subtitles/zh-CN/10_instantiate-a-transformers-model-(pytorch).srt (273 lines of code) (raw):
1
00:00:00,519 --> 00:00:03,186
(标志嗖嗖声)
(logo swooshes)
2
00:00:05,310 --> 00:00:08,483
- 如何实例化 Transformers 模型。
- How to instantiate a Transformers model.
3
00:00:08,483 --> 00:00:11,790
在本视频中,我们会带您了解我们如何能创建自己的模型
In this video, we'll look at how we can create a user model
4
00:00:11,790 --> 00:00:13,290
用 Transformers 库创。
from the Transformers library.
5
00:00:14,310 --> 00:00:17,100
正如我们之前看到的,AutoModel 类允许
As we have seen before the AutoModel class allows
6
00:00:17,100 --> 00:00:19,140
您去实例化一个预训练的模型。
you to instantiate a pretrained model
7
00:00:19,140 --> 00:00:21,513
从 Hugging Face Hub 的任何 checkpoint
*[译者注: checkpoint 意思是 检查点, 作为训练模型在训练时的备份]
from any checkpoint on the Hugging Face Hub.
8
00:00:22,350 --> 00:00:23,910
它会挑选合适的模型类
It'll pick the right model class
9
00:00:23,910 --> 00:00:26,654
从开源库中, 以实例化对应的结构
from the library to instantiate the proper architecture
10
00:00:26,654 --> 00:00:29,793
和权重来作为预训练模型。
and loads of weights as the pretrained model inside.
11
00:00:30,690 --> 00:00:33,810
正如我们所见,当给定一个 BERT checkpoint 时,
As we can see, when given a BERT checkpoint
12
00:00:33,810 --> 00:00:38,043
我们最终会得到一个 BertModel ,类似地,模型 GPT-2 或 BERT 也可以这么做。
we end up with a BertModel and similarly, for GPT-2 or BERT.
13
00:00:40,020 --> 00:00:42,360
背后的信息是,这个 API 可以接受
Behind the scenes,this API can take the name
14
00:00:42,360 --> 00:00:44,250
Hub 上一个 checkpoint 的名字
of a checkpoint on the Hub
15
00:00:44,250 --> 00:00:46,980
在这种情况下,它将下载和缓存配置文件
in which case it will download and cache the configuration file
16
00:00:46,980 --> 00:00:48,843
以及模型权重文件。
as well as a model weights file.
17
00:00:49,698 --> 00:00:52,710
您也可以指定一个本地文件夹的路径,
You can also specify the path to a local folder
18
00:00:52,710 --> 00:00:55,290
这个文件夹包含一个有效的配置文件和
that contains a valid configuration file and
19
00:00:55,290 --> 00:00:56,390
一个权重模型文件。
a model of weights file.
20
00:00:57,600 --> 00:00:59,479
为了实例化预训练的模型,
To instantiate the pretrained model,
21
00:00:59,479 --> 00:01:01,950
AutoModel API 会先打开配置文件
the AutoModel API will first open the configuration
22
00:01:01,950 --> 00:01:05,403
查看应该使用的配置类。
file to look at a configuration class that should be used.
23
00:01:06,420 --> 00:01:08,580
配置类取决于模型的类型,
The configuration class depends on the type
24
00:01:08,580 --> 00:01:12,663
例如模型 BERT、GPT-2 或 BART。
of the model BERT, GPT-2 or BART for instance.
25
00:01:13,680 --> 00:01:15,930
一旦有了一个合适的配置类,
Once it has a proper configuration class,
26
00:01:15,930 --> 00:01:18,390
就可以实例化那个配置,
it can instantiate that configuration
27
00:01:18,390 --> 00:01:21,900
其包含一张关于如何创建模型的蓝图。
which is a blueprint to know how to create the model.
28
00:01:21,900 --> 00:01:24,240
它还使用这个配置类
It also uses this configuration class to
29
00:01:24,240 --> 00:01:27,150
去寻找合适的模型类,其然后被
find the proper model class, which is then combined
30
00:01:27,150 --> 00:01:29,823
和加载后的配置一起来加载模型。
with the loaded configuration to load the model.
31
00:01:30,904 --> 00:01:33,210
这个模型还不是一个预训练模型,
This model is not yet a pretrained model
32
00:01:33,210 --> 00:01:35,883
因为它刚刚用随机权重完成了初始化。
as it has just been initialized with random weights.
33
00:01:36,840 --> 00:01:39,810
最后一步是从模型文件加载权重
The last step is to load the weight from the model file
34
00:01:39,810 --> 00:01:40,923
到模型里
inside this model.
35
00:01:42,330 --> 00:01:44,250
为了方便地加载一个模型的配置
To easily load the configuration of a model
36
00:01:44,250 --> 00:01:48,210
从任何 checkpoint 或包含配置文件的文件夹中
from any checkpoint or folder containing the configuration file.
37
00:01:48,210 --> 00:01:48,210
.
.
38
00:01:48,210 --> 00:01:50,373
我们可以使用 AutoConfig 类。
We can use the AutoConfig class.
39
00:01:51,240 --> 00:01:52,693
像 AutoModel 类一样,
Like the AutoModel class,
40
00:01:52,693 --> 00:01:55,693
它将从开源库中挑选合适的配置类。
it will pick the right configuration class from the library.
41
00:01:57,060 --> 00:01:59,220
我们也可以使用一个特定的类来对应
We can also use a specific class corresponding
42
00:01:59,220 --> 00:02:01,470
一个 checkpoint ,但每次我们需要
to a checkpoint, but we will need to change
43
00:02:01,470 --> 00:02:03,000
改代码, 每当我们想尝试
the code each time we want to try
44
00:02:03,000 --> 00:02:04,550
不同的模型结构时.
a different model architecture.
45
00:02:06,030 --> 00:02:07,860
正如我们刚才所说的,一个模型的配置就是
As we said before, the configuration
46
00:02:07,860 --> 00:02:10,350
一张蓝图,其包括了
of a model is a blueprint that contains all the
47
00:02:10,350 --> 00:02:13,830
创建模型架构所需的所有信息。
information necessary to create the model architecture.
48
00:02:13,830 --> 00:02:15,990
例如,关联到 bert-base-cased checkpoint 的
For instance, the BERT model associated
49
00:02:15,990 --> 00:02:19,980
BERT 模型有 12 层,
with the bert-base-cased checkpoint has 12 layers,
50
00:02:19,980 --> 00:02:24,980
768 的隐藏面和 28,996 的词汇面。
a hidden side of 768 and a vocabulary side of 28,996.
51
00:02:28,020 --> 00:02:29,910
一旦我们有了配置,
Once we have the configuration,
52
00:02:29,910 --> 00:02:31,950
我们就可以创建一个和 checkpoint 有着同样架构的模型,
we can create a model that does the same architecture as our checkpoint,
53
00:02:31,950 --> 00:02:35,280
但是模型是随机初始化的。
but is randomly initialized.
54
00:02:35,280 --> 00:02:36,660
然后我们可以从头开始训练它,
We can then train it from scratch.
55
00:02:36,660 --> 00:02:38,010
就像任何 bio PyTorch 模块一样
Like any bio PyTorch module
56
00:02:39,497 --> 00:02:40,380
我们也可以通过改变
We can also change any part
57
00:02:40,380 --> 00:02:43,200
配置的任何部分, 使用关键字参数
of the configuration by using keyword arguments.
58
00:02:43,200 --> 00:02:46,138
第二段代码实例化了
The second snippet of code instantiates
59
00:02:46,138 --> 00:02:48,360
一个随机初始化的 BERT 模型,
a randomly initialized BERT model
60
00:02:48,360 --> 00:02:50,403
这个模型有 10 层而非 12 层。
with 10 layers instead of 12.
61
00:02:51,409 --> 00:02:55,051
一个模型被训练或微调后,想要保存这个模型是很容易的。
Saving a model once it's trained or fine-tuned is very easy.
62
00:02:55,051 --> 00:02:57,603
我们只需要使用一种安全的预训练方法。
We just have to use a safe pretrained method.
63
00:02:58,500 --> 00:03:01,417
这里模型将保存在当前工作目录下
Here the model will be saved in a folder named
64
00:03:01,417 --> 00:03:04,473
一个名为 "my-bert-model" 的文件夹中。
"my-bert-model" inside the current working directory.
65
00:03:05,400 --> 00:03:08,255
然后,已保存的模型可以使用
Such a model can then be reloaded using the form
66
00:03:08,255 --> 00:03:09,596
from_pretrained 函数重新加载进来。
pretrained method.
67
00:03:09,596 --> 00:03:11,250
如果您要学习如何轻松地应用这个模型,
To learn how to easily approach this model
68
00:03:11,250 --> 00:03:13,473
请查看课程中的相关视频。
to that, check out the push to a video.