1
00:00:00,519 --> 00:00:03,186
（标志嗖嗖声）
(logo swooshes)

2
00:00:05,310 --> 00:00:08,483
- 如何实例化 Transformers 模型。
- How to instantiate a Transformers model.

3
00:00:08,483 --> 00:00:11,790
在本视频中，我们会带您了解我们如何能创建自己的模型
In this video, we'll look at how we can create a user model

4
00:00:11,790 --> 00:00:13,290
用 Transformers 库创。
from the Transformers library.

5
00:00:14,310 --> 00:00:17,100
正如我们之前看到的，AutoModel 类允许
As we have seen before the AutoModel class allows

6
00:00:17,100 --> 00:00:19,140
您去实例化一个预训练的模型。
you to instantiate a pretrained model

7
00:00:19,140 --> 00:00:21,513
从 Hugging Face Hub 的任何 checkpoint
*[译者注: checkpoint 意思是 检查点, 作为训练模型在训练时的备份]
from any checkpoint on the Hugging Face Hub.

8
00:00:22,350 --> 00:00:23,910
它会挑选合适的模型类
It'll pick the right model class

9
00:00:23,910 --> 00:00:26,654
从开源库中, 以实例化对应的结构
from the library to instantiate the proper architecture

10
00:00:26,654 --> 00:00:29,793
和权重来作为预训练模型。
and loads of weights as the pretrained model inside.

11
00:00:30,690 --> 00:00:33,810
正如我们所见，当给定一个 BERT checkpoint 时，
As we can see, when given a BERT checkpoint

12
00:00:33,810 --> 00:00:38,043
我们最终会得到一个 BertModel ，类似地，模型 GPT-2 或 BERT 也可以这么做。
we end up with a BertModel and similarly, for GPT-2 or BERT.

13
00:00:40,020 --> 00:00:42,360
背后的信息是，这个 API 可以接受 
Behind the scenes,this API can take the name

14
00:00:42,360 --> 00:00:44,250
Hub 上一个 checkpoint 的名字
of a checkpoint on the Hub

15
00:00:44,250 --> 00:00:46,980
在这种情况下，它将下载和缓存配置文件
in which case it will download and cache the configuration file 

16
00:00:46,980 --> 00:00:48,843
以及模型权重文件。
as well as a model weights file.

17
00:00:49,698 --> 00:00:52,710
您也可以指定一个本地文件夹的路径，
You can also specify the path to a local folder

18
00:00:52,710 --> 00:00:55,290
这个文件夹包含一个有效的配置文件和
that contains a valid configuration file and

19
00:00:55,290 --> 00:00:56,390
一个权重模型文件。
a model of weights file.

20
00:00:57,600 --> 00:00:59,479
为了实例化预训练的模型，
To instantiate the pretrained model,

21
00:00:59,479 --> 00:01:01,950
AutoModel API 会先打开配置文件
the AutoModel API will first open the configuration

22
00:01:01,950 --> 00:01:05,403
查看应该使用的配置类。
file to look at a configuration class that should be used.

23
00:01:06,420 --> 00:01:08,580
配置类取决于模型的类型，
The configuration class depends on the type

24
00:01:08,580 --> 00:01:12,663
例如模型 BERT、GPT-2 或 BART。
of the model BERT, GPT-2 or BART for instance.

25
00:01:13,680 --> 00:01:15,930
一旦有了一个合适的配置类，
Once it has a proper configuration class,

26
00:01:15,930 --> 00:01:18,390
就可以实例化那个配置，
it can instantiate that configuration

27
00:01:18,390 --> 00:01:21,900
其包含一张关于如何创建模型的蓝图。
which is a blueprint to know how to create the model.

28
00:01:21,900 --> 00:01:24,240
它还使用这个配置类
It also uses this configuration class to

29
00:01:24,240 --> 00:01:27,150
去寻找合适的模型类，其然后被
find the proper model class, which is then combined

30
00:01:27,150 --> 00:01:29,823
和加载后的配置一起来加载模型。
with the loaded configuration to load the model.

31
00:01:30,904 --> 00:01:33,210
这个模型还不是一个预训练模型，
This model is not yet a pretrained model

32
00:01:33,210 --> 00:01:35,883
因为它刚刚用随机权重完成了初始化。
as it has just been initialized with random weights.

33
00:01:36,840 --> 00:01:39,810
最后一步是从模型文件加载权重
The last step is to load the weight from the model file

34
00:01:39,810 --> 00:01:40,923
到模型里
inside this model.

35
00:01:42,330 --> 00:01:44,250
为了方便地加载一个模型的配置
To easily load the configuration of a model

36
00:01:44,250 --> 00:01:48,210
从任何 checkpoint 或包含配置文件的文件夹中
from any checkpoint or folder containing the configuration file.

37
00:01:48,210 --> 00:01:48,210
.
.

38
00:01:48,210 --> 00:01:50,373
我们可以使用 AutoConfig 类。
We can use the AutoConfig class.

39
00:01:51,240 --> 00:01:52,693
像 AutoModel 类一样，
Like the AutoModel class,

40
00:01:52,693 --> 00:01:55,693
它将从开源库中挑选合适的配置类。
it will pick the right configuration class from the library.

41
00:01:57,060 --> 00:01:59,220
我们也可以使用一个特定的类来对应
We can also use a specific class corresponding

42
00:01:59,220 --> 00:02:01,470
一个 checkpoint ，但每次我们需要
to a checkpoint, but we will need to change

43
00:02:01,470 --> 00:02:03,000
改代码, 每当我们想尝试
the code each time we want to try

44
00:02:03,000 --> 00:02:04,550
不同的模型结构时.
a different model architecture.

45
00:02:06,030 --> 00:02:07,860
正如我们刚才所说的，一个模型的配置就是
As we said before, the configuration

46
00:02:07,860 --> 00:02:10,350
一张蓝图，其包括了
of a model is a blueprint that contains all the

47
00:02:10,350 --> 00:02:13,830
创建模型架构所需的所有信息。
information necessary to create the model architecture.

48
00:02:13,830 --> 00:02:15,990
例如，关联到 bert-base-cased checkpoint 的
For instance, the BERT model associated

49
00:02:15,990 --> 00:02:19,980
BERT 模型有 12 层,
with the bert-base-cased checkpoint has 12 layers,

50
00:02:19,980 --> 00:02:24,980
768 的隐藏面和 28,996 的词汇面。
a hidden side of 768 and a vocabulary side of 28,996.

51
00:02:28,020 --> 00:02:29,910
一旦我们有了配置，
Once we have the configuration,

52
00:02:29,910 --> 00:02:31,950
我们就可以创建一个和 checkpoint 有着同样架构的模型，
we can create a model that does the same architecture as our checkpoint, 

53
00:02:31,950 --> 00:02:35,280
但是模型是随机初始化的。
but is randomly initialized.

54
00:02:35,280 --> 00:02:36,660
然后我们可以从头开始训练它，
We can then train it from scratch.

55
00:02:36,660 --> 00:02:38,010
就像任何 bio PyTorch 模块一样
Like any bio PyTorch module

56
00:02:39,497 --> 00:02:40,380
我们也可以通过改变
We can also change any part

57
00:02:40,380 --> 00:02:43,200
配置的任何部分, 使用关键字参数
of the configuration by using keyword arguments.

58
00:02:43,200 --> 00:02:46,138
第二段代码实例化了
The second snippet of code instantiates

59
00:02:46,138 --> 00:02:48,360
一个随机初始化的 BERT 模型，
a randomly initialized BERT model

60
00:02:48,360 --> 00:02:50,403
这个模型有 10 层而非 12 层。
with 10 layers instead of 12.

61
00:02:51,409 --> 00:02:55,051
一个模型被训练或微调后，想要保存这个模型是很容易的。
Saving a model once it's trained or fine-tuned is very easy.

62
00:02:55,051 --> 00:02:57,603
我们只需要使用一种安全的预训练方法。
We just have to use a safe pretrained method.

63
00:02:58,500 --> 00:03:01,417
这里模型将保存在当前工作目录下
Here the model will be saved in a folder named

64
00:03:01,417 --> 00:03:04,473
一个名为 "my-bert-model" 的文件夹中。
"my-bert-model" inside the current working directory.

65
00:03:05,400 --> 00:03:08,255
然后，已保存的模型可以使用
Such a model can then be reloaded using the form

66
00:03:08,255 --> 00:03:09,596
from_pretrained 函数重新加载进来。
pretrained method.

67
00:03:09,596 --> 00:03:11,250
如果您要学习如何轻松地应用这个模型，
To learn how to easily approach this model

68
00:03:11,250 --> 00:03:13,473
请查看课程中的相关视频。
to that, check out the push to a video.