1 00:00:00,519 --> 00:00:03,186 (标志嗖嗖声) (logo swooshes) 2 00:00:05,310 --> 00:00:08,483 - 如何实例化 Transformers 模型。 - How to instantiate a Transformers model. 3 00:00:08,483 --> 00:00:11,790 在本视频中,我们会带您了解我们如何能创建自己的模型 In this video, we'll look at how we can create a user model 4 00:00:11,790 --> 00:00:13,290 用 Transformers 库创。 from the Transformers library. 5 00:00:14,310 --> 00:00:17,100 正如我们之前看到的,AutoModel 类允许 As we have seen before the AutoModel class allows 6 00:00:17,100 --> 00:00:19,140 您去实例化一个预训练的模型。 you to instantiate a pretrained model 7 00:00:19,140 --> 00:00:21,513 从 Hugging Face Hub 的任何 checkpoint *[译者注: checkpoint 意思是 检查点, 作为训练模型在训练时的备份] from any checkpoint on the Hugging Face Hub. 8 00:00:22,350 --> 00:00:23,910 它会挑选合适的模型类 It'll pick the right model class 9 00:00:23,910 --> 00:00:26,654 从开源库中, 以实例化对应的结构 from the library to instantiate the proper architecture 10 00:00:26,654 --> 00:00:29,793 和权重来作为预训练模型。 and loads of weights as the pretrained model inside. 11 00:00:30,690 --> 00:00:33,810 正如我们所见,当给定一个 BERT checkpoint 时, As we can see, when given a BERT checkpoint 12 00:00:33,810 --> 00:00:38,043 我们最终会得到一个 BertModel ,类似地,模型 GPT-2 或 BERT 也可以这么做。 we end up with a BertModel and similarly, for GPT-2 or BERT. 13 00:00:40,020 --> 00:00:42,360 背后的信息是,这个 API 可以接受 Behind the scenes,this API can take the name 14 00:00:42,360 --> 00:00:44,250 Hub 上一个 checkpoint 的名字 of a checkpoint on the Hub 15 00:00:44,250 --> 00:00:46,980 在这种情况下,它将下载和缓存配置文件 in which case it will download and cache the configuration file 16 00:00:46,980 --> 00:00:48,843 以及模型权重文件。 as well as a model weights file. 17 00:00:49,698 --> 00:00:52,710 您也可以指定一个本地文件夹的路径, You can also specify the path to a local folder 18 00:00:52,710 --> 00:00:55,290 这个文件夹包含一个有效的配置文件和 that contains a valid configuration file and 19 00:00:55,290 --> 00:00:56,390 一个权重模型文件。 a model of weights file. 20 00:00:57,600 --> 00:00:59,479 为了实例化预训练的模型, To instantiate the pretrained model, 21 00:00:59,479 --> 00:01:01,950 AutoModel API 会先打开配置文件 the AutoModel API will first open the configuration 22 00:01:01,950 --> 00:01:05,403 查看应该使用的配置类。 file to look at a configuration class that should be used. 23 00:01:06,420 --> 00:01:08,580 配置类取决于模型的类型, The configuration class depends on the type 24 00:01:08,580 --> 00:01:12,663 例如模型 BERT、GPT-2 或 BART。 of the model BERT, GPT-2 or BART for instance. 25 00:01:13,680 --> 00:01:15,930 一旦有了一个合适的配置类, Once it has a proper configuration class, 26 00:01:15,930 --> 00:01:18,390 就可以实例化那个配置, it can instantiate that configuration 27 00:01:18,390 --> 00:01:21,900 其包含一张关于如何创建模型的蓝图。 which is a blueprint to know how to create the model. 28 00:01:21,900 --> 00:01:24,240 它还使用这个配置类 It also uses this configuration class to 29 00:01:24,240 --> 00:01:27,150 去寻找合适的模型类,其然后被 find the proper model class, which is then combined 30 00:01:27,150 --> 00:01:29,823 和加载后的配置一起来加载模型。 with the loaded configuration to load the model. 31 00:01:30,904 --> 00:01:33,210 这个模型还不是一个预训练模型, This model is not yet a pretrained model 32 00:01:33,210 --> 00:01:35,883 因为它刚刚用随机权重完成了初始化。 as it has just been initialized with random weights. 33 00:01:36,840 --> 00:01:39,810 最后一步是从模型文件加载权重 The last step is to load the weight from the model file 34 00:01:39,810 --> 00:01:40,923 到模型里 inside this model. 35 00:01:42,330 --> 00:01:44,250 为了方便地加载一个模型的配置 To easily load the configuration of a model 36 00:01:44,250 --> 00:01:48,210 从任何 checkpoint 或包含配置文件的文件夹中 from any checkpoint or folder containing the configuration file. 37 00:01:48,210 --> 00:01:48,210 . . 38 00:01:48,210 --> 00:01:50,373 我们可以使用 AutoConfig 类。 We can use the AutoConfig class. 39 00:01:51,240 --> 00:01:52,693 像 AutoModel 类一样, Like the AutoModel class, 40 00:01:52,693 --> 00:01:55,693 它将从开源库中挑选合适的配置类。 it will pick the right configuration class from the library. 41 00:01:57,060 --> 00:01:59,220 我们也可以使用一个特定的类来对应 We can also use a specific class corresponding 42 00:01:59,220 --> 00:02:01,470 一个 checkpoint ,但每次我们需要 to a checkpoint, but we will need to change 43 00:02:01,470 --> 00:02:03,000 改代码, 每当我们想尝试 the code each time we want to try 44 00:02:03,000 --> 00:02:04,550 不同的模型结构时. a different model architecture. 45 00:02:06,030 --> 00:02:07,860 正如我们刚才所说的,一个模型的配置就是 As we said before, the configuration 46 00:02:07,860 --> 00:02:10,350 一张蓝图,其包括了 of a model is a blueprint that contains all the 47 00:02:10,350 --> 00:02:13,830 创建模型架构所需的所有信息。 information necessary to create the model architecture. 48 00:02:13,830 --> 00:02:15,990 例如,关联到 bert-base-cased checkpoint 的 For instance, the BERT model associated 49 00:02:15,990 --> 00:02:19,980 BERT 模型有 12 层, with the bert-base-cased checkpoint has 12 layers, 50 00:02:19,980 --> 00:02:24,980 768 的隐藏面和 28,996 的词汇面。 a hidden side of 768 and a vocabulary side of 28,996. 51 00:02:28,020 --> 00:02:29,910 一旦我们有了配置, Once we have the configuration, 52 00:02:29,910 --> 00:02:31,950 我们就可以创建一个和 checkpoint 有着同样架构的模型, we can create a model that does the same architecture as our checkpoint, 53 00:02:31,950 --> 00:02:35,280 但是模型是随机初始化的。 but is randomly initialized. 54 00:02:35,280 --> 00:02:36,660 然后我们可以从头开始训练它, We can then train it from scratch. 55 00:02:36,660 --> 00:02:38,010 就像任何 bio PyTorch 模块一样 Like any bio PyTorch module 56 00:02:39,497 --> 00:02:40,380 我们也可以通过改变 We can also change any part 57 00:02:40,380 --> 00:02:43,200 配置的任何部分, 使用关键字参数 of the configuration by using keyword arguments. 58 00:02:43,200 --> 00:02:46,138 第二段代码实例化了 The second snippet of code instantiates 59 00:02:46,138 --> 00:02:48,360 一个随机初始化的 BERT 模型, a randomly initialized BERT model 60 00:02:48,360 --> 00:02:50,403 这个模型有 10 层而非 12 层。 with 10 layers instead of 12. 61 00:02:51,409 --> 00:02:55,051 一个模型被训练或微调后,想要保存这个模型是很容易的。 Saving a model once it's trained or fine-tuned is very easy. 62 00:02:55,051 --> 00:02:57,603 我们只需要使用一种安全的预训练方法。 We just have to use a safe pretrained method. 63 00:02:58,500 --> 00:03:01,417 这里模型将保存在当前工作目录下 Here the model will be saved in a folder named 64 00:03:01,417 --> 00:03:04,473 一个名为 "my-bert-model" 的文件夹中。 "my-bert-model" inside the current working directory. 65 00:03:05,400 --> 00:03:08,255 然后,已保存的模型可以使用 Such a model can then be reloaded using the form 66 00:03:08,255 --> 00:03:09,596 from_pretrained 函数重新加载进来。 pretrained method. 67 00:03:09,596 --> 00:03:11,250 如果您要学习如何轻松地应用这个模型, To learn how to easily approach this model 68 00:03:11,250 --> 00:03:13,473 请查看课程中的相关视频。 to that, check out the push to a video.