subtitles/zh-CN/11_instantiate-a-transformers-model-(tensorflow).srt (284 lines of code) (raw):

1 00:00:00,125 --> 00:00:02,958 (嘶嘶声) (whooshing sound) 2 00:00:05,463 --> 00:00:08,820 - 如何实例化 Transformers 模型? - How to instantiate the Transformers model? 3 00:00:08,820 --> 00:00:11,250 在本视频中,我们将了解如何创建 In this video, we will look at how we can create 4 00:00:11,250 --> 00:00:13,550 并使用 Transformers 库中的模型。 and use a model from the Transformers library. 5 00:00:15,000 --> 00:00:17,850 正如我们之前看到的,TFAutoModel 类 As we've seen before, the TFAutoModel class 6 00:00:17,850 --> 00:00:20,100 允许你实例化预训练模型, allows you to instantiate a pre-trained model 7 00:00:20,100 --> 00:00:22,503 从 Hugging Face Hub 上的任何一个 checkpoint。 from any checkpoint on the Hugging Face Hub. 8 00:00:23,430 --> 00:00:25,620 它将从库中选择正确的模型类 It will pick the right model class from the library 9 00:00:25,620 --> 00:00:27,750 来实例化适当的结构 to instantiate the proper architecture 10 00:00:27,750 --> 00:00:31,200 并在里面加载预训练模型的权重。 and load the weights of the pre-trained model inside. 11 00:00:31,200 --> 00:00:34,020 正如我们所见,当给定一个 BERT 的 checkpoint 时, As we can see, when given a BERT checkpoint, 12 00:00:34,020 --> 00:00:36,090 我们最终得到一个 TFBertModel, we end up with a TFBertModel, 13 00:00:36,090 --> 00:00:38,553 GPT2 或 BART 也类似。 and similarly for GPT2 or BART. 14 00:00:40,170 --> 00:00:42,510 在幕后,这个 API 接受 Behind the scenes, this API can take the name 15 00:00:42,510 --> 00:00:44,040 Hub 上的一个 checkpoint 的名字, of a checkpoint on the Hub, 16 00:00:44,040 --> 00:00:45,810 在这种情况下,它将下载并缓存 in which case it will download and cache 17 00:00:45,810 --> 00:00:48,660 配置文件以及模型权重文件。 the configuration file as well as the model weights file. 18 00:00:49,590 --> 00:00:52,020 你还可以指定本地文件夹的路径 You can also specify the path to a local folder 19 00:00:52,020 --> 00:00:54,090 包含有效的配置文件 that contains a valid configuration file 20 00:00:54,090 --> 00:00:55,340 和模型的权重文件。 and a model weights file. 21 00:00:56,670 --> 00:00:58,167 要实例化预训练模型, To instantiate the pre-trained model, 22 00:00:58,167 --> 00:01:02,400 TFAutoModel API 会首先打开配置文件 the TFAutoModel API will first open the configuration file 23 00:01:02,400 --> 00:01:05,253 以查看应该使用的配置类。 to look at the configuration class that should be used. 24 00:01:06,390 --> 00:01:09,660 配置类取决于模型的类型, The configuration class depends on the type of the model, 25 00:01:09,660 --> 00:01:12,333 例如 BERT、GPT2 或 BART。 BERT, GPT2 or BART for instance. 26 00:01:13,320 --> 00:01:15,720 一旦它有了正确的配置类, Once it has the proper configuration class, 27 00:01:15,720 --> 00:01:18,000 它可以实例化该配置, it can instantiate that configuration, 28 00:01:18,000 --> 00:01:21,090 这是了解如何创建模型的蓝图。 which is a blueprint to know how to create the model. 29 00:01:21,090 --> 00:01:22,770 它也使用这个配置类 It also uses this configuration class 30 00:01:22,770 --> 00:01:24,750 来找到合适的模型类, to find the proper model class, 31 00:01:24,750 --> 00:01:27,120 它与加载的配置相结合 which is combined with the loaded configuration 32 00:01:27,120 --> 00:01:28,143 加载模型。 to load the model. 33 00:01:29,250 --> 00:01:31,800 这个模型还不是我们的预训练模型 This model is not yet our pre-trained model 34 00:01:31,800 --> 00:01:34,560 因为它刚刚用随机权重初始化。 as it has just been initialized with random weights. 35 00:01:34,560 --> 00:01:36,690 最后一步是加载权重 The last step is to load the weights 36 00:01:36,690 --> 00:01:38,973 从该模型中的模型文件中。 from the model file inside this model. 37 00:01:40,230 --> 00:01:42,270 为了轻松加载模型的配置, To easily load the configuration of a model 38 00:01:42,270 --> 00:01:44,220 从任何 checkpoint 或 from any checkpoint or 39 00:01:44,220 --> 00:01:46,170 包含配置文件的文件夹, a folder containing the configuration file, 40 00:01:46,170 --> 00:01:47,790 我们可以使用 AutoConfig 类。 we can use the AutoConfig class. 41 00:01:47,790 --> 00:01:50,460 与 TFAutoModel 类一样, Like the TFAutoModel class, 42 00:01:50,460 --> 00:01:54,210 它将从库中选择正确的配置类。 it will pick the right configuration class from the library. 43 00:01:54,210 --> 00:01:56,040 我们也可以使用特定的类 We can also use the specific class 44 00:01:56,040 --> 00:01:57,840 对应一个 checkpoint , corresponding to a checkpoint, 45 00:01:57,840 --> 00:01:59,430 但我们需要更改代码 but we will need to change the code 46 00:01:59,430 --> 00:02:02,230 每当我们想尝试不同的模型结构时。 each time we want to try a different model architecture. 47 00:02:03,180 --> 00:02:05,353 正如我们之前所说,模型的配置 As we said before, the configuration of a model 48 00:02:05,353 --> 00:02:08,610 是包含所有必要信息的蓝图 is a blueprint that contains all the information necessary 49 00:02:08,610 --> 00:02:11,070 以创建模型结构。 to create the model architecture. 50 00:02:11,070 --> 00:02:12,750 例如,BERT 模型 For instance, the BERT model 51 00:02:12,750 --> 00:02:15,510 与 bert-base-cased 的 checkpoint 关联 associated with the bert-base-cased checkpoint 52 00:02:15,510 --> 00:02:19,710 有 12 层,隐藏大小为 768, has 12 layers, a hidden size of 768, 53 00:02:19,710 --> 00:02:23,403 词汇量为 28,996。 and a vocabulary size of 28,996. 54 00:02:24,810 --> 00:02:26,670 一旦我们有了配置, Once we have the configuration, 55 00:02:26,670 --> 00:02:28,890 我们可以创建一个具有相同结构的模型 we can create a model that has the same architecture 56 00:02:28,890 --> 00:02:32,160 作为我们的 checkpoint ,但是随机初始化的。 as our checkpoint but is randomly initialized. 57 00:02:32,160 --> 00:02:36,030 我们然后就可以像任何 TensorFlow 模型一样从头开始训练它。 We can then train it from scratch like any TensorFlow model. 58 00:02:36,030 --> 00:02:38,063 我们还可以更改配置的任何部分 We can also change any part of the configuration 59 00:02:38,063 --> 00:02:40,770 通过使用关键字参数。 by using keyword arguments. 60 00:02:40,770 --> 00:02:43,110 第二段代码实例化 The second snippet of code instantiates 61 00:02:43,110 --> 00:02:44,970 一个随机初始化的 BERT 模型 a randomly initialized BERT model 62 00:02:44,970 --> 00:02:46,983 有 10 层而不是 12 层。 with 10 layers instead of 12. 63 00:02:48,240 --> 00:02:51,360 训练或微调后保存模型非常容易。 Saving a model once it's trained or fine-tuned is very easy. 64 00:02:51,360 --> 00:02:53,880 我们只需要使用 save_pretrained 方法。 We just have to use the save_pretrained method. 65 00:02:53,880 --> 00:02:55,980 在这里,模型将保存在一个文件夹中 Here, the model will be saved in a folder 66 00:02:55,980 --> 00:02:59,463 在当前工作目录中命名为 my-bert-model。 named my-bert-model inside the current working directory. 67 00:03:00,480 --> 00:03:02,250 然后可以重新加载这样的模型 Such a model can then be reloaded 68 00:03:02,250 --> 00:03:04,500 使用 from_pretrained 方法。 using the from_pretrained method. 69 00:03:04,500 --> 00:03:06,600 要将其运行到 Hub 的项目模型, To run it to a projects model to the Hub, 70 00:03:06,600 --> 00:03:08,350 查看推送(咕哝)视频。 check out the push (mumbles) video. 71 00:03:09,355 --> 00:03:12,188 (嘶嘶声) (whooshing sound)