subtitles/zh-CN/33_the-push-to-hub-api-(pytorch).srt (436 lines of code) (raw):

1 00:00:00,321 --> 00:00:01,497 (空气呼啸) (air whooshing) 2 00:00:01,497 --> 00:00:02,330 (笑脸弹出) (smiley face popping) 3 00:00:02,330 --> 00:00:05,130 (空气呼啸) (air whooshing) 4 00:00:05,130 --> 00:00:06,830 - [Instructor] push_to_hub API。 - [Instructor] So push_to_hub API. 5 00:00:08,310 --> 00:00:10,533 让我们看一下 push_to_hub API。 Let's have a look at the push_to_hub API. 6 00:00:11,730 --> 00:00:14,640 你需要使用你的 Hugging Face 帐户登录 You will need to be logged in with your Hugging Face account 7 00:00:14,640 --> 00:00:17,400 你可以通过执行第一个单元格里的操作来登录, which you can do by executing this first cell, 8 00:00:17,400 --> 00:00:21,123 或者在 terminal 中输入 huggingface-cli login。 or by typing huggingface-cli login in a terminal. 9 00:00:21,990 --> 00:00:26,640 只需输入你的用户名和密码,然后点击登录, Just enter you username and password, then click login, 10 00:00:26,640 --> 00:00:28,620 登录后将存储一个授权 token this will store a authentication token 11 00:00:28,620 --> 00:00:30,670 到你正在使用的机器的缓存中。 in the cache of the machine you're using. 12 00:00:31,890 --> 00:00:35,790 现在,让我们基于 GLUE COLA 数据集 Now, let's launch a fine tuning of a BERT model 13 00:00:35,790 --> 00:00:37,920 对 BERT 模型进行微调。 on the GLUE COLA dataset. 14 00:00:37,920 --> 00:00:39,600 我们不会深入探讨微调代码 We won't go over the fine tuning code 15 00:00:39,600 --> 00:00:42,270 你可以在任何 transformer 教程 because you can find it in any transformer tutorial, 16 00:00:42,270 --> 00:00:44,670 或查看下面的视频链接找到相关参考。 or by looking at the videos link below. 17 00:00:44,670 --> 00:00:46,470 我们感兴趣的是 What interests us here is 18 00:00:46,470 --> 00:00:48,970 如何在训练期间利用 model hub。 how we can leverage the model hub during training. 19 00:00:49,860 --> 00:00:52,980 这是通过 “push_to_hub=true” 参数完成的 This is done with the "push_to_hub=true" argument 20 00:00:52,980 --> 00:00:55,530 将该参数添加到你的 TrainingArguments。 passed in your TrainingArguments. 21 00:00:55,530 --> 00:00:57,240 每次保存时将自动上传 This will automatically upload your model 22 00:00:57,240 --> 00:00:59,400 你的模型到 Hub,在我们的示例中, to the Hub each time it is saved, 23 00:00:59,400 --> 00:01:01,323 每个 epoch 都会如此操作。 so every epoch in our case. 24 00:01:02,280 --> 00:01:04,860 如果当前的被打断 This allows you to resume training from a different machine 25 00:01:04,860 --> 00:01:06,873 这允许你从不同的机器恢复训练之前的训练。 if the current one gets interrupted. 26 00:01:08,220 --> 00:01:10,440 该模型将使用 The model will be updated in your namespace 27 00:01:10,440 --> 00:01:14,640 你默认选择的输出目录的名称在你的 namespace 中更新。 with the name of the output directory you picked by default. 28 00:01:14,640 --> 00:01:16,020 你可以通过将其 You can choose another name 29 00:01:16,020 --> 00:01:19,113 传递给 hub_model_id 参数选择其他名称。 by passing it to the hub_model_id argument. 30 00:01:20,070 --> 00:01:23,370 你还可以通过传递完整的仓库名称 You can also push inside an organization you are a member of 31 00:01:23,370 --> 00:01:25,740 将模型 push 到你所属的组织内部, by passing a full repository name, 32 00:01:25,740 --> 00:01:28,933 使用 organization/ 的形式, with the name of the organization/, 33 00:01:28,933 --> 00:01:30,433 再加上你所选的 model ID。 the model ID you want to pick. 34 00:01:32,250 --> 00:01:34,650 完成后,我们就可以开始训练了, With that done, we can just launch training, 35 00:01:34,650 --> 00:01:36,093 稍等一下。 and wait a little bit. 36 00:01:36,960 --> 00:01:39,033 视频中会跳过等待的过程。 I'll cut the waiting time from the video. 37 00:01:43,260 --> 00:01:46,350 请注意,模型是异步 push 的, Note that the model is pushed asynchronously, 38 00:01:46,350 --> 00:01:47,730 意味着当你的模型上传到 hub 时, meaning that the training continues 39 00:01:47,730 --> 00:01:49,730 训练将继续进行。 while your model is uploaded to the hub. 40 00:01:51,060 --> 00:01:52,950 当你的第一次 commit 完成时, When your first commit is finished, 41 00:01:52,950 --> 00:01:55,650 你可以通过查看你的 namespace you can go inspect your model on the Hub 42 00:01:55,650 --> 00:01:57,960 去 Hub 检查你的模型, by looking inside your namespace, 43 00:01:57,960 --> 00:01:59,943 你会在最上面找到它。 and you'll find it at the very top. 44 00:02:01,980 --> 00:02:04,200 你甚至可以在继续训练的同时 You can even start playing with its inference widget 45 00:02:04,200 --> 00:02:06,630 开始使用它的 inference 小部件。 while it's continuing the training. 46 00:02:06,630 --> 00:02:09,270 Cola 数据集让模型确定 The Cola dataset tasks the model with determining 47 00:02:09,270 --> 00:02:11,970 句子在语法上是否是正确的。 if the sentence is grammatically correct on that. 48 00:02:11,970 --> 00:02:15,510 所以我们选择一个错误句子的例子来测试它。 So we pick an example of incorrect sentence to test it. 49 00:02:15,510 --> 00:02:16,950 请注意,第一次尝试使用它时, Note that it'll take a bit of time 50 00:02:16,950 --> 00:02:18,750 这需要一些时间才能在 inference API 中 to load your model inside the inference APIs, 51 00:02:18,750 --> 00:02:20,880 完成模型加载。 so first time you try to use it. 52 00:02:20,880 --> 00:02:23,280 我们将根据时间从视频中删掉。 We'll cut by time from the video. 53 00:02:23,280 --> 00:02:24,870 标签有点问题, There is something wrong with the labels, 54 00:02:24,870 --> 00:02:27,360 我们稍后会在本视频中修复它。 but we'll fix it later in this video. 55 00:02:27,360 --> 00:02:29,520 一旦你的训练完成, Once your training is finished, 56 00:02:29,520 --> 00:02:31,770 你应该使用 trainer.push_to_hub 方法 you should do one last push with the trainer 57 00:02:31,770 --> 00:02:33,840 最后再提交一次。 that pushed to a method. 58 00:02:33,840 --> 00:02:35,430 这其中有两个原因。 This is for two reason. 59 00:02:35,430 --> 00:02:36,750 首先,若你尚未完成 First, this will make sure 60 00:02:36,750 --> 00:02:39,180 这将确保你正在预测模型的 you are predicting the final version of your model 61 00:02:39,180 --> 00:02:40,680 最终版本。 if you didn't already. 62 00:02:40,680 --> 00:02:42,480 例如,如果你曾经是在每一步保存 For instance, if you used to save 63 00:02:42,480 --> 00:02:46,980 而不是每秒保存, every in step strategy instead of every second, 64 00:02:46,980 --> 00:02:48,180 这将创建一个 model card this will draft a model card 65 00:02:48,180 --> 00:02:51,120 那将是你的 model repo 的最初始页面。 that will be the landing page of your model repo. 66 00:02:51,120 --> 00:02:52,260 commit 完成后, Once the commit is done, 67 00:02:52,260 --> 00:02:54,810 让我们回到我们的 model 页面并刷新。 let's go back on our model page and refresh. 68 00:02:54,810 --> 00:02:56,820 我们可以看到 model card 的草稿 We can see the drafters model card 69 00:02:56,820 --> 00:02:58,080 其中包括信息, which includes information, 70 00:02:58,080 --> 00:03:00,381 以及哪个模型被调整过。 and which one model we find tuned. 71 00:03:00,381 --> 00:03:03,570 接下来是最终的评估 loss 和 metric, So final evaluation loss and metric, 72 00:03:03,570 --> 00:03:06,300 使用过的训练超参数, the training hyperparameter used, 73 00:03:06,300 --> 00:03:08,670 中间训练结果, the intermediate training results, 74 00:03:08,670 --> 00:03:10,320 以及我们使用的框架版本 and the framework versions we used 75 00:03:10,320 --> 00:03:13,173 以便其他人可以轻松地重现我们的结果。 so that other people can easily reproduce our results. 76 00:03:15,270 --> 00:03:16,860 在所有这些信息之上, On top of all that information, 77 00:03:16,860 --> 00:03:19,740 trainer 还包括一些 metadata,它可以 the trainer also included some metadata that is interpreted 78 00:03:19,740 --> 00:03:22,650 通过 model cloud 上的 HuggingFace 网站解析。 by the Hugging Face website in the model cloud. 79 00:03:22,650 --> 00:03:26,010 你将会得到一个漂亮的 widget 所返回的相关指标数值 You get the value of the metrics reported in a nice widget 80 00:03:26,010 --> 00:03:29,640 以及一个链接指向 leaderboard(Paper with Code)。 as well as a link to a leaderboard with paper with code. 81 00:03:29,640 --> 00:03:32,550 并且 Tensorboard 的运行结果也包含 So the Tensorboard runs have also been pushed 82 00:03:32,550 --> 00:03:34,560 在这份报告中,我们可以在 Model Hub 中 to this report, and we can look at them 83 00:03:34,560 --> 00:03:36,000 通过点击子菜单中的 directly from the model hub 84 00:03:36,000 --> 00:03:38,850 Training metrics 查看报告。 by clicking on the training metrics sub menu. 85 00:03:38,850 --> 00:03:39,795 如果你不使用 Trainer API If you are not using the Trainer API 86 00:03:39,795 --> 00:03:42,510 微调你的模型, to fine-tune your model, 87 00:03:42,510 --> 00:03:43,770 你可以在模型上直接 you can use a push_to_hub method 88 00:03:43,770 --> 00:03:46,427 使用 push_to_hub 方法和分词器。 on the model, and tokenizer directly. 89 00:03:46,427 --> 00:03:50,160 让我们测试一下以修复 inference widget 中的所有标签。 Let's test this to fix all labels in the inference widget. 90 00:03:50,160 --> 00:03:52,740 inference widget 使用不同的标签名称 The inference widget was using different names for labels 91 00:03:52,740 --> 00:03:54,810 因为我们没有在整数和标签名称之间 because we did not indicate the correspondence 92 00:03:54,810 --> 00:03:57,030 注明关联性。 between integer and label names. 93 00:03:57,030 --> 00:03:58,740 当推送模型配置到 hub 时, We can fix this in the configuration 94 00:03:58,740 --> 00:04:01,350 我们可以通过将 label2id by setting the label2id, 95 00:04:01,350 --> 00:04:04,170 和 id2label 字段设置为合适的值 and id2label fields through the proper values 96 00:04:04,170 --> 00:04:06,933 在配置中解决这个问题。 when pushing the model config to the hub. 97 00:04:07,950 --> 00:04:10,620 完成后,我们可以在网站上查看, Once this is done, we can check on the website, 98 00:04:10,620 --> 00:04:13,380 模型现在显示正确的标签。 and the model is now showing the proper label. 99 00:04:13,380 --> 00:04:15,240 现在模型在 hub 上, Now that the model is on the hub, 100 00:04:15,240 --> 00:04:17,370 我们可以在任何地方使用它 we can use it from anywhere 101 00:04:17,370 --> 00:04:19,920 就像我们对任何其他 Transformer 模型 as we would any other Transformer model 102 00:04:19,920 --> 00:04:21,113 使用 from_pretrained 方法 with the from_pretrained method 103 00:04:21,113 --> 00:04:22,923 或者使用 pipeline 函数。 or with the pipeline function. 104 00:04:34,350 --> 00:04:36,780 我们只需要使用 hub 的标识符, We just have to use the identifier from the hub, 105 00:04:36,780 --> 00:04:39,450 我们可以看到模型配置和权重 and we can see that the model configuration and weights 106 00:04:39,450 --> 00:04:42,483 以及分词处理后的文件会自动下载。 as well as the tokenized files are automatically downloaded. 107 00:04:53,880 --> 00:04:55,950 在下一次训练中尝试 push_to_hub API Try the push_to_hub API in the next training 108 00:04:55,950 --> 00:04:58,650 轻松与世界其他地方分享你的模型。 to easily share your model with the rest of the world. 109 00:05:01,151 --> 00:05:03,818 (空气呼啸) (air whooshing)