subtitles/zh-CN/75_writing-a-good-issue.srt (292 lines of code) (raw):

1 00:00:05,610 --> 00:00:08,557 - 如何在 GitHub 上写出好的 issue? - How to write a good issue on GitHub? 2 00:00:08,557 --> 00:00:10,080 GitHub 是主要的地方 GitHub is the main place 3 00:00:10,080 --> 00:00:12,000 对于 Hugging Face 开源库, for the Hugging Face open source libraries, 4 00:00:12,000 --> 00:00:14,010 你应该经常去那里报告错误 and you should always go there to report a bug 5 00:00:14,010 --> 00:00:16,020 或求新功能。 or ask for a new feature. 6 00:00:16,020 --> 00:00:18,660 对于更一般的问题或调试你自己的代码 For more general questions or to debug your own code 7 00:00:18,660 --> 00:00:21,707 使用论坛,请参阅下面链接的视频。 use the forums, see the video linked below. 8 00:00:21,707 --> 00:00:23,677 写好 issue 很重要 It's very important to write good issues 9 00:00:23,677 --> 00:00:27,232 因为它将帮助你发现的错误被立即修复。 as it will help the bug you uncovered be fixed in no time. 10 00:00:27,232 --> 00:00:29,750 对于这个视频,我们创建了一个版本的 Transformers For this video, we have created a version of Transformers 11 00:00:29,750 --> 00:00:31,066 带有一个错误。 with a bug. 12 00:00:31,066 --> 00:00:33,783 你可以通过在笔记本中执行此命令来安装它, You can install it by executing this command in a notebook, 13 00:00:33,783 --> 00:00:37,239 删除感叹号以在终端中执行它。 remove the exclamation mark to execute it in a terminal. 14 00:00:37,239 --> 00:00:41,016 在此版本中,以下示例失败。 In this version, the following example fails. 15 00:00:41,016 --> 00:00:42,472 错误是相当神秘的 The error is rather cryptic 16 00:00:42,472 --> 00:00:45,184 并且似乎不是来自我们代码中的任何内容, and does not seem to come from anything in our code, 17 00:00:45,184 --> 00:00:48,157 所以我们似乎有一个错误要报告。 so it seems we have a bug to report. 18 00:00:48,157 --> 00:00:49,858 在这种情况下要做的第一件事 The first thing to do in this case 19 00:00:49,858 --> 00:00:52,053 是试图找到尽可能少的代码 is to try to find the smallest amount of code possible 20 00:00:52,053 --> 00:00:54,059 重现错误。 that reproduces the bug. 21 00:00:54,059 --> 00:00:56,802 在我们的例子中,检查回溯, In our case, inspecting the traceback, 22 00:00:56,802 --> 00:00:59,645 我们看到失败发生在 pipeline 数内部 we see the failure happens inside the pipeline function 23 00:00:59,645 --> 00:01:03,158 当它调用 AutoTokenizer.from_pretrained 时。 when it calls AutoTokenizer.from_pretrained. 24 00:01:03,158 --> 00:01:06,609 使用调试器,我们找到传递给该方法的值 Using the debugger, we find the values passed to that method 25 00:01:06,609 --> 00:01:08,849 因此可以创建一个小的代码示例 and can thus create a small sample of code 26 00:01:08,849 --> 00:01:12,802 希望产生相同的错误。 that hopefully generates the same error. 27 00:01:12,802 --> 00:01:14,726 完成这一步非常重要 It's very important to go though this step 28 00:01:14,726 --> 00:01:16,770 你可能意识到错误在你这边 as you may realize the error was on your side 29 00:01:16,770 --> 00:01:18,360 而不是库中的错误, and not a bug in the library, 30 00:01:18,360 --> 00:01:20,610 但它也会让维护者更容易 but it also will make it easier for the maintainers 31 00:01:20,610 --> 00:01:22,320 来解决你的问题。 to fix your problem. 32 00:01:22,320 --> 00:01:24,030 在这里我们可以更多地使用这段代码 Here we can play around a bit more with this code 33 00:01:24,030 --> 00:01:26,460 并注意错误发生在不同的 checkpoints and notice the error happens for different checkpoints 34 00:01:26,460 --> 00:01:28,050 不只是这个, and not just this one, 35 00:01:28,050 --> 00:01:31,262 当我们使用 use_fast=False 时它会消失 and that it disappears when we use use_fast=False 36 00:01:31,262 --> 00:01:33,240 在我们的 tokenizer 调用中。 inside our tokenizer call. 37 00:01:33,240 --> 00:01:35,190 重要的是要有东西 The important part is to have something 38 00:01:35,190 --> 00:01:38,640 不依赖于任何外部文件或数据。 that does not depend on any external files or data. 39 00:01:38,640 --> 00:01:40,350 尝试用虚假的值替换你的数据 Try to replace your data by fake values 40 00:01:40,350 --> 00:01:41,450 如果你不能分享它。 if you can't share it. 41 00:01:42,750 --> 00:01:43,620 完成所有这些后, With all of this done, 42 00:01:43,620 --> 00:01:46,260 我们准备开始写我们的 issue 。 we are ready to start writing our issue. 43 00:01:46,260 --> 00:01:48,600 单击 Bug Report (错误报告) 旁边的按钮 Click on the button next to Bug Report 44 00:01:48,600 --> 00:01:51,300 你会发现有一个模板可以填写。 and you will discover that there is a template to fill. 45 00:01:51,300 --> 00:01:53,940 只需几分钟。 It will only take you a couple of minutes. 46 00:01:53,940 --> 00:01:56,460 首先是正确命名你的 issue 。 The first thing is to properly name your issue. 47 00:01:56,460 --> 00:01:59,100 不要选择过于模糊的标题。 Don't pick a title that is too vague. 48 00:01:59,100 --> 00:02:02,160 然后你必须填写你的环境信息。 Then you have to fill your environment information. 49 00:02:02,160 --> 00:02:03,330 提供了一个命令 There is a command provided 50 00:02:03,330 --> 00:02:05,700 由 Transformers 库来执行此操作。 by the Transformers library to do this. 51 00:02:05,700 --> 00:02:08,550 只需在笔记本或终端中执行即可 Just execute it in your notebook or in a terminal 52 00:02:08,550 --> 00:02:10,203 并复制粘贴结果。 and copy paste the results. 53 00:02:11,070 --> 00:02:13,530 最后两个问题需要手动填写, There are two last questions to fill manually, 54 00:02:13,530 --> 00:02:16,023 在我们的案例中,答案是否定的。 to which the answers are no and no in our case. 55 00:02:17,550 --> 00:02:20,460 接下来,我们需要确定标记谁。 Next, we need to determine who to tag. 56 00:02:20,460 --> 00:02:23,010 模板中有完整的用户名列表。 There is a full list of usernames in the template. 57 00:02:23,010 --> 00:02:25,043 由于我们的问题与 tokenizer 有关, Since our issue has to do with tokenizers, 58 00:02:25,043 --> 00:02:28,170 我们选择与他们相关的维护者。 we pick the maintainer associated with them. 59 00:02:28,170 --> 00:02:30,210 标记超过 3 个人是没有意义的, There is no point tagging more than 3 people, 60 00:02:30,210 --> 00:02:32,010 他们会将你重定向到合适的人 they will redirect you to the right person 61 00:02:32,010 --> 00:02:33,110 如果你犯了错误。 if you made a mistake. 62 00:02:34,410 --> 00:02:36,600 接下来,我们必须提供必要的信息 Next, we have to give the information necessary 63 00:02:36,600 --> 00:02:38,220 重现错误。 to reproduce the bug. 64 00:02:38,220 --> 00:02:41,010 我们粘贴样本,并将其放在两行之间 We paste our sample, and put it between two lines 65 00:02:41,010 --> 00:02:44,073 带有三个反引号,以便正确格式化。 with three backticks so that it's formatted properly. 66 00:02:45,210 --> 00:02:47,010 我们还粘贴了完整的回溯, We also paste the full traceback, 67 00:02:47,010 --> 00:02:49,740 仍然在两行三个反引号之间。 still between two lines of three backticks. 68 00:02:49,740 --> 00:02:52,650 最后,我们可以添加任何其他信息 Lastly, we can add any additional information 69 00:02:52,650 --> 00:02:55,200 关于我们试图调试手头问题的内容。 about what we tried to debug the issue at hand. 70 00:02:55,200 --> 00:02:57,030 有了这一切,你应该期待一个答案 With all of this, you should expect an answer 71 00:02:57,030 --> 00:02:59,880 非常快地解决你的 issue ,希望能快速解决。 to your issue pretty fast and hopefully a quick fix. 72 00:02:59,880 --> 00:03:01,770 请注意,此视频中的所有建议 Note that all the advise in this video 73 00:03:01,770 --> 00:03:04,203 适用于几乎所有的开源项目。 applies for almost every open-source project.