subtitles/zh-CN/28_tensorflow-predictions-and-metrics.srt

1 00:00:00,269 --> 00:00:02,936 （空气呼啸） (air whooshing) 2 00:00:05,700 --> 00:00:07,110 - 在我们的其他视频中， - In our other videos, 3 00:00:07,110 --> 00:00:09,000 和往常一样，下面会有链接 and as always, there'll be links below 4 00:00:09,000 --> 00:00:10,740 如果你想回看那些， if you want to check those out, 5 00:00:10,740 --> 00:00:13,230 我们向你展示了如何初始化和微调 we showed you how to initialize and fine-tune 6 00:00:13,230 --> 00:00:15,690 TensorFlow 中的 transformer 模型。 a transformer model in TensorFlow. 7 00:00:15,690 --> 00:00:18,600 所以现在的问题是，我们可以用模型做什么 So the question now is, what can we do with a model 8 00:00:18,600 --> 00:00:20,070 在我们训练之后？ after we train it? 9 00:00:20,070 --> 00:00:21,390 显而易见的尝试 The obvious thing to try 10 00:00:21,390 --> 00:00:23,790 是用它来获得对新数据的预测， is to use it to get predictions for new data, 11 00:00:23,790 --> 00:00:25,560 所以让我们看看如何做到这一点。 so let's see how to do that. 12 00:00:25,560 --> 00:00:28,320 同样，如果你熟悉 Keras，那么好消息是 Again, if you're familiar with Keras, the good news is 13 00:00:28,320 --> 00:00:31,860 因为只有标准的 Keras 模型， that because there are just standard Keras models, 14 00:00:31,860 --> 00:00:34,770 我们可以使用标准的 Keras 预测方法， we can use the standard Keras predict method, 15 00:00:34,770 --> 00:00:35,883 如此处所示。 as shown here. 16 00:00:36,990 --> 00:00:40,560 你只需将分词化的文本传递给此方法， You simply pass in tokenized text to this method, 17 00:00:40,560 --> 00:00:42,330 就像你从分词器那里得到的一样， like you'd get from a tokenizer, 18 00:00:42,330 --> 00:00:44,280 你得到你的结果。 and you get your results. 19 00:00:44,280 --> 00:00:46,740 我们的模型可以输出几种不同的东西， Our models can output several different things, 20 00:00:46,740 --> 00:00:48,510 根据你设置的选项， depending on the options you set, 21 00:00:48,510 --> 00:00:50,310 但大多数时候你想要的东西 but most of the time the thing you want 22 00:00:50,310 --> 00:00:52,290 是输出的 logits 。 is the output logits. 23 00:00:52,290 --> 00:00:54,900 如果你在之前没有遇到过它们，logits If you haven't come across them before, logits, 24 00:00:54,900 --> 00:00:57,630 有时发音成 logits ，因为没有人确定， sometimes pronounced to logits because no one's sure, 25 00:00:57,630 --> 00:01:00,390 是网络最后一层的输出 are the outputs of the last layer of the network 26 00:01:00,390 --> 00:01:03,150 因为在使用 softmax 之前。 because before a softmax has been applied. 27 00:01:03,150 --> 00:01:04,710 所以如果你想把 logits So if you want to turn the logits 28 00:01:04,710 --> 00:01:06,900 进成模型的概率输出， into the model's probability outputs, 29 00:01:06,900 --> 00:01:09,423 你只需使用一个 softmax，就像这样。 you just apply a softmax, like so. 30 00:01:10,981 --> 00:01:12,630 如果我们想这些概率 What if we want to turn those probabilities 31 00:01:12,630 --> 00:01:14,370 进成类别预测？ into class predictions? 32 00:01:14,370 --> 00:01:16,410 同样，它非常直接。 Again, it's very straightforward. 33 00:01:16,410 --> 00:01:19,470 我们只是为每个输出选择最大的概率 We just pick the biggest probability for each output 34 00:01:19,470 --> 00:01:23,070 你可以使用 argmax 函数立即获得它。 and you can get that immediately with the argmax function. 35 00:01:23,070 --> 00:01:24,870 argmax 将返回索引 argmax will return the index 36 00:01:24,870 --> 00:01:27,120 每行中的最大概率 of the largest probability in each row 37 00:01:27,120 --> 00:01:30,360 这意味着我们将得到一个整数向量。 which means that we'll get a vector of integers. 38 00:01:30,360 --> 00:01:34,950 如果最大概率在零位置，则为零， So zero if the largest probability was in the zero position, 39 00:01:34,950 --> 00:01:37,350 一个在第一个位置，依此类推。 one in the first position, and so on. 40 00:01:37,350 --> 00:01:40,380 所以这些是我们的类预测表明第零类， So these are our class predictions indicating class zero, 41 00:01:40,380 --> 00:01:42,300 第一类，等等。 class one, and so on. 42 00:01:42,300 --> 00:01:45,090 事实上，如果你想要的只是类别预测， In fact, if class predictions are all you want, 43 00:01:45,090 --> 00:01:47,310 你可以完全跳过 softmax 步骤 you can skip the softmax step entirely 44 00:01:47,310 --> 00:01:49,740 因为最大的 logit 永远是最大的 because the largest logit will always be the largest 45 00:01:49,740 --> 00:01:51,303 概率也一样。 probability as well. 46 00:01:52,500 --> 00:01:55,800 所以如果你想要概率和类别预测， So if probabilities and class predictions are all you want, 47 00:01:55,800 --> 00:01:58,350 那么此时你已经看到了所需的一切。 then you've seen everything you need at this point. 48 00:01:58,350 --> 00:02:00,630 但是如果你有兴趣对你的模型进行基准测试 But if you're interested in benchmarking your model 49 00:02:00,630 --> 00:02:02,190 或将其用于研究， or using it for research, 50 00:02:02,190 --> 00:02:05,010 你可能想更深入地研究你得到的结果。 you might want to delve deeper into the results you get. 51 00:02:05,010 --> 00:02:07,230 一种方法是计算一些指标 And one way to do that is to compute some metrics 52 00:02:07,230 --> 00:02:09,060 用于模型的预测。 for the model's predictions. 53 00:02:09,060 --> 00:02:10,920 如果你关注我们有关数据集 If you're following along with our datasets 54 00:02:10,920 --> 00:02:12,390 和微调的视频， and fine tuning videos, 55 00:02:12,390 --> 00:02:14,850 我们从 MRPC 数据集中获取数据， we got our data from the MRPC dataset, 56 00:02:14,850 --> 00:02:17,130 这是 GLUE 基准的一部分。 which is part of the GLUE benchmark. 57 00:02:17,130 --> 00:02:19,050 GLUE 数据集的每一个 Each of the GLUE datasets 58 00:02:19,050 --> 00:02:22,560 以及我们数据集 Light Hub 中的许多其他数据集 as well as many other datasets in our dataset, Light Hub 59 00:02:22,560 --> 00:02:24,510 有一些预定义的指标， has some predefined metrics, 60 00:02:24,510 --> 00:02:26,940 我们可以轻松加载它们 and we can load them easily 61 00:02:26,940 --> 00:02:29,880 使用数据集加载度量函数。 with the datasets load metric function. 62 00:02:29,880 --> 00:02:33,720 对于 MRPC 数据集，内置指标是准确性 For the MRPC dataset, the built-in metrics are accuracy 63 00:02:33,720 --> 00:02:35,790 它只是衡量百分比, 其 which just measures the percentage of the time 64 00:02:35,790 --> 00:02:37,830 模型的预测是正确的次数， the model's prediction was correct, 65 00:02:37,830 --> 00:02:39,780 和 F1 分数， and the F1 score, 66 00:02:39,780 --> 00:02:41,610 这是一个稍微复杂的度量 which is a slightly more complex measure 67 00:02:41,610 --> 00:02:43,920 衡量模型如何权衡 that measures how well the model trades off 68 00:02:43,920 --> 00:02:45,543 准确率和召回率。 precision and recall. 69 00:02:46,470 --> 00:02:49,110 要计算这些指标以对我们的模型进行基准测试， To compute those metrics to benchmark our model, 70 00:02:49,110 --> 00:02:51,480 我们只是将模型的预测传递给他们， we just pass them the model's predictions, 71 00:02:51,480 --> 00:02:53,220 和真实标签， and to the ground truth labels, 72 00:02:53,220 --> 00:02:56,880 我们在一个简单的 Python 字典中得到我们的结果。 and we get our results in a straightforward Python dict. 73 00:02:56,880 --> 00:02:58,740 如果你熟悉 Keras， If you're familiar with Keras though, 74 00:02:58,740 --> 00:03:00,870 你可能会注意到这是一种有点奇怪的方式 you might notice that this is a slightly weird way 75 00:03:00,870 --> 00:03:01,800 计算指标， to compute metrics, 76 00:03:01,800 --> 00:03:02,970 因为我们只计算指标 because we're only computing metrics 77 00:03:02,970 --> 00:03:04,440 在训练的最后。 at the very end of training. 78 00:03:04,440 --> 00:03:06,480 但是在 Keras 中，你有这个内置的能力 But in Keras, you have this built-in ability 79 00:03:06,480 --> 00:03:08,790 计算范围广泛的指标 to compute a wide range of metrics 80 00:03:08,790 --> 00:03:10,470 在你训练的过程中， on the fly while you're training, 81 00:03:10,470 --> 00:03:11,910 这给了你一个非常有用的理解 which gives you a very useful insight 82 00:03:11,910 --> 00:03:13,740 了解训练的进展情况。 into how training is going. 83 00:03:13,740 --> 00:03:15,900 所以如果你想使用内置指标， So if you want to use built-in metrics, 84 00:03:15,900 --> 00:03:17,280 这很简单 it's very straightforward 85 00:03:17,280 --> 00:03:19,350 然后你再次使用标准的 Keras 方法。 and you use the standard Keras approach again. 86 00:03:19,350 --> 00:03:23,160 你只需将一个度量参数传递给编译方法。 You just pass a metric argument to the compile method. 87 00:03:23,160 --> 00:03:25,740 与损失和优化器之类的东西一样， As with things like loss and optimizer, 88 00:03:25,740 --> 00:03:28,470 你可以通过字符串指定你想要的指标 you can specify the metrics you want by string 89 00:03:28,470 --> 00:03:30,810 或者你可以导入实际的指标对象 or you can import the actual metric objects 90 00:03:30,810 --> 00:03:33,240 并向他们传递具体的参数。 and pass specific arguments to them. 91 00:03:33,240 --> 00:03:35,610 但请注意，与损失和准确性不同， But note that unlike loss and accuracy, 92 00:03:35,610 --> 00:03:37,710 你必须以列表形式提供指标 you have to supply metrics as a list 93 00:03:37,710 --> 00:03:39,760 即使你只需要一个指标。 even if there's only one metric you want. 94 00:03:40,770 --> 00:03:43,140 一旦用度量标准编译了模型， Once a model has been compiled with a metric, 95 00:03:43,140 --> 00:03:45,360 它将报告该指标对于训练， it will report that metric for training, 96 00:03:45,360 --> 00:03:47,643 验证和预测。 validation, and predictions. 97 00:03:48,480 --> 00:03:50,820 假设有标签传递给预测。 Assuming there are labels passed to the predictions. 98 00:03:50,820 --> 00:03:53,400 你甚至可以编写自己的度量类。 You can even write your own metric classes. 99 00:03:53,400 --> 00:03:55,920 虽然这有点超出本课程的范围， Although this is a bit beyond the scope of this course, 100 00:03:55,920 --> 00:03:58,200 我将链接到下面的相关 TF 文档 I'll link to the relevant TF docs below 101 00:03:58,200 --> 00:03:59,580 因为它可以非常方便 because it can be very handy 102 00:03:59,580 --> 00:04:01,320 如果你想要一个不受支持的指标 if you want a metric that isn't supported 103 00:04:01,320 --> 00:04:02,850 默认情况下，在 Keras 中， by default in Keras, 104 00:04:02,850 --> 00:04:04,473 比如 F1 score. such as the F1 score. 105 00:04:06,076 --> 00:04:08,743 （空气呼啸） (air whooshing)

subtitles/zh-CN/28_tensorflow-predictions-and-metrics.srt (420 lines of code) (raw):