subtitles/zh-CN/28_tensorflow-predictions-and-metrics.srt (420 lines of code) (raw):
1
00:00:00,269 --> 00:00:02,936
(空气呼啸)
(air whooshing)
2
00:00:05,700 --> 00:00:07,110
- 在我们的其他视频中,
- In our other videos,
3
00:00:07,110 --> 00:00:09,000
和往常一样,下面会有链接
and as always, there'll be links below
4
00:00:09,000 --> 00:00:10,740
如果你想回看那些,
if you want to check those out,
5
00:00:10,740 --> 00:00:13,230
我们向你展示了如何初始化和微调
we showed you how to initialize and fine-tune
6
00:00:13,230 --> 00:00:15,690
TensorFlow 中的 transformer 模型。
a transformer model in TensorFlow.
7
00:00:15,690 --> 00:00:18,600
所以现在的问题是,我们可以用模型做什么
So the question now is, what can we do with a model
8
00:00:18,600 --> 00:00:20,070
在我们训练之后?
after we train it?
9
00:00:20,070 --> 00:00:21,390
显而易见的尝试
The obvious thing to try
10
00:00:21,390 --> 00:00:23,790
是用它来获得对新数据的预测,
is to use it to get predictions for new data,
11
00:00:23,790 --> 00:00:25,560
所以让我们看看如何做到这一点。
so let's see how to do that.
12
00:00:25,560 --> 00:00:28,320
同样,如果你熟悉 Keras,那么好消息是
Again, if you're familiar with Keras, the good news is
13
00:00:28,320 --> 00:00:31,860
因为只有标准的 Keras 模型,
that because there are just standard Keras models,
14
00:00:31,860 --> 00:00:34,770
我们可以使用标准的 Keras 预测方法,
we can use the standard Keras predict method,
15
00:00:34,770 --> 00:00:35,883
如此处所示。
as shown here.
16
00:00:36,990 --> 00:00:40,560
你只需将分词化的文本传递给此方法,
You simply pass in tokenized text to this method,
17
00:00:40,560 --> 00:00:42,330
就像你从分词器那里得到的一样,
like you'd get from a tokenizer,
18
00:00:42,330 --> 00:00:44,280
你得到你的结果。
and you get your results.
19
00:00:44,280 --> 00:00:46,740
我们的模型可以输出几种不同的东西,
Our models can output several different things,
20
00:00:46,740 --> 00:00:48,510
根据你设置的选项,
depending on the options you set,
21
00:00:48,510 --> 00:00:50,310
但大多数时候你想要的东西
but most of the time the thing you want
22
00:00:50,310 --> 00:00:52,290
是输出的 logits 。
is the output logits.
23
00:00:52,290 --> 00:00:54,900
如果你在之前没有遇到过它们,logits
If you haven't come across them before, logits,
24
00:00:54,900 --> 00:00:57,630
有时发音成 logits ,因为没有人确定,
sometimes pronounced to logits because no one's sure,
25
00:00:57,630 --> 00:01:00,390
是网络最后一层的输出
are the outputs of the last layer of the network
26
00:01:00,390 --> 00:01:03,150
因为在使用 softmax 之前。
because before a softmax has been applied.
27
00:01:03,150 --> 00:01:04,710
所以如果你想把 logits
So if you want to turn the logits
28
00:01:04,710 --> 00:01:06,900
进成模型的概率输出,
into the model's probability outputs,
29
00:01:06,900 --> 00:01:09,423
你只需使用一个 softmax,就像这样。
you just apply a softmax, like so.
30
00:01:10,981 --> 00:01:12,630
如果我们想这些概率
What if we want to turn those probabilities
31
00:01:12,630 --> 00:01:14,370
进成类别预测?
into class predictions?
32
00:01:14,370 --> 00:01:16,410
同样,它非常直接。
Again, it's very straightforward.
33
00:01:16,410 --> 00:01:19,470
我们只是为每个输出选择最大的概率
We just pick the biggest probability for each output
34
00:01:19,470 --> 00:01:23,070
你可以使用 argmax 函数立即获得它。
and you can get that immediately with the argmax function.
35
00:01:23,070 --> 00:01:24,870
argmax 将返回索引
argmax will return the index
36
00:01:24,870 --> 00:01:27,120
每行中的最大概率
of the largest probability in each row
37
00:01:27,120 --> 00:01:30,360
这意味着我们将得到一个整数向量。
which means that we'll get a vector of integers.
38
00:01:30,360 --> 00:01:34,950
如果最大概率在零位置,则为零,
So zero if the largest probability was in the zero position,
39
00:01:34,950 --> 00:01:37,350
一个在第一个位置,依此类推。
one in the first position, and so on.
40
00:01:37,350 --> 00:01:40,380
所以这些是我们的类预测表明第零类,
So these are our class predictions indicating class zero,
41
00:01:40,380 --> 00:01:42,300
第一类,等等。
class one, and so on.
42
00:01:42,300 --> 00:01:45,090
事实上,如果你想要的只是类别预测,
In fact, if class predictions are all you want,
43
00:01:45,090 --> 00:01:47,310
你可以完全跳过 softmax 步骤
you can skip the softmax step entirely
44
00:01:47,310 --> 00:01:49,740
因为最大的 logit 永远是最大的
because the largest logit will always be the largest
45
00:01:49,740 --> 00:01:51,303
概率也一样。
probability as well.
46
00:01:52,500 --> 00:01:55,800
所以如果你想要概率和类别预测,
So if probabilities and class predictions are all you want,
47
00:01:55,800 --> 00:01:58,350
那么此时你已经看到了所需的一切。
then you've seen everything you need at this point.
48
00:01:58,350 --> 00:02:00,630
但是如果你有兴趣对你的模型进行基准测试
But if you're interested in benchmarking your model
49
00:02:00,630 --> 00:02:02,190
或将其用于研究,
or using it for research,
50
00:02:02,190 --> 00:02:05,010
你可能想更深入地研究你得到的结果。
you might want to delve deeper into the results you get.
51
00:02:05,010 --> 00:02:07,230
一种方法是计算一些指标
And one way to do that is to compute some metrics
52
00:02:07,230 --> 00:02:09,060
用于模型的预测。
for the model's predictions.
53
00:02:09,060 --> 00:02:10,920
如果你关注我们有关数据集
If you're following along with our datasets
54
00:02:10,920 --> 00:02:12,390
和微调的视频,
and fine tuning videos,
55
00:02:12,390 --> 00:02:14,850
我们从 MRPC 数据集中获取数据,
we got our data from the MRPC dataset,
56
00:02:14,850 --> 00:02:17,130
这是 GLUE 基准的一部分。
which is part of the GLUE benchmark.
57
00:02:17,130 --> 00:02:19,050
GLUE 数据集的每一个
Each of the GLUE datasets
58
00:02:19,050 --> 00:02:22,560
以及我们数据集 Light Hub 中的许多其他数据集
as well as many other datasets in our dataset, Light Hub
59
00:02:22,560 --> 00:02:24,510
有一些预定义的指标,
has some predefined metrics,
60
00:02:24,510 --> 00:02:26,940
我们可以轻松加载它们
and we can load them easily
61
00:02:26,940 --> 00:02:29,880
使用数据集加载度量函数。
with the datasets load metric function.
62
00:02:29,880 --> 00:02:33,720
对于 MRPC 数据集,内置指标是准确性
For the MRPC dataset, the built-in metrics are accuracy
63
00:02:33,720 --> 00:02:35,790
它只是衡量百分比, 其
which just measures the percentage of the time
64
00:02:35,790 --> 00:02:37,830
模型的预测是正确的次数,
the model's prediction was correct,
65
00:02:37,830 --> 00:02:39,780
和 F1 分数,
and the F1 score,
66
00:02:39,780 --> 00:02:41,610
这是一个稍微复杂的度量
which is a slightly more complex measure
67
00:02:41,610 --> 00:02:43,920
衡量模型如何权衡
that measures how well the model trades off
68
00:02:43,920 --> 00:02:45,543
准确率和召回率。
precision and recall.
69
00:02:46,470 --> 00:02:49,110
要计算这些指标以对我们的模型进行基准测试,
To compute those metrics to benchmark our model,
70
00:02:49,110 --> 00:02:51,480
我们只是将模型的预测传递给他们,
we just pass them the model's predictions,
71
00:02:51,480 --> 00:02:53,220
和真实标签,
and to the ground truth labels,
72
00:02:53,220 --> 00:02:56,880
我们在一个简单的 Python 字典中得到我们的结果。
and we get our results in a straightforward Python dict.
73
00:02:56,880 --> 00:02:58,740
如果你熟悉 Keras,
If you're familiar with Keras though,
74
00:02:58,740 --> 00:03:00,870
你可能会注意到这是一种有点奇怪的方式
you might notice that this is a slightly weird way
75
00:03:00,870 --> 00:03:01,800
计算指标,
to compute metrics,
76
00:03:01,800 --> 00:03:02,970
因为我们只计算指标
because we're only computing metrics
77
00:03:02,970 --> 00:03:04,440
在训练的最后。
at the very end of training.
78
00:03:04,440 --> 00:03:06,480
但是在 Keras 中,你有这个内置的能力
But in Keras, you have this built-in ability
79
00:03:06,480 --> 00:03:08,790
计算范围广泛的指标
to compute a wide range of metrics
80
00:03:08,790 --> 00:03:10,470
在你训练的过程中,
on the fly while you're training,
81
00:03:10,470 --> 00:03:11,910
这给了你一个非常有用的理解
which gives you a very useful insight
82
00:03:11,910 --> 00:03:13,740
了解训练的进展情况。
into how training is going.
83
00:03:13,740 --> 00:03:15,900
所以如果你想使用内置指标,
So if you want to use built-in metrics,
84
00:03:15,900 --> 00:03:17,280
这很简单
it's very straightforward
85
00:03:17,280 --> 00:03:19,350
然后你再次使用标准的 Keras 方法。
and you use the standard Keras approach again.
86
00:03:19,350 --> 00:03:23,160
你只需将一个度量参数传递给编译方法。
You just pass a metric argument to the compile method.
87
00:03:23,160 --> 00:03:25,740
与损失和优化器之类的东西一样,
As with things like loss and optimizer,
88
00:03:25,740 --> 00:03:28,470
你可以通过字符串指定你想要的指标
you can specify the metrics you want by string
89
00:03:28,470 --> 00:03:30,810
或者你可以导入实际的指标对象
or you can import the actual metric objects
90
00:03:30,810 --> 00:03:33,240
并向他们传递具体的参数。
and pass specific arguments to them.
91
00:03:33,240 --> 00:03:35,610
但请注意,与损失和准确性不同,
But note that unlike loss and accuracy,
92
00:03:35,610 --> 00:03:37,710
你必须以列表形式提供指标
you have to supply metrics as a list
93
00:03:37,710 --> 00:03:39,760
即使你只需要一个指标。
even if there's only one metric you want.
94
00:03:40,770 --> 00:03:43,140
一旦用度量标准编译了模型,
Once a model has been compiled with a metric,
95
00:03:43,140 --> 00:03:45,360
它将报告该指标对于训练,
it will report that metric for training,
96
00:03:45,360 --> 00:03:47,643
验证和预测。
validation, and predictions.
97
00:03:48,480 --> 00:03:50,820
假设有标签传递给预测。
Assuming there are labels passed to the predictions.
98
00:03:50,820 --> 00:03:53,400
你甚至可以编写自己的度量类。
You can even write your own metric classes.
99
00:03:53,400 --> 00:03:55,920
虽然这有点超出本课程的范围,
Although this is a bit beyond the scope of this course,
100
00:03:55,920 --> 00:03:58,200
我将链接到下面的相关 TF 文档
I'll link to the relevant TF docs below
101
00:03:58,200 --> 00:03:59,580
因为它可以非常方便
because it can be very handy
102
00:03:59,580 --> 00:04:01,320
如果你想要一个不受支持的指标
if you want a metric that isn't supported
103
00:04:01,320 --> 00:04:02,850
默认情况下,在 Keras 中,
by default in Keras,
104
00:04:02,850 --> 00:04:04,473
比如 F1 score.
such as the F1 score.
105
00:04:06,076 --> 00:04:08,743
(空气呼啸)
(air whooshing)