subtitles/en/67_the-post-processing-step-in-question-answering-(tensorflow).srt (255 lines of code) (raw):
1
00:00:00,367 --> 00:00:02,950
(subtle blast)
2
00:00:05,850 --> 00:00:08,913
- The post-processing step
in a question-answering task.
3
00:00:10,830 --> 00:00:11,790
When doing question answering,
4
00:00:11,790 --> 00:00:14,670
the processing of the initial dataset
5
00:00:14,670 --> 00:00:18,090
implies splitting examples
in several features,
6
00:00:18,090 --> 00:00:20,850
which may or may not contain the answer.
7
00:00:20,850 --> 00:00:22,530
Passing those features through the model
8
00:00:22,530 --> 00:00:25,860
will give us logits for the
start and end positions,
9
00:00:25,860 --> 00:00:28,620
since our labels are the
indices of the tokens
10
00:00:28,620 --> 00:00:31,020
that correspond to the
start and end the answer.
11
00:00:31,860 --> 00:00:34,740
We must then somehow convert
those logits into an answer,
12
00:00:34,740 --> 00:00:38,070
and then pick one of the various
answers each feature gives
13
00:00:38,070 --> 00:00:40,473
to be the answer for a given example.
14
00:00:41,683 --> 00:00:43,200
For the processing step,
15
00:00:43,200 --> 00:00:45,450
you should refer to
the video linked below.
16
00:00:45,450 --> 00:00:47,310
It's not very different for validation,
17
00:00:47,310 --> 00:00:50,053
we just need to add a few lines
to keep track of two things:
18
00:00:50,053 --> 00:00:52,620
instead of discarding the offset mappings,
19
00:00:52,620 --> 00:00:55,380
we keep them, and also include
in them the information
20
00:00:55,380 --> 00:00:58,410
of where the context is
by setting the offsets
21
00:00:58,410 --> 00:01:01,821
of the special tokens
and the question to None.
22
00:01:01,821 --> 00:01:05,370
Then we also keep track of the
example ID for each feature,
23
00:01:05,370 --> 00:01:07,020
to be able to map back feature
24
00:01:07,020 --> 00:01:09,243
to the examples that they originated from.
25
00:01:10,470 --> 00:01:12,660
If you don't want to
compute the validation loss,
26
00:01:12,660 --> 00:01:14,610
you won't need to include
all the special code
27
00:01:14,610 --> 00:01:17,010
that we used to create the labels.
28
00:01:17,010 --> 00:01:19,650
With this done, we can apply
that preprocessing function
29
00:01:19,650 --> 00:01:21,480
using the map method.
30
00:01:21,480 --> 00:01:23,610
We take the SQUAD dataset
like in the preprocessing
31
00:01:23,610 --> 00:01:25,060
for question-answering video.
32
00:01:26,400 --> 00:01:29,310
Once this is done, the next
step is to create our model.
33
00:01:29,310 --> 00:01:30,570
We use the default model behind
34
00:01:30,570 --> 00:01:32,640
the question-answering pipeline here,
35
00:01:32,640 --> 00:01:35,880
but you should use any
model you want to evaluate.
36
00:01:35,880 --> 00:01:37,680
With the to_tf_dataset method,
37
00:01:37,680 --> 00:01:41,370
we can just sent our processed
dataset to model.predict,
38
00:01:41,370 --> 00:01:43,350
and we directly get our
start and end logits
39
00:01:43,350 --> 00:01:45,930
for the whole dataset as NumPy arrays.
40
00:01:45,930 --> 00:01:49,230
With this done, we can really
dive into the post-processing.
41
00:01:49,230 --> 00:01:52,380
First, we'll need a map
from example to features,
42
00:01:52,380 --> 00:01:53,883
which we can create like this.
43
00:01:54,780 --> 00:01:56,700
Now, for the main part
of the post-processing,
44
00:01:56,700 --> 00:02:00,270
let's see how to extract
an answer from the logits.
45
00:02:00,270 --> 00:02:01,650
We could just take the best index
46
00:02:01,650 --> 00:02:03,690
for the start and end logits and be done,
47
00:02:03,690 --> 00:02:06,180
but if our model predicts
something impossible,
48
00:02:06,180 --> 00:02:07,920
like tokens in the questions,
49
00:02:07,920 --> 00:02:09,670
we will look at more of the logits.
50
00:02:10,800 --> 00:02:12,570
Note that in the
question-answering pipeline,
51
00:02:12,570 --> 00:02:14,160
we attributed the score to each answer
52
00:02:14,160 --> 00:02:17,880
based on the probabilities,
which we did not compute here.
53
00:02:17,880 --> 00:02:19,860
In terms of logits, the
multiplication we had
54
00:02:19,860 --> 00:02:21,663
in the scores becomes an addition.
55
00:02:22,650 --> 00:02:23,910
To go fast, we don't look
56
00:02:23,910 --> 00:02:25,343
at all possible start and end logits,
57
00:02:25,343 --> 00:02:26,973
but the 20 best ones.
58
00:02:27,810 --> 00:02:30,386
We ignore the logits that
spawn impossible answers
59
00:02:30,386 --> 00:02:32,370
or answer that are too long.
60
00:02:32,370 --> 00:02:33,720
As we saw in the preprocessing,
61
00:02:33,720 --> 00:02:36,240
the label "0, 0" correspond to no answer,
62
00:02:36,240 --> 00:02:37,440
otherwise we use the offset
63
00:02:37,440 --> 00:02:39,290
to get the answer inside the context.
64
00:02:40,260 --> 00:02:41,580
Let's have a look at the predicted answer
65
00:02:41,580 --> 00:02:43,200
for the first feature,
66
00:02:43,200 --> 00:02:44,790
which is the answer with the best score,
67
00:02:44,790 --> 00:02:46,860
or the best logit score since the SoftMax
68
00:02:46,860 --> 00:02:48,810
is an increasing function.
69
00:02:48,810 --> 00:02:49,960
The model got it right.
70
00:02:51,210 --> 00:02:54,180
Next, we just have to loop
this for every example,
71
00:02:54,180 --> 00:02:56,700
picking for each the answer
with the best logit score
72
00:02:56,700 --> 00:02:59,133
in all the features the example generated.
73
00:03:00,030 --> 00:03:03,030
Now you know how to get answers
from your model predictions.
74
00:03:04,214 --> 00:03:06,797
(subtle blast)