subtitles/en/01_the-pipeline-function.srt (345 lines of code) (raw):
1
00:00:00,069 --> 00:00:01,341
(screen whooshes)
2
00:00:01,341 --> 00:00:02,449
(face logo whooshes)
3
00:00:02,449 --> 00:00:05,880
(screen whooshes)
4
00:00:05,880 --> 00:00:07,080
- The pipeline function.
5
00:00:09,540 --> 00:00:12,020
The pipeline function is
the most high level API
6
00:00:12,020 --> 00:00:14,010
of the Transformers library.
7
00:00:14,010 --> 00:00:16,050
It regroups together all the steps
8
00:00:16,050 --> 00:00:18,873
to go from raw texts
to usable predictions.
9
00:00:20,228 --> 00:00:22,980
The model used is at
the core of a pipeline,
10
00:00:22,980 --> 00:00:24,390
but the pipeline also include
11
00:00:24,390 --> 00:00:26,610
all the necessary pre-processing,
12
00:00:26,610 --> 00:00:30,240
since the model does not
expect texts, but number,
13
00:00:30,240 --> 00:00:32,040
as well as some post-processing,
14
00:00:32,040 --> 00:00:34,533
to make the output of
the model human-readable.
15
00:00:35,910 --> 00:00:37,593
Let's look at a first example
16
00:00:37,593 --> 00:00:39,693
with the sentiment analysis pipeline.
17
00:00:40,740 --> 00:00:44,670
This pipeline performs text
classification on a given input
18
00:00:44,670 --> 00:00:46,953
and determines if it's
positive or negative.
19
00:00:47,910 --> 00:00:51,750
Here, it attributed the positive
label on the given text,
20
00:00:51,750 --> 00:00:54,413
with a confidence of 95%.
21
00:00:55,650 --> 00:00:58,470
You can pass multiple
texts to the same pipeline,
22
00:00:58,470 --> 00:01:00,270
which will be processed and passed
23
00:01:00,270 --> 00:01:02,673
through the model together as a batch.
24
00:01:03,570 --> 00:01:05,970
The output is a list of individual results
25
00:01:05,970 --> 00:01:07,923
in the same order as the input texts.
26
00:01:08,790 --> 00:01:12,270
Here we find the same label
and score for the first text,
27
00:01:12,270 --> 00:01:14,443
and the second text is judged negative
28
00:01:14,443 --> 00:01:17,243
with a confidence of 99.9%.
29
00:01:18,720 --> 00:01:20,700
The zero-shot classification pipeline
30
00:01:20,700 --> 00:01:23,610
is a more general
text-classification pipeline,
31
00:01:23,610 --> 00:01:26,370
it allows you to provide
the labels you want.
32
00:01:26,370 --> 00:01:29,850
Here we want to classify our
input text along the labels,
33
00:01:29,850 --> 00:01:32,643
education, politics, and business.
34
00:01:33,540 --> 00:01:35,580
The pipeline successfully recognizes
35
00:01:35,580 --> 00:01:38,280
it's more about education
than the other labels,
36
00:01:38,280 --> 00:01:40,643
with a confidence of 84%.
37
00:01:41,670 --> 00:01:43,110
Moving on to other tasks,
38
00:01:43,110 --> 00:01:45,030
the text generation pipeline will
39
00:01:45,030 --> 00:01:46,533
auto-complete a given prompt.
40
00:01:47,460 --> 00:01:49,980
The output is generated
with a bit of randomness,
41
00:01:49,980 --> 00:01:52,800
so it changes each time you
call the generator object
42
00:01:52,800 --> 00:01:53,763
on a given prompt.
43
00:01:54,990 --> 00:01:57,123
Up until now, we've used
the the pipeline API
44
00:01:57,123 --> 00:02:00,360
with the default model
associated to each task,
45
00:02:00,360 --> 00:02:02,880
but you can use it with any
model that has been pretrained
46
00:02:02,880 --> 00:02:04,263
or fine-tuned on this task.
47
00:02:06,540 --> 00:02:10,350
Going on the model hub,
huggingface.co/models
48
00:02:10,350 --> 00:02:13,350
you can filter the
available models by task.
49
00:02:13,350 --> 00:02:17,190
The default model used in our
previous example was gpt2,
50
00:02:17,190 --> 00:02:19,290
but there are many more models available,
51
00:02:19,290 --> 00:02:20,523
and not just in English.
52
00:02:21,450 --> 00:02:23,670
Let's go back to the
text generation pipeline
53
00:02:23,670 --> 00:02:26,193
and load it with another
model, distilgpt2.
54
00:02:27,060 --> 00:02:28,950
This is a lighter version of gpt2
55
00:02:28,950 --> 00:02:30,603
created by the Hugging Face team.
56
00:02:31,740 --> 00:02:34,110
When applying the pipeline
to a given prompt,
57
00:02:34,110 --> 00:02:36,360
we can specify several arguments
58
00:02:36,360 --> 00:02:39,240
such as the maximum length
of the generated texts,
59
00:02:39,240 --> 00:02:41,700
or the number of sentences
we want to return,
60
00:02:41,700 --> 00:02:44,150
since there is some
randomness in the generation.
61
00:02:46,080 --> 00:02:48,750
Generating texts by guessing
the next word in a sentence
62
00:02:48,750 --> 00:02:51,450
was the pretraining objective of GPT-2.
63
00:02:51,450 --> 00:02:55,140
The fill mask pipeline is the
pretraining objective of BERT,
64
00:02:55,140 --> 00:02:57,363
which is to guess the
value of masked word.
65
00:02:58,260 --> 00:03:01,020
In this case, we ask the
two most likely values
66
00:03:01,020 --> 00:03:03,660
for the missing words,
according to the model,
67
00:03:03,660 --> 00:03:07,053
and get mathematical or
computational as possible answers.
68
00:03:08,280 --> 00:03:10,170
Another task Transformers
model can perform
69
00:03:10,170 --> 00:03:12,660
is to classify each word in the sentence
70
00:03:12,660 --> 00:03:14,970
instead of the sentence as a whole.
71
00:03:14,970 --> 00:03:18,390
One example of this is
Named Entity Recognition,
72
00:03:18,390 --> 00:03:20,820
which is the task of identifying entities,
73
00:03:20,820 --> 00:03:25,323
such as persons, organizations
or locations in a sentence.
74
00:03:26,400 --> 00:03:30,570
Here, the model correctly
finds the person, Sylvain,
75
00:03:30,570 --> 00:03:32,453
the organization, Hugging Face,
76
00:03:32,453 --> 00:03:35,010
as well as the location, Brooklyn,
77
00:03:35,010 --> 00:03:36,303
inside the input text.
78
00:03:37,661 --> 00:03:40,230
The grouped_entities=True argument used
79
00:03:40,230 --> 00:03:42,330
is to make the pipeline group together
80
00:03:42,330 --> 00:03:44,790
the different words
linked to the same entity,
81
00:03:44,790 --> 00:03:46,353
such as Hugging and Face here.
82
00:03:48,270 --> 00:03:50,670
Another task available
with the pipeline API
83
00:03:50,670 --> 00:03:52,920
is extractive question answering.
84
00:03:52,920 --> 00:03:55,380
Providing a context and a question,
85
00:03:55,380 --> 00:03:58,290
the model will identify the
span of text in the context
86
00:03:58,290 --> 00:04:00,190
containing the answer to the question.
87
00:04:01,650 --> 00:04:03,960
Getting short summaries
of very long articles
88
00:04:03,960 --> 00:04:06,540
is also something the Transformers
library can help with,
89
00:04:06,540 --> 00:04:08,140
with the summarization pipeline.
90
00:04:09,480 --> 00:04:12,570
Finally, the last task
supported by the pipeline API
91
00:04:12,570 --> 00:04:14,130
is translation.
92
00:04:14,130 --> 00:04:16,170
Here we use a French/English model
93
00:04:16,170 --> 00:04:17,460
found on the model hub
94
00:04:17,460 --> 00:04:19,893
to get the English
version of our input text.
95
00:04:21,600 --> 00:04:23,490
Here is a brief summary of all the tasks
96
00:04:23,490 --> 00:04:25,500
we've looked into in this video.
97
00:04:25,500 --> 00:04:27,390
Try then out through the inference widgets
98
00:04:27,390 --> 00:04:28,327
in the model hub.
99
00:04:30,459 --> 00:04:33,475
(screen whooshes)
100
00:04:33,475 --> 00:04:35,175
(logo whooshes)