subtitles/en/01_the-pipeline-function.srt (345 lines of code) (raw):

1 00:00:00,069 --> 00:00:01,341 (screen whooshes) 2 00:00:01,341 --> 00:00:02,449 (face logo whooshes) 3 00:00:02,449 --> 00:00:05,880 (screen whooshes) 4 00:00:05,880 --> 00:00:07,080 - The pipeline function. 5 00:00:09,540 --> 00:00:12,020 The pipeline function is the most high level API 6 00:00:12,020 --> 00:00:14,010 of the Transformers library. 7 00:00:14,010 --> 00:00:16,050 It regroups together all the steps 8 00:00:16,050 --> 00:00:18,873 to go from raw texts to usable predictions. 9 00:00:20,228 --> 00:00:22,980 The model used is at the core of a pipeline, 10 00:00:22,980 --> 00:00:24,390 but the pipeline also include 11 00:00:24,390 --> 00:00:26,610 all the necessary pre-processing, 12 00:00:26,610 --> 00:00:30,240 since the model does not expect texts, but number, 13 00:00:30,240 --> 00:00:32,040 as well as some post-processing, 14 00:00:32,040 --> 00:00:34,533 to make the output of the model human-readable. 15 00:00:35,910 --> 00:00:37,593 Let's look at a first example 16 00:00:37,593 --> 00:00:39,693 with the sentiment analysis pipeline. 17 00:00:40,740 --> 00:00:44,670 This pipeline performs text classification on a given input 18 00:00:44,670 --> 00:00:46,953 and determines if it's positive or negative. 19 00:00:47,910 --> 00:00:51,750 Here, it attributed the positive label on the given text, 20 00:00:51,750 --> 00:00:54,413 with a confidence of 95%. 21 00:00:55,650 --> 00:00:58,470 You can pass multiple texts to the same pipeline, 22 00:00:58,470 --> 00:01:00,270 which will be processed and passed 23 00:01:00,270 --> 00:01:02,673 through the model together as a batch. 24 00:01:03,570 --> 00:01:05,970 The output is a list of individual results 25 00:01:05,970 --> 00:01:07,923 in the same order as the input texts. 26 00:01:08,790 --> 00:01:12,270 Here we find the same label and score for the first text, 27 00:01:12,270 --> 00:01:14,443 and the second text is judged negative 28 00:01:14,443 --> 00:01:17,243 with a confidence of 99.9%. 29 00:01:18,720 --> 00:01:20,700 The zero-shot classification pipeline 30 00:01:20,700 --> 00:01:23,610 is a more general text-classification pipeline, 31 00:01:23,610 --> 00:01:26,370 it allows you to provide the labels you want. 32 00:01:26,370 --> 00:01:29,850 Here we want to classify our input text along the labels, 33 00:01:29,850 --> 00:01:32,643 education, politics, and business. 34 00:01:33,540 --> 00:01:35,580 The pipeline successfully recognizes 35 00:01:35,580 --> 00:01:38,280 it's more about education than the other labels, 36 00:01:38,280 --> 00:01:40,643 with a confidence of 84%. 37 00:01:41,670 --> 00:01:43,110 Moving on to other tasks, 38 00:01:43,110 --> 00:01:45,030 the text generation pipeline will 39 00:01:45,030 --> 00:01:46,533 auto-complete a given prompt. 40 00:01:47,460 --> 00:01:49,980 The output is generated with a bit of randomness, 41 00:01:49,980 --> 00:01:52,800 so it changes each time you call the generator object 42 00:01:52,800 --> 00:01:53,763 on a given prompt. 43 00:01:54,990 --> 00:01:57,123 Up until now, we've used the the pipeline API 44 00:01:57,123 --> 00:02:00,360 with the default model associated to each task, 45 00:02:00,360 --> 00:02:02,880 but you can use it with any model that has been pretrained 46 00:02:02,880 --> 00:02:04,263 or fine-tuned on this task. 47 00:02:06,540 --> 00:02:10,350 Going on the model hub, huggingface.co/models 48 00:02:10,350 --> 00:02:13,350 you can filter the available models by task. 49 00:02:13,350 --> 00:02:17,190 The default model used in our previous example was gpt2, 50 00:02:17,190 --> 00:02:19,290 but there are many more models available, 51 00:02:19,290 --> 00:02:20,523 and not just in English. 52 00:02:21,450 --> 00:02:23,670 Let's go back to the text generation pipeline 53 00:02:23,670 --> 00:02:26,193 and load it with another model, distilgpt2. 54 00:02:27,060 --> 00:02:28,950 This is a lighter version of gpt2 55 00:02:28,950 --> 00:02:30,603 created by the Hugging Face team. 56 00:02:31,740 --> 00:02:34,110 When applying the pipeline to a given prompt, 57 00:02:34,110 --> 00:02:36,360 we can specify several arguments 58 00:02:36,360 --> 00:02:39,240 such as the maximum length of the generated texts, 59 00:02:39,240 --> 00:02:41,700 or the number of sentences we want to return, 60 00:02:41,700 --> 00:02:44,150 since there is some randomness in the generation. 61 00:02:46,080 --> 00:02:48,750 Generating texts by guessing the next word in a sentence 62 00:02:48,750 --> 00:02:51,450 was the pretraining objective of GPT-2. 63 00:02:51,450 --> 00:02:55,140 The fill mask pipeline is the pretraining objective of BERT, 64 00:02:55,140 --> 00:02:57,363 which is to guess the value of masked word. 65 00:02:58,260 --> 00:03:01,020 In this case, we ask the two most likely values 66 00:03:01,020 --> 00:03:03,660 for the missing words, according to the model, 67 00:03:03,660 --> 00:03:07,053 and get mathematical or computational as possible answers. 68 00:03:08,280 --> 00:03:10,170 Another task Transformers model can perform 69 00:03:10,170 --> 00:03:12,660 is to classify each word in the sentence 70 00:03:12,660 --> 00:03:14,970 instead of the sentence as a whole. 71 00:03:14,970 --> 00:03:18,390 One example of this is Named Entity Recognition, 72 00:03:18,390 --> 00:03:20,820 which is the task of identifying entities, 73 00:03:20,820 --> 00:03:25,323 such as persons, organizations or locations in a sentence. 74 00:03:26,400 --> 00:03:30,570 Here, the model correctly finds the person, Sylvain, 75 00:03:30,570 --> 00:03:32,453 the organization, Hugging Face, 76 00:03:32,453 --> 00:03:35,010 as well as the location, Brooklyn, 77 00:03:35,010 --> 00:03:36,303 inside the input text. 78 00:03:37,661 --> 00:03:40,230 The grouped_entities=True argument used 79 00:03:40,230 --> 00:03:42,330 is to make the pipeline group together 80 00:03:42,330 --> 00:03:44,790 the different words linked to the same entity, 81 00:03:44,790 --> 00:03:46,353 such as Hugging and Face here. 82 00:03:48,270 --> 00:03:50,670 Another task available with the pipeline API 83 00:03:50,670 --> 00:03:52,920 is extractive question answering. 84 00:03:52,920 --> 00:03:55,380 Providing a context and a question, 85 00:03:55,380 --> 00:03:58,290 the model will identify the span of text in the context 86 00:03:58,290 --> 00:04:00,190 containing the answer to the question. 87 00:04:01,650 --> 00:04:03,960 Getting short summaries of very long articles 88 00:04:03,960 --> 00:04:06,540 is also something the Transformers library can help with, 89 00:04:06,540 --> 00:04:08,140 with the summarization pipeline. 90 00:04:09,480 --> 00:04:12,570 Finally, the last task supported by the pipeline API 91 00:04:12,570 --> 00:04:14,130 is translation. 92 00:04:14,130 --> 00:04:16,170 Here we use a French/English model 93 00:04:16,170 --> 00:04:17,460 found on the model hub 94 00:04:17,460 --> 00:04:19,893 to get the English version of our input text. 95 00:04:21,600 --> 00:04:23,490 Here is a brief summary of all the tasks 96 00:04:23,490 --> 00:04:25,500 we've looked into in this video. 97 00:04:25,500 --> 00:04:27,390 Try then out through the inference widgets 98 00:04:27,390 --> 00:04:28,327 in the model hub. 99 00:04:30,459 --> 00:04:33,475 (screen whooshes) 100 00:04:33,475 --> 00:04:35,175 (logo whooshes)