1 00:00:04,560 --> 00:00:06,640 Welcome to the Hugging Face tasks series.   2 00:00:07,280 --> 00:00:10,720 In this video, we will take a look  at the Text Summarization task.  3 00:00:13,680 --> 00:00:16,480 Summarization is a task of  producing a shorter version   4 00:00:16,480 --> 00:00:21,600 of a document while preserving the relevant  and important information in the document.  5 00:00:25,040 --> 00:00:29,840 Summarization models take a document to be  summarized and output the summarized text.  6 00:00:33,360 --> 00:00:40,240 This task is evaluated on the ROUGE score. It’s  based on the overlap between the produced sequence   7 00:00:40,240 --> 00:00:48,000 and the correct sequence. You might see this as ROUGE-1,   8 00:00:48,000 --> 00:00:55,600 which is the overlap of single tokens and ROUGE-2,  the overlap of subsequent token pairs. ROUGE-N   9 00:00:55,600 --> 00:01:02,960 refers to the overlap of n subsequent tokens.  Here we see an example of how overlaps take place.  10 00:01:06,160 --> 00:01:11,280 An example dataset used for this task is  called Extreme Summarization, XSUM. This   11 00:01:11,280 --> 00:01:14,480 dataset contains texts and  their summarized versions.  12 00:01:17,680 --> 00:01:21,280 You can use summarization models  to summarize research papers which   13 00:01:21,280 --> 00:01:25,680 would enable researchers to easily  pick papers for their reading list.  14 00:01:29,040 --> 00:01:39,520 For more information about the Summarization  task, check out the Hugging Face course.