1
00:00:04,560 --> 00:00:06,640
Welcome to the Hugging Face tasks series.  

2
00:00:07,280 --> 00:00:10,720
In this video, we will take a look 
at the Text Summarization task. 

3
00:00:13,680 --> 00:00:16,480
Summarization is a task of 
producing a shorter version  

4
00:00:16,480 --> 00:00:21,600
of a document while preserving the relevant 
and important information in the document. 

5
00:00:25,040 --> 00:00:29,840
Summarization models take a document to be 
summarized and output the summarized text. 

6
00:00:33,360 --> 00:00:40,240
This task is evaluated on the ROUGE score. It’s 
based on the overlap between the produced sequence  

7
00:00:40,240 --> 00:00:48,000
and the correct sequence.
You might see this as ROUGE-1,  

8
00:00:48,000 --> 00:00:55,600
which is the overlap of single tokens and ROUGE-2, 
the overlap of subsequent token pairs. ROUGE-N  

9
00:00:55,600 --> 00:01:02,960
refers to the overlap of n subsequent tokens. 
Here we see an example of how overlaps take place. 

10
00:01:06,160 --> 00:01:11,280
An example dataset used for this task is 
called Extreme Summarization, XSUM. This  

11
00:01:11,280 --> 00:01:14,480
dataset contains texts and 
their summarized versions. 

12
00:01:17,680 --> 00:01:21,280
You can use summarization models 
to summarize research papers which  

13
00:01:21,280 --> 00:01:25,680
would enable researchers to easily 
pick papers for their reading list. 

14
00:01:29,040 --> 00:01:39,520
For more information about the Summarization 
task, check out the Hugging Face course.