subtitles/en/34_the-push-to-hub-api-(tensorflow).srt (675 lines of code) (raw):
1
00:00:00,587 --> 00:00:02,670
(swoosh)
2
00:00:05,100 --> 00:00:07,080
- [Narrator] Hi, this
is going to be a video
3
00:00:07,080 --> 00:00:09,420
about the push_to_hub API
4
00:00:09,420 --> 00:00:10,670
for Tensorflow and Keras.
5
00:00:11,820 --> 00:00:14,850
So, to get started, we'll
open up our notebook.
6
00:00:14,850 --> 00:00:16,920
And the first thing you'll
need to do is log in to
7
00:00:16,920 --> 00:00:18,170
your HuggingFace account,
8
00:00:19,043 --> 00:00:20,663
for example with the
notebook login function.
9
00:00:21,570 --> 00:00:24,630
So to use that, you
simply call the function,
10
00:00:24,630 --> 00:00:26,010
the popup will emerge.
11
00:00:26,010 --> 00:00:28,800
You will enter your username and password,
12
00:00:28,800 --> 00:00:31,425
which I'm going to pull out
of my password manager here,
13
00:00:31,425 --> 00:00:33,108
and you log in.
14
00:00:33,108 --> 00:00:35,670
The next two cells are just
15
00:00:35,670 --> 00:00:37,080
getting everything ready for training.
16
00:00:37,080 --> 00:00:38,940
So we're just going to load a dataset,
17
00:00:38,940 --> 00:00:41,100
we're going to tokenize that dataset,
18
00:00:41,100 --> 00:00:42,990
and then we're going to
load our model and compile
19
00:00:42,990 --> 00:00:45,660
it with the standard Adam optimizer.
20
00:00:45,660 --> 00:00:47,560
So I'm just going to run all of those.
21
00:00:49,830 --> 00:00:52,080
We'll wait a few seconds,
22
00:00:52,080 --> 00:00:54,280
and everything should
be ready for training.
23
00:00:57,983 --> 00:00:58,816
Okay.
24
00:00:58,816 --> 00:01:01,440
So now we're ready to train.
25
00:01:01,440 --> 00:01:03,030
I'm going to show you the two ways
26
00:01:03,030 --> 00:01:05,130
you can push your model to the Hub.
27
00:01:05,130 --> 00:01:08,190
So the first is with
the PushToHubCallback.
28
00:01:08,190 --> 00:01:10,107
So a callback in Keras
29
00:01:10,107 --> 00:01:13,710
is a function that's called
regularly during training.
30
00:01:13,710 --> 00:01:17,400
You can set it to be called
after a certain number of steps,
31
00:01:17,400 --> 00:01:21,427
or every epoch, or even just
once at the end of training.
32
00:01:21,427 --> 00:01:25,080
So a lot of callbacks
in Keras, for example,
33
00:01:25,080 --> 00:01:28,050
control learning rate decaying on plateau,
34
00:01:28,050 --> 00:01:30,047
and things like that.
35
00:01:30,047 --> 00:01:32,520
So this callback, by default,
36
00:01:32,520 --> 00:01:35,760
will save your model to
the Hub once every epoch.
37
00:01:35,760 --> 00:01:37,080
And that's really helpful,
38
00:01:37,080 --> 00:01:38,790
especially if your training is very long,
39
00:01:38,790 --> 00:01:40,800
because that means you
can resume from that save,
40
00:01:40,800 --> 00:01:43,290
so you get this automatic
cloud-saving of your model.
41
00:01:43,290 --> 00:01:45,027
And you can even run inference
42
00:01:45,027 --> 00:01:47,730
with the checkpoints of your model
43
00:01:47,730 --> 00:01:50,208
that have been uploaded by this callback.
44
00:01:50,208 --> 00:01:52,260
And that means you can,
45
00:01:52,260 --> 00:01:54,150
y'know, run some test inputs
46
00:01:54,150 --> 00:01:56,100
and actually see how your model works
47
00:01:56,100 --> 00:01:57,990
at various stages during training,
48
00:01:57,990 --> 00:01:59,540
which is a really nice feature.
49
00:02:00,390 --> 00:02:03,960
So we're going to add
the PushToHubCallback,
50
00:02:03,960 --> 00:02:05,670
and it takes just a few arguments.
51
00:02:05,670 --> 00:02:08,250
So the first argument is
the temporary directory
52
00:02:08,250 --> 00:02:10,260
that files are going to be saved to
53
00:02:10,260 --> 00:02:12,150
before they're uploaded to the Hub.
54
00:02:12,150 --> 00:02:14,127
The second argument is the tokenizer,
55
00:02:14,127 --> 00:02:15,808
and the third argument here
56
00:02:15,808 --> 00:02:19,080
is the keyword argument hub_model_id.
57
00:02:19,080 --> 00:02:21,330
So that's the name it's
going to be saved under
58
00:02:21,330 --> 00:02:23,006
on the HuggingFace Hub.
59
00:02:23,006 --> 00:02:26,267
You can also upload to
an organization account
60
00:02:26,267 --> 00:02:29,370
just by adding the organization name
61
00:02:29,370 --> 00:02:32,460
before the repository name
with a slash, like this.
62
00:02:32,460 --> 00:02:34,020
So you probably don't have permissions
63
00:02:34,020 --> 00:02:36,000
to upload to the HuggingFace organization,
64
00:02:36,000 --> 00:02:37,170
if you do please file a bug
65
00:02:37,170 --> 00:02:38,973
and let us know extremely urgently.
66
00:02:40,830 --> 00:02:42,960
But if you do have access
to your own organization,
67
00:02:42,960 --> 00:02:44,730
then you can use that same approach
68
00:02:44,730 --> 00:02:46,650
to upload models to their account
69
00:02:46,650 --> 00:02:50,760
instead of to your own
personal set of models.
70
00:02:50,760 --> 00:02:53,520
So, once you've made your callback,
71
00:02:53,520 --> 00:02:56,310
you simply add it to the callbacks list
72
00:02:56,310 --> 00:02:58,080
when you're calling model.fit.
73
00:02:58,080 --> 00:03:01,110
And everything is uploaded
for you from there,
74
00:03:01,110 --> 00:03:02,610
there's nothing else to worry about.
75
00:03:02,610 --> 00:03:04,530
The second way to upload a model, though,
76
00:03:04,530 --> 00:03:07,020
is to call model.push_to_hub.
77
00:03:07,020 --> 00:03:09,086
So this is more of a once-off method.
78
00:03:09,086 --> 00:03:11,550
It's not called regularly during training.
79
00:03:11,550 --> 00:03:13,680
You can just call this
manually whenever you want to
80
00:03:13,680 --> 00:03:15,240
upload a model to the hub.
81
00:03:15,240 --> 00:03:18,949
So we recommend running this
after the end of training,
82
00:03:18,949 --> 00:03:21,870
just to make sure that
you have a commit message
83
00:03:21,870 --> 00:03:24,060
to guarantee that this
was the final version
84
00:03:24,060 --> 00:03:26,143
of the model at the end of training.
85
00:03:26,143 --> 00:03:27,930
And it just makes sure that, you know,
86
00:03:27,930 --> 00:03:30,480
you're working with the
definitive end-of-training model
87
00:03:30,480 --> 00:03:32,190
and not accidentally using a checkpoint
88
00:03:32,190 --> 00:03:34,224
from somewhere along the way.
89
00:03:34,224 --> 00:03:37,173
So I'm going to run both of these cells.
90
00:03:39,299 --> 00:03:41,716
And then I'm going to cut the video here,
91
00:03:41,716 --> 00:03:43,080
just because training is going
to take a couple of minutes.
92
00:03:43,080 --> 00:03:44,580
So I'll skip forward to the end of that,
93
00:03:44,580 --> 00:03:46,320
when the models have all been uploaded,
94
00:03:46,320 --> 00:03:48,390
and I'm gonna show you how you can
95
00:03:48,390 --> 00:03:50,010
access the models in the Hub,
96
00:03:50,010 --> 00:03:52,713
and the other things you
can do with them from there.
97
00:03:55,440 --> 00:03:56,700
Okay, we're back,
98
00:03:56,700 --> 00:03:59,160
and our model was uploaded.
99
00:03:59,160 --> 00:04:00,750
Both by the PushToHubCallback
100
00:04:00,750 --> 00:04:04,251
and also by our call to
model.push_to_hub after training.
101
00:04:04,251 --> 00:04:05,910
So everything's looking good.
102
00:04:05,910 --> 00:04:09,960
So now if we drop over to
my profile on HuggingFace,
103
00:04:09,960 --> 00:04:12,630
and you can get there just by
clicking the profile button
104
00:04:12,630 --> 00:04:13,680
in the dropdown.
105
00:04:13,680 --> 00:04:16,860
We can see that the
bert-fine-tuned-cola model is here,
106
00:04:16,860 --> 00:04:18,369
and was updated 3 minutes ago.
107
00:04:18,369 --> 00:04:20,520
So it'll always be at
the top of your list,
108
00:04:20,520 --> 00:04:23,340
because they're sorted by how
recently they were updated.
109
00:04:23,340 --> 00:04:25,740
And we can start querying
our model immediately.
110
00:04:30,564 --> 00:04:32,939
So the dataset we were training on
111
00:04:32,939 --> 00:04:34,320
is the Glue CoLA dataset,
112
00:04:34,320 --> 00:04:36,210
and CoLA is an acronym standing for
113
00:04:36,210 --> 00:04:39,420
the Corpus of Linguistic Acceptability.
114
00:04:39,420 --> 00:04:42,480
So what that means is the model
is being trained to decide
115
00:04:42,480 --> 00:04:46,350
if a sentence is grammatically
or linguistically okay,
116
00:04:46,350 --> 00:04:48,171
or if there's a problem with it.
117
00:04:48,171 --> 00:04:52,890
For example, we could say,
"This is a legitimate sentence."
118
00:04:52,890 --> 00:04:54,180
And hopefully it realizes that
119
00:04:54,180 --> 00:04:56,080
this is in fact a legitimate sentence.
120
00:04:57,630 --> 00:05:00,240
So it might take a couple of
seconds for the model to load
121
00:05:00,240 --> 00:05:03,060
when you call it for the first time.
122
00:05:03,060 --> 00:05:05,960
So I might cut a couple of
seconds out of this video here.
123
00:05:07,860 --> 00:05:09,060
Okay, we're back.
124
00:05:09,060 --> 00:05:12,407
So the model loaded and we got an output,
125
00:05:12,407 --> 00:05:14,340
but there's an obvious problem here.
126
00:05:14,340 --> 00:05:16,888
So these labels aren't really telling us
127
00:05:16,888 --> 00:05:19,740
what categories the model
has actually assigned
128
00:05:19,740 --> 00:05:21,655
to this input sentence.
129
00:05:21,655 --> 00:05:23,520
So if we want to fix that,
130
00:05:23,520 --> 00:05:26,010
we want to make sure the model config
131
00:05:26,010 --> 00:05:28,980
has the correct names for
each of the label classes,
132
00:05:28,980 --> 00:05:30,707
and then we want to upload that config.
133
00:05:30,707 --> 00:05:32,220
So we can do that down here.
134
00:05:32,220 --> 00:05:34,050
To get the label names,
135
00:05:34,050 --> 00:05:36,547
we can get that from
the dataset we loaded,
136
00:05:36,547 --> 00:05:39,627
from the features attribute it has.
137
00:05:39,627 --> 00:05:42,217
And then we can create dictionaries
138
00:05:42,217 --> 00:05:44,865
"id2label" and "label2id",
139
00:05:44,865 --> 00:05:47,452
and just assign them to the model config.
140
00:05:47,452 --> 00:05:50,790
And then we can just
push our updated config,
141
00:05:50,790 --> 00:05:54,690
and that'll override the
existing config in the Hub repo.
142
00:05:54,690 --> 00:05:56,368
So that's just been done.
143
00:05:56,368 --> 00:05:58,320
So now, if we go back here,
144
00:05:58,320 --> 00:06:00,000
I'm going to use a
slightly different sentence
145
00:06:00,000 --> 00:06:03,540
because the outputs for
sentences are sometimes cached.
146
00:06:03,540 --> 00:06:06,030
And so, if we want to generate new results
147
00:06:06,030 --> 00:06:07,590
I'm going to use something
slightly different.
148
00:06:07,590 --> 00:06:09,783
So let's try an incorrect sentence.
149
00:06:10,830 --> 00:06:12,640
So this is not valid English grammar
150
00:06:13,538 --> 00:06:15,030
and hopefully the model will see that.
151
00:06:15,030 --> 00:06:16,958
It's going to reload here,
152
00:06:16,958 --> 00:06:18,630
so I'm going to cut a
couple of seconds here,
153
00:06:18,630 --> 00:06:20,933
and then we'll see what
the model is going to say.
154
00:06:22,860 --> 00:06:23,820
Okay.
155
00:06:23,820 --> 00:06:26,580
So the model, it's
confidence isn't very good,
156
00:06:26,580 --> 00:06:28,830
because of course we
didn't really optimize
157
00:06:28,830 --> 00:06:30,630
our hyperparameters at all.
158
00:06:30,630 --> 00:06:32,190
But it has decided that this sentence
159
00:06:32,190 --> 00:06:35,094
is more likely to be
unacceptable than acceptable.
160
00:06:35,094 --> 00:06:38,160
Presumably if we tried a
bit harder with training
161
00:06:38,160 --> 00:06:40,080
we could get a much lower validation loss,
162
00:06:40,080 --> 00:06:43,830
and therefore the model's
predictions would be more precise.
163
00:06:43,830 --> 00:06:46,260
But let's try our original sentence again.
164
00:06:46,260 --> 00:06:49,140
Of course, because of the caching issue,
165
00:06:49,140 --> 00:06:52,740
we're seeing that the original
answers are unchanged.
166
00:06:52,740 --> 00:06:55,196
So let's try a different, valid sentence.
167
00:06:55,196 --> 00:06:58,767
So let's try, "This is a
valid English sentence".
168
00:07:00,150 --> 00:07:02,100
And we see that now the
model correctly decides
169
00:07:02,100 --> 00:07:04,290
that it has a very high
probability of being acceptable,
170
00:07:04,290 --> 00:07:06,900
and a very low probability
of being unacceptable.
171
00:07:06,900 --> 00:07:09,930
So you can use this inference API
172
00:07:09,930 --> 00:07:12,810
even with the checkpoints that
are uploaded during training,
173
00:07:12,810 --> 00:07:14,546
so it can be very interesting to see how
174
00:07:14,546 --> 00:07:17,690
the model's predictions
for sample inputs change
175
00:07:17,690 --> 00:07:20,579
with each epoch of training.
176
00:07:20,579 --> 00:07:23,370
Also, the model we've uploaded
177
00:07:23,370 --> 00:07:25,740
is going to be accessible to you and,
178
00:07:25,740 --> 00:07:28,046
if it's shared publicly, to anyone else.
179
00:07:28,046 --> 00:07:29,788
So if you want to load that model,
180
00:07:29,788 --> 00:07:32,500
all you or anyone else needs to do
181
00:07:34,290 --> 00:07:37,440
is just to load it in either a pipeline,
182
00:07:37,440 --> 00:07:40,925
or you can just load it with, for example,
183
00:07:40,925 --> 00:07:43,203
TFAutoModelForSequenceClassification.
184
00:07:46,920 --> 00:07:49,989
And then for the name you
would just simply pass
185
00:07:49,989 --> 00:07:53,325
the path to the repo you want to upload.
186
00:07:53,325 --> 00:07:55,890
Or to download, excuse me.
187
00:07:55,890 --> 00:07:58,710
So if I want to use this model again,
188
00:07:58,710 --> 00:08:00,667
if I want to load it from the hub,
189
00:08:00,667 --> 00:08:01,763
I just run this one line of code.
190
00:08:02,813 --> 00:08:03,773
The model will be downloaded.
191
00:08:07,757 --> 00:08:10,080
And, with any luck, it'll be ready to
192
00:08:10,080 --> 00:08:12,450
fine-tune on a different
dataset, make predictions with,
193
00:08:12,450 --> 00:08:14,340
or do anything else you wanna do.
194
00:08:14,340 --> 00:08:17,700
So that was a quick overview of how,
195
00:08:17,700 --> 00:08:19,470
after your training or
during your training,
196
00:08:19,470 --> 00:08:21,420
you can upload models to the Hub,
197
00:08:21,420 --> 00:08:22,440
you can checkpoint there,
198
00:08:22,440 --> 00:08:24,240
you can resume training from there,
199
00:08:24,240 --> 00:08:26,790
and you can get inference results
200
00:08:26,790 --> 00:08:28,384
from the models you've uploaded.
201
00:08:28,384 --> 00:08:31,084
So thank you, and I hope to
see you in a future video.
202
00:08:32,852 --> 00:08:34,935
(swoosh)