1 00:00:00,430 --> 00:00:03,013 (欢快的音乐) (upbeat music) 2 00:00:05,160 --> 00:00:07,080 - 在这个视频中,我会给你 - In this video, I'm going to give you 3 00:00:07,080 --> 00:00:10,350 快速介绍我们的变压器模型 a very quick introduction to how our transformer models 4 00:00:10,350 --> 00:00:14,040 与 Tensorflow 和 Keras 一起工作。 work together with Tensorflow and Keras. 5 00:00:14,040 --> 00:00:15,510 非常简短的解释 The very short explanation 6 00:00:15,510 --> 00:00:17,310 是我们所有的 Tensorflow 模型 is that all of our Tensorflow models 7 00:00:17,310 --> 00:00:19,470 也是 Keras 模型对象, are also Keras model objects, 8 00:00:19,470 --> 00:00:22,950 因此他们拥有标准的 Keras 模型 API。 and so they have the standard Keras model API. 9 00:00:22,950 --> 00:00:24,960 如果你是一位经验丰富的机器学习工程师 If you're an experienced machine learning engineer 10 00:00:24,960 --> 00:00:28,230 谁经常使用 Keras,这可能就是你需要知道的全部 who's used Keras a lot, that's probably all you need to know 11 00:00:28,230 --> 00:00:29,610 开始与他们合作。 to start working with them. 12 00:00:29,610 --> 00:00:30,900 但对其他人来说, But for everyone else, 13 00:00:30,900 --> 00:00:34,170 包括那里的浪子 PyTorch 工程师 including the prodigal PyTorch engineers out there 14 00:00:34,170 --> 00:00:35,910 正在回归的人, who are returning to the fold, 15 00:00:35,910 --> 00:00:38,430 我将快速介绍 Keras 模型, I'm going to quickly introduce Keras models, 16 00:00:38,430 --> 00:00:40,440 以及我们如何与他们合作。 and how we work with them. 17 00:00:40,440 --> 00:00:43,080 在我将在下面链接的其他视频中, In other videos, which I'll link below, 18 00:00:43,080 --> 00:00:46,440 我将更详细地介绍 Keras 模型的训练。 I'll run through training with Keras models in more detail. 19 00:00:46,440 --> 00:00:50,820 但首先,从高层次上讲,什么是 Keras 模型? But first, at a high level, what is a Keras model? 20 00:00:50,820 --> 00:00:54,810 所以你的模型基本上包含了你的整个网络。 So your model basically contains your entire network. 21 00:00:54,810 --> 00:00:58,230 它包含层,以及这些层的权重, It contains the layers, and the weights for those layers, 22 00:00:58,230 --> 00:01:00,690 并告诉模型如何处理它们 and also tells the model what to do with them 23 00:01:00,690 --> 00:01:02,880 所以它一路定义了整个路径 so it defines the whole path all the way 24 00:01:02,880 --> 00:01:05,460 从你的输入到你的输出。 from your inputs to your outputs. 25 00:01:05,460 --> 00:01:07,380 如果你以前使用过 Keras, If you've used Keras before, 26 00:01:07,380 --> 00:01:09,480 你可能开始使用模型对象 you probably started using model objects 27 00:01:09,480 --> 00:01:11,850 通过手工构建它们, by building them out by hand, 28 00:01:11,850 --> 00:01:14,250 你一层又一层地添加 you added one layer after another 29 00:01:14,250 --> 00:01:18,690 并且可能使用 model.add () 或功能方法。 and maybe using the model.add () or the functional approach. 30 00:01:18,690 --> 00:01:20,490 这并没有错。 And there's nothing wrong with that. 31 00:01:21,390 --> 00:01:23,430 许多伟大的模型都是以这种方式构建的 Lots of great models are built that way 32 00:01:23,430 --> 00:01:26,970 但你也可以预加载整个模型、权重和所有内容。 but you can also pre-load an entire model, weights and all. 33 00:01:26,970 --> 00:01:29,994 这真的很有帮助,因为如果你, And this is really helpful, because if you, 34 00:01:29,994 --> 00:01:32,490 正如你在这里看到的,如果你尝试阅读这篇论文 as you can see here, if you try reading the paper 35 00:01:32,490 --> 00:01:34,110 或者如果你尝试查看代码, or if you try looking at the code, 36 00:01:34,110 --> 00:01:37,350 你会看到 Transformer 的内部非常复杂, you'll see the inside of a Transformer is pretty complex, 37 00:01:37,350 --> 00:01:40,110 从头开始把它写出来并把它做好 and writing it all out from scratch and getting it right 38 00:01:40,110 --> 00:01:41,850 即使对于有经验的人来说也很难 would be hard even for an experienced 39 00:01:41,850 --> 00:01:43,500 机器学习工程师。 machine learning engineer. 40 00:01:43,500 --> 00:01:45,870 但因为它都装在一个模型里, But because it's all packed inside a model, 41 00:01:45,870 --> 00:01:48,150 你不需要担心那个的复杂性 you don't need to worry about that complexity on that 42 00:01:48,150 --> 00:01:49,140 如果你不想。 if you don't want to. 43 00:01:49,140 --> 00:01:51,570 如果你是一名研究人员,如果你想真正深入研究 If you're a researcher, if you want to really dig in there 44 00:01:51,570 --> 00:01:55,650 你可以,但你也可以只加载一个预训练的, you can, but you can also just load a pre-trained, 45 00:01:55,650 --> 00:01:59,013 只需一行代码即可预配置变压器模型。 pre-configured transformer model in just one line of code. 46 00:02:00,150 --> 00:02:03,480 当我之前提到 Keras API 时, And when I mentioned earlier about the Keras API, 47 00:02:03,480 --> 00:02:04,560 它的优点是 the advantage of it is that 48 00:02:04,560 --> 00:02:06,690 你是否从头开始编写自己的模型 whether you write your own model from scratch 49 00:02:06,690 --> 00:02:09,510 或者加载一个预训练的,你与模型交互 or load a pre-trained one, you interact with the model 50 00:02:09,510 --> 00:02:11,850 通过相同的 API,所以你使用 through that same API, so you use exactly 51 00:02:11,850 --> 00:02:13,950 同样的几种方法,你会看到它们 the same few methods and you're gonna see them 52 00:02:13,950 --> 00:02:16,380 一次又一次,这些方法就像适合, again and again, these methods like fit, 53 00:02:16,380 --> 00:02:19,650 编译和预测,就像我提到的 compile and predict, and like I've mentioned 54 00:02:19,650 --> 00:02:22,530 我们将介绍如何使用这些方法的具体示例 we'll cover concrete examples of how to use those methods 55 00:02:22,530 --> 00:02:24,330 在我将在下面链接的视频中。 in the videos I'll link below. 56 00:02:24,330 --> 00:02:27,000 现在要从这个视频中拿走的关键是, For now the key thing to take away from this video, 57 00:02:27,000 --> 00:02:28,950 如果你以前从未见过 Keras, if you've never seen Keras before, 58 00:02:28,950 --> 00:02:30,870 是这种简洁的封装意味着 is that this neat encapsulation means 59 00:02:30,870 --> 00:02:33,090 一个巨大的神经网络的所有复杂性 that all the complexity of a huge neural net 60 00:02:33,090 --> 00:02:35,430 变得易于管理,因为你与它互动 becomes manageable, because you interact with it 61 00:02:35,430 --> 00:02:39,000 以完全相同的方式,使用完全相同的方法, in exactly the same way, using exactly the same methods, 62 00:02:39,000 --> 00:02:41,700 是否是一个庞大的预训练语言模型 whether it's a huge pre-trained language model 63 00:02:41,700 --> 00:02:43,950 或你手写的简单模型。 or a simple model that you wrote out by hand. 64 00:02:45,466 --> 00:02:48,049 (欢快的音乐) (upbeat music)