pai-python-sdk/huggingface/huggingface_bert/huggingface_bert.ipynb (865 lines of code) (raw):

{ "cells": [ { "attachments": {}, "cell_type": "markdown", "id": "bb57c39e-16f6-4f84-b071-7751bd01b4c4", "metadata": { "ExecutionIndicator": { "show": true }, "tags": [] }, "source": [ "# HuggingFace BERT模型部署和微调训练\n", "\n", "[HuggingFace](https://huggingface.co/) 是一个开源开放的AI社区平台,允许用户共享自己的AI项目、数据集和模型,同时也为用户提供了各种机器学习工具,包括`transformers`、`diffusers`、`accelerate`等。通过HuggingFace社区,用户可以轻松地构建和训练自己的模型,并将其应用于各种实际场景中。\n", "\n", "当前文档中,我们以HuggingFace提供的[BERT预训练模型-英文-base](https://huggingface.co/bert-base-uncased)预训练模型为示例,展示如何在PAI微调训练和部署BERT模型,主要内容包括以下:\n", "\n", "1. SDK安装和配置:\n", "\n", "安装所需的SDK,并完成PAI Python SDK配置。\n", "\n", "2. 直接部署BERT模型创建推理服务\n", "\n", "将HuggingFace上的BERT模型直接模型部署到PAI-EAS,创建一个在线推理服务。\n", "\n", "3. 使用BERT模型微调训练\n", "\n", "基于BERT模型,我们使用公共数据集进行微调训练,以获得一个可以用于情感分类的模型,然后将输出的模型部署到PAI-EAS,创建一个在线推理服务。\n" ] }, { "cell_type": "markdown", "id": "5f5181b4", "metadata": {}, "source": [ "\n", "## 费用说明\n", "\n", "本示例将会使用以下云产品,并产生相应的费用账单:\n", "\n", "- PAI-DLC:运行训练任务,详细计费说明请参考[PAI-DLC计费说明](https://help.aliyun.com/zh/pai/product-overview/billing-of-dlc)\n", "- PAI-EAS:部署推理服务,详细计费说明请参考[PAI-EAS计费说明](https://help.aliyun.com/zh/pai/product-overview/billing-of-eas)\n", "- OSS:存储训练任务输出的模型、训练代码、TensorBoard日志等,详细计费说明请参考[OSS计费概述](https://help.aliyun.com/zh/oss/product-overview/billing-overview)\n", "\n", "\n", "> 通过参与云产品免费试用,使用**指定资源机型**提交训练作业或是部署推理服务,可以免费试用PAI产品,具体请参考[PAI免费试用](https://help.aliyun.com/zh/pai/product-overview/free-quota-for-new-users)。\n", "\n" ] }, { "attachments": {}, "cell_type": "markdown", "id": "73692608-6d3f-4551-9eeb-e169bfa93799", "metadata": {}, "source": [ "## Step1: SDK的安装配置\n", "\n", "我们将使用PAI提供的Python SDK,提交训练作业,部署模型。请通过以下命令安装PAI Python SDK,以及需要使用到的Huggingface datasets等依赖库。" ] }, { "cell_type": "code", "execution_count": null, "id": "c09a58a3-7cf9-43ac-b386-3bafffbf6321", "metadata": { "ExecutionIndicator": { "show": true }, "tags": [] }, "outputs": [], "source": [ "!python -m pip install --upgrade pai" ] }, { "cell_type": "code", "execution_count": null, "id": "3bee87d5", "metadata": {}, "outputs": [], "source": [ "\n", "!python -m pip install datasets huggingface_hub" ] }, { "attachments": {}, "cell_type": "markdown", "id": "5212ae4f-cb05-45a0-82f1-3d1cd89be38b", "metadata": {}, "source": [ "\n", "SDK需要配置访问阿里云服务需要的AccessKey,以及当前使用的工作空间和OSS Bucket。在PAI Python SDK安装之后,通过在**命令行终端**中执行以下命令,按照引导配置密钥,工作空间等信息。\n", "\n", "\n", "```shell\n", "\n", "# 以下命令,请在命令行终端中执行.\n", "\n", "python -m pai.toolkit.config\n", "\n", "```\n", "\n", "我们可以通过执行以下代码验证当前的配置是否成功。" ] }, { "cell_type": "code", "execution_count": null, "id": "55bcb9aa-58ee-47a0-9656-446c5bf67845", "metadata": { "ExecutionIndicator": { "show": true }, "tags": [] }, "outputs": [], "source": [ "import pai\n", "from pai.session import get_default_session\n", "\n", "print(pai.__version__)\n", "sess = get_default_session()\n", "\n", "assert sess.workspace_name is not None" ] }, { "attachments": {}, "cell_type": "markdown", "id": "5952d3b9", "metadata": {}, "source": [ "## Step2: 部署BERT模型创建推理服务\n", "\n", "\n", "[PAI-EAS](https://www.aliyun.com/activity/bigdata/pai/eas) (Elastic Algorithm Service) 是PAI平台上的模型在线预测服务,支持使用镜像模式部署模型,并且提供了常见的机器学习框架的推理镜像。 在以下示例中,我们将使用PAI-EAS提供的镜像,将HuggingFace上的BERT模型直接部署到PAI,创建一个在线推理服务。\n", "\n", "[BERT](https://arxiv.org/abs/1810.04805)是Google提出的一种预训练语言模型,使用自监督学习方法在大型英文语料库上进行训练。他可以直接用于\"完形填空\"的任务,也可以作为下游任务的预训练模型,通过微调训练,用于分类,问答等不同的任务。我们通过以下代码下载HuggingFace提供的BERT模型,用于创建一个支持“完形填空”的推理服务。\n", "\n", "> 对于如何在离线模式下保存和使用HuggingFace模型,用户可以参考HuggingFace的官方文档: [HuggingFace Offline Mode](https://huggingface.co/docs/transformers/installation#fetch-models-and-tokenizers-to-use-offline)" ] }, { "cell_type": "code", "execution_count": null, "id": "25ae41ce", "metadata": {}, "outputs": [], "source": [ "from huggingface_hub import snapshot_download\n", "\n", "\n", "# 下载BERT模型(PyTorch版本)\n", "model_dir = snapshot_download(\n", " repo_id=\"bert-base-uncased\",\n", " local_dir=\"./bert\",\n", " allow_patterns=[\n", " \"config.json\",\n", " \"pytorch_model.bin\",\n", " \"vocab.txt\",\n", " \"tokenizer_config.json\",\n", " \"tokenizer.json\",\n", " ],\n", ")" ] }, { "attachments": {}, "cell_type": "markdown", "id": "bf489d6a", "metadata": {}, "source": [ "用户也可以通过以下的方式保存模型(需要用户在本地install`transformers`, `pytorch`等依赖库):\n", "\n", "```python\n", "\n", "from transformers import BertTokenizer, BertModel\n", "\n", "# 下载模型\n", "tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')\n", "model = BertModel.from_pretrained(\"bert-base-uncased\")\n", "\n", "# 保存模型到本地路径\n", "model_dir = \"./bert/\"\n", "model.save_pretrained(model_dir)\n", "tokenizer.save_pretrained(model_dir)\n", "\n", "```\n", "\n", "保存的模型,可以直接通过`transformers`库加载使用:\n", "\n", "```python\n", "\n", "from transformers import BertTokenizer, BertModel\n", "\n", "model = BertModel.from_pretrained(\"./bert/\")\n", "tokenizer = BertTokenizer.from_pretrained(\"./bert/\")\n", "\n", "```\n" ] }, { "attachments": {}, "cell_type": "markdown", "id": "14acee39", "metadata": {}, "source": [ "将保存在本地的BERT模型和tokenizer上传到OSS Bucket,拿到模型的OSS路径。" ] }, { "cell_type": "code", "execution_count": null, "id": "25debdca", "metadata": {}, "outputs": [], "source": [ "from pai.common.oss_utils import upload\n", "\n", "# 上传模型\n", "bert_model_uri = upload(\n", " source_path=model_dir, oss_path=\"huggingface/model/bert/\", bucket=sess.oss_bucket\n", ")\n", "print(bert_model_uri)" ] }, { "attachments": {}, "cell_type": "markdown", "id": "088cf89b", "metadata": {}, "source": [ "\n", "在部署模型之前,我们需要准备模型推理服务的代码,用于加载模型,提供HTTP服务。在以下示例中,我们使用[FastAPI](https://fastapi.tiangolo.com/)编写了一个简单的HTTP服务,用于加载模型,提供预测服务。" ] }, { "cell_type": "code", "execution_count": null, "id": "a524a388", "metadata": {}, "outputs": [], "source": [ "# 创建推理服务使用的代码\n", "!mkdir -p serving_src" ] }, { "attachments": {}, "cell_type": "markdown", "id": "b804c39c", "metadata": {}, "source": [ "完整的推理服务程序代码如下:" ] }, { "cell_type": "code", "execution_count": null, "id": "d8cd70ff", "metadata": {}, "outputs": [], "source": [ "%%writefile serving_src/run.py\n", "\n", "import os\n", "import logging\n", "\n", "import uvicorn, json, datetime\n", "from fastapi import FastAPI, Request\n", "from transformers import pipeline, AutoTokenizer, AutoModelForSequenceClassification\n", "\n", "# 用户指定模型,默认会被加载到当前路径下\n", "MODEL_PATH = \"/eas/workspace/model/\"\n", "\n", "logging.basicConfig(level=logging.INFO)\n", "logger = logging.getLogger(\"model_server\")\n", "\n", "app = FastAPI()\n", "\n", "@app.post(\"/\")\n", "async def predict(request: Request):\n", " global bert_pipeline\n", " json_data = await request.json()\n", " logger.info(\"Input data: %s\", json_data)\n", " result = bert_pipeline(json_data[\"text\"])\n", " logger.info(\"Prediction result: %s\", result)\n", " return result\n", "\n", "\n", "if __name__ == '__main__':\n", " task = os.environ.get(\"HF_TASK\", \"fill-mask\")\n", " bert_pipeline = pipeline(task=task, model=MODEL_PATH, tokenizer=MODEL_PATH)\n", "\n", " uvicorn.run(app, host='0.0.0.0', port=int(os.environ.get(\"LISTENING_PORT\", 8000)))" ] }, { "attachments": {}, "cell_type": "markdown", "id": "77a3568b", "metadata": {}, "source": [ "SDK 提供的 `pai.model.InferenceSpec` 用于描述如何加载模型,以及如何提供预测服务。在以下代码中,我们使用 `pai.model.container_serving_spec` 方法,使用 PAI 提供的推理镜像和本地代码 `serving_src`,创建一个 `InferenceSpec` 对象。对应的本地代码会被上传保存到用户OSS,然后通过挂载的方式将相应的代码准备到运行容器中。" ] }, { "cell_type": "code", "execution_count": null, "id": "ab6c3828", "metadata": {}, "outputs": [], "source": [ "from pai.model import Model, container_serving_spec\n", "from pai.image import retrieve, ImageScope\n", "\n", "\n", "# 使用 PAI 提供的 PyTorch CPU 推理镜像\n", "image_uri = retrieve(\n", " \"PyTorch\",\n", " framework_version=\"latest\",\n", " accelerator_type=\"CPU\",\n", " image_scope=ImageScope.INFERENCE,\n", ").image_uri\n", "print(image_uri)\n", "\n", "\n", "# 构建一个使用镜像部署的InferenceSpec,可以用于BERT模型部署为推理服务.\n", "bert_inference_spec = container_serving_spec(\n", " # 模型服务的启动命令\n", " command=\"python run.py\",\n", " # 模型服务依赖的代码\n", " source_dir=\"./serving_src\",\n", " image_uri=image_uri,\n", " requirements=[\n", " \"transformers\",\n", " \"fastapi\",\n", " \"uvicorn\",\n", " # 推理 pipeline 使用 device_map=\"auto\" 时需要安装\n", " \"accelerate\",\n", " ],\n", ")\n", "\n", "print(bert_inference_spec.to_dict())" ] }, { "attachments": {}, "cell_type": "markdown", "id": "debba5d4", "metadata": {}, "source": [ "### 模型部署\n", "\n", "通过构建Model,调用`Model.deploy`方法,可以将模型部署到PAI-EAS,生成在线服务。\n", "\n", "关于如何使用SDK部署模型的详细介绍,用户可以参考文档:[PAI Python SDK部署推理服务](https://help.aliyun.com/document_detail/2261532.html)" ] }, { "cell_type": "code", "execution_count": null, "id": "60b07fb7", "metadata": {}, "outputs": [], "source": [ "from pai.model import Model\n", "from pai.common.utils import random_str\n", "\n", "m = Model(\n", " inference_spec=bert_inference_spec,\n", " model_data=bert_model_uri,\n", ")\n", "\n", "p = m.deploy(\n", " service_name=\"hf_bert_serving_{}\".format(random_str(6)), # 推理服务名称.\n", " instance_type=\"ecs.c6.xlarge\", # 服务使用的机器实例规格: 4 vCPU, 8 GB\n", ")" ] }, { "attachments": {}, "cell_type": "markdown", "id": "5e64254b", "metadata": {}, "source": [ "deploy方法返回的Predictor对象,指向了新创建的推理服务,他提供了`.predict`方法,支持用户向推理服务发送预测请求。" ] }, { "cell_type": "code", "execution_count": null, "id": "2df66b1f", "metadata": {}, "outputs": [], "source": [ "res = p.predict(data={\"text\": \"Hello, I'm a [MASK] model.\"})\n", "\n", "print(res)" ] }, { "attachments": {}, "cell_type": "markdown", "id": "57f86644", "metadata": {}, "source": [ "在测试完成之后,我们可以通过`predictor.delete_service`删除推理服务,释放资源。" ] }, { "cell_type": "code", "execution_count": null, "id": "d78a2587", "metadata": {}, "outputs": [], "source": [ "# 执行完成之后,删除对应的服务\n", "\n", "p.delete_service()" ] }, { "attachments": {}, "cell_type": "markdown", "id": "66b601b9-030e-49c1-8534-b6a53ea7903d", "metadata": {}, "source": [ "## Step3: Finetune BERT预训练模型\n", "\n", "[BERT](https://arxiv.org/abs/1810.04805)使用自监督学习方法在大型英文语料库上进行训练,他学习到了英语语言的内在表示,可以通过微调的方式,应用于不同的下游任务,从而获得更好的性能。在当前示例中,我们将使用Huggingface上 Yelp英文评论数据集[yelp_review_full](https://huggingface.co/datasets/yelp_review_full) 对BERT模型进行微调,以获得一个可以用于情感分类的模型。\n", "\n", "\n", "### 准备模型和数据集\n", "\n", "在当前步骤中,我们将准备微调训练使用的数据集,然后上传到OSS上供训练作业使用。\n", "\n", "> 通过HuggingFace提供的transformers和datasets库可以使用读取本地文件的方式(离线模式),或是从HuggingFace Hub下载模型和数据的方式。为了提高训练作业的执行速度,我们在当前示例中,将模型和数据集准备到OSS,挂载到训练作业执行环境中,供训练作业直接加载使用。\n", "\n" ] }, { "cell_type": "code", "execution_count": null, "id": "f760ddcb", "metadata": {}, "outputs": [], "source": [ "from datasets import load_dataset\n", "from pai.common.oss_utils import upload\n", "\n", "data_path = \"./train_data\"\n", "\n", "# 从HuggingFace Hub加载数据集\n", "dataset = load_dataset(\"yelp_review_full\")\n", "\n", "# 保存到数据集,保存的数据集可以通过`datasets.load_from_disk`加载使用\n", "dataset.save_to_disk(data_path)\n", "\n", "train_data_uri = upload(\n", " source_path=data_path,\n", " oss_path=\"huggingface/dataset/yelp_review_full/\",\n", " bucket=sess.oss_bucket,\n", ")\n", "\n", "print(train_data_uri)" ] }, { "attachments": {}, "cell_type": "markdown", "id": "5e518592", "metadata": {}, "source": [ "\n", "### 准备训练代码\n", "参考HuggingFace提供的对于[Masked Language Model 的微调文档](https://huggingface.co/course/chapter7/3?fw=tf),我们编写了以下训练脚本,它将使用我们上传的数据集完成模型的微调。" ] }, { "cell_type": "code", "execution_count": null, "id": "a2534222", "metadata": {}, "outputs": [], "source": [ "\n", "# 创建代码保存目录\n", "!mkdir -p train_src" ] }, { "attachments": {}, "cell_type": "markdown", "id": "0f2a7761", "metadata": {}, "source": [ "\n", "在我们编写的训练作业脚本中,通过环境变量的方式获取训练作业的超参,输出数据,输出模型保存地址。对于PAI训练服务提供的环境变量的详细介绍,可以见文档:[训练作业预置环境变量](https://help.aliyun.com/document_detail/2261505.html)\n", "\n", "完整的训练代码如下:\n" ] }, { "cell_type": "code", "execution_count": null, "id": "26c62bb8-963e-4ffe-8843-715482896cd3", "metadata": { "ExecutionIndicator": { "show": true }, "tags": [] }, "outputs": [], "source": [ "%%writefile train_src/finetune.py\n", "\n", "import os\n", "\n", "from datasets import load_dataset, load_from_disk\n", "from transformers import AutoTokenizer, AutoModelForSequenceClassification, TrainingArguments, Trainer, DataCollatorWithPadding, HfArgumentParser\n", "import numpy as np\n", "import evaluate\n", "\n", "\n", "def compute_metrics(eval_pred):\n", " logits, labels = eval_pred\n", " predictions = np.argmax(logits, axis=-1)\n", " return metric.compute(predictions=predictions, references=labels)\n", "\n", "def tokenize_function(examples):\n", " return tokenizer(examples[\"text\"], padding=\"max_length\", truncation=True)\n", "\n", "\n", "def train():\n", " # 通过环境变量获取预训练模型地址, 训练数据,以及模型保存地址\n", " model_name_or_path = os.environ.get(\"PAI_INPUT_MODEL\", \"bert-base-cased\")\n", " input_train_data = os.environ.get(\"PAI_INPUT_TRAIN_DATA\")\n", " output_dir=os.environ.get(\"PAI_OUTPUT_MODEL\", \"./output\")\n", "\n", " # 使用环境变量获取训练作业超参\n", " num_train_epochs=int(os.environ.get(\"PAI_HPS_EPOCHS\", 2))\n", " save_strategy=os.environ.get(\"PAI_HPS_SAVE_STRATEGY\", \"epoch\")\n", "\n", " print(\"Loading Model...\")\n", " model = AutoModelForSequenceClassification.from_pretrained(model_name_or_path, num_labels=5)\n", " tokenizer = AutoTokenizer.from_pretrained(model_name_or_path)\n", "\n", " print(\"Loading dataset from disk...\")\n", " dataset = load_from_disk(input_train_data)\n", " tokenized_datasets = dataset.map(lambda examples: tokenizer(examples[\"text\"], padding=\"max_length\", truncation=True, max_length=512),\n", " batched=True)\n", "\n", " data_collator = DataCollatorWithPadding(tokenizer)\n", " small_train_dataset = tokenized_datasets['train'].shuffle(seed=42).select(range(1000))\n", " small_eval_dataset = tokenized_datasets['test'].shuffle(seed=42).select(range(1000))\n", "\n", " training_args = TrainingArguments(\n", " output_dir=output_dir,\n", " # 使用环境变量获取训练作业超参\n", " num_train_epochs=num_train_epochs,\n", " # 使用环境变量获取训练作业保存策略\n", " save_strategy=save_strategy,\n", " )\n", " print(\"TrainingArguments: {}\".format(training_args.to_json_string()))\n", " metric = evaluate.load('accuracy')\n", "\n", " print(\"Training...\")\n", " trainer = Trainer(\n", " model=model,\n", " args=training_args,\n", " train_dataset=small_train_dataset,\n", " eval_dataset=small_eval_dataset,\n", " data_collator=data_collator,\n", " tokenizer=tokenizer,\n", " compute_metrics=compute_metrics,\n", " )\n", "\n", " trainer.train()\n", " print(\"Saving Model...\")\n", " trainer.save_model()\n", "\n", "\n", "if __name__ == \"__main__\":\n", " train()\n" ] }, { "attachments": {}, "cell_type": "markdown", "id": "6a603b63", "metadata": {}, "source": [ "我们的训练作业将使用PAI提供的PyTorch镜像执行,需要在镜像中安装 `transformers` 和 `evaluate` 库才能够执行相应的训练脚本。通过在训练作业目录下提供 `requirements.txt` 文件,PAI的训练服务会自动安装指定的第三方依赖。" ] }, { "cell_type": "code", "execution_count": null, "id": "cfab739e", "metadata": {}, "outputs": [], "source": [ "\n", "%%writefile train_src/requirements.txt\n", "\n", "transformers\n", "datasets\n", "evaluate\n" ] }, { "attachments": {}, "cell_type": "markdown", "id": "b478975d-17bd-4f81-93b8-e3dd32b6b7f1", "metadata": {}, "source": [ "### 提交训练作业\n", "\n", "通过PAI Python SDK提供的训练作业API`pai.estimator.Estimator`,我们可以将训练脚本提交到PAI执行。在以下代码中,我们将指定使用的训练代码 `train_src` ,使用PAI提供的PyTorch GPU镜像训练,提交运行微调训练作业。对于使用SDK提交训练作业的详细介绍,用户可以参考文档:[PAI Python SDK提交训练作业](https://help.aliyun.com/document_detail/2261505.html)。" ] }, { "cell_type": "code", "execution_count": null, "id": "dda481a0-7c85-4b49-b3c5-cc30ca5d3a8c", "metadata": { "ExecutionIndicator": { "show": false }, "tags": [] }, "outputs": [], "source": [ "from pai.huggingface.estimator import HuggingFaceEstimator\n", "from pai.image import retrieve\n", "\n", "\n", "# 使用 PAI 提供的 PyTorch GPU 训练镜像\n", "image_uri = retrieve(\n", " \"PyTorch\", framework_version=\"latest\", accelerator_type=\"GPU\"\n", ").image_uri\n", "\n", "\n", "# 配置训练作业\n", "est = HuggingFaceEstimator(\n", " command=\"python finetune.py\", # 训练作业启动命令\n", " source_dir=\"./train_src/\", # 训练作业代码\n", " instance_type=\"ecs.gn6i-c4g1.xlarge\", # 训练使用的作业机器类型, 4 vCPU, 15 GB, 1* T4 GPU\n", " transformers_version=\"latest\",\n", " hyperparameters={ # 训练作业超参,用户可以通过环境变量,或是\n", " \"save_strategy\": \"epoch\",\n", " \"epochs\": \"1\",\n", " },\n", " base_job_name=\"hf-bert-training\",\n", ")\n", "\n", "\n", "# est = Estimator(\n", "# image_uri=image_uri, # 训练作业使用的镜像\n", "# command=\"python finetune.py\", # 训练作业启动命令\n", "# source_dir=\"./train_src/\", # 训练作业代码\n", "# instance_type=\"ecs.gn6i-c4g1.xlarge\", # 训练使用的作业机器类型, 4 vCPU, 15 GB, 1* T4 GPU\n", "# hyperparameters={ # 训练作业超参,用户可以通过环境变量,或是\n", "# \"save_strategy\": \"epoch\",\n", "# \"epochs\": \"1\",\n", "# },\n", "# base_job_name=\"hf-bert-training\",\n", "# )\n", "\n", "print(est)\n", "print(est.hyperparameters)\n", "\n", "# 提交训练作业到PAI执行\n", "# 提交之后SDK会打印作业URL,我们可以作业详情页查看训练日志,输出模型,资源使用情况等\n", "est.fit(\n", " # 作业使用的预训练模型和数据集使用inputs方式传递\n", " # 相应的OSS URI会被挂载到作业环境中,用户可以通过 `PAI_INPUT_{ChannelNameUpperCase}` 的环境变量获取挂载后的路径\n", " inputs={\n", " \"model\": bert_model_uri,\n", " \"train_data\": train_data_uri,\n", " }\n", ")" ] }, { "cell_type": "code", "execution_count": null, "id": "3c0354a7", "metadata": {}, "outputs": [], "source": [ "# 训练任务产出的模型地址\n", "print(est.model_data())" ] }, { "attachments": {}, "cell_type": "markdown", "id": "27e83a0d-d02c-42f3-bbd2-c71c775fad82", "metadata": { "tags": [] }, "source": [ "### 部署Finetune获得的模型\n", "\n", "我们将复用以上推理服务的代码,将微调训练获得的模型部署到PAI-EAS,创建一个在线推理服务。\n", "\n", "> Note: 微调模型用于情感分析任务,我们显式得修改HuggingFace pipeline的Task参数。这里我们通过环境变量的方式传入Task参数。" ] }, { "cell_type": "code", "execution_count": null, "id": "73a51721-11b9-4f24-b016-5c704de526b8", "metadata": { "ExecutionIndicator": { "show": false }, "tags": [] }, "outputs": [], "source": [ "from pai.model import Model, container_serving_spec\n", "from pai.image import retrieve, ImageScope\n", "\n", "\n", "# 使用 PAI 提供的 PyTorch CPU 推理镜像\n", "image_uri = retrieve(\n", " \"PyTorch\",\n", " framework_version=\"latest\",\n", " accelerator_type=\"CPU\",\n", " image_scope=ImageScope.INFERENCE,\n", ").image_uri\n", "\n", "\n", "# 构建一个使用镜像部署的InferenceSpec,可以用于将以上产出的BERT模型部署为推理服务.\n", "inference_spec = container_serving_spec(\n", " # 模型服务的启动命令\n", " command=\"python run.py\",\n", " # 模型服务依赖的代码\n", " source_dir=\"./serving_src\",\n", " image_uri=image_uri,\n", " requirements=[\n", " \"transformers\",\n", " \"fastapi\",\n", " \"uvicorn\",\n", " ],\n", " # 使用情感分析任务pipeline,通过环境变量的方式传递给到推理服务脚本。\n", " environment_variables={\"HF_TASK\": \"sentiment-analysis\"},\n", ")\n", "\n", "print(inference_spec.to_dict())" ] }, { "cell_type": "code", "execution_count": null, "id": "d57a41f3-a4fc-40d8-92d5-ca083f4de2ee", "metadata": { "ExecutionIndicator": { "show": false }, "tags": [] }, "outputs": [], "source": [ "from pai.model import Model\n", "from pai.common.utils import random_str\n", "\n", "# 使用训练作业产出的模型\n", "model_data = est.model_data()\n", "\n", "m = Model(\n", " inference_spec=inference_spec,\n", " model_data=model_data,\n", ")\n", "\n", "p = m.deploy(\n", " service_name=\"hf_bert_ft_serving_{}\".format(random_str(6)), # 推理服务名称\n", " instance_type=\"ecs.c6.xlarge\", # 服务使用的机器实例规格: 4 vCPU, 8 GB\n", ")" ] }, { "attachments": {}, "cell_type": "markdown", "id": "82e7586d", "metadata": {}, "source": [ "通过Predictor向新创建的推理服务发送预测请求,获取模型预测结果。" ] }, { "cell_type": "code", "execution_count": null, "id": "af0b6e0b", "metadata": {}, "outputs": [], "source": [ "res = p.predict({\"text\": \"i am so happy today\"})\n", "print(res)\n", "\n", "res = p.predict({\"text\": \"i am so sad today\"})\n", "print(res)" ] }, { "attachments": {}, "cell_type": "markdown", "id": "bc2bdce3", "metadata": {}, "source": [ "在测试完成之后,我们通过`predictor.delete_service`删除推理服务,释放资源。" ] }, { "cell_type": "code", "execution_count": null, "id": "4fdd724e-e557-4461-9cdb-93874a77c49a", "metadata": { "tags": [] }, "outputs": [], "source": [ "# 执行完成之后,删除对应的服务\n", "\n", "p.delete_service()" ] } ], "metadata": { "execution": { "timeout": 1800 }, "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.3" }, "vscode": { "interpreter": { "hash": "31f2aee4e71d21fbe5cf8b01ff0e069b9275f58929596ceb00d14d90e3e16cd6" } } }, "nbformat": 4, "nbformat_minor": 5 }