sdk/python/foundation-models/system/distillation/summarization/distillation

{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Distillation Summarization with Large Language Models\n", " \n", "### Notebook details\n", " \n", "This sample demonstrates how to train the selected student model using the teacher model, resulting in the creation of the distilled model.\n", " \n", "We will use the Meta Llama 3.1 405B Instruct as the teacher model and the Meta Llama 3.1 8B Instruct as the student model.\n", " \n", "**Note :**\n", " \n", "- Distillation should only be used for single turn chat completion format as shown below\n", " ```json\n", " {\"messages\": [\n", " {\"role\": \"system\", \"content\": \"Instructions for summarization\"},\n", " {\"role\": \"user\", \"content\": \"Text to summarize\"} \n", " ]}\n", " ```\n", "- The Meta Llama 3.1 405B Instruct model can only be used as a teacher model.\n", "- Distillation of a Meta Llama 3.1 8B Instruct student (target) model is only available in **West US 3** regions.\n", "- Distillation of Phi3 or Phi3.5 student (target) models is only available in **East US 2** regions.\n", "\n", "**Prerequisites :**\n", "- Subscribe to the Meta Llama 3.1 405B Instruct and Meta Llama 3.1 8B Instruct, see [how to subscribe your project to the model offering in MS Learn](https://learn.microsoft.com/en-us/azure/ai-studio/how-to/deploy-models-serverless?tabs=azure-ai-studio#subscribe-your-project-to-the-model-offering)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# 1. Connect to Azure Machine Learning Workspace\n", "\n", "The [workspace](https://docs.microsoft.com/en-us/azure/machine-learning/concept-workspace) is the top-level resource for Azure Machine Learning, providing a centralized place to work with all the artifacts you create when you use Azure Machine Learning. In this section we will connect to the workspace in which the job will be run.\n", "\n", "## 1.1. Install the SDK v2" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "vscode": { "languageId": "powershell" } }, "outputs": [], "source": [ "%pip install azure-ai-ml\n", "%pip install azure-identity\n", "%pip install azure-core\n", "%pip install azure-ai-inference\n", "\n", "%pip install mlflow\n", "%pip install azureml-mlflow\n", "%pip install datasets" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 1.2. Import the required libraries" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# import required libraries\n", "\n", "import json\n", "import uuid\n", "\n", "from azure.identity import DefaultAzureCredential, InteractiveBrowserCredential\n", "\n", "from azure.ai.inference import ChatCompletionsClient\n", "from azure.ai.inference.models import SystemMessage, UserMessage\n", "from azure.ai.ml import MLClient, Input, Output\n", "from azure.ai.ml.constants import AssetTypes, DataGenerationTaskType, DataGenerationType\n", "from azure.ai.ml.model_customization import (\n", " distillation,\n", " EndpointRequestSettings,\n", " PromptSettings,\n", ")\n", "from azure.ai.ml.entities import Data, ServerlessConnection, ServerlessEndpoint\n", "from azure.core.credentials import AzureKeyCredential" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 1.3. Configure workspace details and get a handle to the workspace\n", "\n", "To connect to a workspace, we need identifier parameters - a subscription, resource group and workspace name. We will use these details in the `MLClient` from `azure.ai.ml` to get a handle to the required workspace. We use the [default azure authentication](https://docs.microsoft.com/en-us/python/api/azure-identity/azure.identity.defaultazurecredential?view=azure-python) for this tutorial. Check the [configuration notebook](../../configuration.ipynb) for more details on how to configure credentials and connect to a workspace.\n", "\n", "\n", "### 1.3.1 Prerequisites\n", "\n", "For distillation of a Meta Llama 3.1 8B student model, an Azure AI Foundry project in **West US 3** is required. Please follow [this](https://learn.microsoft.com/azure/ai-studio/how-to/fine-tune-model-llama?tabs=llama-two%2Cchatcompletion#prerequisites) document to setup your Azure AI Foundry project\n", "\n", "If you are using a Phi 3 or Phi 3.5 student model, an Azure AI Foundry project in **East US 2** is required. Follow [this](https://learn.microsoft.com/en-us/azure/ai-studio/how-to/fine-tune-phi-3?tabs=phi-3-mini#prerequisites) document to setup your Azure AI Foundry project\n", "\n", "### 1.3.2 Azure AI Foundry project settings\n", "\n", "Update following cell with the information of the Azure AI Foundry project just created." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "SUBSCRIPTION_ID = \"<SUBSCRIPTION_ID>\"\n", "RESOURCE_GROUP = \"<RESOURCE_GROUP>\"\n", "AI_PROJECT_NAME = \"<AI_PROJECT_NAME>\"" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "try:\n", " credential = DefaultAzureCredential()\n", " # Check if given credential can get token successfully.\n", " credential.get_token(\"https://management.azure.com/.default\")\n", "except Exception as ex:\n", " # Fall back to InteractiveBrowserCredential in case DefaultAzureCredential not work\n", " credential = InteractiveBrowserCredential()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 1.3.3 Get handle to Azure AI Foundry project" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "ml_client = MLClient(credential, SUBSCRIPTION_ID, RESOURCE_GROUP, AI_PROJECT_NAME)\n", "\n", "ai_project = ml_client._workspaces.get(ml_client.workspace_name)\n", "ai_project._workspace_id" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 2. Data\n", "\n", "### 2.1 Download the dataset from HuggingFace repo\n", "\n", "For this task we will use the [griffin/chain_of_density](https://huggingface.co/datasets/griffin/chain_of_density) dataset. This dataset consists of 1000 news articles which we will use to train and test our endpoints.\n", "\n", "We will begin by downloading the dataset and preparing the data in chat completion format. AzureML expects both train and validation datasets for distillation. We will reserve some samples to test the distilled model. Hence we will split the data into 3 parts:\n", "\n", "| Split | Size |\n", "| ---------- | ---- |\n", "| Train | 500 |\n", "| Validation | 400 |\n", "| Test | 100 |" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from datasets import load_dataset\n", "\n", "from abc import ABC\n", "\n", "\n", "class InputDataset(ABC):\n", " def __init__(self):\n", " super().__init__()\n", " (\n", " self.train_data_file_name,\n", " self.test_data_file_name,\n", " self.eval_data_file_name,\n", " ) = (None, None, None)\n", "\n", "\n", "class SummarizationHuggingFaceInputDataset(InputDataset):\n", " \"\"\"\n", " Loads the HuggingFace dataset\n", " \"\"\"\n", "\n", " def __init__(self):\n", " super().__init__()\n", "\n", " def load_hf_dataset(\n", " self,\n", " dataset_name,\n", " train_sample_size=10,\n", " val_sample_size=10,\n", " test_sample_size=10,\n", " train_split_name=\"train\",\n", " val_split_name=\"validation\",\n", " test_split_name=\"test\",\n", " ):\n", " full_dataset = load_dataset(dataset_name, \"unannotated\")\n", "\n", " if val_split_name is not None:\n", " train_data = full_dataset[train_split_name].select(range(train_sample_size))\n", " val_data = full_dataset[val_split_name].select(range(val_sample_size))\n", " test_data = full_dataset[test_split_name].select(range(test_sample_size))\n", " else:\n", " shared_data = full_dataset[train_split_name]\n", "\n", " train_data = shared_data.select(range(train_sample_size))\n", " val_data = shared_data.select(\n", " range(train_sample_size, train_sample_size + val_sample_size)\n", " )\n", " test_data = shared_data.select(\n", " range(\n", " train_sample_size + val_sample_size,\n", " train_sample_size + val_sample_size + test_sample_size,\n", " )\n", " )\n", "\n", " return train_data, val_data, test_data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 2.2 Partition Data\n", "Some datasets, like the multiarith dataset, do not have a validation portion. To mitigate this, we use a 90-10 split of the training data for validation since there is no validation. \n", "\n", "**NOTE:** For math distillation, training and validation must have at least 40 valid entries." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "train_sample_size = 500\n", "val_sample_size = 400\n", "\n", "# Sample notebook using the dataset: https://huggingface.co/datasets/griffin/chain_of_density\n", "dataset_name = \"griffin/chain_of_density\"\n", "input_dataset = SummarizationHuggingFaceInputDataset()\n", "\n", "# Note: train_split_name and test_split_name can vary by dataset. They are passed as arguments in load_hf_dataset.\n", "# If val_split_name is None, the below function will split the train set to create the specified sized validation set.\n", "train, val, _ = input_dataset.load_hf_dataset(\n", " dataset_name=dataset_name,\n", " train_sample_size=train_sample_size,\n", " val_sample_size=val_sample_size,\n", " train_split_name=\"train\",\n", " val_split_name=None,\n", ")\n", "\n", "print(\"Len of train data sample is \" + str(len(train)))\n", "print(\"Len of validation data sample is \" + str(len(val)))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "! mkdir -p data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 2.3 Prepare data to submit for inferencing\n", "The data has now been downloaded and processed in the case that only training data was available and not validation data. In this section we will format the downloaded data to match what is expected in an inferencing request. We will also add a system prompt to instruct the teacher model what kind of labels to generate." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "train_data_path = \"data/train_griffin.jsonl\"\n", "valid_data_path = \"data/valid_griffin.jsonl\"\n", "\n", "SYSTEM_PROMPT = \"You will generate concise, entity-dense summary of the given article. Only generate the summary text. Do not exceed 80 words.\"\n", "user_prompt_template = \"Article: {article}\"\n", "\n", "for row in train:\n", " data = {\n", " \"messages\": [\n", " {\"role\": \"system\", \"content\": SYSTEM_PROMPT},\n", " {\n", " \"role\": \"user\",\n", " \"content\": user_prompt_template.format(article=row[\"article\"]),\n", " },\n", " ]\n", " }\n", "\n", " with open(train_data_path, \"a\") as f:\n", " f.write(json.dumps(data) + \"\\n\")\n", "\n", "for row in val:\n", " data = {\n", " \"messages\": [\n", " {\"role\": \"system\", \"content\": SYSTEM_PROMPT},\n", " {\n", " \"role\": \"user\",\n", " \"content\": user_prompt_template.format(article=row[\"article\"]),\n", " },\n", " ]\n", " }\n", "\n", " with open(valid_data_path, \"a\") as f:\n", " f.write(json.dumps(data) + \"\\n\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 2.4 Create Data Input" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Training data defined locally, with local data to be uploaded\n", "train_data = Input(type=AssetTypes.URI_FILE, path=train_data_path)\n", "\n", "# If training data was registered to workspace already, navigate to the Data tab, select the data to use and use the 'Named asset URI'\n", "# Example of the format is seen below\n", "# train_data = \"azureml:summarize_train_griffin:1\"" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Validation data defined locally, with local data to be uploaded\n", "valid_data = Input(type=AssetTypes.URI_FILE, path=valid_data_path)\n", "\n", "# If validation data was registered to workspace already, navigate to the Data tab, select the data to use and use the 'Named asset URI'\n", "# valid_data = \"azureml:summarize_valid_griffin:1\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 3. Configure and Run the Distillation Job\n", "In this section we will configure and run a Distillation job.\n", "\n", "### 3.1 Configure the job through the distillation() factory function\n", "\n", "#### distillation() parameters:\n", "\n", "The `distillation()` factory function allows user to configure Distillation for the label generation task for the most common scenarios with the following properties.\n", "\n", "- `experiment_name` - The name of the Experiment. An Experiment is like a folder with multiple runs in Azure ML Workspace that should be related to the same logical machine learning experiment.\n", "- `data_generation_type` - The type of data generation to perform. Valid options are 'label_generation'.\n", "- `data_generation_task_type` - The kind of data to generation. Valid options include 'NLI', 'NLI_QA', 'CONVERSATION', 'MATH', and 'SUMMARIZATION'.\n", "- `teacher_model_endpoint_connection` - A ServerlessConnection geared towards a MaaS endpoint. Requires the name of the endpoint, the endopoint url, and the api key for the endpoint.\n", "- `student_model` - The student model to train with the synthetic data generated from the teacher model.\n", "- `training_data` - The data to be used for training.\n", "- `validation_data` - The data to be used for validation.\n", "- `name` - The name of the Job/Run. This is an optional property. If not specified, a random name will be generated.\n", "\n", "\n", "##### Teacher Model Connection\n", "Select the teacher model to use. This requires a serverless (MaaS) endpoint. Supported teacher models:\n", "1. Meta-Llama-3.1-405B-Instruct\n", "\n", "Replace the following strings with your own serverless endpoint information\n", "\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "teacher_model_endpoint_name = \"Meta-Llama-3-1-405B-Instruct-vkn\"\n", "teacher_model_endpoint_url = \"https://Meta-Llama-3-1-405B-Instruct-vkn.westus3.models.ai.azure.com/chat/completions\"\n", "teacher_model_api_key = \"EXAMPLE_API_KEY\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Student Model\n", "Select the student model to use. Supported student models:\n", "1. Meta-Llama-3.1-8B-Instruct\n", "2. Phi-3-Mini-4k-Instruct\n", "3. Phi-3-Mini-128k-Instruct\n", "4. Phi-3.5-Mini-Instruct\n", "5. Phi-3.5-MoE-Instruct\n", "6. Phi-3-Medium-4k-Instruct\n", "7. Phi-3-Medium-128k-Instruct" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# The model id\n", "student_model = (\n", " \"azureml://registries/azureml-meta/models/Meta-Llama-3.1-8B-Instruct/versions/3\"\n", ")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "distillation_job = distillation(\n", " experiment_name=\"llama-summarization-distillation\",\n", " data_generation_type=DataGenerationType.LABEL_GENERATION,\n", " data_generation_task_type=DataGenerationTaskType.SUMMARIZATION,\n", " teacher_model_endpoint_connection=ServerlessConnection(\n", " name=teacher_model_endpoint_name,\n", " endpoint=teacher_model_endpoint_url,\n", " api_key=teacher_model_api_key,\n", " ),\n", " student_model=student_model,\n", " training_data=train_data,\n", " validation_data=valid_data,\n", " outputs={\n", " \"registered_model\": Output(\n", " type=\"mlflow_model\", name=\"llama-summarization-distilled\"\n", " )\n", " },\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 3.2 Configure the distillation settings\n", "\n", "#### set_teacher_model_settings() function parameters:\n", "This is an optional configuration method to configure the settings inference requests will have when submitted to the teacher model endpoint. \n", " \n", "- `inference_parameters` - Inference parameters that are applied to inferencing requests. These inference parameters are aligned with parameters allowed by vllm. Currently, the inference parameters that are used by distillation are 'max_tokens', 'temperature', 'top_p', 'frequency_penalty', 'presence_penalty', and 'stop'.\n", "\n", "- `endpoint_request_settings` - An EndpointRequestSettings object that adds settings for the inferencing requests sent to the endpoint. Valid endpoint settings include 'min_endpoint_success_ratio' and 'request_batch_size'.\n", " - `min_endpoint_success_ratio` - The minimum ratio of successful/total inferencing request needed for data generation to be considered successful. Will not proceed if the number of successful/total inferencing requests is below the ratio. Should be between 0 and 1, inclusive. Defaults to 0.7.\n", " - `request_batch_size` - The number of inferencing requests to send at once to the teacher model endpoint. Defaults to 10.\n", "\n", "\n", "#### set_prompt_settings() function parameters:\n", "This is an optional configuration method to configure the settings for the system prompt used for the teacher model.\n", "\n", "- `prompt_setting` - A PromptSettings object that adds settings that determine what system prompt to use for the teacher model. Valid prompt settings for `SUMMARIZATION` task include 'enable_chain_of_density' and 'max_len_summary'.\n", " - `enable_chain_of_density` - The option to leverage Chain of Density (CoD) reasoning for distillation. CoD leverages step by step reasoning ability of the teacher model to generate more accurate labels.\n", " - `max_len_summary` - The maximum summary length to generate when 'enable_chain_of_density' is set to True.\n", "\n", "\n", "#### set_finetuning_settings() function parameters:\n", "This is an optional configuration method to configure the settings for finetuning the student model.\n", "\n", "- `hyperparameters` - The hyperparameters to use for finetuning." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Optional settings to use for inferencing requests\n", "distillation_job.set_teacher_model_settings(\n", " inference_parameters={\"max_tokens\": 200, \"temperature\": 0.8},\n", " endpoint_request_settings=EndpointRequestSettings(\n", " min_endpoint_success_ratio=0.7, request_batch_size=10\n", " ),\n", ")\n", "\n", "# Optional settings to use for the system prompt\n", "distillation_job.set_prompt_settings(\n", " prompt_settings=PromptSettings(enable_chain_of_density=True, max_len_summary=80)\n", ")\n", "\n", "# Optional settings to use for finetuning the student model\n", "distillation_job.set_finetuning_settings(\n", " hyperparameters={\n", " \"learning_rate\": \"0.00002\",\n", " \"per_device_train_batch_size\": \"1\",\n", " \"num_train_epochs\": \"3\",\n", " }\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 3.2 Submit the Job\n", "Using the `MLClient` created earlier, we will now run this Command in the workspace." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "created_job = ml_client.jobs.create_or_update(distillation_job)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Wait Until the Distillation Job Finishes" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(created_job.name)\n", "ml_client.jobs.stream(created_job.name)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 4. Distilled Model Consumption\n", "With the model distilled, we can now consume the model by creating an endpoint and sending inference requests.\n", "\n", "### 4.1 Create a Serverless Endpoint\n", "We first deploy the endpoint as a serverless endpoint (MaaS endpoint)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Wait for the job to complete\n", "registered_model_name = ml_client.jobs.get(created_job.name).properties[\n", " \"registered_ft_model_name\"\n", "]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Create the model url for registered endpoint\n", "rg_model_vs = ml_client.models.get(registered_model_name, label=\"latest\")._version\n", "\n", "rg_model_asset_id = (\n", " \"azureml://locations/\"\n", " f\"{ai_project.location}\"\n", " \"/workspaces/\"\n", " f\"{ai_project._workspace_id}\"\n", " \"/models/\"\n", " f\"{registered_model_name}\"\n", " \"/versions/\"\n", " f\"{rg_model_vs}\"\n", ")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Create serverless endpoint - names must be unique\n", "guid = uuid.uuid4()\n", "short_guid = str(guid)[:8]\n", "serverless_endpoint_name = \"my-endpoint-\" + short_guid\n", "\n", "serverless_endpoint = ServerlessEndpoint(\n", " name=serverless_endpoint_name,\n", " model_id=rg_model_asset_id,\n", ")\n", "\n", "created_endpoint = ml_client.serverless_endpoints.begin_create_or_update(\n", " serverless_endpoint\n", ").result()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 4.2. Endpoint Inferencing\n", "With the serverless endpoint running, we now can send inference requests to the endpoint\n", "\n", "**Note:** If the student model selected was Phi-3-medium-4k-instruct or Phi-3-medium-128k-instruct, the following section will not work as Phi-3-medium models do not accept system prompts. Instead skip to section 4.2.1" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "url = created_endpoint.scoring_uri\n", "key = ml_client.serverless_endpoints.get_keys(created_endpoint.name).primary_key\n", "model = ChatCompletionsClient(\n", " endpoint=url,\n", " credential=AzureKeyCredential(key),\n", ")\n", "\n", "article = (\n", " \"A hiker was arrested and warned she could face jail after freeing an \"\n", " \"eagle from a trap and springing three more traps to protect other animals. \"\n", " \"Kathleen Adair, 39, was walking her three dogs up Davies Creek Trail in \"\n", " \"Alaska on Christmas Eve when she spotted the bird with each leg shut inside \"\n", " \"traps. She spent an hour freeing the creature before alerting a bird rescue \"\n", " \"firm. Heading home, she also sprung another trap which she spotted in the \"\n", " \"ground - prompting an investigation by Alaska Wildlife Troopers that landed \"\n", " \"her in court. Eventually tracked down by authorities she was charged and \"\n", " \"hauled to court facing a $500 fine and 30 days in jail. Arrested: Kathleen \"\n", " \"Adair, 39, was charged with hindering lawful trapping after snaring three \"\n", " \"traps in Alaska . The eagle was found and euthanized three days after she \"\n", " \"freed it. 'What we expect from the public is if they come upon an eagle in a \"\n", " \"trap, to notify us as soon as possible. That way we can go out there and see \"\n", " \"what's going on,' Alaska Wildlife Trooper Sgt Aaron Frenzel told the station. \"\n", " \"Defending her actions, Adair told the Juneau Empire she is not 'an ecoterrorist \"\n", " \"trying to ruin trappers' livelihood.' 'I grew up hunting and fishing here, I've \"\n", " \"got several animal skins on my walls,' she said. 'I don't personally trap, and I \"\n", " \"don't choose to, I don't want to, but I'm not going to stop someone else from \"\n", " \"doing it. I only object when the traps are on the trail where I think they are \"\n", " \"safety concerns.' Speaking to KTOO, she said: 'I knew at the time that the eagle \"\n", " \"didn't have a very good chance. I knew if I left it there all night, it would have \"\n", " \"had a worse chance of surviving. 'But even as it was, I could tell one of the legs \"\n", " \"was just dangling, just completely broken and I knew they wouldn't be able to fix \"\n", " \"that, but I was hoping they could at least fix the other and keep it as an \"\n", " \"educational bird.' 'I wanted to go back and tell the Raptor Center where it was. \"\n", " \"I knew that would be the best thing to do, but I also knew that it would be \"\n", " \"getting dark soon. Saved: The Bald Eagle caught in a leg-hold trap in Juneau on \"\n", " \"Christmas Eve, found and released by Adair . 'It was two miles from the road and it \"\n", " \"was all the way at the end of the road, so I knew that they wouldn't be able to get \"\n", " \"out there that day to it. 'I'm not against trapping per se. I am concerned about the \"\n", " \"traps when they're on the trail in such a way as these were,' Adair said. On Thursday, \"\n", " \"the case was dismissed by a judge who called Adair's work 'admirable'. 'Her actions in \"\n", " \"saving the eagle were laudable,' Juneau District Attorney James Scott said during \"\n", " \"Adair's arraignment on Thursday afternoon. 'She should not have to run the risk of a \"\n", " \"conviction on her record for this offense.' 'When she's hiking and she comes across an \"\n", " \"eagle in a snare, I encourage her to rescue that eagle again, and I will screen that \"\n", " \"case out as well,' the district attorney added, according to the Empire.\"\n", ")\n", "response = model.complete(\n", " messages=[\n", " SystemMessage(content=SYSTEM_PROMPT),\n", " UserMessage(content=user_prompt_template.format(article=article)),\n", " ],\n", ")\n", "\n", "print(response.choices[0].message.content)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 4.2.1 Endpoint Inferencing (Phi-3-medium models)\n", "With the serverless endpoint running, we now can send inference requests to the endpoint\n", "\n", "**Note:** Skip this section if the student model was **NOT** a Phi-3-medium model" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "url = created_endpoint.scoring_uri\n", "key = ml_client.serverless_endpoints.get_keys(created_endpoint.name).primary_key\n", "model = ChatCompletionsClient(\n", " endpoint=url,\n", " credential=AzureKeyCredential(key),\n", ")\n", "\n", "article = (\n", " \"A hiker was arrested and warned she could face jail after freeing an \"\n", " \"eagle from a trap and springing three more traps to protect other animals. \"\n", " \"Kathleen Adair, 39, was walking her three dogs up Davies Creek Trail in \"\n", " \"Alaska on Christmas Eve when she spotted the bird with each leg shut inside \"\n", " \"traps. She spent an hour freeing the creature before alerting a bird rescue \"\n", " \"firm. Heading home, she also sprung another trap which she spotted in the \"\n", " \"ground - prompting an investigation by Alaska Wildlife Troopers that landed \"\n", " \"her in court. Eventually tracked down by authorities she was charged and \"\n", " \"hauled to court facing a $500 fine and 30 days in jail. Arrested: Kathleen \"\n", " \"Adair, 39, was charged with hindering lawful trapping after snaring three \"\n", " \"traps in Alaska . The eagle was found and euthanized three days after she \"\n", " \"freed it. 'What we expect from the public is if they come upon an eagle in a \"\n", " \"trap, to notify us as soon as possible. That way we can go out there and see \"\n", " \"what's going on,' Alaska Wildlife Trooper Sgt Aaron Frenzel told the station. \"\n", " \"Defending her actions, Adair told the Juneau Empire she is not 'an ecoterrorist \"\n", " \"trying to ruin trappers' livelihood.' 'I grew up hunting and fishing here, I've \"\n", " \"got several animal skins on my walls,' she said. 'I don't personally trap, and I \"\n", " \"don't choose to, I don't want to, but I'm not going to stop someone else from \"\n", " \"doing it. I only object when the traps are on the trail where I think they are \"\n", " \"safety concerns.' Speaking to KTOO, she said: 'I knew at the time that the eagle \"\n", " \"didn't have a very good chance. I knew if I left it there all night, it would have \"\n", " \"had a worse chance of surviving. 'But even as it was, I could tell one of the legs \"\n", " \"was just dangling, just completely broken and I knew they wouldn't be able to fix \"\n", " \"that, but I was hoping they could at least fix the other and keep it as an \"\n", " \"educational bird.' 'I wanted to go back and tell the Raptor Center where it was. \"\n", " \"I knew that would be the best thing to do, but I also knew that it would be \"\n", " \"getting dark soon. Saved: The Bald Eagle caught in a leg-hold trap in Juneau on \"\n", " \"Christmas Eve, found and released by Adair . 'It was two miles from the road and it \"\n", " \"was all the way at the end of the road, so I knew that they wouldn't be able to get \"\n", " \"out there that day to it. 'I'm not against trapping per se. I am concerned about the \"\n", " \"traps when they're on the trail in such a way as these were,' Adair said. On Thursday, \"\n", " \"the case was dismissed by a judge who called Adair's work 'admirable'. 'Her actions in \"\n", " \"saving the eagle were laudable,' Juneau District Attorney James Scott said during \"\n", " \"Adair's arraignment on Thursday afternoon. 'She should not have to run the risk of a \"\n", " \"conviction on her record for this offense.' 'When she's hiking and she comes across an \"\n", " \"eagle in a snare, I encourage her to rescue that eagle again, and I will screen that \"\n", " \"case out as well,' the district attorney added, according to the Empire.\"\n", ")\n", "response = model.complete(\n", " messages=[\n", " UserMessage(\n", " content=SYSTEM_PROMPT + \" \" + user_prompt_template.format(article=article)\n", " ),\n", " ],\n", ")\n", "\n", "print(response.choices[0].message.content)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 5. Cleanup Endpoints\n", "\n", "Endpoint deployments are chargeable and incurr costs on the subscription. Optionally clean up the endpoints after finishing experiments" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "_ = ml_client.serverless_endpoints.begin_delete(teacher_model_endpoint_name).result()\n", "_ = ml_client.serverless_endpoints.begin_delete(serverless_endpoint_name).result()" ] } ], "metadata": { "language_info": { "name": "python" } }, "nbformat": 4, "nbformat_minor": 2 }

sdk/python/foundation-models/system/distillation/summarization/distillation_summarization.ipynb (796 lines of code) (raw):