sdk/python/jobs/automl-standalone-jobs/automl-image-classification-multiclass-task-fridge-items/automl-image-classification-multiclass-task-fridge-items.ipynb (1,734 lines of code) (raw):
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# AutoML: Train \"the best\" Image Classification Multi-Class model for a 'Fridge items' dataset.\n",
"\n",
"**Requirements** - In order to benefit from this tutorial, you will need:\n",
"- A basic understanding of Machine Learning\n",
"- An Azure account with an active subscription. [Create an account for free](https://azure.microsoft.com/free/?WT.mc_id=A261C142F)\n",
"- An Azure ML workspace. [Check this notebook for creating a workspace](../../../resources/workspace/workspace.ipynb) \n",
"- A Compute Cluster. [Check this notebook to create a compute cluster](../../../resources/compute/compute.ipynb)\n",
"- A python environment\n",
"- Installed Azure Machine Learning Python SDK v2 - [install instructions](../../../README.md) - check the getting started section\n",
"\n",
"**Learning Objectives** - By the end of this tutorial, you should be able to:\n",
"- Connect to your AML workspace from the Python SDK\n",
"- Create an `AutoML Image Classification Multiclass Training Job` with the 'image_classification()' factory-function.\n",
"- Train the model using AmlCompute by submitting/running the AutoML training job\n",
"- Obtaing the model and score predictions with it\n",
"\n",
"**Motivations** - This notebook explains how to setup and run an AutoML image classification-multiclass job. This is one of the nine ML-tasks supported by AutoML. Other ML-tasks are 'forecasting', 'classification', 'image object detection', 'nlp text classification', etc.\n",
"\n",
"In this notebook, we go over how you can use AutoML for training an Image Classification Multi-Class model. We will use a small dataset to train the model, demonstrate how you can tune hyperparameters of the model to optimize model performance and deploy the model to use in inference scenarios. "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 1. Connect to Azure Machine Learning Workspace\n",
"\n",
"The [workspace](https://docs.microsoft.com/en-us/azure/machine-learning/concept-workspace) is the top-level resource for Azure Machine Learning, providing a centralized place to work with all the artifacts you create when you use Azure Machine Learning. In this section we will connect to the workspace in which the job will be run.\n",
"\n",
"## 1.1. Import the required libraries"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"gather": {
"logged": 1634852261599
},
"name": "automl-import"
},
"outputs": [],
"source": [
"# Import required libraries\n",
"from azure.identity import DefaultAzureCredential\n",
"from azure.ai.ml import MLClient\n",
"\n",
"from azure.ai.ml.automl import SearchSpace, ClassificationPrimaryMetrics\n",
"from azure.ai.ml.sweep import (\n",
" Choice,\n",
" Choice,\n",
" Uniform,\n",
" BanditPolicy,\n",
")\n",
"\n",
"from azure.ai.ml import automl"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 1.2. Configure workspace details and get a handle to the workspace\n",
"\n",
"To connect to a workspace, we need identifier parameters - a subscription, resource group and workspace name. We will use these details in the `MLClient` from `azure.ai.ml` to get a handle to the required Azure Machine Learning workspace. We use the default [default azure authentication](https://docs.microsoft.com/en-us/python/api/azure-identity/azure.identity.defaultazurecredential?view=azure-python) for this tutorial. Check the [configuration notebook](../../configuration.ipynb) for more details on how to configure credentials and connect to a workspace."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"gather": {
"logged": 1634852261744
},
"jupyter": {
"outputs_hidden": false,
"source_hidden": false
},
"name": "mlclient-setup",
"nteract": {
"transient": {
"deleting": false
}
},
"tags": [
"validation-workspace"
]
},
"outputs": [],
"source": [
"credential = DefaultAzureCredential()\n",
"ml_client = None\n",
"try:\n",
" ml_client = MLClient.from_config(credential)\n",
"except Exception as ex:\n",
" print(ex)\n",
" # Enter details of your AML workspace\n",
" subscription_id = \"<SUBSCRIPTION_ID>\"\n",
" resource_group = \"<RESOURCE_GROUP>\"\n",
" workspace = \"<AML_WORKSPACE_NAME>\"\n",
" ml_client = MLClient(credential, subscription_id, resource_group, workspace)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 2. MLTable with input Training Data\n",
"\n",
"In order to generate models for computer vision tasks with automated machine learning, you need to bring labeled image data as input for model training in the form of an MLTable. You can create an MLTable from labeled training data in JSONL format. If your labeled training data is in a different format (like, pascal VOC or COCO), you can use a conversion script to first convert it to JSONL, and then create an MLTable. Alternatively, you can use Azure Machine Learning's [data labeling tool](https://docs.microsoft.com/en-us/azure/machine-learning/how-to-create-image-labeling-projects) to manually label images, and export the labeled data to use for training your AutoML model.\n",
"\n",
"In this notebook, we use a toy dataset called Fridge Objects, which consists of 134 images of 4 classes of beverage container {`can`, `carton`, `milk bottle`, `water bottle`} photos taken on different backgrounds.\n",
"\n",
"All images in this notebook are hosted in [this repository](https://github.com/microsoft/computervision-recipes) and are made available under the [MIT license](https://github.com/microsoft/computervision-recipes/blob/master/LICENSE)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 2.1. Download the Data\n",
"We first download and unzip the data locally. By default, the data would be downloaded in `./data` folder in current directory. \n",
"If you prefer to download the data at a different location, update it in `dataset_parent_dir = ...` in the next cell."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"import urllib\n",
"from zipfile import ZipFile\n",
"\n",
"# Change to a different location if you prefer\n",
"dataset_parent_dir = \"./data\"\n",
"\n",
"# create data folder if it doesnt exist.\n",
"os.makedirs(dataset_parent_dir, exist_ok=True)\n",
"\n",
"# download data\n",
"download_url = \"https://automlsamplenotebookdata-adcuc7f7bqhhh8a4.b02.azurefd.net/image-classification/fridgeObjects.zip\"\n",
"\n",
"# Extract current dataset name from dataset url\n",
"dataset_name = os.path.split(download_url)[-1].split(\".\")[0]\n",
"# Get dataset path for later use\n",
"dataset_dir = os.path.join(dataset_parent_dir, dataset_name)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Get the data zip file path\n",
"data_file = os.path.join(dataset_parent_dir, f\"{dataset_name}.zip\")\n",
"\n",
"# Download the dataset\n",
"urllib.request.urlretrieve(download_url, filename=data_file)\n",
"\n",
"# extract files\n",
"with ZipFile(data_file, \"r\") as zip:\n",
" print(\"extracting files...\")\n",
" zip.extractall(path=dataset_parent_dir)\n",
" print(\"done\")\n",
"# delete zip file\n",
"os.remove(data_file)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This is a sample image from this dataset:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from IPython.display import Image\n",
"\n",
"sample_image = os.path.join(dataset_dir, \"milk_bottle\", \"99.jpg\")\n",
"Image(filename=sample_image)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 2.2. Upload the images to Datastore through an AML Data asset (URI Folder)\n",
"\n",
"In order to use the data for training in Azure ML, we upload it to our default Azure Blob Storage of our Azure ML Workspace.\n",
"\n",
"[Check this notebook for AML data asset example](../../../assets/data/data.ipynb)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"name": "data-upload"
},
"outputs": [],
"source": [
"# Uploading image files by creating a 'data asset URI FOLDER':\n",
"\n",
"from azure.ai.ml.entities import Data\n",
"from azure.ai.ml.constants import AssetTypes, InputOutputModes\n",
"from azure.ai.ml import Input\n",
"\n",
"my_data = Data(\n",
" path=dataset_dir,\n",
" type=AssetTypes.URI_FOLDER,\n",
" description=\"Fridge-items images\",\n",
" name=\"fridge-items-images\",\n",
")\n",
"\n",
"uri_folder_data_asset = ml_client.data.create_or_update(my_data)\n",
"\n",
"print(uri_folder_data_asset)\n",
"print(\"\")\n",
"print(\"Path to folder in Blob Storage:\")\n",
"print(uri_folder_data_asset.path)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 2.3. Convert the downloaded data to JSONL\n",
"\n",
"In this example, the fridge object dataset is stored in a directory. There are four different folders inside:\n",
"\n",
"- /water_bottle\n",
"- /milk_bottle\n",
"- /carton\n",
"- /can\n",
"\n",
"This is the most common data format for multiclass image classification. Each folder title corresponds to the image label for the images contained inside. \n",
"\n",
"For documentation on preparing the datasets beyond this notebook, please refer to the [documentation on how to prepare datasets](https://docs.microsoft.com/en-us/azure/machine-learning/how-to-prepare-datasets-for-automl-images).\n",
"\n",
"In order to use this data to create an AzureML MLTable, we first need to convert it to the required JSONL format. The following script is creating two `.jsonl` files (one for training and one for validation) in the corresponding MLTable folder. The train / validation ratio corresponds to 20% of the data going into the validation file. For further details on jsonl file used for image classification task in automated ml, please refer to the [data schema documentation for multi-class image classification task](https://learn.microsoft.com/en-us/azure/machine-learning/reference-automl-images-schema#image-classification-binarymulti-class)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## First generate the jsonl file"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import sys\n",
"\n",
"sys.path.insert(0, \"../jsonl-conversion/\")\n",
"from base_jsonl_converter import write_json_lines\n",
"from classification_jsonl_converter import ClassificationJSONLConverter\n",
"\n",
"converter = ClassificationJSONLConverter(\n",
" uri_folder_data_asset.path, data_dir=dataset_dir\n",
")\n",
"jsonl_annotations = os.path.join(dataset_dir, \"annotations.jsonl\")\n",
"write_json_lines(converter, jsonl_annotations)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Now split the annotations into train and validation"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import json\n",
"import os\n",
"\n",
"\n",
"# We'll copy each JSONL file within its related MLTable folder\n",
"training_mltable_path = os.path.join(dataset_parent_dir, \"training-mltable-folder\")\n",
"validation_mltable_path = os.path.join(dataset_parent_dir, \"validation-mltable-folder\")\n",
"\n",
"# First, let's create the folders if they don't exist\n",
"os.makedirs(training_mltable_path, exist_ok=True)\n",
"os.makedirs(validation_mltable_path, exist_ok=True)\n",
"\n",
"train_validation_ratio = 5\n",
"\n",
"# Path to the training and validation files\n",
"train_annotations_file = os.path.join(training_mltable_path, \"train_annotations.jsonl\")\n",
"validation_annotations_file = os.path.join(\n",
" validation_mltable_path, \"validation_annotations.jsonl\"\n",
")\n",
"\n",
"\n",
"with open(jsonl_annotations, \"r\") as annot_f:\n",
" json_lines = annot_f.readlines()\n",
"\n",
"index = 0\n",
"with open(train_annotations_file, \"w\") as train_f:\n",
" with open(validation_annotations_file, \"w\") as validation_f:\n",
" for json_line in json_lines:\n",
" if index % train_validation_ratio == 0:\n",
" # validation annotation\n",
" validation_f.write(json_line)\n",
" else:\n",
" # train annotation\n",
" train_f.write(json_line)\n",
" index += 1"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 2.4. Create MLTable data input\n",
"\n",
"Create MLTable data input using the jsonl files created above.\n",
"\n",
"For documentation on creating your own MLTable assets for jobs beyond this notebook, please refer to below resources\n",
"- [MLTable YAML Schema](https://learn.microsoft.com/en-us/azure/machine-learning/reference-yaml-mltable) - covers how to write MLTable YAML, which is required for each MLTable asset.\n",
"- [Create MLTable data asset](https://learn.microsoft.com/en-us/azure/machine-learning/how-to-create-data-assets?tabs=Python-SDK#create-a-mltable-data-asset) - covers how to create MLTable data asset. "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def create_ml_table_file(filename):\n",
" \"\"\"Create ML Table definition\"\"\"\n",
"\n",
" return (\n",
" \"paths:\\n\"\n",
" \" - file: ./{0}\\n\"\n",
" \"transformations:\\n\"\n",
" \" - read_json_lines:\\n\"\n",
" \" encoding: utf8\\n\"\n",
" \" invalid_lines: error\\n\"\n",
" \" include_path_column: false\\n\"\n",
" \" - convert_column_types:\\n\"\n",
" \" - columns: image_url\\n\"\n",
" \" column_type: stream_info\"\n",
" ).format(filename)\n",
"\n",
"\n",
"def save_ml_table_file(output_path, mltable_file_contents):\n",
" with open(os.path.join(output_path, \"MLTable\"), \"w\") as f:\n",
" f.write(mltable_file_contents)\n",
"\n",
"\n",
"# Create and save train mltable\n",
"train_mltable_file_contents = create_ml_table_file(\n",
" os.path.basename(train_annotations_file)\n",
")\n",
"save_ml_table_file(training_mltable_path, train_mltable_file_contents)\n",
"\n",
"# Create and save validation mltable\n",
"validation_mltable_file_contents = create_ml_table_file(\n",
" os.path.basename(validation_annotations_file)\n",
")\n",
"save_ml_table_file(validation_mltable_path, validation_mltable_file_contents)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"name": "data-load"
},
"outputs": [],
"source": [
"# Training MLTable defined locally, with local data to be uploaded\n",
"my_training_data_input = Input(type=AssetTypes.MLTABLE, path=training_mltable_path)\n",
"\n",
"# Validation MLTable defined locally, with local data to be uploaded\n",
"my_validation_data_input = Input(type=AssetTypes.MLTABLE, path=validation_mltable_path)\n",
"\n",
"# WITH REMOTE PATH: If available already in the cloud/workspace-blob-store\n",
"# my_training_data_input = Input(type=AssetTypes.MLTABLE, path=\"azureml://datastores/workspaceblobstore/paths/vision-classification/train\")\n",
"# my_validation_data_input = Input(type=AssetTypes.MLTABLE, path=\"azureml://datastores/workspaceblobstore/paths/vision-classification/valid\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"To create data input from TabularDataset created using V1 sdk, specify the `type` as `AssetTypes.MLTABLE`, `mode` as `InputOutputModes.DIRECT` and `path` in the following format `azureml:<tabulardataset_name>:<version>`."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"\"\"\"\n",
"# Training MLTable with v1 TabularDataset\n",
"my_training_data_input = Input(\n",
" type=AssetTypes.MLTABLE, path=\"azureml:fridgeObjectsTrainingDataset:1\",\n",
" mode=InputOutputModes.DIRECT\n",
")\n",
"\n",
"# Validation MLTable with v1 TabularDataset\n",
"my_validation_data_input = Input(\n",
" type=AssetTypes.MLTABLE, path=\"azureml:fridgeObjectsValidationDataset:1\",\n",
" mode=InputOutputModes.DIRECT\n",
")\n",
"\"\"\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 3. Compute target setup\n",
"\n",
"You will need to provide a [Compute Target](https://docs.microsoft.com/en-us/azure/machine-learning/concept-azure-machine-learning-architecture#computes) that will be used for your AutoML model training. AutoML models for image tasks require [GPU SKUs](https://docs.microsoft.com/en-us/azure/virtual-machines/sizes-gpu) such as the ones from the NC, NCv2, NCv3, ND, NDv2 and NCasT4 series. We recommend using the NCsv3-series (with v100 GPUs) for faster training. Using a compute target with a multi-GPU VM SKU will leverage the multiple GPUs to speed up training. Additionally, setting up a compute target with multiple nodes will allow for faster model training by leveraging parallelism, when tuning hyperparameters for your model.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"validation-compute"
]
},
"outputs": [],
"source": [
"from azure.ai.ml.entities import AmlCompute\n",
"from azure.core.exceptions import ResourceNotFoundError\n",
"\n",
"compute_name = \"gpu-cluster-nc6sv3\"\n",
"\n",
"try:\n",
" _ = ml_client.compute.get(compute_name)\n",
" print(\"Found existing compute target.\")\n",
"except ResourceNotFoundError:\n",
" print(\"Creating a new compute target...\")\n",
" compute_config = AmlCompute(\n",
" name=compute_name,\n",
" type=\"amlcompute\",\n",
" size=\"Standard_NC6s_v3\",\n",
" idle_time_before_scale_down=120,\n",
" min_instances=0,\n",
" max_instances=4,\n",
" )\n",
" ml_client.begin_create_or_update(compute_config).result()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 4. Configure and run the AutoML for Images Classification-MultiClass training job\n",
"\n",
"AutoML allows you to easily train models for Image Classification, Object Detection & Instance Segmentation on your image data. You can control the model algorithm and hyperparameters to be used, perform a sweep over a manually specified hyperparameter space, or the system can automatically perform a hyperparameter sweep for you.\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"tags": [
"validation-remove"
]
},
"source": [
"## 4.1. Automatic hyperparameter sweeping for your models (AutoMode)\n",
"\n",
"When using AutoML for Images, we can perform an automatic hyperparameter sweep to find the optimal model (we call this functionality AutoMode). The system will choose a model architecture and values for the learning_rate, number_of_epochs, training_batch_size, etc. based on the number of runs. There is no need to specify the hyperparameter search space, sampling method or early termination policy. A number of runs between 10 and 20 will likely work well on many datasets.\n",
"\n",
"AutoMode is triggered by setting `max_trials` to a value greater than 1 in limits and by omitting the hyperparameter space, sampling method and termination policy.\n",
"\n",
"The following functions configure AutoML image jobs for automatic sweeps:\n",
"### image_classification() function parameters:\n",
"The `image_classification()` factory function allows user to configure the training job.\n",
"\n",
"- `compute` - The compute on which the AutoML job will run. In this example we are using a compute called 'gpu-cluster' present in the workspace. You can replace it any other compute in the workspace.\n",
"- `experiment_name` - The name of the experiment. An experiment is like a folder with multiple runs in Azure ML Workspace that should be related to the same logical machine learning experiment.\n",
"- `name` - The name of the Job/Run. This is an optional property. If not specified, a random name will be generated.\n",
"- `primary_metric` - The metric that AutoML will optimize for model selection.\n",
"- `target_column_name` - The name of the column to target for predictions. It must always be specified. This parameter is applicable to 'training_data' and 'validation_data'.\n",
"- `training_data` - The data to be used for training. It should contain both training feature columns and a target column. Optionally, this data can be split for segregating a validation or test dataset. \n",
"You can use a registered MLTable in the workspace using the format '<mltable_name>:<version>' OR you can use a local file or folder as a MLTable. For e.g Input(mltable='my_mltable:1') OR Input(mltable=MLTable(local_path=\"./data\"))\n",
"The parameter `training_data` must always be provided.\n",
"\n",
"### set_limits() function parameters:\n",
"This is an optional configuration method to configure limits parameters such as timeouts.\n",
"\n",
"- `max_trials` - Parameter for maximum number of configurations to sweep. Must be an integer between 1 and 1000. When exploring just the default hyperparameters for a given model algorithm, set this parameter to 1. Default value is 1.\n",
"- `max_concurrent_trials` - Maximum number of runs that can run concurrently. If not specified, all runs launch in parallel. If specified, must be an integer between 1 and 100. Default value is 1.\n",
" NOTE: The number of concurrent runs is gated on the resources available in the specified compute target. Ensure that the compute target has the available resources for the desired concurrency.\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# set up experiment name\n",
"exp_name = \"dpv2-image-classification-experiment\""
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"validation-remove"
]
},
"outputs": [],
"source": [
"# Create the AutoML job with the related factory-function.\n",
"\n",
"image_classification_job = automl.image_classification(\n",
" compute=compute_name,\n",
" # name=\"dpv2-image-classification-job-02\",\n",
" experiment_name=exp_name,\n",
" training_data=my_training_data_input,\n",
" validation_data=my_validation_data_input,\n",
" target_column_name=\"label\",\n",
" primary_metric=\"accuracy\",\n",
" tags={\"my_custom_tag\": \"My custom value\"},\n",
")\n",
"\n",
"image_classification_job.set_limits(\n",
" max_trials=10,\n",
" max_concurrent_trials=2,\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {
"tags": [
"validation-remove"
]
},
"source": [
"### Submitting an AutoML job for Computer Vision tasks\n",
"Once you've configured your job, you can submit it as a job in the workspace in order to train a vision model using your training dataset."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"validation-remove"
]
},
"outputs": [],
"source": [
"# Submit the AutoML job\n",
"returned_job = ml_client.jobs.create_or_update(\n",
" image_classification_job\n",
") # submit the job to the backend\n",
"\n",
"print(f\"Created job: {returned_job}\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"validation-remove"
]
},
"outputs": [],
"source": [
"ml_client.jobs.stream(returned_job.name)"
]
},
{
"cell_type": "markdown",
"metadata": {
"tags": [
"validation-remove"
]
},
"source": [
"## 4.2. Individual runs\n",
"\n",
"If AutoMode does not meet your needs, you can launch individual runs to explore model algorithms; we provide sensible default hyperparameters for each algorithm. You can also launch individual runs for the same model algorithm and different hyperparameter combinations. The model algorithm is specified using the model_name parameter. Please refer to the [documentation](https://docs.microsoft.com/en-us/azure/machine-learning/how-to-auto-train-image-models?tabs=CLI-v2#configure-model-algorithms-and-hyperparameters) for the list of supported model algorithms.\n",
"\n",
"The following function can be used to configure AutoML jobs for individual runs:\n",
"### set_training_parameters() function parameters:\n",
"This is an optional configuration method to configure fixed settings or parameters that don't change during the parameter space sweep. Some of the key parameters of this function are:\n",
"\n",
"- `model_name` - The name of the ML algorithm that we want to use in training job. Please refer to this [documentation](https://docs.microsoft.com/en-us/azure/machine-learning/how-to-auto-train-image-models?tabs=CLI-v2#supported-model-algorithms) for supported model algorithm.\n",
"- `number_of_epochs` - The number of training epochs. It must be positive integer (default value is 15).\n",
"- `layers_to_freeze` - The number of layers to freeze in model for transfer learning. It must be a positive integer (default value is 0).\n",
"- `early_stopping` - It enable early stopping logic during training, It must be boolean value (default is True). \n",
"- `optimizer` - Type of optimizer to use in training. It must be either sgd, adam, adamw (default is sgd).\n",
"- `distributed` - It enable distributed training if compute target contain multiple GPUs. It must be boolean value (default is True).\n",
" \n",
"If you wish to use the default hyperparameter values for a given algorithm (say `vitb16r224`), you can specify the job for your AutoML Image runs as follows:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"validation-remove"
]
},
"outputs": [],
"source": [
"# Create the AutoML job with the related factory-function.\n",
"\n",
"image_classification_job = automl.image_classification(\n",
" compute=compute_name,\n",
" # name=\"dpv2-image-classification-job-02\",\n",
" experiment_name=exp_name,\n",
" training_data=my_training_data_input,\n",
" validation_data=my_validation_data_input,\n",
" target_column_name=\"label\",\n",
")\n",
"\n",
"image_classification_job.set_limits(timeout_minutes=60)\n",
"\n",
"image_classification_job.set_training_parameters(model_name=\"vitb16r224\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"validation-remove"
]
},
"outputs": [],
"source": [
"# Submit the AutoML job\n",
"returned_job = ml_client.jobs.create_or_update(image_classification_job)\n",
"\n",
"print(f\"Created job: {returned_job}\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"validation-remove"
]
},
"outputs": [],
"source": [
"ml_client.jobs.stream(returned_job.name)"
]
},
{
"cell_type": "markdown",
"metadata": {
"tags": [
"validation-remove"
]
},
"source": [
"### 4.2.1 Individual runs with models from Hugging Face (Preview)\n",
"\n",
"In addition to the models supported natively by AutoML, you can launch individual runs to explore any model from HuggingFace transformers library that supports image classification. Please refer to this [documentation](https://huggingface.co/models?pipeline_tag=image-classification&library=transformers) for the list of models.\n",
"\n",
"While you can use any model from Hugging face to support this task, we have curated a set of models in our registry. We provide a set of sensible default hyperparameters for these models. You can fetch the list of curated models using code snippet below."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"validation-remove"
]
},
"outputs": [],
"source": [
"registry_ml_client = MLClient(credential, registry_name=\"azureml\")\n",
"\n",
"models = registry_ml_client.models.list()\n",
"classification_models = []\n",
"for model in models:\n",
" try:\n",
" model = registry_ml_client.models.get(model.name, label=\"latest\")\n",
" if model.tags.get(\"task\", \"\") == \"image-classification\":\n",
" classification_models.append(model.name)\n",
" except Exception as ex:\n",
" print(f\"Error while accessing registry model list: {ex}\")\n",
"\n",
"classification_models"
]
},
{
"cell_type": "markdown",
"metadata": {
"tags": [
"validation-remove"
]
},
"source": [
"If you wish to try a model (say `microsoft/beit-base-patch16-224-pt22k-ft22k`), you can specify the job for your AutoML Image runs as follows:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"validation-remove"
]
},
"outputs": [],
"source": [
"# Create the AutoML job with the related factory-function.\n",
"\n",
"image_classification_job = automl.image_classification(\n",
" compute=compute_name,\n",
" experiment_name=exp_name,\n",
" training_data=my_training_data_input,\n",
" validation_data=my_validation_data_input,\n",
" target_column_name=\"label\",\n",
")\n",
"\n",
"image_classification_job.set_limits(timeout_minutes=60)\n",
"\n",
"image_classification_job.set_training_parameters(\n",
" model_name=\"microsoft/beit-base-patch16-224-pt22k-ft22k\"\n",
")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"validation-remove"
]
},
"outputs": [],
"source": [
"# Submit the AutoML job\n",
"returned_job = ml_client.jobs.create_or_update(image_classification_job)\n",
"\n",
"print(f\"Created job: {returned_job}\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"validation-remove"
]
},
"outputs": [],
"source": [
"ml_client.jobs.stream(returned_job.name)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 4.3. Manual hyperparameter sweeping for your model\n",
"\n",
"When using AutoML for Images, we can perform a hyperparameter sweep over a defined parameter space to find the optimal model. In this example, we sweep over the hyperparameters for `seresnext`, `resnet50`, `vitb16r224`, and `vits16r224` models, choosing from a range of values for learning_rate, number_of_epochs, layers_to_freeze, etc., to generate a model with the optimal 'accuracy'. If hyperparameter values are not specified, then default values are used for the specified algorithm.\n",
"\n",
"set_sweep function is used to configure the sweep settings:\n",
"### set_sweep() parameters:\n",
"- `sampling_algorithm` - Sampling method to use for sweeping over the defined parameter space. Please refer to this [documentation](https://docs.microsoft.com/en-us/azure/machine-learning/how-to-auto-train-image-models?tabs=SDK-v2#sampling-methods-for-the-sweep) for list of supported sampling methods.\n",
"- `early_termination` - Early termination policy to end poorly performing runs. If no termination policy is specified, all configurations are run to completion. Please refer to this [documentation](https://docs.microsoft.com/en-us/azure/machine-learning/how-to-auto-train-image-models?tabs=SDK-v2#early-termination-policies) for supported early termination policies.\n",
"\n",
"We use Random Sampling to pick samples from this parameter space and try a total of 10 iterations with these different samples, running 2 iterations at a time on our compute target. Please note that the more parameters the space has, the more iterations you need to find optimal models.\n",
"\n",
"We leverage the Bandit early termination policy which will terminate poor performing configs (those that are not within 20% slack of the best performing config), thus significantly saving compute resources.\n",
"\n",
"For more details on model and hyperparameter sweeping, please refer to the [documentation](https://docs.microsoft.com/en-us/azure/machine-learning/how-to-tune-hyperparameters)."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"gather": {
"logged": 1634852262026
},
"jupyter": {
"outputs_hidden": false,
"source_hidden": false
},
"name": "image-classification-configuration",
"nteract": {
"transient": {
"deleting": false
}
},
"tags": [
"validation-scenario",
"validation-trials"
]
},
"outputs": [],
"source": [
"# Create the AutoML job with the related factory-function.\n",
"\n",
"image_classification_job = automl.image_classification(\n",
" compute=compute_name,\n",
" # name=\"dpv2-image-classification-job-02\",\n",
" experiment_name=exp_name,\n",
" training_data=my_training_data_input,\n",
" validation_data=my_validation_data_input,\n",
" target_column_name=\"label\",\n",
" primary_metric=ClassificationPrimaryMetrics.ACCURACY,\n",
" tags={\"my_custom_tag\": \"My custom value\"},\n",
")\n",
"\n",
"image_classification_job.set_limits(\n",
" timeout_minutes=60,\n",
" max_trials=10,\n",
" max_concurrent_trials=2,\n",
")\n",
"\n",
"image_classification_job.extend_search_space(\n",
" [\n",
" SearchSpace(\n",
" model_name=Choice([\"vitb16r224\", \"vits16r224\"]),\n",
" learning_rate=Uniform(0.001, 0.01),\n",
" number_of_epochs=Choice([15, 30]),\n",
" ),\n",
" SearchSpace(\n",
" model_name=Choice([\"seresnext\", \"resnet50\"]),\n",
" learning_rate=Uniform(0.001, 0.01),\n",
" layers_to_freeze=Choice([0, 2]),\n",
" ),\n",
" ]\n",
")\n",
"\n",
"image_classification_job.set_sweep(\n",
" sampling_algorithm=\"Random\",\n",
" early_termination=BanditPolicy(\n",
" evaluation_interval=2, slack_factor=0.2, delay_evaluation=6\n",
" ),\n",
")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"gather": {
"logged": 1634852267930
},
"jupyter": {
"outputs_hidden": false,
"source_hidden": false
},
"name": "job-submit",
"nteract": {
"transient": {
"deleting": false
}
}
},
"outputs": [],
"source": [
"# Submit the AutoML job\n",
"returned_job = ml_client.jobs.create_or_update(\n",
" image_classification_job\n",
") # submit the job to the backend\n",
"\n",
"print(f\"Created job: {returned_job}\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"ml_client.jobs.stream(returned_job.name)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"When doing a hyperparameter sweep, it can be useful to visualize the different configurations that were tried using the HyperDrive UI. You can navigate to this UI by going to the 'Child jobs' tab in the UI of the main automl image job from above, which is the HyperDrive parent run. Then you can go into the 'Trials' tab of this HyperDrive parent run. Alternatively, here below you can see directly the HyperDrive parent run and navigate to its 'Trials' tab:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"hd_job = ml_client.jobs.get(returned_job.name + \"_HD\")\n",
"hd_job"
]
},
{
"cell_type": "markdown",
"metadata": {
"tags": [
"validation-remove"
]
},
"source": [
"### 4.3.1 Manual hyperparameter sweeping for models from Hugging Face (Preview)\n",
"\n",
"Similar to how you can use any model from Hugging face transformers library for individual runs, you can also include these models to perform a hyperparameter sweep. You can also choose a combination of models supported supported natively by [AutoML](https://docs.microsoft.com/en-us/azure/machine-learning/how-to-auto-train-image-models?tabs=CLI-v2#configure-model-algorithms-and-hyperparameters) and models from [Hugging Face](https://huggingface.co/models?pipeline_tag=image-classification&library=transformers).\n",
"\n",
"In this example, we sweep over `microsoft/beit-base-patch16-224-pt22k-ft22k`, `facebook/deit-base-patch16-224`, `seresnext`, and `resnet50` models choosing from a range of values for learning_rate, number_of_epochs, etc., to generate a model with the optimal 'accuracy'."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"validation-remove"
]
},
"outputs": [],
"source": [
"# Create the AutoML job with the related factory-function.\n",
"\n",
"image_classification_job = automl.image_classification(\n",
" compute=compute_name,\n",
" experiment_name=exp_name,\n",
" training_data=my_training_data_input,\n",
" validation_data=my_validation_data_input,\n",
" target_column_name=\"label\",\n",
" primary_metric=ClassificationPrimaryMetrics.ACCURACY,\n",
" tags={\"my_custom_tag\": \"My custom value\"},\n",
")\n",
"\n",
"image_classification_job.set_limits(\n",
" timeout_minutes=240,\n",
" max_trials=10,\n",
" max_concurrent_trials=2,\n",
")\n",
"\n",
"image_classification_job.extend_search_space(\n",
" [\n",
" SearchSpace(\n",
" model_name=Choice(\n",
" [\n",
" \"microsoft/beit-base-patch16-224-pt22k-ft22k\",\n",
" \"facebook/deit-base-patch16-224\",\n",
" ]\n",
" ),\n",
" learning_rate=Uniform(0.00001, 0.0001),\n",
" number_of_epochs=Choice([10, 15]),\n",
" ),\n",
" SearchSpace(\n",
" model_name=Choice([\"seresnext\", \"resnet50\"]),\n",
" learning_rate=Uniform(0.001, 0.01),\n",
" ),\n",
" ]\n",
")\n",
"\n",
"image_classification_job.set_sweep(\n",
" sampling_algorithm=\"Random\",\n",
" early_termination=BanditPolicy(\n",
" evaluation_interval=2, slack_factor=0.2, delay_evaluation=6\n",
" ),\n",
")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"validation-remove"
]
},
"outputs": [],
"source": [
"# Submit the AutoML job\n",
"returned_job_hf = ml_client.jobs.create_or_update(\n",
" image_classification_job\n",
") # submit the job to the backend\n",
"\n",
"print(f\"Created job: {returned_job_hf}\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"validation-remove"
]
},
"outputs": [],
"source": [
"ml_client.jobs.stream(returned_job_hf.name)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 5. Retrieve the Best Trial (Best Model's trial/run)\n",
"Use the MLFLowClient to access the results (such as Models, Artifacts, Metrics) of a previously completed AutoML Trial."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Initialize MLFlow Client\n",
"\n",
"The models and artifacts that are produced by AutoML can be accessed via the MLFlow interface.\n",
"Initialize the MLFlow client here, and set the backend as Azure ML, via. the MLFlow Client.\n",
"\n",
"IMPORTANT, you need to have installed the latest MLFlow packages with:\n",
"\n",
" pip install azureml-mlflow\n",
"\n",
" pip install mlflow"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import mlflow\n",
"\n",
"# Obtain the tracking URL from MLClient\n",
"MLFLOW_TRACKING_URI = ml_client.workspaces.get(\n",
" name=ml_client.workspace_name\n",
").mlflow_tracking_uri\n",
"\n",
"print(MLFLOW_TRACKING_URI)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Set the MLFLOW TRACKING URI\n",
"mlflow.set_tracking_uri(MLFLOW_TRACKING_URI)\n",
"print(f\"\\nCurrent tracking uri: {mlflow.get_tracking_uri()}\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from mlflow.tracking.client import MlflowClient\n",
"\n",
"# Initialize MLFlow client\n",
"mlflow_client = MlflowClient()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Get the AutoML parent Job"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"job_name = returned_job.name\n",
"\n",
"# Example if providing an specific Job name/ID\n",
"# job_name = \"salmon_camel_5sdf05xvb3\"\n",
"\n",
"# Get the parent run\n",
"mlflow_parent_run = mlflow_client.get_run(job_name)\n",
"\n",
"print(\"Parent Run: \")\n",
"print(mlflow_parent_run)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Print parent run tags. 'automl_best_child_run_id' tag should be there.\n",
"print(mlflow_parent_run.data.tags.keys())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Get the AutoML best child run"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Get the best model's child run\n",
"\n",
"best_child_run_id = mlflow_parent_run.data.tags[\"automl_best_child_run_id\"]\n",
"print(f\"Found best child run id: {best_child_run_id}\")\n",
"\n",
"best_run = mlflow_client.get_run(best_child_run_id)\n",
"\n",
"print(\"Best child run: \")\n",
"print(best_run)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Get best model run's metrics\n",
"Access the results (such as Models, Artifacts, Metrics) of a previously completed AutoML Run."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import pandas as pd\n",
"\n",
"pd.DataFrame(best_run.data.metrics, index=[0]).T"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Download the best model locally\n",
"Access the results (such as Models, Artifacts, Metrics) of a previously completed AutoML Run."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Create local folder\n",
"import os\n",
"\n",
"local_dir = \"./artifact_downloads\"\n",
"if not os.path.exists(local_dir):\n",
" os.mkdir(local_dir)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Download run's artifacts/outputs\n",
"local_path = mlflow_client.download_artifacts(\n",
" best_run.info.run_id, \"outputs\", local_dir\n",
")\n",
"print(f\"Artifacts downloaded in: {local_path}\")\n",
"print(f\"Artifacts: {os.listdir(local_path)}\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"from mlflow.models import Model\n",
"\n",
"mlflow_model_dir = os.path.join(local_dir, \"outputs\", \"mlflow-model\")\n",
"\n",
"# Show the contents of the MLFlow model folder\n",
"os.listdir(mlflow_model_dir)\n",
"model_flavor = Model.load(mlflow_model_dir).flavors\n",
"is_oss_flavor = \"transformers\" in model_flavor\n",
"# You should see a list of files such as the following:\n",
"# ['artifacts', 'conda.yaml', 'MLmodel', 'python_env.yaml', 'python_model.pkl', 'requirements.txt']"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 6. Register best model and deploy"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 6.1 Create managed online endpoint"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# import required libraries\n",
"from azure.ai.ml.entities import (\n",
" ManagedOnlineEndpoint,\n",
" ManagedOnlineDeployment,\n",
" Model,\n",
" Environment,\n",
" CodeConfiguration,\n",
" ProbeSettings,\n",
")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Creating a unique endpoint name with current datetime to avoid conflicts\n",
"import datetime\n",
"\n",
"online_endpoint_name = \"ic-mc-fridge-items-\" + datetime.datetime.now().strftime(\n",
" \"%m%d%H%M\"\n",
")\n",
"\n",
"# create an online endpoint\n",
"endpoint = ManagedOnlineEndpoint(\n",
" name=online_endpoint_name,\n",
" description=\"this is a sample online endpoint for deploying model\",\n",
" auth_mode=\"key\",\n",
" tags={\"foo\": \"bar\"},\n",
")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"ml_client.begin_create_or_update(endpoint).result()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 6.2 Register best model and deploy"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Register model"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"model_name = \"ic-mc-fridge-items-model\"\n",
"model = Model(\n",
" path=f\"azureml://jobs/{best_run.info.run_id}/outputs/artifacts/outputs/mlflow-model/\",\n",
" name=model_name,\n",
" description=\"my sample image classification multiclass model\",\n",
" type=AssetTypes.MLFLOW_MODEL,\n",
")\n",
"\n",
"# for downloaded file\n",
"# model = Model(\n",
"# path=mlflow_model_dir,\n",
"# name=model_name,\n",
"# description=\"my sample image classification multiclass model\",\n",
"# type=AssetTypes.MLFLOW_MODEL,\n",
"# )\n",
"\n",
"registered_model = ml_client.models.create_or_update(model)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"registered_model.id"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Deploy"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azure.ai.ml.entities import OnlineRequestSettings\n",
"\n",
"# Setting the request timeout to 90 seconds.\n",
"req_timeout = OnlineRequestSettings(request_timeout_ms=90000)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"validation-deployment"
]
},
"outputs": [],
"source": [
"deployment = ManagedOnlineDeployment(\n",
" name=\"ic-mc-fridge-items-mlflow-deploy\",\n",
" endpoint_name=online_endpoint_name,\n",
" model=registered_model.id,\n",
" # use GPU instance type like Standard_NC6s_v3 for faster explanations\n",
" instance_type=\"Standard_DS4_V2\",\n",
" instance_count=1,\n",
" request_settings=req_timeout,\n",
" liveness_probe=ProbeSettings(\n",
" failure_threshold=30,\n",
" success_threshold=1,\n",
" timeout=2,\n",
" period=10,\n",
" initial_delay=2000,\n",
" ),\n",
" readiness_probe=ProbeSettings(\n",
" failure_threshold=10,\n",
" success_threshold=1,\n",
" timeout=10,\n",
" period=10,\n",
" initial_delay=2000,\n",
" ),\n",
")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"ml_client.online_deployments.begin_create_or_update(deployment).result()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# ic mc fridge items deployment to take 100% traffic\n",
"endpoint.traffic = {\"ic-mc-fridge-items-mlflow-deploy\": 100}\n",
"ml_client.begin_create_or_update(endpoint).result()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Get endpoint details"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Get the details for online endpoint\n",
"endpoint = ml_client.online_endpoints.get(name=online_endpoint_name)\n",
"\n",
"# existing traffic details\n",
"print(endpoint.traffic)\n",
"\n",
"# Get the scoring URI\n",
"print(endpoint.scoring_uri)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Online Inference"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Create request json\n",
"import base64\n",
"\n",
"sample_image = os.path.join(dataset_dir, \"milk_bottle\", \"99.jpg\")\n",
"\n",
"\n",
"def read_image(image_path):\n",
" with open(image_path, \"rb\") as f:\n",
" return f.read()\n",
"\n",
"\n",
"if is_oss_flavor:\n",
" request_json = {\n",
" \"input_data\": [base64.b64encode(read_image(sample_image)).decode(\"utf-8\")],\n",
" }\n",
"else:\n",
" request_json = {\n",
" \"input_data\": {\n",
" \"columns\": [\"image\"],\n",
" \"data\": [base64.encodebytes(read_image(sample_image)).decode(\"utf-8\")],\n",
" }\n",
" }"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import json\n",
"\n",
"request_file_name = \"sample_request_data.json\"\n",
"\n",
"with open(request_file_name, \"w\") as request_file:\n",
" json.dump(request_json, request_file)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"resp = ml_client.online_endpoints.invoke(\n",
" endpoint_name=online_endpoint_name,\n",
" deployment_name=deployment.name,\n",
" request_file=request_file_name,\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Visualize detections\n",
"Now that we have scored a test image, we can visualize the prediction for this image."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%matplotlib inline\n",
"import matplotlib.pyplot as plt\n",
"import matplotlib.image as mpimg\n",
"from PIL import Image\n",
"import numpy as np\n",
"import json\n",
"\n",
"IMAGE_SIZE = (18, 12)\n",
"plt.figure(figsize=IMAGE_SIZE)\n",
"img_np = mpimg.imread(sample_image)\n",
"img = Image.fromarray(img_np.astype(\"uint8\"), \"RGB\")\n",
"x, y = img.size\n",
"\n",
"fig, ax = plt.subplots(1, figsize=(15, 15))\n",
"# Display the image\n",
"ax.imshow(img_np)\n",
"\n",
"prediction = json.loads(resp)[0]\n",
"if is_oss_flavor:\n",
" label = prediction[\"0\"][\"label\"]\n",
" conf_score = prediction[\"0\"][\"score\"]\n",
"else:\n",
" label_index = np.argmax(prediction[\"probs\"])\n",
" label = prediction[\"labels\"][label_index]\n",
" conf_score = prediction[\"probs\"][label_index]\n",
"\n",
"display_text = f\"{label} ({round(conf_score, 3)})\"\n",
"print(display_text)\n",
"\n",
"color = \"red\"\n",
"plt.text(30, 30, display_text, color=color, fontsize=30)\n",
"\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Generate the Scores and Explanations\n",
"- Explainability methods:\n",
" - [XRAI](https://arxiv.org/abs/1906.02825) (xrai)\n",
" - [Integrated Gradients](https://arxiv.org/abs/1703.01365) (integrated_gradients)\n",
" - [Guided GradCAM](https://arxiv.org/abs/1610.02391v4) (guided_gradcam)\n",
" - [Guided BackPropagation](https://arxiv.org/abs/1412.6806) (guided_backprop)\n",
"\n",
"For more details on explainability with AutoML for images, refer to the [generating explanations](https://learn.microsoft.com/en-us/azure/machine-learning/how-to-auto-train-image-models?tabs=python#generate-explanations-for-predictions) and [schema](https://learn.microsoft.com/en-us/azure/machine-learning/reference-automl-images-schema#data-format-for-online-scoring-and-explainability-xai) articles.\n",
"\n",
"**Note**: Please note that explainability is not supported for Hugging Face models."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import json\n",
"\n",
"# Define explainability (XAI) parameters\n",
"model_explainability = True\n",
"xai_parameters = {\n",
" \"xai_algorithm\": \"xrai\",\n",
" \"visualizations\": True,\n",
" \"attributions\": False,\n",
"}\n",
"\n",
"# Create request json\n",
"request_json = {\n",
" \"input_data\": {\n",
" \"columns\": [\"image\"],\n",
" \"data\": [\n",
" json.dumps(\n",
" {\n",
" \"image_base64\": base64.encodebytes(read_image(sample_image)).decode(\n",
" \"utf-8\"\n",
" ),\n",
" \"model_explainability\": model_explainability,\n",
" \"xai_parameters\": xai_parameters,\n",
" }\n",
" )\n",
" ],\n",
" }\n",
"}"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"request_file_name = \"sample_request_data.json\"\n",
"\n",
"with open(request_file_name, \"w\") as request_file:\n",
" json.dump(request_json, request_file)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"resp = ml_client.online_endpoints.invoke(\n",
" endpoint_name=online_endpoint_name,\n",
" deployment_name=deployment.name,\n",
" request_file=request_file_name,\n",
")\n",
"predictions = json.loads(resp)\n",
"predictions"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Visualize Explanations"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from io import BytesIO\n",
"from PIL import Image\n",
"\n",
"\n",
"def base64_to_img(base64_img_str):\n",
" base64_img = base64_img_str.encode(\"utf-8\")\n",
" decoded_img = base64.b64decode(base64_img)\n",
" return BytesIO(decoded_img).getvalue()\n",
"\n",
"\n",
"# visualize explanations of the first image against one of the class\n",
"img_bytes = base64_to_img(predictions[0][\"visualizations\"])\n",
"image = Image.open(BytesIO(img_bytes))\n",
"plt.imshow(image)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Delete the deployment and endopoint"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"ml_client.online_endpoints.begin_delete(name=online_endpoint_name)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Next Step: Load the best model and try predictions\n",
"\n",
"Loading the models locally assume that you are running the notebook in an environment compatible with the model. The list of dependencies that is expected by the model is specified in the MLFlow model produced by AutoML (in the 'conda.yaml' file within the mlflow-model folder).\n",
"\n",
"Since the AutoML model was trained remotelly in a different environment with different dependencies to your current local conda environment where you are running this notebook, if you want to load the model you have several options:\n",
"\n",
"1. A recommended way to locally load the model in memory and try predictions is to create a new/clean conda environment with the dependencies specified in the conda.yaml file within the MLFlow model's folder, then use MLFlow to load the model and call .predict() as explained in the notebook **mlflow-model-local-inference-test.ipynb** in this same folder.\n",
"\n",
"2. You can install all the packages/dependencies specified in conda.yaml into your current conda environment you used for using Azure ML SDK and AutoML. MLflow SDK also have a method to install the dependencies in the current environment. However, this option could have risks of package version conflicts depending on what's installed in your current environment.\n",
"\n",
"3. You can also use: mlflow models serve -m 'xxxxxxx'"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Next Steps\n",
"You can see further examples of other AutoML tasks such as Regression, Image-Object-Detection, NLP-Text-Classification, Time-Series-Forcasting, etc."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": []
}
],
"metadata": {
"kernel_info": {
"name": "python3-azureml"
},
"kernelspec": {
"display_name": "Python 3.10 - SDK V2",
"language": "python",
"name": "python310-sdkv2"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.10"
},
"microsoft": {
"host": {
"AzureML": {
"notebookHasBeenCompleted": true
}
}
},
"nteract": {
"version": "nteract-front-end@1.0.0"
}
},
"nbformat": 4,
"nbformat_minor": 4
}