sdk/python/responsible-ai/vision/responsibleaidashboard-automl-image-classification-fridge.ipynb (1,295 lines of code) (raw):

{ "cells": [ { "cell_type": "markdown", "id": "98605bcd", "metadata": {}, "source": [ "# AutoML Image Classification scenario with RAI Dashboard\n", "\n", "This example notebook demonstrates how to use an automl trained computer vision model on the dataset to evaluate the model in AzureML.\n", "\n", "First, we need to specify the version of the RAI components which are available in the workspace. This was specified when the components were uploaded." ] }, { "cell_type": "code", "execution_count": null, "id": "53b4eeac", "metadata": {}, "outputs": [], "source": [ "version_string = \"0.0.20\"" ] }, { "cell_type": "markdown", "id": "06008690", "metadata": {}, "source": [ "We can optionally provide the name of the compute cluster we want to use in AzureML. Later in this notebook, we will create it if it does not already exist as an example. AzureML can also run on serverless computes if a compute is not explicitly set. " ] }, { "cell_type": "code", "execution_count": null, "id": "f1ad79f9", "metadata": {}, "outputs": [], "source": [ "train_compute_name = \"gpu-cluster-nc6-v3\"\n", "\n", "rai_compute_name = \"cpucluster\"" ] }, { "cell_type": "markdown", "id": "9fc65dc7", "metadata": {}, "source": [ "Finally, we need to specify a version for the data and components we will create while running this notebook. This should be unique for the workspace, but the specific value doesn't matter:" ] }, { "cell_type": "code", "execution_count": null, "id": "78053935", "metadata": {}, "outputs": [], "source": [ "rai_example_version_string = \"63\"" ] }, { "cell_type": "markdown", "id": "f386a61d", "metadata": {}, "source": [ "# 1. Connect to Azure Machine Learning Workspace\n", "\n", "The [workspace](https://docs.microsoft.com/en-us/azure/machine-learning/concept-workspace) is the top-level resource for Azure Machine Learning, providing a centralized place to work with all the artifacts you create when you use Azure Machine Learning. In this section we will connect to the workspace in which the job will be run." ] }, { "cell_type": "code", "execution_count": null, "id": "1847bd7a", "metadata": {}, "outputs": [], "source": [ "from azure.ai.ml.entities import Data, Model\n", "from azure.ai.ml.constants import AssetTypes\n", "\n", "from azure.ai.ml import MLClient\n", "from azure.identity import DefaultAzureCredential\n", "\n", "from azure.ai.ml.automl import ClassificationPrimaryMetrics\n", "from azure.ai.ml import automl, Input, dsl" ] }, { "cell_type": "code", "execution_count": null, "id": "7d053cc6", "metadata": {}, "outputs": [], "source": [ "# Enter details of your AML workspace\n", "subscription_id = \"<SUBSCRIPTION_ID>\"\n", "resource_group = \"<RESOURCE_GROUP>\"\n", "workspace = \"<AML_WORKSPACE_NAME>\"" ] }, { "cell_type": "code", "execution_count": null, "id": "9d4b1c33", "metadata": {}, "outputs": [], "source": [ "# Handle to the workspace\n", "from azure.ai.ml import MLClient\n", "from azure.identity import DefaultAzureCredential\n", "\n", "try:\n", " credential = DefaultAzureCredential()\n", " ml_client = MLClient(\n", " credential=credential,\n", " subscription_id=subscription_id,\n", " resource_group_name=resource_group,\n", " workspace_name=workspace,\n", " )\n", "except Exception:\n", " # If in compute instance we can get the config automatically\n", " from azureml.core import Workspace\n", "\n", " workspace = Workspace.from_config()\n", " workspace.write_config()\n", " ml_client = MLClient.from_config(\n", " credential=DefaultAzureCredential(exclude_shared_token_cache_credential=True),\n", " logging_enable=True,\n", " )\n", "\n", "print(ml_client)" ] }, { "cell_type": "markdown", "id": "e83d5b18", "metadata": {}, "source": [ "#### Compute target setup\n", "\n", "There are two ways to submit a job - through a compute or a serverless job.\n", "\n", "##### Serverless Job:\n", "\n", "In a serverless job, there is no need to create a compute explicitly.\n", "Simply pass the desired instance type value to the `instance_type` parameter while creating a pipeline job.\n", "This allows for quick and convenient job submission without the need for managing a compute cluster.\n", "\n", "##### Compute Job:\n", "\n", "The following code below demonstrates how to create a gpu compute cluster.\n", "After creating the compute cluster, pass the name of the compute cluster to the `compute_name` parameter while submitting the pipeline job. This ensures that the job runs on the specified compute cluster, allowing for more control and customization.\n", "\n", "You will need to provide a [Compute Target](https://docs.microsoft.com/en-us/azure/machine-learning/concept-azure-machine-learning-architecture#computes) that will be used for your AutoML model training. AutoML models for image tasks require [GPU SKUs](https://docs.microsoft.com/en-us/azure/virtual-machines/sizes-gpu) such as the ones from the NCv3, ND, NDv2 and NCasT4 series. We recommend using the NCsv3-series (with v100 GPUs) for faster training. Using a compute target with a multi-GPU VM SKU will leverage the multiple GPUs to speed up training. Additionally, setting up a compute target with multiple nodes will allow for faster model training by leveraging parallelism, when tuning hyperparameters for your model.\n", "\n" ] }, { "cell_type": "code", "execution_count": null, "id": "68b8e1e6", "metadata": {}, "outputs": [], "source": [ "from azure.ai.ml.entities import AmlCompute\n", "\n", "all_compute_names = [x.name for x in ml_client.compute.list()]\n", "\n", "if train_compute_name in all_compute_names:\n", " print(f\"Found existing compute: {train_compute_name}\")\n", "else:\n", " train_compute_config = AmlCompute(\n", " name=train_compute_name,\n", " type=\"amlcompute\",\n", " size=\"Standard_NC6s_v3\",\n", " min_instances=0,\n", " max_instances=4,\n", " idle_time_before_scale_down=120,\n", " )\n", " ml_client.compute.begin_create_or_update(train_compute_config).result()" ] }, { "cell_type": "code", "execution_count": null, "id": "ee6fa857", "metadata": {}, "outputs": [], "source": [ "from azure.ai.ml.entities import AmlCompute\n", "\n", "all_compute_names = [x.name for x in ml_client.compute.list()]\n", "\n", "if rai_compute_name in all_compute_names:\n", " print(f\"Found existing compute: {rai_compute_name}\")\n", "else:\n", " rai_compute_config = AmlCompute(\n", " name=rai_compute_name,\n", " size=\"STANDARD_DS3_V2\",\n", " min_instances=0,\n", " max_instances=4,\n", " idle_time_before_scale_down=3600,\n", " )\n", " ml_client.compute.begin_create_or_update(rai_compute_config)" ] }, { "cell_type": "markdown", "id": "73be2b63", "metadata": {}, "source": [ "# 2. Accessing the Data\n", "\n", "We supply the data as a pair of parquet files and accompanying `MLTable` file. We can download them, preprocess them, and take a brief look. \n", "\n", "The [fridge dataset](https://github.com/microsoft/computervision-recipes/tree/master/scenarios/classification) classifies images into four types of items commonly found in the Microsoft New England R&D office refrigerator - carton, water bottle, can and milk bottle. " ] }, { "cell_type": "code", "execution_count": null, "id": "5f875f18", "metadata": {}, "outputs": [], "source": [ "import os\n", "import pandas as pd\n", "\n", "try:\n", " from urllib import urlretrieve\n", "except ImportError:\n", " from urllib.request import urlretrieve" ] }, { "cell_type": "markdown", "id": "8f9ce4ae", "metadata": {}, "source": [ "## 2.1 Download Data\n", "\n", "Load the 'fridge items' dataset from a JSON file and MLTable definition.\n", "\n", "In this notebook, we use a toy dataset called Fridge Objects, which consists of 134 images of 4 classes of beverage container {can, carton, milk bottle, water bottle} photos taken on different backgrounds.\n", "\n", "All images in this notebook are hosted in [this repository](https://github.com/microsoft/computervision-recipes) and are made available under the [MIT license](https://github.com/microsoft/computervision-recipes/blob/master/LICENSE)." ] }, { "cell_type": "code", "execution_count": null, "id": "0eac648c", "metadata": {}, "outputs": [], "source": [ "import os\n", "import urllib\n", "from zipfile import ZipFile\n", "\n", "# Change to a different location if you prefer\n", "dataset_parent_dir = \"./data\"\n", "\n", "# create data folder if it doesnt exist.\n", "os.makedirs(dataset_parent_dir, exist_ok=True)\n", "\n", "# download data\n", "download_url = (\n", " \"https://publictestdatasets.blob.core.windows.net/computervision/fridgeObjects.zip\"\n", ")\n", "\n", "# Extract current dataset name from dataset url\n", "dataset_name = os.path.split(download_url)[-1].split(\".\")[0]\n", "# Get dataset path for later use\n", "dataset_dir = os.path.join(dataset_parent_dir, dataset_name)\n", "\n", "# Get the data zip file path\n", "data_file = os.path.join(dataset_parent_dir, f\"{dataset_name}.zip\")\n", "\n", "# Download the dataset\n", "urllib.request.urlretrieve(download_url, filename=data_file)\n", "\n", "# extract files\n", "with ZipFile(data_file, \"r\") as zip:\n", " print(\"extracting files...\")\n", " zip.extractall(path=dataset_parent_dir)\n", " print(\"done\")\n", "# delete zip file\n", "os.remove(data_file)" ] }, { "cell_type": "markdown", "id": "65d0ee08", "metadata": {}, "source": [ "## 2.2. Upload the images to Datastore through an AML Data asset (URI Folder) for training an AutomatedML Model\n", "\n", "In order to use the data for training in Azure ML, we upload it to our default Azure Blob Storage of our Azure ML Workspace.\n", "\n", "Reference to URI FOLDER data asset example for further details: https://github.com/Azure/azureml-examples/blob/samuel100/data-samples/sdk/assets/data/data.ipynb" ] }, { "cell_type": "code", "execution_count": null, "id": "ba8507e5", "metadata": {}, "outputs": [], "source": [ "from azure.ai.ml.entities import Data\n", "from azure.ai.ml.constants import AssetTypes\n", "\n", "input_test_data = \"fridge-items-images\"\n", "\n", "try:\n", " uri_folder_data_asset = ml_client.data.get(\n", " name=input_test_data, version=rai_example_version_string\n", " )\n", "except Exception:\n", " my_data = Data(\n", " path=dataset_dir,\n", " type=AssetTypes.URI_FOLDER,\n", " description=\"Fridge-items images\",\n", " name=input_test_data,\n", " version=rai_example_version_string,\n", " )\n", " uri_folder_data_asset = ml_client.data.create_or_update(my_data)\n", "print(uri_folder_data_asset)\n", "print(\"\")\n", "print(\"Path to folder in Blob Storage:\")\n", "print(uri_folder_data_asset.path)" ] }, { "cell_type": "markdown", "id": "4ade415d", "metadata": {}, "source": [ "## 2.3. Convert the downloaded data to JSONL\n", "\n", "In this example, the fridge object dataset is stored in a directory. There are four different folders inside:\n", "\n", "- /water_bottle\n", "- /milk_bottle\n", "- /carton\n", "- /can\n", "\n", "This is the most common data format for multiclass image classification. Each folder title corresponds to the image label for the images contained inside. In order to use this data to create an AzureML MLTable, we first need to convert it to the required JSONL format. Please refer to the [documentation on how to prepare datasets](https://docs.microsoft.com/en-us/azure/machine-learning/how-to-prepare-datasets-for-automl-images).\n", "\n", "\n", "The following script is creating two .jsonl files (one for training and one for validation) in the corresponding MLTable folder. The train / validation ratio corresponds to 20% of the data going into the validation file." ] }, { "cell_type": "code", "execution_count": null, "id": "b3eef3b6", "metadata": {}, "outputs": [], "source": [ "import json\n", "import os\n", "\n", "\n", "# We'll copy each JSONL file within its related MLTable folder\n", "training_mltable_path = os.path.join(dataset_parent_dir, \"training-mltable-folder\")\n", "validation_mltable_path = os.path.join(dataset_parent_dir, \"validation-mltable-folder\")\n", "\n", "# First, let's create the folders if they don't exist\n", "os.makedirs(training_mltable_path, exist_ok=True)\n", "os.makedirs(validation_mltable_path, exist_ok=True)\n", "\n", "train_validation_ratio = 5\n", "\n", "# Path to the training and validation files\n", "train_annotations_file = os.path.join(training_mltable_path, \"train_annotations.jsonl\")\n", "validation_annotations_file = os.path.join(\n", " validation_mltable_path, \"validation_annotations.jsonl\"\n", ")\n", "\n", "# Baseline of json line dictionary\n", "json_line_sample = {\n", " \"image_url\": uri_folder_data_asset.path,\n", " \"label\": \"\",\n", "}\n", "\n", "index = 0\n", "# Scan each sub directary and generate a jsonl line per image, distributed on train and valid JSONL files\n", "with open(train_annotations_file, \"w\") as train_f:\n", " with open(validation_annotations_file, \"w\") as validation_f:\n", " for class_name in os.listdir(dataset_dir):\n", " sub_dir = os.path.join(dataset_dir, class_name)\n", " if not os.path.isdir(sub_dir):\n", " continue\n", "\n", " # Scan each sub directary\n", " print(f\"Parsing {sub_dir}\")\n", " for image in os.listdir(sub_dir):\n", " json_line = dict(json_line_sample)\n", " json_line[\"image_url\"] += f\"{class_name}/{image}\"\n", " json_line[\"label\"] = class_name\n", "\n", " if index % train_validation_ratio == 0:\n", " # validation annotation\n", " validation_f.write(json.dumps(json_line) + \"\\n\")\n", " else:\n", " # train annotation\n", " train_f.write(json.dumps(json_line) + \"\\n\")\n", " index += 1" ] }, { "cell_type": "markdown", "id": "17d53df4", "metadata": {}, "source": [ "## 2.4 Create MLTable data input for training an AutomatedML Model\n", "\n", "Create MLTable data input using the jsonl files created above.\n", "\n", "For documentation on creating your own MLTable assets for jobs beyond this notebook, please refer to below resources\n", "- [MLTable YAML Schema](https://learn.microsoft.com/en-us/azure/machine-learning/reference-yaml-mltable) - covers how to write MLTable YAML, which is required for each MLTable asset.\n", "- [Create MLTable data asset](https://learn.microsoft.com/en-us/azure/machine-learning/how-to-create-data-assets?tabs=Python-SDK#create-a-mltable-data-asset) - covers how to create mltable data asset. " ] }, { "cell_type": "code", "execution_count": null, "id": "f39d2ea8", "metadata": {}, "outputs": [], "source": [ "def create_ml_table_file(filename):\n", " return (\n", " \"$schema: https://azureml/sdk-2-0/MLTable.json\\n\"\n", " \"type: mltable\\n\"\n", " \"paths:\\n\"\n", " \" - file: ./{0}\\n\"\n", " \"transformations:\\n\"\n", " \" - read_json_lines:\\n\"\n", " \" encoding: utf8\\n\"\n", " \" invalid_lines: error\\n\"\n", " \" include_path_column: false\\n\"\n", " \" - convert_column_types:\\n\"\n", " \" - columns: image_url\\n\"\n", " \" column_type: stream_info\"\n", " ).format(filename)\n", "\n", "\n", "def save_ml_table_file(output_path, mltable_file_contents):\n", " with open(os.path.join(output_path, \"MLTable\"), \"w\") as f:\n", " f.write(mltable_file_contents)\n", "\n", "\n", "# Create and save train mltable\n", "train_mltable_file_contents = create_ml_table_file(\n", " os.path.basename(train_annotations_file)\n", ")\n", "save_ml_table_file(training_mltable_path, train_mltable_file_contents)\n", "\n", "# Create and save validation mltable\n", "validation_mltable_file_contents = create_ml_table_file(\n", " os.path.basename(validation_annotations_file)\n", ")\n", "save_ml_table_file(validation_mltable_path, validation_mltable_file_contents)" ] }, { "cell_type": "code", "execution_count": null, "id": "9d251a68", "metadata": {}, "outputs": [], "source": [ "# Training MLTable defined locally, with local data to be uploaded\n", "my_training_data_input = Input(type=AssetTypes.MLTABLE, path=training_mltable_path)\n", "\n", "# Validation MLTable defined locally, with local data to be uploaded\n", "my_validation_data_input = Input(type=AssetTypes.MLTABLE, path=validation_mltable_path)\n", "\n", "# WITH REMOTE PATH: If available already in the cloud/workspace-blob-store\n", "# my_training_data_input = Input(type=AssetTypes.MLTABLE, path=\"azureml://datastores/workspaceblobstore/paths/vision-classification/train\")\n", "# my_validation_data_input = Input(type=AssetTypes.MLTABLE, path=\"azureml://datastores/workspaceblobstore/paths/vision-classification/valid\")" ] }, { "cell_type": "markdown", "id": "1115ac59", "metadata": {}, "source": [ "The label column contains the classes:" ] }, { "cell_type": "code", "execution_count": null, "id": "5b42df3d", "metadata": {}, "outputs": [], "source": [ "target_column_name = \"label\"" ] }, { "cell_type": "markdown", "id": "055288ec", "metadata": {}, "source": [ "# 3. Configure and run the AutoML for Images Classification-MultiClass training job\n", "\n", "Here, we are using Automatic hyperparameter sweeping for your models (AutoMode). For details on individual runs or manual hyper parameter sweep, refer to [automl-image-classification-multiclass-task-fridge-items.ipynb notebook](https://github.com/Azure/azureml-examples/blob/main/sdk/python/jobs/automl-standalone-jobs/automl-image-classification-multiclass-task-fridge-items/automl-image-classification-multiclass-task-fridge-items.ipynb).\n", "\n", "When using AutoML for Images, we can perform an automatic hyperparameter sweep to find the optimal model (we call this functionality AutoMode). The system will choose a model architecture and values for the learning_rate, number_of_epochs, training_batch_size, etc. based on the number of runs. There is no need to specify the hyperparameter search space, sampling method or early termination policy. A number of runs between 10 and 20 will likely work well on many datasets.\n", "\n", "AutoMode is triggered by setting `max_trials` to a value greater than 1 in limits and by omitting the hyperparameter space, sampling method and termination policy.\n", "\n", "The following functions configure AutoML image jobs for automatic sweeps:\n", "### image_classification() function parameters:\n", "The `image_classification()` factory function allows user to configure the training job.\n", "\n", "- `compute` - The compute on which the AutoML job will run. In this example we are using a compute called 'gpu-cluster' present in the workspace. You can replace it any other compute in the workspace.\n", "- `experiment_name` - The name of the experiment. An experiment is like a folder with multiple runs in Azure ML Workspace that should be related to the same logical machine learning experiment.\n", "- `name` - The name of the Job/Run. This is an optional property. If not specified, a random name will be generated.\n", "- `primary_metric` - The metric that AutoML will optimize for model selection.\n", "- `target_column_name` - The name of the column to target for predictions. It must always be specified. This parameter is applicable to 'training_data' and 'validation_data'.\n", "- `training_data` - The data to be used for training. It should contain both training feature columns and a target column. Optionally, this data can be split for segregating a validation or test dataset. \n", "You can use a registered MLTable in the workspace using the format '<mltable_name>:<version>' OR you can use a local file or folder as a MLTable. For e.g Input(mltable='my_mltable:1') OR Input(mltable=MLTable(local_path=\"./data\"))\n", "The parameter `training_data` must always be provided.\n", "\n", "### set_limits() function parameters:\n", "This is an optional configuration method to configure limits parameters such as timeouts.\n", "\n", "- `max_trials` - Parameter for maximum number of configurations to sweep. Must be an integer between 1 and 1000. When exploring just the default hyperparameters for a given model algorithm, set this parameter to 1. Default value is 1.\n", "- `max_concurrent_trials` - Maximum number of runs that can run concurrently. If not specified, all runs launch in parallel. If specified, must be an integer between 1 and 100. Default value is 1.\n", " NOTE: The number of concurrent runs is gated on the resources available in the specified compute target. Ensure that the compute target has the available resources for the desired concurrency.\n" ] }, { "cell_type": "code", "execution_count": null, "id": "1733179c", "metadata": {}, "outputs": [], "source": [ "# set up experiment name\n", "exp_name = \"dpv2-image-classification-experiment\"" ] }, { "cell_type": "markdown", "id": "005a1098", "metadata": {}, "source": [ "This pipeline uses serverless compute. To use the compute you created above, uncomment the compute parameter line." ] }, { "cell_type": "code", "execution_count": null, "id": "9427a2e5", "metadata": {}, "outputs": [], "source": [ "# Create the AutoML job with the related factory-function.\n", "\n", "import random\n", "import string\n", "\n", "allowed_chars = string.ascii_lowercase + string.digits\n", "suffix = \"\".join(random.choice(allowed_chars) for x in range(5))\n", "job_name = \"dpv2-image-classification-job-02\" + suffix\n", "\n", "image_classification_job = automl.image_classification(\n", " # compute=train_compute_name,\n", " name=job_name,\n", " experiment_name=exp_name,\n", " training_data=my_training_data_input,\n", " validation_data=my_validation_data_input,\n", " target_column_name=\"label\",\n", " primary_metric=ClassificationPrimaryMetrics.ACCURACY,\n", " tags={\"my_custom_tag\": \"My custom value\"},\n", ")\n", "\n", "image_classification_job.set_limits(\n", " max_trials=10,\n", " max_concurrent_trials=2,\n", ")" ] }, { "cell_type": "code", "execution_count": null, "id": "8f0735dd", "metadata": {}, "outputs": [], "source": [ "# Submit the AutoML job\n", "returned_job = ml_client.jobs.create_or_update(\n", " image_classification_job\n", ") # submit the job to the backend\n", "\n", "print(f\"Created job: {returned_job}\")" ] }, { "cell_type": "code", "execution_count": null, "id": "f2f41ca4", "metadata": {}, "outputs": [], "source": [ "ml_client.jobs.stream(returned_job.name)" ] }, { "cell_type": "markdown", "id": "3220d145", "metadata": {}, "source": [ "# 4. Retrieve the Best Trial (Best Model's trial/run) and Register the Best Model\n", "Use the MLFLowClient to access the results (such as Models, Artifacts, Metrics) of a previously completed AutoML Trial." ] }, { "cell_type": "markdown", "id": "3ff0484c", "metadata": {}, "source": [ "### 4.1 Initialize MLFlow Client\n", "\n", "The models and artifacts that are produced by AutoML can be accessed via the MLFlow interface.\n", "Initialize the MLFlow client here, and set the backend as Azure ML, via. the MLFlow Client.\n", "\n", "IMPORTANT, you need to have installed the latest MLFlow packages with:\n", "\n", " pip install azureml-mlflow\n", "\n", " pip install mlflow" ] }, { "cell_type": "code", "execution_count": null, "id": "e3029324", "metadata": {}, "outputs": [], "source": [ "import mlflow\n", "\n", "# Obtain the tracking URL from MLClient\n", "MLFLOW_TRACKING_URI = ml_client.workspaces.get(\n", " name=ml_client.workspace_name\n", ").mlflow_tracking_uri\n", "\n", "print(MLFLOW_TRACKING_URI)" ] }, { "cell_type": "code", "execution_count": null, "id": "868dd2ac", "metadata": {}, "outputs": [], "source": [ "# Set the MLFLOW TRACKING URI\n", "mlflow.set_tracking_uri(MLFLOW_TRACKING_URI)\n", "print(f\"\\nCurrent tracking uri: {mlflow.get_tracking_uri()}\")" ] }, { "cell_type": "code", "execution_count": null, "id": "320db110", "metadata": {}, "outputs": [], "source": [ "from mlflow.tracking.client import MlflowClient\n", "\n", "# Initialize MLFlow client\n", "mlflow_client = MlflowClient()" ] }, { "cell_type": "code", "execution_count": null, "id": "a7f791cb", "metadata": {}, "outputs": [], "source": [ "job_name = returned_job.name\n", "\n", "# # Example if providing an specific Job name/ID\n", "# job_name = \"happy_yam_40fq53m7c2\" #\"ashy_net_gdd31zf2fq\"\n", "\n", "# Get the parent run\n", "mlflow_parent_run = mlflow_client.get_run(job_name)\n", "\n", "print(\"Parent Run: \")\n", "print(mlflow_parent_run)" ] }, { "cell_type": "code", "execution_count": null, "id": "7914c06e", "metadata": {}, "outputs": [], "source": [ "# Print parent run tags. 'automl_best_child_run_id' tag should be there.\n", "print(mlflow_parent_run.data.tags.keys())" ] }, { "cell_type": "markdown", "id": "e930dd0c", "metadata": {}, "source": [ "### 4.2 Get the AutoML best child run" ] }, { "cell_type": "code", "execution_count": null, "id": "6937627a", "metadata": {}, "outputs": [], "source": [ "# Get the best model's child run\n", "\n", "best_child_run_id = mlflow_parent_run.data.tags[\"automl_best_child_run_id\"]\n", "print(f\"Found best child run id: {best_child_run_id}\")\n", "\n", "best_run = mlflow_client.get_run(best_child_run_id)\n", "\n", "print(\"Best child run: \")\n", "print(best_run)" ] }, { "cell_type": "code", "execution_count": null, "id": "7afd65b8", "metadata": {}, "outputs": [], "source": [ "import json\n", "\n", "hyperparameter_tag_dict = json.loads(best_run.data.tags[\"hyperparameters\"])\n", "print(hyperparameter_tag_dict)" ] }, { "cell_type": "code", "execution_count": null, "id": "8886474a", "metadata": {}, "outputs": [], "source": [ "import pandas as pd\n", "\n", "# Access the results (such as Models, Artifacts, Metrics) of a previously completed AutoML Run.\n", "pd.DataFrame(best_run.data.metrics, index=[0]).T" ] }, { "cell_type": "markdown", "id": "024e5458", "metadata": {}, "source": [ "### 4.3 Download the best model locally\n", "Access the results (such as Models, Artifacts, Metrics) of a previously completed AutoML Run." ] }, { "cell_type": "code", "execution_count": null, "id": "315bfb05", "metadata": {}, "outputs": [], "source": [ "# Create local folder\n", "import os\n", "\n", "local_dir = \"./artifact_downloads\"\n", "if not os.path.exists(local_dir):\n", " os.mkdir(local_dir)" ] }, { "cell_type": "code", "execution_count": null, "id": "2da37b17", "metadata": {}, "outputs": [], "source": [ "# Download run's artifacts/outputs\n", "local_path = mlflow_client.download_artifacts(\n", " best_run.info.run_id, \"outputs\", local_dir\n", ")\n", "print(f\"Artifacts downloaded in: {local_path}\")\n", "print(f\"Artifacts: {os.listdir(local_path)}\")" ] }, { "cell_type": "code", "execution_count": null, "id": "41956fba", "metadata": {}, "outputs": [], "source": [ "import os\n", "\n", "mlflow_model_dir = os.path.join(local_dir, \"outputs\", \"mlflow-model\")\n", "\n", "# Show the contents of the MLFlow model folder\n", "os.listdir(mlflow_model_dir)\n", "\n", "# You should see a list of files such as the following:\n", "# ['artifacts', 'conda.yaml', 'MLmodel', 'python_env.yaml', 'python_model.pkl', 'requirements.txt']" ] }, { "cell_type": "markdown", "id": "940b5239", "metadata": {}, "source": [ "### 4.4 Register model" ] }, { "cell_type": "code", "execution_count": null, "id": "bf1c5a6b", "metadata": {}, "outputs": [], "source": [ "model_name = \"ic-mc-rai-fridge-items-model\" + suffix\n", "model = Model(\n", " path=f\"azureml://jobs/{best_run.info.run_id}/outputs/artifacts/outputs/mlflow-model/\",\n", " name=model_name,\n", " description=\"my sample image classification multiclass model\",\n", " type=AssetTypes.MLFLOW_MODEL,\n", ")\n", "\n", "# for downloaded file\n", "# model = Model(\n", "# path=mlflow_model_dir,\n", "# name=model_name,\n", "# description=\"my sample image classification multiclass model\",\n", "# type=AssetTypes.MLFLOW_MODEL,\n", "# )\n", "\n", "registered_model = ml_client.models.create_or_update(model)" ] }, { "cell_type": "code", "execution_count": null, "id": "5e9f147e", "metadata": {}, "outputs": [], "source": [ "registered_model.id" ] }, { "cell_type": "markdown", "id": "6d165e2b", "metadata": {}, "source": [ "We need a compute target on which to run our jobs. The following checks whether the compute specified above is present; if not, then the compute target is created." ] }, { "cell_type": "code", "execution_count": null, "id": "d64dff2b", "metadata": {}, "outputs": [], "source": [ "print(registered_model.name, registered_model.version)" ] }, { "cell_type": "markdown", "id": "0722395e", "metadata": {}, "source": [ "# 5. Creating the RAI Vision Insights\n", "\n", "Now that we have our model, we can generate RAI Vision insights for it. We will need the `id` of the registered model, which will be as follows:" ] }, { "cell_type": "code", "execution_count": null, "id": "7d3e6e6e", "metadata": {}, "outputs": [], "source": [ "expected_model_id = f\"{registered_model.name}:{registered_model.version}\"\n", "azureml_model_id = f\"azureml:{expected_model_id}\"" ] }, { "cell_type": "markdown", "id": "310aa659", "metadata": {}, "source": [ "Next, we load the RAI components, so that we can construct a pipeline:" ] }, { "cell_type": "code", "execution_count": null, "id": "d67b942e", "metadata": {}, "outputs": [], "source": [ "registry_name = \"azureml\"\n", "credential = DefaultAzureCredential()\n", "\n", "ml_client_registry = MLClient(\n", " credential=credential,\n", " subscription_id=ml_client.subscription_id,\n", " resource_group_name=ml_client.resource_group_name,\n", " # workspace_name=ml_client.workspace_name,\n", " registry_name=registry_name,\n", ")\n", "\n", "rai_vision_insights_component = ml_client_registry.components.get(\n", " name=\"rai_vision_insights\", label=\"latest\"\n", ")" ] }, { "cell_type": "markdown", "id": "c98cd2d9", "metadata": {}, "source": [ "## 5.1 Constructing the pipeline in sdk\n", "We can now specify our pipeline. Complex objects (such as lists of column names) have to be converted to JSON strings before being passed to the components.\n", "\n", "\n", "Note:\n", "1. guided_gradcam doesn't work with transformer vision models\n", "2. shap isn't supported for automl images models\n", "\n", "For more details on XAI parameters, refer to this [page](https://learn.microsoft.com/en-us/azure/machine-learning/how-to-auto-train-image-models?tabs=cli#generate-explanations-for-predictions)." ] }, { "cell_type": "code", "execution_count": null, "id": "5d4c5514", "metadata": {}, "outputs": [], "source": [ "# Prepare XAI parameters\n", "xai_algorithm = (\n", " \"guided_gradcam\" # xrai, integrated_gradients, guided_gradcam, guided_backprop\n", ")\n", "n_steps = 50 # applicable for xrai, integrated_gradients\n", "xrai_fast = True # applicable for xrai\n", "approximation_method = \"gausslegendre\" # applicable integrated_gradients\n", "confidence_score_threshold_multilabel = 0.5 # applicable for multilabel classification\n", "\n", "# Note: latest automl wraps model_name property in model\n", "if \"model\" in hyperparameter_tag_dict:\n", " hyperparameter_tag_dict = hyperparameter_tag_dict[\"model\"]\n", "\n", "if \"model_name\" in hyperparameter_tag_dict:\n", " model_name = hyperparameter_tag_dict[\"model_name\"]\n", " if \"vit\" in model_name:\n", " # guided_gradcam doesn't work with transformer vision models\n", " # override defaults\n", " xai_algorithm = \"xrai\" # xrai, integrated_gradients, guided_backprop\n", " n_steps = 50 # applicable for xrai, integrated_gradients\n", " xrai_fast = True # applicable for xrai\n", " approximation_method = \"gausslegendre\" # applicable integrated_gradients\n", " confidence_score_threshold_multilabel = (\n", " 0.5 # applicable for multilabel classification\n", " )" ] }, { "cell_type": "code", "execution_count": null, "id": "a62105a7", "metadata": {}, "outputs": [], "source": [ "import json\n", "from azure.ai.ml import Input\n", "from azure.ai.ml.constants import AssetTypes\n", "\n", "\n", "@dsl.pipeline(\n", " compute=rai_compute_name,\n", " description=\"Example RAI computation on Fridge data\",\n", " experiment_name=f\"RAI_Fridge_Example_RAIInsights_Computation\",\n", ")\n", "def rai_fridge_image_classification_pipeline(target_column_name, test_data, classes):\n", " # Initiate the RAIInsights\n", " rai_image_job = rai_vision_insights_component(\n", " model_input=Input(type=AssetTypes.MLFLOW_MODEL, path=azureml_model_id),\n", " test_dataset=test_data,\n", " task_type=\"image_classification\",\n", " model_info=expected_model_id,\n", " target_column_name=target_column_name,\n", " classes=classes,\n", " dataset_type=\"private\",\n", " model_type=\"pyfunc\",\n", " precompute_explanation=True,\n", " enable_error_analysis=True,\n", " xai_algorithm=xai_algorithm,\n", " n_steps=n_steps,\n", " xrai_fast=xrai_fast,\n", " approximation_method=approximation_method,\n", " confidence_score_threshold_multilabel=confidence_score_threshold_multilabel,\n", " )\n", " rai_image_job.set_limits(timeout=7200)\n", "\n", " rai_image_job.outputs.dashboard.mode = \"upload\"\n", " rai_image_job.outputs.ux_json.mode = \"upload\"\n", "\n", " return {\n", " \"dashboard\": rai_image_job.outputs.dashboard,\n", " \"ux_json\": rai_image_job.outputs.ux_json,\n", " }" ] }, { "cell_type": "markdown", "id": "6b5b14a9", "metadata": {}, "source": [ "Next, we define the pipeline object itself, and ensure that the outputs will be available for download:" ] }, { "cell_type": "code", "execution_count": null, "id": "e4d86ec2", "metadata": {}, "outputs": [], "source": [ "import uuid\n", "from azure.ai.ml import Output\n", "\n", "insights_pipeline_job = rai_fridge_image_classification_pipeline(\n", " target_column_name=target_column_name,\n", " test_data=my_validation_data_input, # rai_fridge_validation_mltable,\n", " classes=\"[]\",\n", ")\n", "\n", "rand_path = str(uuid.uuid4())\n", "insights_pipeline_job.outputs.dashboard = Output(\n", " path=f\"azureml://datastores/workspaceblobstore/paths/{rand_path}/dashboard/\",\n", " mode=\"upload\",\n", " type=\"uri_folder\",\n", ")\n", "insights_pipeline_job.outputs.ux_json = Output(\n", " path=f\"azureml://datastores/workspaceblobstore/paths/{rand_path}/ux_json/\",\n", " mode=\"upload\",\n", " type=\"uri_folder\",\n", ")" ] }, { "cell_type": "markdown", "id": "25f34573", "metadata": {}, "source": [ "And submit the pipeline to AzureML for execution:" ] }, { "cell_type": "code", "execution_count": null, "id": "9b23ab85", "metadata": {}, "outputs": [], "source": [ "import time\n", "from azure.ai.ml.entities import PipelineJob\n", "\n", "\n", "def submit_and_wait(ml_client, pipeline_job) -> PipelineJob:\n", " created_job = ml_client.jobs.create_or_update(pipeline_job)\n", " assert created_job is not None\n", "\n", " while created_job.status not in [\n", " \"Completed\",\n", " \"Failed\",\n", " \"Canceled\",\n", " \"NotResponding\",\n", " ]:\n", " time.sleep(30)\n", " created_job = ml_client.jobs.get(created_job.name)\n", " print(\"Latest status : {0}\".format(created_job.status))\n", " assert created_job.status == \"Completed\"\n", " return created_job" ] }, { "cell_type": "code", "execution_count": null, "id": "2ca757f7", "metadata": {}, "outputs": [], "source": [ "insights_job = submit_and_wait(ml_client, insights_pipeline_job)" ] }, { "cell_type": "code", "execution_count": null, "id": "ce376147", "metadata": {}, "outputs": [], "source": [ "insights_job" ] }, { "cell_type": "markdown", "id": "1381768a", "metadata": {}, "source": [ "The dashboard should appear in the AzureML portal in the registered model view. The following cell computes the expected URI:" ] }, { "cell_type": "code", "execution_count": null, "id": "e86ab611", "metadata": {}, "outputs": [], "source": [ "sub_id = ml_client._operation_scope.subscription_id\n", "rg_name = ml_client._operation_scope.resource_group_name\n", "ws_name = ml_client.workspace_name\n", "\n", "expected_uri = f\"https://ml.azure.com/model/{expected_model_id}/model_analysis?wsid=/subscriptions/{sub_id}/resourcegroups/{rg_name}/workspaces/{ws_name}\"\n", "\n", "print(f\"Please visit {expected_uri} to see your analysis\")" ] }, { "cell_type": "markdown", "id": "93a8dff9", "metadata": {}, "source": [ "## 5.2 Constructing the pipeline in YAML\n", "\n", "It is also possible to specify the pipeline as a YAML file, and submit that using the command line. We will now create a YAML specification of the above pipeline and submit that:" ] }, { "cell_type": "code", "execution_count": null, "id": "7bcb4471", "metadata": {}, "outputs": [], "source": [ "my_validation_data_input" ] }, { "cell_type": "code", "execution_count": null, "id": "624bb0cd", "metadata": {}, "outputs": [], "source": [ "yaml_contents = f\"\"\"\n", "$schema: https://azuremlschemas.azureedge.net/latest/pipelineJob.schema.json\n", "experiment_name: AML_RAI_Vision_Sample_{rai_example_version_string}_From_YAML\n", "type: pipeline\n", "\n", "compute: azureml:cpucluster\n", "\n", "inputs:\n", " registered_model_name: {registered_model.name}\n", " vision_model_info: {expected_model_id}\n", " dataset_type: private\n", " my_test_data:\n", " type: mltable\n", " path: {my_validation_data_input[\"path\"]}\n", " mode: download\n", "\n", "settings:\n", " default_datastore: azureml:workspaceblobstore\n", " default_compute: azureml:cpucluster\n", " continue_on_step_failure: false\n", "\n", "jobs:\n", " analyse_model:\n", " type: command\n", " component: azureml://registries/azureml-preview/components/rai_vision_insights/versions/{version_string}\n", " inputs:\n", " task_type: image_classification\n", " model_input:\n", " type: mlflow_model\n", " path: {azureml_model_id}\n", " model_info: ${{{{parent.inputs.vision_model_info}}}}\n", " test_dataset:\n", " type: mltable\n", " path: ${{{{parent.inputs.my_test_data}}}}\n", " dataset_type: ${{{{parent.inputs.dataset_type}}}}\n", " target_column_name: {target_column_name}\n", " maximum_rows_for_test_dataset: 5000\n", " classes: '[]'\n", " precompute_explanation: True\n", " model_type: pyfunc\n", " xai_algorithm: {xai_algorithm}\n", " n_steps: {n_steps}\n", " xrai_fast: {xrai_fast}\n", " approximation_method: {approximation_method}\n", " confidence_score_threshold_multilabel: {confidence_score_threshold_multilabel}\n", "\"\"\"\n", "\n", "yaml_pipeline_filename = \"rai_automl_vision_example.yaml\"\n", "\n", "with open(yaml_pipeline_filename, \"w\") as f:\n", " f.write(yaml_contents)" ] }, { "cell_type": "markdown", "id": "1fd5f2dd", "metadata": {}, "source": [ "The created file can then be submitted using the Azure CLI:" ] }, { "cell_type": "code", "execution_count": null, "id": "3bf9bb1c", "metadata": {}, "outputs": [], "source": [ "cmd_line = [\n", " \"az\",\n", " \"ml\",\n", " \"job\",\n", " \"create\",\n", " \"--resource-group\",\n", " rg_name,\n", " \"--workspace\",\n", " ws_name,\n", " \"--file\",\n", " yaml_pipeline_filename,\n", "]\n", "\n", "import subprocess\n", "\n", "try:\n", " cmd = subprocess.run(cmd_line, check=True, shell=True, capture_output=True)\n", "except subprocess.CalledProcessError as cpe:\n", " print(f\"Error invoking: {cpe.args}\")\n", " print(cpe.stdout)\n", " print(cpe.stderr)\n", " raise\n", "else:\n", " print(\"Azure CLI submission completed\")" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3.10 - SDK V2", "language": "python", "name": "python310-sdkv2" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.15" } }, "nbformat": 4, "nbformat_minor": 5 }