sdk/python/responsible-ai/vision/responsibleaidashboard-image-classification-fridge.ipynb (996 lines of code) (raw):
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"id": "98605bcd",
"metadata": {},
"source": [
"# Image Classification scenario with RAI Dashboard\n",
"\n",
"The [fridge dataset](https://github.com/microsoft/computervision-recipes/tree/master/scenarios/classification) classifies images into four types of items commonly found in the Microsoft New England R&D office refrigerator - carton, water bottle, can and milk bottle. This example notebook demonstrates how to use a fine-tuned fastai computer vision model on the dataset to evaluate the model in AzureML.\n",
"\n",
"First, we need to specify the version of the RAI components which are available in the workspace. This was specified when the components were uploaded."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "53b4eeac",
"metadata": {},
"outputs": [],
"source": [
"version_string = \"0.0.21\""
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "06008690",
"metadata": {},
"source": [
"We also need to give the name of the compute cluster we want to use in AzureML. Later in this notebook, we will create it if it does not already exist:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f1ad79f9",
"metadata": {},
"outputs": [],
"source": [
"compute_name = \"cpucluster\""
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "9fc65dc7",
"metadata": {},
"source": [
"Finally, we need to specify a version for the data and components we will create while running this notebook. This should be unique for the workspace, but the specific value doesn't matter:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "78053935",
"metadata": {},
"outputs": [],
"source": [
"rai_example_version_string = \"22\""
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "73be2b63",
"metadata": {},
"source": [
"## Accessing the Data\n",
"\n",
"We supply the data as a pair of parquet files and accompanying `MLTable` file. We can download them, preprocess them, and take a brief look:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "5f875f18",
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"import pandas as pd\n",
"\n",
"try:\n",
" from urllib import urlretrieve\n",
"except ImportError:\n",
" from urllib.request import urlretrieve"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "8f9ce4ae",
"metadata": {},
"source": [
"First we download the fridge dataset:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "0eac648c",
"metadata": {},
"outputs": [],
"source": [
"def download_fridge_dataset(data_path, annotations_file):\n",
" os.makedirs(data_path, exist_ok=True)\n",
"\n",
" # download data\n",
" base_url = \"https://publictestdatasets.blob.core.windows.net/\"\n",
" fridge_folder = \"computervision/fridgeObjects/\"\n",
"\n",
" data_url = base_url + fridge_folder + annotations_file\n",
" data_output_path = os.path.join(data_path, annotations_file)\n",
" urlretrieve(data_url, filename=data_output_path)\n",
"\n",
"\n",
"test_annotations = \"test_annotations.jsonl\"\n",
"test_data_path = \"fridge_test_data\"\n",
"download_fridge_dataset(test_data_path, test_annotations)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "17d53df4",
"metadata": {},
"source": [
"Now create the mltable:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f39d2ea8",
"metadata": {},
"outputs": [],
"source": [
"def create_ml_table_file(filename):\n",
" return (\n",
" \"$schema: http://azureml/sdk-2-0/MLTable.json\\n\"\n",
" \"type: mltable\\n\"\n",
" \"paths:\\n\"\n",
" \" - file: ./{0}\\n\"\n",
" \"transformations:\\n\"\n",
" \" - read_json_lines:\\n\"\n",
" \" encoding: utf8\\n\"\n",
" \" invalid_lines: error\\n\"\n",
" \" include_path_column: false\\n\"\n",
" ).format(filename)\n",
"\n",
"\n",
"def save_ml_table_file(output_path, ml_table_data):\n",
" mltable_file_contents = create_ml_table_file(ml_table_data)\n",
" with open(os.path.join(output_path, \"MLTable\"), \"w\") as f:\n",
" f.write(mltable_file_contents)\n",
"\n",
"\n",
"save_ml_table_file(test_data_path, test_annotations)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "a2c4ebb4",
"metadata": {},
"source": [
"Load some data for a quick view:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1027fa92",
"metadata": {},
"outputs": [],
"source": [
"import mltable\n",
"\n",
"tbl = mltable.load(test_data_path)\n",
"test_df: pd.DataFrame = tbl.to_pandas_dataframe()\n",
"\n",
"display(test_df)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "1115ac59",
"metadata": {},
"source": [
"The label column contains the classes:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "5b42df3d",
"metadata": {},
"outputs": [],
"source": [
"target_column_name = \"label\""
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "52e79b04",
"metadata": {},
"source": [
"First, we need to upload the datasets to our workspace. We start by creating an `MLClient` for interactions with AzureML:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "180ad111",
"metadata": {},
"outputs": [],
"source": [
"# Enter details of your AML workspace\n",
"subscription_id = \"<SUBSCRIPTION_ID>\"\n",
"resource_group = \"<RESOURCE_GROUP>\"\n",
"workspace = \"<AML_WORKSPACE_NAME>\""
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "395435fc",
"metadata": {},
"outputs": [],
"source": [
"# Handle to the workspace\n",
"from azure.ai.ml import MLClient\n",
"from azure.identity import DefaultAzureCredential\n",
"\n",
"try:\n",
" credential = DefaultAzureCredential()\n",
" ml_client = MLClient(\n",
" credential=credential,\n",
" subscription_id=subscription_id,\n",
" resource_group_name=resource_group,\n",
" workspace_name=workspace,\n",
" )\n",
"except Exception:\n",
" # If in compute instance we can get the config automatically\n",
" from azureml.core import Workspace\n",
"\n",
" workspace = Workspace.from_config()\n",
" workspace.write_config()\n",
" ml_client = MLClient.from_config(\n",
" credential=DefaultAzureCredential(exclude_shared_token_cache_credential=True),\n",
" logging_enable=True,\n",
" )\n",
"\n",
"print(ml_client)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "7b501735",
"metadata": {},
"source": [
"We can now upload the data to AzureML:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "62eb02a2",
"metadata": {},
"outputs": [],
"source": [
"from azure.ai.ml.entities import Data\n",
"from azure.ai.ml.constants import AssetTypes\n",
"\n",
"input_test_data = \"Fridge_Test_MLTable\"\n",
"\n",
"try:\n",
" test_data = ml_client.data.get(\n",
" name=input_test_data,\n",
" version=rai_example_version_string,\n",
" )\n",
"except Exception:\n",
" test_data = Data(\n",
" path=test_data_path,\n",
" type=AssetTypes.MLTABLE,\n",
" description=\"RAI Fridge test data\",\n",
" name=input_test_data,\n",
" version=rai_example_version_string,\n",
" )\n",
" ml_client.data.create_or_update(test_data)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "6815ba75",
"metadata": {},
"source": [
"# Creating the Model\n",
"\n",
"To simplify the model creation process, we're going to use a pipeline.\n",
"\n",
"We create a directory for the training script:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e78d869b",
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"\n",
"os.makedirs(\"fridge_component_src\", exist_ok=True)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "ea86e55d",
"metadata": {},
"source": [
"Next, we write out our script to retrieve the trained model:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a523f144",
"metadata": {},
"outputs": [],
"source": [
"%%writefile fridge_component_src/training_script.py\n",
"\n",
"import argparse\n",
"import logging\n",
"import json\n",
"import os\n",
"import time\n",
"\n",
"\n",
"import mlflow\n",
"import mlflow.pyfunc\n",
"\n",
"from azureml.core import Run\n",
"\n",
"from fastai.learner import load_learner\n",
"\n",
"from raiutils.common.retries import retry_function\n",
"\n",
"try:\n",
" from urllib import urlretrieve\n",
"except ImportError:\n",
" from urllib.request import urlretrieve\n",
"\n",
"_logger = logging.getLogger(__file__)\n",
"logging.basicConfig(level=logging.INFO)\n",
"\n",
"FRIDGE_MODEL_NAME = 'fridge_model'\n",
"\n",
"\n",
"def parse_args():\n",
" # setup arg parser\n",
" parser = argparse.ArgumentParser()\n",
"\n",
" # add arguments\n",
" parser.add_argument(\n",
" \"--model_output_path\", type=str, help=\"Path to write model info JSON\"\n",
" )\n",
" parser.add_argument(\n",
" \"--model_base_name\", type=str, help=\"Name of the registered model\"\n",
" )\n",
" parser.add_argument(\n",
" \"--model_name_suffix\", type=int, help=\"Set negative to use epoch_secs\"\n",
" )\n",
" parser.add_argument(\n",
" \"--device\", type=int, help=(\n",
" \"Device for CPU/GPU supports. Setting this to -1 will leverage \"\n",
" \"CPU, >=0 will run the model on the associated CUDA device id.\")\n",
" )\n",
"\n",
" # parse args\n",
" args = parser.parse_args()\n",
"\n",
" # return args\n",
" return args\n",
"\n",
"\n",
"class FetchModel(object):\n",
" def __init__(self):\n",
" pass\n",
"\n",
" def fetch(self):\n",
" url = ('https://publictestdatasets.blob.core.windows.net/models/' +\n",
" FRIDGE_MODEL_NAME)\n",
" urlretrieve(url, FRIDGE_MODEL_NAME)\n",
"\n",
"\n",
"def main(args):\n",
" current_experiment = Run.get_context().experiment\n",
" tracking_uri = current_experiment.workspace.get_mlflow_tracking_uri()\n",
" _logger.info(\"tracking_uri: {0}\".format(tracking_uri))\n",
" mlflow.set_tracking_uri(tracking_uri)\n",
" mlflow.set_experiment(current_experiment.name)\n",
"\n",
" _logger.info(\"Getting device\")\n",
" device = args.device\n",
"\n",
" _logger.info(\"Loading parquet input\")\n",
"\n",
" # Load the fridge fastai model\n",
" fetcher = FetchModel()\n",
" action_name = \"Model download\"\n",
" err_msg = \"Failed to download model\"\n",
" max_retries = 4\n",
" retry_delay = 60\n",
" retry_function(fetcher.fetch, action_name, err_msg,\n",
" max_retries=max_retries,\n",
" retry_delay=retry_delay)\n",
" model = load_learner(FRIDGE_MODEL_NAME)\n",
"\n",
" if device >= 0:\n",
" model = model.cuda()\n",
"\n",
" if args.model_name_suffix < 0:\n",
" suffix = int(time.time())\n",
" else:\n",
" suffix = args.model_name_suffix\n",
" registered_name = \"{0}_{1}\".format(args.model_base_name, suffix)\n",
" _logger.info(f\"Registering model as {registered_name}\")\n",
"\n",
" # Saving model with mlflow\n",
" _logger.info(\"Saving with mlflow\")\n",
"\n",
" mlflow.fastai.log_model(\n",
" model,\n",
" artifact_path=registered_name,\n",
" registered_model_name=registered_name\n",
" )\n",
"\n",
" _logger.info(\"Writing JSON\")\n",
" dict = {\"id\": \"{0}:1\".format(registered_name)}\n",
" output_path = os.path.join(args.model_output_path, \"model_info.json\")\n",
" with open(output_path, \"w\") as of:\n",
" json.dump(dict, fp=of)\n",
"\n",
"\n",
"# run script\n",
"if __name__ == \"__main__\":\n",
" # add space in logs\n",
" print(\"*\" * 60)\n",
" print(\"\\n\\n\")\n",
"\n",
" # parse args\n",
" args = parse_args()\n",
"\n",
" # run main function\n",
" main(args)\n",
"\n",
" # add space in logs\n",
" print(\"*\" * 60)\n",
" print(\"\\n\\n\")"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "e115dd6e",
"metadata": {},
"source": [
"Now, we can build this into an AzureML component:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "3d54e43f",
"metadata": {},
"outputs": [],
"source": [
"from azure.ai.ml import load_component\n",
"\n",
"yaml_contents = f\"\"\"\n",
"$schema: http://azureml/sdk-2-0/CommandComponent.json\n",
"name: rai_fridge_training_component\n",
"display_name: Fridge training component for RAI example\n",
"version: {rai_example_version_string}\n",
"type: command\n",
"inputs:\n",
" model_base_name:\n",
" type: string\n",
" model_name_suffix: # Set negative to use epoch_secs\n",
" type: integer\n",
" default: -1\n",
" device: # set to >= 0 to use GPU\n",
" type: integer\n",
" default: 0\n",
"outputs:\n",
" model_output_path:\n",
" type: path\n",
"code: ./fridge_component_src/\n",
"environment: azureml://registries/azureml/environments/responsibleai-vision/versions/15\n",
"command: >-\n",
" python training_script.py\n",
" --model_base_name ${{{{inputs.model_base_name}}}}\n",
" --model_name_suffix ${{{{inputs.model_name_suffix}}}}\n",
" --device ${{{{inputs.device}}}}\n",
" --model_output_path ${{{{outputs.model_output_path}}}}\n",
"\"\"\"\n",
"\n",
"yaml_filename = \"FridgeVisionTrainingComp.yaml\"\n",
"\n",
"with open(yaml_filename, \"w\") as f:\n",
" f.write(yaml_contents)\n",
"\n",
"train_component_definition = load_component(source=yaml_filename)\n",
"\n",
"ml_client.components.create_or_update(train_component_definition)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "6d165e2b",
"metadata": {},
"source": [
"We need a compute target on which to run our jobs. The following checks whether the compute specified above is present; if not, then the compute target is created."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1e40fc38",
"metadata": {},
"outputs": [],
"source": [
"from azure.ai.ml.entities import AmlCompute\n",
"\n",
"all_compute_names = [x.name for x in ml_client.compute.list()]\n",
"\n",
"if compute_name in all_compute_names:\n",
" print(f\"Found existing compute: {compute_name}\")\n",
"else:\n",
" my_compute = AmlCompute(\n",
" name=compute_name,\n",
" size=\"STANDARD_DS3_V2\",\n",
" min_instances=0,\n",
" max_instances=4,\n",
" idle_time_before_scale_down=3600,\n",
" )\n",
" ml_client.compute.begin_create_or_update(my_compute)\n",
" print(\"Initiated compute creation\")"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "9d8eb868",
"metadata": {},
"source": [
"## Running a training pipeline\n",
"\n",
"Now that we have our training component, we can run it. We begin by generating a unique name for the model."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "ad76242b",
"metadata": {},
"outputs": [],
"source": [
"import time\n",
"\n",
"model_base_name = \"fridge_model\"\n",
"model_name_suffix = \"12492\"\n",
"device = -1"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "d49615a7",
"metadata": {},
"source": [
"Next, we define our training pipeline. This has two components. The first is the training component which we defined above. The second is a component to register the model in AzureML:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "cb6c6cec",
"metadata": {},
"outputs": [],
"source": [
"from azure.ai.ml import dsl, Input\n",
"\n",
"train_model_component = ml_client.components.get(\n",
" name=\"rai_fridge_training_component\", version=rai_example_version_string\n",
")\n",
"\n",
"\n",
"@dsl.pipeline(\n",
" compute=compute_name,\n",
" description=\"Register Model for RAI Fridge example\",\n",
" experiment_name=f\"RAI_Fridge_Example_Model_Training_{model_name_suffix}\",\n",
")\n",
"def my_training_pipeline(model_base_name, model_name_suffix, device):\n",
" trained_model = train_component_definition(\n",
" model_base_name=model_base_name,\n",
" model_name_suffix=model_name_suffix,\n",
" device=device,\n",
" )\n",
" trained_model.set_limits(timeout=3600)\n",
"\n",
" return {}\n",
"\n",
"\n",
"model_registration_pipeline_job = my_training_pipeline(\n",
" model_base_name, model_name_suffix, device\n",
")"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "2fa66ea6",
"metadata": {},
"source": [
"With the training pipeline defined, we can submit it for execution in AzureML. We define a helper function to wait for the job to complete:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f854eef5",
"metadata": {},
"outputs": [],
"source": [
"from azure.ai.ml.entities import PipelineJob\n",
"\n",
"\n",
"def submit_and_wait(ml_client, pipeline_job) -> PipelineJob:\n",
" created_job = ml_client.jobs.create_or_update(pipeline_job)\n",
" assert created_job is not None\n",
"\n",
" while created_job.status not in [\n",
" \"Completed\",\n",
" \"Failed\",\n",
" \"Canceled\",\n",
" \"NotResponding\",\n",
" ]:\n",
" time.sleep(30)\n",
" created_job = ml_client.jobs.get(created_job.name)\n",
" print(\"Latest status : {0}\".format(created_job.status))\n",
" assert created_job.status == \"Completed\"\n",
" return created_job\n",
"\n",
"\n",
"# This is the actual submission\n",
"training_job = submit_and_wait(ml_client, model_registration_pipeline_job)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "0722395e",
"metadata": {},
"source": [
"## Creating the RAI Vision Insights\n",
"\n",
"Now that we have our model, we can generate RAI Vision insights for it. We will need the `id` of the registered model, which will be as follows:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "7d3e6e6e",
"metadata": {},
"outputs": [],
"source": [
"expected_model_id = f\"{model_base_name}_{model_name_suffix}:1\"\n",
"azureml_model_id = f\"azureml:{expected_model_id}\""
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "310aa659",
"metadata": {},
"source": [
"Next, we load the RAI components, so that we can construct a pipeline:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d67b942e",
"metadata": {},
"outputs": [],
"source": [
"fridge_test_mltable = Input(\n",
" type=\"mltable\",\n",
" path=f\"{input_test_data}:{rai_example_version_string}\",\n",
" mode=\"download\",\n",
")\n",
"\n",
"registry_name = \"azureml\"\n",
"credential = DefaultAzureCredential()\n",
"\n",
"ml_client_registry = MLClient(\n",
" credential=credential,\n",
" subscription_id=ml_client.subscription_id,\n",
" resource_group_name=ml_client.resource_group_name,\n",
" registry_name=registry_name,\n",
")\n",
"\n",
"rai_vision_insights_component = ml_client_registry.components.get(\n",
" name=\"rai_vision_insights\", version=version_string\n",
")"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "c98cd2d9",
"metadata": {},
"source": [
"We can now specify our pipeline. Complex objects (such as lists of column names) have to be converted to JSON strings before being passed to the components."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a62105a7",
"metadata": {},
"outputs": [],
"source": [
"import json\n",
"from azure.ai.ml import Input\n",
"from azure.ai.ml.constants import AssetTypes\n",
"\n",
"\n",
"@dsl.pipeline(\n",
" compute=compute_name,\n",
" description=\"Example RAI computation on Fridge data\",\n",
" experiment_name=f\"RAI_Fridge_Example_RAIInsights_Computation_{model_name_suffix}\",\n",
")\n",
"def rai_fridge_image_classification_pipeline(\n",
" target_column_name,\n",
" test_data,\n",
" classes,\n",
" use_model_dependency,\n",
"):\n",
" # Initiate the RAIInsights\n",
" rai_image_job = rai_vision_insights_component(\n",
" task_type=\"image_classification\",\n",
" model_info=expected_model_id,\n",
" model_input=Input(type=AssetTypes.MLFLOW_MODEL, path=azureml_model_id),\n",
" test_dataset=test_data,\n",
" target_column_name=target_column_name,\n",
" classes=classes,\n",
" model_type=\"fastai\",\n",
" use_model_dependency=use_model_dependency,\n",
" )\n",
" rai_image_job.set_limits(timeout=7200)\n",
"\n",
" rai_image_job.outputs.dashboard.mode = \"upload\"\n",
" rai_image_job.outputs.ux_json.mode = \"upload\"\n",
"\n",
" return {\n",
" \"dashboard\": rai_image_job.outputs.dashboard,\n",
" \"ux_json\": rai_image_job.outputs.ux_json,\n",
" }"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "6b5b14a9",
"metadata": {},
"source": [
"Next, we define the pipeline object itself, and ensure that the outputs will be available for download:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e4d86ec2",
"metadata": {},
"outputs": [],
"source": [
"import uuid\n",
"from azure.ai.ml import Output\n",
"\n",
"insights_pipeline_job = rai_fridge_image_classification_pipeline(\n",
" target_column_name=target_column_name,\n",
" test_data=fridge_test_mltable,\n",
" classes=\"[]\",\n",
" use_model_dependency=True,\n",
")\n",
"\n",
"rand_path = str(uuid.uuid4())\n",
"insights_pipeline_job.outputs.dashboard = Output(\n",
" path=f\"azureml://datastores/workspaceblobstore/paths/{rand_path}/dashboard/\",\n",
" mode=\"upload\",\n",
" type=\"uri_folder\",\n",
")\n",
"insights_pipeline_job.outputs.ux_json = Output(\n",
" path=f\"azureml://datastores/workspaceblobstore/paths/{rand_path}/ux_json/\",\n",
" mode=\"upload\",\n",
" type=\"uri_folder\",\n",
")"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "25f34573",
"metadata": {},
"source": [
"And submit the pipeline to AzureML for execution:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "2ca757f7",
"metadata": {},
"outputs": [],
"source": [
"insights_job = submit_and_wait(ml_client, insights_pipeline_job)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "1381768a",
"metadata": {},
"source": [
"The dashboard should appear in the AzureML portal in the registered model view. The following cell computes the expected URI:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e86ab611",
"metadata": {},
"outputs": [],
"source": [
"sub_id = ml_client._operation_scope.subscription_id\n",
"rg_name = ml_client._operation_scope.resource_group_name\n",
"ws_name = ml_client.workspace_name\n",
"\n",
"expected_uri = f\"https://ml.azure.com/model/{expected_model_id}/model_analysis?wsid=/subscriptions/{sub_id}/resourcegroups/{rg_name}/workspaces/{ws_name}\"\n",
"\n",
"print(f\"Please visit {expected_uri} to see your analysis\")"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "93a8dff9",
"metadata": {},
"source": [
"## Constructing the pipeline in YAML\n",
"\n",
"It is also possible to specify the pipeline as a YAML file, and submit that using the command line. We will now create a YAML specification of the above pipeline and submit that:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "624bb0cd",
"metadata": {},
"outputs": [],
"source": [
"yaml_contents = f\"\"\"\n",
"$schema: https://azuremlschemas.azureedge.net/latest/pipelineJob.schema.json\n",
"experiment_name: AML_RAI_Vision_Sample_{rai_example_version_string}\n",
"type: pipeline\n",
"\n",
"compute: azureml:cpucluster\n",
"\n",
"inputs:\n",
" registered_model_name: fridge_model\n",
" fridge_model_info: {expected_model_id}\n",
" my_test_data:\n",
" type: mltable\n",
" path: azureml:{input_test_data}:{rai_example_version_string}\n",
" mode: download\n",
"\n",
"settings:\n",
" default_datastore: azureml:workspaceblobstore\n",
" default_compute: azureml:cpucluster\n",
" continue_on_step_failure: false\n",
"\n",
"jobs:\n",
" analyse_model:\n",
" type: command\n",
" component: azureml://registries/azureml/components/rai_vision_insights/versions/{version_string}\n",
" inputs:\n",
" task_type: image_classification\n",
" model_input:\n",
" type: mlflow_model\n",
" path: {azureml_model_id}\n",
" model_info: ${{{{parent.inputs.fridge_model_info}}}}\n",
" test_dataset: ${{{{parent.inputs.my_test_data}}}}\n",
" target_column_name: {target_column_name}\n",
" maximum_rows_for_test_dataset: 5000\n",
" classes: '[]'\n",
"\"\"\"\n",
"\n",
"yaml_pipeline_filename = \"rai_vision_example.yaml\"\n",
"\n",
"with open(yaml_pipeline_filename, \"w\") as f:\n",
" f.write(yaml_contents)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "1fd5f2dd",
"metadata": {},
"source": [
"The created file can then be submitted using the Azure CLI:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "3bf9bb1c",
"metadata": {},
"outputs": [],
"source": [
"cmd_line = [\n",
" \"az\",\n",
" \"ml\",\n",
" \"job\",\n",
" \"create\",\n",
" \"--resource-group\",\n",
" rg_name,\n",
" \"--workspace\",\n",
" ws_name,\n",
" \"--file\",\n",
" yaml_pipeline_filename,\n",
"]\n",
"\n",
"import subprocess\n",
"\n",
"try:\n",
" cmd = subprocess.run(cmd_line, check=True, shell=True, capture_output=True)\n",
"except subprocess.CalledProcessError as cpe:\n",
" print(f\"Error invoking: {cpe.args}\")\n",
" print(cpe.stdout)\n",
" print(cpe.stderr)\n",
" raise\n",
"else:\n",
" print(\"Azure CLI submission completed\")"
]
}
],
"metadata": {
"celltoolbar": "Raw Cell Format",
"kernelspec": {
"display_name": "Python 3.10 - SDK V2",
"language": "python",
"name": "python310-sdkv2"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.15"
},
"vscode": {
"interpreter": {
"hash": "8fd340b5477ca1a0b454d48a3973beff39fee032ada47a04f6f3725b469a8988"
}
}
},
"nbformat": 4,
"nbformat_minor": 5
}