sdk/python/foundation-models/system/finetune/image-instance-segmentation/mmdetection-fridgeobjects-instance-segmentation.ipynb (1,226 lines of code) (raw):
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Image instance-segmentation using MMDetection specific pipeline component\n",
"\n",
"This sample shows how to use `mmdetection_image_objectdetection_instancesegmentation_pipeline` component from the `azureml` system registry to fine tune a model for image instance-segmentation task using fridgeObjects Dataset. We then deploy the fine tuned model to an online endpoint for real time inference.\n",
"\n",
"### Training data\n",
"We will use the [odfridgeObjectsMask](https://automlsamplenotebookdata-adcuc7f7bqhhh8a4.b02.azurefd.net/image-instance-segmentation/odFridgeObjectsMask.zip) dataset.\n",
"\n",
"### Model\n",
"We will use the `mask-rcnn_swin-t-p4-w7_fpn_1x_coco` model in this notebook. If you need to fine tune a model that is available on MMDetection model zoo, but not available in `azureml` system registry, you can either register the model and use the registered model or use the `model_name` parameter to instruct the components to pull the model directly from MMDetection model zoo.\n",
"\n",
"### Outline\n",
"1. Install dependencies\n",
"2. Setup pre-requisites such as compute\n",
"3. Pick a model to fine tune\n",
"4. Prepare dataset for finetuning the model\n",
"5. Submit the fine tuning job using MMDetection specific image instance-segmentation and instance-segmentation component\n",
"6. Review training and evaluation metrics\n",
"7. Register the fine tuned model\n",
"8. Deploy the fine tuned model for real time inference\n",
"9. Test deployed end point\n",
"9. Clean up resources"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"### 1. Install dependencies\n",
"Before starting off, if you are running the notebook on Azure Machine Learning Studio or running first time locally, you will need the following packages"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"! pip install azure-ai-ml>=1.23.1\n",
"! pip install azure-identity==1.13.0"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"### 2. Setup pre-requisites"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"#### 2.1 Connect to Azure Machine Learning workspace\n",
"\n",
"Before we dive in the code, you'll need to connect to your workspace. The workspace is the top-level resource for Azure Machine Learning, providing a centralized place to work with all the artifacts you create when you use Azure Machine Learning.\n",
"\n",
"We are using `DefaultAzureCredential` to get access to workspace. `DefaultAzureCredential` should be capable of handling most scenarios. If you want to learn more about other available credentials, go to [set up authentication doc](https://learn.microsoft.com/en-us/azure/machine-learning/how-to-setup-authentication?tabs=sdk), [azure-identity reference doc](https://learn.microsoft.com/en-us/python/api/azure-identity/azure.identity?view=azure-python).\n",
"\n",
"Replace `<AML_WORKSPACE_NAME>`, `<RESOURCE_GROUP>` and `<SUBSCRIPTION_ID>` with their respective values in the below cell."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azure.ai.ml import MLClient\n",
"from azure.identity import DefaultAzureCredential\n",
"\n",
"\n",
"experiment_name = (\n",
" \"AzureML-Train-Finetune-Vision-IS-Samples\" # can rename to any valid name\n",
")\n",
"\n",
"credential = DefaultAzureCredential()\n",
"workspace_ml_client = None\n",
"try:\n",
" workspace_ml_client = MLClient.from_config(credential)\n",
" subscription_id = workspace_ml_client.subscription_id\n",
" resource_group = workspace_ml_client.resource_group_name\n",
" workspace_name = workspace_ml_client.workspace_name\n",
"except Exception as ex:\n",
" print(ex)\n",
" # Enter details of your AML workspace\n",
" subscription_id = \"<SUBSCRIPTION_ID>\"\n",
" resource_group = \"<RESOURCE_GROUP>\"\n",
" workspace_name = \"<AML_WORKSPACE_NAME>\"\n",
" workspace_ml_client = MLClient(\n",
" credential, subscription_id, resource_group, workspace_name\n",
" )\n",
"\n",
"registry_ml_client = MLClient(\n",
" credential,\n",
" subscription_id,\n",
" resource_group,\n",
" registry_name=\"azureml\",\n",
")"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"#### 2.2 Create compute\n",
"\n",
"In order to finetune a model on Azure Machine Learning studio, you will need to create a compute resource first. **Creating a compute will take 3-4 minutes.** \n",
"\n",
"For additional references, see [Azure Machine Learning in a Day](https://github.com/Azure/azureml-examples/blob/main/tutorials/azureml-in-a-day/azureml-in-a-day.ipynb). "
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Create CPU compute for model selection component"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azure.ai.ml.entities import AmlCompute\n",
"from azure.core.exceptions import ResourceNotFoundError\n",
"\n",
"model_import_cluster_name = \"sample-model-import-cluster\"\n",
"try:\n",
" _ = workspace_ml_client.compute.get(model_import_cluster_name)\n",
" print(\"Found existing compute target.\")\n",
"except ResourceNotFoundError:\n",
" print(\"Creating a new compute target...\")\n",
" compute_config = AmlCompute(\n",
" name=model_import_cluster_name,\n",
" type=\"amlcompute\",\n",
" size=\"Standard_D12_v2\",\n",
" idle_time_before_scale_down=120,\n",
" min_instances=0,\n",
" max_instances=4,\n",
" )\n",
" workspace_ml_client.begin_create_or_update(compute_config).result()"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Create GPU compute for finetune component\n",
"\n",
"The list of GPU machines can be found [here](https://learn.microsoft.com/en-us/azure/virtual-machines/sizes-gpu)."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"finetune_cluster_name = \"sample-finetune-cluster-gpu\"\n",
"\n",
"try:\n",
" _ = workspace_ml_client.compute.get(finetune_cluster_name)\n",
" print(\"Found existing compute target.\")\n",
"except ResourceNotFoundError:\n",
" print(\"Creating a new compute target...\")\n",
" compute_config = AmlCompute(\n",
" name=finetune_cluster_name,\n",
" type=\"amlcompute\",\n",
" size=\"STANDARD_NC6s_v3\",\n",
" idle_time_before_scale_down=120,\n",
" min_instances=0,\n",
" max_instances=4,\n",
" )\n",
" workspace_ml_client.begin_create_or_update(compute_config).result()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Create GPU compute for model evaluation component\n",
"\n",
"The list of GPU machines can be found [here](https://learn.microsoft.com/en-us/azure/virtual-machines/sizes-gpu)."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import time\n",
"import warnings\n",
"\n",
"# Using the same compute cluster for model evaluation as finetuning. If you want to use a different cluster, specify it below\n",
"model_eval_cluster_name = \"sample-finetune-cluster-gpu\"\n",
"\n",
"try:\n",
" model_evaluation_compute = workspace_ml_client.compute.get(model_eval_cluster_name)\n",
" print(\"Found existing compute target.\")\n",
"except ResourceNotFoundError:\n",
" print(\"Creating a new compute target...\")\n",
" model_evaluation_compute = AmlCompute(\n",
" name=model_eval_cluster_name,\n",
" type=\"amlcompute\",\n",
" size=\"Standard_NC6s_v3\",\n",
" idle_time_before_scale_down=120,\n",
" min_instances=0,\n",
" max_instances=4,\n",
" )\n",
" workspace_ml_client.begin_create_or_update(compute_config).result()\n",
"\n",
"model_evaluation_compute_instance_type = model_evaluation_compute.size\n",
"print(\n",
" f\"Model Evaluation compute's instance type: {model_evaluation_compute_instance_type}\"\n",
")\n",
"\n",
"if model_evaluation_compute_instance_type != \"STANDARD_NC6S_V3\":\n",
" # Print a warning message if compute type is not 'STANDARD_NC6S_V3', i.e. Single GPU V100\n",
" warning_message = (\n",
" \"Warning! Currently evaluation is only supported on STANDARD_NC6S_V3 compute type.\"\n",
" \" Please change the compute type to STANDARD_NC6S_V3 if you want to run evaluation.\"\n",
" )\n",
" warnings.warn(warning_message, category=Warning)\n",
"# generating a unique timestamp that can be used for names and versions that need to be unique\n",
"timestamp = str(int(time.time()))"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"### 3. Pick a foundation model to fine tune\n",
"\n",
"We will use the `mask-rcnn_swin-t-p4-w7_fpn_1x_coco` model in this notebook. If you need to fine tune a model that is available on MMDetection model zoo, but not available in `azureml` system registry, you can either register the model and use the registered model or use the `model_name` parameter to instruct the components to pull the model directly from MMDetection model zoo.\n",
"\n",
"Currently following models are supported:\n",
"\n",
"| Model Name | Source |\n",
"| :------------: | :-------: |\n",
"| [mmd-3x-mask-rcnn_swin-t-p4-w7_fpn_1x_coco](https://ml.azure.com/registries/azureml/models/mmd-3x-mask-rcnn_swin-t-p4-w7_fpn_1x_coco/version/14) | azureml registry |\n",
"| [Image instance-segmentation models from MMDetection](https://github.com/open-mmlab/mmdetection/blob/v3.1.0/docs/en/model_zoo.md) | MMDetection |"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"mmdetection_model_name = \"mask-rcnn_swin-t-p4-w7_fpn_1x_coco\"\n",
"\n",
"aml_registry_model_name = \"mmd-3x-mask-rcnn_swin-t-p4-w7_fpn_1x_coco\"\n",
"foundation_models = registry_ml_client.models.list(name=aml_registry_model_name)\n",
"foundation_model = max(foundation_models, key=lambda x: int(x.version))\n",
"print(\n",
" f\"\\n\\nUsing model name: {foundation_model.name}, version: {foundation_model.version}, id: {foundation_model.id} for inferencing\"\n",
")"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"### 4. Prepare the dataset for fine-tuning the model\n",
"\n",
"We will use the [odfridgeObjectsMask](https://automlsamplenotebookdata-adcuc7f7bqhhh8a4.b02.azurefd.net/image-instance-segmentation/odFridgeObjectsMask.zip), a toy dataset called Fridge Objects, which consists of 128 images of 4 labels of beverage container {`can`, `carton`, `milk bottle`, `water bottle`} photos taken on different backgrounds.\n",
"\n",
"All images in this notebook are hosted in [this repository](https://github.com/microsoft/computervision-recipes) and are made available under the [MIT license](https://github.com/microsoft/computervision-recipes/blob/master/LICENSE).\n",
"\n",
"#### 4.1 Download the Data\n",
"We first download and unzip the data locally. By default, the data would be downloaded in `./data` folder in current directory. \n",
"If you prefer to download the data at a different location, update it in `dataset_parent_dir = ...` in the following cell."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"import urllib\n",
"from zipfile import ZipFile\n",
"\n",
"# Change to a different location if you prefer\n",
"dataset_parent_dir = \"./data\"\n",
"\n",
"# Create data folder if it doesnt exist.\n",
"os.makedirs(dataset_parent_dir, exist_ok=True)\n",
"\n",
"# Download data\n",
"download_url = \"https://automlsamplenotebookdata-adcuc7f7bqhhh8a4.b02.azurefd.net/image-instance-segmentation/odFridgeObjectsMask.zip\"\n",
"\n",
"# Extract current dataset name from dataset url\n",
"dataset_name = os.path.split(download_url)[-1].split(\".\")[0]\n",
"# Get dataset path for later use\n",
"dataset_dir = os.path.join(dataset_parent_dir, dataset_name)\n",
"\n",
"# Get the data zip file path\n",
"data_file = os.path.join(dataset_parent_dir, f\"{dataset_name}.zip\")\n",
"\n",
"# Download the dataset\n",
"urllib.request.urlretrieve(download_url, filename=data_file)\n",
"\n",
"# Extract files\n",
"with ZipFile(data_file, \"r\") as zip:\n",
" print(\"extracting files...\")\n",
" zip.extractall(path=dataset_parent_dir)\n",
" print(\"done\")\n",
"# Delete zip file\n",
"os.remove(data_file)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from IPython.display import Image\n",
"\n",
"sample_image = os.path.join(dataset_dir, \"images\", \"31.jpg\")\n",
"Image(filename=sample_image)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"#### 4.2 Upload the images to Datastore through an AML Data asset (URI Folder)\n",
"\n",
"In order to use the data for training in Azure ML, we upload it to our default Azure Blob Storage of our Azure ML Workspace."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Uploading image files by creating a 'data asset URI FOLDER':\n",
"\n",
"from azure.ai.ml.entities import Data\n",
"from azure.ai.ml.constants import AssetTypes\n",
"\n",
"my_data = Data(\n",
" path=dataset_dir,\n",
" type=AssetTypes.URI_FOLDER,\n",
" description=\"Fridge-items images instance segmentation\",\n",
" name=\"fridge-items-images-instance-segmentation\",\n",
")\n",
"\n",
"uri_folder_data_asset = workspace_ml_client.data.create_or_update(my_data)\n",
"\n",
"print(uri_folder_data_asset)\n",
"print(\"\")\n",
"print(\"Path to folder in Blob Storage:\")\n",
"print(uri_folder_data_asset.path)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"#### 4.3 Convert the downloaded data to JSONL\n",
"\n",
"In this example, the fridge object dataset is annotated in Pascal VOC format, where each image corresponds to an xml file. Each xml file contains information on where its corresponding image file is located and also contains information about the bounding boxes and the object labels. \n",
"\n",
"For documentation on preparing the datasets beyond this notebook, please refer to the [documentation on how to prepare datasets](https://docs.microsoft.com/en-us/azure/machine-learning/how-to-prepare-datasets-for-automl-images).\n",
"\n",
"In order to use this data to create an AzureML MLTable, we first need to convert it to the required JSONL format. The following script is creating two `.jsonl` files (one for training and one for validation) in the corresponding MLTable folder. In this example, 20% of the data is kept for validation. For further details on jsonl file used for image classification task in automated ml, please refer to the [data schema documentation for image instance segmentation task](https://learn.microsoft.com/en-us/azure/machine-learning/reference-automl-images-schema#instance-segmentation)."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# The jsonl_converter below relies on scikit-image and simplification.\n",
"# If you don't have them installed, install them before converting data by runing this cell.\n",
"%pip install \"scikit-image\" \"simplification\""
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from jsonl_converter import convert_mask_in_VOC_to_jsonl\n",
"\n",
"convert_mask_in_VOC_to_jsonl(dataset_dir, uri_folder_data_asset.path)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"#### 4.5 Create MLTable data input\n",
"\n",
"Create MLTable data input using the jsonl files created above.\n",
"\n",
"For documentation on creating your own MLTable assets for jobs beyond this notebook, please refer to below resources\n",
"- [MLTable YAML Schema](https://learn.microsoft.com/en-us/azure/machine-learning/reference-yaml-mltable) - covers how to write MLTable YAML, which is required for each MLTable asset.\n",
"- [Create MLTable data asset](https://learn.microsoft.com/en-us/azure/machine-learning/how-to-create-data-assets?tabs=Python-SDK#create-a-mltable-data-asset) - covers how to create MLTable data asset. "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def create_ml_table_file(filename):\n",
" \"\"\"Create ML Table definition\"\"\"\n",
"\n",
" return (\n",
" \"paths:\\n\"\n",
" \" - file: ./{0}\\n\"\n",
" \"transformations:\\n\"\n",
" \" - read_json_lines:\\n\"\n",
" \" encoding: utf8\\n\"\n",
" \" invalid_lines: error\\n\"\n",
" \" include_path_column: false\\n\"\n",
" \" - convert_column_types:\\n\"\n",
" \" - columns: image_url\\n\"\n",
" \" column_type: stream_info\"\n",
" ).format(filename)\n",
"\n",
"\n",
"def save_ml_table_file(output_path, mltable_file_contents):\n",
" with open(os.path.join(output_path, \"MLTable\"), \"w\") as f:\n",
" f.write(mltable_file_contents)\n",
"\n",
"\n",
"# We will copy each JSONL file within its related MLTable folder\n",
"training_mltable_path = os.path.join(dataset_parent_dir, \"training-mltable-folder\")\n",
"validation_mltable_path = os.path.join(dataset_parent_dir, \"validation-mltable-folder\")\n",
"\n",
"# Create the folders if they don't exist\n",
"os.makedirs(training_mltable_path, exist_ok=True)\n",
"os.makedirs(validation_mltable_path, exist_ok=True)\n",
"\n",
"# Path to the training and validation files\n",
"train_annotations_file = os.path.join(training_mltable_path, \"train_annotations.jsonl\")\n",
"validation_annotations_file = os.path.join(\n",
" validation_mltable_path, \"validation_annotations.jsonl\"\n",
")\n",
"\n",
"# Create and save train mltable\n",
"train_mltable_file_contents = create_ml_table_file(\n",
" os.path.basename(train_annotations_file)\n",
")\n",
"save_ml_table_file(training_mltable_path, train_mltable_file_contents)\n",
"\n",
"# Save train and validation mltable\n",
"validation_mltable_file_contents = create_ml_table_file(\n",
" os.path.basename(validation_annotations_file)\n",
")\n",
"save_ml_table_file(validation_mltable_path, validation_mltable_file_contents)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"### 5. Submit the fine tuning job using `mmdetection_image_objectdetection_instancesegmentation_pipeline` component\n",
" \n",
"Create the job that uses the `mmdetection_image_objectdetection_instancesegmentation_pipeline` component for image instance segmentation and instance segmentation tasks. Learn more in 5.2 about all the parameters supported for fine tuning."
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"#### 5.1 Create component"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"FINETUNE_PIPELINE_COMPONENT_NAME = (\n",
" \"mmdetection_image_objectdetection_instancesegmentation_pipeline\"\n",
")\n",
"pipeline_component_mmdetection_func = registry_ml_client.components.get(\n",
" name=FINETUNE_PIPELINE_COMPONENT_NAME, label=\"latest\"\n",
")"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"#### 5.2 Create arguments to be passed to `mmdetection_image_objectdetection_instancesegmentation_pipeline` component\n",
"\n",
"The `mmdetection_image_objectdetection_instancesegmentation_pipeline` component consists of model selection and finetuning components. The detailed arguments for each component can be found at following README files:\n",
"- [Model Import Component](../../docs/component_docs/image_finetune/mmd_model_import_component.md)\n",
"- [Finetune Component](../../docs/component_docs/image_finetune/mmd_finetune_component.md)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"deepspeed_config_path = \"./deepspeed_configs/zero1.json\"\n",
"if not os.path.exists(deepspeed_config_path):\n",
" print(\"DeepSpeed config file not found\")\n",
" deepspeed_config_path = None\n",
"\n",
"pipeline_component_args = {\n",
" # # Model import args\n",
" \"model_family\": \"MmDetectionImage\",\n",
" \"download_from_source\": False, # True for downloading a model directly from MMDetection\n",
" \"mlflow_model\": foundation_model.id, # foundation_model.id is provided, only foundation_model gives UserErrorException: only path input is supported now but get: ...\n",
" # \"model_name\": mmdetection_model_name, # specify the model_name instead of mlflow_model if you want to use a model from the mmdetection model zoo\n",
" # Finetune args\n",
" \"task_name\": \"image-instance-segmentation\",\n",
" \"apply_augmentations\": True,\n",
" \"number_of_workers\": 8,\n",
" \"apply_deepspeed\": False,\n",
" \"deepspeed_config\": deepspeed_config_path,\n",
" \"apply_ort\": False,\n",
" \"auto_find_batch_size\": False,\n",
" \"extra_optim_args\": \"\",\n",
" \"precision\": \"32\",\n",
" \"random_seed\": 42,\n",
" \"evaluation_strategy\": \"epoch\",\n",
" \"evaluation_steps\": 500,\n",
" \"logging_strategy\": \"epoch\",\n",
" \"logging_steps\": 500,\n",
" \"save_strategy\": \"epoch\",\n",
" \"save_steps\": 500,\n",
" \"save_total_limit\": -1,\n",
" \"early_stopping\": False,\n",
" \"early_stopping_patience\": 1,\n",
" \"resume_from_checkpoint\": False,\n",
" \"save_as_mlflow_model\": True,\n",
" # # Uncomment one or more lines below to provide specific values, if you wish you override the autoselected default values.\n",
" # \"image_min_size\": -1,\n",
" # \"image_max_size\": -1,\n",
" # \"metric_for_best_model\": \"mean_average_precision\",\n",
" # \"number_of_epochs\": 15,\n",
" # \"max_steps\": -1,\n",
" # \"training_batch_size\": 4,\n",
" # \"validation_batch_size\": 4,\n",
" # \"learning_rate\": 5e-5,\n",
" # \"learning_rate_scheduler\": \"warmup_linear\",\n",
" # \"warmup_steps\": 0,\n",
" # \"optimizer\": \"adamw_hf\",\n",
" # \"weight_decay\": 0.0,\n",
" # \"gradient_accumulation_step\": 1,\n",
" # \"max_grad_norm\": 1.0,\n",
" # \"iou_threshold\": 0.5,\n",
" # \"box_score_threshold\": 0.3,\n",
" # # Model evaluation args\n",
" # The following parameters map to the dataset fields\n",
" # Uncomment one or more lines below to provide specific values, if you wish you override the autoselected default values.\n",
" # \"label_column_name\": \"label\",\n",
" # \"input_column_names\": \"image_url\",\n",
"}\n",
"instance_count = 1\n",
"process_count_per_instance = 1\n",
"\n",
"# Ensure that the user provides only one of mlflow_model or model_name\n",
"if (\n",
" pipeline_component_args.get(\"mlflow_model\") is None\n",
" and pipeline_component_args.get(\"model_name\") is None\n",
"):\n",
" raise ValueError(\n",
" \"You must specify either mlflow_model or model_name for the model to finetune\"\n",
" )\n",
"if (\n",
" pipeline_component_args.get(\"mlflow_model\") is not None\n",
" and pipeline_component_args.get(\"model_name\") is not None\n",
"):\n",
" raise ValueError(\n",
" \"You must specify ONLY one of mlflow_model and model_name for the model to finetune\"\n",
" )\n",
"elif (\n",
" pipeline_component_args.get(\"mlflow_model\") is None\n",
" and pipeline_component_args.get(\"model_name\") is not None\n",
"):\n",
" use_model_name = mmdetection_model_name\n",
"elif (\n",
" pipeline_component_args.get(\"mlflow_model\") is not None\n",
" and pipeline_component_args.get(\"model_name\") is None\n",
"):\n",
" use_model_name = aml_registry_model_name\n",
"print(f\"Finetuning model {use_model_name}\")"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"#### 5.3 Utility function to create pipeline using `mmdetection_image_objectdetection_instancesegmentation_pipeline` component"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azure.ai.ml.dsl import pipeline\n",
"from azure.ai.ml.entities import PipelineComponent\n",
"from azure.ai.ml import Input\n",
"from azure.ai.ml.constants import AssetTypes\n",
"\n",
"\n",
"@pipeline()\n",
"def create_pipeline_mmdetection():\n",
" \"\"\"Create pipeline.\"\"\"\n",
"\n",
" mmdetection_pipeline_component: PipelineComponent = pipeline_component_mmdetection_func(\n",
" compute_model_import=model_import_cluster_name,\n",
" compute_finetune=finetune_cluster_name,\n",
" compute_model_evaluation=model_eval_cluster_name,\n",
" training_data=Input(type=AssetTypes.MLTABLE, path=training_mltable_path),\n",
" validation_data=Input(type=AssetTypes.MLTABLE, path=validation_mltable_path),\n",
" # test data\n",
" # Using the same data for validation and test. If you want to use a different dataset for test, specify it below\n",
" test_data=Input(type=AssetTypes.MLTABLE, path=validation_mltable_path),\n",
" instance_count=instance_count,\n",
" process_count_per_instance=process_count_per_instance,\n",
" **pipeline_component_args,\n",
" )\n",
" return {\n",
" # Map the output of the fine tuning job to the output of pipeline job so that we can easily register the fine tuned model. Registering the model is required to deploy the model to an online or batch endpoint.\n",
" \"trained_model\": mmdetection_pipeline_component.outputs.mlflow_model_folder,\n",
" }"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"#### 5.4 Run the fine tuning job using `mmdetection_image_objectdetection_instancesegmentation_pipeline` component"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"mmdetection_pipeline_object = create_pipeline_mmdetection()\n",
"\n",
"# don't use cached results from previous jobs\n",
"mmdetection_pipeline_object.settings.force_rerun = True\n",
"\n",
"# set continue on step failure to False\n",
"mmdetection_pipeline_object.settings.continue_on_step_failure = False\n",
"\n",
"mmdetection_pipeline_object.display_name = (\n",
" use_model_name + \"_mmdetection_pipeline_component_run_\" + \"is\"\n",
")\n",
"# Don't use cached results from previous jobs\n",
"mmdetection_pipeline_object.settings.force_rerun = True\n",
"\n",
"print(\"Submitting pipeline\")\n",
"\n",
"mmdetection_pipeline_run = workspace_ml_client.jobs.create_or_update(\n",
" mmdetection_pipeline_object, experiment_name=experiment_name\n",
")\n",
"\n",
"print(f\"Pipeline created. URL: {mmdetection_pipeline_run.studio_url}\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"workspace_ml_client.jobs.stream(mmdetection_pipeline_run.name)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"### 6. Get metrics from finetune component\n",
"\n",
"The model training happens as part of the finetune component. Please follow below steps to extract validation metrics from the run."
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"##### 6.1 Initialize MLFlow Client\n",
"\n",
"The models and artifacts that are produced by AutoML can be accessed via the MLFlow interface.\n",
"Initialize the MLFlow client here, and set the backend as Azure ML, via. the MLFlow Client.\n",
"\n",
"IMPORTANT - You need to have installed the latest MLFlow packages with:\n",
"\n",
" pip install azureml-mlflow\n",
" pip install mlflow"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import mlflow\n",
"\n",
"# Obtain the tracking URL from MLClient\n",
"MLFLOW_TRACKING_URI = workspace_ml_client.workspaces.get(\n",
" name=workspace_ml_client.workspace_name\n",
").mlflow_tracking_uri\n",
"\n",
"print(MLFLOW_TRACKING_URI)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Set the MLFLOW TRACKING URI\n",
"mlflow.set_tracking_uri(MLFLOW_TRACKING_URI)\n",
"print(f\"\\nCurrent tracking uri: {mlflow.get_tracking_uri()}\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from mlflow.tracking.client import MlflowClient\n",
"\n",
"# Initialize MLFlow client\n",
"mlflow_client = MlflowClient()"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"#### 6.2 Get the training and evaluation run\n",
"\n",
"Fetch the training and evaluation run ids from the above pipeline run. We will later use these run ids to fetch the metrics. We will use the training run id to register the model."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Concat 'tags.mlflow.rootRunId=' and pipeline_job.name in single quotes as filter variable\n",
"filter = \"tags.mlflow.rootRunId='\" + mmdetection_pipeline_run.name + \"'\"\n",
"runs = mlflow.search_runs(\n",
" experiment_names=[experiment_name], filter_string=filter, output_format=\"list\"\n",
")\n",
"\n",
"# Get the training and evaluation runs.\n",
"# Using a hacky way till 'Bug 2320997: not able to show eval metrics in FT notebooks - mlflow client now showing display names' is fixed\n",
"for run in runs:\n",
" # Check if run.data.metrics.epoch exists\n",
" if \"epoch\" in run.data.metrics:\n",
" training_run = run\n",
" # Else, check if run.data.metrics.accuracy exists\n",
" elif \"mean_average_precision\" in run.data.metrics:\n",
" evaluation_run = run"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"#### 6.3 Get training metrics\n",
"\n",
"Access the results (such as Models, Artifacts, Metrics) of a previously completed run."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import pandas as pd\n",
"\n",
"pd.DataFrame(training_run.data.metrics, index=[0]).T"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"### 7. Register the fine tuned model with the workspace\n",
"\n",
"We will register the model from the output of the fine tuning job. This will track lineage between the fine tuned model and the fine tuning job. The fine tuning job, further, tracks lineage to the foundation model, data and training code."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import time\n",
"\n",
"# Generating a unique timestamp that can be used for names and versions that need to be unique\n",
"timestamp = str(int(time.time()))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azure.ai.ml.entities import Model\n",
"from azure.ai.ml.constants import AssetTypes\n",
"\n",
"# Check if the `trained_model` output is available\n",
"print(\n",
" f\"Pipeline job outputs: {workspace_ml_client.jobs.get(mmdetection_pipeline_run.name).outputs}\"\n",
")\n",
"\n",
"# Fetch the model from pipeline job output - not working, hence fetching from fine tune child job\n",
"model_path_from_job = (\n",
" f\"azureml://jobs/{mmdetection_pipeline_run.name}/outputs/trained_model\"\n",
")\n",
"print(f\"Path to register model: {model_path_from_job}\")\n",
"\n",
"finetuned_model_name = f\"{use_model_name.replace('/', '-')}-fridge-objects-is\"\n",
"finetuned_model_description = f\"{use_model_name.replace('/', '-')} fine tuned model for fridge objects instance segmentation\"\n",
"prepare_to_register_model = Model(\n",
" path=model_path_from_job,\n",
" type=AssetTypes.MLFLOW_MODEL,\n",
" name=finetuned_model_name,\n",
" version=timestamp, # Use timestamp as version to avoid version conflict\n",
" description=finetuned_model_description,\n",
")\n",
"print(f\"Prepare to register model: \\n{prepare_to_register_model}\")\n",
"\n",
"# Register the model from pipeline job output\n",
"registered_model = workspace_ml_client.models.create_or_update(\n",
" prepare_to_register_model\n",
")\n",
"print(f\"Registered model: {registered_model}\")"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"### 8. Deploy the fine tuned model to an online endpoint\n",
"Online endpoints give a durable REST API that can be used to integrate with applications that need to use the model."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import datetime\n",
"from azure.ai.ml.entities import ManagedOnlineEndpoint, ManagedOnlineDeployment\n",
"\n",
"# Endpoint names need to be unique in a region, hence using timestamp to create unique endpoint name\n",
"online_endpoint_name = \"mmd-is-fridge-items-\" + datetime.datetime.now().strftime(\n",
" \"%m%d%H%M\"\n",
")\n",
"online_endpoint_description = f\"Online endpoint for {registered_model.name}, finetuned for fridge objects instance segmentation\"\n",
"# Create an online endpoint\n",
"endpoint = ManagedOnlineEndpoint(\n",
" name=online_endpoint_name,\n",
" description=online_endpoint_description,\n",
" auth_mode=\"key\",\n",
" tags={\"foo\": \"bar\"},\n",
")\n",
"workspace_ml_client.begin_create_or_update(endpoint).result()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azure.ai.ml.entities import OnlineRequestSettings, ProbeSettings\n",
"\n",
"deployment_name = \"mmd-is-fridge-mlflow-deploy\"\n",
"print(registered_model.id)\n",
"print(online_endpoint_name)\n",
"print(deployment_name)\n",
"\n",
"# Create a deployment\n",
"demo_deployment = ManagedOnlineDeployment(\n",
" name=deployment_name,\n",
" endpoint_name=online_endpoint_name,\n",
" model=registered_model.id,\n",
" instance_type=\"Standard_DS3_V2\",\n",
" instance_count=1,\n",
" request_settings=OnlineRequestSettings(\n",
" max_concurrent_requests_per_instance=1,\n",
" request_timeout_ms=90000,\n",
" max_queue_wait_ms=500,\n",
" ),\n",
" liveness_probe=ProbeSettings(\n",
" failure_threshold=49,\n",
" success_threshold=1,\n",
" timeout=299,\n",
" period=180,\n",
" initial_delay=180,\n",
" ),\n",
" readiness_probe=ProbeSettings(\n",
" failure_threshold=10,\n",
" success_threshold=1,\n",
" timeout=10,\n",
" period=10,\n",
" initial_delay=10,\n",
" ),\n",
")\n",
"workspace_ml_client.online_deployments.begin_create_or_update(demo_deployment).wait()\n",
"endpoint.traffic = {deployment_name: 100}\n",
"workspace_ml_client.begin_create_or_update(endpoint).result()"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"### 9. Test the endpoint with sample data\n",
"\n",
"We will fetch some sample data from the test dataset and submit to online endpoint for inference. We will then display the scored labels alongside the ground truth labels."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"demo_deployment = workspace_ml_client.online_deployments.get(\n",
" name=deployment_name,\n",
" endpoint_name=online_endpoint_name,\n",
")\n",
"\n",
"# Get the details for online endpoint\n",
"endpoint = workspace_ml_client.online_endpoints.get(name=online_endpoint_name)\n",
"\n",
"# Existing traffic details\n",
"print(endpoint.traffic)\n",
"\n",
"# Get the scoring URI\n",
"print(endpoint.scoring_uri)\n",
"print(demo_deployment)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Create request json\n",
"import base64\n",
"import json\n",
"\n",
"sample_image = os.path.join(dataset_dir, \"images\", \"99.jpg\")\n",
"\n",
"\n",
"def read_image(image_path):\n",
" with open(image_path, \"rb\") as f:\n",
" return f.read()\n",
"\n",
"\n",
"request_json = {\n",
" \"input_data\": {\n",
" \"columns\": [\"image\"],\n",
" \"data\": [base64.encodebytes(read_image(sample_image)).decode(\"utf-8\")],\n",
" }\n",
"}\n",
"\n",
"request_file_name = \"sample_request_data.json\"\n",
"with open(request_file_name, \"w\") as request_file:\n",
" json.dump(request_json, request_file)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"resp = workspace_ml_client.online_endpoints.invoke(\n",
" endpoint_name=online_endpoint_name,\n",
" deployment_name=demo_deployment.name,\n",
" request_file=request_file_name,\n",
")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"resp"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Visualize detections\n",
"Now that we have scored a test image, we can visualize the bounding boxes and masks for this image."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%matplotlib inline\n",
"import json\n",
"import numpy as np\n",
"from PIL import Image\n",
"import seaborn as sns\n",
"import matplotlib.pyplot as plt\n",
"import matplotlib.image as mpimg\n",
"import matplotlib.patches as patches\n",
"from matplotlib.lines import Line2D\n",
"\n",
"\n",
"img_np = mpimg.imread(sample_image)\n",
"img = Image.fromarray(img_np.astype(\"uint8\"), \"RGB\")\n",
"x, y = img.size\n",
"conf_threshold = 0.6 # display top objects with confidence score > 0.6\n",
"\n",
"# Set a compact figure size\n",
"fig_width = 12\n",
"fig_height = 12\n",
"\n",
"# Initialize figure and axes\n",
"fig = plt.figure(figsize=(fig_width, fig_height))\n",
"gs = fig.add_gridspec(2, 1, height_ratios=[4, 1], hspace=0.2)\n",
"ax1 = fig.add_subplot(gs[0])\n",
"ax2 = fig.add_subplot(gs[1])\n",
"\n",
"# Display the image with bounding boxes and segmentation maps\n",
"ax1.imshow(img_np)\n",
"ax1.axis(\"off\")\n",
"\n",
"# Draw bounding boxes and segmentation maps for each detection\n",
"detections = json.loads(resp)[0]\n",
"sorted_data = sorted(detections[\"boxes\"], key=lambda x: x[\"score\"], reverse=True)\n",
"sorted_scores = []\n",
"sorted_colors = []\n",
"unique_labels = []\n",
"label_counter = {}\n",
"\n",
"for i, detect in enumerate(sorted_data):\n",
" label = detect[\"label\"]\n",
" box = detect[\"box\"]\n",
" polygon = detect[\"polygon\"]\n",
" conf_score = detect[\"score\"]\n",
"\n",
" if conf_score > conf_threshold:\n",
" # Modify labels to make them unique with numbering\n",
" if label not in label_counter:\n",
" label_counter[label] = 1\n",
" unique_labels.append(f\"{label} {label_counter[label]}\")\n",
" else:\n",
" label_counter[label] += 1\n",
" unique_labels.append(f\"{label} {label_counter[label]}\")\n",
"\n",
" current_label = unique_labels[-1]\n",
"\n",
" ymin, xmin, ymax, xmax = (\n",
" box[\"topY\"],\n",
" box[\"topX\"],\n",
" box[\"bottomY\"],\n",
" box[\"bottomX\"],\n",
" )\n",
" topleft_x, topleft_y = x * xmin, y * ymin\n",
" width, height = x * (xmax - xmin), y * (ymax - ymin)\n",
"\n",
" color = np.random.rand(3)\n",
" rect = patches.Rectangle(\n",
" (topleft_x, topleft_y),\n",
" width,\n",
" height,\n",
" linewidth=2,\n",
" edgecolor=color,\n",
" facecolor=\"none\",\n",
" )\n",
"\n",
" ax1.add_patch(rect)\n",
" ax1.text(topleft_x, topleft_y - 10, current_label, color=color, fontsize=20)\n",
"\n",
" polygon_np = np.array(polygon[0])\n",
" polygon_np = polygon_np.reshape(-1, 2)\n",
" polygon_np[:, 0] *= x\n",
" polygon_np[:, 1] *= y\n",
" poly = plt.Polygon(polygon_np, True, facecolor=color, alpha=0.4)\n",
" ax1.add_patch(poly)\n",
" # Draw polyline\n",
" poly_line = Line2D(\n",
" polygon_np[:, 0],\n",
" polygon_np[:, 1],\n",
" linewidth=0.4, # Adjust the line width for the polyline\n",
" color=color, # Set polyline color to match bounding box\n",
" marker=\"o\",\n",
" markersize=2, # Smaller markers for the polyline\n",
" markerfacecolor=color,\n",
" )\n",
" ax1.add_line(poly_line)\n",
" sorted_scores.append(conf_score)\n",
" sorted_colors.append(color)\n",
"\n",
"# Set a stylish color palette\n",
"sns.set_palette(\"pastel\")\n",
"# Create the bar plot without x-axis and y-axis markings\n",
"barplot = sns.barplot(x=sorted_scores, y=unique_labels, palette=sorted_colors, ax=ax2)\n",
"ax2.set_xlabel(\"\") # Remove x-axis label\n",
"ax2.set_ylabel(\"\") # Remove y-axis label\n",
"ax2.set_title(f\"Top {len(sorted_scores)} Object Scores\", fontsize=12)\n",
"\n",
"# Add scores in front of the bars\n",
"for index, value in enumerate(sorted_scores):\n",
" barplot.text(\n",
" value + 0.01, index, f\"{value:.2f}\", va=\"center\", color=\"black\", fontsize=10\n",
" )\n",
"\n",
"# Remove spines and ticks from the bar plot\n",
"barplot.spines[\"left\"].set_visible(False)\n",
"barplot.spines[\"top\"].set_visible(False)\n",
"barplot.spines[\"right\"].set_visible(False)\n",
"barplot.spines[\"bottom\"].set_visible(False)\n",
"barplot.tick_params(left=False, top=False, right=False, bottom=False)\n",
"barplot.xaxis.set_visible(False) # Remove x-axis\n",
"barplot.yaxis.grid(False) # Remove y-axis grid\n",
"\n",
"# Set plot background color\n",
"fig.patch.set_facecolor(\"#F7F7F7\") # Light gray\n",
"\n",
"plt.tight_layout()\n",
"# fig.savefig(\"plot.png\", bbox_inches=\"tight\")\n",
"plt.show()"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"### 10. Clean up resources - delete the online endpoint\n",
"Don't forget to delete the online endpoint, else you will leave the billing meter running for the compute used by the endpoint."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"workspace_ml_client.online_endpoints.begin_delete(name=online_endpoint_name).wait()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"language_info": {
"name": "python"
}
},
"nbformat": 4,
"nbformat_minor": 2
}