sdk/python/foundation-models/system/evaluation/image-object-detection/image-object-detection.ipynb (730 lines of code) (raw):

{ "cells": [ { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "## Image Object-Detection Evaluation\n", "\n", "This sample shows how use the evaluate a group of models against a given set of metrics for the `image-object-detection` task. \n", "\n", "### Evaluation dataset\n", "We will use the [odfridgeObjects](https://automlsamplenotebookdata-adcuc7f7bqhhh8a4.b02.azurefd.net/image-object-detection/odFridgeObjects.zip) dataset.\n", "\n", "### Model\n", "The goal of evaluating models is to compare their performance on a variety of metrics. `image-object-detection` is a generic task type. As such, the models you pick to compare must be finetuned for the same scenario. Given that we have the dataset, we would like to look for models finetuned for this specific scenario. We will compare `mmd-3x-yolof_r50_c5_8x8_1x_coco` and `mmd-3x-sparse-rcnn_r50_fpn_300-proposals_crop-ms-480-800-3x_coco` in this sample, which are available in the `azureml` system registry.\n", "\n", "If you'd like to evaluate models that are not in the system registry, you can import those models to your workspace or organization registry and then evaluate them using the approach outlined in this sample. Review the sample notebook for [importing models](../../../import/import_model_into_registry.ipynb).\n", "\n", "### Outline\n", "1. Install dependencies\n", "2. Setup pre-requisites\n", "3. Pick the models to evaluate\n", "4. Prepare the dataset for fine-tuning the model\n", "5. Submit the evaluation jobs using the model and data as inputs\n", "6. Review evaluation metrics" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "### 1. Install dependencies\n", "Before starting off, if you are running the notebook on Azure Machine Learning Studio or running first time locally, you will need the following packages" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%pip install azure-ai-ml\n", "%pip install azure-identity" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "### 2. Setup pre-requisites" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "#### 2.1 Connect to Azure Machine Learning workspace\n", "\n", "Before we dive in the code, you'll need to connect to your workspace. The workspace is the top-level resource for Azure Machine Learning, providing a centralized place to work with all the artifacts you create when you use Azure Machine Learning.\n", "\n", "We are using `DefaultAzureCredential` to get access to workspace. `DefaultAzureCredential` should be capable of handling most scenarios. If you want to learn more about other available credentials, go to [set up authentication doc](https://learn.microsoft.com/en-us/azure/machine-learning/how-to-setup-authentication?tabs=sdk), [azure-identity reference doc](https://learn.microsoft.com/en-us/python/api/azure-identity/azure.identity?view=azure-python).\n", "\n", "Replace `AML_WORKSPACE_NAME`, `RESOURCE_GROUP` and `SUBSCRIPTION_ID` with their respective values in the below cell." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from azure.ai.ml import MLClient\n", "from azure.identity import DefaultAzureCredential\n", "\n", "\n", "experiment_name = (\n", " \"AzureML-Train-Finetune-Vision-OD-Samples\" # can rename to any valid name\n", ")\n", "\n", "credential = DefaultAzureCredential()\n", "workspace_ml_client = None\n", "try:\n", " workspace_ml_client = MLClient.from_config(credential)\n", " subscription_id = workspace_ml_client.subscription_id\n", " resource_group = workspace_ml_client.resource_group_name\n", " workspace_name = workspace_ml_client.workspace_name\n", "except Exception as ex:\n", " print(ex)\n", " # Enter details of your AML workspace\n", " subscription_id = \"SUBSCRIPTION_ID\"\n", " resource_group = \"RESOURCE_GROUP\"\n", " workspace_name = \"AML_WORKSPACE_NAME\"\n", "\n", " workspace_ml_client = MLClient(\n", " credential, subscription_id, resource_group, workspace_name\n", " )\n", "\n", "registry_ml_client = MLClient(\n", " credential,\n", " subscription_id,\n", " resource_group,\n", " registry_name=\"azureml\",\n", ")" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "#### 2.2 Create compute\n", "\n", "In order to finetune a model on Azure Machine Learning studio, you will need to create a compute resource first. **Creating a compute will take 3-4 minutes.** \n", "\n", "For additional references, see [Azure Machine Learning in a Day](https://github.com/Azure/azureml-examples/blob/main/tutorials/azureml-in-a-day/azureml-in-a-day.ipynb). " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import time\n", "import warnings\n", "\n", "from azure.ai.ml.entities import AmlCompute\n", "from azure.core.exceptions import ResourceNotFoundError\n", "\n", "model_evaluation_cluster_name = \"sample-model-evaluation-compute\"\n", "\n", "try:\n", " model_evaluation_compute = workspace_ml_client.compute.get(\n", " model_evaluation_cluster_name\n", " )\n", " print(\"Found existing compute target.\")\n", "except ResourceNotFoundError:\n", " print(\"Creating a new compute target...\")\n", " model_evaluation_compute = AmlCompute(\n", " name=model_evaluation_cluster_name,\n", " type=\"amlcompute\",\n", " size=\"Standard_NC6s_v3\",\n", " idle_time_before_scale_down=120,\n", " min_instances=0,\n", " max_instances=4,\n", " )\n", " workspace_ml_client.begin_create_or_update(model_evaluation_compute).result()\n", "\n", "compute_instance_type = model_evaluation_compute.size\n", "print(f\"Instance type: {compute_instance_type}\")\n", "\n", "if compute_instance_type != \"STANDARD_NC6S_V3\":\n", " # Print a warning message if compute type is not 'STANDARD_NC6S_V3', i.e. Single GPU V100\n", " warning_message = (\n", " \"Warning! Currently evaluation is only supported on STANDARD_NC6S_V3 compute type.\"\n", " \" Please change the compute type to STANDARD_NC6S_V3 if you want to run evaluation.\"\n", " )\n", " warnings.warn(warning_message, category=Warning)\n", "# generating a unique timestamp that can be used for names and versions that need to be unique\n", "timestamp = str(int(time.time()))" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "Below snippet will allow us to query number of GPU's present on the compute. We can use it to set `gpu_per_node` to ensure utilization of all GPUs in the node." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# This is the number of GPUs in a single node of the selected 'vm_size' compute.\n", "# Setting this to less than the number of GPUs will result in underutilized GPUs, taking longer to train.\n", "# Setting this to more than the number of GPUs will result in an error.\n", "gpus_per_node = 1 # default value\n", "gpu_count_found = False\n", "ws_computes = workspace_ml_client.compute.list_sizes()\n", "for ws_compute in ws_computes:\n", " if ws_compute.name.lower() == model_evaluation_compute.size.lower():\n", " gpus_per_node = ws_compute.gpus\n", " print(f\"Number of GPUs in compute {ws_compute.name} are {ws_compute.gpus}\")\n", "# if gpu_count_found not found, then print an error\n", "if gpus_per_node > 0:\n", " gpu_count_found = True\n", "else:\n", " gpu_count_found = False\n", " print(\n", " f\"No GPUs found in compute. Number of GPUs in compute {model_evaluation_compute.size} 0.\"\n", " )" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "### 3. Pick the models to evaluate\n", "\n", "You can evaluate the pretrained models on the repective datasets. Verify that the models selected for evaluation are available in system registry using the below code." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "registry_models = [\n", " {\"name\": \"mmd-3x-yolof_r50_c5_8x8_1x_coco\"},\n", " {\"name\": \"mmd-3x-sparse-rcnn_r50_fpn_300-proposals_crop-ms-480-800-3x_coco\"},\n", "]\n", "for model in registry_models:\n", " all_models = registry_ml_client.models.list(model[\"name\"])\n", " latest_model = max(all_models, key=lambda x: int(x.version))\n", " print(latest_model.id)" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "In this demo notebook, we are using fridge objects dataset. Due to the differences in the labels of the dataset used for pretrained models and that of fridge object dataset, the pretrained models can't be evalauted on the fridge dataset.\n", "\n", "Therefore, for the scope of this notebook, we request you to finetune model(s) for fridge objects dataset using [mmdetection-fridgeobjects-object-detection.ipynb](../../finetune/image-object-detection/mmdetection-fridgeobjects-object-detection.ipynb). To finetune and register the model(s), you need to run the above notebook upto \"7. Register the fine tuned model with the workspace\" Once you have finetuned and registered the model(s) using above notebook, you can proceed further. \n", "- Replace `REGISTERED_MODEL_NAME_1/REGISTERED_MODEL_NAME_2` and `REGISTERED_MODEL_VERSION_1/REGISTERED_MODEL_VERSION_2` with that of the registered models from above." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "finetuned_registered_models = [\n", " {\"name\": \"REGISTERED_MODEL_NAME_1\", \"version\": \"REGISTERED_MODEL_VERSION_1\"},\n", " {\"name\": \"REGISTERED_MODEL_NAME_2\", \"version\": \"REGISTERED_MODEL_VERSION_2\"},\n", "]\n", "\n", "for model in finetuned_registered_models:\n", " model = workspace_ml_client.models.get(model[\"name\"], version=model[\"version\"])\n", " print(model.id)" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "### 4. Prepare the dataset for fine-tuning the model\n", "\n", "We will use the [odfridgeObjects](https://automlsamplenotebookdata-adcuc7f7bqhhh8a4.b02.azurefd.net/image-object-detection/odFridgeObjects.zip), a toy dataset called Fridge Objects, which consists of 128 images of 4 labels of beverage container {`can`, `carton`, `milk bottle`, `water bottle`} photos taken on different backgrounds.\n", "\n", "All images in this notebook are hosted in [this repository](https://github.com/microsoft/computervision-recipes) and are made available under the [MIT license](https://github.com/microsoft/computervision-recipes/blob/master/LICENSE).\n", "\n", "#### 4.1 Download the Data\n", "We first download and unzip the data locally. By default, the data would be downloaded in `./data` folder in current directory. \n", "If you prefer to download the data at a different location, update it in `dataset_parent_dir = ...` in the following cell." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import os\n", "import urllib\n", "from zipfile import ZipFile\n", "\n", "# Change to a different location if you prefer\n", "dataset_parent_dir = \"./data\"\n", "\n", "# Create data folder if it doesnt exist.\n", "os.makedirs(dataset_parent_dir, exist_ok=True)\n", "\n", "# Download data\n", "download_url = \"https://automlsamplenotebookdata-adcuc7f7bqhhh8a4.b02.azurefd.net/image-object-detection/odFridgeObjects.zip\"\n", "\n", "# Extract current dataset name from dataset url\n", "dataset_name = os.path.split(download_url)[-1].split(\".\")[0]\n", "\n", "# Get dataset path for later use\n", "dataset_dir = os.path.join(dataset_parent_dir, dataset_name)\n", "\n", "# Get the data zip file path\n", "data_file = os.path.join(dataset_parent_dir, f\"{dataset_name}.zip\")\n", "\n", "# Download the dataset\n", "urllib.request.urlretrieve(download_url, filename=data_file)\n", "\n", "# Extract files\n", "with ZipFile(data_file, \"r\") as zip:\n", " print(\"extracting files...\")\n", " zip.extractall(path=dataset_parent_dir)\n", " print(\"done\")\n", "\n", "# Delete zip file\n", "os.remove(data_file)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from IPython.display import Image\n", "\n", "sample_image = os.path.join(dataset_dir, \"images\", \"31.jpg\")\n", "Image(filename=sample_image)" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "#### 4.2 Upload the images to Datastore through an AML Data asset (URI Folder)\n", "\n", "In order to use the data for training in Azure ML, we upload it to our default Azure Blob Storage of our Azure ML Workspace." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Uploading image files by creating a 'data asset URI FOLDER':\n", "\n", "from azure.ai.ml.entities import Data\n", "from azure.ai.ml.constants import AssetTypes\n", "\n", "my_data = Data(\n", " path=dataset_dir,\n", " type=AssetTypes.URI_FOLDER,\n", " description=\"Fridge-items images Object detection\",\n", " name=\"fridge-items-images-object-detection\",\n", ")\n", "\n", "uri_folder_data_asset = workspace_ml_client.data.create_or_update(my_data)\n", "\n", "print(uri_folder_data_asset)\n", "print(\"\")\n", "print(\"Path to folder in Blob Storage:\")\n", "print(uri_folder_data_asset.path)" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "#### 4.3 Convert the downloaded data to JSONL\n", "\n", "In this example, the fridge object dataset is annotated in Pascal VOC format, where each image corresponds to an xml file. Each xml file contains information on where its corresponding image file is located and also contains information about the bounding boxes and the object labels. \n", "\n", "For documentation on preparing the datasets beyond this notebook, please refer to the [documentation on how to prepare datasets](https://docs.microsoft.com/en-us/azure/machine-learning/how-to-prepare-datasets-for-automl-images).\n", "\n", "\n", "In order to use this data to create an AzureML MLTable, we first need to convert it to the required JSONL format. The following script is creating two `.jsonl` files (one for training and one for validation) in the corresponding MLTable folder. In this example, 20% of the data is kept for validation. For further details on jsonl file used for image object detection task in automated ml, please refer to the [data schema documentation for image object-detection task](https://learn.microsoft.com/en-us/azure/machine-learning/reference-automl-images-schema#object-detection)." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "!pip install pycocotools\n", "!pip install simplification\n", "!pip install scikit-image" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import sys\n", "\n", "sys.path.insert(0, \"../../../../jobs/automl-standalone-jobs/jsonl-conversion/\")\n", "from base_jsonl_converter import write_json_lines\n", "from voc_jsonl_converter import VOCJSONLConverter\n", "\n", "base_url = os.path.join(uri_folder_data_asset.path, \"images/\")\n", "converter = VOCJSONLConverter(base_url, os.path.join(dataset_dir, \"annotations\"))\n", "jsonl_annotations = os.path.join(dataset_dir, \"annotations_voc.jsonl\")\n", "write_json_lines(converter, jsonl_annotations)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# If you want to try with a dataset in COCO format, the scripts below shows how to convert it to `jsonl` format. The file \"odFridgeObjects_coco.json\" consists of annotation information for the `odFridgeObjects` dataset.\n", "\n", "# import sys\n", "# sys.path.insert(0, \"../../../../jobs/automl-standalone-jobs/jsonl-conversion/\")\n", "# from base_jsonl_converter import write_json_lines\n", "# from coco_jsonl_converter import COCOJSONLConverter\n", "\n", "# base_url = os.path.join(uri_folder_data_asset.path, \"images/\")\n", "# print(base_url)\n", "# converter = COCOJSONLConverter(base_url, \"./odFridgeObjects_coco.json\")\n", "# jsonl_annotations = os.path.join(dataset_dir, \"annotations_coco.jsonl\")\n", "# write_json_lines(converter, jsonl_annotations)" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "#### Now split the annotations into train and validation" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import os\n", "\n", "# We'll copy each JSONL file within its related MLTable folder\n", "validation_mltable_path = os.path.join(dataset_parent_dir, \"validation-mltable-folder\")\n", "\n", "# First, let's create the folders if they don't exist\n", "os.makedirs(validation_mltable_path, exist_ok=True)\n", "\n", "train_validation_ratio = 5\n", "\n", "# Path to the validation files\n", "validation_annotations_file = os.path.join(\n", " validation_mltable_path, \"validation_annotations.jsonl\"\n", ")\n", "\n", "with open(jsonl_annotations, \"r\") as annot_f:\n", " json_lines = annot_f.readlines()\n", "\n", "index = 0\n", "with open(validation_annotations_file, \"w\") as validation_f:\n", " for json_line in json_lines:\n", " if index % train_validation_ratio == 0:\n", " # validation annotation\n", " validation_f.write(json_line)\n", " else:\n", " # train annotation\n", " pass\n", " index += 1" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "#### 4.4 Create MLTable data input\n", "\n", "Create MLTable data input using the jsonl files created above.\n", "\n", "For documentation on creating your own MLTable assets for jobs beyond this notebook, please refer to below resources\n", "- [MLTable YAML Schema](https://learn.microsoft.com/en-us/azure/machine-learning/reference-yaml-mltable) - covers how to write MLTable YAML, which is required for each MLTable asset.\n", "- [Create MLTable data asset](https://learn.microsoft.com/en-us/azure/machine-learning/how-to-create-data-assets?tabs=Python-SDK#create-a-mltable-data-asset) - covers how to create MLTable data asset. " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def create_ml_table_file(filename):\n", " \"\"\"Create ML Table definition\"\"\"\n", "\n", " return (\n", " \"paths:\\n\"\n", " \" - file: ./{0}\\n\"\n", " \"transformations:\\n\"\n", " \" - read_json_lines:\\n\"\n", " \" encoding: utf8\\n\"\n", " \" invalid_lines: error\\n\"\n", " \" include_path_column: false\\n\"\n", " \" - convert_column_types:\\n\"\n", " \" - columns: image_url\\n\"\n", " \" column_type: stream_info\"\n", " ).format(filename)\n", "\n", "\n", "def save_ml_table_file(output_path, mltable_file_contents):\n", " with open(os.path.join(output_path, \"MLTable\"), \"w\") as f:\n", " f.write(mltable_file_contents)\n", "\n", "\n", "# Create and save validation mltable\n", "validation_mltable_file_contents = create_ml_table_file(\n", " os.path.basename(validation_annotations_file)\n", ")\n", "save_ml_table_file(validation_mltable_path, validation_mltable_file_contents)" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "### 5. Submit the evaluation jobs using the model and data as inputs\n", " \n", "Create the job that uses the `model_evaluation_pipeline` component. We will submit one job per model. \n", "\n", "Note that the metrics that the evaluation jobs need to calculate are specified in the [fridge-eval-config.json](./fridge-eval-config.json) file.\n", "\n", "All supported evaluation configurations for `image-object-detection` can be found in [README](./README.md)." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from azure.ai.ml.dsl import pipeline\n", "from azure.ai.ml import Input\n", "from azure.ai.ml.constants import AssetTypes\n", "\n", "# fetch the pipeline component\n", "pipeline_component_func = registry_ml_client.components.get(\n", " name=\"model_evaluation_pipeline\", label=\"latest\"\n", ")\n", "\n", "# define the pipeline job\n", "@pipeline()\n", "def evaluation_pipeline(mlflow_model):\n", " evaluation_job = pipeline_component_func(\n", " # specify the foundation model available in the azureml system registry or a model from the workspace\n", " # mlflow_model = Input(type=AssetTypes.MLFLOW_MODEL, path=f\"{mlflow_model_path}\"),\n", " mlflow_model=mlflow_model,\n", " # test data\n", " test_data=Input(type=AssetTypes.MLTABLE, path=validation_mltable_path),\n", " # The following parameters map to the dataset fields\n", " label_column_name=\"label\",\n", " input_column_names=\"image_url\",\n", " # Evaluation settings\n", " task=\"image-object-detection\",\n", " # config file containing the details of evaluation metrics to calculate\n", " evaluation_config=Input(\n", " type=AssetTypes.URI_FILE, path=\"./fridge-eval-config.json\"\n", " ),\n", " # config cluster/device job is running on\n", " # set device to GPU/CPU on basis if GPU count was found\n", " compute_name=model_evaluation_cluster_name,\n", " instance_type=compute_instance_type,\n", " device=\"gpu\" if gpu_count_found else \"cpu\",\n", " )\n", " return {\"evaluation_result\": evaluation_job.outputs.evaluation_result}" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "Submit the jobs, passing the model as a parameter to the pipeline created in the above step." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "# submit the pipeline job for each model that we want to evaluate\n", "# you could consider submitting the pipeline jobs in parallel, provided your cluster has multiple nodes\n", "pipeline_jobs = []\n", "\n", "for model in finetuned_registered_models:\n", "\n", " # # For each model, fetch the model object from the registry\n", " # model_object = registry_ml_client.models.get(\n", " # model[\"name\"], version=model[\"version\"]\n", " # )\n", "\n", " # Fetch the model from workspace\n", " model_object = workspace_ml_client.models.get(\n", " model[\"name\"], version=model[\"version\"]\n", " )\n", "\n", " pipeline_object = evaluation_pipeline(\n", " mlflow_model=Input(type=AssetTypes.MLFLOW_MODEL, path=f\"{model_object.id}\"),\n", " )\n", " # don't reuse cached results from previous jobs\n", " pipeline_object.settings.force_rerun = True\n", " pipeline_object.settings.default_compute = model_evaluation_cluster_name\n", " pipeline_object.display_name = f\"eval-{model['name']}-{timestamp}\"\n", " pipeline_job = workspace_ml_client.jobs.create_or_update(\n", " pipeline_object, experiment_name=experiment_name\n", " )\n", " # add model['name'] and pipeline_job.name as key value pairs to a dictionary\n", " pipeline_jobs.append({\"model_name\": model[\"name\"], \"job_name\": pipeline_job.name})\n", " # wait for the pipeline job to complete\n", " workspace_ml_client.jobs.stream(pipeline_job.name)" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "### 6. Review evaluation metrics\n", "Viewing the job in AzureML studio is the best way to analyze logs, metrics and outputs of jobs. You can create custom charts and compare metics across different jobs. See https://learn.microsoft.com/en-us/azure/machine-learning/how-to-log-view-metrics?tabs=interactive#view-jobsruns-information-in-the-studio to learn more. \n", "\n", "However, we may need to access and review metrics programmatically for which we will use MLflow, which is the recommended client for logging and querying metrics." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##### 6.1 Initialize MLFlow Client\n", "\n", "The models and artifacts that are produced by Azure ML can be accessed via the MLFlow interface.\n", "Initialize the MLFlow client here, and set the backend as Azure ML, via. the MLFlow Client.\n", "\n", "IMPORTANT - You need to have installed the latest MLFlow packages with:\n", "\n", " pip install azureml-mlflow\n", " pip install mlflow" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import mlflow\n", "\n", "# Obtain the tracking URL from MLClient\n", "MLFLOW_TRACKING_URI = workspace_ml_client.workspaces.get(\n", " name=workspace_ml_client.workspace_name\n", ").mlflow_tracking_uri\n", "\n", "print(MLFLOW_TRACKING_URI)\n", "\n", "# Set the MLFLOW TRACKING URI\n", "mlflow.set_tracking_uri(MLFLOW_TRACKING_URI)\n", "print(f\"\\nCurrent tracking uri: {mlflow.get_tracking_uri()}\")\n", "\n", "from mlflow.tracking.client import MlflowClient\n", "\n", "# Initialize MLFlow client\n", "mlflow_client = MlflowClient()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##### 6.2 Get the evaluation metrics" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import pandas as pd\n", "\n", "metrics_df = pd.DataFrame()\n", "\n", "for job in pipeline_jobs:\n", " # Concat 'tags.mlflow.rootRunId=' and pipeline_job.name in single quotes as filter variable\n", " filter = \"tags.mlflow.rootRunId='\" + job[\"job_name\"] + \"'\"\n", " runs = mlflow.search_runs(\n", " experiment_names=[experiment_name], filter_string=filter, output_format=\"list\"\n", " )\n", "\n", " # Get the training and evaluation runs.\n", " for run in runs:\n", " # else, check if run.data.metrics.accuracy exists\n", " if \"mean_average_precision\" in run.data.metrics:\n", " # get the metrics from the mlflow run\n", " run_metric = run.data.metrics\n", " # add the model name to the run_metric dictionary\n", " run_metric[\"model_name\"] = job[\"model_name\"]\n", " # convert the run_metric dictionary to a pandas dataframe\n", " temp_df = pd.DataFrame(run_metric, index=[0])\n", " # concat the temp_df to the metrics_df\n", " metrics_df = pd.concat([metrics_df, temp_df], ignore_index=True)\n", "\n", "# move the model_name columns to the first column\n", "cols = metrics_df.columns.tolist()\n", "cols = cols[-1:] + cols[:-1]\n", "metrics_df = metrics_df[cols]\n", "metrics_df.head()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "language_info": { "name": "python" } }, "nbformat": 4, "nbformat_minor": 2 }