notebooks/official/model_evaluation/automl_video_classification_model_evaluation.ipynb (952 lines of code) (raw):
{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "copyright"
},
"outputs": [],
"source": [
"# Copyright 2022 Google LLC\n",
"#\n",
"# Licensed under the Apache License, Version 2.0 (the \"License\");\n",
"# you may not use this file except in compliance with the License.\n",
"# You may obtain a copy of the License at\n",
"#\n",
"# https://www.apache.org/licenses/LICENSE-2.0\n",
"#\n",
"# Unless required by applicable law or agreed to in writing, software\n",
"# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
"# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
"# See the License for the specific language governing permissions and\n",
"# limitations under the License."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "title"
},
"source": [
"# Vertex AI Pipelines: Evaluating batch prediction results from AutoML video classification model\n",
"\n",
"<table align=\"left\">\n",
" <td style=\"text-align: center\">\n",
" <a href=\"https://colab.research.google.com/github/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/official/model_evaluation/automl_video_classification_model_evaluation.ipynb\">\n",
" <img src=\"https://cloud.google.com/ml-engine/images/colab-logo-32px.png\" alt=\"Google Colaboratory logo\"><br> Open in Colab\n",
" </a>\n",
" </td>\n",
" <td style=\"text-align: center\">\n",
" <a href=\"https://console.cloud.google.com/vertex-ai/colab/import/https:%2F%2Fraw.githubusercontent.com%2FGoogleCloudPlatform%2Fvertex-ai-samples%2Fmain%2Fnotebooks%2Fofficial%2Fmodel_evaluation%2Fautoml_video_classification_model_evaluation.ipynb\">\n",
" <img src=\"https://cloud.google.com/ml-engine/images/colab-enterprise-logo-32px.png\" alt=\"Google Cloud Colab Enterprise logo\"><br> Open in Colab Enterprise\n",
" </a>\n",
" </td>\n",
" <td style=\"text-align: center\">\n",
" <a href=\"https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/official/model_evaluation/automl_video_classification_model_evaluation.ipynb\">\n",
" <img src=\"https://cloud.google.com/ml-engine/images/github-logo-32px.png\" alt=\"GitHub logo\"><br> View on GitHub\n",
" </a>\n",
" </td>\n",
" <td style=\"text-align: center\">\n",
"<a href=\"https://console.cloud.google.com/vertex-ai/workbench/deploy-notebook?download_url=https://raw.githubusercontent.com/GoogleCloudPlatform/vertex-ai-samples/main/notebooks/official/model_evaluation/automl_video_classification_model_evaluation.ipynb\" target='_blank'>\n",
" <img src=\"https://lh3.googleusercontent.com/UiNooY4LUgW_oTvpsNhPpQzsstV5W8F7rYgxgGBD85cWJoLmrOzhVs_ksK_vgx40SHs7jCqkTkCk=e14-rj-sc0xffffff-h130-w32\" alt=\"Vertex AI logo\"><br> Open in Vertex AI Workbench\n",
" </a>\n",
" </td>\n",
"</table>\n",
"<br/><br/><br/>\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "098dd9090e65"
},
"source": [
"## Overview\n",
"\n",
"This notebook demonstrates how to use the Vertex AI classification model evaluation component to evaluate an AutoML video classification model. Model evaluation helps you determine your model performance based on the evaluation metrics and improve the model if necessary. \n",
"\n",
"Learn more about [Vertex AI Model Evaluation](https://cloud.google.com/vertex-ai/docs/evaluation/introduction) and [Classification for video data](https://cloud.google.com/vertex-ai/docs/training-overview#classification_for_videos)."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "objective:automl,training,batch_prediction"
},
"source": [
"### Objective\n",
"\n",
"In this tutorial, you learn how to train a Vertex AI AutoML video classification model and learn how to evaluate it through a Vertex AI pipeline job using google_cloud_pipeline_components:\n",
"\n",
"This tutorial uses the following Google Cloud ML services and resources:\n",
"\n",
"- Vertex AI dataset\n",
"- Vertex AI Training(AutoML video Classification) \n",
"- Vertex AI Model Registry\n",
"- Vertex AI Pipelines\n",
"- Vertex AI batch prediction\n",
"\n",
"\n",
"\n",
"The steps performed include:\n",
"\n",
"- Create a Vertex AI dataset.\n",
"- Train a Automl video Classification model on the Vertex AI dataset resource.\n",
"- Import the trained AutoML Vertex AI Model resource into the pipeline.\n",
"- Run a batch prediction job inside the pipeline.\n",
"- Evaluate the AutoML model using the classification evaluation component.\n",
"- Import the classification metrics to the AutoML Vertex AI Model resource."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "dataset:hmdb,vcn"
},
"source": [
"### Dataset\n",
"\n",
"The dataset used for this tutorial is the golf swing recognition portion of the [Human Motion dataset from MIT](http://cbcl.mit.edu/publications/ps/Kuehne_etal_iccv11.pdf). The version of the dataset you use in this tutorial is stored in a public Cloud Storage bucket. The trained model predicts the start frame where a golf swing begins.\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "costs"
},
"source": [
"### Costs\n",
"\n",
"This tutorial uses billable components of Google Cloud:\n",
"\n",
"* Vertex AI\n",
"* Cloud Storage\n",
"\n",
"Learn about [Vertex AI pricing](https://cloud.google.com/vertex-ai/pricing) and \n",
"[Cloud Storage pricing](https://cloud.google.com/storage/pricing), and use the \n",
"[Pricing Calculator](https://cloud.google.com/products/calculator/) to generate a cost estimate based on your projected usage."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "install_aip:mbsdk"
},
"source": [
"## Get started\n",
"Install Vertex AI SDK for Python and other required packages"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "2abdd254e90f"
},
"outputs": [],
"source": [
"! pip3 install --upgrade --quiet google-cloud-aiplatform \\\n",
" google-cloud-pipeline-components \\\n",
" google-cloud-storage \\\n",
" matplotlib"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "5eec42e37bcf"
},
"source": [
"### Restart runtime (Colab only)\n",
"To use the newly installed packages, you must restart the runtime on Google Colab."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "dcc98768955f"
},
"outputs": [],
"source": [
"import sys\n",
"\n",
"if \"google.colab\" in sys.modules:\n",
"\n",
" import IPython\n",
"\n",
" app = IPython.Application.instance()\n",
" app.kernel.do_shutdown(True)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "4de1bd77992b"
},
"source": [
"<div class=\"alert alert-block alert-warning\">,\n",
"<b>⚠️ The kernel is going to restart. Wait until it's finished before continuing to the next step. ⚠️</b>,\n",
"</div>"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "56e219dbcb9a"
},
"source": [
"### Authenticate your notebook environment (Colab only)\n",
"Authenticate your environment on Google Colab."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "c97be6a73155"
},
"outputs": [],
"source": [
"import sys\n",
"\n",
"if \"google.colab\" in sys.modules:\n",
"\n",
" from google.colab import auth\n",
"\n",
" auth.authenticate_user()"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "fdaaecbb2a27"
},
"source": [
"### Set Google Cloud project information\n",
"To get started using Vertex AI, you must have an existing Google Cloud project. Learn more about [setting up a project and a development environment.](https://cloud.google.com/vertex-ai/docs/start/cloud-environment)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "e33244c6e6b5"
},
"outputs": [],
"source": [
"PROJECT_ID = \"[your-project-id]\" # @param {type:\"string\"}\n",
"LOCATION = \"us-central1\" # @param {type:\"string\"}"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "bucket:mbsdk"
},
"source": [
"### Create a Cloud Storage bucket\n",
"\n",
"Create a storage bucket to store intermediate artifacts such as datasets."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "bucket"
},
"outputs": [],
"source": [
"BUCKET_URI = f\"gs://your-bucket-name-{PROJECT_ID}-unique\" # @param {type:\"string\"}"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "autoset_bucket"
},
"source": [
"**If your bucket doesn't already exist**: Run the following cell to create your Cloud Storage bucket."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "91c46850b49b"
},
"outputs": [],
"source": [
"! gsutil mb -l $LOCATION -p $PROJECT_ID $BUCKET_URI"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "8debaa04cb14"
},
"source": [
"#### Service Account\n",
"\n",
"You use a service account to create Vertex AI Pipeline jobs. If you don't want to use your project's Compute Engine service account, set `SERVICE_ACCOUNT` to another service account ID."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "77b01a1fdbb4"
},
"outputs": [],
"source": [
"SERVICE_ACCOUNT = \"[your-service-account]\" # @param {type:\"string\"}"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "f936bebda2d4"
},
"outputs": [],
"source": [
"import sys\n",
"\n",
"IS_COLAB = \"google.colab\" in sys.modules\n",
"\n",
"if (\n",
" SERVICE_ACCOUNT == \"\"\n",
" or SERVICE_ACCOUNT is None\n",
" or SERVICE_ACCOUNT == \"[your-service-account]\"\n",
"):\n",
" # Get your service account from gcloud\n",
" if not IS_COLAB:\n",
" shell_output = !gcloud auth list 2>/dev/null\n",
" SERVICE_ACCOUNT = shell_output[2].replace(\"*\", \"\").strip()\n",
"\n",
" else: # IS_COLAB:\n",
" shell_output = ! gcloud projects describe $PROJECT_ID\n",
" project_number = shell_output[-1].split(\":\")[1].strip().replace(\"'\", \"\")\n",
" SERVICE_ACCOUNT = f\"{project_number}-compute@developer.gserviceaccount.com\"\n",
"\n",
" print(\"Service Account:\", SERVICE_ACCOUNT)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "b70f72422518"
},
"source": [
"#### Set service account access for Vertex AI Pipelines\n",
"Run the following commands to grant your service account access to read and write pipeline artifacts in the bucket that you created in the previous step. You only need to run this step once per service account."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "a533af977189"
},
"outputs": [],
"source": [
"! gsutil iam ch serviceAccount:{SERVICE_ACCOUNT}:roles/storage.objectCreator $BUCKET_URI\n",
"\n",
"! gsutil iam ch serviceAccount:{SERVICE_ACCOUNT}:roles/storage.objectViewer $BUCKET_URI"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "setup_vars"
},
"source": [
"### Import libraries"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "import_aip:mbsdk"
},
"outputs": [],
"source": [
"import json\n",
"\n",
"import google.cloud.aiplatform as aiplatform\n",
"import matplotlib.pyplot as plt\n",
"from google.cloud import aiplatform_v1, storage"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "init_aip:mbsdk"
},
"source": [
"### Initialize Vertex AI SDK for Python\n",
"\n",
"Initialize the Vertex AI SDK for Python for your project and corresponding bucket."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "init_aip:mbsdk"
},
"outputs": [],
"source": [
"aiplatform.init(project=PROJECT_ID, staging_bucket=BUCKET_URI)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "import_file:u_dataset,csv"
},
"source": [
"### Location of training data\n",
"\n",
"Now set the variable IMPORT_FILE to the location of the CSV index file in Cloud Storage."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "import_file:hmdb,csv,vcn"
},
"outputs": [],
"source": [
"IMPORT_FILE = (\n",
" \"gs://cloud-samples-data/video/automl_classification/hmdb_split_40_mp4_step2.csv\"\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "quick_peek:csv"
},
"source": [
"### Quick peek at your data\n",
"\n",
"This tutorial uses a version of the MIT Human Motion dataset that is stored in a public Cloud Storage bucket, using a CSV index file.\n",
"\n",
"Start by doing a quick peek at the data. Count the number of examples by counting the number of rows in the CSV index file (wc -l) and then peek at the first few rows."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "quick_peek:csv"
},
"outputs": [],
"source": [
"count = ! gsutil cat $IMPORT_FILE | wc -l\n",
"print(\"Number of Examples\", int(count[0]))\n",
"\n",
"print(\"First 10 rows\")\n",
"! gsutil cat $IMPORT_FILE | head"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "create_dataset:video,vcn"
},
"source": [
"### Create the Dataset\n",
"\n",
"Next, create the Vertex AI dataset resource using the `create` method for the VideoDataset class, which takes the following parameters:\n",
"\n",
"- display_name: The human readable name for the Vertex AI dataset resource.\n",
"- gcs_source: A list of one or more dataset index files to import the data items into the Vertex AI Dataset resource.\n",
"\n",
"This operation may take several minutes."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "create_dataset:video,vcn"
},
"outputs": [],
"source": [
"dataset = aiplatform.VideoDataset.create(\n",
" display_name=\"MIT Human Motion\",\n",
" gcs_source=[IMPORT_FILE],\n",
" import_schema_uri=aiplatform.schema.dataset.ioformat.video.classification,\n",
")\n",
"\n",
"print(dataset.resource_name)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "create_automl_pipeline:video,vcn"
},
"source": [
"### Create and run training pipeline\n",
"\n",
"To train an AutoML model, you perform two steps:\n",
"\n",
"1. Create a training pipeline.\n",
"2. Run the pipeline.\n",
"\n",
"\n",
"\n",
"#### Create the training pipeline\n",
"\n",
"An AutoML training pipeline is created with the AutoMLVideoTrainingJob class, with the following parameters:\n",
"\n",
"- display_name: The human readable name for the TrainingJob resource.\n",
"- prediction_type: The type task to train the model for.\n",
" - classification: A video classification model.\n",
" - object_tracking: A video object tracking model.\n",
" - action_recognition: A video action recognition model.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "create_automl_pipeline:video,vcn"
},
"outputs": [],
"source": [
"training_job = aiplatform.AutoMLVideoTrainingJob(\n",
" display_name=\"hmdb\",\n",
" prediction_type=\"classification\",\n",
")\n",
"\n",
"print(training_job)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "run_automl_pipeline:video"
},
"source": [
"#### Run the training pipeline\n",
"\n",
"Next,run the job to start the training job by invoking the `run` method with the following parameters:\n",
"\n",
"- dataset: The Vertex AI dataset resource to train the model.\n",
"- model_display_name: The human readable name for the trained model.\n",
"- training_fraction_split: The percentage of the dataset to use for training.\n",
"- test_fraction_split: The percentage of the dataset to use for testing (holdout data).\n",
"\n",
"The `run` method when completed returns the Model resource.\n",
"\n",
"The execution of the training pipeline can take over 24 hours to complete."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "3eaba926cdfa"
},
"outputs": [],
"source": [
"import os\n",
"\n",
"if os.getenv(\"IS_TESTING\"):\n",
" sys.exit(0)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "run_automl_pipeline:video"
},
"outputs": [],
"source": [
"model = training_job.run(\n",
" dataset=dataset,\n",
" model_display_name=\"hmdb\",\n",
" training_fraction_split=0.8,\n",
" test_fraction_split=0.2,\n",
")\n",
"\n",
"print(model)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "evaluate_the_model:mbsdk"
},
"source": [
"## List model evaluations from training \n",
"After your model has finished training, you can review the evaluation scores for it.\n",
"\n",
"You can check the model's evaluation results using the `get_model_evaluation` method of the Vertex AI Model resource.\n",
"\n",
"Just like Vertex AI datasets, you can either use the reference to the model variable you created when you trained the model or you can filter from the list of all of the models in your project using the model's display name as given below."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "evaluate_the_model:mbsdk"
},
"outputs": [],
"source": [
"# Get Vertex AI Model resource ID using the display_name\n",
"models = aiplatform.Model.list(filter=\"display_name=hmdb\")\n",
"\n",
"if len(models) != 0:\n",
" # Get the model object\n",
" MODEL_RSC_NAME = models[0].resource_name\n",
" print(\"Vertex AI Model resource name:\", MODEL_RSC_NAME)\n",
" model = aiplatform.Model(MODEL_RSC_NAME)\n",
"\n",
" # Print the evaluation metrics\n",
" model_eval = model.get_model_evaluation()\n",
" evaluation = model_eval.to_dict()\n",
" print(\"Model's evaluation metrics from Training:\\n\")\n",
" metrics = evaluation[\"metrics\"]\n",
" for metric in metrics.keys():\n",
" print(f\"metric: {metric}, value: {metrics[metric]}\\n\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "get_test_items:batch_prediction"
},
"source": [
"### Get test item(s)\n",
"\n",
"Inside the pipeline, you need some data samples for creating a batch prediction job. So, you use some arbitrary examples from the dataset as test items."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "get_test_items:automl,vcn,csv"
},
"outputs": [],
"source": [
"test_items = ! gsutil cat $IMPORT_FILE | head -n2\n",
"\n",
"if len(test_items[0]) == 5:\n",
" _, test_item_1, test_label_1, _, _ = str(test_items[0]).split(\",\")\n",
" _, test_item_2, test_label_2, _, _ = str(test_items[1]).split(\",\")\n",
"else:\n",
" test_item_1, test_label_1, _, _ = str(test_items[0]).split(\",\")\n",
" test_item_2, test_label_2, _, _ = str(test_items[1]).split(\",\")\n",
"\n",
"\n",
"print(test_item_1, test_label_1)\n",
"print(test_item_2, test_label_2)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "dcb49010e512"
},
"source": [
"### Copy test item(s)\n",
"For the batch prediction, copy the test items over to your Cloud Storage bucket."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "874f9ab72908"
},
"outputs": [],
"source": [
"file_1 = test_item_1.split(\"/\")[-1]\n",
"file_2 = test_item_2.split(\"/\")[-1]\n",
"\n",
"! gsutil cp $test_item_1 $BUCKET_URI/$file_1\n",
"! gsutil cp $test_item_2 $BUCKET_URI/$file_2\n",
"\n",
"test_item_1 = BUCKET_URI + \"/\" + file_1\n",
"test_item_2 = BUCKET_URI + \"/\" + file_2"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "make_batch_file:automl,video"
},
"source": [
"### Make a Pipeline input file\n",
"\n",
"Now, make an input file for your evaluation pipeline and store it in the Cloud Storage bucket. The input file is stored in JSONL format for this tutorial. In the JSONL file, you make one dictionary entry per line for each video file. The dictionary contains the following key-value pairs:\n",
"\n",
"- content: The Cloud Storage path to the video.\n",
"- mimeType: The content type. In our example, it's an avi file.\n",
"- timeSegmentStart: The start timestamp in the video to do prediction on. *Note*, the timestamp must be specified as a string and followed by s (second), m (minute) or h (hour).\n",
"- timeSegmentEnd: The end timestamp in the video to do prediction on.\n",
"- outputLabel: The batch prediction labels."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "116377999a9f"
},
"outputs": [],
"source": [
"test_filename = \"ground_truth.jsonl\"\n",
"gcs_ground_truth_uri = BUCKET_URI + \"/\" + test_filename\n",
"\n",
"data_1 = {\n",
" \"content\": test_item_1,\n",
" \"mimeType\": \"video/mp4\",\n",
" \"timeSegmentStart\": \"0.0s\",\n",
" \"timeSegmentEnd\": \"5.0s\",\n",
" \"outputLabel\": test_label_1,\n",
"}\n",
"data_2 = {\n",
" \"content\": test_item_2,\n",
" \"mimeType\": \"video/mp4\",\n",
" \"timeSegmentStart\": \"0.0s\",\n",
" \"timeSegmentEnd\": \"5.0s\",\n",
" \"outputLabel\": test_label_2,\n",
"}\n",
"\n",
"\n",
"bucket = storage.Client(project=PROJECT_ID).bucket(BUCKET_URI[5:])\n",
"blob = bucket.blob(blob_name=test_filename)\n",
"data = json.dumps(data_1) + \"\\n\" + json.dumps(data_2) + \"\\n\"\n",
"blob.upload_from_string(data)\n",
"print(gcs_ground_truth_uri)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "d56366168ec5"
},
"source": [
"### Check input content\n",
"Check the contents of the ground_truth.jsonl."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "48889e396969"
},
"outputs": [],
"source": [
"! gsutil cat $gcs_ground_truth_uri"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "f55310caafeb"
},
"source": [
"## Run a pipeline for model evaluation\n",
"\n",
"Now, you run a Vertex AI batch prediction job and generate evaluations and feature attributions on its results using a pipeline. \n",
"\n",
"To do so, create a Vertex AI pipeline by calling the `evaluate` function. Learn more about [evaluate function](https://github.com/googleapis/python-aiplatform/blob/main/google/cloud/aiplatform/models.py#L5127)."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "269a8f6f50ed"
},
"source": [
"### Define parameters to run the evaluate function\n",
"\n",
"Specify the required parameters to run the `evaluate` function. \n",
"\n",
"The `evaluate` function parameters are as follows:\n",
"\n",
"- prediction_type: The problem type being addressed by this evaluation run. 'classification' and 'regression' are the currently supported problem types.\n",
"- target_field_name: Name of the column to be used as the target for classification.\n",
"- gcs_source_uris: List of the Cloud Storage bucket uris of input instances for batch prediction.\n",
"- class_labels: List of class labels in the target column.\n",
"- generate_feature_attributions: Optional. Whether the model evaluation job should generate feature attributions. Defaults to False ."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "59be2edbff8b"
},
"outputs": [],
"source": [
"LABEL_COLUMN = \"outputLabel\"\n",
"CLASS_LABELS = [\"brush_hair\", \"cartwheel\"]\n",
"\n",
"job = model.evaluate(\n",
" prediction_type=\"classification\",\n",
" target_field_name=LABEL_COLUMN,\n",
" gcs_source_uris=[gcs_ground_truth_uri],\n",
" class_labels=CLASS_LABELS,\n",
" generate_feature_attributions=False,\n",
")\n",
"\n",
"print(\"Waiting model evaluation is in process\")\n",
"job.wait()"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "batch_request:mbsdk"
},
"source": [
"### Get the Model Evaluation Results\n",
"After the evalution pipeline is finished, run the below cell to print the evaluation metrics."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "a4f83f2da03f"
},
"outputs": [],
"source": [
"model_evaluation = job.get_model_evaluation()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "batch_request:mbsdk"
},
"outputs": [],
"source": [
"# Iterate over the pipeline tasks\n",
"for (\n",
" task\n",
") in model_evaluation._backing_pipeline_job._gca_resource.job_detail.task_details:\n",
" # Obtain the artifacts from the evaluation task\n",
" if (\n",
" (\"model-evaluation\" in task.task_name)\n",
" and (\"model-evaluation-import\" not in task.task_name)\n",
" and (\n",
" task.state == aiplatform_v1.types.PipelineTaskDetail.State.SUCCEEDED\n",
" or task.state == aiplatform_v1.types.PipelineTaskDetail.State.SKIPPED\n",
" )\n",
" ):\n",
" evaluation_metrics = task.outputs.get(\"evaluation_metrics\").artifacts[0]\n",
" evaluation_metrics_gcs_uri = evaluation_metrics.uri\n",
"\n",
"print(evaluation_metrics)\n",
"print(evaluation_metrics_gcs_uri)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "16d4bdbd9f4a"
},
"source": [
"## Visualize the metrics\n",
"Visualize the available metrics like auRoc and logLoss using a bar-chart."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "batch_request_wait:mbsdk"
},
"outputs": [],
"source": [
"metrics = []\n",
"values = []\n",
"for i in evaluation_metrics.metadata.items():\n",
" metrics.append(i[0])\n",
" values.append(i[1])\n",
"plt.figure(figsize=(5, 3))\n",
"plt.bar(x=metrics, height=values)\n",
"plt.title(\"Evaluation Metrics\")\n",
"plt.ylabel(\"Value\")\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "cleanup:mbsdk"
},
"source": [
"## Clean up\n",
"\n",
"To clean up all Google Cloud resources used in this project, you can [delete the Google Cloud\n",
"project](https://cloud.google.com/resource-manager/docs/creating-managing-projects#shutting_down_projects) you used for the tutorial.\n",
"\n",
"Otherwise, you can delete the individual resources you created in this tutorial:\n",
"\n",
"- Dataset\n",
"- Model\n",
"- AutoML Training Job\n",
"- Cloud Storage Bucket\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "cleanup:mbsdk"
},
"outputs": [],
"source": [
"# If the bucket needs to be deleted too, please set \"delete_bucket\" to True\n",
"delete_bucket = False\n",
"\n",
"# Delete the dataset using the Vertex dataset object\n",
"dataset.delete()\n",
"\n",
"# Delete the model using the Vertex model object\n",
"model.delete()\n",
"\n",
"# Delete the training job\n",
"training_job.delete()\n",
"\n",
"# Delete the evaluation pipeline\n",
"job.delete()\n",
"\n",
"# Delete the Cloud storage bucket\n",
"if delete_bucket:\n",
" ! gsutil rm -r $BUCKET_URI"
]
}
],
"metadata": {
"colab": {
"name": "automl_video_classification_model_evaluation.ipynb",
"toc_visible": true
},
"kernelspec": {
"display_name": "Python 3",
"name": "python3"
}
},
"nbformat": 4,
"nbformat_minor": 0
}