notebooks/community/gapic/automl/showcase_automl_video_classification_batch.ipynb (1,764 lines of code) (raw):
{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "copyright"
},
"outputs": [],
"source": [
"# Copyright 2020 Google LLC\n",
"#\n",
"# Licensed under the Apache License, Version 2.0 (the \"License\");\n",
"# you may not use this file except in compliance with the License.\n",
"# You may obtain a copy of the License at\n",
"#\n",
"# https://www.apache.org/licenses/LICENSE-2.0\n",
"#\n",
"# Unless required by applicable law or agreed to in writing, software\n",
"# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
"# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
"# See the License for the specific language governing permissions and\n",
"# limitations under the License."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "title"
},
"source": [
"# Vertex client library: AutoML video classification model for batch prediction\n",
"\n",
"<table align=\"left\">\n",
" <td>\n",
" <a href=\"https://colab.research.google.com/github/GoogleCloudPlatform/vertex-ai-samples/blob/master/notebooks/community/gapic/automl/showcase_automl_video_classification_batch.ipynb\">\n",
" <img src=\"https://cloud.google.com/ml-engine/images/colab-logo-32px.png\" alt=\"Colab logo\"> Run in Colab\n",
" </a>\n",
" </td>\n",
" <td>\n",
" <a href=\"https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/master/notebooks/community/gapic/automl/showcase_automl_video_classification_batch.ipynb\">\n",
" <img src=\"https://cloud.google.com/ml-engine/images/github-logo-32px.png\" alt=\"GitHub logo\">\n",
" View on GitHub\n",
" </a>\n",
" </td>\n",
"</table>\n",
"<br/><br/><br/>"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "overview:automl"
},
"source": [
"## Overview\n",
"\n",
"\n",
"This tutorial demonstrates how to use the Vertex client library for Python to create video classification models and do batch prediction using Google Cloud's [AutoML](https://cloud.google.com/vertex-ai/docs/start/automl-users)."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "dataset:hmdb,vcn"
},
"source": [
"### Dataset\n",
"\n",
"The dataset used for this tutorial is the [Human Motion dataset](https://TODO) from [MIT](http://cbcl.mit.edu/publications/ps/Kuehne_etal_iccv11.pdf). The version of the dataset you will use in this tutorial is stored in a public Cloud Storage bucket."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "objective:automl,training,batch_prediction"
},
"source": [
"### Objective\n",
"\n",
"In this tutorial, you create an AutoML video classification model from a Python script, and then do a batch prediction using the Vertex client library. You can alternatively create and deploy models using the `gcloud` command-line tool or online using the Google Cloud Console.\n",
"\n",
"The steps performed include:\n",
"\n",
"- Create a Vertex `Dataset` resource.\n",
"- Train the model.\n",
"- View the model evaluation.\n",
"- Make a batch prediction.\n",
"\n",
"There is one key difference between using batch prediction and using online prediction:\n",
"\n",
"* Prediction Service: Does an on-demand prediction for the entire set of instances (i.e., one or more data items) and returns the results in real-time.\n",
"\n",
"* Batch Prediction Service: Does a queued (batch) prediction for the entire set of instances in the background and stores the results in a Cloud Storage bucket when ready."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "costs"
},
"source": [
"### Costs\n",
"\n",
"This tutorial uses billable components of Google Cloud (GCP):\n",
"\n",
"* Vertex AI\n",
"* Cloud Storage\n",
"\n",
"Learn about [Vertex AI\n",
"pricing](https://cloud.google.com/vertex-ai/pricing) and [Cloud Storage\n",
"pricing](https://cloud.google.com/storage/pricing), and use the [Pricing\n",
"Calculator](https://cloud.google.com/products/calculator/)\n",
"to generate a cost estimate based on your projected usage."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "install_aip"
},
"source": [
"## Installation\n",
"\n",
"Install the latest version of Vertex client library."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "install_aip"
},
"outputs": [],
"source": [
"import os\n",
"import sys\n",
"\n",
"# Google Cloud Notebook\n",
"if os.path.exists(\"/opt/deeplearning/metadata/env_version\"):\n",
" USER_FLAG = '--user'\n",
"else:\n",
" USER_FLAG = ''\n",
"\n",
"! pip3 install -U google-cloud-aiplatform $USER_FLAG"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "install_storage"
},
"source": [
"Install the latest GA version of *google-cloud-storage* library as well."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "install_storage"
},
"outputs": [],
"source": [
"! pip3 install -U google-cloud-storage $USER_FLAG"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "restart"
},
"source": [
"### Restart the kernel\n",
"\n",
"Once you've installed the Vertex client library and Google *cloud-storage*, you need to restart the notebook kernel so it can find the packages."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "restart"
},
"outputs": [],
"source": [
"if not os.getenv(\"IS_TESTING\"):\n",
" # Automatically restart kernel after installs\n",
" import IPython\n",
" app = IPython.Application.instance()\n",
" app.kernel.do_shutdown(True)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "before_you_begin"
},
"source": [
"## Before you begin\n",
"\n",
"### GPU runtime\n",
"\n",
"*Make sure you're running this notebook in a GPU runtime if you have that option. In Colab, select* **Runtime > Change Runtime Type > GPU**\n",
"\n",
"### Set up your Google Cloud project\n",
"\n",
"**The following steps are required, regardless of your notebook environment.**\n",
"\n",
"1. [Select or create a Google Cloud project](https://console.cloud.google.com/cloud-resource-manager). When you first create an account, you get a $300 free credit towards your compute/storage costs.\n",
"\n",
"2. [Make sure that billing is enabled for your project.](https://cloud.google.com/billing/docs/how-to/modify-project)\n",
"\n",
"3. [Enable the Vertex APIs and Compute Engine APIs.](https://console.cloud.google.com/flows/enableapi?apiid=ml.googleapis.com,compute_component)\n",
"\n",
"4. [The Google Cloud SDK](https://cloud.google.com/sdk) is already installed in Google Cloud Notebook.\n",
"\n",
"5. Enter your project ID in the cell below. Then run the cell to make sure the\n",
"Cloud SDK uses the right project for all the commands in this notebook.\n",
"\n",
"**Note**: Jupyter runs lines prefixed with `!` as shell commands, and it interpolates Python variables prefixed with `$` into these commands."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "set_project_id"
},
"outputs": [],
"source": [
"PROJECT_ID = \"[your-project-id]\" #@param {type:\"string\"}"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "autoset_project_id"
},
"outputs": [],
"source": [
"if PROJECT_ID == \"\" or PROJECT_ID is None or PROJECT_ID == \"[your-project-id]\":\n",
" # Get your GCP project id from gcloud\n",
" shell_output = !gcloud config list --format 'value(core.project)' 2>/dev/null\n",
" PROJECT_ID = shell_output[0]\n",
" print(\"Project ID:\", PROJECT_ID)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "set_gcloud_project_id"
},
"outputs": [],
"source": [
"! gcloud config set project $PROJECT_ID"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "region"
},
"source": [
"#### Region\n",
"\n",
"You can also change the `REGION` variable, which is used for operations\n",
"throughout the rest of this notebook. Below are regions supported for Vertex. We recommend that you choose the region closest to you.\n",
"\n",
"- Americas: `us-central1`\n",
"- Europe: `europe-west4`\n",
"- Asia Pacific: `asia-east1`\n",
"\n",
"You may not use a multi-regional bucket for training with Vertex. Not all regions provide support for all Vertex services. For the latest support per region, see the [Vertex locations documentation](https://cloud.google.com/vertex-ai/docs/general/locations)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "region"
},
"outputs": [],
"source": [
"REGION = 'us-central1' #@param {type: \"string\"}"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "timestamp"
},
"source": [
"#### Timestamp\n",
"\n",
"If you are in a live tutorial session, you might be using a shared test account or project. To avoid name collisions between users on resources created, you create a timestamp for each instance session, and append onto the name of resources which will be created in this tutorial."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "timestamp"
},
"outputs": [],
"source": [
"from datetime import datetime\n",
"\n",
"TIMESTAMP = datetime.now().strftime(\"%Y%m%d%H%M%S\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "gcp_authenticate"
},
"source": [
"### Authenticate your Google Cloud account\n",
"\n",
"**If you are using Google Cloud Notebook**, your environment is already authenticated. Skip this step.\n",
"\n",
"**If you are using Colab**, run the cell below and follow the instructions when prompted to authenticate your account via oAuth.\n",
"\n",
"**Otherwise**, follow these steps:\n",
"\n",
"In the Cloud Console, go to the [Create service account key](https://console.cloud.google.com/apis/credentials/serviceaccountkey) page.\n",
"\n",
"**Click Create service account**.\n",
"\n",
"In the **Service account name** field, enter a name, and click **Create**.\n",
"\n",
"In the **Grant this service account access to project** section, click the Role drop-down list. Type \"Vertex\" into the filter box, and select **Vertex Administrator**. Type \"Storage Object Admin\" into the filter box, and select **Storage Object Admin**.\n",
"\n",
"Click Create. A JSON file that contains your key downloads to your local environment.\n",
"\n",
"Enter the path to your service account key as the GOOGLE_APPLICATION_CREDENTIALS variable in the cell below and run the cell."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "gcp_authenticate"
},
"outputs": [],
"source": [
"# If you are running this notebook in Colab, run this cell and follow the\n",
"# instructions to authenticate your GCP account. This provides access to your\n",
"# Cloud Storage bucket and lets you submit training jobs and prediction\n",
"# requests.\n",
"\n",
"# If on Google Cloud Notebook, then don't execute this code\n",
"if not os.path.exists(\"/opt/deeplearning/metadata/env_version\"):\n",
" if \"google.colab\" in sys.modules:\n",
" from google.colab import auth as google_auth\n",
"\n",
" google_auth.authenticate_user()\n",
"\n",
" # If you are running this notebook locally, replace the string below with the\n",
" # path to your service account key and run this cell to authenticate your GCP\n",
" # account.\n",
" elif not os.getenv(\"IS_TESTING\"):\n",
" %env GOOGLE_APPLICATION_CREDENTIALS ''"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "bucket:batch_prediction"
},
"source": [
"### Create a Cloud Storage bucket\n",
"\n",
"**The following steps are required, regardless of your notebook environment.**\n",
"\n",
"This tutorial is designed to use training data that is in a public Cloud Storage bucket and a local Cloud Storage bucket for your batch predictions. You may alternatively use your own training data that you have stored in a local Cloud Storage bucket.\n",
"\n",
"Set the name of your Cloud Storage bucket below. Bucket names must be globally unique across all Google Cloud projects, including those outside of your organization."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "bucket"
},
"outputs": [],
"source": [
"BUCKET_NAME = \"gs://[your-bucket-name]\" #@param {type:\"string\"}"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "autoset_bucket"
},
"outputs": [],
"source": [
"if BUCKET_NAME == \"\" or BUCKET_NAME is None or BUCKET_NAME == \"gs://[your-bucket-name]\":\n",
" BUCKET_NAME = \"gs://\" + PROJECT_ID + \"aip-\" + TIMESTAMP"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "create_bucket"
},
"source": [
"**Only if your bucket doesn't already exist**: Run the following cell to create your Cloud Storage bucket."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "create_bucket"
},
"outputs": [],
"source": [
"! gsutil mb -l $REGION $BUCKET_NAME"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "validate_bucket"
},
"source": [
"Finally, validate access to your Cloud Storage bucket by examining its contents:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "validate_bucket"
},
"outputs": [],
"source": [
"! gsutil ls -al $BUCKET_NAME"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "setup_vars"
},
"source": [
"### Set up variables\n",
"\n",
"Next, set up some variables used throughout the tutorial.\n",
"### Import libraries and define constants"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "import_aip:protobuf"
},
"source": [
"#### Import Vertex client library\n",
"\n",
"Import the Vertex client library into our Python environment."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "import_aip:protobuf"
},
"outputs": [],
"source": [
"import time\n",
"\n",
"from google.cloud.aiplatform import gapic as aip\n",
"from google.protobuf import json_format\n",
"from google.protobuf.json_format import MessageToJson, ParseDict\n",
"from google.protobuf.struct_pb2 import Struct, Value"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "aip_constants"
},
"source": [
"#### Vertex constants\n",
"\n",
"Setup up the following constants for Vertex:\n",
"\n",
"- `API_ENDPOINT`: The Vertex API service endpoint for dataset, model, job, pipeline and endpoint services.\n",
"- `PARENT`: The Vertex location root path for dataset, model, job, pipeline and endpoint resources."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "aip_constants"
},
"outputs": [],
"source": [
"# API service endpoint\n",
"API_ENDPOINT = \"{}-aiplatform.googleapis.com\".format(REGION)\n",
"\n",
"# Vertex location root path for your dataset, model and endpoint resources\n",
"PARENT = \"projects/\" + PROJECT_ID + \"/locations/\" + REGION"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "automl_constants"
},
"source": [
"#### AutoML constants\n",
"\n",
"Set constants unique to AutoML datasets and training:\n",
"\n",
"- Dataset Schemas: Tells the `Dataset` resource service which type of dataset it is.\n",
"- Data Labeling (Annotations) Schemas: Tells the `Dataset` resource service how the data is labeled (annotated).\n",
"- Dataset Training Schemas: Tells the `Pipeline` resource service the task (e.g., classification) to train the model for."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "automl_constants:vcn"
},
"outputs": [],
"source": [
"# Video Dataset type\n",
"DATA_SCHEMA = 'gs://google-cloud-aiplatform/schema/dataset/metadata/video_1.0.0.yaml'\n",
"# Video Labeling type\n",
"LABEL_SCHEMA = \"gs://google-cloud-aiplatform/schema/dataset/ioformat/video_classification_io_format_1.0.0.yaml\"\n",
"# Video Training task\n",
"TRAINING_SCHEMA = \"gs://google-cloud-aiplatform/schema/trainingjob/definition/automl_video_classification_1.0.0.yaml\""
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "accelerators:prediction"
},
"source": [
"#### Hardware Accelerators\n",
"\n",
"Set the hardware accelerators (e.g., GPU), if any, for prediction.\n",
"\n",
"Set the variable `DEPLOY_GPU/DEPLOY_NGPU` to use a container image supporting a GPU and the number of GPUs allocated to the virtual machine (VM) instance. For example, to use a GPU container image with 4 Nvidia Telsa K80 GPUs allocated to each VM, you would specify:\n",
"\n",
" (aip.AcceleratorType.NVIDIA_TESLA_K80, 4)\n",
"\n",
"For GPU, available accelerators include:\n",
" - aip.AcceleratorType.NVIDIA_TESLA_K80\n",
" - aip.AcceleratorType.NVIDIA_TESLA_P100\n",
" - aip.AcceleratorType.NVIDIA_TESLA_P4\n",
" - aip.AcceleratorType.NVIDIA_TESLA_T4\n",
" - aip.AcceleratorType.NVIDIA_TESLA_V100\n",
"\n",
"Otherwise specify `(None, None)` to use a container image to run on a CPU."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "accelerators:prediction"
},
"outputs": [],
"source": [
"if os.getenv(\"IS_TESTING_DEPOLY_GPU\"):\n",
" DEPLOY_GPU, DEPLOY_NGPU = (aip.AcceleratorType.NVIDIA_TESLA_K80, int(os.getenv(\"IS_TESTING_DEPOLY_GPU\")))\n",
"else:\n",
" DEPLOY_GPU, DEPLOY_NGPU = (aip.AcceleratorType.NVIDIA_TESLA_K80, 1)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "container:automl"
},
"source": [
"#### Container (Docker) image\n",
"\n",
"For AutoML batch prediction, the container image for the serving binary is pre-determined by the Vertex prediction service. More specifically, the service will pick the appropriate container for the model depending on the hardware accelerator you selected."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "machine:prediction"
},
"source": [
"#### Machine Type\n",
"\n",
"Next, set the machine type to use for prediction.\n",
"\n",
"- Set the variable `DEPLOY_COMPUTE` to configure the compute resources for the VM you will use for prediction.\n",
" - `machine type`\n",
" - `n1-standard`: 3.75GB of memory per vCPU.\n",
" - `n1-highmem`: 6.5GB of memory per vCPU\n",
" - `n1-highcpu`: 0.9 GB of memory per vCPU\n",
" - `vCPUs`: number of \\[2, 4, 8, 16, 32, 64, 96 \\]\n",
"\n",
"*Note: You may also use n2 and e2 machine types for training and deployment, but they do not support GPUs*"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "machine:prediction"
},
"outputs": [],
"source": [
"if os.getenv(\"IS_TESTING_DEPLOY_MACHINE\"):\n",
" MACHINE_TYPE = os.getenv(\"IS_TESTING_DEPLOY_MACHINE\")\n",
"else:\n",
" MACHINE_TYPE = 'n1-standard'\n",
"\n",
"VCPU = '4'\n",
"DEPLOY_COMPUTE = MACHINE_TYPE + '-' + VCPU\n",
"print('Deploy machine type', DEPLOY_COMPUTE)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "tutorial_start:automl"
},
"source": [
"# Tutorial\n",
"\n",
"Now you are ready to start creating your own AutoML video classification model."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "clients:automl,batch_prediction"
},
"source": [
"## Set up clients\n",
"\n",
"The Vertex client library works as a client/server model. On your side (the Python script) you will create a client that sends requests and receives responses from the Vertex server.\n",
"\n",
"You will use different clients in this tutorial for different steps in the workflow. So set them all up upfront.\n",
"\n",
"- Dataset Service for `Dataset` resources.\n",
"- Model Service for `Model` resources.\n",
"- Pipeline Service for training.\n",
"- Job Service for batch prediction and custom training."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "clients:automl,batch_prediction"
},
"outputs": [],
"source": [
"# client options same for all services\n",
"client_options = {\"api_endpoint\": API_ENDPOINT}\n",
"\n",
"\n",
"def create_dataset_client():\n",
" client = aip.DatasetServiceClient(\n",
" client_options=client_options\n",
" )\n",
" return client\n",
"\n",
"\n",
"def create_model_client():\n",
" client = aip.ModelServiceClient(\n",
" client_options=client_options\n",
" )\n",
" return client\n",
"\n",
"\n",
"def create_pipeline_client():\n",
" client = aip.PipelineServiceClient(\n",
" client_options=client_options\n",
" )\n",
" return client\n",
"\n",
"\n",
"def create_job_client():\n",
" client = aip.JobServiceClient(\n",
" client_options=client_options\n",
" )\n",
" return client\n",
"\n",
"\n",
"clients = {}\n",
"clients['dataset'] = create_dataset_client()\n",
"clients['model'] = create_model_client()\n",
"clients['pipeline'] = create_pipeline_client()\n",
"clients['job'] = create_job_client()\n",
"\n",
"for client in clients.items():\n",
" print(client)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "create_aip_dataset"
},
"source": [
"## Dataset\n",
"\n",
"Now that your clients are ready, your first step in training a model is to create a managed dataset instance, and then upload your labeled data to it.\n",
"\n",
"### Create `Dataset` resource instance\n",
"\n",
"Use the helper function `create_dataset` to create the instance of a `Dataset` resource. This function does the following:\n",
"\n",
"1. Uses the dataset client service.\n",
"2. Creates an Vertex `Dataset` resource (`aip.Dataset`), with the following parameters:\n",
" - `display_name`: The human-readable name you choose to give it.\n",
" - `metadata_schema_uri`: The schema for the dataset type.\n",
"3. Calls the client dataset service method `create_dataset`, with the following parameters:\n",
" - `parent`: The Vertex location root path for your `Database`, `Model` and `Endpoint` resources.\n",
" - `dataset`: The Vertex dataset object instance you created.\n",
"4. The method returns an `operation` object.\n",
"\n",
"An `operation` object is how Vertex handles asynchronous calls for long running operations. While this step usually goes fast, when you first use it in your project, there is a longer delay due to provisioning.\n",
"\n",
"You can use the `operation` object to get status on the operation (e.g., create `Dataset` resource) or to cancel the operation, by invoking an operation method:\n",
"\n",
"| Method | Description |\n",
"| ----------- | ----------- |\n",
"| result() | Waits for the operation to complete and returns a result object in JSON format. |\n",
"| running() | Returns True/False on whether the operation is still running. |\n",
"| done() | Returns True/False on whether the operation is completed. |\n",
"| canceled() | Returns True/False on whether the operation was canceled. |\n",
"| cancel() | Cancels the operation (this may take up to 30 seconds). |"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "create_aip_dataset"
},
"outputs": [],
"source": [
"TIMEOUT = 90\n",
"\n",
"def create_dataset(name, schema, labels=None, timeout=TIMEOUT):\n",
" start_time = time.time()\n",
" try:\n",
" dataset = aip.Dataset(display_name=name, metadata_schema_uri=schema, labels=labels)\n",
"\n",
" operation = clients['dataset'].create_dataset(parent=PARENT, dataset=dataset)\n",
" print(\"Long running operation:\", operation.operation.name)\n",
" result = operation.result(timeout=TIMEOUT)\n",
" print(\"time:\", time.time() - start_time)\n",
" print(\"response\")\n",
" print(\" name:\", result.name)\n",
" print(\" display_name:\", result.display_name)\n",
" print(\" metadata_schema_uri:\", result.metadata_schema_uri)\n",
" print(\" metadata:\", dict(result.metadata))\n",
" print(\" create_time:\", result.create_time)\n",
" print(\" update_time:\", result.update_time)\n",
" print(\" etag:\", result.etag)\n",
" print(\" labels:\", dict(result.labels))\n",
" return result\n",
" except Exception as e:\n",
" print(\"exception:\", e)\n",
" return None\n",
"\n",
"\n",
"result = create_dataset(\"hmdb,tst-\" + TIMESTAMP, DATA_SCHEMA)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "dataset_id:result"
},
"source": [
"Now save the unique dataset identifier for the `Dataset` resource instance you created."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "dataset_id:result"
},
"outputs": [],
"source": [
"# The full unique ID for the dataset\n",
"dataset_id = result.name\n",
"# The short numeric ID for the dataset\n",
"dataset_short_id = dataset_id.split('/')[-1]\n",
"\n",
"print(dataset_id)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "data_preparation:video,u_dataset"
},
"source": [
"### Data preparation\n",
"\n",
"The Vertex `Dataset` resource for video has some requirements for your data.\n",
"\n",
"- Videos must be stored in a Cloud Storage bucket.\n",
"- Each video file must be in a video format (MPG, AVI, ...).\n",
"- There must be an index file stored in your Cloud Storage bucket that contains the path and label for each video.\n",
"- The index file must be either CSV or JSONL."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "data_import_format:vcn,u_dataset,csv"
},
"source": [
"#### CSV\n",
"\n",
"For video classification, the CSV index file has a few requirements:\n",
"\n",
"- No heading.\n",
"- First column is the Cloud Storage path to the video.\n",
"- Second column is the label."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "import_file:u_dataset,csv"
},
"source": [
"#### Location of Cloud Storage training data.\n",
"\n",
"Now set the variable `IMPORT_FILE` to the location of the CSV index file in Cloud Storage."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "import_file:hmdb,csv,vcn"
},
"outputs": [],
"source": [
"IMPORT_FILE = 'gs://automl-video-demo-data/hmdb_split1_5classes_train_inf.csv'"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "quick_peek:csv"
},
"source": [
"#### Quick peek at your data\n",
"\n",
"You will use a version of the MIT Human Motion dataset that is stored in a public Cloud Storage bucket, using a CSV index file.\n",
"\n",
"Start by doing a quick peek at the data. You count the number of examples by counting the number of rows in the CSV index file (`wc -l`) and then peek at the first few rows."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "quick_peek:csv"
},
"outputs": [],
"source": [
"if 'IMPORT_FILES' in globals():\n",
" FILE = IMPORT_FILES[0]\n",
"else:\n",
" FILE = IMPORT_FILE\n",
"\n",
"count = ! gsutil cat $FILE | wc -l\n",
"print(\"Number of Examples\", int(count[0]))\n",
"\n",
"print(\"First 10 rows\")\n",
"! gsutil cat $FILE | head"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "import_data"
},
"source": [
"### Import data\n",
"\n",
"Now, import the data into your Vertex Dataset resource. Use this helper function `import_data` to import the data. The function does the following:\n",
"\n",
"- Uses the `Dataset` client.\n",
"- Calls the client method `import_data`, with the following parameters:\n",
" - `name`: The human readable name you give to the `Dataset` resource (e.g., hmdb,tst).\n",
" - `import_configs`: The import configuration.\n",
"\n",
"- `import_configs`: A Python list containing a dictionary, with the key/value entries:\n",
" - `gcs_sources`: A list of URIs to the paths of the one or more index files.\n",
" - `import_schema_uri`: The schema identifying the labeling type.\n",
"\n",
"The `import_data()` method returns a long running `operation` object. This will take a few minutes to complete. If you are in a live tutorial, this would be a good time to ask questions, or take a personal break."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "import_data"
},
"outputs": [],
"source": [
"def import_data(dataset, gcs_sources, schema):\n",
" config = [{\n",
" 'gcs_source': {'uris': gcs_sources},\n",
" 'import_schema_uri': schema\n",
" }]\n",
" print(\"dataset:\", dataset_id)\n",
" start_time = time.time()\n",
" try:\n",
" operation = clients['dataset'].import_data(name=dataset_id, import_configs=config)\n",
" print(\"Long running operation:\", operation.operation.name)\n",
"\n",
" result = operation.result()\n",
" print(\"result:\", result)\n",
" print(\"time:\", int(time.time() - start_time), \"secs\")\n",
" print(\"error:\", operation.exception())\n",
" print(\"meta :\", operation.metadata)\n",
" print(\"after: running:\", operation.running(), \"done:\", operation.done(), \"cancelled:\", operation.cancelled())\n",
"\n",
" return operation\n",
" except Exception as e:\n",
" print(\"exception:\", e)\n",
" return None\n",
"\n",
"\n",
"import_data(dataset_id, [IMPORT_FILE], LABEL_SCHEMA)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "train_automl_model"
},
"source": [
"## Train the model\n",
"\n",
"Now train an AutoML video classification model using your Vertex `Dataset` resource. To train the model, do the following steps:\n",
"\n",
"1. Create an Vertex training pipeline for the `Dataset` resource.\n",
"2. Execute the pipeline to start the training."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "create_pipeline:automl,video"
},
"source": [
"### Create a training pipeline\n",
"\n",
"You may ask, what do we use a pipeline for? You typically use pipelines when the job (such as training) has multiple steps, generally in sequential order: do step A, do step B, etc. By putting the steps into a pipeline, we gain the benefits of:\n",
"\n",
"1. Being reusable for subsequent training jobs.\n",
"2. Can be containerized and ran as a batch job.\n",
"3. Can be distributed.\n",
"4. All the steps are associated with the same pipeline job for tracking progress.\n",
"\n",
"Use this helper function `create_pipeline`, which takes the following parameters:\n",
"\n",
"- `pipeline_name`: A human readable name for the pipeline job.\n",
"- `model_name`: A human readable name for the model.\n",
"- `dataset`: The Vertex fully qualified dataset identifier.\n",
"- `schema`: The dataset labeling (annotation) training schema.\n",
"- `task`: A dictionary describing the requirements for the training job.\n",
"\n",
"The helper function calls the `Pipeline` client service'smethod `create_pipeline`, which takes the following parameters:\n",
"\n",
"- `parent`: The Vertex location root path for your `Dataset`, `Model` and `Endpoint` resources.\n",
"- `training_pipeline`: the full specification for the pipeline training job.\n",
"\n",
"Let's look now deeper into the *minimal* requirements for constructing a `training_pipeline` specification:\n",
"\n",
"- `display_name`: A human readable name for the pipeline job.\n",
"- `training_task_definition`: The dataset labeling (annotation) training schema.\n",
"- `training_task_inputs`: A dictionary describing the requirements for the training job.\n",
"- `model_to_upload`: A human readable name for the model.\n",
"- `input_data_config`: The dataset specification.\n",
" - `dataset_id`: The Vertex dataset identifier only (non-fully qualified) -- this is the last part of the fully-qualified identifier.\n",
" - `fraction_split`: If specified, the percentages of the dataset to use for training, test and validation. Otherwise, the percentages are automatically selected by AutoML.\n",
" - Note for video, validation split is not supported -- only training and test."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "create_pipeline:automl,video"
},
"outputs": [],
"source": [
" def create_pipeline(pipeline_name, model_name, dataset, schema, task):\n",
"\n",
" dataset_id = dataset.split('/')[-1]\n",
"\n",
" input_config = {'dataset_id': dataset_id,\n",
" 'fraction_split': {\n",
" 'training_fraction': 0.8,\n",
" 'test_fraction': 0.2\n",
" }}\n",
"\n",
" training_pipeline = {\n",
" \"display_name\": pipeline_name,\n",
" \"training_task_definition\": schema,\n",
" \"training_task_inputs\": task,\n",
" \"input_data_config\": input_config,\n",
" \"model_to_upload\": {\"display_name\": model_name},\n",
" }\n",
"\n",
" try:\n",
" pipeline = clients['pipeline'].create_training_pipeline(parent=PARENT, training_pipeline=training_pipeline)\n",
" print(pipeline)\n",
" except Exception as e:\n",
" print(\"exception:\", e)\n",
" return None\n",
" return pipeline"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "task_requirements:automl,vcn"
},
"source": [
"### Construct the task requirements\n",
"\n",
"Next, construct the task requirements. Unlike other parameters which take a Python (JSON-like) dictionary, the `task` field takes a Google protobuf Struct, which is very similar to a Python dictionary. Use the `json_format.ParseDict` method for the conversion.\n",
"\n",
"For video classification, there are no required minimal fields to specify.\n",
"\n",
"Finally, you create the pipeline by calling the helper function `create_pipeline`, which returns an instance of a training pipeline object."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "task_requirements:automl,vcn"
},
"outputs": [],
"source": [
"PIPE_NAME = \"hmdb,tst_pipe-\" + TIMESTAMP\n",
"MODEL_NAME = \"hmdb,tst_model-\" + TIMESTAMP\n",
"\n",
"task = json_format.ParseDict({}, Value())\n",
"\n",
"response = create_pipeline(PIPE_NAME, MODEL_NAME, dataset_id, TRAINING_SCHEMA, task)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "pipeline_id:response"
},
"source": [
"Now save the unique identifier of the training pipeline you created."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "pipeline_id:response"
},
"outputs": [],
"source": [
"# The full unique ID for the pipeline\n",
"pipeline_id = response.name\n",
"# The short numeric ID for the pipeline\n",
"pipeline_short_id = pipeline_id.split('/')[-1]\n",
"\n",
"print(pipeline_id)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "get_training_pipeline"
},
"source": [
"### Get information on a training pipeline\n",
"\n",
"Now get pipeline information for just this training pipeline instance. The helper function gets the job information for just this job by calling the the job client service's `get_training_pipeline` method, with the following parameter:\n",
"\n",
"- `name`: The Vertex fully qualified pipeline identifier.\n",
"\n",
"When the model is done training, the pipeline state will be `PIPELINE_STATE_SUCCEEDED`."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "get_training_pipeline"
},
"outputs": [],
"source": [
"def get_training_pipeline(name, silent=False):\n",
" response = clients['pipeline'].get_training_pipeline(name=name)\n",
" if silent:\n",
" return response\n",
"\n",
" print(\"pipeline\")\n",
" print(\" name:\", response.name)\n",
" print(\" display_name:\", response.display_name)\n",
" print(\" state:\", response.state)\n",
" print(\" training_task_definition:\", response.training_task_definition)\n",
" print(\" training_task_inputs:\", dict(response.training_task_inputs))\n",
" print(\" create_time:\", response.create_time)\n",
" print(\" start_time:\", response.start_time)\n",
" print(\" end_time:\", response.end_time)\n",
" print(\" update_time:\", response.update_time)\n",
" print(\" labels:\", dict(response.labels))\n",
" return response\n",
"\n",
"\n",
"response = get_training_pipeline(pipeline_id)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "wait_training_complete"
},
"source": [
"# Deployment\n",
"\n",
"Training the above model may take upwards of 20 minutes time.\n",
"\n",
"Once your model is done training, you can calculate the actual time it took to train the model by subtracting `end_time` from `start_time`. For your model, you will need to know the fully qualified Vertex Model resource identifier, which the pipeline service assigned to it. You can get this from the returned pipeline instance as the field `model_to_deploy.name`."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "wait_training_complete"
},
"outputs": [],
"source": [
"while True:\n",
" response = get_training_pipeline(pipeline_id, True)\n",
" if response.state != aip.PipelineState.PIPELINE_STATE_SUCCEEDED:\n",
" print(\"Training job has not completed:\", response.state)\n",
" model_to_deploy_id = None\n",
" if response.state == aip.PipelineState.PIPELINE_STATE_FAILED:\n",
" raise Exception(\"Training Job Failed\")\n",
" else:\n",
" model_to_deploy = response.model_to_upload\n",
" model_to_deploy_id = model_to_deploy.name\n",
" print(\"Training Time:\", response.end_time - response.start_time)\n",
" break\n",
" time.sleep(60)\n",
"\n",
"print(\"model to deploy:\", model_to_deploy_id)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "model_information"
},
"source": [
"## Model information\n",
"\n",
"Now that your model is trained, you can get some information on your model."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "evaluate_the_model:automl"
},
"source": [
"## Evaluate the Model resource\n",
"\n",
"Now find out how good the model service believes your model is. As part of training, some portion of the dataset was set aside as the test (holdout) data, which is used by the pipeline service to evaluate the model."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "list_model_evaluations:automl,vcn"
},
"source": [
"### List evaluations for all slices\n",
"\n",
"Use this helper function `list_model_evaluations`, which takes the following parameter:\n",
"\n",
"- `name`: The Vertex fully qualified model identifier for the `Model` resource.\n",
"\n",
"This helper function uses the model client service's `list_model_evaluations` method, which takes the same parameter. The response object from the call is a list, where each element is an evaluation metric.\n",
"\n",
"For each evaluation (you probably only have one) we then print all the key names for each metric in the evaluation, and for a small set (`auPrc`) you will print the result."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "list_model_evaluations:automl,vcn"
},
"outputs": [],
"source": [
"def list_model_evaluations(name):\n",
" response = clients['model'].list_model_evaluations(parent=name)\n",
" for evaluation in response:\n",
" print(\"model_evaluation\")\n",
" print(\" name:\", evaluation.name)\n",
" print(\" metrics_schema_uri:\", evaluation.metrics_schema_uri)\n",
" metrics = json_format.MessageToDict(evaluation._pb.metrics)\n",
" for metric in metrics.keys():\n",
" print(metric)\n",
" print('auPrc', metrics['auPrc'])\n",
"\n",
"\n",
" return evaluation.name\n",
"\n",
"\n",
"last_evaluation = list_model_evaluations(model_to_deploy_id)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "deploy:batch_prediction"
},
"source": [
"## Model deployment for batch prediction\n",
"\n",
"Now deploy the trained Vertex `Model` resource you created for batch prediction. This differs from deploying a `Model` resource for on-demand prediction.\n",
"\n",
"For online prediction, you:\n",
"\n",
"1. Create an `Endpoint` resource for deploying the `Model` resource to.\n",
"\n",
"2. Deploy the `Model` resource to the `Endpoint` resource.\n",
"\n",
"3. Make online prediction requests to the `Endpoint` resource.\n",
"\n",
"For batch-prediction, you:\n",
"\n",
"1. Create a batch prediction job.\n",
"\n",
"2. The job service will provision resources for the batch prediction request.\n",
"\n",
"3. The results of the batch prediction request are returned to the caller.\n",
"\n",
"4. The job service will unprovision the resoures for the batch prediction request."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "make_prediction"
},
"source": [
"## Make a batch prediction request\n",
"\n",
"Now do a batch prediction to your deployed model."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "get_test_items:batch_prediction"
},
"source": [
"### Get test item(s)\n",
"\n",
"Now do a batch prediction to your Vertex model. You will use arbitrary examples out of the dataset as a test items. Don't be concerned that the examples were likely used in training the model -- we just want to demonstrate how to make a prediction."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "get_test_items:automl,vcn,csv"
},
"outputs": [],
"source": [
"test_items = ! gsutil cat $IMPORT_FILE | head -n2\n",
"\n",
"if len(test_items[0]) == 5:\n",
" _, test_item_1, test_label_1, _, _ = str(test_items[0]).split(',')\n",
" _, test_item_2, test_label_2, _, _ = str(test_items[1]).split(',')\n",
"else:\n",
" test_item_1, test_label_1, _, _ = str(test_items[0]).split(',')\n",
" test_item_2, test_label_2, _, _ = str(test_items[1]).split(',')\n",
"\n",
"\n",
"print(test_item_1, test_label_1)\n",
"print(test_item_2, test_label_2)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "make_batch_file:automl,video"
},
"source": [
"### Make a batch input file\n",
"\n",
"Now make a batch input file, which you store in your local Cloud Storage bucket. The batch input file can be either CSV or JSONL. You will use JSONL in this tutorial. For JSONL file, you make one dictionary entry per line for each video. The dictionary contains the key/value pairs:\n",
"\n",
"- `content`: The Cloud Storage path to the video.\n",
"- `mimeType`: The content type. In our example, it is an `avi` file.\n",
"- `timeSegmentStart`: The start timestamp in the video to do prediction on. *Note*, the timestamp must be specified as a string and followed by s (second), m (minute) or h (hour).\n",
"- `timeSegmentEnd`: The end timestamp in the video to do prediction on."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "make_batch_file:automl,video"
},
"outputs": [],
"source": [
"import json\n",
"\n",
"import tensorflow as tf\n",
"\n",
"gcs_input_uri = BUCKET_NAME + '/test.jsonl'\n",
"with tf.io.gfile.GFile(gcs_input_uri, 'w') as f:\n",
" data = { \"content\": test_item_1, \"mimeType\": \"video/avi\", \"timeSegmentStart\": \"0.0s\", 'timeSegmentEnd': '5.0s' }\n",
" f.write(json.dumps(data) + '\\n')\n",
" data = { \"content\": test_item_2, \"mimeType\": \"video/avi\", \"timeSegmentStart\": \"0.0s\", 'timeSegmentEnd': '5.0s' }\n",
" f.write(json.dumps(data) + '\\n')\n",
"\n",
"print(gcs_input_uri)\n",
"! gsutil cat $gcs_input_uri"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "instance_scaling"
},
"source": [
"### Compute instance scaling\n",
"\n",
"You have several choices on scaling the compute instances for handling your batch prediction requests:\n",
"\n",
"- Single Instance: The batch prediction requests are processed on a single compute instance.\n",
" - Set the minimum (`MIN_NODES`) and maximum (`MAX_NODES`) number of compute instances to one.\n",
"\n",
"- Manual Scaling: The batch prediction requests are split across a fixed number of compute instances that you manually specified.\n",
" - Set the minimum (`MIN_NODES`) and maximum (`MAX_NODES`) number of compute instances to the same number of nodes. When a model is first deployed to the instance, the fixed number of compute instances are provisioned and batch prediction requests are evenly distributed across them.\n",
"\n",
"- Auto Scaling: The batch prediction requests are split across a scaleable number of compute instances.\n",
" - Set the minimum (`MIN_NODES`) number of compute instances to provision when a model is first deployed and to de-provision, and set the maximum (`MAX_NODES) number of compute instances to provision, depending on load conditions.\n",
"\n",
"The minimum number of compute instances corresponds to the field `min_replica_count` and the maximum number of compute instances corresponds to the field `max_replica_count`, in your subsequent deployment request."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "instance_scaling"
},
"outputs": [],
"source": [
"MIN_NODES = 1\n",
"MAX_NODES = 1"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "make_batch_request:automl,vcn"
},
"source": [
"### Make batch prediction request\n",
"\n",
"Now that your batch of two test items is ready, let's do the batch request. Use this helper function `create_batch_prediction_job`, with the following parameters:\n",
"\n",
"- `display_name`: The human readable name for the prediction job.\n",
"- `model_name`: The Vertex fully qualified identifier for the `Model` resource.\n",
"- `gcs_source_uri`: The Cloud Storage path to the input file -- which you created above.\n",
"- `gcs_destination_output_uri_prefix`: The Cloud Storage path that the service will write the predictions to.\n",
"- `parameters`: Additional filtering parameters for serving prediction results.\n",
"\n",
"The helper function calls the job client service's `create_batch_prediction_job` metho, with the following parameters:\n",
"\n",
"- `parent`: The Vertex location root path for Dataset, Model and Pipeline resources.\n",
"- `batch_prediction_job`: The specification for the batch prediction job.\n",
"\n",
"Let's now dive into the specification for the `batch_prediction_job`:\n",
"\n",
"- `display_name`: The human readable name for the prediction batch job.\n",
"- `model`: The Vertex fully qualified identifier for the `Model` resource.\n",
"- `dedicated_resources`: The compute resources to provision for the batch prediction job.\n",
" - `machine_spec`: The compute instance to provision. Use the variable you set earlier `DEPLOY_GPU != None` to use a GPU; otherwise only a CPU is allocated.\n",
" - `starting_replica_count`: The number of compute instances to initially provision, which you set earlier as the variable `MIN_NODES`.\n",
" - `max_replica_count`: The maximum number of compute instances to scale to, which you set earlier as the variable `MAX_NODES`.\n",
"- `model_parameters`: Additional filtering parameters for serving prediction results.\n",
" - `confidenceThreshold`: The minimum confidence threshold on doing a prediction.\n",
" - `maxPredictions`: The maximum number of predictions to return per classification, sorted by confidence.\n",
" - `oneSecIntervalClassification`: If `True`, predictions are made on one second intervals.\n",
" - `shotClassification`: If `True`, predictions are made on each camera shot boundary.\n",
" - `segmentClassification`: If `True`, predictions are made on each time segment; otherwise prediction is made for the entire time segment.\n",
"- `input_config`: The input source and format type for the instances to predict.\n",
" - `instances_format`: The format of the batch prediction request file: `csv` or `jsonl`.\n",
" - `gcs_source`: A list of one or more Cloud Storage paths to your batch prediction requests.\n",
"- `output_config`: The output destination and format for the predictions.\n",
" - `prediction_format`: The format of the batch prediction response file: `jsonl` only.\n",
" - `gcs_destination`: The output destination for the predictions.\n",
"\n",
"You might ask, how does confidence_threshold affect the model accuracy? The threshold won't change the accuracy. What it changes is recall and precision.\n",
"\n",
" - Precision: The higher the precision the more likely what is predicted is the correct prediction, but return fewer predictions. Increasing the confidence threshold increases precision.\n",
" - Recall: The higher the recall the more likely a correct prediction is returned in the result, but return more prediction with incorrect prediction. Decreasing the confidence threshold increases recall.\n",
"\n",
"In this example, you will predict for precision. You set the confidence threshold to 0.5 and the maximum number of predictions for an action to two. Since, all the confidence values across the classes must add up to one, there are only two possible outcomes:\n",
"\n",
" 1. There is a tie, both 0.5, and returns two predictions.\n",
" 2. One value is above 0.5 and the rest are below 0.5, and returns one prediction.\n",
"\n",
"This call is an asychronous operation. You will print from the response object a few select fields, including:\n",
"\n",
"- `name`: The Vertex fully qualified identifier assigned to the batch prediction job.\n",
"- `display_name`: The human readable name for the prediction batch job.\n",
"- `model`: The Vertex fully qualified identifier for the Model resource.\n",
"- `generate_explanations`: Whether True/False explanations were provided with the predictions (explainability).\n",
"- `state`: The state of the prediction job (pending, running, etc).\n",
"\n",
"Since this call will take a few moments to execute, you will likely get `JobState.JOB_STATE_PENDING` for `state`."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "make_batch_request:automl,vcn"
},
"outputs": [],
"source": [
"BATCH_MODEL = \"hmdb,tst_batch-\" + TIMESTAMP\n",
"\n",
"\n",
"def create_batch_prediction_job(display_name, model_name, gcs_source_uri, gcs_destination_output_uri_prefix, parameters=None):\n",
"\n",
" if DEPLOY_GPU:\n",
" machine_spec = {\n",
" \"machine_type\": DEPLOY_COMPUTE,\n",
" \"accelerator_type\": DEPLOY_GPU,\n",
" \"accelerator_count\": DEPLOY_NGPU,\n",
" }\n",
" else:\n",
" machine_spec = {\n",
" \"machine_type\": DEPLOY_COMPUTE,\n",
" \"accelerator_count\": 0,\n",
" }\n",
"\n",
" batch_prediction_job = {\n",
" \"display_name\": display_name,\n",
" # Format: 'projects/{project}/locations/{location}/models/{model_id}'\n",
" \"model\": model_name,\n",
" \"model_parameters\": json_format.ParseDict(parameters, Value()),\n",
" \"input_config\": {\n",
" \"instances_format\": IN_FORMAT,\n",
" \"gcs_source\": {\"uris\": [gcs_source_uri]},\n",
" },\n",
" \"output_config\": {\n",
" \"predictions_format\": OUT_FORMAT,\n",
" \"gcs_destination\": {\"output_uri_prefix\": gcs_destination_output_uri_prefix},\n",
" },\n",
" \"dedicated_resources\": {\n",
" \"machine_spec\": machine_spec,\n",
" \"starting_replica_count\": MIN_NODES,\n",
" \"max_replica_count\": MAX_NODES\n",
" }\n",
"\n",
" }\n",
" response = clients['job'].create_batch_prediction_job(\n",
" parent=PARENT, batch_prediction_job=batch_prediction_job\n",
" )\n",
" print(\"response\")\n",
" print(\" name:\", response.name)\n",
" print(\" display_name:\", response.display_name)\n",
" print(\" model:\", response.model)\n",
" try:\n",
" print(\" generate_explanation:\", response.generate_explanation)\n",
" except:\n",
" pass\n",
" print(\" state:\", response.state)\n",
" print(\" create_time:\", response.create_time)\n",
" print(\" start_time:\", response.start_time)\n",
" print(\" end_time:\", response.end_time)\n",
" print(\" update_time:\", response.update_time)\n",
" print(\" labels:\", response.labels)\n",
" return response\n",
"\n",
"\n",
"IN_FORMAT = 'jsonl'\n",
"OUT_FORMAT = 'jsonl' # [jsonl]\n",
"\n",
"response = create_batch_prediction_job(BATCH_MODEL, model_to_deploy_id, gcs_input_uri, BUCKET_NAME, None)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "batch_job_id:response"
},
"source": [
"Now get the unique identifier for the batch prediction job you created."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "batch_job_id:response"
},
"outputs": [],
"source": [
"# The full unique ID for the batch job\n",
"batch_job_id = response.name\n",
"# The short numeric ID for the batch job\n",
"batch_job_short_id = batch_job_id.split('/')[-1]\n",
"\n",
"print(batch_job_id)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "get_batch_prediction_job"
},
"source": [
"### Get information on a batch prediction job\n",
"\n",
"Use this helper function `get_batch_prediction_job`, with the following paramter:\n",
"\n",
"- `job_name`: The Vertex fully qualified identifier for the batch prediction job.\n",
"\n",
"The helper function calls the job client service's `get_batch_prediction_job` method, with the following paramter:\n",
"\n",
"- `name`: The Vertex fully qualified identifier for the batch prediction job. In this tutorial, you will pass it the Vertex fully qualified identifier for your batch prediction job -- `batch_job_id`\n",
"\n",
"The helper function will return the Cloud Storage path to where the predictions are stored -- `gcs_destination`."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "get_batch_prediction_job"
},
"outputs": [],
"source": [
"def get_batch_prediction_job(job_name, silent=False):\n",
" response = clients['job'].get_batch_prediction_job(name=job_name)\n",
" if silent:\n",
" return response.output_config.gcs_destination.output_uri_prefix, response.state\n",
"\n",
" print(\"response\")\n",
" print(\" name:\", response.name)\n",
" print(\" display_name:\", response.display_name)\n",
" print(\" model:\", response.model)\n",
" try: # not all data types support explanations\n",
" print(\" generate_explanation:\", response.generate_explanation)\n",
" except:\n",
" pass\n",
" print(\" state:\", response.state)\n",
" print(\" error:\", response.error)\n",
" gcs_destination = response.output_config.gcs_destination\n",
" print(\" gcs_destination\")\n",
" print(\" output_uri_prefix:\", gcs_destination.output_uri_prefix)\n",
" return gcs_destination.output_uri_prefix, response.state\n",
"\n",
"\n",
"predictions, state = get_batch_prediction_job(batch_job_id)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "get_the_predictions:automl,vcn"
},
"source": [
"### Get the predictions\n",
"\n",
"When the batch prediction is done processing, the job state will be `JOB_STATE_SUCCEEDED`.\n",
"\n",
"Finally you view the predictions stored at the Cloud Storage path you set as output. The predictions will be in a JSONL format, which you indicated at the time you made the batch prediction job, under a subfolder starting with the name `prediction`, and under that folder will be a file called `predictions*.jsonl`.\n",
"\n",
"Now display (cat) the contents. You will see multiple JSON objects, one for each prediction.\n",
"\n",
"For each prediction:\n",
"\n",
"- `content`: The video that was input for the prediction request.\n",
"- `displayName`: The prediction action.\n",
"- `confidence`: The confidence in the prediction between 0 and 1.\n",
"- `timeSegmentStart/timeSegmentEnd`: The time offset of the start and end of the predicted action."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "get_the_predictions:automl,video"
},
"outputs": [],
"source": [
"def get_latest_predictions(gcs_out_dir):\n",
" ''' Get the latest prediction subfolder using the timestamp in the subfolder name'''\n",
" folders = !gsutil ls $gcs_out_dir\n",
" latest = \"\"\n",
" for folder in folders:\n",
" subfolder = folder.split('/')[-2]\n",
" if subfolder.startswith('prediction-'):\n",
" if subfolder > latest:\n",
" latest = folder[:-1]\n",
" return latest\n",
"\n",
"\n",
"while True:\n",
" predictions, state = get_batch_prediction_job(batch_job_id, True)\n",
" if state != aip.JobState.JOB_STATE_SUCCEEDED:\n",
" print(\"The job has not completed:\", state)\n",
" if state == aip.JobState.JOB_STATE_FAILED:\n",
" raise Exception(\"Batch Job Failed\")\n",
" else:\n",
" folder = get_latest_predictions(predictions)\n",
" ! gsutil ls $folder/prediction*.jsonl\n",
"\n",
" ! gsutil cat $folder/prediction*.jsonl\n",
" break\n",
" time.sleep(60)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "cleanup"
},
"source": [
"# Cleaning up\n",
"\n",
"To clean up all GCP resources used in this project, you can [delete the GCP\n",
"project](https://cloud.google.com/resource-manager/docs/creating-managing-projects#shutting_down_projects) you used for the tutorial.\n",
"\n",
"Otherwise, you can delete the individual resources you created in this tutorial:\n",
"\n",
"- Dataset\n",
"- Pipeline\n",
"- Model\n",
"- Endpoint\n",
"- Batch Job\n",
"- Custom Job\n",
"- Hyperparameter Tuning Job\n",
"- Cloud Storage Bucket"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "cleanup"
},
"outputs": [],
"source": [
"delete_dataset = True\n",
"delete_pipeline = True\n",
"delete_model = True\n",
"delete_endpoint = True\n",
"delete_batchjob = True\n",
"delete_customjob = True\n",
"delete_hptjob = True\n",
"delete_bucket = True\n",
"\n",
"# Delete the dataset using the Vertex fully qualified identifier for the dataset\n",
"try:\n",
" if delete_dataset and 'dataset_id' in globals():\n",
" clients['dataset'].delete_dataset(name=dataset_id)\n",
"except Exception as e:\n",
" print(e)\n",
"\n",
"# Delete the training pipeline using the Vertex fully qualified identifier for the pipeline\n",
"try:\n",
" if delete_pipeline and 'pipeline_id' in globals():\n",
" clients['pipeline'].delete_training_pipeline(name=pipeline_id)\n",
"except Exception as e:\n",
" print(e)\n",
"\n",
"# Delete the model using the Vertex fully qualified identifier for the model\n",
"try:\n",
" if delete_model and 'model_to_deploy_id' in globals():\n",
" clients['model'].delete_model(name=model_to_deploy_id)\n",
"except Exception as e:\n",
" print(e)\n",
"\n",
"# Delete the endpoint using the Vertex fully qualified identifier for the endpoint\n",
"try:\n",
" if delete_endpoint and 'endpoint_id' in globals():\n",
" clients['endpoint'].delete_endpoint(name=endpoint_id)\n",
"except Exception as e:\n",
" print(e)\n",
"\n",
"# Delete the batch job using the Vertex fully qualified identifier for the batch job\n",
"try:\n",
" if delete_batchjob and 'batch_job_id' in globals():\n",
" clients['job'].delete_batch_prediction_job(name=batch_job_id)\n",
"except Exception as e:\n",
" print(e)\n",
"\n",
"# Delete the custom job using the Vertex fully qualified identifier for the custom job\n",
"try:\n",
" if delete_customjob and 'job_id' in globals():\n",
" clients['job'].delete_custom_job(name=job_id)\n",
"except Exception as e:\n",
" print(e)\n",
"\n",
"# Delete the hyperparameter tuning job using the Vertex fully qualified identifier for the hyperparameter tuning job\n",
"try:\n",
" if delete_hptjob and 'hpt_job_id' in globals():\n",
" clients['job'].delete_hyperparameter_tuning_job(name=hpt_job_id)\n",
"except Exception as e:\n",
" print(e)\n",
"\n",
"if delete_bucket and 'BUCKET_NAME' in globals():\n",
" ! gsutil rm -r $BUCKET_NAME"
]
}
],
"metadata": {
"colab": {
"name": "showcase_automl_video_classification_batch.ipynb",
"toc_visible": true
},
"kernelspec": {
"display_name": "Python 3",
"name": "python3"
}
},
"nbformat": 4,
"nbformat_minor": 0
}