notebooks/official/model_evaluation/custom_tabular_regression_model

{ "cells": [ { "cell_type": "code", "execution_count": null, "metadata": { "id": "copyright" }, "outputs": [], "source": [ "# Copyright 2022 Google LLC\n", "#\n", "# Licensed under the Apache License, Version 2.0 (the \"License\");\n", "# you may not use this file except in compliance with the License.\n", "# You may obtain a copy of the License at\n", "#\n", "# https://www.apache.org/licenses/LICENSE-2.0\n", "#\n", "# Unless required by applicable law or agreed to in writing, software\n", "# distributed under the License is distributed on an \"AS IS\" BASIS,\n", "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n", "# See the License for the specific language governing permissions and\n", "# limitations under the License." ] }, { "cell_type": "markdown", "metadata": { "id": "title" }, "source": [ "# Vertex AI Pipelines: Evaluating batch prediction results from custom tabular regression model\n", "\n", "<table align=\"left\">\n", " <td style=\"text-align: center\">\n", " <a href=\"https://colab.research.google.com/github/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/official/model_evaluation/custom_tabular_regression_model_evaluation.ipynb\">\n", " <img src=\"https://cloud.google.com/ml-engine/images/colab-logo-32px.png\" alt=\"Google Colaboratory logo\"><br> Open in Colab\n", " </a>\n", " </td>\n", " <td style=\"text-align: center\">\n", " <a href=\"https://console.cloud.google.com/vertex-ai/colab/import/https:%2F%2Fraw.githubusercontent.com%2FGoogleCloudPlatform%2Fvertex-ai-samples%2Fmain%2Fnotebooks%2Fofficial%2Fmodel_evaluation%2Fcustom_tabular_regression_model_evaluation.ipynb\">\n", " <img width=\"32px\" src=\"https://cloud.google.com/ml-engine/images/colab-enterprise-logo-32px.png\" alt=\"Google Cloud Colab Enterprise logo\"><br> Open in Colab Enterprise\n", " </a>\n", " </td> \n", " <td style=\"text-align: center\">\n", " <a href=\"https://console.cloud.google.com/vertex-ai/workbench/deploy-notebook?download_url=https://raw.githubusercontent.com/GoogleCloudPlatform/vertex-ai-samples/main/notebooks/official/model_evaluation/custom_tabular_regression_model_evaluation.ipynb\">\n", " <img src=\"https://lh3.googleusercontent.com/UiNooY4LUgW_oTvpsNhPpQzsstV5W8F7rYgxgGBD85cWJoLmrOzhVs_ksK_vgx40SHs7jCqkTkCk=e14-rj-sc0xffffff-h130-w32\" alt=\"Vertex AI logo\"><br> Open in Workbench\n", " </a>\n", " </td>\n", " <td style=\"text-align: center\">\n", " <a href=\"https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/official/model_evaluation/custom_tabular_regression_model_evaluation.ipynb\">\n", " <img src=\"https://cloud.google.com/ml-engine/images/github-logo-32px.png\" alt=\"GitHub logo\"><br> View on GitHub\n", " </a>\n", " </td>\n", "</table>" ] }, { "cell_type": "markdown", "metadata": { "id": "overview:custom" }, "source": [ "## Overview\n", "\n", "This notebook demonstrates how to use the Vertex AI regression model evaluation component to evaluate a custom regression model. Model evaluation helps you determine your model performance based on the evaluation metrics and improve the model if necessary. \n", "\n", "Learn more about [Vertex AI Model Evaluation](https://cloud.google.com/vertex-ai/docs/evaluation/introduction) and [Custom training](https://cloud.google.com/vertex-ai/docs/training/custom-training)." ] }, { "cell_type": "markdown", "metadata": { "id": "dataset:custom,boston,lrg" }, "source": [ "### Objective\n", "\n", "In this tutorial, you learn how to evaluate a Vertex AI model resource through a Vertex AI pipeline job using google cloud pipeline components.\n", "\n", "This tutorial uses the following Google Cloud ML services and resources:\n", "\n", "- Vertex AI Training (Custom Training)\n", "- Vertex AI Batch Predictions\n", "- Vertex AI Pipelines\n", "- Vertex AI Model Registry\n", "\n", "\n", "The steps performed include:\n", "\n", "- Create a Vertex AI Custom Training Job to train a TensorFlow model.\n", "- Run the custom training job. \n", "- Retrieve and load the model artifacts.\n", "- View the model evaluation.\n", "- Upload the model as a Vertex AI model resource.\n", "- Import a pre-trained Vertex AI model resource into the pipeline.\n", "- Run a batch prediction job in the pipeline.\n", "- Evaluate the model using the regression evaluation component.\n", "- Import the Regression Metrics to the Vertex AI model resource." ] }, { "cell_type": "markdown", "metadata": { "id": "objective:custom,training,batch_prediction" }, "source": [ "### Dataset\n", "\n", "The dataset used for this tutorial is the [Boston Housing Prices dataset](https://www.cs.toronto.edu/~delve/data/boston/bostonDetail.html). The version of the dataset you use in this tutorial is the one that's available from TensorFlow SDK. The trained model predicts the median price of a house in units of 1K USD." ] }, { "cell_type": "markdown", "metadata": { "id": "costs" }, "source": [ "### Costs\n", "\n", "This tutorial uses billable components of Google Cloud:\n", "\n", "* Vertex AI\n", "* Cloud Storage\n", "\n", "Learn about [Vertex AI\n", "pricing](https://cloud.google.com/vertex-ai/pricing) and [Cloud Storage\n", "pricing](https://cloud.google.com/storage/pricing), and use the [Pricing\n", "Calculator](https://cloud.google.com/products/calculator/)\n", "to generate a cost estimate based on your projected usage." ] }, { "cell_type": "markdown", "metadata": { "id": "61RBz8LLbxCR" }, "source": [ "## Get started" ] }, { "cell_type": "markdown", "metadata": { "id": "No17Cw5hgx12" }, "source": [ "### Install Vertex AI SDK for Python and other required packages\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "HwrBbTwGw0aC" }, "outputs": [], "source": [ "! pip3 install --upgrade --quiet google-cloud-aiplatform \\\n", " tensorflow==2.15.1 \\\n", " google-cloud-pipeline-components==1.0.26 \\\n", " matplotlib \\\n", " google-cloud-storage " ] }, { "cell_type": "markdown", "metadata": { "id": "R5Xep4W9lq-Z" }, "source": [ "### Restart runtime (Colab only)\n", "\n", "To use the newly installed packages, you must restart the runtime on Google Colab." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "XRvKdaPDTznN" }, "outputs": [], "source": [ "import sys\n", "\n", "if \"google.colab\" in sys.modules:\n", "\n", " import IPython\n", "\n", " app = IPython.Application.instance()\n", " app.kernel.do_shutdown(True)" ] }, { "cell_type": "markdown", "metadata": { "id": "SbmM4z7FOBpM" }, "source": [ "<div class=\"alert alert-block alert-warning\">\n", "<b>⚠️ The kernel is going to restart. Wait until it's finished before continuing to the next step. ⚠️</b>\n", "</div>\n" ] }, { "cell_type": "markdown", "metadata": { "id": "dmWOrTJ3gx13" }, "source": [ "### Authenticate your notebook environment (Colab only)\n", "\n", "Authenticate your environment on Google Colab.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "NyKGtVQjgx13" }, "outputs": [], "source": [ "import sys\n", "\n", "if \"google.colab\" in sys.modules:\n", "\n", " from google.colab import auth\n", "\n", " auth.authenticate_user()" ] }, { "cell_type": "markdown", "metadata": { "id": "DF4l8DTdWgPY" }, "source": [ "### Set Google Cloud project information\n", "\n", "To get started using Vertex AI, you must have an existing Google Cloud project. Learn more about [setting up a project and a development environment](https://cloud.google.com/vertex-ai/docs/start/cloud-environment)." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "Nqwi-5ufWp_B" }, "outputs": [], "source": [ "PROJECT_ID = \"[your-project-id]\" # @param {type:\"string\"}\n", "LOCATION = \"us-central1\" # @param {type:\"string\"}" ] }, { "cell_type": "markdown", "metadata": { "id": "bucket:mbsdk" }, "source": [ "### Create a Cloud Storage bucket\n", "\n", "Create a storage bucket to store intermediate artifacts such as datasets." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "bucket" }, "outputs": [], "source": [ "BUCKET_URI = f\"gs://your-bucket-name-{PROJECT_ID}-unique\" # @param {type:\"string\"}" ] }, { "cell_type": "markdown", "metadata": { "id": "autoset_bucket" }, "source": [ "**Only if your bucket doesn't already exist**: Run the following cell to create your Cloud Storage bucket." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "91c46850b49b" }, "outputs": [], "source": [ "! gsutil mb -l {LOCATION} -p {PROJECT_ID} {BUCKET_URI}" ] }, { "cell_type": "markdown", "metadata": { "id": "Bw2_iughbsd7" }, "source": [ "#### Service Account\n", "\n", "You use a service account to create Vertex AI Pipeline jobs. If you don't want to use your project's Compute Engine service account, set `SERVICE_ACCOUNT` to another service account ID." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "bPW2wdPNw0aJ" }, "outputs": [], "source": [ "SERVICE_ACCOUNT = \"[your-service-account]\" # @param {type:\"string\"}" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "BUT1yahzw0aK" }, "outputs": [], "source": [ "import sys\n", "\n", "IS_COLAB = \"google.colab\" in sys.modules\n", "\n", "if (\n", " SERVICE_ACCOUNT == \"\"\n", " or SERVICE_ACCOUNT is None\n", " or SERVICE_ACCOUNT == \"[your-service-account]\"\n", "):\n", " # Get your service account from gcloud\n", " if not IS_COLAB:\n", " shell_output = !gcloud auth list 2>/dev/null\n", " SERVICE_ACCOUNT = shell_output[2].replace(\"*\", \"\").strip()\n", "\n", " else: # IS_COLAB:\n", " shell_output = ! gcloud projects describe $PROJECT_ID\n", " project_number = shell_output[-1].split(\":\")[1].strip().replace(\"'\", \"\")\n", " SERVICE_ACCOUNT = f\"{project_number}-compute@developer.gserviceaccount.com\"\n", "\n", " print(\"Service Account:\", SERVICE_ACCOUNT)" ] }, { "cell_type": "markdown", "metadata": { "id": "EOOmh89cbsd8" }, "source": [ "#### Set service account access for Vertex AI Pipelines\n", "\n", "Run the following commands to grant your service account access to read and write pipeline artifacts in the bucket that you created in the previous step. You only need to run this step once per service account." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "NjAu9sOjbsd8" }, "outputs": [], "source": [ "! gsutil iam ch serviceAccount:{SERVICE_ACCOUNT}:roles/storage.objectCreator $BUCKET_URI\n", "\n", "! gsutil iam ch serviceAccount:{SERVICE_ACCOUNT}:roles/storage.objectViewer $BUCKET_URI" ] }, { "cell_type": "markdown", "metadata": { "id": "setup_vars" }, "source": [ "### Import libraries" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "import_aip:mbsdk" }, "outputs": [], "source": [ "import json\n", "import os\n", "\n", "import matplotlib.pyplot as plt\n", "import numpy as np\n", "import tensorflow as tf\n", "from google.cloud import aiplatform, aiplatform_v1\n", "from tensorflow.keras.datasets import boston_housing" ] }, { "cell_type": "markdown", "metadata": { "id": "init_aip:mbsdk" }, "source": [ "### Initialize Vertex AI SDK for Python\n", "\n", "Initialize the Vertex AI SDK for Python for your project and corresponding bucket." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "J_vhdLRDw0aL" }, "outputs": [], "source": [ "aiplatform.init(project=PROJECT_ID, staging_bucket=BUCKET_URI, location=LOCATION)" ] }, { "cell_type": "markdown", "metadata": { "id": "accelerators:training,cpu,prediction,cpu,mbsdk" }, "source": [ "#### Set hardware accelerators\n", "\n", "You can set hardware accelerators for training and prediction.\n", "\n", "Set the variables `TRAIN_GPU/TRAIN_NGPU` and `DEPLOY_GPU/DEPLOY_NGPU` to use a container image supporting a GPU and the number of GPUs allocated to the virtual machine (VM) instance. For example, to use a GPU container image with 4 Nvidia Telsa T4 GPUs allocated to each VM, you'd specify:\n", "\n", " (aiplatform.gapic.AcceleratorType.NVIDIA_TESLA_T4, 4)\n", "\n", "\n", "Otherwise specify `(None, None)` to use a container image to run on a CPU.\n", "\n", "Learn more about [hardware accelerator support for your region](https://cloud.google.com/vertex-ai/docs/general/locations#accelerators) \n", "\n", "*Note*: TF releases before 2.3 for GPU support fail to load the custom model in this tutorial. It's a known issue and fixed in TF 2.3 which is caused by static graph ops that are generated in the serving function. If you encounter this issue on your own custom models, use a container image for TF 2.3 with GPU support." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "aex4cf68w0aL" }, "outputs": [], "source": [ "if os.getenv(\"IS_TESTING_TRAIN_GPU\"):\n", " TRAIN_GPU, TRAIN_NGPU = (\n", " aiplatform.gapic.AcceleratorType.NVIDIA_TESLA_T4,\n", " int(os.getenv(\"IS_TESTING_TRAIN_GPU\")),\n", " )\n", "else:\n", " TRAIN_GPU, TRAIN_NGPU = (None, None)\n", "\n", "if os.getenv(\"IS_TESTING_DEPLOY_GPU\"):\n", " DEPLOY_GPU, DEPLOY_NGPU = (\n", " aiplatform.gapic.AcceleratorType.NVIDIA_TESLA_T4,\n", " int(os.getenv(\"IS_TESTING_DEPLOY_GPU\")),\n", " )\n", "else:\n", " DEPLOY_GPU, DEPLOY_NGPU = (None, None)" ] }, { "cell_type": "markdown", "metadata": { "id": "container:training,prediction" }, "source": [ "#### Set pre-built containers\n", "\n", "Set the pre-built Docker container image for training and prediction.\n", "\n", "\n", "For the latest list, see [Pre-built containers for training](https://cloud.google.com/ai-platform-unified/docs/training/pre-built-containers).\n", "\n", "\n", "For the latest list, see [Pre-built containers for prediction](https://cloud.google.com/ai-platform-unified/docs/predictions/pre-built-containers)." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "tJ8sDj_7w0aL" }, "outputs": [], "source": [ "if os.getenv(\"IS_TESTING_TF\"):\n", " TF = os.getenv(\"IS_TESTING_TF\")\n", "else:\n", " TF = \"2-9\"\n", "\n", "if TF[0] == \"2\":\n", " if TRAIN_GPU:\n", " TRAIN_VERSION = \"tf-gpu.{}\".format(TF)\n", " else:\n", " TRAIN_VERSION = \"tf-cpu.{}\".format(TF)\n", " if DEPLOY_GPU:\n", " DEPLOY_VERSION = \"tf2-gpu.{}\".format(TF)\n", " else:\n", " DEPLOY_VERSION = \"tf2-cpu.{}\".format(TF)\n", "else:\n", " if TRAIN_GPU:\n", " TRAIN_VERSION = \"tf-gpu.{}\".format(TF)\n", " else:\n", " TRAIN_VERSION = \"tf-cpu.{}\".format(TF)\n", " if DEPLOY_GPU:\n", " DEPLOY_VERSION = \"tf-gpu.{}\".format(TF)\n", " else:\n", " DEPLOY_VERSION = \"tf-cpu.{}\".format(TF)\n", "\n", "TRAIN_IMAGE = \"us-docker.pkg.dev/vertex-ai/training/{}:latest\".format(TRAIN_VERSION)\n", "DEPLOY_IMAGE = \"us-docker.pkg.dev/vertex-ai/prediction/{}:latest\".format(DEPLOY_VERSION)\n", "\n", "print(\"Training:\", TRAIN_IMAGE, TRAIN_GPU, TRAIN_NGPU)\n", "print(\"Deployment:\", DEPLOY_IMAGE, DEPLOY_GPU, DEPLOY_NGPU)" ] }, { "cell_type": "markdown", "metadata": { "id": "machine:training,prediction" }, "source": [ "#### Set machine type\n", "\n", "Next, set the machine type to use for training and prediction.\n", "\n", "- Set the variables `TRAIN_COMPUTE` and `DEPLOY_COMPUTE` to configure the compute resources for the VMs you use for training and prediction.\n", " - `machine type`\n", " - `n1-standard`: 3.75GB of memory per vCPU.\n", " - `n1-highmem`: 6.5GB of memory per vCPU\n", " - `n1-highcpu`: 0.9 GB of memory per vCPU\n", " - `vCPUs`: number of \\[2, 4, 8, 16, 32, 64, 96 \\]\n", "\n", "**Note**: The following isn't supported for training:\n", "\n", " - `standard`: 2 vCPUs\n", " - `highcpu`: 2, 4 and 8 vCPUs\n", "\n", "**Note**: You may also use n2 and e2 machine types for training and deployment, but they don't support GPUs." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "uUhWX8dXw0aM" }, "outputs": [], "source": [ "if os.getenv(\"IS_TESTING_TRAIN_MACHINE\"):\n", " MACHINE_TYPE = os.getenv(\"IS_TESTING_TRAIN_MACHINE\")\n", "else:\n", " MACHINE_TYPE = \"n1-standard\"\n", "\n", "VCPU = \"4\"\n", "TRAIN_COMPUTE = MACHINE_TYPE + \"-\" + VCPU\n", "print(\"Train machine type\", TRAIN_COMPUTE)\n", "\n", "if os.getenv(\"IS_TESTING_DEPLOY_MACHINE\"):\n", " MACHINE_TYPE = os.getenv(\"IS_TESTING_DEPLOY_MACHINE\")\n", "else:\n", " MACHINE_TYPE = \"n1-standard\"\n", "\n", "VCPU = \"4\"\n", "DEPLOY_COMPUTE = MACHINE_TYPE + \"-\" + VCPU\n", "print(\"Deploy machine type\", DEPLOY_COMPUTE)" ] }, { "cell_type": "markdown", "metadata": { "id": "examine_training_package" }, "source": [ "## Training a custom model\n", "\n", "Now you're ready to start creating your own custom model and training for Boston Housing. \n", "\n", "Learn more about [custom model training on Vertex AI](https://cloud.google.com/vertex-ai/docs/training/custom-training)\n", "\n", "### Examine the training package\n", "\n", "#### Package layout\n", "\n", "Before you start the training, look at how a Python package is assembled for a custom training job. When unarchived, the package contains the following directory/file layout.\n", "\n", "- PKG-INFO\n", "- README.md\n", "- setup.cfg\n", "- setup.py\n", "- trainer\n", " - \\_\\_init\\_\\_.py\n", " - task.py\n", "\n", "The files `setup.cfg` and `setup.py` are the instructions for installing the package into the operating environment of the Docker image.\n", "\n", "The file `trainer/task.py` is the Python script for executing the custom training job. \n", "\n", "**Note:** when `trainer/task.py` is referred to in the worker pool specification, the directory slash is replaced with a dot and the file suffix (.py) is dropped (trainer.task).\n", "\n", "#### Package Assembly\n", "\n", "In the following cells, you assemble the training package." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "MsQjrHq1w0aM" }, "outputs": [], "source": [ "# Make folder for Python training script\n", "! rm -rf custom\n", "! mkdir custom\n", "\n", "# Add package information\n", "! touch custom/README.md\n", "\n", "setup_cfg = \"[egg_info]\\n\\ntag_build =\\n\\ntag_date = 0\"\n", "! echo \"$setup_cfg\" > custom/setup.cfg\n", "\n", "setup_py = \"import setuptools\\n\\nsetuptools.setup(\\n\\n install_requires=[\\n\\n 'tensorflow_datasets==1.3.0',\\n\\n ],\\n\\n packages=setuptools.find_packages())\"\n", "! echo \"$setup_py\" > custom/setup.py\n", "\n", "pkg_info = \"Metadata-Version: 1.0\\n\\nName: Boston Housing tabular regression\\n\\nVersion: 0.0.0\\n\\nSummary: Demostration training script\\n\\nHome-page: www.google.com\\n\\nAuthor: Google\\n\\nAuthor-email: aferlitsch@google.com\\n\\nLicense: Public\\n\\nDescription: Demo\\n\\nPlatform: Vertex\"\n", "! echo \"$pkg_info\" > custom/PKG-INFO\n", "\n", "# Make the training subfolder\n", "! mkdir custom/trainer\n", "! touch custom/trainer/__init__.py" ] }, { "cell_type": "markdown", "metadata": { "id": "taskpy_contents:boston" }, "source": [ "#### Create task.py\n", "\n", "In the next cell, write the contents of the training script *task.py*.\n", "\n", "To summarize, the script performs the following steps:\n", "\n", "- Gets the directory for where to save the model artifacts from the command line (`--model_dir`), and if not specified, then from the environment variable `AIP_MODEL_DIR`.\n", "- Loads Boston Housing dataset from TF.Keras built-in datasets.\n", "- Builds a simple deep neural network model using TF.Keras model API.\n", "- Compiles the model (`compile()`).\n", "- Sets a training distribution strategy according to the argument `args.distribute`.\n", "- Trains the model (`fit()`) with epochs specified by `args.epochs`.\n", "- Saves the trained model (`save(args.model_dir)`) to the specified model directory.\n", "- Saves the maximum value for each feature `f.write(str(params))` to the specified parameters file." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "5mN6Ky8mw0aN" }, "outputs": [], "source": [ "%%writefile custom/trainer/task.py\n", "# Single, Mirror and Multi-Machine Distributed Training for Boston Housing\n", "\n", "import tensorflow_datasets as tfds\n", "import tensorflow as tf\n", "from tensorflow.python.client import device_lib\n", "import numpy as np\n", "import argparse\n", "import os\n", "import sys\n", "tfds.disable_progress_bar()\n", "\n", "parser = argparse.ArgumentParser()\n", "parser.add_argument('--model-dir', dest='model_dir',\n", " default=os.getenv('AIP_MODEL_DIR'), type=str, help='Model dir.')\n", "parser.add_argument('--lr', dest='lr',\n", " default=0.001, type=float,\n", " help='Learning rate.')\n", "parser.add_argument('--epochs', dest='epochs',\n", " default=20, type=int,\n", " help='Number of epochs.')\n", "parser.add_argument('--steps', dest='steps',\n", " default=100, type=int,\n", " help='Number of steps per epoch.')\n", "parser.add_argument('--distribute', dest='distribute', type=str, default='single',\n", " help='distributed training strategy')\n", "parser.add_argument('--param-file', dest='param_file',\n", " default='/tmp/param.txt', type=str,\n", " help='Output file for parameters')\n", "args = parser.parse_args()\n", "\n", "print('Python Version = {}'.format(sys.version))\n", "print('TensorFlow Version = {}'.format(tf.__version__))\n", "print('TF_CONFIG = {}'.format(os.environ.get('TF_CONFIG', 'Not found')))\n", "\n", "# Single Machine, single compute device\n", "if args.distribute == 'single':\n", " if tf.test.is_gpu_available():\n", " strategy = tf.distribute.OneDeviceStrategy(device=\"/gpu:0\")\n", " else:\n", " strategy = tf.distribute.OneDeviceStrategy(device=\"/cpu:0\")\n", "# Single Machine, multiple compute device\n", "elif args.distribute == 'mirror':\n", " strategy = tf.distribute.MirroredStrategy()\n", "# Multiple Machine, multiple compute device\n", "elif args.distribute == 'multi':\n", " strategy = tf.distribute.experimental.MultiWorkerMirroredStrategy()\n", "\n", "# Multi-worker configuration\n", "print('num_replicas_in_sync = {}'.format(strategy.num_replicas_in_sync))\n", "\n", "\n", "def make_dataset():\n", "\n", " \n", " (x_train, y_train), (x_test, y_test) = tf.keras.datasets.boston_housing.load_data(\n", " path=\"boston_housing.npz\", test_split=0.2, seed=113\n", " )\n", " \n", " \n", " \n", " #Get maximum value of each column\n", " max_value_in_each_column_array_x_train=np.max(x_train,axis=0)\n", "\n", " #dividing each value by the maximum value of that column \n", " \n", " x_train=x_train/max_value_in_each_column_array_x_train\n", "\n", " max_value_in_each_column_array_x_test=np.max(x_test,axis=0)\n", "\n", " #dividing each value by the maximum value of that column \n", " \n", " x_test=x_test/max_value_in_each_column_array_x_test\n", "\n", " params=max_value_in_each_column_array_x_train\n", "\n", " \n", "\n", " # store the normalization (max) value for each feature\n", " with tf.io.gfile.GFile(args.param_file, 'w') as f:\n", " f.write(str(params))\n", " return (x_train, y_train), (x_test, y_test)\n", "\n", "\n", "# Build the Keras model\n", "def build_and_compile_dnn_model():\n", " model = tf.keras.Sequential([\n", " tf.keras.layers.Dense(128, activation='relu', input_shape=(13,)),\n", " tf.keras.layers.Dense(128, activation='relu'),\n", " tf.keras.layers.Dense(1, activation='linear')\n", " ])\n", " model.compile(\n", " loss='mse',\n", " optimizer=tf.keras.optimizers.RMSprop(learning_rate=args.lr))\n", " return model\n", "\n", "NUM_WORKERS = strategy.num_replicas_in_sync\n", "# Here the batch size scales up by number of workers since\n", "# `tf.data.Dataset.batch` expects the global batch size.\n", "BATCH_SIZE = 16\n", "GLOBAL_BATCH_SIZE = BATCH_SIZE * NUM_WORKERS\n", "\n", "with strategy.scope():\n", " # Creation of dataset, and model building/compiling need to be within\n", " # `strategy.scope()`.\n", " model = build_and_compile_dnn_model()\n", "\n", "# Train the model\n", "(x_train, y_train), (x_test, y_test) = make_dataset()\n", "model.fit(x_train, y_train, epochs=args.epochs, batch_size=GLOBAL_BATCH_SIZE)\n", "model.save(args.model_dir)" ] }, { "cell_type": "markdown", "metadata": { "id": "tarball_training_script" }, "source": [ "### Store the training script in your Cloud Storage bucket.\n", "\n", "Next, package the training folder into a compressed tar ball, and then store the folder in your Cloud Storage bucket." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "_gJHD3fow0aN" }, "outputs": [], "source": [ "! rm -f custom.tar custom.tar.gz\n", "! tar cvf custom.tar custom\n", "! gzip custom.tar\n", "! gsutil cp custom.tar.gz $BUCKET_URI/trainer_boston.tar.gz" ] }, { "cell_type": "markdown", "metadata": { "id": "create_custom_training_job:mbsdk,no_model" }, "source": [ "### Create and run custom training job\n", "\n", "\n", "To train a custom model, you perform two steps:\n", "\n", "1) Create a custom training job\n", "\n", "2) Specify your training parameters and run the job.\n", "\n", "#### Create a custom training job\n", "\n", "A custom training job is created using the `CustomTrainingJob` class, with the following parameters:\n", "\n", "- `display_name`: The human readable name for the custom training job\n", "- `container_uri`: The training container image\n", "- `requirements`: Package requirements for the training container image (e.g., pandas)\n", "- `script_path`: The relative path to the training script" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "iVbjlrdgw0aO" }, "outputs": [], "source": [ "train_job = aiplatform.CustomTrainingJob(\n", " display_name=\"boston\",\n", " script_path=\"custom/trainer/task.py\",\n", " container_uri=TRAIN_IMAGE,\n", " requirements=[\"gcsfs==0.7.1\", \"tensorflow-datasets==4.4\"],\n", ")\n", "\n", "print(train_job)" ] }, { "cell_type": "markdown", "metadata": { "id": "prepare_custom_cmdargs" }, "source": [ "#### Prepare your training parameters\n", "\n", "Now define the command-line arguments for your custom training container:\n", "\n", "- `args`: The command-line arguments to pass to the executable that's set as the entry point into the container.\n", " - `--model-dir`: Command-line argument to specify where to store the model artifacts. You can use either of the following methods to specify the storage location for artifacts.\n", " - **method-1**(set `DIRECT` to `True`): You pass the Cloud Storage location as a command line argument to your training script.\n", " - **method-2**(set `DIRECT` to `False`): The service passes the Cloud Storage location as the environment variable AIP_MODEL_DIR to your training script. In this case, you tell the service the model artifact location in the job specification.\n", " - `--epochs`: The number of epochs for training.\n", " - `--steps`: The number of steps per epoch." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "7zKSOPvvw0aO" }, "outputs": [], "source": [ "MODEL_DIR = \"{}/{}\".format(BUCKET_URI, \"model\")\n", "\n", "EPOCHS = 20\n", "STEPS = 100\n", "\n", "DIRECT = True\n", "if DIRECT:\n", " CMDARGS = [\n", " \"--model-dir=\" + MODEL_DIR,\n", " \"--epochs=\" + str(EPOCHS),\n", " \"--steps=\" + str(STEPS),\n", " ]\n", "else:\n", " CMDARGS = [\n", " \"--epochs=\" + str(EPOCHS),\n", " \"--steps=\" + str(STEPS),\n", " ]" ] }, { "cell_type": "markdown", "metadata": { "id": "run_custom_job:mbsdk,no_model" }, "source": [ "#### Run the custom training job\n", "\n", "Next, you run the custom job to start the training job by invoking the `run()` method, with the following parameters:\n", "\n", "- `args`: The command-line arguments to pass to the training script.\n", "- `replica_count`: The number of compute instances for training (replica_count = 1 is single node training).\n", "- `machine_type`: The machine type for the compute instances.\n", "- `accelerator_type`: The hardware accelerator type.\n", "- `accelerator_count`: The number of accelerators to attach to a worker replica.\n", "- `base_output_dir`: The Cloud Storage location to write the model artifacts.\n", "- `sync`: Set **True** to wait until the completion of the job." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "ScNW3L-Xw0aO" }, "outputs": [], "source": [ "if TRAIN_GPU:\n", " train_job.run(\n", " args=CMDARGS,\n", " replica_count=1,\n", " machine_type=TRAIN_COMPUTE,\n", " accelerator_type=TRAIN_GPU.name,\n", " accelerator_count=TRAIN_NGPU,\n", " base_output_dir=MODEL_DIR,\n", " sync=True,\n", " )\n", "else:\n", " train_job.run(\n", " args=CMDARGS,\n", " replica_count=1,\n", " machine_type=TRAIN_COMPUTE,\n", " base_output_dir=MODEL_DIR,\n", " sync=True,\n", " )\n", "\n", "model_path_to_deploy = MODEL_DIR" ] }, { "cell_type": "markdown", "metadata": { "id": "ab954a846b61" }, "source": [ "#### Load the saved model\n", "\n", "Your model is stored in a TensorFlow SavedModel format in a Cloud Storage bucket. Once you load the model from the Cloud Storage bucket you can run model evaluation and prepare it for prediction requests.\n", "\n", "To load the model, pass the Cloud Storage path \"MODEL_DIR\" to the `tf.saved_model.load()` method." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "dad74e9bc8b3" }, "outputs": [], "source": [ "loaded = tf.saved_model.load(model_path_to_deploy)" ] }, { "cell_type": "markdown", "metadata": { "id": "serving_function_signature" }, "source": [ "#### Get the serving function signature\n", "\n", "You can get the signatures of your model's input and output layers by reloading the model into memory, and querying it for the signatures corresponding to each layer.\n", "\n", "When making a prediction request, you need to route the request to the serving function instead of the model, so you need to know the input layer name of the serving function which you use later when you make a prediction request.\n", "\n", "You also need to know the name of the serving function's input and output layer for constructing the explanation metadata **during a later step**." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "n_SWvewPw0aQ" }, "outputs": [], "source": [ "serving_input = list(\n", " loaded.signatures[\"serving_default\"].structured_input_signature[1].keys()\n", ")[0]\n", "print(\"Serving function input:\", serving_input)\n", "serving_output = list(loaded.signatures[\"serving_default\"].structured_outputs.keys())[0]\n", "print(\"Serving function output:\", serving_output)" ] }, { "cell_type": "markdown", "metadata": { "id": "69d4859c7196" }, "source": [ "## Configure feature-based explanations (Optional) \n", "\n", "**For configuring explanations to the model, follow this step. This step is optional.**\n", "\n", "To use Vertex Explainable AI with a custom-trained model, you must configure certain options when you create the Model resource that you plan to request explanations from, or when you deploy the model, or when you submit a batch explanation job.\n", "\n", "If you want to use Vertex Explainable AI with an AutoML tabular model, then you don't need to perform any configuration. Vertex AI automatically configures the model for Vertex Explainable AI.\n", "\n", "### Explanation Specification\n", "\n", "To get explanations for the predictions, you must enable the explanation feature and set corresponding settings when you upload your custom model to Vertex AI Model Registry. These settings are referred to as the explanation metadata, which consists of:\n", "\n", "- `parameters`: Specification for the explainability algorithm to use for explanations on your model. You can choose between:\n", " - Shapley (**Note**: not recommended for image data since can involve a long-running operation)\n", " - XRAI\n", " - Integrated Gradients\n", "- `metadata`: Specification for how the algoithm is applied on your custom model\n", "\n", "Learn more about [explanation specification](https://cloud.google.com/vertex-ai/docs/explainable-ai/configuring-explanations-feature-based#when-creating-or-importing-model).\n", "\n", "\n", "\n", "In the next code cell, set the variable `XAI` to the explainabilty algorithm that you use on your custom model." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "p_jRpSlCeJIY" }, "outputs": [], "source": [ "XAI = \"ig\" # [ shapley, ig, xrai ]\n", "\n", "if XAI == \"shapley\":\n", " PARAMETERS = {\"sampled_shapley_attribution\": {\"path_count\": 10}}\n", "elif XAI == \"ig\":\n", " PARAMETERS = {\"integrated_gradients_attribution\": {\"step_count\": 50}}\n", "elif XAI == \"xrai\":\n", " PARAMETERS = {\"xrai_attribution\": {\"step_count\": 50}}\n", "\n", "parameters = aiplatform.explain.ExplanationParameters(PARAMETERS)" ] }, { "cell_type": "markdown", "metadata": { "id": "781989a46a3b" }, "source": [ "In the next code cell, define the metadata." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "A9Rc-_HUeMnh" }, "outputs": [], "source": [ "INPUT_METADATA = {\n", " \"input_tensor_name\": serving_input,\n", " \"encoding\": \"BAG_OF_FEATURES\",\n", " \"modality\": \"numeric\",\n", " \"index_feature_mapping\": [\n", " \"crim\",\n", " \"zn\",\n", " \"indus\",\n", " \"chas\",\n", " \"nox\",\n", " \"rm\",\n", " \"age\",\n", " \"dis\",\n", " \"rad\",\n", " \"tax\",\n", " \"ptratio\",\n", " \"b\",\n", " \"lstat\",\n", " ],\n", "}\n", "\n", "OUTPUT_METADATA = {\"output_tensor_name\": serving_output}\n", "\n", "input_metadata = aiplatform.explain.ExplanationMetadata.InputMetadata(INPUT_METADATA)\n", "output_metadata = aiplatform.explain.ExplanationMetadata.OutputMetadata(OUTPUT_METADATA)\n", "\n", "metadata = aiplatform.explain.ExplanationMetadata(\n", " inputs={\"features\": input_metadata}, outputs={\"medv\": output_metadata}\n", ")" ] }, { "cell_type": "markdown", "metadata": { "id": "ed414a2f945a" }, "source": [ "### Create instance schema and prediction schema yaml files\n", "\n", "In next cells, write the contents of *instance_schema.yaml* and *prediction_schema.yaml* files. Content structure is the same for both files.\n", "\n", "\n", "#### Create instance schema yaml file\n", "\n", "The *instance_schema.yaml* file defines the structure of the prediction instances you provide to your batch predictions. \n", "\n", "- Specify the title and description.\n", "- Specify type of the input. In your case input layer to batch prediction is \n", "**{\"dense_input\": [0.02715405449271202, 0.0, 0.027177177369594574, 0.0, 0.0010195195209234953, 0.009660660289227962, 0.1501501500606537, 0.0027548049110919237, 0.036036036908626556, 1.0, 0.03033033013343811, 0.04091591760516167, 0.043618619441986084]}**\n", "which is an object. Inside object, there are properties like dense_input.\n", "- For each property, provide description and mention its type. \n", "- If type of the property is an array, mention the information about array items in `items` key.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "096c0ad95247" }, "outputs": [], "source": [ "%%writefile instance_schema.yaml\n", "title: TabularRegression\n", "description: 'Regression Instances.'\n", "\n", "type: object\n", "properties:\n", " dense_input:\n", " type: array\n", " items:\n", " type: float\n", " minimum: 0.0\n", " maximum: 1.0\n", " description: 'Input values to model'\n" ] }, { "cell_type": "markdown", "metadata": { "id": "53f324aaf19a" }, "source": [ "#### Create prediction schema yaml file\n", "\n", "The *prediction_schema.yaml* file defines the structure of the prediction output you get from your batch prediction job. \n", "\n", "Output of batch prediction job is \"prediction\": [value], which is of type array." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "81e3c06bf8d7" }, "outputs": [], "source": [ "%%writefile prediction_schema.yaml\n", "title: TabularRegression\n", "description: 'Regression results.'\n", "\n", "type: array" ] }, { "cell_type": "markdown", "metadata": { "id": "ff29b80d8b9c" }, "source": [ "Upload both the files to your Cloud Storage bucket." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "d22179020f2d" }, "outputs": [], "source": [ "!gsutil cp instance_schema.yaml {BUCKET_URI}/instance_schema.yaml\n", "!gsutil cp prediction_schema.yaml {BUCKET_URI}/prediction_schema.yaml" ] }, { "cell_type": "markdown", "metadata": { "id": "upload_model:mbsdk" }, "source": [ "## Upload the model\n", "\n", "Next, upload your model to Vertex AI Model Registry using `Model.upload()` method, with the following parameters:\n", "\n", "- `display_name`: The human readable name for the model resource.\n", "- `artifact`: The Cloud Storage location of the trained model artifacts.\n", "- `serving_container_image_uri`: The serving container image.\n", "- `instance_schema_uri`: Points to a YAML file stored on Google Cloud Storage describing the format of a single instance.\n", "- `prediction_schema_uri`: Points to a YAML file stored on Google Cloud Storage describing the format of a single prediction produced by this model.\n", "- `sync`: Whether to execute the upload asynchronously or synchronously.\n", "- `explanation_parameters`: Parameters to configure explaining for model's predictions.\n", "- `explanation_metadata`: Metadata describing the model's input and output for explanation.\n", "\n", "If the `upload()` method is run asynchronously, you can subsequently block until completion with the `wait()` method.\n", "\n", "**Note:** If you want to configure explanations for the model, set `explanation_parameters`, `explanation_metadata` parameters. Otherwise don't set them." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "JF-2-yxtw0aQ" }, "outputs": [], "source": [ "model = aiplatform.Model.upload(\n", " display_name=\"boston_new_model\",\n", " artifact_uri=MODEL_DIR,\n", " serving_container_image_uri=DEPLOY_IMAGE,\n", " instance_schema_uri=f\"{BUCKET_URI}/instance_schema.yaml\",\n", " prediction_schema_uri=f\"{BUCKET_URI}/prediction_schema.yaml\",\n", " explanation_parameters=parameters,\n", " explanation_metadata=metadata,\n", " sync=False,\n", ")\n", "\n", "model.wait()" ] }, { "cell_type": "markdown", "metadata": { "id": "d9b7a236f0f2" }, "source": [ "### Load data for the pipeline\n", "\n", "Load the Boston Housing test (holdout) data from `tf.keras.datasets`, using the method `load_data()`. This returns the dataset as a tuple of two elements. The first element is the training data and the second is the test data. Each element is also a tuple of two elements: the feature data, and the corresponding labels (median value of owner-occupied home).\n", "\n", "You don't need the training data, and therefore you load it as `(_, _)`.\n", "\n", "Before you can run the data through the pipeline, you need to preprocess it. Normalize (rescale) the data in each column by dividing each value by the maximum value of that column. This replaces each single value with a 32-bit floating point number between 0 and 1." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "4790dcb57dff" }, "outputs": [], "source": [ "(_, _), (x_test, y_test) = boston_housing.load_data(\n", " path=\"boston_housing.npz\", test_split=0.2, seed=113\n", ")\n", "\n", "max_value_in_each_column_array = np.max(x_test, axis=0)\n", "\n", "\n", "# dividing each value by the maximum value of that column\n", "x_test = x_test / max_value_in_each_column_array\n", "\n", "\n", "x_test = x_test.astype(np.float32)\n", "\n", "print(x_test.shape, x_test.dtype, y_test.shape)\n", "print(\"scaled\", x_test[0])" ] }, { "cell_type": "markdown", "metadata": { "id": "make_batch_file:custom,tabular" }, "source": [ "### Prepare the input file for the pipeline\n", "\n", "Prepare an input file and store it in your Cloud Storage bucket. Each instance in the file is a dictionary entry of the form:\n", "\n", " {serving_input: content, grount_truth_column:value}\n", "\n", "- `serving_input`: The name of the input layer of the underlying model.\n", "- `content`: The feature values of the test item as a list.\n", "- `ground_truth_column`: Give any name to this key. Use the same name in target_field_name in the below pipeline parameters.\n", "- `value`: Ground truth value of this instance.\n", "\n", " " ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "Tmf1yEirnVQZ" }, "outputs": [], "source": [ "gcs_input_uri = BUCKET_URI + \"/\" + \"test_file_with_ground_truth.jsonl\"\n", "with tf.io.gfile.GFile(gcs_input_uri, \"w\") as f:\n", " for i in range(10):\n", " data = {serving_input: x_test[i].tolist(), \"MEDV\": y_test[i]}\n", " f.write(json.dumps(data) + \"\\n\")" ] }, { "cell_type": "markdown", "metadata": { "id": "dAYyBa_qw0aT" }, "source": [ "## Model Evaluation\n", "\n", "Now, run a Vertex AI Batch Prediction job and generate evaluations and feature-attributions on its results by creating a Vertex AI pipeline using `evaluate` function. Learn more about [evaluate function](https://github.com/googleapis/python-aiplatform/blob/main/google/cloud/aiplatform/models.py#L5127)." ] }, { "cell_type": "markdown", "metadata": { "id": "84d660722fef" }, "source": [ "### Define parameters to run the evaluate function\n", "\n", "Specify the required parameters to run `evaluate` function. \n", "\n", "The following is the instruction of `evaluate` function paramters:\n", "\n", "- `prediction_type`: The problem type being addressed by this evaluation run. 'classification' and 'regression' are the currently supported problem types.\n", "- `target_field_name`: Name of the column to be used as the target for regression.\n", "- `gcs_source_uris`: List of the Cloud Storage bucket uris of input instances for batch prediction.\n", "- `generate_feature_attributions`: (**Optional**) Whether the model evaluation job should generate feature attributions. Defaults to False if not specified.\n", "\n", "**The pipeline takes about 1 hour to complete.**" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "RqcRr7USbseH" }, "outputs": [], "source": [ "job = model.evaluate(\n", " prediction_type=\"regression\",\n", " target_field_name=\"MEDV\",\n", " gcs_source_uris=[BUCKET_URI + \"/\" + \"test_file_with_ground_truth.jsonl\"],\n", " generate_feature_attributions=True,\n", ")\n", "\n", "print(\"Waiting model evaluation is in process\")\n", "job.wait()" ] }, { "cell_type": "markdown", "metadata": { "id": "7651764dfa66" }, "source": [ "In the results from the last step, click on the generated link to see your run in the Cloud Console.\n" ] }, { "cell_type": "markdown", "metadata": { "id": "l7DHzescbseI" }, "source": [ "##### Runtime Graph of Model Evaluation pipeline\n", "\n", "In the UI, you can click on the DAG nodes to expand or collapse them. Here's a partially-expanded view of the DAG (click image to see larger version).\n", "\n", "<img src=\"images/custom_tabular_regression_evaluation_pipeline.PNG\" style=\"height:622px;width:726px\"></img>\n", "\n", "## Get the model evaluation results\n", "\n", "After the evalution pipeline is finished, run the below cell to print the evaluation metrics." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "11846624ae51" }, "outputs": [], "source": [ "model_evaluation = job.get_model_evaluation()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "3wYGeI1abseJ" }, "outputs": [], "source": [ "# Iterate over the pipeline tasks\n", "for (\n", " task\n", ") in model_evaluation._backing_pipeline_job._gca_resource.job_detail.task_details:\n", " # Obtain the artifacts from the evaluation task\n", " if (\n", " (\"model-evaluation\" in task.task_name)\n", " and (\"model-evaluation-import\" not in task.task_name)\n", " and (\n", " task.state == aiplatform_v1.types.PipelineTaskDetail.State.SUCCEEDED\n", " or task.state == aiplatform_v1.types.PipelineTaskDetail.State.SKIPPED\n", " )\n", " ):\n", " evaluation_metrics = task.outputs.get(\"evaluation_metrics\").artifacts[\n", " 0\n", " ] # ['artifacts']\n", " evaluation_metrics_gcs_uri = evaluation_metrics.uri\n", "\n", "print(evaluation_metrics)\n", "print(evaluation_metrics_gcs_uri)" ] }, { "cell_type": "markdown", "metadata": { "id": "1-oX7xI6bseJ" }, "source": [ "### Visualize the metrics\n", "\n", "After the evalution pipeline is finished, run the below cell to visualize the evaluation metrics." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "WTweUK_kbseJ" }, "outputs": [], "source": [ "metrics = []\n", "values = []\n", "for i in evaluation_metrics.metadata.items():\n", " # if (\n", " # i[0] == \"meanAbsolutePercentageError\"\n", " # ): # we are not considering MAPE as it is infinite. MAPE is infinite if groud truth is 0 as in our case Age is 0 for some instances.\n", " # continue\n", " metrics.append(i[0])\n", " values.append(i[1])\n", "plt.figure(figsize=(15, 5))\n", "plt.bar(x=metrics, height=values)\n", "plt.title(\"Evaluation Metrics\")\n", "plt.ylabel(\"Value\")\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": { "id": "NUhcQPptbseJ" }, "source": [ "### Get the Feature Attributions (Optional)\n", "\n", "**If you have configured explanations for the model, run below cell. Else skip below cell.**\n", "\n", "\n", "Feature attributions indicate how much each feature in your model contributed to the predictions for each given instance.\n", "\n", "Learn more about [Feature attributions](https://cloud.google.com/vertex-ai/docs/explainable-ai/overview#feature_attributions).\n", "\n", "Run the below cell to get the feature attributions. " ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "6-BO8VOabseJ" }, "outputs": [], "source": [ "# Iterate over the pipeline tasks\n", "for (\n", " task\n", ") in model_evaluation._backing_pipeline_job._gca_resource.job_detail.task_details:\n", " # Obtain the artifacts from the feature-attribution task\n", " if (task.task_name == \"feature-attribution\") and (\n", " task.state == aiplatform_v1.types.PipelineTaskDetail.State.SUCCEEDED\n", " or task.state == aiplatform_v1.types.PipelineTaskDetail.State.SKIPPED\n", " ):\n", " feat_attrs = task.outputs.get(\"feature_attributions\").artifacts[0]\n", " feat_attrs_gcs_uri = feat_attrs.uri\n", "\n", "print(feat_attrs)\n", "print(feat_attrs_gcs_uri)" ] }, { "cell_type": "markdown", "metadata": { "id": "tIko1hOYbseK" }, "source": [ "From the obtained Cloud Storage URI for the feature attributions, get the attribution values." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "L7EGZUAAbseK" }, "outputs": [], "source": [ "# Load the results\n", "attributions = !gsutil cat $feat_attrs_gcs_uri\n", "\n", "# Print the results obtained\n", "attributions = json.loads(attributions[0])\n", "print(attributions)" ] }, { "cell_type": "markdown", "metadata": { "id": "TILgUqq9bseK" }, "source": [ "### Visualize the Feature Attributions\n", "\n", "Visualize the obtained attributions for each feature using a bar-chart." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "VOc6L7VCbseK" }, "outputs": [], "source": [ "data = attributions[\"explanation\"][\"attributions\"][0][\"featureAttributions\"]\n", "features = []\n", "attr_values = []\n", "for key, value in data.items():\n", " features.append(key)\n", " attr_values.append(value[0])\n", "\n", "plt.figure(figsize=(5, 3))\n", "plt.bar(x=features, height=attr_values)\n", "plt.title(\"Feature Attributions\")\n", "plt.xticks(rotation=90)\n", "plt.ylabel(\"Attribution value\")\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": { "id": "cleanup:mbsdk" }, "source": [ "## Cleaning up\n", "\n", "To clean up all Google Cloud resources used in this project, you can [delete the Google Cloud\n", "project](https://cloud.google.com/resource-manager/docs/creating-managing-projects#shutting_down_projects) you used for the tutorial.\n", "\n", "Otherwise, you can delete the individual resources you created in this tutorial.\n", "\n", "Set `delete_bucket` to **True** to create the Cloud Storage bucket created in this notebook." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "6NVBNlVdw0aV" }, "outputs": [], "source": [ "# Delete model resource\n", "model.delete()\n", "\n", "# Delete the training job\n", "train_job.delete()\n", "\n", "# Delete the evaluation pipeline\n", "job.delete()\n", "\n", "# Delete the batch prediction jobs\n", "batch_prediction_jobs = aiplatform.BatchPredictionJob.list()\n", "for batch_prediction_job in batch_prediction_jobs:\n", " if any(\n", " keyword in batch_prediction_job.display_name\n", " for keyword in [\n", " \"model-registry-batch-predict-evaluation\",\n", " \"model-registry-batch-explain-evaluation\",\n", " ]\n", " ):\n", " batch_prediction_job.delete()\n", "\n", "# Delete locally generated files\n", "! rm -rf custom custom.tar.gz instance_schema.yaml prediction_schema.yaml\n", "\n", "# Delete Cloud Storage objects\n", "delete_bucket = False\n", "if delete_bucket:\n", " ! gsutil -m rm -r $BUCKET_URI" ] } ], "metadata": { "colab": { "name": "custom_tabular_regression_model_evaluation.ipynb", "toc_visible": true }, "kernelspec": { "display_name": "Python 3", "name": "python3" } }, "nbformat": 4, "nbformat_minor": 0 }

notebooks/official/model_evaluation/custom_tabular_regression_model_evaluation.ipynb (1,640 lines of code) (raw):