notebooks/official/custom/custom-tabular-bq-managed-dataset.ipynb (952 lines of code) (raw):

{ "cells": [ { "cell_type": "code", "execution_count": null, "metadata": { "cellView": "form", "id": "copyright" }, "outputs": [], "source": [ "# Copyright 2022 Google LLC\n", "#\n", "# Licensed under the Apache License, Version 2.0 (the \"License\");\n", "# you may not use this file except in compliance with the License.\n", "# You may obtain a copy of the License at\n", "#\n", "# https://www.apache.org/licenses/LICENSE-2.0\n", "#\n", "# Unless required by applicable law or agreed to in writing, software\n", "# distributed under the License is distributed on an \"AS IS\" BASIS,\n", "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n", "# See the License for the specific language governing permissions and\n", "# limitations under the License." ] }, { "cell_type": "markdown", "metadata": { "id": "title" }, "source": [ "# Training a TensorFlow model on BigQuery data\n", "\n", "<table align=\"left\">\n", " <td style=\"text-align: center\">\n", " <a href=\"https://colab.research.google.com/github/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/official/custom/custom-tabular-bq-managed-dataset.ipynb\">\n", " <img src=\"https://cloud.google.com/ml-engine/images/colab-logo-32px.png\" alt=\"Google Colaboratory logo\"><br> Open in Colab\n", " </a>\n", " </td>\n", " <td style=\"text-align: center\">\n", " <a href=\"https://console.cloud.google.com/vertex-ai/colab/import/https:%2F%2Fraw.githubusercontent.com%2FGoogleCloudPlatform%2Fvertex-ai-samples%2Fmain%2Fnotebooks%2Fofficial%2Fcustom%2Fcustom-tabular-bq-managed-dataset.ipynb\">\n", " <img width=\"32px\" src=\"https://cloud.google.com/ml-engine/images/colab-enterprise-logo-32px.png\" alt=\"Google Cloud Colab Enterprise logo\"><br> Open in Colab Enterprise\n", " </a>\n", " </td> \n", " <td style=\"text-align: center\">\n", " <a href=\"https://console.cloud.google.com/vertex-ai/workbench/deploy-notebook?download_url=https://raw.githubusercontent.com/GoogleCloudPlatform/vertex-ai-samples/main/notebooks/official/custom/custom-tabular-bq-managed-dataset.ipynb\">\n", " <img src=\"https://lh3.googleusercontent.com/UiNooY4LUgW_oTvpsNhPpQzsstV5W8F7rYgxgGBD85cWJoLmrOzhVs_ksK_vgx40SHs7jCqkTkCk=e14-rj-sc0xffffff-h130-w32\" alt=\"Vertex AI logo\"><br> Open in Workbench\n", " </a>\n", " </td>\n", " <td style=\"text-align: center\">\n", " <a href=\"https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/official/custom/custom-tabular-bq-managed-dataset.ipynb\">\n", " <img src=\"https://cloud.google.com/ml-engine/images/github-logo-32px.png\" alt=\"GitHub logo\"><br> View on GitHub\n", " </a>\n", " </td>\n", "</table>" ] }, { "cell_type": "markdown", "metadata": { "id": "overview:custom" }, "source": [ "## Overview\n", "\n", "\n", "This tutorial demonstrates how to use the Vertex AI SDK for Python to train and deploy a custom tabular classification model for online prediction.\n", "\n", "Learn more about [Vertex AI Training](https://cloud.google.com/vertex-ai/docs/training/custom-training)." ] }, { "cell_type": "markdown", "metadata": { "id": "objective:custom,training,online_prediction" }, "source": [ "### Objective\n", "\n", "In this notebook, you learn how to create a custom-trained model from a Python script in a Docker container using the Vertex AI SDK for Python, and then get a prediction from the deployed model by sending data. Alternatively, you can create custom-trained models using `gcloud` command-line tool, or online using the Cloud Console.\n", "\n", "This tutorial uses the following Google Cloud ML services and resources:\n", "\n", "- BigQuery\n", "- Cloud Storage\n", "- Vertex AI managed Datasets\n", "- Vertex AI Training\n", "- Vertex AI Endpoints\n", "\n", "The steps performed include:\n", "\n", "- Create a Vertex AI custom `TrainingPipeline` for training a model.\n", "- Train a TensorFlow model.\n", "- Deploy the `Model` resource to a serving `Endpoint` resource.\n", "- Make a prediction.\n", "- Undeploy the `Model` resource." ] }, { "cell_type": "markdown", "metadata": { "id": "dataset:custom,cifar10,icn" }, "source": [ "### Dataset\n", "\n", "The dataset used for this tutorial is the penguins dataset from [BigQuery public datasets](https://cloud.google.com/bigquery/public-data). For this tutorial, you use only the fields `culmen_length_mm`, `culmen_depth_mm`, `flipper_length_mm`, `body_mass_g` from the dataset to predict the penguins species (`species`)." ] }, { "cell_type": "markdown", "metadata": { "id": "costs" }, "source": [ "### Costs\n", "\n", "This tutorial uses billable components of Google Cloud:\n", "\n", "* Vertex AI\n", "* Cloud Storage\n", "* BigQuery\n", "\n", "Learn about [Vertex AI\n", "pricing](https://cloud.google.com/vertex-ai/pricing), [Cloud Storage\n", "pricing](https://cloud.google.com/storage/pricing), [BigQuery pricing](https://cloud.google.com/bigquery/pricing) and use the [Pricing\n", "Calculator](https://cloud.google.com/products/calculator/)\n", "to generate a cost estimate based on your projected usage." ] }, { "cell_type": "markdown", "metadata": { "id": "3b1ffd5ab768" }, "source": [ "## Get Started" ] }, { "cell_type": "markdown", "metadata": { "id": "dc848186ab0e" }, "source": [ "### Install Vertex AI SDK for Python and other required packages" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "1fd00fa70a2a" }, "outputs": [], "source": [ "# Install the packages\n", "! pip3 install --upgrade --quiet google-cloud-aiplatform \\\n", " google-cloud-storage \\\n", " 'google-cloud-bigquery[pandas]'" ] }, { "cell_type": "markdown", "metadata": { "id": "ff555b32bab8" }, "source": [ "### Restart runtime (Colab only)\n", "\n", "To use the newly installed packages, you must restart the runtime on Google Colab." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "f09b4dff629a" }, "outputs": [], "source": [ "import sys\n", "\n", "if \"google.colab\" in sys.modules:\n", "\n", " import IPython\n", "\n", " app = IPython.Application.instance()\n", " app.kernel.do_shutdown(True)" ] }, { "cell_type": "markdown", "metadata": { "id": "54c5ef8a8f43" }, "source": [ "<div class=\"alert alert-block alert-warning\">\n", "<b>⚠️ The kernel is going to restart. Wait until it's finished before continuing to the next step. ⚠️</b>\n", "</div>\n" ] }, { "cell_type": "markdown", "metadata": { "id": "f82e28c631cc" }, "source": [ "### Authenticate your notebook environment (Colab only)\n", "\n", "Authenticate your environment on Google Colab." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "46604f70e831" }, "outputs": [], "source": [ "import sys\n", "\n", "if \"google.colab\" in sys.modules:\n", "\n", " from google.colab import auth\n", "\n", " auth.authenticate_user()" ] }, { "cell_type": "markdown", "metadata": { "id": "107c51893a64" }, "source": [ "### Set Google Cloud project information and initialize Vertex AI SDK for Python\n", "\n", "To get started using Vertex AI, you must have an existing Google Cloud project and [enable the Vertex AI API](https://console.cloud.google.com/flows/enableapi?apiid=aiplatform.googleapis.com). Learn more about [setting up a project and a development environment](https://cloud.google.com/vertex-ai/docs/start/cloud-environment)." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "3c8049930470" }, "outputs": [], "source": [ "PROJECT_ID = \"[your-project-id]\" # @param {type:\"string\"}\n", "LOCATION = \"us-central1\" # @param {type:\"string\"}" ] }, { "cell_type": "markdown", "metadata": { "id": "bucket:custom" }, "source": [ "### Create a Cloud Storage bucket\n", "\n", "Create a storage bucket to store intermediate artifacts such as datasets." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "bucket" }, "outputs": [], "source": [ "BUCKET_URI = \"gs://your-bucket-name-unique\" # @param {type:\"string\"}" ] }, { "cell_type": "markdown", "metadata": { "id": "create_bucket" }, "source": [ "**Only if your bucket doesn't already exist**: Run the following cell to create your Cloud Storage bucket." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "Oz8J0vmSlugt" }, "outputs": [], "source": [ "! gsutil mb -l $LOCATION -p $PROJECT_ID $BUCKET_URI" ] }, { "cell_type": "markdown", "metadata": { "id": "4f1319830b6e" }, "source": [ "### Import libraries" ] }, { "cell_type": "markdown", "metadata": { "id": "poijnGfZCFYi" }, "source": [ "Import the Vertex AI Python SDK and other required Python libraries." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "b9a0a5a74fa6" }, "outputs": [], "source": [ "import numpy as np\n", "import pandas as pd\n", "from google.cloud import aiplatform, bigquery" ] }, { "cell_type": "markdown", "metadata": { "id": "750d53e37094" }, "source": [ "### Initialize Vertex AI SDK for Python\n", "\n", "Initialize the Vertex AI SDK for Python for your project and corresponding bucket." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "c9d3ac73dfbc" }, "outputs": [], "source": [ "# Initialize the Vertex AI SDK for Python\n", "aiplatform.init(project=PROJECT_ID, location=LOCATION, staging_bucket=BUCKET_URI)" ] }, { "cell_type": "markdown", "metadata": { "id": "7c163842eabd" }, "source": [ "### Initialize BigQuery Client\n", "\n", "Initialize the BigQuery Python client for your project.\n", "\n", "To use BigQuery, make sure your account has the \"BigQuery User\" role." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "fad2ba1ad7c3" }, "outputs": [], "source": [ "# Set up BigQuery client\n", "bq_client = bigquery.Client(project=PROJECT_ID)" ] }, { "cell_type": "markdown", "metadata": { "id": "0a2c41bc91a6" }, "source": [ "### Preprocess data and split data\n", "First you should download and preprocess your data for training and testing.\n", "\n", "- Convert categorical features to numeric\n", "- Remove unused columns\n", "- Remove unusable rows\n", "- Split train and test data" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "e3a2449cfcf1" }, "outputs": [], "source": [ "LABEL_COLUMN = \"species\"\n", "\n", "# Define the BigQuery source dataset\n", "BQ_SOURCE = \"bigquery-public-data.ml_datasets.penguins\"\n", "\n", "# Define NA values\n", "NA_VALUES = [\"NA\", \".\"]\n", "\n", "# Download a table\n", "table = bq_client.get_table(BQ_SOURCE)\n", "df = bq_client.list_rows(table).to_dataframe()\n", "\n", "# Drop unusable rows\n", "df = df.replace(to_replace=NA_VALUES, value=np.nan).dropna()\n", "\n", "# Convert categorical columns to numeric\n", "df[\"island\"], _ = pd.factorize(df[\"island\"])\n", "df[\"species\"], _ = pd.factorize(df[\"species\"])\n", "df[\"sex\"], _ = pd.factorize(df[\"sex\"])\n", "\n", "# Split into a training and holdout dataset\n", "df_train = df.sample(frac=0.8, random_state=100)\n", "df_holdout = df[~df.index.isin(df_train.index)]" ] }, { "cell_type": "markdown", "metadata": { "id": "a39df4692a70" }, "source": [ "### Create a Vertex AI Tabular dataset from BigQuery dataset\n", "\n", "Create a Vertex AI tabular dataset resource from your BigQuery training data.\n", "\n", "See more info here: https://cloud.google.com/vertex-ai/docs/training/using-managed-datasets" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "7fa452ee5c75" }, "outputs": [], "source": [ "# Create BigQuery dataset\n", "bq_dataset_id = f\"{PROJECT_ID}.dataset_id_unique\"\n", "bq_dataset = bigquery.Dataset(bq_dataset_id)\n", "bq_client.create_dataset(bq_dataset, exists_ok=True)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "1d26b159106f" }, "outputs": [], "source": [ "dataset = aiplatform.TabularDataset.create_from_dataframe(\n", " df_source=df_train,\n", " staging_path=f\"bq://{bq_dataset_id}.table-unique\",\n", " display_name=\"sample-penguins\",\n", ")" ] }, { "cell_type": "markdown", "metadata": { "id": "train_custom_model" }, "source": [ "### Train a model\n", "\n", "There are two ways you can train a model using a container image:\n", "\n", "- **Use a Vertex AI pre-built container**. If you use a pre-built training container, you must additionally specify a Python package to install into the container image. This Python package contains your training code.\n", "\n", "- **Use your own custom container image**. If you use your own container, the container image must contain your training code.\n", "\n", "You will use a pre-built container for this demo." ] }, { "cell_type": "markdown", "metadata": { "id": "train_custom_job_args" }, "source": [ "### Define the command args for the training script\n", "\n", "Prepare the command-line arguments to pass to your training script.\n", "- `args`: The command line arguments to pass to the corresponding Python module. In this example, they are:\n", " - `label_column`: The label column in your data to predict.\n", " - `epochs`: The number of epochs for training.\n", " - `batch_size`: The number of batch size for training." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "1npiDcUtlugw" }, "outputs": [], "source": [ "JOB_NAME = \"custom_job_unique\"\n", "\n", "EPOCHS = 20\n", "BATCH_SIZE = 10\n", "\n", "CMDARGS = [\n", " \"--label_column=\" + LABEL_COLUMN,\n", " \"--epochs=\" + str(EPOCHS),\n", " \"--batch_size=\" + str(BATCH_SIZE),\n", "]" ] }, { "cell_type": "markdown", "metadata": { "id": "taskpy_contents" }, "source": [ "#### Training script\n", "\n", "In the next cell, write the contents of the training script, `task.py`. In summary, the script does the following:\n", "\n", "- Loads the data from the BigQuery table using the BigQuery Python client library.\n", "- Builds a model using TF.Keras model API.\n", "- Compiles the model (`compile()`).\n", "- Sets a training distribution strategy according to the argument `args.distribute`.\n", "- Trains the model (`fit()`) with epochs and batch size according to the arguments `args.epochs` and `args.batch_size`\n", "- Gets the directory where to save the model artifacts from the environment variable `AIP_MODEL_DIR`. This variable is [set by the training service](https://cloud.google.com/vertex-ai/docs/training/code-requirements#environment-variables).\n", "- Saves the trained model to the model directory.\n", "\n", "> **_NOTE:_** To improve model performance, it's recommended to normalize your inputs to the model before training. See the TensorFlow tutorial at https://www.tensorflow.org/tutorials/structured_data/preprocessing_layers#numerical_columns for details.\n", "\n", "> **_NOTE:_** The following training code requires you to grant the training account the \"BigQuery Read Session User\" role. See \"https://cloud.google.com/vertex-ai/docs/general/access-control#service-agents for details on how to find this account." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "72rUqXNFlugx" }, "outputs": [], "source": [ "%%writefile task.py\n", "\n", "import argparse\n", "import numpy as np\n", "import os\n", "\n", "import pandas as pd\n", "import tensorflow as tf\n", "\n", "from google.cloud import bigquery\n", "from google.cloud import storage\n", "\n", "# Read environmental variables\n", "training_data_uri = os.getenv(\"AIP_TRAINING_DATA_URI\")\n", "validation_data_uri = os.getenv(\"AIP_VALIDATION_DATA_URI\")\n", "test_data_uri = os.getenv(\"AIP_TEST_DATA_URI\")\n", "\n", "# Read args\n", "parser = argparse.ArgumentParser()\n", "parser.add_argument('--label_column', required=True, type=str)\n", "parser.add_argument('--epochs', default=10, type=int)\n", "parser.add_argument('--batch_size', default=10, type=int)\n", "args = parser.parse_args()\n", "\n", "# Set up training variables\n", "LABEL_COLUMN = args.label_column\n", "\n", "# See https://cloud.google.com/vertex-ai/docs/workbench/managed/executor#explicit-project-selection for issues regarding permissions.\n", "PROJECT_NUMBER = os.environ[\"CLOUD_ML_PROJECT_ID\"]\n", "bq_client = bigquery.Client(project=PROJECT_NUMBER)\n", "\n", "\n", "# Download a table\n", "def download_table(bq_table_uri: str):\n", " # Remove bq:// prefix if present\n", " prefix = \"bq://\"\n", " if bq_table_uri.startswith(prefix):\n", " bq_table_uri = bq_table_uri[len(prefix) :]\n", " \n", " # Download the BigQuery table as a dataframe\n", " # This requires the \"BigQuery Read Session User\" role on the custom training service account.\n", " table = bq_client.get_table(bq_table_uri)\n", " return bq_client.list_rows(table).to_dataframe()\n", "\n", "# Download dataset splits\n", "df_train = download_table(training_data_uri)\n", "df_validation = download_table(validation_data_uri)\n", "df_test = download_table(test_data_uri)\n", "\n", "def convert_dataframe_to_dataset(\n", " df_train: pd.DataFrame,\n", " df_validation: pd.DataFrame,\n", "):\n", " df_train_x, df_train_y = df_train, df_train.pop(LABEL_COLUMN)\n", " df_validation_x, df_validation_y = df_validation, df_validation.pop(LABEL_COLUMN)\n", "\n", " y_train = tf.convert_to_tensor(np.asarray(df_train_y).astype(\"float32\"))\n", " y_validation = tf.convert_to_tensor(np.asarray(df_validation_y).astype(\"float32\"))\n", "\n", " # Convert to numpy representation\n", " x_train = tf.convert_to_tensor(np.asarray(df_train_x).astype(\"float32\"))\n", " x_test = tf.convert_to_tensor(np.asarray(df_validation_x).astype(\"float32\"))\n", "\n", " # Convert to one-hot representation\n", " num_species = len(df_train_y.unique())\n", " y_train = tf.keras.utils.to_categorical(y_train, num_classes=num_species)\n", " y_validation = tf.keras.utils.to_categorical(y_validation, num_classes=num_species)\n", "\n", " dataset_train = tf.data.Dataset.from_tensor_slices((x_train, y_train))\n", " dataset_validation = tf.data.Dataset.from_tensor_slices((x_test, y_validation))\n", " return (dataset_train, dataset_validation)\n", "\n", "# Create datasets\n", "dataset_train, dataset_validation = convert_dataframe_to_dataset(df_train, df_validation)\n", "\n", "# Shuffle train set\n", "dataset_train = dataset_train.shuffle(len(df_train))\n", "\n", "def create_model(num_features):\n", " # Create model\n", " Dense = tf.keras.layers.Dense\n", " model = tf.keras.Sequential(\n", " [\n", " Dense(\n", " 100,\n", " activation=tf.nn.relu,\n", " kernel_initializer=\"uniform\",\n", " input_dim=num_features,\n", " ),\n", " Dense(75, activation=tf.nn.relu),\n", " Dense(50, activation=tf.nn.relu), \n", " Dense(25, activation=tf.nn.relu),\n", " Dense(3, activation=tf.nn.softmax),\n", " ]\n", " )\n", " \n", " # Compile Keras model\n", " optimizer = tf.keras.optimizers.RMSprop(lr=0.001)\n", " model.compile(\n", " loss=\"categorical_crossentropy\", metrics=[\"accuracy\"], optimizer=optimizer\n", " )\n", " \n", " return model\n", "\n", "# Create the model\n", "model = create_model(num_features=dataset_train._flat_shapes[0].dims[0].value)\n", "\n", "# Set up datasets\n", "dataset_train = dataset_train.batch(args.batch_size)\n", "dataset_validation = dataset_validation.batch(args.batch_size)\n", "\n", "# Train the model\n", "model.fit(dataset_train, epochs=args.epochs, validation_data=dataset_validation)\n", "\n", "tf.saved_model.save(model, os.getenv(\"AIP_MODEL_DIR\"))" ] }, { "cell_type": "markdown", "metadata": { "id": "train_custom_job" }, "source": [ "### Train the model\n", "\n", "Define your custom `TrainingPipeline` on Vertex AI.\n", "\n", "Use the `CustomTrainingJob` class to define the `TrainingPipeline`. The class takes the following parameters:\n", "\n", "- `display_name`: The user-defined name of this training pipeline.\n", "- `script_path`: The local path to the training script.\n", "- `container_uri`: The URI of the training container image.\n", "- `requirements`: The list of Python package dependencies of the script.\n", "- `model_serving_container_image_uri`: The URI of a container that can serve predictions for your model — either a pre-built container or a custom container.\n", "\n", "Use the `run` function to start training. The function takes the following parameters:\n", "\n", "- `args`: The command line arguments to be passed to the Python script.\n", "- `replica_count`: The number of worker replicas.\n", "- `model_display_name`: The display name of the `Model` if the script produces a managed `Model`.\n", "- `machine_type`: The type of machine to use for training.\n", "- `accelerator_type`: The hardware accelerator type.\n", "- `accelerator_count`: The number of accelerators to attach to a worker replica.\n", "\n", "The `run` function creates a training pipeline that trains and creates a `Model` object. After the training pipeline completes, the `run` function returns the `Model` object." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "mxIxvDdglugx" }, "outputs": [], "source": [ "job = aiplatform.CustomTrainingJob(\n", " display_name=JOB_NAME,\n", " script_path=\"task.py\",\n", " container_uri=\"us-docker.pkg.dev/vertex-ai/training/tf-cpu.2-8:latest\",\n", " requirements=[\"google-cloud-bigquery[pandas]\", \"protobuf<3.20.0\"],\n", " model_serving_container_image_uri=\"us-docker.pkg.dev/vertex-ai/prediction/tf2-cpu.2-8:latest\",\n", ")\n", "\n", "MODEL_DISPLAY_NAME = \"penguins_model_unique\"\n", "\n", "# Start the training\n", "model = job.run(\n", " dataset=dataset,\n", " model_display_name=MODEL_DISPLAY_NAME,\n", " bigquery_destination=f\"bq://{PROJECT_ID}\",\n", " args=CMDARGS,\n", ")" ] }, { "cell_type": "markdown", "metadata": { "id": "deploy_model:dedicated" }, "source": [ "### Deploy the model\n", "\n", "Before you use your model to make predictions, you must deploy it to an `Endpoint`. You can do this by calling the `deploy` function on the `Model` resource. This does two things:\n", "\n", "1. Create an `Endpoint` resource for deploying the `Model` resource to.\n", "2. Deploy the `Model` resource to the `Endpoint` resource.\n", "\n", "\n", "The function takes the following parameters:\n", "\n", "- `deployed_model_display_name`: A human readable name for the deployed model.\n", "- `traffic_split`: Percent of traffic at the endpoint that goes to this model, which is specified as a dictionary of one or more key/value pairs.\n", " - If only one model, then specify `{ \"0\": 100 }`, where \"0\" refers to this model being uploaded and 100 means 100% of the traffic.\n", " - If there are existing models on the endpoint, for which the traffic is split, then use `model_id` to specify `{ \"0\": percent, model_id: percent, ... }`, where `model_id` is the ID of an existing `DeployedModel` on the endpoint. The percentages must add up to 100.\n", "- `machine_type`: The type of machine to use for training.\n", "- `accelerator_type`: The hardware accelerator type.\n", "- `accelerator_count`: The number of accelerators to attach to a worker replica.\n", "- `starting_replica_count`: The number of compute instances to initially provision.\n", "- `max_replica_count`: The maximum number of compute instances to scale to. In this tutorial, only one instance is provisioned.\n", "\n", "### Traffic split\n", "\n", "The `traffic_split` parameter is specified as a Python dictionary. You can deploy more than one instance of your model to an endpoint, and then set the percentage of traffic that goes to each instance.\n", "\n", "You can use a traffic split to introduce a new model gradually into production. For example, if you had one existing model in production with 100% of the traffic, you could deploy a new model to the same endpoint, direct 10% of traffic to it, and reduce the original model's traffic to 90%. This allows you to monitor the new model's performance while minimizing the distruption to the majority of users.\n", "\n", "### Compute instance scaling\n", "\n", "You can specify a single instance (or node) to serve your online prediction requests. This tutorial uses a single node, so the variables `MIN_NODES` and `MAX_NODES` are both set to `1`.\n", "\n", "If you want to use multiple nodes to serve your online prediction requests, set `MAX_NODES` to the maximum number of nodes you want to use. Vertex AI autoscales the number of nodes used to serve your predictions, up to the maximum number you set. Refer to the [pricing page](https://cloud.google.com/vertex-ai/pricing#prediction-prices) to understand the costs of autoscaling with multiple nodes.\n", "\n", "### Endpoint\n", "\n", "The `deploy` method waits until the model is deployed and eventually returns an `Endpoint` object. If this is the first time a model is deployed to the endpoint, it may take a few additional minutes to complete provisioning of resources." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "WMH7GrYMlugy" }, "outputs": [], "source": [ "DEPLOYED_NAME = \"penguins_deployed_unique\"\n", "\n", "endpoint = model.deploy(deployed_model_display_name=DEPLOYED_NAME)" ] }, { "cell_type": "markdown", "metadata": { "id": "make_prediction" }, "source": [ "### Make an online prediction request\n", "\n", "Send an online prediction request to your deployed model." ] }, { "cell_type": "markdown", "metadata": { "id": "get_test_item:test" }, "source": [ "### Prepare test data\n", "\n", "Prepare test data by convert it to a Python list" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "67aeea91384a" }, "outputs": [], "source": [ "df_holdout_y = df_holdout.pop(LABEL_COLUMN)\n", "df_holdout_x = df_holdout\n", "\n", "# Convert to list representation\n", "holdout_x = np.array(df_holdout_x).tolist()\n", "holdout_y = np.array(df_holdout_y).astype(\"float32\").tolist()" ] }, { "cell_type": "markdown", "metadata": { "id": "send_prediction_request:image" }, "source": [ "### Send the prediction request\n", "\n", "Now that you have test data, you can use it to send a prediction request. Use the `Endpoint` object's `predict` function, which takes the following parameters:\n", "\n", "- `instances`: A list of penguin measurement instances. According to your custom model, each instance should be an array of numbers. You prepared this list in the previous step.\n", "\n", "The `predict` function returns a list, where each element in the list corresponds to the an instance in the request. In the output for each prediction, you see the following:\n", "\n", "- Confidence level for the prediction (`predictions`), between 0 and 1, for each of the ten classes.\n", "\n", "You can then run a quick evaluation on the prediction results:\n", "1. `np.argmax`: Convert each list of confidence levels to a label\n", "2. Print predictions" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "6e20473b09f5" }, "outputs": [], "source": [ "predictions = endpoint.predict(instances=holdout_x)\n", "y_predicted = np.argmax(predictions.predictions, axis=1)\n", "\n", "y_predicted" ] }, { "cell_type": "markdown", "metadata": { "id": "undeploy_model" }, "source": [ "### Undeploy models\n", "\n", "To undeploy all `Model` resources from the serving `Endpoint` resource, use the endpoint's `undeploy_all` method." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "khPSAO1tlug0" }, "outputs": [], "source": [ "endpoint.undeploy_all()" ] }, { "cell_type": "markdown", "metadata": { "id": "cleanup:custom" }, "source": [ "## Cleaning up\n", "\n", "To clean up all Google Cloud resources used in this project, you can [delete the Google Cloud\n", "project](https://cloud.google.com/resource-manager/docs/creating-managing-projects#shutting_down_projects) you used for the tutorial.\n", "\n", "Otherwise, you can delete the individual resources you created in this tutorial:\n", "\n", "- Training Job\n", "- Model\n", "- Endpoint\n", "- Cloud Storage Bucket" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "NNmebHf7lug0" }, "outputs": [], "source": [ "# Delete the training job\n", "job.delete()\n", "\n", "# Delete the model\n", "model.delete()\n", "\n", "# Delete the endpoint\n", "endpoint.delete()\n", "\n", "# Warning: Setting this to true deletes everything in your bucket\n", "delete_bucket = True\n", "\n", "if delete_bucket:\n", " ! gsutil rm -r $BUCKET_URI" ] } ], "metadata": { "colab": { "collapsed_sections": [ "overview:custom", "objective:custom,training,online_prediction", "dataset:custom,cifar10,icn", "costs", "7c163842eabd", "accelerators:training,prediction", "container:training,prediction", "machine:training,prediction", "59f24e7d2269", "5c7732822757", "train_custom_model", "train_custom_job_args", "taskpy_contents", "train_custom_job", "deploy_model:dedicated", "make_prediction", "get_test_item:test", "send_prediction_request:image", "undeploy_model", "cleanup:custom" ], "name": "custom-tabular-bq-managed-dataset.ipynb", "toc_visible": true }, "kernelspec": { "display_name": "Python 3", "name": "python3" } }, "nbformat": 4, "nbformat_minor": 0 }