notebooks/112_vertex_ai

{ "cells": [ { "cell_type": "markdown", "id": "9d254d99", "metadata": { "id": "9d254d99" }, "source": [ "This notebook shows how to deploy a vision model from 🤗 Transformers (written in TensorFlow) to [Vertex AI](https://cloud.google.com/vertex-ai). This is beneficial in many ways:\n", "\n", "* Vertex AI provides support for autoscaling, authorization, and authentication out of the box.\n", "* One can maintain multiple versions of a model and can control the traffic split very easily. \n", "* Purely serverless. \n", "\n", "This notebook uses code from [this official GCP example](https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/community/vertex_endpoints/optimized_tensorflow_runtime/bert_optimized_online_prediction.ipynb).\n", "\n", "This tutorial uses the following billable components of Google Cloud:\n", "\n", "* Vertex AI\n", "* Cloud Storage\n", "\n", "Learn about [Vertex AI pricing](https://cloud.google.com/vertex-ai/pricing) and [Cloud Storage pricing](https://cloud.google.com/storage/pricing), and use the [Pricing Calculator](https://cloud.google.com/products/calculator/) to generate a cost estimate based on your projected usage." ] }, { "cell_type": "markdown", "id": "04188740", "metadata": { "id": "04188740" }, "source": [ "## Initial setup" ] }, { "cell_type": "markdown", "id": "ILO4xrYhsjqG", "metadata": { "id": "ILO4xrYhsjqG" }, "source": [ "First authenticate yourself to provide Colab access to your GCP resources. " ] }, { "cell_type": "code", "execution_count": null, "id": "LRsR-9UmsotL", "metadata": { "id": "LRsR-9UmsotL" }, "outputs": [], "source": [ "from google.colab import auth\n", "auth.authenticate_user()" ] }, { "cell_type": "code", "execution_count": null, "id": "5a5b62bd-68a3-429d-b4c9-c70588ebb7a5", "metadata": { "id": "5a5b62bd-68a3-429d-b4c9-c70588ebb7a5" }, "outputs": [], "source": [ "# Storage bucket\n", "GCS_BUCKET = \"gs://[GCS-BUCKET-NAME]\"\n", "REGION = \"us-central1\"" ] }, { "cell_type": "code", "execution_count": null, "id": "c9309446-1c97-4b2e-9b1c-ca0f4ead7c34", "metadata": { "id": "c9309446-1c97-4b2e-9b1c-ca0f4ead7c34", "outputId": "b98fe6fe-01cc-4518-baa4-421c61560e8b" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Creating gs://hf-tf-vision/...\n" ] } ], "source": [ "!gsutil mb -l $REGION $GCS_BUCKET" ] }, { "cell_type": "code", "execution_count": null, "id": "157f1ffe", "metadata": { "id": "157f1ffe" }, "outputs": [], "source": [ "# Install Vertex AI SDK and transformers\n", "!pip install --upgrade google-cloud-aiplatform transformers -q" ] }, { "cell_type": "markdown", "id": "aba67801", "metadata": { "id": "aba67801" }, "source": [ "## Initial imports" ] }, { "cell_type": "code", "execution_count": null, "id": "354874fb-98e8-4fd7-891e-4c9ce1e6f75f", "metadata": { "id": "354874fb-98e8-4fd7-891e-4c9ce1e6f75f", "outputId": "7cf77f4b-fed9-4fdf-86c3-fa3295951142" }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "2022-07-17 05:01:47.593465: I tensorflow/core/common_runtime/process_util.cc:146] Creating new thread pool with default inter op setting: 2. Tune using inter_op_parallelism_threads for best performance.\n" ] } ], "source": [ "from transformers import ViTImageProcessor, TFViTForImageClassification\n", "import tensorflow as tf\n", "import tempfile\n", "import requests\n", "import base64\n", "import json\n", "import os" ] }, { "cell_type": "code", "execution_count": null, "id": "57bd7259-5a06-40d3-9ef6-37e7380ca42e", "metadata": { "id": "57bd7259-5a06-40d3-9ef6-37e7380ca42e", "outputId": "870b492b-6096-462f-8a41-2b4566b551ff" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "2.9.0-rc2\n", "4.20.1\n" ] } ], "source": [ "import transformers\n", "\n", "print(tf.__version__)\n", "print(transformers.__version__)" ] }, { "cell_type": "markdown", "id": "dad72899", "metadata": { "id": "dad72899" }, "source": [ "## Save the model locally\n", "\n", "We will work with a [Vision Transformer B-16 model provided by 🤗 Transformers](https://huggingface.co/docs/transformers/main/en/model_doc/vit). We will first initialize it, load the model weights, and then save it locally as a [SavedModel](https://www.tensorflow.org/guide/saved_model) resource. " ] }, { "cell_type": "code", "execution_count": null, "id": "a8f1ba75-a1f7-4ada-91bc-c3d07ec146f0", "metadata": { "id": "a8f1ba75-a1f7-4ada-91bc-c3d07ec146f0", "outputId": "b9dd7399-f575-48a5-c1ab-fb6d916a8b5f" }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "All model checkpoint layers were used when initializing TFViTForImageClassification.\n", "\n", "All the layers of TFViTForImageClassification were initialized from the model checkpoint at google/vit-base-patch16-224.\n", "If your task is similar to the task the model of the checkpoint was trained on, you can already use TFViTForImageClassification for predictions without further training.\n", "WARNING:absl:Found untraced functions such as embeddings_layer_call_fn, embeddings_layer_call_and_return_conditional_losses, encoder_layer_call_fn, encoder_layer_call_and_return_conditional_losses, layernorm_layer_call_fn while saving (showing 5 of 421). These functions will not be directly callable after loading.\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "INFO:tensorflow:Assets written to: vit/saved_model/1/assets\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:tensorflow:Assets written to: vit/saved_model/1/assets\n" ] } ], "source": [ "# the saved_model parameter is a flag to create a saved model version of the model\n", "LOCAL_MODEL_DIR = \"vit\"\n", "model = TFViTForImageClassification.from_pretrained(\"google/vit-base-patch16-224\")\n", "model.save_pretrained(LOCAL_MODEL_DIR, saved_model=True)" ] }, { "cell_type": "code", "execution_count": null, "id": "215a0d0f-45e6-438c-b02f-152427da958b", "metadata": { "id": "215a0d0f-45e6-438c-b02f-152427da958b", "outputId": "0830d1bf-f4cb-407a-8a59-3b234815d893" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "MetaGraphDef with tag-set: 'serve' contains the following SignatureDefs:\n", "\n", "signature_def['__saved_model_init_op']:\n", " The given SavedModel SignatureDef contains the following input(s):\n", " The given SavedModel SignatureDef contains the following output(s):\n", " outputs['__saved_model_init_op'] tensor_info:\n", " dtype: DT_INVALID\n", " shape: unknown_rank\n", " name: NoOp\n", " Method name is: \n", "\n", "signature_def['serving_default']:\n", " The given SavedModel SignatureDef contains the following input(s):\n", " inputs['pixel_values'] tensor_info:\n", " dtype: DT_FLOAT\n", " shape: (-1, -1, -1, -1)\n", " name: serving_default_pixel_values:0\n", " The given SavedModel SignatureDef contains the following output(s):\n", " outputs['logits'] tensor_info:\n", " dtype: DT_FLOAT\n", " shape: (-1, 1000)\n", " name: StatefulPartitionedCall:0\n", " Method name is: tensorflow/serving/predict\n", "\n", "Concrete Functions:\n", " Function Name: '__call__'\n", " Option #1\n", " Callable with:\n", " Argument #1\n", " DType: dict\n", " Value: {'pixel_values': TensorSpec(shape=(None, 3, 224, 224), dtype=tf.float32, name='pixel_values/pixel_values')}\n", " Argument #2\n", " DType: NoneType\n", " Value: None\n", " Argument #3\n", " DType: NoneType\n", " Value: None\n", " Argument #4\n", " DType: NoneType\n", " Value: None\n", " Argument #5\n", " DType: NoneType\n", " Value: None\n", " Argument #6\n", " DType: NoneType\n", " Value: None\n", " Argument #7\n", " DType: NoneType\n", " Value: None\n", " Argument #8\n", " DType: bool\n", " Value: True\n", " Option #2\n", " Callable with:\n", " Argument #1\n", " DType: dict\n", " Value: {'pixel_values': TensorSpec(shape=(None, 3, 224, 224), dtype=tf.float32, name='pixel_values/pixel_values')}\n", " Argument #2\n", " DType: NoneType\n", " Value: None\n", " Argument #3\n", " DType: NoneType\n", " Value: None\n", " Argument #4\n", " DType: NoneType\n", " Value: None\n", " Argument #5\n", " DType: NoneType\n", " Value: None\n", " Argument #6\n", " DType: NoneType\n", " Value: None\n", " Argument #7\n", " DType: NoneType\n", " Value: None\n", " Argument #8\n", " DType: bool\n", " Value: False\n", "\n", " Function Name: '_default_save_signature'\n", " Option #1\n", " Callable with:\n", " Argument #1\n", " DType: dict\n", " Value: {'pixel_values': TensorSpec(shape=(None, 3, 224, 224), dtype=tf.float32, name='pixel_values')}\n", "\n", " Function Name: 'call_and_return_all_conditional_losses'\n", " Option #1\n", " Callable with:\n", " Argument #1\n", " DType: dict\n", " Value: {'pixel_values': TensorSpec(shape=(None, 3, 224, 224), dtype=tf.float32, name='pixel_values/pixel_values')}\n", " Argument #2\n", " DType: NoneType\n", " Value: None\n", " Argument #3\n", " DType: NoneType\n", " Value: None\n", " Argument #4\n", " DType: NoneType\n", " Value: None\n", " Argument #5\n", " DType: NoneType\n", " Value: None\n", " Argument #6\n", " DType: NoneType\n", " Value: None\n", " Argument #7\n", " DType: NoneType\n", " Value: None\n", " Argument #8\n", " DType: bool\n", " Value: False\n", " Option #2\n", " Callable with:\n", " Argument #1\n", " DType: dict\n", " Value: {'pixel_values': TensorSpec(shape=(None, 3, 224, 224), dtype=tf.float32, name='pixel_values/pixel_values')}\n", " Argument #2\n", " DType: NoneType\n", " Value: None\n", " Argument #3\n", " DType: NoneType\n", " Value: None\n", " Argument #4\n", " DType: NoneType\n", " Value: None\n", " Argument #5\n", " DType: NoneType\n", " Value: None\n", " Argument #6\n", " DType: NoneType\n", " Value: None\n", " Argument #7\n", " DType: NoneType\n", " Value: None\n", " Argument #8\n", " DType: bool\n", " Value: True\n", "\n", " Function Name: 'serving'\n", " Option #1\n", " Callable with:\n", " Argument #1\n", " DType: dict\n", " Value: {'pixel_values': TensorSpec(shape=(None, None, None, None), dtype=tf.float32, name='pixel_values')}\n" ] } ], "source": [ "# Inspect the input and output signatures of the model\n", "!saved_model_cli show --dir {LOCAL_MODEL_DIR}/saved_model/1 --all" ] }, { "cell_type": "markdown", "id": "e7ae5a29", "metadata": { "id": "e7ae5a29" }, "source": [ "## Embedding pre and post processing ops inside the model\n", "\n", "ML models usually require some pre and post processing of the input data and predicted results. So, it's a good idea to ship an ML model that already has these supports. It also helps in reducing training/serving skew. \n", "\n", "For our model we need:\n", "\n", "* Data normalization, resizing, and transposition as the preprocessing ops.\n", "* Mapping the predicted logits to ImageNet-1k classes as the post-processing ops. " ] }, { "cell_type": "code", "execution_count": null, "id": "c4862e37-9701-4782-9da1-e4fac4dd9d32", "metadata": { "id": "c4862e37-9701-4782-9da1-e4fac4dd9d32", "outputId": "7d4426a1-c942-45af-8e14-1dfc8a2d9eb1" }, "outputs": [ { "data": { "text/plain": [ "ViTFeatureExtractor {\n", " \"do_normalize\": true,\n", " \"do_resize\": true,\n", " \"feature_extractor_type\": \"ViTFeatureExtractor\",\n", " \"image_mean\": [\n", " 0.5,\n", " 0.5,\n", " 0.5\n", " ],\n", " \"image_std\": [\n", " 0.5,\n", " 0.5,\n", " 0.5\n", " ],\n", " \"resample\": 2,\n", " \"size\": 224\n", "}" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "processor = ViTImageProcessor()\n", "processor" ] }, { "cell_type": "code", "execution_count": null, "id": "0bfc1d8f-94f2-4b4e-b117-9918fa25a233", "metadata": { "id": "0bfc1d8f-94f2-4b4e-b117-9918fa25a233" }, "outputs": [], "source": [ "CONCRETE_INPUT = \"pixel_values\"\n", "SIZE = processor.size[\"height\"]\n", "INPUT_SHAPE = (SIZE, SIZE, 3)" ] }, { "cell_type": "code", "execution_count": null, "id": "06843d8c-db3e-4650-8358-f17b265818f9", "metadata": { "id": "06843d8c-db3e-4650-8358-f17b265818f9" }, "outputs": [], "source": [ "def normalize_img(img, mean=processor.image_mean, std=processor.image_std):\n", " # Scale to the value range of [0, 1] first and then normalize.\n", " img = img / 255\n", " mean = tf.constant(mean)\n", " std = tf.constant(std)\n", " return (img - mean) / std\n", "\n", "def preprocess(string_input):\n", " decoded = tf.io.decode_jpeg(string_input, channels=3)\n", " resized = tf.image.resize(decoded, size=(SIZE, SIZE))\n", " normalized = normalize_img(resized)\n", " normalized = tf.transpose(normalized, (2, 0, 1)) # Since HF models are channel-first.\n", " return normalized\n", "\n", "\n", "@tf.function(input_signature=[tf.TensorSpec([None], tf.string)])\n", "def preprocess_fn(string_input):\n", " decoded_images = tf.map_fn(\n", " preprocess, string_input, fn_output_signature=tf.float32,\n", " )\n", " return {CONCRETE_INPUT: decoded_images}\n", "\n", "\n", "def model_exporter(model: tf.keras.Model):\n", " m_call = tf.function(model.call).get_concrete_function(\n", " tf.TensorSpec(\n", " shape=[None, 3, SIZE, SIZE], dtype=tf.float32, name=CONCRETE_INPUT\n", " )\n", " )\n", "\n", " @tf.function(input_signature=[tf.TensorSpec([None], tf.string)])\n", " def serving_fn(string_input):\n", " labels = tf.constant(\n", " list(model.config.id2label.values()), dtype=tf.string\n", " )\n", " images = preprocess_fn(string_input)\n", "\n", " predictions = m_call(**images)\n", " indices = tf.argmax(predictions.logits, axis=1)\n", " pred_source = tf.gather(params=labels, indices=indices)\n", " probs = tf.nn.softmax(predictions.logits, axis=1)\n", " pred_confidence = tf.reduce_max(probs, axis=1)\n", " return {\"label\": pred_source, \"confidence\": pred_confidence}\n", "\n", " return serving_fn" ] }, { "cell_type": "code", "execution_count": null, "id": "359fca61-995f-451b-a89c-d2fbf493e6cf", "metadata": { "id": "359fca61-995f-451b-a89c-d2fbf493e6cf", "outputId": "cb03ebfb-509d-464c-f7d2-3e504fb72444" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "WARNING:tensorflow:From /opt/conda/lib/python3.7/site-packages/tensorflow/python/autograph/impl/api.py:458: calling map_fn_v2 (from tensorflow.python.ops.map_fn) with back_prop=False is deprecated and will be removed in a future version.\n", "Instructions for updating:\n", "back_prop=False is deprecated. Consider using tf.stop_gradient instead.\n", "Instead of:\n", "results = tf.map_fn(fn, elems, back_prop=False)\n", "Use:\n", "results = tf.nest.map_structure(tf.stop_gradient, tf.map_fn(fn, elems))\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "WARNING:tensorflow:From /opt/conda/lib/python3.7/site-packages/tensorflow/python/autograph/impl/api.py:458: calling map_fn_v2 (from tensorflow.python.ops.map_fn) with back_prop=False is deprecated and will be removed in a future version.\n", "Instructions for updating:\n", "back_prop=False is deprecated. Consider using tf.stop_gradient instead.\n", "Instead of:\n", "results = tf.map_fn(fn, elems, back_prop=False)\n", "Use:\n", "results = tf.nest.map_structure(tf.stop_gradient, tf.map_fn(fn, elems))\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "WARNING:tensorflow:From /opt/conda/lib/python3.7/site-packages/tensorflow/python/util/deprecation.py:629: calling map_fn_v2 (from tensorflow.python.ops.map_fn) with dtype is deprecated and will be removed in a future version.\n", "Instructions for updating:\n", "Use fn_output_signature instead\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "WARNING:tensorflow:From /opt/conda/lib/python3.7/site-packages/tensorflow/python/util/deprecation.py:629: calling map_fn_v2 (from tensorflow.python.ops.map_fn) with dtype is deprecated and will be removed in a future version.\n", "Instructions for updating:\n", "Use fn_output_signature instead\n", "WARNING:absl:Found untraced functions such as embeddings_layer_call_fn, embeddings_layer_call_and_return_conditional_losses, encoder_layer_call_fn, encoder_layer_call_and_return_conditional_losses, layernorm_layer_call_fn while saving (showing 5 of 421). These functions will not be directly callable after loading.\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "INFO:tensorflow:Assets written to: gs://hf-tf-vision/vit/assets\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:tensorflow:Assets written to: gs://hf-tf-vision/vit/assets\n" ] } ], "source": [ "# To deploy the model on Vertex AI we must have the model in a storage bucket.\n", "tf.saved_model.save(\n", " model,\n", " os.path.join(GCS_BUCKET, LOCAL_MODEL_DIR),\n", " signatures={\"serving_default\": model_exporter(model)},\n", ")" ] }, { "cell_type": "markdown", "id": "a60d5bb7", "metadata": { "id": "a60d5bb7" }, "source": [ "**Notes on making the model accept string inputs**:\n", "\n", "When dealing with images via REST or gRPC requests the size of the request payload can easily spiral up depending on the resolution of the images being passed. This is why, it is good practice to compress them reliably and then prepare the request payload." ] }, { "cell_type": "markdown", "id": "d5a46991", "metadata": { "id": "d5a46991" }, "source": [ "## Deployment on Vertex AI\n", "\n", "[This resource](https://cloud.google.com/vertex-ai/docs/general/general-concepts) shows some relevant concepts on Vertex AI. " ] }, { "cell_type": "code", "execution_count": null, "id": "5edb508a-80f4-4652-947b-f0d95b9a8ce3", "metadata": { "id": "5edb508a-80f4-4652-947b-f0d95b9a8ce3" }, "outputs": [], "source": [ "from google.cloud.aiplatform import gapic as aip" ] }, { "cell_type": "code", "execution_count": null, "id": "f8edaee9-1068-4ffb-bfc1-9e63a39c8bc5", "metadata": { "id": "f8edaee9-1068-4ffb-bfc1-9e63a39c8bc5" }, "outputs": [], "source": [ "# Deployment hardware\n", "DEPLOY_COMPUTE = \"n1-standard-8\"\n", "DEPLOY_GPU = aip.AcceleratorType.NVIDIA_TESLA_T4\n", "PROJECT_ID = \"GCP-PROJECT-ID\"" ] }, { "cell_type": "code", "execution_count": null, "id": "4e8d6fcd-2e63-4e39-9c2e-2047308be3f0", "metadata": { "id": "4e8d6fcd-2e63-4e39-9c2e-2047308be3f0" }, "outputs": [], "source": [ "# Initialize clients.\n", "API_ENDPOINT = f\"{REGION}-aiplatform.googleapis.com\"\n", "PARENT = f\"projects/{PROJECT_ID}/locations/{REGION}\"\n", "\n", "client_options = {\"api_endpoint\": API_ENDPOINT}\n", "model_service_client = aip.ModelServiceClient(client_options=client_options)\n", "endpoint_service_client = aip.EndpointServiceClient(client_options=client_options)\n", "prediction_service_client = aip.PredictionServiceClient(client_options=client_options)" ] }, { "cell_type": "code", "execution_count": null, "id": "d5fb346f-bfce-43e7-9c0c-5463709d8c6f", "metadata": { "id": "d5fb346f-bfce-43e7-9c0c-5463709d8c6f", "outputId": "0c731116-f6ea-4a24-b04b-5fd5c34f083b" }, "outputs": [ { "data": { "text/plain": [ "'projects/29880397572/locations/us-central1/models/7235960789184544768'" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Upload the model to Vertex AI. \n", "tf28_gpu_model_dict = {\n", " \"display_name\": \"ViT Base TF2.8 GPU model\",\n", " \"artifact_uri\": f\"{GCS_BUCKET}/{LOCAL_MODEL_DIR}\",\n", " \"container_spec\": {\n", " \"image_uri\": \"us-docker.pkg.dev/vertex-ai/prediction/tf2-gpu.2-8:latest\",\n", " },\n", "}\n", "tf28_gpu_model = (\n", " model_service_client.upload_model(parent=PARENT, model=tf28_gpu_model_dict)\n", " .result(timeout=180)\n", " .model\n", ")\n", "tf28_gpu_model" ] }, { "cell_type": "code", "execution_count": null, "id": "5a0dc3ef-6609-4374-b934-75c382f7fc6d", "metadata": { "id": "5a0dc3ef-6609-4374-b934-75c382f7fc6d", "outputId": "e15119e4-1939-4ee9-d618-2d2ad381b21c" }, "outputs": [ { "data": { "text/plain": [ "'projects/29880397572/locations/us-central1/endpoints/7116769330687115264'" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Create an Endpoint for the model.\n", "tf28_gpu_endpoint_dict = {\n", " \"display_name\": \"ViT Base TF2.8 GPU endpoint\",\n", "}\n", "tf28_gpu_endpoint = (\n", " endpoint_service_client.create_endpoint(\n", " parent=PARENT, endpoint=tf28_gpu_endpoint_dict\n", " )\n", " .result(timeout=300)\n", " .name\n", ")\n", "tf28_gpu_endpoint" ] }, { "cell_type": "code", "execution_count": null, "id": "6a5a98af-720e-4854-a97e-76cd02ed5f7a", "metadata": { "id": "6a5a98af-720e-4854-a97e-76cd02ed5f7a", "outputId": "12eda73f-ce65-4981-ee24-f086600164d7" }, "outputs": [ { "data": { "text/plain": [ "deployed_model {\n", " id: \"5163311002082607104\"\n", "}" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Deploy the Endpoint. \n", "tf28_gpu_deployed_model_dict = {\n", " \"model\": tf28_gpu_model,\n", " \"display_name\": \"ViT Base TF2.8 GPU deployed model\",\n", " \"dedicated_resources\": {\n", " \"min_replica_count\": 1,\n", " \"max_replica_count\": 1,\n", " \"machine_spec\": {\n", " \"machine_type\": DEPLOY_COMPUTE,\n", " \"accelerator_type\": DEPLOY_GPU,\n", " \"accelerator_count\": 1,\n", " },\n", " },\n", "}\n", "\n", "tf28_gpu_deployed_model = endpoint_service_client.deploy_model(\n", " endpoint=tf28_gpu_endpoint,\n", " deployed_model=tf28_gpu_deployed_model_dict,\n", " traffic_split={\"0\": 100},\n", ").result()\n", "tf28_gpu_deployed_model" ] }, { "cell_type": "markdown", "id": "3627b87c", "metadata": { "id": "3627b87c" }, "source": [ "## Make a prediction request" ] }, { "cell_type": "code", "execution_count": null, "id": "03866d77-27b6-4357-9912-72ea37937828", "metadata": { "id": "03866d77-27b6-4357-9912-72ea37937828" }, "outputs": [], "source": [ "# Generate sample data. \n", "import base64\n", "\n", "image_path = tf.keras.utils.get_file(\n", " \"image.jpg\", \"http://images.cocodataset.org/val2017/000000039769.jpg\"\n", ")\n", "bytes = tf.io.read_file(image_path)\n", "b64str = base64.b64encode(bytes.numpy()).decode(\"utf-8\")" ] }, { "cell_type": "code", "execution_count": null, "id": "698c6d9e-9373-48e8-b6ce-ed5a20910a9c", "metadata": { "id": "698c6d9e-9373-48e8-b6ce-ed5a20910a9c", "outputId": "e7e52f8d-4927-49b3-d099-8c26267a3690" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Serving function input: string_input\n" ] } ], "source": [ "# Model input signature key name.\n", "pushed_model_location = os.path.join(GCS_BUCKET, LOCAL_MODEL_DIR)\n", "loaded = tf.saved_model.load(pushed_model_location)\n", "serving_input = list(\n", " loaded.signatures[\"serving_default\"].structured_input_signature[1].keys()\n", ")[0]\n", "print(\"Serving function input:\", serving_input)" ] }, { "cell_type": "code", "execution_count": null, "id": "b7479eba-7e2f-4dd9-a91f-64ec3f0c7341", "metadata": { "id": "b7479eba-7e2f-4dd9-a91f-64ec3f0c7341", "outputId": "93ae9750-ed6d-4071-cd94-2bd33eabe34d" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "predictions {\n", " struct_value {\n", " fields {\n", " key: \"confidence\"\n", " value {\n", " number_value: 0.896659553\n", " }\n", " }\n", " fields {\n", " key: \"label\"\n", " value {\n", " string_value: \"Egyptian cat\"\n", " }\n", " }\n", " }\n", "}\n", "deployed_model_id: \"5163311002082607104\"\n", "model: \"projects/29880397572/locations/us-central1/models/7235960789184544768\"\n", "model_display_name: \"ViT Base TF2.8 GPU model\"\n", "\n" ] } ], "source": [ "from google.protobuf import json_format\n", "from google.protobuf.struct_pb2 import Value\n", "\n", "\n", "def predict_image(image, endpoint, serving_input):\n", " # The format of each instance should conform to the deployed model's prediction input schema.\n", " instances_list = [{serving_input: {\"b64\": image}}]\n", " instances = [json_format.ParseDict(s, Value()) for s in instances_list]\n", "\n", " print(\n", " prediction_service_client.predict(\n", " endpoint=endpoint,\n", " instances=instances,\n", " )\n", " )\n", "\n", "\n", "predict_image(b64str, tf28_gpu_endpoint, serving_input)" ] }, { "cell_type": "markdown", "id": "791fcd11", "metadata": { "id": "791fcd11" }, "source": [ "## Cleaning up of resources" ] }, { "cell_type": "code", "execution_count": null, "id": "bf18cd91-29af-4bc3-8cb9-ded65aba40e6", "metadata": { "id": "bf18cd91-29af-4bc3-8cb9-ded65aba40e6", "outputId": "07ee752c-df18-4861-a2cf-3be50f9f1629" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "running undeploy_model operation: projects/29880397572/locations/us-central1/endpoints/7116769330687115264/operations/6837774371172384768\n", "\n", "running delete_endpoint operation: projects/29880397572/locations/us-central1/operations/7182299742666227712\n", "\n", "running delete_model operation: projects/29880397572/locations/us-central1/operations/1269073431928766464\n", "\n" ] } ], "source": [ "def cleanup(endpoint, model_name, deployed_model_id):\n", " response = endpoint_service_client.undeploy_model(\n", " endpoint=endpoint, deployed_model_id=deployed_model_id\n", " )\n", " print(\"running undeploy_model operation:\", response.operation.name)\n", " print(response.result())\n", "\n", " response = endpoint_service_client.delete_endpoint(name=endpoint)\n", " print(\"running delete_endpoint operation:\", response.operation.name)\n", " print(response.result())\n", "\n", " response = model_service_client.delete_model(name=model_name)\n", " print(\"running delete_model operation:\", response.operation.name)\n", " print(response.result())\n", "\n", "\n", "cleanup(tf28_gpu_endpoint, tf28_gpu_model, tf28_gpu_deployed_model.deployed_model.id)" ] }, { "cell_type": "code", "execution_count": null, "id": "d33a0d9f-4f56-408f-8fb5-d7e327b66029", "metadata": { "id": "d33a0d9f-4f56-408f-8fb5-d7e327b66029", "outputId": "2c5358d4-61ab-4b4f-8fdc-80d0b09f220c" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Removing gs://hf-tf-vision/vit/#1658034189039614...\n", "Removing gs://hf-tf-vision/vit/assets/#1658034196731689... \n", "Removing gs://hf-tf-vision/vit/saved_model.pb#1658034197598270... \n", "Removing gs://hf-tf-vision/vit/variables/#1658034189325867... \n", "/ [4 objects] \n", "==> NOTE: You are performing a sequence of gsutil operations that may\n", "run significantly faster if you instead use gsutil -m rm ... Please\n", "see the -m section under \"gsutil help options\" for further information\n", "about when gsutil -m can be advantageous.\n", "\n", "Removing gs://hf-tf-vision/vit/variables/variables.data-00000-of-00001#1658034195624888...\n", "Removing gs://hf-tf-vision/vit/variables/variables.index#1658034195904828... \n", "/ [6 objects] \n", "Operation completed over 6 objects. \n", "Removing gs://hf-tf-vision/...\n" ] } ], "source": [ "!gsutil rm -r $GCS_BUCKET" ] } ], "metadata": { "colab": { "name": "112_vertex_ai_vision.ipynb", "provenance": [] }, "environment": { "kernel": "python3", "name": "tf2-gpu.2-9.m94", "type": "gcloud", "uri": "gcr.io/deeplearning-platform-release/tf2-gpu.2-9:m94" }, "kernelspec": { "display_name": "Python 3.8.2 ('tf')", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.2" }, "vscode": { "interpreter": { "hash": "d034a732bc78a5fefd0ed32463b4bbf75d8bf7890a983f59eb8bc475a23d68ab" } } }, "nbformat": 4, "nbformat_minor": 5 }

notebooks/112_vertex_ai_vision.ipynb (1,022 lines of code) (raw):