notebooks/official/generative_ai/qodo_intro.ipynb (984 lines of code) (raw):

{ "cells": [ { "cell_type": "code", "execution_count": null, "metadata": { "cellView": "form", "id": "9A9NkTRTfo2I" }, "outputs": [], "source": [ "# Copyright 2024 Google LLC\n", "#\n", "# Licensed under the Apache License, Version 2.0 (the \"License\");\n", "# you may not use this file except in compliance with the License.\n", "# You may obtain a copy of the License at\n", "#\n", "# https://www.apache.org/licenses/LICENSE-2.0\n", "#\n", "# Unless required by applicable law or agreed to in writing, software\n", "# distributed under the License is distributed on an \"AS IS\" BASIS,\n", "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n", "# See the License for the specific language governing permissions and\n", "# limitations under the License." ] }, { "cell_type": "markdown", "metadata": { "id": "b9f8293b0643" }, "source": [ "# Getting Started with qodo Models\n", "\n", "\n", "<table align=\"left\">\n", " <td style=\"text-align: center\">\n", " <a href=\"https://colab.research.google.com/github/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/official/generative_ai/Qodo_intro.ipynb\">\n", " <img src=\"https://cloud.google.com/ml-engine/images/colab-logo-32px.png\" alt=\"Google Colaboratory logo\"><br> Open in Colab\n", " </a>\n", " </td>\n", " <td style=\"text-align: center\">\n", " <a href=\"https://console.cloud.google.com/vertex-ai/colab/import/https:%2F%2Fraw.githubusercontent.com%2FGoogleCloudPlatform%2Fvertex-ai-samples%2Fmain%2Fnotebooks%2Fnotebook_template.ipynb\">\n", " <img width=\"32px\" src=\"https://cloud.google.com/ml-engine/images/colab-enterprise-logo-32px.png\" alt=\"Google Cloud Colab Enterprise logo\"><br> Open in Colab Enterprise\n", " </a>\n", " </td> \n", " <td style=\"text-align: center\">\n", " <a href=\"https://console.cloud.google.com/vertex-ai/workbench/deploy-notebook?download_url=https://raw.githubusercontent.com/GoogleCloudPlatform/vertex-ai-samples/main/notebooks/official/generative_ai/Qodo_intro.ipynb\">\n", " <img src=\"https://lh3.googleusercontent.com/UiNooY4LUgW_oTvpsNhPpQzsstV5W8F7rYgxgGBD85cWJoLmrOzhVs_ksK_vgx40SHs7jCqkTkCk=e14-rj-sc0xffffff-h130-w32\" alt=\"Vertex AI logo\"><br> Open in Workbench\n", " </a>\n", " </td>\n", " <td style=\"text-align: center\">\n", " <a href=\"https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/official/generative_ai/Qodo_intro.ipynb\">\n", " <img src=\"https://cloud.google.com/ml-engine/images/github-logo-32px.png\" alt=\"GitHub logo\"><br> View on GitHub\n", " </a>\n", " </td>\n", "</table>" ] }, { "cell_type": "markdown", "metadata": { "id": "8fK_rdvvx1iZ" }, "source": [ "## Overview\n", "\n", "This notebook demonstrates how to deploy and use Qodo's state-of-the-art code embedding models on Google Cloud's Vertex AI platform. You'll learn how to set up, deploy, and make predictions with these specialized embedding models that enhance code retrieval and search capabilities.\n", "In this notebook, you will:\n", "\n", "Set up your Google Cloud environment and initialize the Vertex AI SDK\n", "Upload a Qodo model to your Vertex AI Model Registry\n", "Create a Vertex AI endpoint for model deployment\n", "Deploy the Qodo model to your endpoint with appropriate compute resources\n", "Make predictions using the deployed model.\n", "\n", "### Qodo on Vertex AI\n", "\n", "You can deploy the Qodo models in your own endpoint.\n", "\n", "\n", "\n", "### Available Qodo models\n", "\n", "#### Qodo-Embed-1-7B\n", "Qodo-Embed-1-7B is a state-of-the-art code embedding model for efficient code & text retrieval, enhancing the search accuracy of RAG methods.\n", "\n", "\n", "\n", "## Objective\n", "\n", "This notebook shows how to use **Vertex AI API** to deploy the qodo models.\n", "\n", "For more information, see the [qodo website](https://www.qodo.ai/blog/qodo-embed-1-code-embedding-code-retreival/).\n" ] }, { "cell_type": "markdown", "metadata": { "id": "nwYvaaW25jYS" }, "source": [ "## Get Started\n" ] }, { "cell_type": "markdown", "metadata": { "id": "d10e8895d2d4" }, "source": [ "### Install Vertex AI SDK for Python or other required packages\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "08dd6d2ac629" }, "outputs": [], "source": [ "! pip3 install --upgrade --quiet google-cloud-aiplatform" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "754611260f53" }, "outputs": [], "source": [ "! pip3 install -U -q httpx" ] }, { "cell_type": "markdown", "metadata": { "id": "b9f4c57a43f6" }, "source": [ "### Restart runtime (Colab only)\n", "\n", "To use the newly installed packages, you must restart the runtime on Google Colab." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "3b9119a60525" }, "outputs": [], "source": [ "import sys\n", "\n", "if \"google.colab\" in sys.modules:\n", "\n", " import IPython\n", "\n", " app = IPython.Application.instance()\n", " app.kernel.do_shutdown(True)" ] }, { "cell_type": "markdown", "metadata": { "id": "6a5bea26f60f" }, "source": [ "### Authenticate your notebook environment (Colab only)\n", "\n", "Authenticate your environment on Google Colab.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "c97be6a73155" }, "outputs": [], "source": [ "import sys\n", "\n", "if \"google.colab\" in sys.modules:\n", "\n", " from google.colab import auth\n", "\n", " auth.authenticate_user()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "Y8X70FTSbx7U" }, "outputs": [], "source": [ "PUBLISHER_NAME = \"qodo\" # @param {type:\"string\"}\n", "PUBLISHER_MODEL_NAME = \"qodo-embed-1-7b-v1\" # @param [\"publisher-model-name-1\", \"publisher-model-name-2\", \"test-marketplace-publisher-model-e2e-01\"]\n", "\n", "available_regions = [\"us-central1\", \"europe-west4\"]" ] }, { "cell_type": "markdown", "metadata": { "id": "bpuX3sKtexlK" }, "source": [ "### Select a location and a version from the dropdown" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "dHl8xW45ex_O" }, "outputs": [], "source": [ "import ipywidgets as widgets\n", "from IPython.display import display\n", "\n", "dropdown_loc = widgets.Dropdown(\n", " options=available_regions,\n", " description=\"Select a location:\",\n", " font_weight=\"bold\",\n", " style={\"description_width\": \"initial\"},\n", ")\n", "\n", "\n", "def dropdown_loc_eventhandler(change):\n", " global LOCATION\n", " if change[\"type\"] == \"change\" and change[\"name\"] == \"value\":\n", " LOCATION = change.new\n", " print(\"Selected:\", change.new)\n", "\n", "\n", "LOCATION = dropdown_loc.value\n", "dropdown_loc.observe(dropdown_loc_eventhandler, names=\"value\")\n", "display(dropdown_loc)" ] }, { "cell_type": "markdown", "metadata": { "id": "4f872cd812d0" }, "source": [ "### Set Google Cloud project information and initialize Vertex AI SDK for Python\n", "\n", "To get started using Vertex AI, you must have an existing Google Cloud project and [enable the Vertex AI API](https://console.cloud.google.com/flows/enableapi?apiid=aiplatform.googleapis.com). Learn more about [setting up a project and a development environment](https://cloud.google.com/vertex-ai/docs/start/cloud-environment)." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "ce2e2765bc2d" }, "outputs": [], "source": [ "PROJECT_ID = \"[your-project-id]\" # @param {type:\"string\"}\n", "ENDPOINT = f\"https://{LOCATION}-aiplatform.googleapis.com\"\n", "\n", "if not PROJECT_ID or PROJECT_ID == \"[your-project-id]\":\n", " raise ValueError(\"Please set your PROJECT_ID\")" ] }, { "cell_type": "markdown", "metadata": { "id": "4NAstKRFBt4N" }, "source": [ "### Import required libraries" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "QZEFLE6a6bqy" }, "outputs": [], "source": [ "import json\n", "import time" ] }, { "cell_type": "markdown", "metadata": { "id": "4fa6f083d253" }, "source": [ "## Using Vertex AI API" ] }, { "cell_type": "markdown", "metadata": { "id": "qjsDpa8jlTRu" }, "source": [ "### Upload Model" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "y1R2BRsBlu-k" }, "outputs": [], "source": [ "UPLOAD_MODEL_PAYLOAD = {\n", " \"model\": {\n", " \"displayName\": \"ModelGarden_LaunchPad_Model_\" + time.strftime(\"%Y%m%d-%H%M%S\"),\n", " \"baseModelSource\": {\n", " \"modelGardenSource\": {\n", " \"publicModelName\": f\"publishers/{PUBLISHER_NAME}/models/{PUBLISHER_MODEL_NAME}\",\n", " }\n", " },\n", " }\n", "}\n", "\n", "request = json.dumps(UPLOAD_MODEL_PAYLOAD)\n", "\n", "! curl -X POST -H \"Authorization: Bearer $(gcloud auth print-access-token)\" -H \"Content-Type: application/json\" {ENDPOINT}/v1beta1/projects/{PROJECT_ID}/locations/{LOCATION}/models:upload -d '{request}'" ] }, { "cell_type": "markdown", "metadata": { "id": "6afd1e782a3f" }, "source": [ "## Extract the Model ID\n", "\n", "After uploading your model to Vertex AI, you'll need to extract the model ID from the response for use in subsequent steps.\n", "\n", "The response from the upload command will look similar to this:\n", "\n", "```json\n", "{\n", " \"name\": \"projects/123456789/locations/us-central1/models/9876543210/operations/1122334455\",\n", " \"metadata\": {\n", " \"@type\": \"type.googleapis.com/google.cloud.aiplatform.v1beta1.UploadModelOperationMetadata\",\n", " \"genericMetadata\": {\n", " \"createTime\": \"2025-04-07T16:47:27.076450Z\",\n", " \"updateTime\": \"2025-04-07T16:47:27.076450Z\"\n", " }\n", " }\n", "}\n", "```\n", "\n", "Your **model ID** is the number between `models/` and `/operations` in the \"name\" field.\n", "\n", "In the example above, the model ID would be `9876543210`." ] }, { "cell_type": "markdown", "metadata": { "id": "76cba0adc39c" }, "source": "Extract the model ID from the response" }, { "cell_type": "markdown", "metadata": { "id": "V2j0nVGwlf9b" }, "source": [ "### Verify Your Model" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "bxwM0GXTmQhh" }, "outputs": [], "source": [ "MODEL_ID = \"[extracted_model_id]\" # @param {type: \"number\"}\n", "\n", "! curl -X GET -H \"Authorization: Bearer $(gcloud auth print-access-token)\" -H \"Content-Type: application/json\" {ENDPOINT}/v1/projects/{PROJECT_ID}/locations/{LOCATION}/models/{MODEL_ID}" ] }, { "cell_type": "markdown", "metadata": { "id": "3q3ygq8VlZAp" }, "source": [ "### Create an Endpoint and Extract the Endpoint ID" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "O1ChDOt7mPBQ" }, "outputs": [], "source": [ "CREATE_ENDPOINT_PAYLOAD = {\n", " \"displayName\": \"ModelGarden_LaunchPad_Endpoint_\" + time.strftime(\"%Y%m%d-%H%M%S\"),\n", "}\n", "\n", "request = json.dumps(CREATE_ENDPOINT_PAYLOAD)\n", "\n", "! curl -X POST -H \"Authorization: Bearer $(gcloud auth print-access-token)\" -H \"Content-Type: application/json\" {ENDPOINT}/v1/projects/{PROJECT_ID}/locations/{LOCATION}/endpoints -d '{request}'" ] }, { "cell_type": "markdown", "metadata": { "id": "400ec43e8a5c" }, "source": [ "### Extracting the Endpoint ID\n", "\n", " After running the command above, you'll see a JSON response similar to:\n", "```json\n", "{\n", " \"name\": \"projects/PROJECT_NUMBER/locations/LOCATION/endpoints/ENDPOINT_ID/operations/OPERATION_ID\",\n", " \"metadata\": {\n", " \"@type\": \"type.googleapis.com/google.cloud.aiplatform.v1.CreateEndpointOperationMetadata\",\n", " \"genericMetadata\": {\n", " \"createTime\": \"2025-04-07T16:55:27.076450Z\",\n", " \"updateTime\": \"2025-04-07T16:55:27.076450Z\"\n", " }\n", " }\n", "}\n", "```\n", "\n", "Your endpoint ID is the number that appears after \"endpoints/\" and before \"/operations\" in the \"name\" field.\n", "\n", "For example, if the \"name\" field shows:\n", "\"projects/123456789/locations/us-central1/endpoints/9876543210/operations/1122334455\"\n", "\n", "Then your endpoint ID is: 9876543210" ] }, { "cell_type": "markdown", "metadata": { "id": "GuMZCdhmlpCE" }, "source": [ "### Verify Your Endpoint" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "tHq_cLT6mPp_" }, "outputs": [], "source": [ "ENDPOINT_ID = \"[extracted_endpoint_id]\" # @param {type: \"number\"}\n", "\n", "! curl -X GET -H \"Authorization: Bearer $(gcloud auth print-access-token)\" -H \"Content-Type: application/json\" {ENDPOINT}/v1/projects/{PROJECT_ID}/locations/{LOCATION}/endpoints/{ENDPOINT_ID}" ] }, { "cell_type": "markdown", "metadata": { "id": "G0amEPXolbP7" }, "source": [ "### Deploy Model" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "Ucj-Xa-fpGrg" }, "outputs": [], "source": [ "import ipywidgets as widgets\n", "from IPython.display import display\n", "\n", "# Initial data\n", "PUBLISHER_NAME = \"qodo\" # @param {type:\"string\"}\n", "PUBLISHER_MODEL_NAME = \"qodo-embed-1-7b-v1\"\n", "available_regions = [\"us-central1\", \"europe-west4\"]\n", "compatible_machines = [\n", " \"a2-highgpu-1g\",\n", " \"a2-highgpu-4g\",\n", " \"a2-ultragpu-1g\",\n", " \"a2-ultragpu-2g\",\n", " \"a3-highgpu-2g\",\n", "]\n", "\n", "# Machine type to accelerator mapping (you can customize this based on your needs)\n", "machine_config = {\n", " \"a2-highgpu-1g\": {\"type\": \"NVIDIA_A100\", \"count\": 1},\n", " \"a2-highgpu-4g\": {\"type\": \"NVIDIA_A100\", \"count\": 4},\n", " \"a2-ultragpu-1g\": {\"type\": \"NVIDIA_A100_80GB\", \"count\": 1},\n", " \"a2-ultragpu-2g\": {\"type\": \"NVIDIA_A100_80GB\", \"count\": 2},\n", " \"a3-highgpu-2g\": {\"type\": \"NVIDIA_H100\", \"count\": 2},\n", "}\n", "\n", "# Create widgets\n", "\n", "\n", "dropdown_machine = widgets.Dropdown(\n", " options=compatible_machines,\n", " description=\"Machine type:\",\n", " font_weight=\"bold\",\n", " style={\"description_width\": \"initial\"},\n", ")\n", "\n", "label_accelerator_type = widgets.HTML(\n", " value=f\"<b>Accelerator type:</b> {machine_config[compatible_machines[0]]['type']}\"\n", ")\n", "\n", "label_accelerator_count = widgets.HTML(\n", " value=f\"<b>Accelerator count:</b> {machine_config[compatible_machines[0]]['count']}\"\n", ")\n", "\n", "# Event handlers\n", "\n", "\n", "def dropdown_machine_eventhandler(change):\n", " global MACHINE_TYPE, ACCELERATOR_TYPE, ACCELERATOR_COUNT\n", " if change[\"type\"] == \"change\" and change[\"name\"] == \"value\":\n", " MACHINE_TYPE = change.new\n", " machine_info = machine_config.get(change.new, {\"type\": \"Unknown\", \"count\": 0})\n", " ACCELERATOR_TYPE = machine_info[\"type\"]\n", " ACCELERATOR_COUNT = machine_info[\"count\"]\n", "\n", " # Update the displayed information\n", " label_accelerator_type.value = f\"<b>Accelerator type:</b> {ACCELERATOR_TYPE}\"\n", " label_accelerator_count.value = f\"<b>Accelerator count:</b> {ACCELERATOR_COUNT}\"\n", "\n", "\n", "# Initialize global variables\n", "MACHINE_TYPE = dropdown_machine.value\n", "ACCELERATOR_TYPE = machine_config[MACHINE_TYPE][\"type\"]\n", "ACCELERATOR_COUNT = machine_config[MACHINE_TYPE][\"count\"]\n", "\n", "# Set up observers\n", "dropdown_machine.observe(dropdown_machine_eventhandler, names=\"value\")\n", "\n", "# Display widgets\n", "display(\n", " widgets.VBox([dropdown_machine, label_accelerator_type, label_accelerator_count])\n", ")" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "VGTyCQQhlrAR" }, "outputs": [], "source": [ "DEPLOY_PAYLOAD = {\n", " \"deployedModel\": {\n", " \"model\": f\"projects/{PROJECT_ID}/locations/{LOCATION}/models/{MODEL_ID}\",\n", " \"displayName\": \"ModelGarden_LaunchPad_DeployedModel_\"\n", " + time.strftime(\"%Y%m%d-%H%M%S\"),\n", " \"dedicatedResources\": {\n", " \"machineSpec\": {\n", " \"machineType\": MACHINE_TYPE,\n", " \"acceleratorType\": ACCELERATOR_TYPE,\n", " \"acceleratorCount\": ACCELERATOR_COUNT,\n", " },\n", " \"minReplicaCount\": 1,\n", " \"maxReplicaCount\": 1,\n", " },\n", " },\n", " \"trafficSplit\": {\"0\": 100},\n", "}\n", "\n", "request = json.dumps(DEPLOY_PAYLOAD)\n", "\n", "! curl -X POST -H \"Authorization: Bearer $(gcloud auth print-access-token)\" -H \"Content-Type: application/json\" {ENDPOINT}/v1/projects/{PROJECT_ID}/locations/{LOCATION}/endpoints/{ENDPOINT_ID}:deployModel -d '{request}'" ] }, { "cell_type": "markdown", "metadata": { "id": "21399a078364" }, "source": [ "### Wait for Deployment to Complete\n", "\n", "Deployment can take several minutes. This cell will check the status of the operation." ] }, { "cell_type": "markdown", "metadata": { "id": "d95cc714a451" }, "source": [ "### Extracting the operation ID\n", "\n", " After running the command above, you'll see a JSON response similar to:\n", "```json\n", "{\n", " \"name\": \"projects/513257720056/locations/us-central1/endpoints/3978337634014461952/operations/2501704616106786816\",\n", " \"metadata\": {\n", " \"@type\": \"type.googleapis.com/google.cloud.aiplatform.v1.DeployModelOperationMetadata\",\n", " \"genericMetadata\": {\n", " \"createTime\": \"2025-04-07T17:10:55.383719Z\",\n", " \"updateTime\": \"2025-04-07T17:10:55.383719Z\"\n", " }\n", " }\n", "}\n", "```\n", "\n", "Your operation ID is the number that appears after \"operation/\" in the \"name\" field.\n", "\n", "For example, if the \"name\" field shows:\n", "\"projects/123456789/locations/us-central1/endpoints/9876543210/operations/1122334455\"\n", "\n", "Then your endpoint ID is: 1122334455" ] }, { "cell_type": "markdown", "metadata": { "id": "7dc1861dbde8" }, "source": [ "### Check Operation Status\n", "\n", "Run this cell to check the current status of the deployment operation. You may need to run this cell multiple times until the operation is complete (`\"done\": true`).\n", "\n", "**Note:** Model deployment typically takes 5-20 minutes to complete." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "346c2fc14a21" }, "outputs": [], "source": [ "OPERATION_ID = \"[extracted_operation_id]\" # @param {type: \"number\"}\n", "# Check operation status\n", "print(\"Checking deployment status...\")\n", "!curl -X GET -H \"Authorization: Bearer $(gcloud auth print-access-token)\" -H \"Content-Type: application/json\" {ENDPOINT}/v1/projects/{PROJECT_ID}/locations/{LOCATION}/operations/{OPERATION_ID}" ] }, { "cell_type": "markdown", "metadata": { "id": "5ahw-uFjCAbo" }, "source": [ "### Prediction" ] }, { "cell_type": "markdown", "metadata": { "id": "7cb14ea3b257" }, "source": [ "Sends a POST request to the specified API endpoint to get a response from the model for a joke using the provided payload." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "4zFz260B50oi" }, "outputs": [], "source": [ "PAYLOAD = {\n", " \"instances\":[\n", " {\n", " \"input\":[\n", " \"def hello_world(): \n", " print('hello_world')\"\n", " ]\n", " }\n", " ]\n", "}\n", "\n", "\n", "request = json.dumps(PAYLOAD)\n", "\n", "!curl -X POST \\\n", " -H \"Authorization: Bearer $(gcloud auth print-access-token)\" \\\n", " -H \"Content-Type: application/json\" {ENDPOINT}/v1/projects/{PROJECT_ID}/locations/{LOCATION}/endpoints/{ENDPOINT_ID}:rawPredict \\\n", " -d '{request}'" ] }, { "cell_type": "markdown", "metadata": { "id": "90b684e180b8" }, "source": [ "## Using Vertex AI SDK for *Python*" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "141fe83f051b" }, "outputs": [], "source": [ "from google.cloud import aiplatform" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "cc8edb9d1d3b" }, "outputs": [], "source": [ "aiplatform.init(project=PROJECT_ID, location=LOCATION)" ] }, { "cell_type": "markdown", "metadata": { "id": "6175fddd280b" }, "source": [ "### Upload Model" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "3cf31f867d7f" }, "outputs": [], "source": [ "model = aiplatform.Model.upload(\n", " display_name=\"ModelGarden_LaunchPad_Endpoint_\" + time.strftime(\"%Y%m%d-%H%M%S\"),\n", " model_garden_source_model_name=f\"publishers/{PUBLISHER_NAME}/models/{PUBLISHER_MODEL_NAME}\",\n", ")" ] }, { "cell_type": "markdown", "metadata": { "id": "e61cc3ae9860" }, "source": [ "### Create Endpoint" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "db0a82502964" }, "outputs": [], "source": [ "my_endpoint = aiplatform.Endpoint.create(\n", " display_name=\"ModelGarden_LaunchPad_Endpoint_\" + time.strftime(\"%Y%m%d-%H%M%S\")\n", ")" ] }, { "cell_type": "markdown", "metadata": { "id": "050f96c8c2b8" }, "source": [ "### Deploy Model" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "16da63ba97a7" }, "outputs": [], "source": [ "import ipywidgets as widgets\n", "from IPython.display import display\n", "\n", "# Initial data\n", "PUBLISHER_NAME = \"qodo\" # @param {type:\"string\"}\n", "PUBLISHER_MODEL_NAME = \"qodo-embed-1-7b-v1\"\n", "available_regions = [\"us-central1\", \"europe-west4\"]\n", "compatible_machines = [\n", " \"a2-highgpu-1g\",\n", " \"a2-highgpu-4g\",\n", " \"a2-ultragpu-1g\",\n", " \"a2-ultragpu-2g\",\n", " \"a3-highgpu-2g\",\n", "]\n", "\n", "# Machine type to accelerator mapping (you can customize this based on your needs)\n", "machine_config = {\n", " \"a2-highgpu-1g\": {\"type\": \"NVIDIA_A100\", \"count\": 1},\n", " \"a2-highgpu-4g\": {\"type\": \"NVIDIA_A100\", \"count\": 4},\n", " \"a2-ultragpu-1g\": {\"type\": \"NVIDIA_A100_80GB\", \"count\": 1},\n", " \"a2-ultragpu-2g\": {\"type\": \"NVIDIA_A100_80GB\", \"count\": 2},\n", " \"a3-highgpu-2g\": {\"type\": \"NVIDIA_H100\", \"count\": 2},\n", "}\n", "\n", "# Create widgets\n", "\n", "\n", "dropdown_machine = widgets.Dropdown(\n", " options=compatible_machines,\n", " description=\"Machine type:\",\n", " font_weight=\"bold\",\n", " style={\"description_width\": \"initial\"},\n", ")\n", "\n", "label_accelerator_type = widgets.HTML(\n", " value=f\"<b>Accelerator type:</b> {machine_config[compatible_machines[0]]['type']}\"\n", ")\n", "\n", "label_accelerator_count = widgets.HTML(\n", " value=f\"<b>Accelerator count:</b> {machine_config[compatible_machines[0]]['count']}\"\n", ")\n", "\n", "# Event handlers\n", "\n", "\n", "def dropdown_machine_eventhandler(change):\n", " global MACHINE_TYPE, ACCELERATOR_TYPE, ACCELERATOR_COUNT\n", " if change[\"type\"] == \"change\" and change[\"name\"] == \"value\":\n", " MACHINE_TYPE = change.new\n", " machine_info = machine_config.get(change.new, {\"type\": \"Unknown\", \"count\": 0})\n", " ACCELERATOR_TYPE = machine_info[\"type\"]\n", " ACCELERATOR_COUNT = machine_info[\"count\"]\n", "\n", " # Update the displayed information\n", " label_accelerator_type.value = f\"<b>Accelerator type:</b> {ACCELERATOR_TYPE}\"\n", " label_accelerator_count.value = f\"<b>Accelerator count:</b> {ACCELERATOR_COUNT}\"\n", "\n", "\n", "# Initialize global variables\n", "MACHINE_TYPE = dropdown_machine.value\n", "ACCELERATOR_TYPE = machine_config[MACHINE_TYPE][\"type\"]\n", "ACCELERATOR_COUNT = machine_config[MACHINE_TYPE][\"count\"]\n", "\n", "# Set up observers\n", "dropdown_machine.observe(dropdown_machine_eventhandler, names=\"value\")\n", "\n", "# Display widgets\n", "display(\n", " widgets.VBox([dropdown_machine, label_accelerator_type, label_accelerator_count])\n", ")" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "a4afefe566f6" }, "outputs": [], "source": [ "model.deploy(\n", " endpoint=my_endpoint,\n", " deployed_model_display_name=\"ModelGarden_LaunchPad_DeployedModel_\"\n", " + time.strftime(\"%Y%m%d-%H%M%S\"),\n", " traffic_split={\"0\": 100},\n", " machine_type=MACHINE_TYPE,\n", " accelerator_type=ACCELERATOR_TYPE,\n", " accelerator_count=ACCELERATOR_COUNT,\n", " min_replica_count=1,\n", " max_replica_count=1,\n", ")" ] }, { "cell_type": "markdown", "metadata": { "id": "e793c9b3e13b" }, "source": [ "### Prediction" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "8cee72080193" }, "outputs": [], "source": [ "PAYLOAD = {\"instances\": [{\"input\": [\"def hello_world(): \\n print('hello_world')\"]}]}\n", "\n", "request = json.dumps(PAYLOAD)\n", "\n", "response = my_endpoint.raw_predict(\n", " body=request, headers={\"Content-Type\": \"application/json\"}\n", ")\n", "data = response.json()\n", "embedding = data[\"predictions\"][0][\"data\"][0][\"embedding\"]\n", "print(embedding)" ] }, { "cell_type": "markdown", "metadata": { "id": "d45c572a4b7d" }, "source": [ "## Cleaning up\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "ec834cd282d4" }, "outputs": [], "source": [ "# Cleaning up\n", "\n", "print(\"Starting cleanup process...\\n\")\n", "\n", "# First undeploy the model from the endpoint\n", "print(\"Undeploying model from endpoint...\")\n", "try:\n", " my_endpoint.undeploy_all()\n", " print(\"✓ Model successfully undeployed from endpoint\")\n", "except Exception as e:\n", " print(f\"Error undeploying model: {e}\")\n", "\n", "# Delete the endpoint\n", "print(\"\\nDeleting endpoint...\")\n", "try:\n", " my_endpoint.delete()\n", " print(\"✓ Endpoint successfully deleted\")\n", "except Exception as e:\n", " print(f\"Error deleting endpoint: {e}\")\n", "\n", "# Delete the model\n", "print(\"\\nDeleting model...\")\n", "try:\n", " model.delete()\n", " print(\"✓ Model successfully deleted\")\n", "except Exception as e:\n", " print(f\"Error deleting model: {e}\")\n", "\n", "print(\"\\nCleanup complete! All resources have been removed.\")\n" ] } ], "metadata": { "colab": { "name": "qodo_intro.ipynb", "toc_visible": true }, "kernelspec": { "display_name": "Python 3", "name": "python3" } }, "nbformat": 4, "nbformat_minor": 0 }