notebooks/community/model_garden/model_garden_openai_api_llama4.ipynb (1,062 lines of code) (raw):

{ "cells": [ { "cell_type": "code", "execution_count": null, "metadata": { "cellView": "form", "id": "ur8xi4C7S06n" }, "outputs": [], "source": [ "# Copyright 2025 Google LLC\n", "#\n", "# Licensed under the Apache License, Version 2.0 (the \"License\");\n", "# you may not use this file except in compliance with the License.\n", "# You may obtain a copy of the License at\n", "#\n", "# https://www.apache.org/licenses/LICENSE-2.0\n", "#\n", "# Unless required by applicable law or agreed to in writing, software\n", "# distributed under the License is distributed on an \"AS IS\" BASIS,\n", "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n", "# See the License for the specific language governing permissions and\n", "# limitations under the License." ] }, { "cell_type": "markdown", "metadata": { "id": "JAPoU8Sm5E6e" }, "source": [ "# Vertex AI Model Garden - Get started with Llama 4 models\n", "\n", "<table align=\"left\">\n", " <td style=\"text-align: center\">\n", " <a href=\"https://colab.research.google.com/github/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/community/model_garden/model_garden_openai_api_llama4.ipynb\">\n", " <img src=\"https://cloud.google.com/ml-engine/images/colab-logo-32px.png\" alt=\"Google Colaboratory logo\"><br> Open in Colab\n", " </a>\n", " </td>\n", " <td style=\"text-align: center\">\n", " <a href=\"https://console.cloud.google.com/vertex-ai/colab/import/https:%2F%2Fraw.githubusercontent.com%2FGoogleCloudPlatform%2Fvertex-ai-samples%2Fmain%2Fnotebooks%2Fcommunity%2Fmodel_garden%2Fmodel_garden_openai_api_llama4.ipynb\"\">\n", " <img width=\"32px\" src=\"https://cloud.google.com/ml-engine/images/colab-enterprise-logo-32px.png\" alt=\"Google Cloud Colab Enterprise logo\"><br> Open in Colab Enterprise\n", " </a>\n", " </td> \n", " <td style=\"text-align: center\">\n", " <a href=\"https://console.cloud.google.com/vertex-ai/workbench/deploy-notebook?download_url=https://raw.githubusercontent.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/community/model_garden/model_garden_openai_api_llama4.ipynb\">\n", " <img src=\"https://lh3.googleusercontent.com/UiNooY4LUgW_oTvpsNhPpQzsstV5W8F7rYgxgGBD85cWJoLmrOzhVs_ksK_vgx40SHs7jCqkTkCk=e14-rj-sc0xffffff-h130-w32\" alt=\"Vertex AI logo\"><br> Open in Workbench\n", " </a>\n", " </td>\n", " <td style=\"text-align: center\">\n", " <a href=\"https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/community/model_garden/model_garden_openai_api_llama4.ipynb\">\n", " <img src=\"https://cloud.google.com/ml-engine/images/github-logo-32px.png\" alt=\"GitHub logo\"><br> View on GitHub\n", " </a>\n", " </td>\n", "</table>" ] }, { "cell_type": "markdown", "metadata": { "id": "tvgnzT1CKxrO" }, "source": [ "## Overview\n", "\n", "This notebook demonstrates how to get started with using the OpenAI library and demonstrates how to leverage multimodal capabilities of Llama 4 models as Model-as-service (MaaS).\n", "\n", "### Objective\n", "\n", "- Configure OpenAI SDK for the Llama 4 Completions API\n", "- Chat with Llama 4 models with different prompts and model parameters\n", "- Build and use Llama 4 GenAI powered application for Car Damage Assessment.\n", "\n", "### Costs\n", "\n", "This tutorial uses billable components of Google Cloud:\n", "\n", "* Vertex AI\n", "* Cloud Storage\n", "\n", "Learn about [Vertex AI pricing](https://cloud.google.com/vertex-ai/pricing), [Cloud Storage pricing](https://cloud.google.com/storage/pricing), and use the [Pricing Calculator](https://cloud.google.com/products/calculator/) to generate a cost estimate based on your projected usage." ] }, { "cell_type": "markdown", "metadata": { "id": "61RBz8LLbxCR" }, "source": [ "## Get started" ] }, { "cell_type": "markdown", "metadata": { "id": "No17Cw5hgx12" }, "source": [ "### Install Vertex AI SDK for Python and other required packages\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "cellView": "form", "id": "tFy3H3aPgx12" }, "outputs": [], "source": [ "! pip3 install --upgrade --quiet google-cloud-aiplatform openai gradio" ] }, { "cell_type": "markdown", "metadata": { "id": "R5Xep4W9lq-Z" }, "source": [ "### Restart runtime (Colab only)\n", "\n", "To use the newly installed packages, you must restart the runtime on Google Colab." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "cellView": "form", "id": "XRvKdaPDTznN" }, "outputs": [], "source": [ "import sys\n", "\n", "if \"google.colab\" in sys.modules:\n", "\n", " import IPython\n", "\n", " app = IPython.Application.instance()\n", " app.kernel.do_shutdown(True)" ] }, { "cell_type": "markdown", "metadata": { "id": "SbmM4z7FOBpM" }, "source": [ "<div class=\"alert alert-block alert-warning\">\n", "<b>⚠️ The kernel is going to restart. Wait until it's finished before continuing to the next step. ⚠️</b>\n", "</div>\n" ] }, { "cell_type": "markdown", "metadata": { "id": "dmWOrTJ3gx13" }, "source": [ "### Authenticate your notebook environment (Colab only)\n", "\n", "Authenticate your environment on Google Colab.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "cellView": "form", "id": "NyKGtVQjgx13" }, "outputs": [], "source": [ "import sys\n", "\n", "if \"google.colab\" in sys.modules:\n", "\n", " from google.colab import auth\n", "\n", " auth.authenticate_user()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "cellView": "form", "id": "tQvSE1VQNWWs" }, "outputs": [], "source": [ "# @title End User Agreement\n", "# @markdown To use the Llama 4 Model-as-a-service endpoints, you will need to\n", "# @markdown accept the end-user license agreement (EULA) on the model card.\n", "\n", "# @markdown [End-user License Agreement](https://console.cloud.google.com/vertex-ai/publishers/meta/model-garden/llama-4-maverick-17b-128e-instruct-maas).\n", "\n", "# fmt: off\n", "accept_eula = False # @param {\"type\":\"boolean\", \"placeholder\":\"I have read and accepted the EULA\"}\n", "# fmt: on" ] }, { "cell_type": "markdown", "metadata": { "id": "DF4l8DTdWgPY" }, "source": [ "### Set Google Cloud project information\n", "\n", "To get started using Vertex AI, you must have an existing Google Cloud project and [enable the Vertex AI API](https://console.cloud.google.com/flows/enableapi?apiid=aiplatform.googleapis.com). Learn more about [setting up a project and a development environment](https://cloud.google.com/vertex-ai/docs/start/cloud-environment)." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "cellView": "form", "id": "Nqwi-5ufWp_B" }, "outputs": [], "source": [ "PROJECT_ID = \"<your-project-id>\" # @param {type:\"string\"}\n", "\n", "# Only `us-eastt5` is supported region for Llama 4 models using Model-as-a-Service (MaaS).\n", "LOCATION = \"us-east5\"" ] }, { "cell_type": "markdown", "metadata": { "id": "zgPO1eR3CYjk" }, "source": [ "### Create a Cloud Storage bucket\n", "\n", "Create a storage bucket to store tutorial artifacts." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "cellView": "form", "id": "MzGDU7TWdts_" }, "outputs": [], "source": [ "BUCKET_NAME = \"<your-bucket-name>\" # @param {type:\"string\"}\n", "\n", "BUCKET_URI = f\"gs://{BUCKET_NAME}\"" ] }, { "cell_type": "markdown", "metadata": { "id": "-EcIXiGsCePi" }, "source": [ "**If your bucket doesn't already exist**: Run the following cell to create your Cloud Storage bucket." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "cellView": "form", "id": "NIq7R4HZCfIc" }, "outputs": [], "source": [ "! gsutil mb -l {LOCATION} -p {PROJECT_ID} {BUCKET_URI}" ] }, { "cell_type": "markdown", "metadata": { "id": "0Wn8ZkcV86KR" }, "source": [ "### Initialize Vertex AI SDK for Python" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "cellView": "form", "id": "B8DawN9D9NLU" }, "outputs": [], "source": [ "import vertexai\n", "\n", "vertexai.init(project=PROJECT_ID, location=LOCATION, staging_bucket=BUCKET_URI)" ] }, { "cell_type": "markdown", "metadata": { "id": "jVYoyDl165EE" }, "source": [ "### Import libraries\n", "\n", "Import libraries to use in this tutorial." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "cellView": "form", "id": "c1tEW-U968h8" }, "outputs": [], "source": [ "import json\n", "import re\n", "import uuid\n", "from io import BytesIO\n", "\n", "import gradio as gr\n", "import matplotlib.pyplot as plt\n", "# Chat completions API\n", "import openai\n", "from google.auth import default, transport\n", "from google.cloud import storage\n", "from PIL import Image" ] }, { "cell_type": "markdown", "metadata": { "id": "-ti5YGgSSG-7" }, "source": [ "### Helpers functions" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "cellView": "form", "id": "do6pdqyLSJif" }, "outputs": [], "source": [ "def visualize_image_from_bucket(bucket_name: str, blob_name: str) -> None:\n", " \"\"\"Visualizes an image stored in a Google Cloud Storage bucket.\"\"\"\n", " try:\n", " # Create a client for interacting with Google Cloud Storage\n", " storage_client = storage.Client()\n", "\n", " # Get a reference to the bucket and blob\n", " bucket = storage_client.bucket(bucket_name)\n", " blob = bucket.blob(blob_name)\n", "\n", " # Download the image data into memory\n", " image_data = blob.download_as_bytes()\n", "\n", " # Open the image using PIL\n", " image = Image.open(BytesIO(image_data))\n", "\n", " # Display the image using matplotlib\n", " plt.figure(figsize=(10, 10)) # Set the figure size (adjust as needed)\n", " plt.imshow(image)\n", " plt.axis(\"off\") # Turn off axis labels\n", " plt.show()\n", "\n", " except Exception as e:\n", " print(f\"Error visualizing image: {e}\")" ] }, { "cell_type": "markdown", "metadata": { "id": "uqYCG2Fw7D3L" }, "source": [ "### Configure OpenAI SDK for the Llama 4 Chat Completions API\n", "\n", "To configure the OpenAI SDK for the Llama 4 Chat Completions API, you need to request the access token and initialize the client pointing to the Llama 4 endpoint.\n" ] }, { "cell_type": "markdown", "metadata": { "id": "W0K6VSJRHhH2" }, "source": [ "#### Authentication\n", "\n", "You can request an access token from the default credentials for the current environment. Note that the access token lives for [1 hour by default](https://cloud.google.com/docs/authentication/token-types#at-lifetime); after expiration, it must be refreshed.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "cellView": "form", "id": "i0qceuiQEPHv" }, "outputs": [], "source": [ "credentials, _ = default()\n", "auth_request = transport.requests.Request()\n", "credentials.refresh(auth_request)" ] }, { "cell_type": "markdown", "metadata": { "id": "Q04wJmA0HT6X" }, "source": [ "Then configure the OpenAI SDK to point to the Llama 4 Chat Completions API endpoint.\n", "\n", "Note that only `us-east5` is supported region for Llama 4 models using Model-as-a-Service (MaaS)." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "cellView": "form", "id": "c-MRhsnlj6iw" }, "outputs": [], "source": [ "MODEL_LOCATION = \"us-east5\"\n", "MAAS_ENDPOINT = f\"{MODEL_LOCATION}-aiplatform.googleapis.com\"\n", "\n", "if not accept_eula:\n", " raise ValueError(\"Accept the EULA to continue.\")\n", "\n", "client = openai.OpenAI(\n", " base_url=f\"https://{MAAS_ENDPOINT}/v1beta1/projects/{PROJECT_ID}/locations/{LOCATION}/endpoints/openapi\",\n", " api_key=credentials.token,\n", ")" ] }, { "cell_type": "markdown", "metadata": { "id": "UGokrtdiIHrX" }, "source": [ "#### Llama 4 models\n", "\n", "You can experiment with various supported Llama 4 models.\n", "\n", "This tutorial use Llama 4 90B Vision Instruct using Model-as-a-Service (MaaS). Using Model-as-a-Service (MaaS), you can access Llama 4 models in just a few clicks without any setup or infrastructure hassles.\n", "\n", "You can also access Llama models for self-service in Vertex AI Model Garden, allowing you to choose your preferred infrastructure. [Check out Llama 4 model card](https://console.cloud.google.com/vertex-ai/publishers/meta/model-garden/llama4?_ga=2.31261500.2048242469.1721714335-1107467625.1721655511) to learn how to deploy a Llama 4 models on Vertex AI." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "cellView": "form", "id": "r7OhyH46H2H5" }, "outputs": [], "source": [ "MODEL_ID = \"meta/llama-4-scout-17b-16e-instruct-maas\" # @param [\"meta/llama-4-scout-17b-16e-instruct-maas\", \"meta/llama-4-maverick-17b-128e-instruct-maas\"]" ] }, { "cell_type": "markdown", "metadata": { "id": "1xD62NTpqHXd" }, "source": [ "### Chat with Llama 4\n", "\n", "Use the Chat Completions API to send a multi-model request to the Llama 4 model." ] }, { "cell_type": "markdown", "metadata": { "id": "tkyp9kZSuJGx" }, "source": [ "#### Hello, Llama 4" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "cellView": "form", "id": "CKVOZ1HEqRbY" }, "outputs": [], "source": [ "max_tokens = 4096\n", "\n", "response = client.chat.completions.create(\n", " model=MODEL_ID,\n", " messages=[\n", " {\n", " \"role\": \"user\",\n", " \"content\": [\n", " {\n", " \"image_url\": {\n", " \"url\": \"gs://github-repo/img/gemini/intro/landmark1.jpg\"\n", " },\n", " \"type\": \"image_url\",\n", " },\n", " {\"text\": \"What’s in this image?\", \"type\": \"text\"},\n", " ],\n", " },\n", " {\"role\": \"assistant\", \"content\": \"In this image, you have:\"},\n", " ],\n", " max_tokens=max_tokens,\n", ")" ] }, { "cell_type": "markdown", "metadata": { "id": "T3yz-Xuc9Nyf" }, "source": [ "You get the response as shown below." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "cellView": "form", "id": "LxpdxYCxH51u" }, "outputs": [], "source": [ "print(response.choices[0].message.content)" ] }, { "cell_type": "markdown", "metadata": { "id": "ShR15nvo9Te4" }, "source": [ "You use the helper function to visualize the image." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "cellView": "form", "id": "ECUxyzWmSbuX" }, "outputs": [], "source": [ "visualize_image_from_bucket(\"github-repo\", \"img/gemini/intro/landmark1.jpg\")" ] }, { "cell_type": "markdown", "metadata": { "id": "B1rKbHUQt605" }, "source": [ "#### Ask Llama 4 using different model configuration\n", "\n", "Use the following parameters to generate different answers:\n", "\n", "* `temperature` to control the randomness of the response\n", "* `top_p` to control the quality of the response\n", "* `stream` to stream the response back or not\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "cellView": "form", "id": "owv-5Sz5rIEU" }, "outputs": [], "source": [ "temperature = 1.0 # @param {type:\"number\"}\n", "top_p = 1.0 # @param {type:\"number\"}\n", "stream = True # @param {type:\"boolean\"}" ] }, { "cell_type": "markdown", "metadata": { "id": "a-qBuhcK-G1V" }, "source": [ "Get the answer." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "cellView": "form", "id": "O1YU8bSivH0B" }, "outputs": [], "source": [ "response = client.chat.completions.create(\n", " model=MODEL_ID,\n", " messages=[\n", " {\n", " \"role\": \"user\",\n", " \"content\": [\n", " {\n", " \"image_url\": {\n", " \"url\": \"gs://github-repo/img/gemini/intro/landmark2.jpg\"\n", " },\n", " \"type\": \"image_url\",\n", " },\n", " {\"text\": \"What’s in this image?\", \"type\": \"text\"},\n", " ],\n", " },\n", " {\"role\": \"assistant\", \"content\": \"In this image, you have:\"},\n", " ],\n", " temperature=temperature,\n", " max_tokens=max_tokens,\n", " top_p=top_p,\n", " stream=stream,\n", ")" ] }, { "cell_type": "markdown", "metadata": { "id": "-o9-gF0U-Kba" }, "source": [ "Depending if `stream` parameter is enabled or not, you can print the response entirely or chunk by chunk." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "cellView": "form", "id": "CoDHLGhyyt8d" }, "outputs": [], "source": [ "if stream:\n", " for chunk in response:\n", " print(chunk.choices[0].delta.content, end=\"\")\n", "else:\n", " print(response.choices[0].message.content)" ] }, { "cell_type": "markdown", "metadata": { "id": "nKBAMJuG9l_j" }, "source": [ "And again, let's check if the answer is correct." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "cellView": "form", "id": "OqMthjHy9vXW" }, "outputs": [], "source": [ "visualize_image_from_bucket(\"github-repo\", \"img/gemini/intro/landmark2.jpg\")" ] }, { "cell_type": "markdown", "metadata": { "id": "BkoaelaKxm1r" }, "source": [ "#### Use Llama 4 with different multimodal tasks\n", "\n", "In this section, you will use Llama 4 to perform different multimodal tasks including image captioning and Visual Question Answering (VQA).\n", "\n", "For each task, you'll define a different prompt and submit a request to the model as you did before." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "cellView": "form", "id": "MuMkl7De_DR9" }, "outputs": [], "source": [ "visualize_image_from_bucket(\"github-repo\", \"img/gemini/intro/landmark3.jpg\")" ] }, { "cell_type": "markdown", "metadata": { "id": "-en7AYQDyONt" }, "source": [ "##### Image captioning" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "cellView": "form", "id": "QANInNvizWbi" }, "outputs": [], "source": [ "prompt = \"Imagine you're telling a friend about this photo. What would you say?\"" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "cellView": "form", "id": "2x8ML1Y_yfom" }, "outputs": [], "source": [ "response = client.chat.completions.create(\n", " model=MODEL_ID,\n", " messages=[\n", " {\n", " \"role\": \"user\",\n", " \"content\": [\n", " {\n", " \"image_url\": {\n", " \"url\": \"gs://github-repo/img/gemini/intro/landmark3.jpg\"\n", " },\n", " \"type\": \"image_url\",\n", " },\n", " {\"text\": prompt, \"type\": \"text\"},\n", " ],\n", " },\n", " ],\n", " max_tokens=max_tokens,\n", ")" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "cellView": "form", "id": "qQ6UUgpHztXZ" }, "outputs": [], "source": [ "print(response.choices[0].message.content)" ] }, { "cell_type": "markdown", "metadata": { "id": "kBLESIw4zhto" }, "source": [ "##### Visual Question Answering (VQA)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "cellView": "form", "id": "UrAybklfzhtz" }, "outputs": [], "source": [ "prompt = \"\"\"\n", "Analyze this image and answer the following questions:\n", "- What is the primary color in the image?\n", "- What is the overall mood or atmosphere conveyed in the scene?\n", "- Based on the visual clues, who might have taken the picture?\"\n", "\"\"\"" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "cellView": "form", "id": "VJYIZbGyzhtz" }, "outputs": [], "source": [ "response = client.chat.completions.create(\n", " model=MODEL_ID,\n", " messages=[\n", " {\n", " \"role\": \"user\",\n", " \"content\": [\n", " {\n", " \"image_url\": {\n", " \"url\": \"gs://github-repo/img/gemini/intro/landmark3.jpg\"\n", " },\n", " \"type\": \"image_url\",\n", " },\n", " {\"text\": prompt, \"type\": \"text\"},\n", " ],\n", " },\n", " ],\n", " max_tokens=max_tokens,\n", ")" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "cellView": "form", "id": "9vfWA2i9zwOZ" }, "outputs": [], "source": [ "print(response.choices[0].message.content)" ] }, { "cell_type": "markdown", "metadata": { "id": "gnrXpv5Y3yFK" }, "source": [ "### Build with Llama 4 : Car Damage Assessment app using Gradio\n", "\n", "In this section, you use Llama 4 to build a simple GenAI powered application for Car Damage Assessment.\n", "\n", "In this scenario, the app has to cover the following tasks:\n", "\n", "* Classify the type of damage\n", "* Estimate the damage severity\n", "* Estimate the damage cost\n" ] }, { "cell_type": "markdown", "metadata": { "id": "kD-Fo_2WRSBt" }, "source": [ "#### Define the UI functions" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "cellView": "form", "id": "tCbyAGF6ZpSN" }, "outputs": [], "source": [ "def upload_image_to_bucket(image_path: str) -> str:\n", " \"\"\"Uploads an image to a Google Cloud Storage bucket.\"\"\"\n", " try:\n", " # Create a client for interacting with Google Cloud Storage\n", " storage_client = storage.Client()\n", "\n", " # Get a reference to the bucket\n", " bucket = storage_client.bucket(BUCKET_NAME)\n", "\n", " # Generate a unique blob name based on the file extension\n", " file_extension = image_path.split(\".\")[-1].lower()\n", " if file_extension in [\"jpg\", \"jpeg\"]:\n", " blob_name = f\"car_damage_{uuid.uuid4()}.jpg\"\n", " else:\n", " blob_name = f\"car_damage_{uuid.uuid4()}.png\"\n", "\n", " # Get a reference to the blob and upload the image\n", " blob = bucket.blob(blob_name)\n", " blob.upload_from_filename(image_path)\n", "\n", " # Construct the URI of the uploaded image\n", " image_uri = f\"gs://{BUCKET_NAME}/{blob_name}\"\n", " return image_uri\n", "\n", " except Exception as e:\n", " print(f\"Error uploading image: {e}\")\n", "\n", "\n", "def parse_json_from_markdown(markdown_text: str) -> dict | None:\n", " \"\"\"Extracts and parses JSON content embedded within Markdown text.\"\"\"\n", " json_pattern = r\"```json\\n(.*?)\\n```\"\n", " match = re.search(json_pattern, markdown_text, re.DOTALL)\n", "\n", " if match:\n", " json_content = match.group(1)\n", " try:\n", " parsed_data = json.loads(json_content)\n", " return parsed_data\n", " except json.JSONDecodeError as e:\n", " print(f\"Error: Invalid JSON content found. {e}\")\n", " return None\n", " else:\n", " return None\n", "\n", "\n", "def process_image(image_uri):\n", " \"\"\"Processes a car damage image using a multimodal LLM.\"\"\"\n", "\n", " # Construct the prompt\n", " prompt = \"\"\"\n", " Analyze the provided image of a car and provide the following information:\n", "\n", " 1. Damage Type: Identify the primary type of damage visible in the image (e.g., dent, scratch, cracked windshield, etc.).\n", " 2. Severity: Estimate the severity of the damage on a scale of 1 to 5, where 1 is minor and 5 is severe.\n", " 3. Estimated Repair Cost: Provide an approximate range for the repair cost in USD.\n", "\n", " Return the results in JSON format with damagetype, severity, and cost fields.\n", " \"\"\"\n", "\n", " # Call Llama model\n", " credentials, _ = default()\n", " auth_request = transport.requests.Request()\n", " credentials.refresh(auth_request)\n", "\n", " client = openai.OpenAI(\n", " base_url=f\"https://{MAAS_ENDPOINT}/v1beta1/projects/{PROJECT_ID}/locations/{LOCATION}/endpoints/openapi\",\n", " api_key=credentials.token,\n", " )\n", " response = client.chat.completions.create(\n", " model=MODEL_ID,\n", " messages=[\n", " {\n", " \"role\": \"user\",\n", " \"content\": [\n", " {\"image_url\": {\"url\": image_uri}, \"type\": \"image_url\"},\n", " {\"text\": prompt, \"type\": \"text\"},\n", " ],\n", " },\n", " ],\n", " max_tokens=max_tokens,\n", " )\n", "\n", " # Parse the response\n", " response = response.choices[0].message.content\n", " output = parse_json_from_markdown(response)\n", "\n", " output = {\"damagetype\": \"scratch\", \"severity\": 5, \"cost\": 1000}\n", " return output[\"damagetype\"], output[\"severity\"], output[\"cost\"]\n", "\n", "\n", "def demo_fn(image_path):\n", " \"\"\"\n", " Processes a car damage image using a multimodal LLM.\n", " \"\"\"\n", "\n", " # Upload the image\n", " image_uri = upload_image_to_bucket(image_path)\n", "\n", " # Process the image\n", " damagetype, severity, cost = process_image(image_uri)\n", "\n", " return damagetype, severity, cost" ] }, { "cell_type": "markdown", "metadata": { "id": "lKa_C0QaZqVN" }, "source": [ "#### Run the application" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "cellView": "form", "id": "8CSczAFiBamY" }, "outputs": [], "source": [ "demo = gr.Interface(\n", " fn=demo_fn,\n", " inputs=gr.Image(type=\"filepath\"),\n", " outputs=[\n", " gr.Textbox(label=\"Damage Type\"),\n", " gr.Slider(label=\"Severity\", minimum=1, maximum=10, step=1),\n", " gr.Number(label=\"Cost\"),\n", " ],\n", " title=\"Car Damage Assessment\",\n", ")\n", "\n", "demo.launch(debug=True)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "cellView": "form", "id": "UmFjQHeYd08T" }, "outputs": [], "source": [ "demo.close()" ] }, { "cell_type": "markdown", "metadata": { "id": "2a4e033321ad" }, "source": [ "## Cleaning up\n", "\n", "Clean up resources created in this notebook.\n", "\n", "To delete to the search engine in Vertex AI, check out the following [documentation](https://cloud.google.com/generative-ai-app-builder/docs/delete-engine)." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "cellView": "form", "id": "OC7Ypb05ccUE" }, "outputs": [], "source": [ "delete_bucket = False # @param {type:\"boolean\"}\n", "\n", "if delete_bucket:\n", " ! gsutil -m rm -r $BUCKET_NAME" ] } ], "metadata": { "colab": { "name": "model_garden_openai_api_llama4.ipynb", "toc_visible": true }, "kernelspec": { "display_name": "Python 3", "name": "python3" } }, "nbformat": 4, "nbformat_minor": 0 }