notebooks/community/model_garden/model_garden_openai_api_llama4.ipynb (1,062 lines of code) (raw):
{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"cellView": "form",
"id": "ur8xi4C7S06n"
},
"outputs": [],
"source": [
"# Copyright 2025 Google LLC\n",
"#\n",
"# Licensed under the Apache License, Version 2.0 (the \"License\");\n",
"# you may not use this file except in compliance with the License.\n",
"# You may obtain a copy of the License at\n",
"#\n",
"# https://www.apache.org/licenses/LICENSE-2.0\n",
"#\n",
"# Unless required by applicable law or agreed to in writing, software\n",
"# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
"# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
"# See the License for the specific language governing permissions and\n",
"# limitations under the License."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "JAPoU8Sm5E6e"
},
"source": [
"# Vertex AI Model Garden - Get started with Llama 4 models\n",
"\n",
"<table align=\"left\">\n",
" <td style=\"text-align: center\">\n",
" <a href=\"https://colab.research.google.com/github/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/community/model_garden/model_garden_openai_api_llama4.ipynb\">\n",
" <img src=\"https://cloud.google.com/ml-engine/images/colab-logo-32px.png\" alt=\"Google Colaboratory logo\"><br> Open in Colab\n",
" </a>\n",
" </td>\n",
" <td style=\"text-align: center\">\n",
" <a href=\"https://console.cloud.google.com/vertex-ai/colab/import/https:%2F%2Fraw.githubusercontent.com%2FGoogleCloudPlatform%2Fvertex-ai-samples%2Fmain%2Fnotebooks%2Fcommunity%2Fmodel_garden%2Fmodel_garden_openai_api_llama4.ipynb\"\">\n",
" <img width=\"32px\" src=\"https://cloud.google.com/ml-engine/images/colab-enterprise-logo-32px.png\" alt=\"Google Cloud Colab Enterprise logo\"><br> Open in Colab Enterprise\n",
" </a>\n",
" </td> \n",
" <td style=\"text-align: center\">\n",
" <a href=\"https://console.cloud.google.com/vertex-ai/workbench/deploy-notebook?download_url=https://raw.githubusercontent.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/community/model_garden/model_garden_openai_api_llama4.ipynb\">\n",
" <img src=\"https://lh3.googleusercontent.com/UiNooY4LUgW_oTvpsNhPpQzsstV5W8F7rYgxgGBD85cWJoLmrOzhVs_ksK_vgx40SHs7jCqkTkCk=e14-rj-sc0xffffff-h130-w32\" alt=\"Vertex AI logo\"><br> Open in Workbench\n",
" </a>\n",
" </td>\n",
" <td style=\"text-align: center\">\n",
" <a href=\"https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/community/model_garden/model_garden_openai_api_llama4.ipynb\">\n",
" <img src=\"https://cloud.google.com/ml-engine/images/github-logo-32px.png\" alt=\"GitHub logo\"><br> View on GitHub\n",
" </a>\n",
" </td>\n",
"</table>"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "tvgnzT1CKxrO"
},
"source": [
"## Overview\n",
"\n",
"This notebook demonstrates how to get started with using the OpenAI library and demonstrates how to leverage multimodal capabilities of Llama 4 models as Model-as-service (MaaS).\n",
"\n",
"### Objective\n",
"\n",
"- Configure OpenAI SDK for the Llama 4 Completions API\n",
"- Chat with Llama 4 models with different prompts and model parameters\n",
"- Build and use Llama 4 GenAI powered application for Car Damage Assessment.\n",
"\n",
"### Costs\n",
"\n",
"This tutorial uses billable components of Google Cloud:\n",
"\n",
"* Vertex AI\n",
"* Cloud Storage\n",
"\n",
"Learn about [Vertex AI pricing](https://cloud.google.com/vertex-ai/pricing), [Cloud Storage pricing](https://cloud.google.com/storage/pricing), and use the [Pricing Calculator](https://cloud.google.com/products/calculator/) to generate a cost estimate based on your projected usage."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "61RBz8LLbxCR"
},
"source": [
"## Get started"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "No17Cw5hgx12"
},
"source": [
"### Install Vertex AI SDK for Python and other required packages\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"cellView": "form",
"id": "tFy3H3aPgx12"
},
"outputs": [],
"source": [
"! pip3 install --upgrade --quiet google-cloud-aiplatform openai gradio"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "R5Xep4W9lq-Z"
},
"source": [
"### Restart runtime (Colab only)\n",
"\n",
"To use the newly installed packages, you must restart the runtime on Google Colab."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"cellView": "form",
"id": "XRvKdaPDTznN"
},
"outputs": [],
"source": [
"import sys\n",
"\n",
"if \"google.colab\" in sys.modules:\n",
"\n",
" import IPython\n",
"\n",
" app = IPython.Application.instance()\n",
" app.kernel.do_shutdown(True)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "SbmM4z7FOBpM"
},
"source": [
"<div class=\"alert alert-block alert-warning\">\n",
"<b>⚠️ The kernel is going to restart. Wait until it's finished before continuing to the next step. ⚠️</b>\n",
"</div>\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "dmWOrTJ3gx13"
},
"source": [
"### Authenticate your notebook environment (Colab only)\n",
"\n",
"Authenticate your environment on Google Colab.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"cellView": "form",
"id": "NyKGtVQjgx13"
},
"outputs": [],
"source": [
"import sys\n",
"\n",
"if \"google.colab\" in sys.modules:\n",
"\n",
" from google.colab import auth\n",
"\n",
" auth.authenticate_user()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"cellView": "form",
"id": "tQvSE1VQNWWs"
},
"outputs": [],
"source": [
"# @title End User Agreement\n",
"# @markdown To use the Llama 4 Model-as-a-service endpoints, you will need to\n",
"# @markdown accept the end-user license agreement (EULA) on the model card.\n",
"\n",
"# @markdown [End-user License Agreement](https://console.cloud.google.com/vertex-ai/publishers/meta/model-garden/llama-4-maverick-17b-128e-instruct-maas).\n",
"\n",
"# fmt: off\n",
"accept_eula = False # @param {\"type\":\"boolean\", \"placeholder\":\"I have read and accepted the EULA\"}\n",
"# fmt: on"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "DF4l8DTdWgPY"
},
"source": [
"### Set Google Cloud project information\n",
"\n",
"To get started using Vertex AI, you must have an existing Google Cloud project and [enable the Vertex AI API](https://console.cloud.google.com/flows/enableapi?apiid=aiplatform.googleapis.com). Learn more about [setting up a project and a development environment](https://cloud.google.com/vertex-ai/docs/start/cloud-environment)."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"cellView": "form",
"id": "Nqwi-5ufWp_B"
},
"outputs": [],
"source": [
"PROJECT_ID = \"<your-project-id>\" # @param {type:\"string\"}\n",
"\n",
"# Only `us-eastt5` is supported region for Llama 4 models using Model-as-a-Service (MaaS).\n",
"LOCATION = \"us-east5\""
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "zgPO1eR3CYjk"
},
"source": [
"### Create a Cloud Storage bucket\n",
"\n",
"Create a storage bucket to store tutorial artifacts."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"cellView": "form",
"id": "MzGDU7TWdts_"
},
"outputs": [],
"source": [
"BUCKET_NAME = \"<your-bucket-name>\" # @param {type:\"string\"}\n",
"\n",
"BUCKET_URI = f\"gs://{BUCKET_NAME}\""
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "-EcIXiGsCePi"
},
"source": [
"**If your bucket doesn't already exist**: Run the following cell to create your Cloud Storage bucket."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"cellView": "form",
"id": "NIq7R4HZCfIc"
},
"outputs": [],
"source": [
"! gsutil mb -l {LOCATION} -p {PROJECT_ID} {BUCKET_URI}"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "0Wn8ZkcV86KR"
},
"source": [
"### Initialize Vertex AI SDK for Python"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"cellView": "form",
"id": "B8DawN9D9NLU"
},
"outputs": [],
"source": [
"import vertexai\n",
"\n",
"vertexai.init(project=PROJECT_ID, location=LOCATION, staging_bucket=BUCKET_URI)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "jVYoyDl165EE"
},
"source": [
"### Import libraries\n",
"\n",
"Import libraries to use in this tutorial."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"cellView": "form",
"id": "c1tEW-U968h8"
},
"outputs": [],
"source": [
"import json\n",
"import re\n",
"import uuid\n",
"from io import BytesIO\n",
"\n",
"import gradio as gr\n",
"import matplotlib.pyplot as plt\n",
"# Chat completions API\n",
"import openai\n",
"from google.auth import default, transport\n",
"from google.cloud import storage\n",
"from PIL import Image"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "-ti5YGgSSG-7"
},
"source": [
"### Helpers functions"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"cellView": "form",
"id": "do6pdqyLSJif"
},
"outputs": [],
"source": [
"def visualize_image_from_bucket(bucket_name: str, blob_name: str) -> None:\n",
" \"\"\"Visualizes an image stored in a Google Cloud Storage bucket.\"\"\"\n",
" try:\n",
" # Create a client for interacting with Google Cloud Storage\n",
" storage_client = storage.Client()\n",
"\n",
" # Get a reference to the bucket and blob\n",
" bucket = storage_client.bucket(bucket_name)\n",
" blob = bucket.blob(blob_name)\n",
"\n",
" # Download the image data into memory\n",
" image_data = blob.download_as_bytes()\n",
"\n",
" # Open the image using PIL\n",
" image = Image.open(BytesIO(image_data))\n",
"\n",
" # Display the image using matplotlib\n",
" plt.figure(figsize=(10, 10)) # Set the figure size (adjust as needed)\n",
" plt.imshow(image)\n",
" plt.axis(\"off\") # Turn off axis labels\n",
" plt.show()\n",
"\n",
" except Exception as e:\n",
" print(f\"Error visualizing image: {e}\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "uqYCG2Fw7D3L"
},
"source": [
"### Configure OpenAI SDK for the Llama 4 Chat Completions API\n",
"\n",
"To configure the OpenAI SDK for the Llama 4 Chat Completions API, you need to request the access token and initialize the client pointing to the Llama 4 endpoint.\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "W0K6VSJRHhH2"
},
"source": [
"#### Authentication\n",
"\n",
"You can request an access token from the default credentials for the current environment. Note that the access token lives for [1 hour by default](https://cloud.google.com/docs/authentication/token-types#at-lifetime); after expiration, it must be refreshed.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"cellView": "form",
"id": "i0qceuiQEPHv"
},
"outputs": [],
"source": [
"credentials, _ = default()\n",
"auth_request = transport.requests.Request()\n",
"credentials.refresh(auth_request)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Q04wJmA0HT6X"
},
"source": [
"Then configure the OpenAI SDK to point to the Llama 4 Chat Completions API endpoint.\n",
"\n",
"Note that only `us-east5` is supported region for Llama 4 models using Model-as-a-Service (MaaS)."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"cellView": "form",
"id": "c-MRhsnlj6iw"
},
"outputs": [],
"source": [
"MODEL_LOCATION = \"us-east5\"\n",
"MAAS_ENDPOINT = f\"{MODEL_LOCATION}-aiplatform.googleapis.com\"\n",
"\n",
"if not accept_eula:\n",
" raise ValueError(\"Accept the EULA to continue.\")\n",
"\n",
"client = openai.OpenAI(\n",
" base_url=f\"https://{MAAS_ENDPOINT}/v1beta1/projects/{PROJECT_ID}/locations/{LOCATION}/endpoints/openapi\",\n",
" api_key=credentials.token,\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "UGokrtdiIHrX"
},
"source": [
"#### Llama 4 models\n",
"\n",
"You can experiment with various supported Llama 4 models.\n",
"\n",
"This tutorial use Llama 4 90B Vision Instruct using Model-as-a-Service (MaaS). Using Model-as-a-Service (MaaS), you can access Llama 4 models in just a few clicks without any setup or infrastructure hassles.\n",
"\n",
"You can also access Llama models for self-service in Vertex AI Model Garden, allowing you to choose your preferred infrastructure. [Check out Llama 4 model card](https://console.cloud.google.com/vertex-ai/publishers/meta/model-garden/llama4?_ga=2.31261500.2048242469.1721714335-1107467625.1721655511) to learn how to deploy a Llama 4 models on Vertex AI."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"cellView": "form",
"id": "r7OhyH46H2H5"
},
"outputs": [],
"source": [
"MODEL_ID = \"meta/llama-4-scout-17b-16e-instruct-maas\" # @param [\"meta/llama-4-scout-17b-16e-instruct-maas\", \"meta/llama-4-maverick-17b-128e-instruct-maas\"]"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "1xD62NTpqHXd"
},
"source": [
"### Chat with Llama 4\n",
"\n",
"Use the Chat Completions API to send a multi-model request to the Llama 4 model."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "tkyp9kZSuJGx"
},
"source": [
"#### Hello, Llama 4"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"cellView": "form",
"id": "CKVOZ1HEqRbY"
},
"outputs": [],
"source": [
"max_tokens = 4096\n",
"\n",
"response = client.chat.completions.create(\n",
" model=MODEL_ID,\n",
" messages=[\n",
" {\n",
" \"role\": \"user\",\n",
" \"content\": [\n",
" {\n",
" \"image_url\": {\n",
" \"url\": \"gs://github-repo/img/gemini/intro/landmark1.jpg\"\n",
" },\n",
" \"type\": \"image_url\",\n",
" },\n",
" {\"text\": \"What’s in this image?\", \"type\": \"text\"},\n",
" ],\n",
" },\n",
" {\"role\": \"assistant\", \"content\": \"In this image, you have:\"},\n",
" ],\n",
" max_tokens=max_tokens,\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "T3yz-Xuc9Nyf"
},
"source": [
"You get the response as shown below."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"cellView": "form",
"id": "LxpdxYCxH51u"
},
"outputs": [],
"source": [
"print(response.choices[0].message.content)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "ShR15nvo9Te4"
},
"source": [
"You use the helper function to visualize the image."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"cellView": "form",
"id": "ECUxyzWmSbuX"
},
"outputs": [],
"source": [
"visualize_image_from_bucket(\"github-repo\", \"img/gemini/intro/landmark1.jpg\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "B1rKbHUQt605"
},
"source": [
"#### Ask Llama 4 using different model configuration\n",
"\n",
"Use the following parameters to generate different answers:\n",
"\n",
"* `temperature` to control the randomness of the response\n",
"* `top_p` to control the quality of the response\n",
"* `stream` to stream the response back or not\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"cellView": "form",
"id": "owv-5Sz5rIEU"
},
"outputs": [],
"source": [
"temperature = 1.0 # @param {type:\"number\"}\n",
"top_p = 1.0 # @param {type:\"number\"}\n",
"stream = True # @param {type:\"boolean\"}"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "a-qBuhcK-G1V"
},
"source": [
"Get the answer."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"cellView": "form",
"id": "O1YU8bSivH0B"
},
"outputs": [],
"source": [
"response = client.chat.completions.create(\n",
" model=MODEL_ID,\n",
" messages=[\n",
" {\n",
" \"role\": \"user\",\n",
" \"content\": [\n",
" {\n",
" \"image_url\": {\n",
" \"url\": \"gs://github-repo/img/gemini/intro/landmark2.jpg\"\n",
" },\n",
" \"type\": \"image_url\",\n",
" },\n",
" {\"text\": \"What’s in this image?\", \"type\": \"text\"},\n",
" ],\n",
" },\n",
" {\"role\": \"assistant\", \"content\": \"In this image, you have:\"},\n",
" ],\n",
" temperature=temperature,\n",
" max_tokens=max_tokens,\n",
" top_p=top_p,\n",
" stream=stream,\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "-o9-gF0U-Kba"
},
"source": [
"Depending if `stream` parameter is enabled or not, you can print the response entirely or chunk by chunk."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"cellView": "form",
"id": "CoDHLGhyyt8d"
},
"outputs": [],
"source": [
"if stream:\n",
" for chunk in response:\n",
" print(chunk.choices[0].delta.content, end=\"\")\n",
"else:\n",
" print(response.choices[0].message.content)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "nKBAMJuG9l_j"
},
"source": [
"And again, let's check if the answer is correct."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"cellView": "form",
"id": "OqMthjHy9vXW"
},
"outputs": [],
"source": [
"visualize_image_from_bucket(\"github-repo\", \"img/gemini/intro/landmark2.jpg\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "BkoaelaKxm1r"
},
"source": [
"#### Use Llama 4 with different multimodal tasks\n",
"\n",
"In this section, you will use Llama 4 to perform different multimodal tasks including image captioning and Visual Question Answering (VQA).\n",
"\n",
"For each task, you'll define a different prompt and submit a request to the model as you did before."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"cellView": "form",
"id": "MuMkl7De_DR9"
},
"outputs": [],
"source": [
"visualize_image_from_bucket(\"github-repo\", \"img/gemini/intro/landmark3.jpg\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "-en7AYQDyONt"
},
"source": [
"##### Image captioning"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"cellView": "form",
"id": "QANInNvizWbi"
},
"outputs": [],
"source": [
"prompt = \"Imagine you're telling a friend about this photo. What would you say?\""
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"cellView": "form",
"id": "2x8ML1Y_yfom"
},
"outputs": [],
"source": [
"response = client.chat.completions.create(\n",
" model=MODEL_ID,\n",
" messages=[\n",
" {\n",
" \"role\": \"user\",\n",
" \"content\": [\n",
" {\n",
" \"image_url\": {\n",
" \"url\": \"gs://github-repo/img/gemini/intro/landmark3.jpg\"\n",
" },\n",
" \"type\": \"image_url\",\n",
" },\n",
" {\"text\": prompt, \"type\": \"text\"},\n",
" ],\n",
" },\n",
" ],\n",
" max_tokens=max_tokens,\n",
")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"cellView": "form",
"id": "qQ6UUgpHztXZ"
},
"outputs": [],
"source": [
"print(response.choices[0].message.content)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "kBLESIw4zhto"
},
"source": [
"##### Visual Question Answering (VQA)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"cellView": "form",
"id": "UrAybklfzhtz"
},
"outputs": [],
"source": [
"prompt = \"\"\"\n",
"Analyze this image and answer the following questions:\n",
"- What is the primary color in the image?\n",
"- What is the overall mood or atmosphere conveyed in the scene?\n",
"- Based on the visual clues, who might have taken the picture?\"\n",
"\"\"\""
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"cellView": "form",
"id": "VJYIZbGyzhtz"
},
"outputs": [],
"source": [
"response = client.chat.completions.create(\n",
" model=MODEL_ID,\n",
" messages=[\n",
" {\n",
" \"role\": \"user\",\n",
" \"content\": [\n",
" {\n",
" \"image_url\": {\n",
" \"url\": \"gs://github-repo/img/gemini/intro/landmark3.jpg\"\n",
" },\n",
" \"type\": \"image_url\",\n",
" },\n",
" {\"text\": prompt, \"type\": \"text\"},\n",
" ],\n",
" },\n",
" ],\n",
" max_tokens=max_tokens,\n",
")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"cellView": "form",
"id": "9vfWA2i9zwOZ"
},
"outputs": [],
"source": [
"print(response.choices[0].message.content)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "gnrXpv5Y3yFK"
},
"source": [
"### Build with Llama 4 : Car Damage Assessment app using Gradio\n",
"\n",
"In this section, you use Llama 4 to build a simple GenAI powered application for Car Damage Assessment.\n",
"\n",
"In this scenario, the app has to cover the following tasks:\n",
"\n",
"* Classify the type of damage\n",
"* Estimate the damage severity\n",
"* Estimate the damage cost\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "kD-Fo_2WRSBt"
},
"source": [
"#### Define the UI functions"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"cellView": "form",
"id": "tCbyAGF6ZpSN"
},
"outputs": [],
"source": [
"def upload_image_to_bucket(image_path: str) -> str:\n",
" \"\"\"Uploads an image to a Google Cloud Storage bucket.\"\"\"\n",
" try:\n",
" # Create a client for interacting with Google Cloud Storage\n",
" storage_client = storage.Client()\n",
"\n",
" # Get a reference to the bucket\n",
" bucket = storage_client.bucket(BUCKET_NAME)\n",
"\n",
" # Generate a unique blob name based on the file extension\n",
" file_extension = image_path.split(\".\")[-1].lower()\n",
" if file_extension in [\"jpg\", \"jpeg\"]:\n",
" blob_name = f\"car_damage_{uuid.uuid4()}.jpg\"\n",
" else:\n",
" blob_name = f\"car_damage_{uuid.uuid4()}.png\"\n",
"\n",
" # Get a reference to the blob and upload the image\n",
" blob = bucket.blob(blob_name)\n",
" blob.upload_from_filename(image_path)\n",
"\n",
" # Construct the URI of the uploaded image\n",
" image_uri = f\"gs://{BUCKET_NAME}/{blob_name}\"\n",
" return image_uri\n",
"\n",
" except Exception as e:\n",
" print(f\"Error uploading image: {e}\")\n",
"\n",
"\n",
"def parse_json_from_markdown(markdown_text: str) -> dict | None:\n",
" \"\"\"Extracts and parses JSON content embedded within Markdown text.\"\"\"\n",
" json_pattern = r\"```json\\n(.*?)\\n```\"\n",
" match = re.search(json_pattern, markdown_text, re.DOTALL)\n",
"\n",
" if match:\n",
" json_content = match.group(1)\n",
" try:\n",
" parsed_data = json.loads(json_content)\n",
" return parsed_data\n",
" except json.JSONDecodeError as e:\n",
" print(f\"Error: Invalid JSON content found. {e}\")\n",
" return None\n",
" else:\n",
" return None\n",
"\n",
"\n",
"def process_image(image_uri):\n",
" \"\"\"Processes a car damage image using a multimodal LLM.\"\"\"\n",
"\n",
" # Construct the prompt\n",
" prompt = \"\"\"\n",
" Analyze the provided image of a car and provide the following information:\n",
"\n",
" 1. Damage Type: Identify the primary type of damage visible in the image (e.g., dent, scratch, cracked windshield, etc.).\n",
" 2. Severity: Estimate the severity of the damage on a scale of 1 to 5, where 1 is minor and 5 is severe.\n",
" 3. Estimated Repair Cost: Provide an approximate range for the repair cost in USD.\n",
"\n",
" Return the results in JSON format with damagetype, severity, and cost fields.\n",
" \"\"\"\n",
"\n",
" # Call Llama model\n",
" credentials, _ = default()\n",
" auth_request = transport.requests.Request()\n",
" credentials.refresh(auth_request)\n",
"\n",
" client = openai.OpenAI(\n",
" base_url=f\"https://{MAAS_ENDPOINT}/v1beta1/projects/{PROJECT_ID}/locations/{LOCATION}/endpoints/openapi\",\n",
" api_key=credentials.token,\n",
" )\n",
" response = client.chat.completions.create(\n",
" model=MODEL_ID,\n",
" messages=[\n",
" {\n",
" \"role\": \"user\",\n",
" \"content\": [\n",
" {\"image_url\": {\"url\": image_uri}, \"type\": \"image_url\"},\n",
" {\"text\": prompt, \"type\": \"text\"},\n",
" ],\n",
" },\n",
" ],\n",
" max_tokens=max_tokens,\n",
" )\n",
"\n",
" # Parse the response\n",
" response = response.choices[0].message.content\n",
" output = parse_json_from_markdown(response)\n",
"\n",
" output = {\"damagetype\": \"scratch\", \"severity\": 5, \"cost\": 1000}\n",
" return output[\"damagetype\"], output[\"severity\"], output[\"cost\"]\n",
"\n",
"\n",
"def demo_fn(image_path):\n",
" \"\"\"\n",
" Processes a car damage image using a multimodal LLM.\n",
" \"\"\"\n",
"\n",
" # Upload the image\n",
" image_uri = upload_image_to_bucket(image_path)\n",
"\n",
" # Process the image\n",
" damagetype, severity, cost = process_image(image_uri)\n",
"\n",
" return damagetype, severity, cost"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "lKa_C0QaZqVN"
},
"source": [
"#### Run the application"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"cellView": "form",
"id": "8CSczAFiBamY"
},
"outputs": [],
"source": [
"demo = gr.Interface(\n",
" fn=demo_fn,\n",
" inputs=gr.Image(type=\"filepath\"),\n",
" outputs=[\n",
" gr.Textbox(label=\"Damage Type\"),\n",
" gr.Slider(label=\"Severity\", minimum=1, maximum=10, step=1),\n",
" gr.Number(label=\"Cost\"),\n",
" ],\n",
" title=\"Car Damage Assessment\",\n",
")\n",
"\n",
"demo.launch(debug=True)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"cellView": "form",
"id": "UmFjQHeYd08T"
},
"outputs": [],
"source": [
"demo.close()"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "2a4e033321ad"
},
"source": [
"## Cleaning up\n",
"\n",
"Clean up resources created in this notebook.\n",
"\n",
"To delete to the search engine in Vertex AI, check out the following [documentation](https://cloud.google.com/generative-ai-app-builder/docs/delete-engine)."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"cellView": "form",
"id": "OC7Ypb05ccUE"
},
"outputs": [],
"source": [
"delete_bucket = False # @param {type:\"boolean\"}\n",
"\n",
"if delete_bucket:\n",
" ! gsutil -m rm -r $BUCKET_NAME"
]
}
],
"metadata": {
"colab": {
"name": "model_garden_openai_api_llama4.ipynb",
"toc_visible": true
},
"kernelspec": {
"display_name": "Python 3",
"name": "python3"
}
},
"nbformat": 4,
"nbformat_minor": 0
}