vision/use-cases/retail_with_veo_and_imagen.ipynb (415 lines of code) (raw):
{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "KjTHAV8FgEza"
},
"outputs": [],
"source": [
"# Copyright 2025 Google LLC\n",
"#\n",
"# Licensed under the Apache License, Version 2.0 (the \"License\");\n",
"# you may not use this file except in compliance with the License.\n",
"# You may obtain a copy of the License at\n",
"#\n",
"# https://www.apache.org/licenses/LICENSE-2.0\n",
"#\n",
"# Unless required by applicable law or agreed to in writing, software\n",
"# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
"# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
"# See the License for the specific language governing permissions and\n",
"# limitations under the License."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "UdL4uvQQs76x"
},
"source": [
"# Retail Product Customization with Imagen and Veo\n",
"\n",
"<table align=\"left\">\n",
" <td style=\"text-align: center\">\n",
" <a href=\"https://colab.research.google.com/github/GoogleCloudPlatform/generative-ai/blob/main/vision/use-cases/retail_with_veo_and_imagen.ipynb\">\n",
" <img src=\"https://cloud.google.com/ml-engine/images/colab-logo-32px.png\" alt=\"Google Colaboratory logo\"><br> Run in Colab\n",
" </a>\n",
" </td>\n",
" <td style=\"text-align: center\">\n",
" <a href=\"https://console.cloud.google.com/vertex-ai/colab/import/https:%2F%2Fraw.githubusercontent.com%2FGoogleCloudPlatform%2Fgenerative-ai%2Fmain%2Fvision%2Fuse-cases%2Fretail_with_veo_and_imagen.ipynb\">\n",
" <img width=\"32px\" src=\"https://cloud.google.com/ml-engine/images/colab-enterprise-logo-32px.png\" alt=\"Google Cloud Colab Enterprise logo\"><br> Run in Colab Enterprise\n",
" </a>\n",
" </td>\n",
" <td style=\"text-align: center\">\n",
" <a href=\"https://console.cloud.google.com/vertex-ai/workbench/deploy-notebook?download_url=https://raw.githubusercontent.com/GoogleCloudPlatform/generative-ai/main/vision/use-cases/retail_with_veo_and_imagen.ipynb\">\n",
" <img src=\"https://lh3.googleusercontent.com/UiNooY4LUgW_oTvpsNhPpQzsstV5W8F7rYgxgGBD85cWJoLmrOzhVs_ksK_vgx40SHs7jCqkTkCk=e14-rj-sc0xffffff-h130-w32\" alt=\"Vertex AI logo\"><br> Open in Vertex AI Workbench\n",
" </a>\n",
" </td> \n",
" <td style=\"text-align: center\">\n",
" <a href=\"https://github.com/GoogleCloudPlatform/generative-ai/blob/main/vision/use-cases/retail_with_veo_and_imagen.ipynb\">\n",
" <img src=\"https://cloud.google.com/ml-engine/images/github-logo-32px.png\" alt=\"GitHub logo\"><br> View on GitHub\n",
" </a>\n",
" </td>\n",
"</table>\n",
"\n",
"<div style=\"clear: both;\"></div>\n",
"\n",
"<b>Share to:</b>\n",
"\n",
"<a href=\"https://www.linkedin.com/sharing/share-offsite/?url=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/vision/use-cases/retail_with_veo_and_imagen.ipynb\" target=\"_blank\">\n",
" <img width=\"20px\" src=\"https://upload.wikimedia.org/wikipedia/commons/8/81/LinkedIn_icon.svg\" alt=\"LinkedIn logo\">\n",
"</a>\n",
"\n",
"<a href=\"https://bsky.app/intent/compose?text=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/vision/use-cases/retail_with_veo_and_imagen.ipynb\" target=\"_blank\">\n",
" <img width=\"20px\" src=\"https://upload.wikimedia.org/wikipedia/commons/7/7a/Bluesky_Logo.svg\" alt=\"Bluesky logo\">\n",
"</a>\n",
"\n",
"<a href=\"https://twitter.com/intent/tweet?url=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/vision/use-cases/retail_with_veo_and_imagen.ipynb\" target=\"_blank\">\n",
" <img width=\"20px\" src=\"https://upload.wikimedia.org/wikipedia/commons/5/53/X_logo_2023_original.svg\" alt=\"X logo\">\n",
"</a>\n",
"\n",
"<a href=\"https://reddit.com/submit?url=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/vision/use-cases/retail_with_veo_and_imagen.ipynb\" target=\"_blank\">\n",
" <img width=\"20px\" src=\"https://redditinc.com/hubfs/Reddit%20Inc/Brand/Reddit_Logo.png\" alt=\"Reddit logo\">\n",
"</a>\n",
"\n",
"<a href=\"https://www.facebook.com/sharer/sharer.php?u=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/vision/use-cases/retail_with_veo_and_imagen.ipynb\" target=\"_blank\">\n",
" <img width=\"20px\" src=\"https://upload.wikimedia.org/wikipedia/commons/5/51/Facebook_f_logo_%282019%29.svg\" alt=\"Facebook logo\">\n",
"</a>"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "lUCUMoTmN_lJ"
},
"source": [
"| Author |\n",
"| --- |\n",
"| [Katie Nguyen](https://github.com/katiemn) |"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "rDjAqcgigwdX"
},
"source": [
"## Overview\n",
"\n",
"### Veo 2\n",
"\n",
"[Veo 2 on Vertex AI](https://cloud.google.com/vertex-ai/generative-ai/docs/video/generate-videos) brings Google's state of the art video generation capabilities to application developers. It's capable of creating videos with astonishing detail that simulate real-world physics across a wide range of visual styles.\n",
"\n",
"In this tutorial, you will walk through a retail use case that illustrates how you can use Veo and Imagen to:\n",
"- Start with existing product images\n",
"- Edit product photo backgrounds\n",
"- Add motion to create assets that can be used for product catalogues, recommendation engines, or advertisements\n",
"\n",
"You will utilize the Google Gen AI SDK for Python to interact with Veo 2 and Imagen 3 to generate new product images and videos."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "v2_iOv5uhXVg"
},
"source": [
"## Get started"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "K4uerc9Xhf1f"
},
"source": [
"### Install Google Gen AI SDK for Python and other libraries"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "rJyFNKoQhiwF"
},
"outputs": [],
"source": [
"%pip install --upgrade --quiet google-genai mediapy"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "GWYnCW0-h6HI"
},
"source": [
"### Authenticate your notebook environment (Colab only)\n",
"\n",
"If you are running this notebook on Google Colab, run the following cell to authenticate your environment."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "bqz5LUG6h8fA"
},
"outputs": [],
"source": [
"import sys\n",
"\n",
"if \"google.colab\" in sys.modules:\n",
" from google.colab import auth\n",
"\n",
" auth.authenticate_user()"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "LVrasKoriKZn"
},
"source": [
"### Import libraries"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "oMQf_BkyiMgF"
},
"outputs": [],
"source": [
"import time\n",
"import urllib\n",
"\n",
"from PIL import Image as PIL_Image\n",
"from google import genai\n",
"from google.genai import types\n",
"import matplotlib.pyplot as plt\n",
"import mediapy as media"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "yxBkUEqdiB1g"
},
"source": [
"### Set Google Cloud project information and create client\n",
"\n",
"To get started using Vertex AI, you must have an existing Google Cloud project and [enable the Vertex AI API](https://console.cloud.google.com/flows/enableapi?apiid=aiplatform.googleapis.com).\n",
"\n",
"Learn more about [setting up a project and a development environment](https://cloud.google.com/vertex-ai/docs/start/cloud-environment)."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "GtjPBmYHiEfx"
},
"outputs": [],
"source": [
"import os\n",
"\n",
"PROJECT_ID = \"[your-project-id]\" # @param {type: \"string\", placeholder: \"[your-project-id]\", isTemplate: true}\n",
"if not PROJECT_ID or PROJECT_ID == \"[your-project-id]\":\n",
" PROJECT_ID = str(os.environ.get(\"GOOGLE_CLOUD_PROJECT\"))\n",
"\n",
"LOCATION = os.environ.get(\"GOOGLE_CLOUD_REGION\", \"us-central1\")\n",
"\n",
"client = genai.Client(vertexai=True, project=PROJECT_ID, location=LOCATION)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "qD_bwA9hiMzL"
},
"source": [
"### Define helper functions to display media"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "GUrEwbvFiPhJ"
},
"outputs": [],
"source": [
"def display_images(image: PIL_Image) -> None:\n",
" _, axis = plt.subplots(1, 1, figsize=(12, 6))\n",
" axis.imshow(image)\n",
" axis.set_title(\"Product Image\")\n",
" axis.axis(\"off\")\n",
" plt.show()\n",
"\n",
"\n",
"def show_video(gcs_uri: str) -> None:\n",
" file_name = gcs_uri.split(\"/\")[-1]\n",
" !gsutil cp {gcs_uri} {file_name}\n",
" media.show_video(media.read_video(file_name), height=500)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "2jaSOOadiUj6"
},
"source": [
"### Load the generative media models"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "APRfTklCiYR2"
},
"outputs": [],
"source": [
"VIDEO_MODEL_ID = \"veo-2.0-generate-001\"\n",
"IMAGE_EDIT_MODEL_ID = \"imagen-3.0-capability-001\""
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "aDaTx8WCidRG"
},
"source": [
"### Load the original product image\n",
"\n",
"Start by loading the initial product image. In this case, you'll be loading an image of a suitcase saved in Google Cloud Storage.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "EBPdDCRe-tDn"
},
"outputs": [],
"source": [
"PRODUCT_IMAGE = (\n",
" \"https://storage.googleapis.com/cloud-samples-data/generative-ai/image/suitcase.png\"\n",
")\n",
"product_image = types.Image(gcs_uri=PRODUCT_IMAGE)\n",
"display_images(PIL_Image.open(urllib.request.urlopen(PRODUCT_IMAGE)))"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "uFkUiApjYU3F"
},
"source": [
"### Edit the product background\n",
"\n",
"Next, create a background mask and supply a prompt to guide background replacement of the product image. To maintain the original product set the edit_mode to \"EDIT_MODE_BGSWAP\".\n",
"\n",
"Finally, save the edited image locally to reference it later in your video generation prompt."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "09ckASrKNuqF"
},
"outputs": [],
"source": [
"raw_ref_image = types.RawReferenceImage(reference_image=product_image, reference_id=0)\n",
"mask_ref_image = types.MaskReferenceImage(\n",
" reference_id=1,\n",
" reference_image=None,\n",
" config=types.MaskReferenceConfig(\n",
" mask_mode=types.MaskReferenceMode.MASK_MODE_BACKGROUND\n",
" ),\n",
")\n",
"\n",
"prompt = \"a busy airport\"\n",
"edited_image = client.models.edit_image(\n",
" model=IMAGE_EDIT_MODEL_ID,\n",
" prompt=prompt,\n",
" reference_images=[raw_ref_image, mask_ref_image],\n",
" config=types.EditImageConfig(\n",
" edit_mode=types.EditMode.EDIT_MODE_BGSWAP,\n",
" number_of_images=1,\n",
" seed=1,\n",
" safety_filter_level=types.SafetyFilterLevel.BLOCK_MEDIUM_AND_ABOVE,\n",
" person_generation=types.PersonGeneration.ALLOW_ADULT,\n",
" ),\n",
")\n",
"\n",
"edited_product_image = edited_image.generated_images[0].image\n",
"display_images(edited_product_image._pil_image)\n",
"edited_product_image.save(\"edited_product_image.png\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "YysYLyiVj8Zd"
},
"source": [
"### Generate videos from an image input\n",
"\n",
"Now, you'll take the edited image and add motion with Veo. In order to generate a video in the following sample, specify the following info:\n",
"\n",
"- **File location:** The generated video will be shown below with support from a previously defined helper function. The video will also be stored in Cloud Storage once video generation is complete. Specify the file path where you'd like this video to be stored in the `output_gcs` field.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "5gNxNj9N2R4o"
},
"outputs": [],
"source": [
"output_gcs = \"gs://\" # @param {type: 'string'}\n",
"img = types.Image.from_file(location=\"edited_product_image.png\")\n",
"\n",
"operation = client.models.generate_videos(\n",
" model=VIDEO_MODEL_ID,\n",
" image=types.Image(\n",
" image_bytes=img.image_bytes,\n",
" mime_type=\"image/png\",\n",
" ),\n",
" config=types.GenerateVideosConfig(\n",
" output_gcs_uri=output_gcs,\n",
" aspect_ratio=\"9:16\",\n",
" number_of_videos=1,\n",
" person_generation=\"allow_adult\",\n",
" ),\n",
")\n",
"\n",
"while not operation.done:\n",
" time.sleep(15)\n",
" operation = client.operations.get(operation)\n",
" print(operation)\n",
"\n",
"if operation.response:\n",
" show_video(operation.result.generated_videos[0].video.uri)"
]
}
],
"metadata": {
"colab": {
"name": "retail_with_veo_and_imagen.ipynb",
"toc_visible": true
},
"kernelspec": {
"display_name": "Python 3",
"name": "python3"
}
},
"nbformat": 4,
"nbformat_minor": 0
}