vision/getting-started/imagen3

{ "cells": [ { "cell_type": "code", "execution_count": null, "metadata": { "id": "ijGzTHJJUCPY" }, "outputs": [], "source": [ "# Copyright 2024 Google LLC\n", "#\n", "# Licensed under the Apache License, Version 2.0 (the \"License\");\n", "# you may not use this file except in compliance with the License.\n", "# You may obtain a copy of the License at\n", "#\n", "# https://www.apache.org/licenses/LICENSE-2.0\n", "#\n", "# Unless required by applicable law or agreed to in writing, software\n", "# distributed under the License is distributed on an \"AS IS\" BASIS,\n", "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n", "# See the License for the specific language governing permissions and\n", "# limitations under the License." ] }, { "cell_type": "markdown", "metadata": { "id": "9003470a8d3b" }, "source": [ "# Imagen 3 Image Editing\n", "\n", "<table align=\"left\">\n", " <td style=\"text-align: center\">\n", " <a href=\"https://colab.research.google.com/github/GoogleCloudPlatform/generative-ai/blob/main/vision/getting-started/imagen3_editing.ipynb\">\n", " <img src=\"https://cloud.google.com/ml-engine/images/colab-logo-32px.png\" alt=\"Google Colaboratory logo\"><br> Run in Colab\n", " </a>\n", " </td>\n", " <td style=\"text-align: center\">\n", " <a href=\"https://console.cloud.google.com/vertex-ai/colab/import/https:%2F%2Fraw.githubusercontent.com%2FGoogleCloudPlatform%2Fgenerative-ai%2Fmain%2Fvision%2Fgetting-started%2Fimagen3_editing.ipynb\">\n", " <img width=\"32px\" src=\"https://cloud.google.com/ml-engine/images/colab-enterprise-logo-32px.png\" alt=\"Google Cloud Colab Enterprise logo\"><br> Run in Colab Enterprise\n", " </a>\n", " </td>\n", " <td style=\"text-align: center\">\n", " <a href=\"https://console.cloud.google.com/vertex-ai/workbench/deploy-notebook?download_url=https://raw.githubusercontent.com/GoogleCloudPlatform/generative-ai/main/vision/getting-started/imagen3_editing.ipynb\">\n", " <img src=\"https://lh3.googleusercontent.com/UiNooY4LUgW_oTvpsNhPpQzsstV5W8F7rYgxgGBD85cWJoLmrOzhVs_ksK_vgx40SHs7jCqkTkCk=e14-rj-sc0xffffff-h130-w32\" alt=\"Vertex AI logo\"><br> Open in Vertex AI Workbench\n", " </a>\n", " </td> \n", " <td style=\"text-align: center\">\n", " <a href=\"https://github.com/GoogleCloudPlatform/generative-ai/blob/main/vision/getting-started/imagen3_editing.ipynb\">\n", " <img src=\"https://cloud.google.com/ml-engine/images/github-logo-32px.png\" alt=\"GitHub logo\"><br> View on GitHub\n", " </a>\n", " </td>\n", "</table>\n", "\n", "<div style=\"clear: both;\"></div>\n", "\n", "<b>Share to:</b>\n", "\n", "<a href=\"https://www.linkedin.com/sharing/share-offsite/?url=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/vision/getting-started/imagen3_editing.ipynb\" target=\"_blank\">\n", " <img width=\"20px\" src=\"https://upload.wikimedia.org/wikipedia/commons/8/81/LinkedIn_icon.svg\" alt=\"LinkedIn logo\">\n", "</a>\n", "\n", "<a href=\"https://bsky.app/intent/compose?text=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/vision/getting-started/imagen3_editing.ipynb\" target=\"_blank\">\n", " <img width=\"20px\" src=\"https://upload.wikimedia.org/wikipedia/commons/7/7a/Bluesky_Logo.svg\" alt=\"Bluesky logo\">\n", "</a>\n", "\n", "<a href=\"https://twitter.com/intent/tweet?url=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/vision/getting-started/imagen3_editing.ipynb\" target=\"_blank\">\n", " <img width=\"20px\" src=\"https://upload.wikimedia.org/wikipedia/commons/5/5a/X_icon_2.svg\" alt=\"X logo\">\n", "</a>\n", "\n", "<a href=\"https://reddit.com/submit?url=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/vision/getting-started/imagen3_editing.ipynb\" target=\"_blank\">\n", " <img width=\"20px\" src=\"https://redditinc.com/hubfs/Reddit%20Inc/Brand/Reddit_Logo.png\" alt=\"Reddit logo\">\n", "</a>\n", "\n", "<a href=\"https://www.facebook.com/sharer/sharer.php?u=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/vision/getting-started/imagen3_editing.ipynb\" target=\"_blank\">\n", " <img width=\"20px\" src=\"https://upload.wikimedia.org/wikipedia/commons/5/51/Facebook_f_logo_%282019%29.svg\" alt=\"Facebook logo\">\n", "</a>" ] }, { "cell_type": "markdown", "metadata": { "id": "G1KDmM_PBAXz" }, "source": [ "| Author |\n", "| --- |\n", "| [Katie Nguyen](https://github.com/katiemn) |" ] }, { "cell_type": "markdown", "metadata": { "id": "CkHPv2myT2cx" }, "source": [ "## Overview\n", "\n", "### Imagen 3\n", "\n", "Imagen 3 on Vertex AI brings Google's state of the art generative AI capabilities to application developers. Imagen 3 is Google's highest quality text-to-image model to date. It's capable of creating images with astonishing detail. Thus, developers have more control when building next-generation AI products that transform their imagination into high quality visual assets. Learn more about [Imagen on Vertex AI](https://cloud.google.com/vertex-ai/generative-ai/docs/image/overview).\n" ] }, { "cell_type": "markdown", "metadata": { "id": "DrkcqHrrwMAo" }, "source": [ "In this tutorial, you will learn how to use the Google Gen AI SDK for Python to interact with Imagen 3 and modify existing images with mask-based editing and mask-free editing in the following modes:\n", "\n", "- Inpainting\n", "- Product background editing\n", "- Outpainting\n", "- Mask-free" ] }, { "cell_type": "markdown", "metadata": { "id": "r11Gu7qNgx1p" }, "source": [ "## Get started\n" ] }, { "cell_type": "markdown", "metadata": { "id": "No17Cw5hgx12" }, "source": [ "### Install Google Gen AI SDK for Python\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "tFy3H3aPgx12" }, "outputs": [], "source": [ "%pip install --upgrade --quiet google-genai" ] }, { "cell_type": "markdown", "metadata": { "id": "dmWOrTJ3gx13" }, "source": [ "### Authenticate your notebook environment (Colab only)\n", "\n", "If you are running this notebook on Google Colab, run the following cell to authenticate your environment.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "NyKGtVQjgx13" }, "outputs": [], "source": [ "import sys\n", "\n", "if \"google.colab\" in sys.modules:\n", " from google.colab import auth\n", "\n", " auth.authenticate_user()" ] }, { "cell_type": "markdown", "metadata": { "id": "Ua6PDqB1iBSb" }, "source": [ "### Import libraries" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "yTiDo0lRh6sc" }, "outputs": [], "source": [ "from google import genai\n", "from google.genai.types import (\n", " EditImageConfig,\n", " GenerateImagesConfig,\n", " Image,\n", " MaskReferenceConfig,\n", " MaskReferenceImage,\n", " RawReferenceImage,\n", ")" ] }, { "cell_type": "markdown", "metadata": { "id": "DF4l8DTdWgPY" }, "source": [ "### Set Google Cloud project information and create client\n", "To get started using Vertex AI, you must have an existing Google Cloud project and [enable the Vertex AI API](https://console.cloud.google.com/flows/enableapi?apiid=aiplatform.googleapis.com).\n", "\n", "Learn more about [setting up a project and a development environment](https://cloud.google.com/vertex-ai/docs/start/cloud-environment)." ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "id": "Nqwi-5ufWp_B" }, "outputs": [], "source": [ "import os\n", "\n", "PROJECT_ID = \"[your-project-id]\" # @param {type: \"string\", placeholder: \"[your-project-id]\", isTemplate: true}\n", "if not PROJECT_ID or PROJECT_ID == \"[your-project-id]\":\n", " PROJECT_ID = str(os.environ.get(\"GOOGLE_CLOUD_PROJECT\"))\n", "\n", "LOCATION = os.environ.get(\"GOOGLE_CLOUD_REGION\", \"us-central1\")\n", "\n", "client = genai.Client(vertexai=True, project=PROJECT_ID, location=LOCATION)" ] }, { "cell_type": "markdown", "metadata": { "id": "Sr2Y3lFwKW1M" }, "source": [ "### Define helper functions" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "r_38e5rRKB6s" }, "outputs": [], "source": [ "import io\n", "import urllib\n", "\n", "from PIL import Image as PIL_Image\n", "import matplotlib.pyplot as plt\n", "\n", "\n", "# Gets the image bytes from a PIL Image object.\n", "def get_bytes_from_pil(image: PIL_Image) -> bytes:\n", " byte_io_png = io.BytesIO()\n", " image.save(byte_io_png, \"PNG\")\n", " return byte_io_png.getvalue()\n", "\n", "\n", "# Pads an image for outpainting.\n", "def pad_to_target_size(\n", " source_image,\n", " target_size=(1536, 1536),\n", " mode=\"RGB\",\n", " vertical_offset_ratio=0,\n", " horizontal_offset_ratio=0,\n", " fill_val=255,\n", "):\n", " orig_image_size_w, orig_image_size_h = source_image.size\n", " target_size_w, target_size_h = target_size\n", "\n", " insert_pt_x = (target_size_w - orig_image_size_w) // 2 + int(\n", " horizontal_offset_ratio * target_size_w\n", " )\n", " insert_pt_y = (target_size_h - orig_image_size_h) // 2 + int(\n", " vertical_offset_ratio * target_size_h\n", " )\n", " insert_pt_x = min(insert_pt_x, target_size_w - orig_image_size_w)\n", " insert_pt_y = min(insert_pt_y, target_size_h - orig_image_size_h)\n", "\n", " if mode == \"RGB\":\n", " source_image_padded = PIL_Image.new(\n", " mode, target_size, color=(fill_val, fill_val, fill_val)\n", " )\n", " elif mode == \"L\":\n", " source_image_padded = PIL_Image.new(mode, target_size, color=(fill_val))\n", " else:\n", " raise ValueError(\"source image mode must be RGB or L.\")\n", "\n", " source_image_padded.paste(source_image, (insert_pt_x, insert_pt_y))\n", " return source_image_padded\n", "\n", "\n", "# Pads and resizes image and mask to the same target size.\n", "def pad_image_and_mask(\n", " image_vertex: PIL_Image,\n", " mask_vertex: PIL_Image,\n", " target_size,\n", " vertical_offset_ratio,\n", " horizontal_offset_ratio,\n", "):\n", " image_vertex.thumbnail(target_size)\n", " mask_vertex.thumbnail(target_size)\n", "\n", " image_vertex = pad_to_target_size(\n", " image_vertex,\n", " target_size=target_size,\n", " mode=\"RGB\",\n", " vertical_offset_ratio=vertical_offset_ratio,\n", " horizontal_offset_ratio=horizontal_offset_ratio,\n", " fill_val=0,\n", " )\n", " mask_vertex = pad_to_target_size(\n", " mask_vertex,\n", " target_size=target_size,\n", " mode=\"L\",\n", " vertical_offset_ratio=vertical_offset_ratio,\n", " horizontal_offset_ratio=horizontal_offset_ratio,\n", " fill_val=255,\n", " )\n", " return image_vertex, mask_vertex\n", "\n", "\n", "def display_images(original_image, modified_image) -> None:\n", " fig, axis = plt.subplots(1, 2, figsize=(12, 6))\n", " axis[0].imshow(original_image)\n", " axis[0].set_title(\"Original Image\")\n", " axis[1].imshow(modified_image)\n", " axis[1].set_title(\"Edited Image\")\n", " for ax in axis:\n", " ax.axis(\"off\")\n", " plt.show()" ] }, { "cell_type": "markdown", "metadata": { "id": "VLmwIj2RD0Fx" }, "source": [ "### Load the image models\n", "\n", "Imagen 3 Generation: `imagen-3.0-generate-002`\n", "\n", "Imagen 3 Editing: `imagen-3.0-capability-001`" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "F-gd2ypQhh7K" }, "outputs": [], "source": [ "generation_model = \"imagen-3.0-generate-002\"\n", "\n", "edit_model = \"imagen-3.0-capability-001\"" ] }, { "cell_type": "markdown", "metadata": { "id": "f64d92aef6cb" }, "source": [ "### Inpainting insert\n", "\n", "In these examples you will specify a targeted area to apply edits to. In the case of inpainting insert, you'll use a mask area to add image content to an existing image. Start by generating an image using Imagen 3. Then create two ```ReferenceImage``` objects, one for your reference image and one for your mask. For the ```MaskReferenceImage``` set ```reference_image=None```, this will allow for automatic mask detection based on the specified ```mask_mode```.\n", "\n", "When generating images you can also set the `safety_filter_level` and `person_generation` parameters accordingly:\n", "* `person_generation`: DONT_ALLOW, ALLOW_ADULT, ALLOW_ALL\n", "* `safety_filter_level`: BLOCK_LOW_AND_ABOVE, BLOCK_MEDIUM_AND_ABOVE, BLOCK_ONLY_HIGH, BLOCK_NONE" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "wwZBW0UW-PiW" }, "outputs": [], "source": [ "image_prompt = \"\"\"\n", "a small wooden bowl with grapes and apples on a marble kitchen counter, light brown cabinets blurred in the background\n", "\"\"\"\n", "generated_image = client.models.generate_images(\n", " model=generation_model,\n", " prompt=image_prompt,\n", " config=GenerateImagesConfig(\n", " number_of_images=1,\n", " aspect_ratio=\"1:1\",\n", " safety_filter_level=\"BLOCK_MEDIUM_AND_ABOVE\",\n", " person_generation=\"DONT_ALLOW\",\n", " ),\n", ")\n", "\n", "edit_prompt = \"a small white ceramic bowl with lemons and limes\"\n", "raw_ref_image = RawReferenceImage(\n", " reference_image=generated_image.generated_images[0].image, reference_id=0\n", ")\n", "mask_ref_image = MaskReferenceImage(\n", " reference_id=1,\n", " reference_image=None,\n", " config=MaskReferenceConfig(\n", " mask_mode=\"MASK_MODE_FOREGROUND\",\n", " mask_dilation=0.1,\n", " ),\n", ")\n", "edited_image = client.models.edit_image(\n", " model=edit_model,\n", " prompt=edit_prompt,\n", " reference_images=[raw_ref_image, mask_ref_image],\n", " config=EditImageConfig(\n", " edit_mode=\"EDIT_MODE_INPAINT_INSERTION\",\n", " number_of_images=1,\n", " safety_filter_level=\"BLOCK_MEDIUM_AND_ABOVE\",\n", " person_generation=\"ALLOW_ADULT\",\n", " ),\n", ")\n", "\n", "display_images(\n", " generated_image.generated_images[0].image._pil_image,\n", " edited_image.generated_images[0].image._pil_image,\n", ")" ] }, { "cell_type": "markdown", "metadata": { "id": "ec7135f4de3d" }, "source": [ "This next example demonstrates another instance of inpainting insert. However, you'll use the semantic mask mode. When using this mask mode, you'll need to specify the class ID of the object in the image that you wish to mask and replace. A list of possible instance types is shown at the end of this notebook. Once you've found the correct segmentation class ID, list it in ```segmentation_classes```.\n", "\n", "Within the ```MaskReferenceImage``` object you can also configure the dilation value. This float between 0 and 1 represents the percentage of the provided mask." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "8pyAJlvQsocc" }, "outputs": [], "source": [ "image_prompt = \"\"\"\n", "a french bulldog sitting in a living room on a couch with green throw pillows and a throw blanket,\n", "a circular mirror is on the wall above the couch\n", "\"\"\"\n", "generated_image = client.models.generate_images(\n", " model=generation_model,\n", " prompt=image_prompt,\n", " config=GenerateImagesConfig(\n", " number_of_images=1,\n", " aspect_ratio=\"1:1\",\n", " safety_filter_level=\"BLOCK_MEDIUM_AND_ABOVE\",\n", " person_generation=\"DONT_ALLOW\",\n", " ),\n", ")\n", "\n", "edit_prompt = \"a corgi sitting on a couch\"\n", "raw_ref_image = RawReferenceImage(\n", " reference_image=generated_image.generated_images[0].image, reference_id=0\n", ")\n", "mask_ref_image = MaskReferenceImage(\n", " reference_id=1,\n", " reference_image=None,\n", " config=MaskReferenceConfig(\n", " mask_mode=\"MASK_MODE_SEMANTIC\",\n", " segmentation_classes=[8],\n", " mask_dilation=0.1,\n", " ),\n", ")\n", "edited_image = client.models.edit_image(\n", " model=edit_model,\n", " prompt=edit_prompt,\n", " reference_images=[raw_ref_image, mask_ref_image],\n", " config=EditImageConfig(\n", " edit_mode=\"EDIT_MODE_INPAINT_INSERTION\",\n", " number_of_images=1,\n", " safety_filter_level=\"BLOCK_MEDIUM_AND_ABOVE\",\n", " person_generation=\"ALLOW_ADULT\",\n", " ),\n", ")\n", "\n", "display_images(\n", " generated_image.generated_images[0].image._pil_image,\n", " edited_image.generated_images[0].image._pil_image,\n", ")" ] }, { "cell_type": "markdown", "metadata": { "id": "AnneP_O4vdL1" }, "source": [ "Below you'll see another instance of inpainting insert. This time you'll use a local image and mask that have been downloaded from Google Cloud Storage. When using your own mask, you'll specify \"MASK_MODE_USER_PROVIDED\" as the ```mask_mode```." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "87yClMcjsM_O" }, "outputs": [], "source": [ "! gsutil cp \"gs://cloud-samples-data/generative-ai/image/image-dog.png\" .\n", "! gsutil cp \"gs://cloud-samples-data/generative-ai/image/image-dog-mask.png\" .\n", "initial_image = Image.from_file(location=\"image-dog.png\")\n", "initial_image_mask = Image.from_file(location=\"image-dog-mask.png\")\n", "\n", "edit_prompt = \"a Persian cat sitting in a white cat bed\"\n", "raw_ref_image = RawReferenceImage(reference_image=initial_image, reference_id=0)\n", "mask_ref_image = MaskReferenceImage(\n", " reference_id=1,\n", " reference_image=initial_image_mask,\n", " config=MaskReferenceConfig(\n", " mask_mode=\"MASK_MODE_USER_PROVIDED\",\n", " mask_dilation=0.1,\n", " ),\n", ")\n", "\n", "edited_image = client.models.edit_image(\n", " model=edit_model,\n", " prompt=edit_prompt,\n", " reference_images=[raw_ref_image, mask_ref_image],\n", " config=EditImageConfig(\n", " edit_mode=\"EDIT_MODE_INPAINT_INSERTION\",\n", " number_of_images=1,\n", " safety_filter_level=\"BLOCK_MEDIUM_AND_ABOVE\",\n", " person_generation=\"ALLOW_ADULT\",\n", " ),\n", ")\n", "\n", "display_images(\n", " initial_image._pil_image, edited_image.generated_images[0].image._pil_image\n", ")" ] }, { "cell_type": "markdown", "metadata": { "id": "6ad62258e803" }, "source": [ "### Inpainting remove\n", "\n", "Inpainting remove allows you to use a mask area to remove image content.\n", "\n", "In this next example, you'll take an image in Google Cloud Storage of a wall with a mirror and some photos and create a mask over detected mirror instances. You'll then remove this object by setting the edit mode to \"EDIT_MODE_INPAINT_REMOVAL.\" For these types of requests the prompt can be an empty string." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "mOdKM3ZCB4g6" }, "outputs": [], "source": [ "starting_image = Image(gcs_uri=\"gs://cloud-samples-data/generative-ai/image/mirror.png\")\n", "raw_ref_image = RawReferenceImage(reference_image=starting_image, reference_id=0)\n", "mask_ref_image = MaskReferenceImage(\n", " reference_id=1,\n", " reference_image=None,\n", " config=MaskReferenceConfig(\n", " mask_mode=\"MASK_MODE_SEMANTIC\", segmentation_classes=[85]\n", " ),\n", ")\n", "\n", "remove_image = client.models.edit_image(\n", " model=edit_model,\n", " prompt=\"\",\n", " reference_images=[raw_ref_image, mask_ref_image],\n", " config=EditImageConfig(\n", " edit_mode=\"EDIT_MODE_INPAINT_REMOVAL\",\n", " number_of_images=1,\n", " safety_filter_level=\"BLOCK_MEDIUM_AND_ABOVE\",\n", " person_generation=\"ALLOW_ADULT\",\n", " ),\n", ")\n", "\n", "starting_image_show = PIL_Image.open(\n", " urllib.request.urlopen(\n", " \"https://storage.googleapis.com/cloud-samples-data/generative-ai/image/mirror.png\"\n", " )\n", ")\n", "\n", "display_images(\n", " starting_image_show,\n", " remove_image.generated_images[0].image._pil_image,\n", ")" ] }, { "cell_type": "markdown", "metadata": { "id": "68909c926952" }, "source": [ "### Product background editing via background swap mode\n", "\n", "\n", "You can also use Imagen 3 for product image editing. By setting `edit_mode` to \"EDIT_MODE_BGSWAP\", you can maintain the product content while modifying the image background.\n", "\n", "For this example, start with an image stored in a Google Cloud Storage bucket, and provide a prompt describing the new background scene. " ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "S5zv8PqYweHW" }, "outputs": [], "source": [ "product_image = Image(\n", " gcs_uri=\"gs://cloud-samples-data/generative-ai/image/suitcase.png\"\n", ")\n", "raw_ref_image = RawReferenceImage(reference_image=product_image, reference_id=0)\n", "mask_ref_image = MaskReferenceImage(\n", " reference_id=1,\n", " reference_image=None,\n", " config=MaskReferenceConfig(mask_mode=\"MASK_MODE_BACKGROUND\"),\n", ")\n", "\n", "prompt = \"a light blue suitcase in front of a window in an airport, lots of bright, natural lighting coming in from the windows, planes taking off in the distance\"\n", "edited_image = client.models.edit_image(\n", " model=edit_model,\n", " prompt=prompt,\n", " reference_images=[raw_ref_image, mask_ref_image],\n", " config=EditImageConfig(\n", " edit_mode=\"EDIT_MODE_BGSWAP\",\n", " number_of_images=1,\n", " seed=1,\n", " safety_filter_level=\"BLOCK_MEDIUM_AND_ABOVE\",\n", " person_generation=\"ALLOW_ADULT\",\n", " ),\n", ")\n", "\n", "product_image_show = PIL_Image.open(\n", " urllib.request.urlopen(\n", " \"https://storage.googleapis.com/cloud-samples-data/generative-ai/image/suitcase.png\"\n", " )\n", ")\n", "display_images(product_image_show, edited_image.generated_images[0].image._pil_image)" ] }, { "cell_type": "markdown", "metadata": { "id": "76df73e7bbd2" }, "source": [ "### Outpainting\n", "\n", "Imagen 3 editing can be used for image outpainting. Outpainting is used to expand the content of an image to a larger area or area with different dimensions. To use the outpainting feature, you must create an image mask and prepare the original image by padding some empty space around it. Once you've padded the image, you can use the ```outpainting``` editing mode to fill in the empty space." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "yUevoHxxIsIN" }, "outputs": [], "source": [ "! gsutil cp \"gs://cloud-samples-data/generative-ai/image/living-room.png\" .\n", "initial_image = Image.from_file(location=\"living-room.png\")\n", "mask = PIL_Image.new(\"L\", initial_image._pil_image.size, 0)\n", "\n", "target_size_w = int(2500 * eval(\"3/4\"))\n", "target_size = (target_size_w, 2500)\n", "image_pil_outpaint, mask_pil_outpaint = pad_image_and_mask(\n", " initial_image._pil_image,\n", " mask,\n", " target_size,\n", " 0,\n", " 0,\n", ")\n", "image_pil_outpaint_image = Image(image_bytes=get_bytes_from_pil(image_pil_outpaint))\n", "mask_pil_outpaint_image = Image(image_bytes=get_bytes_from_pil(mask_pil_outpaint))\n", "\n", "raw_ref_image = RawReferenceImage(\n", " reference_image=image_pil_outpaint_image, reference_id=0\n", ")\n", "mask_ref_image = MaskReferenceImage(\n", " reference_id=1,\n", " reference_image=mask_pil_outpaint_image,\n", " config=MaskReferenceConfig(\n", " mask_mode=\"MASK_MODE_USER_PROVIDED\",\n", " mask_dilation=0.03,\n", " ),\n", ")\n", "\n", "prompt = \"a chandelier hanging from the ceiling\"\n", "edited_image = client.models.edit_image(\n", " model=edit_model,\n", " prompt=prompt,\n", " reference_images=[raw_ref_image, mask_ref_image],\n", " config=EditImageConfig(\n", " edit_mode=\"EDIT_MODE_OUTPAINT\",\n", " number_of_images=1,\n", " safety_filter_level=\"BLOCK_MEDIUM_AND_ABOVE\",\n", " person_generation=\"ALLOW_ADULT\",\n", " ),\n", ")\n", "\n", "display_images(\n", " initial_image._pil_image, edited_image.generated_images[0].image._pil_image\n", ")" ] }, { "cell_type": "markdown", "metadata": { "id": "D3A7GIQBjSQX" }, "source": [ "### Mask-free editing\n", "\n", "Imagen 3 editing also lets you edit images without a mask. Simply write the changes you wish to make to the image in the prompt and provide the original image as the sole reference image." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "E8df7clFJNRa" }, "outputs": [], "source": [ "original_image = Image(gcs_uri=\"gs://cloud-samples-data/generative-ai/image/latte.jpg\")\n", "raw_ref_image = RawReferenceImage(reference_image=original_image, reference_id=0)\n", "\n", "\n", "prompt = \"swan latte art in the coffee cup and an assortment of red velvet cupcakes in gold wrappers on the white plate\"\n", "edited_image = client.models.edit_image(\n", " model=edit_model,\n", " prompt=prompt,\n", " reference_images=[raw_ref_image],\n", " config=EditImageConfig(\n", " edit_mode=\"EDIT_MODE_DEFAULT\",\n", " number_of_images=1,\n", " safety_filter_level=\"BLOCK_MEDIUM_AND_ABOVE\",\n", " person_generation=\"ALLOW_ADULT\",\n", " ),\n", ")\n", "\n", "original_image_show = PIL_Image.open(\n", " urllib.request.urlopen(\n", " \"https://storage.googleapis.com/cloud-samples-data/generative-ai/image/latte.jpg\"\n", " )\n", ")\n", "display_images(original_image_show, edited_image.generated_images[0].image._pil_image)" ] }, { "cell_type": "markdown", "metadata": { "id": "5c37c0c51d8f" }, "source": [ "### Semantic segmentation classes\n", "\n", "| Class ID | Instance Type | Class ID | Instance Type | Class ID | Instance Type | Class ID | Instance Type |\n", "| --- | --- | --- | --- | --- | --- | --- | --- |\n", "| 0 | backpack | 50 | carrot | 100 | sidewalk_pavement | 150 | skis |\n", "| 1 | umbrella | 51 | hot_dog | 101 | runway | 151 | snowboard |\n", "| 2 | bag | 52 | pizza | 102 | terrain | 152 | sports_ball |\n", "| 3 | tie | 53 | donut | 103 | book | 153 | kite |\n", "| 4 | suitcase | 54 | cake | 104 | box | 154 | baseball_bat |\n", "| 5 | case | 55 | fruit_other | 105 | clock | 155 | baseball_glove |\n", "| 6 | bird | 56 | food_other | 106 | vase | 156 | skateboard |\n", "| 7 | cat | 57 | chair_other | 107 | scissors | 157 | surfboard |\n", "| 8 | dog | 58 | armchair | 108 | plaything_other | 158 | tennis_racket |\n", "| 9 | horse | 59 | swivel_chair | 109 | teddy_bear | 159 | net |\n", "| 10 | sheep | 60 | stool | 110 | hair_dryer | 160 | base |\n", "| 11 | cow | 61 | seat | 111 | toothbrush | 161 | sculpture |\n", "| 12 | elephant | 62 | couch | 112 | painting | 162 | column |\n", "| 13 | bear | 63 | trash_can | 113 | poster | 163 | fountain |\n", "| 14 | zebra | 64 | potted_plant | 114 | bulletin_board | 164 | awning |\n", "| 15 | giraffe | 65 | nightstand | 115 | bottle | 165 | apparel |\n", "| 16 | animal_other | 66 | bed | 116 | cup | 166 | banner |\n", "| 17 | microwave | 67 | table | 117 | wine_glass | 167 | flag |\n", "| 18 | radiator | 68 | pool_table | 118 | knife | 168 | blanket |\n", "| 19 | oven | 69 | barrel | 119 | fork | 169 | curtain_other |\n", "| 20 | toaster | 70 | desk | 120 | spoon | 170 | shower_curtain |\n", "| 21 | storage_tank | 71 | ottoman | 121 | bowl | 171 | pillow |\n", "| 22 | conveyor_belt | 72 | wardrobe | 122 | tray | 172 | towel |\n", "| 23 | sink | 73 | crib | 123 | range_hood | 173 | rug_floormat |\n", "| 24 | refrigerator | 74 | basket | 124 | plate | 174 | vegetation |\n", "| 25 | washer_dryer | 75 | chest_of_drawers | 125 | person | 175 | bicycle |\n", "| 26 | fan | 76 | bookshelf | 126 | rider_other | 176 | car |\n", "| 27 | dishwasher | 77 | counter_other | 127 | bicyclist | 177 | autorickshaw |\n", "| 28 | toilet | 78 | bathroom_counter | 128 | motorcyclist | 178 | motorcycle |\n", "| 29 | bathtub | 79 | kitchen_island | 129 | paper | 179 | airplane |\n", "| 30 | shower | 80 | door | 130 | streetlight | 180 | bus |\n", "| 31 | tunnel | 81 | light_other | 131 | road_barrier | 181 | train |\n", "| 32 | bridge | 82 | lamp | 132 | mailbox | 182 | truck |\n", "| 33 | pier_wharf | 83 | sconce | 133 | cctv_camera | 183 | trailer |\n", "| 34 | tent | 84 | chandelier | 134 | junction_box | 184 | boat_ship |\n", "| 35 | building | 85 | mirror | 135 | traffic_sign | 185 | slow_wheeled_object |\n", "| 36 | ceiling | 86 | whiteboard | 136 | traffic_light | 186 | river_lake |\n", "| 37 | laptop | 87 | shelf | 137 | fire_hydrant | 187 | sea |\n", "| 38 | keyboard | 88 | stairs | 138 | parking_meter | 188 | water_other |\n", "| 39 | mouse | 89 | escalator | 139 | bench | 189 | swimming_pool |\n", "| 40 | remote | 90 | cabinet | 140 | bike_rack | 190 | waterfall |\n", "| 41 | cell phone | 91 | fireplace | 141 | billboard | 191 | wall |\n", "| 42 | television | 92 | stove | 142 | sky | 192 | window |\n", "| 43 | floor | 93 | arcade_machine | 143 | pole | 193 | window_blind |\n", "| 44 | stage | 94 | gravel | 144 | fence | | |\n", "| 45 | banana | 95 | platform | 145 | railing_banister | | |\n", "| 46 | apple | 96 | playingfield | 146 | guard_rail | | |\n", "| 47 | sandwich | 97 | railroad | 147 | mountain_hill | | |\n", "| 48 | orange | 98 | road | 148 | rock | | |\n", "| 49 | broccoli | 99 | snow | 149 | frisbee | | |\n" ] } ], "metadata": { "colab": { "name": "imagen3_editing.ipynb", "toc_visible": true }, "kernelspec": { "display_name": "Python 3", "name": "python3" } }, "nbformat": 4, "nbformat_minor": 0 }

vision/getting-started/imagen3_editing.ipynb (842 lines of code) (raw):