vision/getting-started/image_generation.ipynb (622 lines of code) (raw):

{ "cells": [ { "cell_type": "code", "execution_count": null, "metadata": { "id": "uxCkB_DXTHzf" }, "outputs": [], "source": [ "# Copyright 2023 Google LLC\n", "#\n", "# Licensed under the Apache License, Version 2.0 (the \"License\");\n", "# you may not use this file except in compliance with the License.\n", "# You may obtain a copy of the License at\n", "#\n", "# https://www.apache.org/licenses/LICENSE-2.0\n", "#\n", "# Unless required by applicable law or agreed to in writing, software\n", "# distributed under the License is distributed on an \"AS IS\" BASIS,\n", "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n", "# See the License for the specific language governing permissions and\n", "# limitations under the License." ] }, { "cell_type": "markdown", "metadata": { "id": "Hny4I-ODTIS6" }, "source": [ "# Image Generation with Imagen on Vertex AI\n", "\n", "> **NOTE:** This Notebook uses Imagen 2. Google recommends using [Imagen 3](https://cloud.google.com/vertex-ai/generative-ai/docs/image/overview) for Image Generation tasks.\n", "\n", "<table align=\"left\">\n", " <td style=\"text-align: center\">\n", " <a href=\"https://colab.research.google.com/github/GoogleCloudPlatform/generative-ai/blob/main/vision/getting-started/image_generation.ipynb\">\n", " <img src=\"https://cloud.google.com/ml-engine/images/colab-logo-32px.png\" alt=\"Google Colaboratory logo\"><br> Open in Colab\n", " </a>\n", " </td>\n", " <td style=\"text-align: center\">\n", " <a href=\"https://console.cloud.google.com/vertex-ai/colab/import/https:%2F%2Fraw.githubusercontent.com%2FGoogleCloudPlatform%2Fgenerative-ai%2Fmain%2Fvision%2Fgetting-started%2Fimage_generation.ipynb\">\n", " <img width=\"32px\" src=\"https://lh3.googleusercontent.com/JmcxdQi-qOpctIvWKgPtrzZdJJK-J3sWE1RsfjZNwshCFgE_9fULcNpuXYTilIR2hjwN\" alt=\"Google Cloud Colab Enterprise logo\"><br> Open in Colab Enterprise\n", " </a>\n", " </td> \n", " <td style=\"text-align: center\">\n", " <a href=\"https://console.cloud.google.com/vertex-ai/workbench/deploy-notebook?download_url=https://raw.githubusercontent.com/GoogleCloudPlatform/generative-ai/main/vision/getting-started/image_generation.ipynb\">\n", " <img src=\"https://lh3.googleusercontent.com/UiNooY4LUgW_oTvpsNhPpQzsstV5W8F7rYgxgGBD85cWJoLmrOzhVs_ksK_vgx40SHs7jCqkTkCk=e14-rj-sc0xffffff-h130-w32\" alt=\"Vertex AI logo\"><br> Open in Workbench\n", " </a>\n", " </td>\n", " <td style=\"text-align: center\">\n", " <a href=\"https://github.com/GoogleCloudPlatform/generative-ai/blob/main/vision/getting-started/image_generation.ipynb\">\n", " <img src=\"https://cloud.google.com/ml-engine/images/github-logo-32px.png\" alt=\"GitHub logo\"><br> View on GitHub\n", " </a>\n", " </td>\n", "</table>\n", "\n", "<div style=\"clear: both;\"></div>\n", "\n", "<b>Share to:</b>\n", "\n", "<a href=\"https://www.linkedin.com/sharing/share-offsite/?url=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/vision/getting-started/image_generation.ipynb\" target=\"_blank\">\n", " <img width=\"20px\" src=\"https://upload.wikimedia.org/wikipedia/commons/8/81/LinkedIn_icon.svg\" alt=\"LinkedIn logo\">\n", "</a>\n", "\n", "<a href=\"https://bsky.app/intent/compose?text=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/vision/getting-started/image_generation.ipynb\" target=\"_blank\">\n", " <img width=\"20px\" src=\"https://upload.wikimedia.org/wikipedia/commons/7/7a/Bluesky_Logo.svg\" alt=\"Bluesky logo\">\n", "</a>\n", "\n", "<a href=\"https://twitter.com/intent/tweet?url=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/vision/getting-started/image_generation.ipynb\" target=\"_blank\">\n", " <img width=\"20px\" src=\"https://upload.wikimedia.org/wikipedia/commons/5/5a/X_icon_2.svg\" alt=\"X logo\">\n", "</a>\n", "\n", "<a href=\"https://reddit.com/submit?url=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/vision/getting-started/image_generation.ipynb\" target=\"_blank\">\n", " <img width=\"20px\" src=\"https://redditinc.com/hubfs/Reddit%20Inc/Brand/Reddit_Logo.png\" alt=\"Reddit logo\">\n", "</a>\n", "\n", "<a href=\"https://www.facebook.com/sharer/sharer.php?u=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/vision/getting-started/image_generation.ipynb\" target=\"_blank\">\n", " <img width=\"20px\" src=\"https://upload.wikimedia.org/wikipedia/commons/5/51/Facebook_f_logo_%282019%29.svg\" alt=\"Facebook logo\">\n", "</a> " ] }, { "cell_type": "markdown", "metadata": { "id": "048a84091064" }, "source": [ "| | |\n", "|-|-|\n", "|Author(s) | [Thu Ya Kyaw](https://github.com/iamthuya) |" ] }, { "cell_type": "markdown", "metadata": { "id": "-nLS57E2TO5y" }, "source": [ "## Overview\n", "\n", "[Imagen on Vertex AI](https://cloud.google.com/vertex-ai/docs/generative-ai/image/overview) brings Google's state of the art generative AI capabilities to application developers. With Imagen on Vertex AI, application developers can build next-generation AI products that transform their user's imagination into high quality visual assets, in seconds.\n", "\n", "With Imagen, you can do the following:\n", "- Generate novel images using only a text prompt (text-to-image generation).\n", "- Edit an entire uploaded or generated image with a text prompt.\n", "- Edit only parts of an uploaded or generated image using a mask area you define.\n", "- Upscale existing, generated, or edited images.\n", "- Fine-tune a model with a specific subject (for example, a specific handbag or shoe) for image generation.\n", "- Get text descriptions of images with visual captioning.\n", "- Get answers to a question about an image with Visual Question Answering (VQA).\n", "\n", "This notebook focuses on **image generation** only. You can read more about image generation feature from Imagen [here](https://cloud.google.com/vertex-ai/docs/generative-ai/image/generate-images).\n" ] }, { "cell_type": "markdown", "metadata": { "id": "iXsvgIuwTPZw" }, "source": [ "### Objectives\n", "\n", "In this notebook, you will be exploring the image generation features of Imagen using the Vertex AI Python SDK. You will:\n", "\n", "- generate images using text prompts\n", "- experiment with different parameters, such as:\n", " - increasing the number of images to be generated\n", " - fixing a seed number for reproducibility\n", " - influencing the output images using negative prompts" ] }, { "cell_type": "markdown", "metadata": { "id": "skXAu__iqks_" }, "source": [ "### Costs\n", "\n", "- This notebook uses billable components of Google Cloud:\n", " - Vertex AI (Imagen)\n", "\n", "- Learn about [Vertex AI pricing](https://cloud.google.com/vertex-ai/pricing) and use the [Pricing Calculator](https://cloud.google.com/products/calculator/) to generate a cost estimate based on your projected usage." ] }, { "cell_type": "markdown", "metadata": { "id": "mvKl-BtQTRiQ" }, "source": [ "## Getting Started" ] }, { "cell_type": "markdown", "metadata": { "id": "PwFMpIMrTV_4" }, "source": [ "### Install Vertex AI SDK for Python" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "WYUu8VMdJs3V" }, "outputs": [], "source": [ "%pip install --quiet --upgrade --user google-cloud-aiplatform" ] }, { "cell_type": "markdown", "metadata": { "id": "opUxT_k5TdgP" }, "source": [ "### Authenticate your notebook environment (Colab only)\n", "\n", "If you are running this notebook on Google Colab, run the following cell to authenticate your environment. This step is not required if you are using [Vertex AI Workbench](https://cloud.google.com/vertex-ai-workbench)." ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "id": "vbNgv4q1T2Mi" }, "outputs": [], "source": [ "import sys\n", "\n", "# Additional authentication is required for Google Colab\n", "if \"google.colab\" in sys.modules:\n", " # Authenticate user to Google Cloud\n", " from google.colab import auth\n", "\n", " auth.authenticate_user()" ] }, { "cell_type": "markdown", "metadata": { "id": "ybBXSukZkgjg" }, "source": [ "### Define Google Cloud project information and initialize Vertex AI\n", "\n", "Initialize the Vertex AI SDK for Python for your project:" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "id": "5gUjJ42Nh5kf" }, "outputs": [], "source": [ "# Define project information\n", "PROJECT_ID = \"[your-project-id]\" # @param {type:\"string\"}\n", "LOCATION = \"us-central1\" # @param {type:\"string\"}\n", "\n", "# Initialize Vertex AI\n", "import vertexai\n", "\n", "vertexai.init(project=PROJECT_ID, location=LOCATION)" ] }, { "cell_type": "markdown", "metadata": { "id": "lhfneknwEDHT" }, "source": [ "### Load the image generation model\n", "\n", "The model names from Vertex AI Imagen have two components: model name and version number. The naming convention follow this format: `<model-name>@<version-number>`. For example, `imagegeneration@001` represents the version **001** of the **imagegeneration** model.\n" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "id": "nEKPNLNL5RhD" }, "outputs": [], "source": [ "from vertexai.preview.vision_models import ImageGenerationModel\n", "\n", "generation_model = ImageGenerationModel.from_pretrained(\"imagegeneration@006\")" ] }, { "cell_type": "markdown", "metadata": { "id": "qLZagQ8NUDiB" }, "source": [ "### Generate an image\n", "\n", "The `generate_image` function can be used to generate images. All you need is a text prompt." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "0GYBwQuciCco" }, "outputs": [], "source": [ "prompt = \"aerial shot of a river flowing up a mystical valley\"\n", "\n", "response = generation_model.generate_images(\n", " prompt=prompt,\n", ")\n", "\n", "response.images[0].show()" ] }, { "cell_type": "markdown", "metadata": { "id": "tZy8fNc5_HnY" }, "source": [ "Now, you have the power to generate the image you desire. Here are some example prompts to inspire you:\n", "- A raccoon wearing formal clothes, wearing a top hat. Oil painting in the style of Vincent Van Gogh.\n", "- A digital collage of famous works of art, all blended together into one cohesive image.\n", "- A whimsical scene from a children's book, such as a tea party with talking animals.\n", "- A futuristic cityscape with towering skyscrapers and flying cars.\n", "- A studio photo of a modern armchair, dramatic lighting, warm.\n", "- A surreal landscape of a city made of ice and snow." ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "id": "SACmbbUb_MHX" }, "outputs": [], "source": [ "prompt = \"A raccoon wearing formal clothes, wearing a top hat. Oil painting in the style of Vincent Van Gogh.\" # @param {type:\"string\"}\n", "\n", "response = generation_model.generate_images(prompt=prompt)\n", "\n", "response.images[0].show()" ] }, { "cell_type": "markdown", "metadata": { "id": "3RuVUB6MVGa-" }, "source": [ "### Explore different parameters\n", "\n", "The `generate_images` function accepts additional parameters that can be used to influence the generated images. The following sections will explore how to influence the output images through the use of those additional parameters." ] }, { "cell_type": "code", "execution_count": 19, "metadata": { "id": "zbYtoZtTB98_" }, "outputs": [], "source": [ "import math\n", "\n", "import matplotlib.pyplot as plt\n", "\n", "\n", "# An auxiliary function to display images in grid\n", "def display_images_in_grid(images):\n", " \"\"\"Displays the provided images in a grid format. 4 images per row.\n", "\n", " Args:\n", " images: A list of PIL Image objects representing the images to display.\n", " \"\"\"\n", "\n", " # Determine the number of rows and columns for the grid layout.\n", " nrows = math.ceil(len(images) / 4) # Display at most 4 images per row\n", " ncols = min(len(images) + 1, 4) # Adjust columns based on the number of images\n", "\n", " # Create a figure and axes for the grid layout.\n", " fig, axes = plt.subplots(nrows=nrows, ncols=ncols, figsize=(12, 6))\n", "\n", " for i, ax in enumerate(axes.flat):\n", " if i < len(images):\n", " # Display the image in the current axis.\n", " ax.imshow(images[i]._pil_image)\n", "\n", " # Adjust the axis aspect ratio to maintain image proportions.\n", " ax.set_aspect(\"equal\")\n", "\n", " # Disable axis ticks for a cleaner appearance.\n", " ax.set_xticks([])\n", " ax.set_yticks([])\n", " else:\n", " # Hide empty subplots to avoid displaying blank axes.\n", " ax.axis(\"off\")\n", "\n", " # Adjust the layout to minimize whitespace between subplots.\n", " plt.tight_layout()\n", "\n", " # Display the figure with the arranged images.\n", " plt.show()" ] }, { "cell_type": "markdown", "metadata": { "id": "--0bCzEYB-uF" }, "source": [ "#### `number_of_images`\n", "\n", "You can use the `number_of_images` parameter to generate up to four images." ] }, { "cell_type": "code", "execution_count": 20, "metadata": { "id": "_9dPZC4VGISm" }, "outputs": [], "source": [ "prompt = \"a delicious bowl of pho from Vietnam\"\n", "\n", "response = generation_model.generate_images(\n", " prompt=prompt,\n", " number_of_images=4,\n", ")\n", "\n", "display_images_in_grid(response.images)" ] }, { "cell_type": "markdown", "metadata": { "id": "AkQGTuaobR-5" }, "source": [ "Try running the code a few times and observe the generated images. You will notice that the model generates a new set of images each time, even though the same prompt is used." ] }, { "cell_type": "markdown", "metadata": { "id": "Dg0D2VcfQPH3" }, "source": [ "#### `seed`\n", "\n", "With the `seed` parameter, you can influence the model to create the same output from the same input every time. Please note that the order of the generated images may still change. In order for the call to work when using the `seed` parameter, watermarking has to be disabled." ] }, { "cell_type": "code", "execution_count": 21, "metadata": { "id": "CPoUULlZHJNI" }, "outputs": [], "source": [ "response = generation_model.generate_images(\n", " prompt=prompt,\n", " number_of_images=4,\n", " seed=42,\n", " add_watermark=False,\n", ")\n", "\n", "display_images_in_grid(response.images)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "9kvARikDRYiZ" }, "outputs": [], "source": [ "response = generation_model.generate_images(\n", " prompt=prompt,\n", " number_of_images=4,\n", " seed=42,\n", " add_watermark=False,\n", ")\n", "\n", "display_images_in_grid(response.images)" ] }, { "cell_type": "markdown", "metadata": { "id": "JbuHaMy8ak_5" }, "source": [ "As can be observed, the model produced an identical set of images after utilizing the seed parameter, although the order of the images varies." ] }, { "cell_type": "markdown", "metadata": { "id": "ihdvSoN2bn0b" }, "source": [ "#### `negative_prompt`\n", "\n", "Objects you do not want to generate can be excluded using the `negative_prompt` parameter. \"Bean sprout\" was given as a negative prompt in the example below. As a result, despite using the same prompt and seed number, the model did not produce bean sprout in the images." ] }, { "cell_type": "code", "execution_count": 23, "metadata": { "id": "Dt6zoj7hX4me" }, "outputs": [], "source": [ "response = generation_model.generate_images(\n", " prompt=prompt,\n", " number_of_images=4,\n", " seed=42,\n", " negative_prompt=\"bean sprout\",\n", " add_watermark=False,\n", ")\n", "\n", "display_images_in_grid(response.images)" ] }, { "cell_type": "markdown", "metadata": { "id": "3f11f7254ee5" }, "source": [ "#### `aspect_ratio`\n", "\n", "Use the `aspect_ratio` parameter to set the ratio of the width to the height of the generated images. The available values are:\n", "- `\"16:9\"`\n", "- `\"4:3\"`\n", "- `\"3:4\"`\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "89fcfe5962bb" }, "outputs": [], "source": [ "response = generation_model.generate_images(\n", " prompt=prompt,\n", " aspect_ratio=\"16:9\",\n", ")\n", "\n", "response.images[0].show()" ] }, { "cell_type": "markdown", "metadata": { "id": "b35916a1d950" }, "source": [ "#### `safety_filter_levels`\n", "\n", "Use the `safety_filter_levels` parameter to control the level of safety filtering. The available values are:\n", "- `block_most`: The highest level of filtering.\n", "- `block_some`: One level below. Blocks some, but let's some prompts and images pass.\n", "- `block_few`: Blocks fewer prompts and images\n", "- `block_fewest`: Blocks the least amount of images and prompts (available only to allowlisted users/projects)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "033224ea6578" }, "outputs": [], "source": [ "response = generation_model.generate_images(\n", " prompt=prompt,\n", " number_of_images=1,\n", " aspect_ratio=\"4:3\",\n", " safety_filter_level=\"block_most\",\n", ")\n", "\n", "display_images_in_grid(response.images)" ] }, { "cell_type": "markdown", "metadata": { "id": "e8a13b18d242" }, "source": [ "#### `add_watermark`\n", "\n", "Use the `add_watermark` boolean flag to indicate whether to add [a digital watermark](https://deepmind.google/technologies/synthid/) to the generated images. The default is set to `True`." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "c7c81011835f" }, "outputs": [], "source": [ "response = generation_model.generate_images(\n", " prompt=prompt,\n", " number_of_images=1,\n", " aspect_ratio=\"16:9\",\n", " add_watermark=True,\n", ")\n", "\n", "response.images[0].show()" ] }, { "cell_type": "markdown", "metadata": { "id": "sk0eXjQ1nR4F" }, "source": [ "## Conclusion\n", "\n", "You have explored the Imagen image generation features through the Vertex AI Python SDK, including the additional parameters that influence image generation.\n", "\n", "The next step is to enhance your skills by exploring this [prompting guide](https://cloud.google.com/vertex-ai/docs/generative-ai/image/img-gen-prompt-guide?_ga=2.128324367.-2094800479.1701746552&_gac=1.219926379.1701161688.CjwKCAiAvJarBhA1EiwAGgZl0LFQUFOFZUxfNPlzjB4T00PDiLeCIEYfY-coLbX9eUfHKr_i8VbtSBoCEJQQAvD_BwE).\n", "\n", "Through practice, you will become proficient in the art of image prompting." ] } ], "metadata": { "colab": { "name": "image_generation.ipynb", "toc_visible": true }, "kernelspec": { "display_name": "Python 3", "name": "python3" } }, "nbformat": 4, "nbformat_minor": 0 }