gemini/getting-started/intro_gemini_2_0_image_gen.ipynb (453 lines of code) (raw):
{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "BCl9pcbOOeA0"
},
"outputs": [],
"source": [
"# Copyright 2025 Google LLC\n",
"#\n",
"# Licensed under the Apache License, Version 2.0 (the \"License\");\n",
"# you may not use this file except in compliance with the License.\n",
"# You may obtain a copy of the License at\n",
"#\n",
"# https://www.apache.org/licenses/LICENSE-2.0\n",
"#\n",
"# Unless required by applicable law or agreed to in writing, software\n",
"# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
"# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
"# See the License for the specific language governing permissions and\n",
"# limitations under the License."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "ZPC2X_a9ErW7"
},
"source": [
"# Gemini 2.0 Flash Image Generation in Vertex AI\n",
"\n",
"<table align=\"left\">\n",
" <td style=\"text-align: center\">\n",
" <a href=\"https://colab.research.google.com/github/GoogleCloudPlatform/generative-ai/blob/main/gemini/getting-started/intro_gemini_2_0_image_gen.ipynb\">\n",
" <img width=\"32px\" src=\"https://www.gstatic.com/pantheon/images/bigquery/welcome_page/colab-logo.svg\" alt=\"Google Colaboratory logo\"><br> Open in Colab\n",
" </a>\n",
" </td>\n",
" <td style=\"text-align: center\">\n",
" <a href=\"https://console.cloud.google.com/vertex-ai/colab/import/https:%2F%2Fraw.githubusercontent.com%2FGoogleCloudPlatform%2Fgenerative-ai%2Fmain%2Fgemini%2Fgetting-started%2Fintro_gemini_2_0_image_gen.ipynb\">\n",
" <img width=\"32px\" src=\"https://lh3.googleusercontent.com/JmcxdQi-qOpctIvWKgPtrzZdJJK-J3sWE1RsfjZNwshCFgE_9fULcNpuXYTilIR2hjwN\" alt=\"Google Cloud Colab Enterprise logo\"><br> Open in Colab Enterprise\n",
" </a>\n",
" </td>\n",
" <td style=\"text-align: center\">\n",
" <a href=\"https://console.cloud.google.com/vertex-ai/workbench/deploy-notebook?download_url=https://raw.githubusercontent.com/GoogleCloudPlatform/generative-ai/main/gemini/getting-started/intro_gemini_2_0_image_gen.ipynb\">\n",
" <img src=\"https://www.gstatic.com/images/branding/gcpiconscolors/vertexai/v1/32px.svg\" alt=\"Vertex AI logo\"><br> Open in Vertex AI Workbench\n",
" </a>\n",
" </td>\n",
" <td style=\"text-align: center\">\n",
" <a href=\"https://github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/getting-started/intro_gemini_2_0_image_gen.ipynb\">\n",
" <img width=\"32px\" src=\"https://www.svgrepo.com/download/217753/github.svg\" alt=\"GitHub logo\"><br> View on GitHub\n",
" </a>\n",
" </td>\n",
"</table>\n",
"\n",
"<div style=\"clear: both;\"></div>\n",
"\n",
"<b>Share to:</b>\n",
"\n",
"<a href=\"https://www.linkedin.com/sharing/share-offsite/?url=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/getting-started/intro_gemini_2_0_image_gen.ipynb\" target=\"_blank\">\n",
" <img width=\"20px\" src=\"https://upload.wikimedia.org/wikipedia/commons/8/81/LinkedIn_icon.svg\" alt=\"LinkedIn logo\">\n",
"</a>\n",
"\n",
"<a href=\"https://bsky.app/intent/compose?text=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/getting-started/intro_gemini_2_0_image_gen.ipynb\" target=\"_blank\">\n",
" <img width=\"20px\" src=\"https://upload.wikimedia.org/wikipedia/commons/7/7a/Bluesky_Logo.svg\" alt=\"Bluesky logo\">\n",
"</a>\n",
"\n",
"<a href=\"https://twitter.com/intent/tweet?url=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/getting-started/intro_gemini_2_0_image_gen.ipynb\" target=\"_blank\">\n",
" <img width=\"20px\" src=\"https://upload.wikimedia.org/wikipedia/commons/5/5a/X_icon_2.svg\" alt=\"X logo\">\n",
"</a>\n",
"\n",
"<a href=\"https://reddit.com/submit?url=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/getting-started/intro_gemini_2_0_image_gen.ipynb\" target=\"_blank\">\n",
" <img width=\"20px\" src=\"https://redditinc.com/hubfs/Reddit%20Inc/Brand/Reddit_Logo.png\" alt=\"Reddit logo\">\n",
"</a>\n",
"\n",
"<a href=\"https://www.facebook.com/sharer/sharer.php?u=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/getting-started/intro_gemini_2_0_image_gen.ipynb\" target=\"_blank\">\n",
" <img width=\"20px\" src=\"https://upload.wikimedia.org/wikipedia/commons/5/51/Facebook_f_logo_%282019%29.svg\" alt=\"Facebook logo\">\n",
"</a>"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "f0cc0f48513b"
},
"source": [
"| Author |\n",
"| --- |\n",
"| [Nikita Namjoshi](https://github.com/nikitamaia) |"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "axauUzNXEl_R"
},
"source": [
"## Overview\n",
"\n",
"In this tutorial, you learn how to use Gemini 2.0 image generation features in Vertex AI using the Google Gen AI SDK.\n",
"\n",
"You'll try out the following scenarios:\n",
"* text --> image\n",
"* text --> image + text (interleaved)\n",
"* text + image --> image"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "D50ekWXjEl_S"
},
"source": [
"## Getting Started"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "jLJQdbgSbb4M"
},
"source": [
"### Install required libraries"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "SQ0qcEWuXNXs"
},
"outputs": [],
"source": [
"%pip install --upgrade --quiet google-genai"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "dmWOrTJ3gx13"
},
"source": [
"### Authenticate your notebook environment (Colab only)\n",
"\n",
"If you are running this notebook on Google Colab, run the following cell to authenticate your environment."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "NyKGtVQjgx13"
},
"outputs": [],
"source": [
"import sys\n",
"\n",
"if \"google.colab\" in sys.modules:\n",
" from google.colab import auth\n",
"\n",
" auth.authenticate_user()"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "O6ZGaZlxP9L0"
},
"source": [
"### Set Google Cloud project\n",
"\n",
"To get started using Vertex AI, you must have an existing Google Cloud project and [enable the Vertex AI API](https://console.cloud.google.com/flows/enableapi?apiid=aiplatform.googleapis.com).\n",
"\n",
"Learn more about [setting up a project and a development environment](https://cloud.google.com/vertex-ai/docs/start/cloud-environment)."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "u8IivOG5SqY6"
},
"outputs": [],
"source": [
"import os\n",
"\n",
"PROJECT_ID = \"[your-project-id]\" # @param {type: \"string\", placeholder: \"[your-project-id]\", isTemplate: true}\n",
"if not PROJECT_ID or PROJECT_ID == \"[your-project-id]\":\n",
" PROJECT_ID = str(os.environ.get(\"GOOGLE_CLOUD_PROJECT\"))\n",
"\n",
"LOCATION = os.environ.get(\"GOOGLE_CLOUD_REGION\", \"us-central1\")\n",
"\n",
"from google import genai\n",
"\n",
"client = genai.Client(vertexai=True, project=PROJECT_ID, location=LOCATION)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "854fbf388e2b"
},
"source": [
"## Use the Gemini 2.0 Flash model"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "aRyjFTFxj-K8"
},
"outputs": [],
"source": [
"from IPython.display import Image, Markdown, display\n",
"\n",
"# import the Google Gen AI SDK\n",
"from google.genai.types import GenerateContentConfig, Part"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "7eeb063ac6d4"
},
"outputs": [],
"source": [
"MODEL_ID = \"gemini-2.0-flash-exp\" # @param {type: \"string\"}"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "xgOucVQlVR4t"
},
"source": [
"## Generate Text + Image\n",
"\n",
"First, send a text prompt to Gemini 2.0 describing the image you want generated.\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "MmA_RAbFwED4"
},
"source": [
"In the cell below, you'll call the `generate_content` method and pass in the following arguments:\n",
"\n",
"* `model`: The ID of the model you want to use, in this case gemini-2.0-flash-exp\n",
"* `contents`: this is your prompt, in this case a text only user message describing the image to be generated\n",
"*`generation_config`: A config for specifying the desired `response_modalities`, in this case `TEXT` and `IMAGE`. If you do not specify `IMAGE`, you will not get image output."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "QK2Mi3zmkHSA"
},
"outputs": [],
"source": [
"response = client.models.generate_content(\n",
" model=MODEL_ID,\n",
" contents=\"generate an image of a grackle wearing a top hat and monocle\",\n",
" config=GenerateContentConfig(\n",
" response_modalities=[\"TEXT\", \"IMAGE\"],\n",
" ),\n",
")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "OBuEhRnIbdaR"
},
"outputs": [],
"source": [
"for part in response.candidates[0].content.parts:\n",
" if part.text:\n",
" display(Markdown(part.text))\n",
" if part.inline_data:\n",
" display(Image(data=part.inline_data.data))"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "5l4YLy8Vq_-v"
},
"source": [
"## Generate Text + Image\n",
"\n",
"In addition to generating images, Gemini can generate multiple images and text in an interleaved fashion.\n",
"\n",
"For example, you could ask the model to generate a recipe for banana bread with image showing different stages of the cooking process. Our you could ask the model to generate a images of different wildflowers with accompanying titles and descriptions.\n",
"\n",
"Let's try out the interleaved text+image functionality by prompting Gemini 2.0 to create an illustrated children's story.\n",
"\n",
"You'll notice that in the prompt we ask the model to generate both text and images for each episode of the narrative. This will nudge the model to create text with images interleaved."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "WdkrXmyIFtgv"
},
"source": [
"⚠️ Note that we are asking the model to generate a lot of content in this prompt, so it will take a bit of time for this cell to finish executing."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "FwCeB0Hxlrz2"
},
"outputs": [],
"source": [
"response = client.models.generate_content(\n",
" model=MODEL_ID,\n",
" contents=\"Make a children's storybook about a curious young fox named Mosi, who sets off on a magical adventure through a forest in search of a special star. The story unfolds over three episodes, with each episode introducing Mosi to a new friend and revealing wondrous and magical landscapes. For each episode, provide a title, a captivating narrative, and also generate a realistic image illustrating everything in the scene described in the narrative of that episode\",\n",
" config=GenerateContentConfig(\n",
" response_modalities=[\"TEXT\", \"IMAGE\"],\n",
" ),\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "FX3vDsibcUWg"
},
"source": [
"The length of `content.parts` is 6 since we asked the model to produce 3 images and corresponding text for each image."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "WQMIOWzJcFLo"
},
"outputs": [],
"source": [
"len(response.candidates[0].content.parts)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "Jh7YJ_7Td1pD"
},
"outputs": [],
"source": [
"display(Markdown(response.text))"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "KuyZyzBSdrpf"
},
"source": [
"Let's visualize the response"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "FxfW8ly7WVZ9"
},
"outputs": [],
"source": [
"for part in response.candidates[0].content.parts:\n",
" if part.text:\n",
" display(Markdown(part.text))\n",
" if part.inline_data:\n",
" display(Image(data=part.inline_data.data))"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "R3g5n23lDtsN"
},
"source": [
"## Generate a new image from a text + image prompt\n",
"\n",
"You can pass text and an image to Gemini 2.0 for use cases like product captions, information about a particular image, or to make edits or modifications to an image.\n",
"\n",
"Let's try out a style transfer example and ask Gemini 2.0 to create an image of this dog in the baroque style.\n",
"\n",
""
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "UAiUyW1VMzPn"
},
"outputs": [],
"source": [
"response = client.models.generate_content(\n",
" model=MODEL_ID,\n",
" contents=[\n",
" Part.from_uri(\n",
" file_uri=\"gs://cloud-samples-data/generative-ai/image/small-dog-pink.jpg\",\n",
" mime_type=\"image/jpeg\",\n",
" ),\n",
" \"Generate a baroque style portrait painting of this dog.\",\n",
" ],\n",
" config=GenerateContentConfig(\n",
" response_modalities=[\"TEXT\", \"IMAGE\"],\n",
" ),\n",
")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "zlzYCkQEfQX4"
},
"outputs": [],
"source": [
"for part in response.candidates[0].content.parts:\n",
" if part.text:\n",
" display(Markdown(part.text))\n",
" if part.inline_data:\n",
" display(Image(data=part.inline_data.data))"
]
}
],
"metadata": {
"colab": {
"name": "intro_gemini_2_0_image_gen.ipynb",
"toc_visible": true
},
"kernelspec": {
"display_name": "Python 3",
"name": "python3"
}
},
"nbformat": 4,
"nbformat_minor": 0
}