gemini/use-cases/education/use_cases_for_education.ipynb (1,228 lines of code) (raw):
{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "ijGzTHJJUCPY"
},
"outputs": [],
"source": [
"# Copyright 2024 Google LLC\n",
"#\n",
"# Licensed under the Apache License, Version 2.0 (the \"License\");\n",
"# you may not use this file except in compliance with the License.\n",
"# You may obtain a copy of the License at\n",
"#\n",
"# https://www.apache.org/licenses/LICENSE-2.0\n",
"#\n",
"# Unless required by applicable law or agreed to in writing, software\n",
"# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
"# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
"# See the License for the specific language governing permissions and\n",
"# limitations under the License."
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {
"id": "VEqbX8OhE8y9"
},
"source": [
"# Using Gemini in Education\n",
"\n",
"<table align=\"left\">\n",
" <td style=\"text-align: center\">\n",
" <a href=\"https://colab.research.google.com/github/GoogleCloudPlatform/generative-ai/blob/main/gemini/use-cases/education/use_cases_for_education.ipynb\">\n",
" <img src=\"https://cloud.google.com/ml-engine/images/colab-logo-32px.png\" alt=\"Google Colaboratory logo\"><br> Run in Colab\n",
" </a>\n",
" </td>\n",
" <td style=\"text-align: center\">\n",
" <a href=\"https://console.cloud.google.com/vertex-ai/colab/import/https:%2F%2Fraw.githubusercontent.com%2FGoogleCloudPlatform%2Fgenerative-ai%2Fmain%2Fgemini%2Fuse-cases%2Feducation%2Fuse_cases_for_education.ipynb\">\n",
" <img width=\"32px\" src=\"https://lh3.googleusercontent.com/JmcxdQi-qOpctIvWKgPtrzZdJJK-J3sWE1RsfjZNwshCFgE_9fULcNpuXYTilIR2hjwN\" alt=\"Google Cloud Colab Enterprise logo\"><br> Run in Colab Enterprise\n",
" </a>\n",
" </td>\n",
" <td style=\"text-align: center\">\n",
" <a href=\"https://console.cloud.google.com/vertex-ai/workbench/deploy-notebook?download_url=https://raw.githubusercontent.com/GoogleCloudPlatform/generative-ai/main/gemini/use-cases/education/use_cases_for_education.ipynb\">\n",
" <img src=\"https://lh3.googleusercontent.com/UiNooY4LUgW_oTvpsNhPpQzsstV5W8F7rYgxgGBD85cWJoLmrOzhVs_ksK_vgx40SHs7jCqkTkCk=e14-rj-sc0xffffff-h130-w32\" alt=\"Vertex AI logo\"><br> Open in Vertex AI Workbench\n",
" </a>\n",
" </td>\n",
" <td style=\"text-align: center\">\n",
" <a href=\"https://github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/use-cases/education/use_cases_for_education.ipynb\">\n",
" <img src=\"https://cloud.google.com/ml-engine/images/github-logo-32px.png\" alt=\"GitHub logo\"><br> View on GitHub\n",
" </a>\n",
" </td>\n",
" <td style=\"text-align: center\">\n",
" <a href=\"https://goo.gle/4jeQypm\">\n",
" <img width=\"32px\" src=\"https://cdn.qwiklabs.com/assets/gcp_cloud-e3a77215f0b8bfa9b3f611c0d2208c7e8708ed31.svg\" alt=\"Google Cloud logo\"><br> Open in Cloud Skills Boost\n",
" </a>\n",
" </td>\n",
"</table>\n",
"\n",
"<div style=\"clear: both;\"></div>\n",
"\n",
"<b>Share to:</b>\n",
"\n",
"<a href=\"https://www.linkedin.com/sharing/share-offsite/?url=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/use-cases/education/use_cases_for_education.ipynb\" target=\"_blank\">\n",
" <img width=\"20px\" src=\"https://upload.wikimedia.org/wikipedia/commons/8/81/LinkedIn_icon.svg\" alt=\"LinkedIn logo\">\n",
"</a>\n",
"\n",
"<a href=\"https://bsky.app/intent/compose?text=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/use-cases/education/use_cases_for_education.ipynb\" target=\"_blank\">\n",
" <img width=\"20px\" src=\"https://upload.wikimedia.org/wikipedia/commons/7/7a/Bluesky_Logo.svg\" alt=\"Bluesky logo\">\n",
"</a>\n",
"\n",
"<a href=\"https://twitter.com/intent/tweet?url=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/use-cases/education/use_cases_for_education.ipynb\" target=\"_blank\">\n",
" <img width=\"20px\" src=\"https://upload.wikimedia.org/wikipedia/commons/5/5a/X_icon_2.svg\" alt=\"X logo\">\n",
"</a>\n",
"\n",
"<a href=\"https://reddit.com/submit?url=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/use-cases/education/use_cases_for_education.ipynb\" target=\"_blank\">\n",
" <img width=\"20px\" src=\"https://redditinc.com/hubfs/Reddit%20Inc/Brand/Reddit_Logo.png\" alt=\"Reddit logo\">\n",
"</a>\n",
"\n",
"<a href=\"https://www.facebook.com/sharer/sharer.php?u=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/use-cases/education/use_cases_for_education.ipynb\" target=\"_blank\">\n",
" <img width=\"20px\" src=\"https://upload.wikimedia.org/wikipedia/commons/5/51/Facebook_f_logo_%282019%29.svg\" alt=\"Facebook logo\">\n",
"</a> \n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "3cf4e0393128"
},
"source": [
"| Author |\n",
"| --- |\n",
"| [Laurent Picard](https://github.com/PicardParis) |"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "VK1Q5ZYdVL4Y"
},
"source": [
"## Overview\n",
"\n",
"In this notebook, you will explore a variety of use cases enabled by Gemini in the context of education.\n",
"\n",
"### Gemini\n",
"\n",
"Gemini is a family of generative AI models developed by Google DeepMind.\n",
"\n",
"The Gemini models are built for multimodality from the ground up:\n",
"\n",
"- Supported inputs: Text, code, images, audio, video, video with audio, and PDF\n",
"- Generated output: Text\n",
"\n",
"### Gemini API in Vertex AI\n",
"\n",
"The Gemini API in Vertex AI provides a unified interface for interacting with the Gemini models.\n",
"\n",
"For more information, see [Gemini models](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/models#gemini-models)."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "uVL_vGs4q3pg"
},
"source": [
"### Objectives\n",
"\n",
"The main objective of this notebook is to demonstrate a variety of educational use cases that can benefit from Gemini\n",
"\n",
"The steps performed include:\n",
"\n",
"- Installing the Python SDK\n",
"- Loading Gemini\n",
"- Reasoning at different levels\n",
"- Reasoning on text\n",
"- Reasoning on numbers\n",
"- Reasoning on a single image\n",
"- Reasoning on multiple images\n",
"- Reasoning on a video"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "bRdtKLfTsQ27"
},
"source": [
"### Costs\n",
"\n",
"This tutorial uses billable components of Google Cloud:\n",
"\n",
"- Vertex AI\n",
"\n",
"Learn about [Vertex AI pricing](https://cloud.google.com/vertex-ai/pricing) and use the [Pricing Calculator](https://cloud.google.com/products/calculator/) to generate a cost estimate based on your projected usage.\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "QDU0XJ1xRDlL"
},
"source": [
"## Getting Started\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "SBGrQE22sVrt"
},
"source": [
"### Install Google Gen AI SDK\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "-hqq5vomsW_P"
},
"outputs": [],
"source": [
"%pip install --upgrade --quiet google-genai"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "N5afkyDMSBW5"
},
"source": [
"### Authenticate your notebook environment (Colab only)\n",
"\n",
"If you are running this notebook on Google Colab, run the following cell to authenticate your environment. This step is not required if you are using [Vertex AI Workbench](https://cloud.google.com/vertex-ai-workbench).\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "Ab4Y6eSIUknb"
},
"outputs": [],
"source": [
"import sys\n",
"\n",
"# Additional authentication is required for Google Colab\n",
"if \"google.colab\" in sys.modules:\n",
" # Authenticate user to Google Cloud\n",
" from google.colab import auth\n",
"\n",
" auth.authenticate_user()"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "kc4WxYmLSBW5"
},
"source": [
"### Define Google Cloud project information and initialize Vertex AI\n",
"\n",
"Initialize the Gen AI SDK for Python for your project:\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "YmY9HVVGSBW5"
},
"outputs": [],
"source": [
"# Use the environment variable if the user doesn't provide Project ID.\n",
"import os\n",
"\n",
"from google import genai\n",
"\n",
"PROJECT_ID = \"[your-project-id]\" # @param {type: \"string\", placeholder: \"[your-project-id]\", isTemplate: true}\n",
"if not PROJECT_ID or PROJECT_ID == \"[your-project-id]\":\n",
" PROJECT_ID = str(os.environ.get(\"GOOGLE_CLOUD_PROJECT\"))\n",
"\n",
"LOCATION = os.environ.get(\"GOOGLE_CLOUD_REGION\", \"us-central1\")\n",
"\n",
"client = genai.Client(vertexai=True, project=PROJECT_ID, location=LOCATION)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "kdINrwJZsj1d"
},
"source": [
"### Import libraries\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "stNmWCsRsotM"
},
"outputs": [],
"source": [
"from IPython.display import Markdown, display\n",
"from google.genai.types import GenerateContentConfig, Part"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "0v63w6fWs9Dx"
},
"source": [
"### Define helper functions\n",
"\n",
"Define some helper functions to load and display images.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "OGvJLH4DmZfw"
},
"outputs": [],
"source": [
"def generate_content(\n",
" model_id: str,\n",
" contents: list | str,\n",
" temperature: float = 0.0,\n",
") -> str:\n",
" \"\"\"Call the Gemini API in Vertex AI.\n",
"\n",
" The default temperature (randomness/creativity) is set low for more consistent responses.\n",
" \"\"\"\n",
" return client.models.generate_content(\n",
" model=model_id,\n",
" contents=contents,\n",
" config=GenerateContentConfig(temperature=temperature, max_output_tokens=8192),\n",
" ).text"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "s9R3nV5bE22Z"
},
"source": [
"## Loading Gemini Model\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "UnUmflwsE22a"
},
"outputs": [],
"source": [
"MODEL_ID = \"gemini-2.0-flash\""
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "9fm71OTvpyqD"
},
"source": [
"## Reasoning at different levels\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "zgf3t9odFlIj"
},
"source": [
"You can ask for direct answers:\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "tBpDHLmMv2un"
},
"outputs": [],
"source": [
"contents = \"\"\"\n",
"What happened to the dinosaurs? When?\n",
"Explain simply in one sentence.\n",
"\"\"\"\n",
"\n",
"display(Markdown(generate_content(MODEL_ID, contents)))"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Rfs1qLAoFrZu"
},
"source": [
"… as well as for more nuanced answers:\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "AcVfT9bowUKW"
},
"outputs": [],
"source": [
"contents = \"\"\"\n",
"Are we 100% sure about what happened to the dinosaurs?\n",
"If not, detail the current main hypotheses.\n",
"\"\"\"\n",
"\n",
"display(Markdown(generate_content(MODEL_ID, contents)))"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "-KPNK1hNGCk4"
},
"source": [
"You can ask for simple answers:\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "SPoEA7UUqlvy"
},
"outputs": [],
"source": [
"contents = \"\"\"\n",
"Explain why it's summer here in France and winter in Australia.\n",
"I'm a kid. Answer in simple key points.\n",
"\"\"\"\n",
"\n",
"display(Markdown(generate_content(MODEL_ID, contents)))"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "gMlbj1-pGKkA"
},
"source": [
"… or for more detailed answers:\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "LdiaJBtbpXjM"
},
"outputs": [],
"source": [
"contents = \"\"\"\n",
"Explain why we have tides.\n",
"I'm a scientist. Provide a detailed answer using bullet points.\n",
"\"\"\"\n",
"\n",
"display(Markdown(generate_content(MODEL_ID, contents)))"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "hV6ULv8KHkxO"
},
"source": [
"You can ask closed questions:\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "dGk1uEPN0XXq"
},
"outputs": [],
"source": [
"contents = \"\"\"\n",
"When were the previous and penultimate leap years?\n",
"List 3 international competitions that took place during the penultimate one.\n",
"Detail dates, cities, and venues.\n",
"\"\"\"\n",
"\n",
"display(Markdown(generate_content(MODEL_ID, contents)))"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "SzmiMEq_HsNF"
},
"source": [
"… as well as questions that are more open:\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "HqcYe_w5BvLT"
},
"outputs": [],
"source": [
"contents = \"\"\"\n",
"What came first, the chicken or the egg? Explain from 3 different perspectives.\n",
"What do we call a \"chicken and egg\" problem? Give 1 example that can occur in education.\n",
"\"\"\"\n",
"\n",
"display(Markdown(generate_content(MODEL_ID, contents)))"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "BoqqOTjfL6HX"
},
"source": [
"## Reasoning on text\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "1dEt9vXqmGN5"
},
"source": [
"You can summarize and translate text:\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "jAbIQ5U3mGN5"
},
"outputs": [],
"source": [
"contents = \"\"\"\n",
"Summarize the following text in English, in 3 short bullet points, only using the text.\n",
"\n",
"TEXT:\n",
"- Les hommes naissent et demeurent libres et égaux en droits. Les distinctions sociales ne peuvent être fondées que sur l'utilité commune.\n",
"- Le but de toute association politique est la conservation des droits naturels et imprescriptibles de l'homme. Ces droits sont la liberté, la propriété, la sûreté et la résistance à l'oppression.\n",
"- Le principe de toute souveraineté réside essentiellement dans la Nation. Nul corps, nul individu ne peut exercer d'autorité qui n'en émane expressément.\n",
"- La liberté consiste à pouvoir faire tout ce qui ne nuit pas à autrui : ainsi, l'exercice des droits naturels de chaque homme n'a de bornes que celles qui assurent aux autres membres de la société la jouissance de ces mêmes droits. Ces bornes ne peuvent être déterminées que par la loi.\n",
"La loi n'a le droit de défendre que les actions nuisibles à la société. Tout ce qui n'est pas défendu par la loi ne peut être empêché, et nul ne peut être contraint à faire ce qu'elle n'ordonne pas.\n",
"- La loi est l'expression de la volonté générale. Tous les citoyens ont droit de concourir personnellement ou par leurs représentants à sa formation. Elle doit être la même pour tous, soit qu'elle protège, soit qu'elle punisse. Tous les citoyens, étant égaux à ses yeux, sont également admissibles à toutes dignités, places et emplois publics, selon leur capacité et sans autre distinction que celle de leurs vertus et de leurs talents.\n",
"- Nul homme ne peut être accusé, arrêté ou détenu que dans les cas déterminés par la loi et selon les formes qu'elle a prescrites. Ceux qui sollicitent, expédient, exécutent ou font exécuter des ordres arbitraires doivent être punis ; mais tout citoyen appelé ou saisi en vertu de la loi doit obéir à l'instant ; il se rend coupable par la résistance.\n",
"- La loi ne doit établir que des peines strictement et évidemment nécessaires, et nul ne peut être puni qu'en vertu d'une loi établie et promulguée antérieurement au délit, et légalement appliquée.\n",
"Tout homme étant présumé innocent jusqu'à ce qu'il ait été déclaré coupable, s'il est jugé indispensable de l'arrêter, toute rigueur qui ne serait pas nécessaire pour s'assurer de sa personne doit être sévèrement réprimée par la loi.\n",
"- Nul ne doit être inquiété pour ses opinions, même religieuses, pourvu que leur manifestation ne trouble pas l'ordre public établi par la loi.\n",
"\"\"\"\n",
"\n",
"display(Markdown(generate_content(MODEL_ID, contents)))"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Ri8Yx5vtE22b"
},
"source": [
"… as well as ask for ideas:\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "k5pXqmNJmGN5"
},
"outputs": [],
"source": [
"contents = \"\"\"\n",
"Provide an outline in 5 key points for a \"chocolate in the world\" presentation.\n",
"One part must be about its origin in Mexico (my teacher has family there).\n",
"The last one will be a tasting with everybody in the classroom.\n",
"\"\"\"\n",
"\n",
"# For more creative responses, let's increase the level of randomness with a higher temperature.\n",
"# Successive requests will likely return different answers.\n",
"display(Markdown(generate_content(MODEL_ID, contents, temperature=1.0)))"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "zf25rsTSkdXu"
},
"source": [
"You can also ask for text corrections:\n",
"\n",
"Below, examples are provided to help the model generate responses with the expected structure and formatting. This is also called few-shot prompting.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "jgKEDKVkL8Ub"
},
"outputs": [],
"source": [
"contents = \"\"\"\n",
"I'm a non-native English speaker.\n",
"Check whether the following sentences are correct.\n",
"When incorrect, provide a correction and an explanation.\n",
"Use the same structure as in the given examples.\n",
"\n",
"EXAMPLES:\n",
"- **Hi!**\n",
" - Status: ✔️\n",
"- **Your my best freind!**\n",
" - Status: ❌\n",
" - Correction: **You're my best friend!**\n",
" - Explanation:\n",
" - \"**Your**\" is incorrect. It seems that you meant \"You're\", which is the short form of \"You are\".\n",
" - \"**freind**\" is misspelled. The correct spelling is \"**friend**\".\n",
"\n",
"SENTENCES:\n",
"- They're twins, isn't it?\n",
"- I assisted to the meeting.\n",
"- You received important informations.\n",
"- I digged a hole in the ice and saw lots of fishes.\n",
"- That's all folks!\n",
"\"\"\"\n",
"\n",
"display(Markdown(generate_content(MODEL_ID, contents)))"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "ach9JtNikxVz"
},
"source": [
"… as well as ask for elaborate tasks on text and languages:\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "P0xdsfQDE22b"
},
"outputs": [],
"source": [
"contents = \"\"\"\n",
"Translate the text into the following languages.\n",
"\n",
"TEXT:\n",
"Hello folks! I hope you're all doing well. Let's get this workshop started!\n",
"We'll stick to English because, actually, I can't speak all those languages.\n",
"\n",
"LANGUAGES:\n",
"German, French, Greek, Bulgarian, Japanese\n",
"\"\"\"\n",
"\n",
"display(Markdown(generate_content(MODEL_ID, contents)))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "LSNoxDSeZWw-"
},
"outputs": [],
"source": [
"contents = \"\"\"\n",
"I'm a non-native English speaker and made mistakes in the following sentences.\n",
"Guess my native language (if there are several possibilities, here is a hint: I like cheese).\n",
"Explain why these are typical mistakes.\n",
"\n",
"SENTENCES:\n",
"- They are twin sisters, isn't it?\n",
"- I assisted to the meeting.\n",
"- I saw lots of fishes.\n",
"\"\"\"\n",
"\n",
"display(Markdown(generate_content(MODEL_ID, contents)))"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "AIl7R_jBUsaC"
},
"source": [
"## Reasoning on numbers\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "hm61coMZJX-o"
},
"source": [
"> Note: Depending on inputs and parameters, large language models can hallucinate and generate inaccurate outputs, including math operations. As a best practice, consider using prompts with step-by-step instructions to reduce hallucinations, or use a calculator library for more advanced math. You can also ask Gemini to generate problem-solving code (see [Doing Math with Large Language Models](https://medium.com/google-cloud/doing-math-with-large-language-models-69d94c8b0590)).\n",
"\n",
"You can ask about real life problems:\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "nmvIJfDUmGN6"
},
"outputs": [],
"source": [
"contents = \"\"\"\n",
"Patricia is a good runner and runs at an average 12 km/h.\n",
"- On Monday, she ran for 1.5 hour. What distance did she run?\n",
"- On Tuesday, she ran for 21 km. How long did she run?\n",
"- On Wednesday, she ran for 150 minutes. What distance did she run?\n",
"- Next, she plans to do a marathon (42 km). How long should it take?\n",
"- To complete a marathon in 3 hours, how much faster does she need to run?\n",
"\n",
"Detail the answers step by step.\n",
"\"\"\"\n",
"\n",
"display(Markdown(generate_content(MODEL_ID, contents)))"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "_nZwoSe0mGN6"
},
"source": [
"… or about classical problems:\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "R6dmn9_Q0Be0"
},
"outputs": [],
"source": [
"contents = \"\"\"\n",
"I just borrowed 1,000 EUR from a friend.\n",
"We agreed on a 4.5% simple interest rate, based solely on the initial amount borrowed.\n",
"I want to know how much I'll have to refund in 1, 2, or 3 years.\n",
"Present the results in a recap table.\n",
"\"\"\"\n",
"\n",
"display(Markdown(generate_content(MODEL_ID, contents)))"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "dvdNYQKJE22c"
},
"source": [
"## Reasoning on a single image\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "UFW-KLIjmGN7"
},
"source": [
"You can ask for an image description:\n",
"\n",
""
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "aFvAEetIcSA7"
},
"outputs": [],
"source": [
"prompt = \"Describe this image in a short sentence:\"\n",
"\n",
"# Image by Crissy Jarvis on Unsplash: https://unsplash.com/photos/cHhbULJbPwM\n",
"image_abacus = Part.from_uri(\n",
" file_uri=\"https://images.unsplash.com/photo-1548690596-f1722c190938?q=80&w=3548&auto=format&fit=crop&ixlib=rb-4.0.3&ixid=M3wxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8fA%3D%3D\",\n",
" mime_type=\"image/jpeg\",\n",
")\n",
"\n",
"contents = [prompt, image_abacus]\n",
"\n",
"display(Markdown(generate_content(MODEL_ID, contents)))"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "jjqqXuauE22c"
},
"source": [
"… or ask specific questions:\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "H0SOqKjyi1tH"
},
"outputs": [],
"source": [
"prompt = \"\"\"\n",
"Answer the following questions about this image.\n",
"Return the results as a JSON list containing \"question\" and \"answer\" key pairs.\n",
"\n",
"QUESTIONS:\n",
"- What does the image show?\n",
"- How does it work?\n",
"- When was it invented?\n",
"- What's the name of this object in French, Italian, Spanish, Dutch, and German?\n",
"- What are the most prominent colors in the image?\n",
"\"\"\"\n",
"\n",
"contents = [prompt, image_abacus]\n",
"\n",
"display(Markdown(generate_content(MODEL_ID, contents)))"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "I9SIjHnbmGN8"
},
"source": [
"Your specific questions can have follow-up questions:\n",
"\n",
""
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "bGIsUZFAmGN8"
},
"outputs": [],
"source": [
"# Image by Brett Jordan on Unsplash: https://unsplash.com/photos/E1por_SGvJE\n",
"image_tiles = Part.from_uri(\n",
" file_uri=\"https://images.unsplash.com/photo-1612174636808-fdb8c5d80780?ixlib=rb-4.0.3&q=85&fm=jpg&crop=entropy&cs=srgb&w=600\",\n",
" mime_type=\"image/jpeg\",\n",
")\n",
"\n",
"prompt = \"\"\"\n",
"- What expression can be read in this image? How is it presented?\n",
"- What is the opposite expression?\n",
"- What is a recommendation, starting with this expression, a teacher could give his students for an exam?\n",
"- With the opposite expression?\n",
"\"\"\"\n",
"\n",
"contents = [image_tiles, prompt]\n",
"display(Markdown(generate_content(MODEL_ID, contents)))"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "trT6xm249rqo"
},
"source": [
"Information can have multiple forms. It can be objects, printed text, handwritten text, and more:\n",
"\n",
""
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "VKsArAoJmGN8"
},
"outputs": [],
"source": [
"prompt = \"\"\"\n",
"Follow the instructions.\n",
"Write math expressions in LaTex.\n",
"Use a table with a row for each instruction and its result.\n",
"\n",
"INSTRUCTIONS:\n",
"- Extract the formula.\n",
"- What is the symbol right before Pi? What does it mean?\n",
"- Is this a famous formula? Does it have a name?\n",
"- Why is it special?\n",
"- Extract the caption.\n",
"- What color is it?\n",
"- What color is the formula?\n",
"- What's the object in the bottom?\n",
"- What can you conclude about the object?\n",
"\"\"\"\n",
"\n",
"image_euler = Part.from_uri(\n",
" file_uri=\"https://storage.googleapis.com/cloud-samples-data/generative-ai/image/math_beauty.jpg\",\n",
" mime_type=\"image/jpeg\",\n",
")\n",
"\n",
"contents = [prompt, image_euler]\n",
"display(Markdown(generate_content(MODEL_ID, contents)))"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "XHHKFGbMU-Re"
},
"source": [
"You can also ask for interpretations and suggestions:\n",
"\n",
""
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "cJ0TsH-7unIE"
},
"outputs": [],
"source": [
"prompt = \"\"\"\n",
"Answer the following questions about the image.\n",
"Present the results in a table with a row for each question and its answer.\n",
"\n",
"QUESTIONS:\n",
"- What is visible?\n",
"- What are the reasons it's funny?\n",
"- What could be a fun caption?\n",
"- What could happen next?\n",
"- How would you alter the image? Would it still be funny and why?\n",
"- How would you make it funnier?\n",
"\"\"\"\n",
"# Image by Elimende Inagella on Unsplash: https://unsplash.com/photos/4ApmfdVo32Q\n",
"image_classroom = Part.from_uri(\n",
" file_uri=\"https://images.unsplash.com/photo-1613905780946-26b73b6f6e11?ixlib=rb-4.0.3&q=85&fm=jpg&crop=entropy&cs=srgb&w=600\",\n",
" mime_type=\"image/jpeg\",\n",
")\n",
"\n",
"contents = [prompt, image_classroom]\n",
"display(Markdown(generate_content(MODEL_ID, contents)))"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "qU5Tm-FQE22c"
},
"source": [
"## Reasoning on multiple images\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Vbf1HALvE22d"
},
"source": [
"You can also use multiple images:\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "sLUesezCE22d"
},
"outputs": [],
"source": [
"prompt = f\"\"\"\n",
"Answer the following questions for each image.\n",
"Present the results in a table with a row for each image and a column for each question.\n",
"\n",
"QUESTIONS:\n",
"- What can we see in the image?\n",
"- Where does it take place? (answer in one word)\n",
"\"\"\"\n",
"caption_b1 = \"Image 1:\"\n",
"caption_b2 = \"Image 2:\"\n",
"caption_b3 = \"Image 3:\"\n",
"\n",
"# Photo by Deleece Cook on Unsplash: https://unsplash.com/photos/zzjLGF_6dx4\n",
"image_b1 = Part.from_uri(\n",
" file_uri=\"https://images.unsplash.com/photo-1542587227-8802646daa56?ixlib=rb-4.0.3&q=85&fm=jpg&crop=entropy&cs=srgb&w=600\",\n",
" mime_type=\"image/jpeg\",\n",
")\n",
"# Photo by Natasha Kapur on Unsplash: https://unsplash.com/photos/ndAHi2Wxcok\n",
"image_b2 = Part.from_uri(\n",
" file_uri=\"https://images.unsplash.com/photo-1498669374702-58e97ebbede3?ixlib=rb-4.0.3&q=85&fm=jpg&crop=entropy&cs=srgb&w=600\",\n",
" mime_type=\"image/jpeg\",\n",
")\n",
"# Photo by Roman Mager on Unsplash: https://unsplash.com/photos/5mZ_M06Fc9g\n",
"image_b3 = Part.from_uri(\n",
" file_uri=\"https://images.unsplash.com/photo-1453733190371-0a9bedd82893?ixlib=rb-4.0.3&q=85&fm=jpg&crop=entropy&cs=srgb&w=600\",\n",
" mime_type=\"image/jpeg\",\n",
")\n",
"contents = [prompt, caption_b1, image_b1, caption_b2, image_b2, caption_b3, image_b3]\n",
"display(Markdown(generate_content(MODEL_ID, contents)))"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "e0968313024e"
},
"source": [
"… or make comparisons between images:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "d1a33383a717"
},
"outputs": [],
"source": [
"prompt = \"\"\"\n",
"Answer the following questions about the images, with a short answer and a detailed reason for the answer.\n",
"\n",
"QUESTIONS:\n",
"- What do the images have in common?\n",
"- Which one would be of interest to a mathematician?\n",
"- Which one indicates it's the end of vacation?\n",
"- Which one suggests we may get a coffee there?\n",
"\"\"\"\n",
"\n",
"contents = [prompt, caption_b1, image_b1, caption_b2, image_b2, caption_b3, image_b3]\n",
"display(Markdown(generate_content(MODEL_ID, contents)))"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "uqFSjI4_mGN9"
},
"source": [
"You can use Gemini's level of language and visual understanding to work with concepts or even get suggestions on new images:\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "QjbP2y26mGN9"
},
"outputs": [],
"source": [
"prompt = \"\"\"\n",
"Answer the following questions about the images, with a short answer and a detailed reason for the answer.\n",
"\n",
"QUESTIONS:\n",
"- What does the first image represent?\n",
"- What does the second image represent?\n",
"- What could be the next logical image?\n",
"\"\"\"\n",
"caption_w1 = \"Image 1:\"\n",
"caption_w2 = \"Image 2:\"\n",
"# Photo by Diego Ballon Vargas on Unsplash: https://unsplash.com/photos/TA5bUTySOrg\n",
"image_w1 = Part.from_uri(\n",
" file_uri=\"https://images.unsplash.com/photo-1502251590879-9b45a5d29c7f?ixlib=rb-4.0.3&q=85&fm=jpg&crop=entropy&cs=srgb&w=600\",\n",
" mime_type=\"image/jpeg\",\n",
")\n",
"# Photo by Reza Shayestehpour on Unsplash: https://unsplash.com/photos/Nw_D8v79PM4\n",
"image_w2 = Part.from_uri(\n",
" file_uri=\"https://images.unsplash.com/photo-1428592953211-077101b2021b?ixlib=rb-4.0.3&q=85&fm=jpg&crop=entropy&cs=srgb&w=600\",\n",
" mime_type=\"image/jpeg\",\n",
")\n",
"\n",
"contents = [\n",
" caption_w1,\n",
" image_w1,\n",
" caption_w2,\n",
" image_w2,\n",
" prompt,\n",
"]\n",
"display(Markdown(generate_content(MODEL_ID, contents)))"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "dJdBiddamGN9"
},
"source": [
"This is really up to your imagination:\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "iFhubVwFmGN9"
},
"outputs": [],
"source": [
"prompt = \"\"\"\n",
"Answer the following questions, with a short answer and a detailed reason for the answer.\n",
"\n",
"QUESTIONS:\n",
"- What theme do these images illustrate?\n",
"- What could be another image to replace the first one?\n",
"- What other image could replace the second one?\n",
"- What would be an alternative to the third image?\n",
"\"\"\"\n",
"caption_s1 = \"Image 1:\"\n",
"caption_s2 = \"Image 2:\"\n",
"caption_s3 = \"Image 3:\"\n",
"\n",
"# Photo by Tomoko Uji on Unsplash: https://unsplash.com/photos/eriuKJwcdjI\n",
"image_s1 = Part.from_uri(\n",
" file_uri=\"https://images.unsplash.com/photo-1522748906645-95d8adfd52c7?ixlib=rb-4.0.3&q=85&fm=jpg&crop=entropy&cs=srgb&w=600\",\n",
" mime_type=\"image/jpeg\",\n",
")\n",
"\n",
"# Photo by Todd Trapani on Unsplash: https://unsplash.com/photos/QldMpmrmWuc\n",
"image_s2 = Part.from_uri(\n",
" file_uri=\"https://images.unsplash.com/photo-1598920710727-e6c74781538c?ixlib=rb-4.0.3&q=85&fm=jpg&crop=entropy&cs=srgb&w=600\",\n",
" mime_type=\"image/jpeg\",\n",
")\n",
"\n",
"# Photo by Olivia Hutcherson on Unsplash: https://unsplash.com/photos/rN3m7aTH3io\n",
"image_s3 = Part.from_uri(\n",
" file_uri=\"https://images.unsplash.com/photo-1538580619159-6c19131e1062?ixlib=rb-4.0.3&q=85&fm=jpg&crop=entropy&cs=srgb&w=600\",\n",
" mime_type=\"image/jpeg\",\n",
")\n",
"\n",
"contents = [\n",
" prompt,\n",
" caption_s1,\n",
" image_s1,\n",
" caption_s2,\n",
" image_s2,\n",
" caption_s3,\n",
" image_s3,\n",
"]\n",
"display(Markdown(generate_content(MODEL_ID, contents)))"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "JkjzQsgKGS7o"
},
"source": [
"## Reasoning on a video\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "_irp8pd4GS7o"
},
"source": [
"And you also can extract information from a video:\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "o9dOaKJoUknq"
},
"outputs": [],
"source": [
"prompt = \"\"\"\n",
"Answer the following questions using the video only.\n",
"Present the results for each question and its answer, as well as the timestamps where the answer can be found and whether the info source comes from \"Image\", \"Text\", and/or \"Speech\".\n",
"\n",
"QUESTIONS:\n",
"- Where was the video likely shot?\n",
"- What real animals are first visible as a group?\n",
"- What animals are cartoon characters doing a close-up selfie?\n",
"- What does the electronic device let real animals do?\n",
"- What is the veterinarian full name?\n",
"- Where does he work?\n",
"- What is Courtney's job position?\n",
"- What's her full name?\n",
"- Which famous brand is first visible?\n",
"- Which famous brand is last visible?\n",
"- What happens at timestamp 0:36?\n",
"- What happens at timestamp 1:05?\n",
"\"\"\"\n",
"video = Part.from_uri(\n",
" file_uri=\"gs://cloud-samples-data/video/animals.mp4\",\n",
" mime_type=\"video/mp4\",\n",
")\n",
"\n",
"contents = [prompt, video]\n",
"display(Markdown(generate_content(MODEL_ID, contents)))"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "806cbfafe417"
},
"source": [
"Some remarks:\n",
"\n",
"- Gemini uses both video and audio information.\n",
" - Video files are sampled at 1 frame per second (1fps).\n",
" - Video and audio samples are analyzed with their timestamps.\n",
" - Text (typewritten or handwritten) can be detected in video frames, like when reasoning on images.\n",
" - This explains why the extracted samples can give you three possible sources of information:\n",
" - image (video frame)\n",
" - text (text in the video frame)\n",
" - speech (audio sample)\n",
"- Using timestamps, you can also ask questions about specific video locations.\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "jWT14Lw6Uknq"
},
"source": [
"## Conclusion\n",
"\n",
"In this tutorial, you saw examples of how to use Gemini in education to reason on text, images, and videos.\n",
"\n",
"For more information, see the following documentation pages:\n",
"\n",
"- [Image understanding](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/image-understanding)\n",
"- [Video understanding](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/video-understanding)\n",
"- [Audio understanding](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/audio-understanding)\n",
"- [Document understanding](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/document-understanding)\n",
"\n",
"You may also want to explore other tutorials that focus on different domains or specificities of the Gemini API in Vertex AI."
]
}
],
"metadata": {
"colab": {
"name": "use_cases_for_education.ipynb",
"toc_visible": true
},
"kernelspec": {
"display_name": "Python 3",
"name": "python3"
}
},
"nbformat": 4,
"nbformat_minor": 0
}