gemini/function-calling/intro_diy_react_agent.ipynb (1,543 lines of code) (raw):

{ "cells": [ { "cell_type": "code", "execution_count": null, "metadata": { "id": "5C5UHf-cgV0h" }, "outputs": [], "source": [ "# Copyright 2024 Google LLC\n", "#\n", "# Licensed under the Apache License, Version 2.0 (the \"License\");\n", "# you may not use this file except in compliance with the License.\n", "# You may obtain a copy of the License at\n", "#\n", "# https://www.apache.org/licenses/LICENSE-2.0\n", "#\n", "# Unless required by applicable law or agreed to in writing, software\n", "# distributed under the License is distributed on an \"AS IS\" BASIS,\n", "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n", "# See the License for the specific language governing permissions and\n", "# limitations under the License." ] }, { "cell_type": "markdown", "metadata": { "id": "jXgCZ0VqYfNM" }, "source": [ "# Introduction to ReAct Agents with Gemini & Function Calling\n", "\n", "<table align=\"left\">\n", " <td style=\"text-align: center\">\n", " <a href=\"https://colab.research.google.com/github/GoogleCloudPlatform/generative-ai/blob/main/gemini/function-calling/intro_diy_react_agent.ipynb\">\n", " <img width=\"32px\" src=\"https://cloud.google.com/ml-engine/images/colab-logo-32px.png\" alt=\"Google Colaboratory logo\"><br> Run in Colab\n", " </a>\n", " </td>\n", " <td style=\"text-align: center\">\n", " <a href=\"https://console.cloud.google.com/vertex-ai/colab/import/https:%2F%2Fraw.githubusercontent.com%2FGoogleCloudPlatform%2Fgenerative-ai%2Fmain%2Fgemini%2Ffunction-calling%2Fintro_diy_react_agent.ipynb\">\n", " <img width=\"32px\" src=\"https://lh3.googleusercontent.com/JmcxdQi-qOpctIvWKgPtrzZdJJK-J3sWE1RsfjZNwshCFgE_9fULcNpuXYTilIR2hjwN\" alt=\"Google Cloud Colab Enterprise logo\"><br> Run in Colab Enterprise\n", " </a>\n", " </td> \n", " <td style=\"text-align: center\">\n", " <a href=\"https://github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/function-calling/intro_diy_react_agent.ipynb\">\n", " <img width=\"32px\" src=\"https://cloud.google.com/ml-engine/images/github-logo-32px.png\" alt=\"GitHub logo\"><br> View on GitHub\n", " </a>\n", " </td>\n", " <td style=\"text-align: center\">\n", " <a href=\"https://console.cloud.google.com/vertex-ai/workbench/deploy-notebook?download_url=https://raw.githubusercontent.com/GoogleCloudPlatform/generative-ai/main/gemini/function-calling/intro_diy_react_agent.ipynb\">\n", " <img width=\"32px\" src=\"https://lh3.googleusercontent.com/UiNooY4LUgW_oTvpsNhPpQzsstV5W8F7rYgxgGBD85cWJoLmrOzhVs_ksK_vgx40SHs7jCqkTkCk=e14-rj-sc0xffffff-h130-w32\" alt=\"Vertex AI logo\"><br> \n", " Open in Vertex AI Workbench\n", " </a>\n", " </td> \n", "</table>\n", "\n", "<div style=\"clear: both;\"></div>\n", "\n", "<b>Share to:</b>\n", "\n", "<a href=\"https://www.linkedin.com/sharing/share-offsite/?url=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/function-calling/intro_diy_react_agent.ipynb\" target=\"_blank\">\n", " <img width=\"20px\" src=\"https://upload.wikimedia.org/wikipedia/commons/8/81/LinkedIn_icon.svg\" alt=\"LinkedIn logo\">\n", "</a>\n", "\n", "<a href=\"https://bsky.app/intent/compose?text=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/function-calling/intro_diy_react_agent.ipynb\" target=\"_blank\">\n", " <img width=\"20px\" src=\"https://upload.wikimedia.org/wikipedia/commons/7/7a/Bluesky_Logo.svg\" alt=\"Bluesky logo\">\n", "</a>\n", "\n", "<a href=\"https://twitter.com/intent/tweet?url=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/function-calling/intro_diy_react_agent.ipynb\" target=\"_blank\">\n", " <img width=\"20px\" src=\"https://upload.wikimedia.org/wikipedia/commons/5/5a/X_icon_2.svg\" alt=\"X logo\">\n", "</a>\n", "\n", "<a href=\"https://reddit.com/submit?url=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/function-calling/intro_diy_react_agent.ipynb\" target=\"_blank\">\n", " <img width=\"20px\" src=\"https://redditinc.com/hubfs/Reddit%20Inc/Brand/Reddit_Logo.png\" alt=\"Reddit logo\">\n", "</a>\n", "\n", "<a href=\"https://www.facebook.com/sharer/sharer.php?u=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/function-calling/intro_diy_react_agent.ipynb\" target=\"_blank\">\n", " <img width=\"20px\" src=\"https://upload.wikimedia.org/wikipedia/commons/5/51/Facebook_f_logo_%282019%29.svg\" alt=\"Facebook logo\">\n", "</a> " ] }, { "cell_type": "markdown", "metadata": { "id": "f92df1bde019" }, "source": [ "| | |\n", "|-|-|\n", "|Author(s) | [Gary Ng](https://github.com/gkcng) |" ] }, { "cell_type": "markdown", "metadata": { "id": "tGULENpgf_Pz" }, "source": [ "## Overview\n", "\n", "This notebook illustrates that at its simplest, a ReAct agent is a piece of code that coordinates between reasoning and acting, where:\n", "- The reasoning is carried out by the language model\n", "- The application code performs the acting, at the instruction of the language model.\n", "\n", "This allows problems to be solved by letting a model 'think' through the tasks step-by-step, taking actions and getting action feedback before determining the next steps.\n", "\n", "<div>\n", " <table align=\"center\">\n", " <tr><td>\n", " <img src=\"https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiuuYg9Pduep9GkUfjloNVOiy3qjpPbT017GKlgGEGMaLNu_TCheEeJ7r8Qok6-0BK3KMfLvsN2vSgFQ8xOvnHM9CAb4Ix4I62bcN2oXFWfqAJzGAGbVqbeCyVktu3h9Dyf5ameRe54LEr32Emp0nG52iofpNOTXCxMY12K7fvmDZNPPmfJaT5zo1OBQA/s595/Screen%20Shot%202022-11-08%20at%208.53.49%20AM.png\" alt=\"The Reasoning and Acting Cycle\" width=\"500\" align=\"center\"/>\n", " </td></tr>\n", " <tr><td><div align=\"center\"><em>From the paper: <a href=\"https://research.google/blog/react-synergizing-reasoning-and-acting-in-language-models/\">ReAct: Synergizing Reasoning and Acting in Language Models</a></em></div></td></tr>\n", " </table>\n", "</div>\n", "\n", "This coordination between the language model and the environment is made possible by asking the language model to communicate the intended actions in a specific and structured manner. The response is 'specific' in that the list of possible actions are predefined functions and thus necessarily constrained. The response is also 'structured', so the function parameters given in the response can be used directly by the application code, minimizing the need for further parsing, interpretation, or transformations. \n", "\n", "Both requirements can be supported by many language models, as they are equivalent to performing natural language tasks such as classification and information extraction. As illustrated in the first two examples in this notebook, the task of identifying suitable function names and extraction of function parameters can be done using prompting and response parsing alone. \n", "\n", "For strengthened quality on the function call responses however, in terms of validity, reliability, and consistency, many models now feature built-in APIs supporting 'Function Calling' or 'Tools Calling' (these terms are often used interchangeably). Such built-in support reduces the amount of defensive safeguards a developer has to build around response handling in their applications. " ] }, { "cell_type": "markdown", "metadata": { "id": "d295151a7c9b" }, "source": [ "### Function / Tool-Calling APIs and Agent Frameworks\n", "\n", "In the third example in this notebook, we leverage [Function Calling in Gemini](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/function-calling) to build our simple agent. It lets developers create a description of a function in their code, then pass that description to a language model in a request. The response from the model includes the name of a function that matches the description and the arguments to call it with.\n", "\n", "There are also other tools-calling and agents building frameworks to increase developers productivity. For example, the [Tool-Calling Agents](https://python.langchain.com/v0.1/docs/modules/agents/agent_types/tool_calling/) from LangChain, and at an even higher level of abstraction, [Reasoning Engine](https://cloud.google.com/vertex-ai/generative-ai/docs/reasoning-engine/overview) is a Google Cloud managed service that helps you to build and deploy an agent reasoning framework ([See sample notebooks](https://github.com/GoogleCloudPlatform/generative-ai/tree/main/gemini/reasoning-engine)). Reasoning Engine integrates closely with the Python SDK for the Gemini model in Vertex AI, and it can manage prompts, agents, and examples in a modular way. Reasoning Engine is compatible with LangChain, LlamaIndex, or other Python frameworks. " ] }, { "cell_type": "markdown", "metadata": { "id": "6edffc043e41" }, "source": [ "### Objectives\n", "\n", "To illustrate the basic building blocks of function calling and its utility, this notebook illustrates building the same agent with Gemini in three different ways, via:\n", "\n", "1. Prompting alone - using the single turn `generate_content` API. \n", "1. Prompting alone - using the `ChatSession` API instead.\n", "1. Function Calling - Modified from the `ChatSession` example.\n", "\n", "In the first example, the list of possible functions are presented to the API every time because the API is stateless. In the second example, because the `ChatSession` is stateful on the client side, we only need to present the list of function choices at the beginning of the session. The first two examples will introduce to the audience the building blocks that are now reliably supported by Gemini and many other model APIs as 'Tool' / 'Function' calling, and the Gemini API is demonstrated in the third example. \n", "\n", "The raw prompting examples are only used to explain the building blocks and help understand the dedicated APIs. For your productivity and reliability of responses you are encouraged to use an API that supports function calling. \n", "\n", "In the first example, we also illustrate the concept of explicit goal checking vs model-based goal checking. Use explicit goal checking when the goal can easily be define in code, it can save some cost and improves speed. Otherwise use model-based goal checking when the goal is too complex or variable, and specifying the goal in natural language and let the model handle the interpretation is simpler and faster than writing the full checks in code." ] }, { "cell_type": "markdown", "metadata": { "id": "d4d798879063" }, "source": [ "### Background\n", "This example was suggested by Gemini Advanced as a simple, text-based demo that highlights the core ReAct concept: Autonomy, Cyclic, Reasoning. The agent's thoughts demonstrate a simple form of reasoning, connecting observations to actions.\n", "\n", "<div>\n", " <table align=\"center\">\n", " <tr><td>\n", " <img src=\"https://services.google.com/fh/files/misc/gemini_react_suggestion.jpg\" alt=\"Gemini's suggestion\" width=\"500\" align=\"center\"/>\n", " </td></tr>\n", " <tr><td><div align=\"center\"><em>Scenario: A ReAct agent designed to tidy up a virtual room.</em></div></td></tr>\n", " </table>\n", "</div>" ] }, { "cell_type": "markdown", "metadata": { "id": "39992f621eb0" }, "source": [ "### Costs\n", "\n", "This tutorial uses billable components of Google Cloud:\n", "\n", "- Google Foundational Models on Vertex AI ([Function Calling](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/function-calling#pricing))\n", "\n", "Learn about [Generative AI on Vertex AI Pricing](https://cloud.google.com/vertex-ai/generative-ai/pricing) and use the [Pricing Calculator](https://cloud.google.com/products/calculator/) to generate a cost estimate based on your projected usage." ] }, { "cell_type": "markdown", "metadata": { "id": "39fb8cb87102" }, "source": [ "## Getting Started" ] }, { "cell_type": "markdown", "metadata": { "id": "996eae6d82d3" }, "source": [ "### Install Vertex AI SDK for Python\n", "This notebook uses the [Vertex AI SDK for Python](https://cloud.google.com/vertex-ai/generative-ai/docs/reference/python/latest)." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "2969acc84135" }, "outputs": [], "source": [ "%pip install --upgrade --user google-cloud-aiplatform" ] }, { "cell_type": "markdown", "metadata": { "id": "70b9a7f00179" }, "source": [ "### Restart current runtime\n", "\n", "To use the newly installed packages in this Jupyter runtime, you must restart the runtime. You can do this by running the cell below, which will restart the current kernel." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "cf71cbda01f9" }, "outputs": [], "source": [ "# Restart kernel after installs so that your environment can access the new packages\n", "import IPython\n", "\n", "app = IPython.Application.instance()\n", "app.kernel.do_shutdown(True)" ] }, { "cell_type": "markdown", "metadata": { "id": "02d6dfc513c3" }, "source": [ "<div class=\"alert alert-block alert-warning\">\n", "<b>⚠️ The kernel is going to restart. Please wait until it is finished before continuing to the next step. ⚠️</b>\n", "</div>" ] }, { "cell_type": "markdown", "metadata": { "id": "575d71266b5b" }, "source": [ "### Authenticate your notebook environment (Colab only)\n", "\n", "If you are running this notebook on Google Colab, run the following cell to authenticate your environment. This step is not required if you are using [Vertex AI Workbench](https://cloud.google.com/vertex-ai-workbench)." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "b95a7aa9f3d8" }, "outputs": [], "source": [ "import sys\n", "\n", "# Additional authentication is required for Google Colab\n", "if \"google.colab\" in sys.modules:\n", " # Authenticate user to Google Cloud\n", " from google.colab import auth\n", "\n", " auth.authenticate_user()" ] }, { "cell_type": "markdown", "metadata": { "id": "4ee80c5b9d54" }, "source": [ "### Set Google Cloud project information and initialize Vertex AI SDK\n", "\n", "To get started using Vertex AI, you must have an existing Google Cloud project and [enable the Vertex AI API](https://console.cloud.google.com/flows/enableapi?apiid=aiplatform.googleapis.com).\n", "\n", "Learn more about [setting up a project and a development environment](https://cloud.google.com/vertex-ai/docs/start/cloud-environment)." ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "id": "a41550d555ea" }, "outputs": [], "source": [ "PROJECT_ID = \"[your-project-id]\" # @param {type:\"string\"}\n", "LOCATION = \"us-central1\" # @param {type:\"string\"}\n", "\n", "import vertexai\n", "\n", "vertexai.init(project=PROJECT_ID, location=LOCATION)" ] }, { "cell_type": "markdown", "metadata": { "id": "71b40692ace5" }, "source": [ "### Imports Libraries" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "id": "a17e54f9eb9d" }, "outputs": [], "source": [ "from collections.abc import Callable\n", "import json\n", "import sys\n", "import traceback\n", "\n", "from google.protobuf.json_format import MessageToJson\n", "from vertexai import generative_models\n", "from vertexai.generative_models import FunctionDeclaration, GenerativeModel, Part, Tool" ] }, { "cell_type": "markdown", "metadata": { "id": "Az-OexEYJ9_I" }, "source": [ "### Prepare a model with system instructions" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "id": "fY9wu9XUcxzy" }, "outputs": [], "source": [ "model = GenerativeModel(\n", " \"gemini-2.0-flash\",\n", " system_instruction=[\n", " \"You are an assistant that helps me tidy my room.\"\n", " \"Your goal is to make sure all the books are on the shelf, all clothes are in the hamper, and the trash is empty.\",\n", " \"You cannot receive any input from me.\",\n", " ],\n", " generation_config={\"temperature\": 0.0},\n", " safety_settings=[\n", " generative_models.SafetySetting(\n", " category=generative_models.HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT,\n", " method=generative_models.SafetySetting.HarmBlockMethod.PROBABILITY,\n", " threshold=generative_models.HarmBlockThreshold.BLOCK_ONLY_HIGH,\n", " ),\n", " ],\n", ")" ] }, { "cell_type": "markdown", "metadata": { "id": "4d292a1ed394" }, "source": [ "## Helper Functions" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "id": "b724d6f5d271" }, "outputs": [], "source": [ "verbose = True" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "id": "dnk1DurbxspA" }, "outputs": [], "source": [ "# Conveience function to print multiline text indented\n", "\n", "\n", "def indent(text, amount, ch=\" \"):\n", " padding = amount * ch\n", " return \"\".join(padding + line for line in text.splitlines(True))\n", "\n", "\n", "# Convenience function for logging statements\n", "def logging(msg):\n", " global verbose\n", " print(msg) if verbose else None\n", "\n", "\n", "# Retrieve the text from a model response\n", "def get_text(resp):\n", " return resp.candidates[0].content.parts[0].text\n", "\n", "\n", "# Retrieve the function call information from a model response\n", "def get_function_call(resp):\n", " return resp.candidates[0].function_calls[0]\n", "\n", "\n", "def get_action_label(json_payload, log, role=\"MODEL\"):\n", " log(f\"{role}: {json_payload}\")\n", " answer = json.loads(json_payload)\n", " action = answer[\"next_action\"]\n", " return action\n", "\n", "\n", "def get_action_from_function_call(func_payload, log, role=\"MODEL\"):\n", " json_payload = MessageToJson(func_payload._pb)\n", " log(f\"{role}: {json_payload}\")\n", " return func_payload.name" ] }, { "cell_type": "markdown", "metadata": { "id": "DyVd9-OALAKc" }, "source": [ "### Action definitions\n", "These are the pseudo actions declared as simple Python functions. With the Function Calling pattern, the orchestration layer of an agent will be calling these Tools to carry out actions." ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "id": "sho-9jxJW7Fe" }, "outputs": [], "source": [ "# Initial room state\n", "\n", "\n", "def reset_room_state(room_state):\n", " room_state.clear()\n", " room_state[\"clothes\"] = \"floor\"\n", " room_state[\"books\"] = \"scattered\"\n", " room_state[\"wastebin\"] = \"empty\"\n", "\n", "\n", "# Functions for actions (replace these with Gemini function calls)\n", "def pick_up_clothes(room_state):\n", " room_state[\"clothes\"] = \"carrying by hand\"\n", " return room_state, \"The clothes are now being carried.\"\n", "\n", "\n", "def put_clothes_in_hamper(room_state):\n", " room_state[\"clothes\"] = \"hamper\"\n", " return room_state, \"The clothes are now in the hamper.\"\n", "\n", "\n", "def pick_up_books(room_state):\n", " room_state[\"books\"] = \"in hand\"\n", " return room_state, \"The books are now in my hand.\"\n", "\n", "\n", "def place_books_on_shelf(room_state):\n", " room_state[\"books\"] = \"shelf\"\n", " return room_state, \"The books are now on the shelf.\"\n", "\n", "\n", "def empty_wastebin(room_state):\n", " room_state[\"wastebin\"] = \"empty\"\n", " return room_state, \"The wastebin is emptied.\"\n", "\n", "\n", "# Maps a function string to its respective function reference.\n", "def get_func(action_label):\n", " return None if action_label == \"\" else getattr(sys.modules[__name__], action_label)" ] }, { "cell_type": "markdown", "metadata": { "id": "cdca5fc7ee78" }, "source": [ "### Explicit goals checking\n", "This is only used in the first example to illustrate the concept: The goal checking responsibility can be either in code or be delegated to the model, depending on factors such as the complexity of the goal, ease of defining in code for example." ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "id": "163e0a64b8b1" }, "outputs": [], "source": [ "# Function to check if the room is tidy\n", "# Some examples below do not call this function,\n", "# for those examples the model takes on the goal validation role.\n", "\n", "\n", "def is_room_tidy(room_state):\n", " return all(\n", " [\n", " room_state[\"clothes\"] == \"hamper\",\n", " room_state[\"books\"] == \"shelf\",\n", " room_state[\"wastebin\"] == \"empty\",\n", " ]\n", " )" ] }, { "cell_type": "markdown", "metadata": { "id": "da5935b90607" }, "source": [ "### Prompt Templates" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "id": "l5V0GIOASWm8" }, "outputs": [], "source": [ "functions = \"\"\"\n", "<actions>\n", " put_clothes_in_hamper - place clothes into hamper, instead of carrying them around in your hand.\n", " pick_up_clothes - pick clothes up from the floor.\n", " pick_up_books - pick books up from anywhere not on the shelf\n", " place_books_on_shelf - self explanatory.\n", " empty_wastebin - self explanatory.\n", " done - when everything are in the right place.\n", "</actions>\"\"\"\n", "\n", "\n", "def get_next_step_full_prompt(state, cycle, log):\n", " observation = f\"The room is currently in this state: {state}.\"\n", " prompt = \"\\n\".join(\n", " [\n", " observation,\n", " f\"You can pick any of the following action labels: {functions}\",\n", " \"Which one should be the next step to achieve the goal? \",\n", " 'Return a single JSON object containing fields \"next_action\" and \"rationale\".',\n", " ]\n", " )\n", " (\n", " log(\"PROMPT:\\n{}\".format(indent(prompt, 1, \"\\t\")))\n", " if cycle == 1\n", " else log(f\"OBSERVATION: {observation}\")\n", " )\n", "\n", " return prompt" ] }, { "cell_type": "markdown", "metadata": { "id": "L2Ytt0GjKfRv" }, "source": [ "## Example 1: Multiple single-turn `generate_content` calls with full prompts" ] }, { "cell_type": "markdown", "metadata": { "id": "zbzqp2YJ3bfc" }, "source": [ "An example turn.\n", "\n", "```\n", "You are an assistant that helps me tidy my room.\n", "Your goal is to make sure all the books are on the shelf, all clothes are in the hamper, and the trash is empty.\n", "You cannot receive any input from me.\n", "\n", "The room is currently in this state: {'clothes': 'floor', 'books': 'scattered', 'wastebin': 'empty'}.\n", "\n", "You can pick any of the following action labels:\n", "<actions>\n", " put_clothes_in_hamper - place clothes into hamper, instead of carrying them around in your hand.\n", " pick_up_clothes - pick clothes up from the floor.\n", " pick_up_books - pick books up from anywhere not on the shelf\n", " place_books_on_shelf - self explanatory.\n", " empty_wastebin - self explanatory.\n", " done - when everything are in the right place.\n", "</actions>\n", "Which one should be the next step to achieve the goal?\n", "Return a single JSON object containing fields \"next_action\" and \"rationale\".\n", "\n", "RAW MODEL RESPONSE:\n", "\n", "candidates {\n", " content {\n", " role: \"model\"\n", " parts {\n", " text: \"{\\\"next_action\\\": \\\"pick_up_clothes\\\", \\\"rationale\\\": \\\"The clothes are on the floor and need to be picked up before they can be put in the hamper.\\\"}\\n\"\n", " }\n", " }\n", " finish_reason: STOP,\n", " ...\n", "}\n", "```" ] }, { "cell_type": "markdown", "metadata": { "id": "Obyi7GxaUXjE" }, "source": [ "### The Main ReAct Loop\n", "Interleaving asking for next steps and executing the steps.\n", "\n", "Notice that at cycle 4 the environment has changed to have a non-empty wastebin.\n", "With the goal that includes trash being empty, the model is recognizing the change and behaves accordingly, without the need to restate anything.\n", "\n", "This is also well within expectation as this loop prompts the model with all the information every time." ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "id": "MzlEFdXVKAFm" }, "outputs": [], "source": [ "# Main ReAct loop\n", "\n", "\n", "def main_react_loop(loop_continues, log):\n", " room_state = {}\n", " reset_room_state(room_state)\n", " trash_added = False\n", "\n", " cycle = 1\n", " while loop_continues(cycle, room_state):\n", " log(f\"Cycle #{cycle}\")\n", "\n", " # Observe the environment (use Gemini to generate an action thought)\n", " try: # REASON #\n", " response = model.generate_content(\n", " get_next_step_full_prompt(room_state, cycle, log),\n", " generation_config={\"response_mime_type\": \"application/json\"},\n", " ) # JSON Mode\n", " action_label = get_action_label(get_text(response).strip(), log)\n", "\n", " except Exception:\n", " traceback.print_exc()\n", " log(response)\n", " break\n", "\n", " # Execute the action and get the observation\n", " if action_label == \"done\":\n", " break\n", "\n", " try: # ACTION #\n", " # Call the function mapped from the label\n", " room_state, acknowledgement = get_func(action_label)(room_state)\n", " log(f\"ACTION: {action_label}\\nEXECUTED: {acknowledgement}\\n\")\n", "\n", " except Exception:\n", " log(\"No action suggested.\")\n", "\n", " # Simulating a change in environment\n", " if cycle == 4 and not trash_added:\n", " room_state[\"wastebin\"] = \"1 item\"\n", " trash_added = True\n", "\n", " cycle += 1\n", " # End of while loop\n", "\n", " # Determine the final result\n", " result = (\n", " \"The room is tidy!\" if is_room_tidy(room_state) else \"The room is not tidy!\"\n", " )\n", "\n", " return room_state, result" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "id": "4GGRQo8WQvV0" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Cycle #1\n", "PROMPT:\n", "\tThe room is currently in this state: {'clothes': 'floor', 'books': 'scattered', 'wastebin': 'empty'}.\n", "\tYou can pick any of the following action labels: \n", "\t<actions>\n", "\t put_clothes_in_hamper - place clothes into hamper, instead of carrying them around in your hand.\n", "\t pick_up_clothes - pick clothes up from the floor.\n", "\t pick_up_books - pick books up from anywhere not on the shelf\n", "\t place_books_on_shelf - self explanatory.\n", "\t empty_wastebin - self explanatory.\n", "\t done - when everything are in the right place.\n", "\t</actions>\n", "\tWhich one should be the next step to achieve the goal? \n", "\tReturn a single JSON object containing fields \"next_action\" and \"rationale\".\n", "MODEL: {\"next_action\": \"pick_up_clothes\", \"rationale\": \"The clothes are on the floor and need to be picked up before they can be put in the hamper.\"}\n", "ACTION: pick_up_clothes\n", "EXECUTED: The clothes are now being carried.\n", "\n", "Cycle #2\n", "OBSERVATION: The room is currently in this state: {'clothes': 'carrying by hand', 'books': 'scattered', 'wastebin': 'empty'}.\n", "MODEL: {\"next_action\": \"put_clothes_in_hamper\", \"rationale\": \"The clothes need to be in the hamper, and they are currently being carried. So the next step is to put them in the hamper.\"}\n", "ACTION: put_clothes_in_hamper\n", "EXECUTED: The clothes are now in the hamper.\n", "\n", "Cycle #3\n", "OBSERVATION: The room is currently in this state: {'clothes': 'hamper', 'books': 'scattered', 'wastebin': 'empty'}.\n", "MODEL: {\"next_action\": \"pick_up_books\", \"rationale\": \"The goal is to have all books on the shelf, so we need to pick them up first.\"}\n", "ACTION: pick_up_books\n", "EXECUTED: The books are now in my hand.\n", "\n", "Cycle #4\n", "OBSERVATION: The room is currently in this state: {'clothes': 'hamper', 'books': 'in hand', 'wastebin': 'empty'}.\n", "MODEL: {\"next_action\": \"place_books_on_shelf\", \"rationale\": \"The books need to be on the shelf, and they are currently in hand.\"}\n", "ACTION: place_books_on_shelf\n", "EXECUTED: The books are now on the shelf.\n", "\n", "Cycle #5\n", "OBSERVATION: The room is currently in this state: {'clothes': 'hamper', 'books': 'shelf', 'wastebin': '1 item'}.\n", "MODEL: {\"next_action\": \"empty_wastebin\", \"rationale\": \"The wastebin has one item in it and needs to be emptied to achieve the goal.\"}\n", "ACTION: empty_wastebin\n", "EXECUTED: The wastebin is emptied.\n", "\n", "{'clothes': 'hamper', 'books': 'shelf', 'wastebin': 'empty'} The room is tidy!\n" ] } ], "source": [ "# We are passing in a while loop continuation test function:\n", "# Continue while loop when number of cycles <= 10 AND the room is not yet tidy.\n", "# We are explicitly testing if the room is tidy within code.\n", "#\n", "# To save space, only the first cycle prints the full prompt.\n", "# The same prompt template is used for every model call with a modified room state.\n", "room_state, result = main_react_loop(\n", " lambda c, r: c <= 10 and not is_room_tidy(r), logging\n", ")\n", "print(room_state, result)" ] }, { "cell_type": "markdown", "metadata": { "id": "VY6cFvSvhAmt" }, "source": [ "### The Model decides when the goal is reached\n", "\n", "The model can also decide if the goal has been reached, instead of the application explicitly testing for the condition.\n", "This is useful in scenarios where the goal state is variable and/or too complex to define in code.\n", "\n", "To facilitate that, \n", "\n", "Instead of:\n", "\n", " while cycle <= 10 and not is_room_tidy(room_state):\n", "\n", "We just have\n", "\n", " while cycle <= 10:\n", " \n", "Remember we have previously defined an action \"done\" above, even though it is not a real function,\n", "the model and the application can utilize that to determine termination. Note this creates an extra cycle.\n" ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "id": "c69dbb409b30" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Cycle #1\n", "PROMPT:\n", "\tThe room is currently in this state: {'clothes': 'floor', 'books': 'scattered', 'wastebin': 'empty'}.\n", "\tYou can pick any of the following action labels: \n", "\t<actions>\n", "\t put_clothes_in_hamper - place clothes into hamper, instead of carrying them around in your hand.\n", "\t pick_up_clothes - pick clothes up from the floor.\n", "\t pick_up_books - pick books up from anywhere not on the shelf\n", "\t place_books_on_shelf - self explanatory.\n", "\t empty_wastebin - self explanatory.\n", "\t done - when everything are in the right place.\n", "\t</actions>\n", "\tWhich one should be the next step to achieve the goal? \n", "\tReturn a single JSON object containing fields \"next_action\" and \"rationale\".\n", "MODEL: {\"next_action\": \"pick_up_clothes\", \"rationale\": \"The clothes are on the floor and need to be picked up before they can be put in the hamper.\"}\n", "ACTION: pick_up_clothes\n", "EXECUTED: The clothes are now being carried.\n", "\n", "Cycle #2\n", "OBSERVATION: The room is currently in this state: {'clothes': 'carrying by hand', 'books': 'scattered', 'wastebin': 'empty'}.\n", "MODEL: {\"next_action\": \"put_clothes_in_hamper\", \"rationale\": \"The clothes need to be in the hamper, and they are currently being carried. So the next step is to put them in the hamper.\"}\n", "ACTION: put_clothes_in_hamper\n", "EXECUTED: The clothes are now in the hamper.\n", "\n", "Cycle #3\n", "OBSERVATION: The room is currently in this state: {'clothes': 'hamper', 'books': 'scattered', 'wastebin': 'empty'}.\n", "MODEL: {\"next_action\": \"pick_up_books\", \"rationale\": \"The goal is to have all books on the shelf, so we need to pick them up first.\"}\n", "ACTION: pick_up_books\n", "EXECUTED: The books are now in my hand.\n", "\n", "Cycle #4\n", "OBSERVATION: The room is currently in this state: {'clothes': 'hamper', 'books': 'in hand', 'wastebin': 'empty'}.\n", "MODEL: {\"next_action\": \"place_books_on_shelf\", \"rationale\": \"The books need to be on the shelf, and they are currently in hand.\"}\n", "ACTION: place_books_on_shelf\n", "EXECUTED: The books are now on the shelf.\n", "\n", "Cycle #5\n", "OBSERVATION: The room is currently in this state: {'clothes': 'hamper', 'books': 'shelf', 'wastebin': '1 item'}.\n", "MODEL: {\"next_action\": \"empty_wastebin\", \"rationale\": \"The wastebin has one item in it and needs to be emptied.\"}\n", "ACTION: empty_wastebin\n", "EXECUTED: The wastebin is emptied.\n", "\n", "Cycle #6\n", "OBSERVATION: The room is currently in this state: {'clothes': 'hamper', 'books': 'shelf', 'wastebin': 'empty'}.\n", "MODEL: {\"next_action\": \"done\", \"rationale\": \"All items are already in their correct places: clothes in the hamper, books on the shelf, and the wastebin is empty.\"}\n", "{'clothes': 'hamper', 'books': 'shelf', 'wastebin': 'empty'} The room is tidy!\n" ] } ], "source": [ "# We are passing in a while loop continuation test function:\n", "# Continue while loop when number of cycles <= 10\n", "# We are no longer testing if the room is tidy within code.\n", "# The decision is now up to the model.\n", "room_state, result = main_react_loop(lambda c, r: c <= 10, logging)\n", "print(room_state, result)" ] }, { "cell_type": "markdown", "metadata": { "id": "1g8wcyWLay_8" }, "source": [ "## Example 2: Incremental Messaging Using the Chat API" ] }, { "cell_type": "markdown", "metadata": { "id": "bwpqqrqcfL6l" }, "source": [ "### The Chat session loop\n", "\n", "The difference between using the stateless API and the stateful chat session is that the list of function choices is only given to the session object once. In subsequent chat messaging we are only sending a message with the action response and the new current \n", "state of the environment. You can see in this loop we formulate the prompt / message differently depending on whether we are at the start of session or we have just performed an action." ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "id": "Yk872z-Rax0l" }, "outputs": [], "source": [ "# Main ReAct loop\n", "\n", "\n", "def main_react_loop_chat(session, loop_continues, log):\n", " room_state = {}\n", " reset_room_state(room_state)\n", " trash_added = False\n", "\n", " prev_action = None\n", " msg = \"\"\n", " cycle = 1\n", " while loop_continues(cycle, room_state):\n", " log(f\"Cycle #{cycle}\")\n", " # Observe the environment (use Gemini to generate an action thought)\n", " try: # REASON #\n", " if prev_action:\n", " msg = \"\\n\".join(\n", " [\n", " prev_action,\n", " f\"ENVIRONMENT: The room is currently in this state: {room_state}.\",\n", " \"Which should be the next action?\",\n", " ]\n", " )\n", " log(\"MESSAGE:\\n{}\".format(indent(msg, 1, \"\\t\")))\n", " else:\n", " msg = get_next_step_full_prompt(room_state, cycle, log)\n", "\n", " # MODEL CALL\n", " response = session.send_message(\n", " msg, generation_config={\"response_mime_type\": \"application/json\"}\n", " )\n", " action_label = get_action_label(get_text(response).strip(), log)\n", "\n", " except Exception:\n", " traceback.print_exc()\n", " log(response)\n", " break\n", "\n", " # Execute the action and get the observation\n", " if action_label == \"done\":\n", " break\n", "\n", " try: # ACTION #\n", " # Call the function mapped from the label\n", " room_state, acknowledgement = get_func(action_label)(room_state)\n", " prev_action = f\"ACTION: {action_label}\\nEXECUTED: {acknowledgement}\\n\"\n", " log(prev_action)\n", "\n", " except Exception:\n", " log(\"No action suggested.\")\n", "\n", " # Simulating a change in environment\n", " if cycle == 4 and not trash_added:\n", " room_state[\"wastebin\"] = \"1 item\"\n", " trash_added = True\n", "\n", " cycle += 1\n", " # End of while loop\n", "\n", " # Determine the final result\n", " result = (\n", " \"The room is tidy!\" if is_room_tidy(room_state) else \"The room is not tidy!\"\n", " )\n", "\n", " return room_state, result" ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "id": "I6dcLLuTduZY" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Cycle #1\n", "PROMPT:\n", "\tThe room is currently in this state: {'clothes': 'floor', 'books': 'scattered', 'wastebin': 'empty'}.\n", "\tYou can pick any of the following action labels: \n", "\t<actions>\n", "\t put_clothes_in_hamper - place clothes into hamper, instead of carrying them around in your hand.\n", "\t pick_up_clothes - pick clothes up from the floor.\n", "\t pick_up_books - pick books up from anywhere not on the shelf\n", "\t place_books_on_shelf - self explanatory.\n", "\t empty_wastebin - self explanatory.\n", "\t done - when everything are in the right place.\n", "\t</actions>\n", "\tWhich one should be the next step to achieve the goal? \n", "\tReturn a single JSON object containing fields \"next_action\" and \"rationale\".\n", "MODEL: {\"next_action\": \"pick_up_clothes\", \"rationale\": \"The clothes are on the floor and need to be picked up before they can be put in the hamper.\"}\n", "ACTION: pick_up_clothes\n", "EXECUTED: The clothes are now being carried.\n", "\n", "Cycle #2\n", "MESSAGE:\n", "\tACTION: pick_up_clothes\n", "\tEXECUTED: The clothes are now being carried.\n", "\t\n", "\tENVIRONMENT: The room is currently in this state: {'clothes': 'carrying by hand', 'books': 'scattered', 'wastebin': 'empty'}.\n", "\tWhich should be the next action?\n", "MODEL: {\"next_action\": \"put_clothes_in_hamper\", \"rationale\": \"Now that the clothes are picked up, they should be put in the hamper.\"}\n", "ACTION: put_clothes_in_hamper\n", "EXECUTED: The clothes are now in the hamper.\n", "\n", "Cycle #3\n", "MESSAGE:\n", "\tACTION: put_clothes_in_hamper\n", "\tEXECUTED: The clothes are now in the hamper.\n", "\t\n", "\tENVIRONMENT: The room is currently in this state: {'clothes': 'hamper', 'books': 'scattered', 'wastebin': 'empty'}.\n", "\tWhich should be the next action?\n", "MODEL: {\"next_action\": \"pick_up_books\", \"rationale\": \"The clothes are put away, so now we should pick up the scattered books.\"}\n", "ACTION: pick_up_books\n", "EXECUTED: The books are now in my hand.\n", "\n", "Cycle #4\n", "MESSAGE:\n", "\tACTION: pick_up_books\n", "\tEXECUTED: The books are now in my hand.\n", "\t\n", "\tENVIRONMENT: The room is currently in this state: {'clothes': 'hamper', 'books': 'in hand', 'wastebin': 'empty'}.\n", "\tWhich should be the next action?\n", "MODEL: {\"next_action\": \"place_books_on_shelf\", \"rationale\": \"The books need to be placed on the shelf to achieve the goal.\"}\n", "ACTION: place_books_on_shelf\n", "EXECUTED: The books are now on the shelf.\n", "\n", "Cycle #5\n", "MESSAGE:\n", "\tACTION: place_books_on_shelf\n", "\tEXECUTED: The books are now on the shelf.\n", "\t\n", "\tENVIRONMENT: The room is currently in this state: {'clothes': 'hamper', 'books': 'shelf', 'wastebin': '1 item'}.\n", "\tWhich should be the next action?\n", "MODEL: {\"next_action\": \"empty_wastebin\", \"rationale\": \"The wastebin has one item in it and needs to be emptied.\"}\n", "ACTION: empty_wastebin\n", "EXECUTED: The wastebin is emptied.\n", "\n", "Cycle #6\n", "MESSAGE:\n", "\tACTION: empty_wastebin\n", "\tEXECUTED: The wastebin is emptied.\n", "\t\n", "\tENVIRONMENT: The room is currently in this state: {'clothes': 'hamper', 'books': 'shelf', 'wastebin': 'empty'}.\n", "\tWhich should be the next action?\n", "MODEL: {\"next_action\": \"done\", \"rationale\": \"All clothes are in the hamper, books are on the shelf, and the wastebin is empty. The room is tidy.\"}\n", "{'clothes': 'hamper', 'books': 'shelf', 'wastebin': 'empty'} The room is tidy!\n" ] } ], "source": [ "session = model.start_chat()\n", "\n", "room_state, result = main_react_loop_chat(session, lambda c, r: c <= 10, logging)\n", "print(room_state, result)" ] }, { "cell_type": "markdown", "metadata": { "id": "DUTSUNDHfHS6" }, "source": [ "### Display the full chat history" ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "id": "VjhvHk8wfGnc" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[role: \"user\"\n", "parts {\n", " text: \"The room is currently in this state: {\\'clothes\\': \\'floor\\', \\'books\\': \\'scattered\\', \\'wastebin\\': \\'empty\\'}.\\nYou can pick any of the following action labels: \\n<actions>\\n put_clothes_in_hamper - place clothes into hamper, instead of carrying them around in your hand.\\n pick_up_clothes - pick clothes up from the floor.\\n pick_up_books - pick books up from anywhere not on the shelf\\n place_books_on_shelf - self explanatory.\\n empty_wastebin - self explanatory.\\n done - when everything are in the right place.\\n</actions>\\nWhich one should be the next step to achieve the goal? \\nReturn a single JSON object containing fields \\\"next_action\\\" and \\\"rationale\\\".\"\n", "}\n", ", role: \"model\"\n", "parts {\n", " text: \"{\\\"next_action\\\": \\\"pick_up_clothes\\\", \\\"rationale\\\": \\\"The clothes are on the floor and need to be picked up before they can be put in the hamper.\\\"}\\n\\n\"\n", "}\n", ", role: \"user\"\n", "parts {\n", " text: \"ACTION: pick_up_clothes\\nEXECUTED: The clothes are now being carried.\\n\\nENVIRONMENT: The room is currently in this state: {\\'clothes\\': \\'carrying by hand\\', \\'books\\': \\'scattered\\', \\'wastebin\\': \\'empty\\'}.\\nWhich should be the next action?\"\n", "}\n", ", role: \"model\"\n", "parts {\n", " text: \"{\\\"next_action\\\": \\\"put_clothes_in_hamper\\\", \\\"rationale\\\": \\\"Now that the clothes are picked up, they should be put in the hamper.\\\"}\\n\"\n", "}\n", ", role: \"user\"\n", "parts {\n", " text: \"ACTION: put_clothes_in_hamper\\nEXECUTED: The clothes are now in the hamper.\\n\\nENVIRONMENT: The room is currently in this state: {\\'clothes\\': \\'hamper\\', \\'books\\': \\'scattered\\', \\'wastebin\\': \\'empty\\'}.\\nWhich should be the next action?\"\n", "}\n", ", role: \"model\"\n", "parts {\n", " text: \"{\\\"next_action\\\": \\\"pick_up_books\\\", \\\"rationale\\\": \\\"The clothes are put away, so now we should pick up the scattered books.\\\"}\\n\"\n", "}\n", ", role: \"user\"\n", "parts {\n", " text: \"ACTION: pick_up_books\\nEXECUTED: The books are now in my hand.\\n\\nENVIRONMENT: The room is currently in this state: {\\'clothes\\': \\'hamper\\', \\'books\\': \\'in hand\\', \\'wastebin\\': \\'empty\\'}.\\nWhich should be the next action?\"\n", "}\n", ", role: \"model\"\n", "parts {\n", " text: \"{\\\"next_action\\\": \\\"place_books_on_shelf\\\", \\\"rationale\\\": \\\"The books need to be placed on the shelf to achieve the goal.\\\"}\\n\"\n", "}\n", ", role: \"user\"\n", "parts {\n", " text: \"ACTION: place_books_on_shelf\\nEXECUTED: The books are now on the shelf.\\n\\nENVIRONMENT: The room is currently in this state: {\\'clothes\\': \\'hamper\\', \\'books\\': \\'shelf\\', \\'wastebin\\': \\'1 item\\'}.\\nWhich should be the next action?\"\n", "}\n", ", role: \"model\"\n", "parts {\n", " text: \"{\\\"next_action\\\": \\\"empty_wastebin\\\", \\\"rationale\\\": \\\"The wastebin has one item in it and needs to be emptied.\\\"}\\n\"\n", "}\n", ", role: \"user\"\n", "parts {\n", " text: \"ACTION: empty_wastebin\\nEXECUTED: The wastebin is emptied.\\n\\nENVIRONMENT: The room is currently in this state: {\\'clothes\\': \\'hamper\\', \\'books\\': \\'shelf\\', \\'wastebin\\': \\'empty\\'}.\\nWhich should be the next action?\"\n", "}\n", ", role: \"model\"\n", "parts {\n", " text: \"{\\\"next_action\\\": \\\"done\\\", \\\"rationale\\\": \\\"All clothes are in the hamper, books are on the shelf, and the wastebin is empty. The room is tidy.\\\"}\\n\"\n", "}\n", "]\n" ] } ], "source": [ "print(session.history)" ] }, { "cell_type": "markdown", "metadata": { "id": "Xylz5_c8foms" }, "source": [ "## Example 3: Leveraging Gemini Function Calling Support\n", "\n", "For more details please refer to the documentation on [Function Calling](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/function-calling). \n", "\n", "In the last two examples we simulated the function calling feature by explicitly prompting the model with a list of action labels and setting a JSON mode output. This example uses the Function Calling feature, the list of possible actions are supplied as 'Tool' declarations, and by default the function calling feature returns structured results." ] }, { "cell_type": "markdown", "metadata": { "id": "bIDKrkZ1-3ke" }, "source": [ "### Tool Declarations\n", "See [Best Practices](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/function-calling#best-practices) for guidance on achieving good results with Function Calling." ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "id": "I8xgYekJ-_Ks" }, "outputs": [], "source": [ "# Functions for actions (replace these with Gemini function calls)\n", "pick_up_clothes_func = FunctionDeclaration(\n", " name=\"pick_up_clothes\",\n", " description=\"The act of picking clothes up from any place\",\n", " parameters={\"type\": \"object\"},\n", ")\n", "\n", "put_clothes_in_hamper_func = FunctionDeclaration(\n", " name=\"put_clothes_in_hamper\",\n", " description=\"Put the clothes being carried into a hamper\",\n", " parameters={\"type\": \"object\"},\n", ")\n", "\n", "pick_up_books_func = FunctionDeclaration(\n", " name=\"pick_up_books\",\n", " description=\"The act of picking books up from any place\",\n", " parameters={\"type\": \"object\"},\n", ")\n", "\n", "place_books_on_shelf_func = FunctionDeclaration(\n", " name=\"place_books_on_shelf\",\n", " description=\"Put the books being carried onto a shelf\",\n", " parameters={\"type\": \"object\"},\n", ")\n", "\n", "empty_wastebin_func = FunctionDeclaration(\n", " name=\"empty_wastebin\",\n", " description=\"Empty out the wastebin\",\n", " parameters={\"type\": \"object\"},\n", ")\n", "\n", "done_func = FunctionDeclaration(\n", " name=\"done\", description=\"The goal has been reached\", parameters={\"type\": \"object\"}\n", ")\n", "\n", "room_tools = Tool(\n", " function_declarations=[\n", " pick_up_clothes_func,\n", " put_clothes_in_hamper_func,\n", " pick_up_books_func,\n", " place_books_on_shelf_func,\n", " empty_wastebin_func,\n", " done_func,\n", " ],\n", ")" ] }, { "cell_type": "markdown", "metadata": { "id": "Y9k3LX6fDlzB" }, "source": [ "### Model with tool declarations" ] }, { "cell_type": "markdown", "metadata": { "id": "1Rp8IV5XDla9" }, "source": [ "NOTE: Tools can be passed in during the initial creation of the model reference as below, or during `send_message()`, and `generate_content()`. The choice depends on the variability of the set of tools to be used.\n", "\n", "```\n", "model_fc = GenerativeModel(\n", " \"gemini-2.0-flash\", \n", " system_instruction=[\n", " \"You are an assistant that helps me tidy my room.\"\n", " \"Your goal is to make sure all the books are on the shelf, all clothes are in the hamper, and the trash is empty.\",\n", " \"You cannot receive any input from me.\"\n", " ],\n", " tools=[ room_tools ],\n", ")\n", "```" ] }, { "cell_type": "markdown", "metadata": { "id": "ZiqEr7OwCs4v" }, "source": [ "### The function calling model response\n", "With Function Calling, the choices of the tools are supplied through the API and is no longer necessary to include them in your prompt, and also unnecessary to specify the output format. For more details see the function calling [API Reference](https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/function-calling#python_1).\n", "```\n", "response = session.send_message( msgs, tools=[ room_tools ]) \n", "```\n", "\n", "The following raw model response is expected:\n", "```\n", "MESSAGE:\n", "\tENVIRONMENT: The room is currently in this state: {'clothes': 'floor', 'books': 'scattered', 'wastebin': 'empty'}.\n", "\tWhich should be the next action?\n", "\n", "RAW RESPONSE:\n", "\n", "candidates {\n", " content {\n", " role: \"model\"\n", " parts {\n", " function_call {\n", " name: \"pick_up_clothes\"\n", " args {\n", " }\n", " }\n", " }\n", " },\n", " finish_reason: STOP,\n", " ...\n", "}\n", "```\n", "Use the following function to extract the function calling information from the response object:\n", "```\n", "# Helper function to extract one or more function calls from a Gemini Function Call response\n", "def extract_function_calls(response: GenerationResponse) -> List[Dict]:\n", " function_calls = []\n", " if response.candidates[0].function_calls:\n", " for function_call in response.candidates[0].function_calls:\n", " function_call_dict = {function_call.name: {}}\n", " for key, value in function_call.args.items():\n", " function_call_dict[function_call.name][key] = value\n", " function_calls.append(function_call_dict)\n", " return function_calls\n", "```\n", "In recent versions of specific Gemini models (from May 2024 and on), Gemini has the ability to return two or more function calls in parallel (i.e., two or more function call responses within the first function call response object). Parallel function calling allows you to fan out and parallelize your API calls or other actions that you perform in your application code, so you don't have to work through each function call response and return one-by-one! Refer to the [Gemini Function Calling documentation](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/function-calling) for more information on which Gemini model versions support parallel function calling, and this [notebook on parallel function calling](https://github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/function-calling/parallel_function_calling.ipynb) for examples." ] }, { "cell_type": "markdown", "metadata": { "id": "e98c7c33c936" }, "source": [ "### The Main ReAct Loop\n", "\n", "In this third example we reorganized the code for easier comprehension. The 3 main components of the loop are broken out into separate functions:\n", "- observe and reason - modified to use the Function Calling feature*\n", "- execute action - simplified\n", "- main loop - calling the other two functions cyclically.\n", "\n", "\\* Main changes:\n", "- The list of tools declared above are sent to the model via the `tools=` argument of the `send_message()` call.\n", "- Any function execution responses are reported back to the model as a structured input 'Part' object in the next cycle." ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "id": "f66655a3b3a7" }, "outputs": [], "source": [ "# Wrapping the observation and model calling code into a function for better main loop readability.\n", "\n", "\n", "def observe_and_reason(session, state: dict, prev_action: str, log: Callable) -> str:\n", " \"\"\"Uses the language model (Gemini) to select the next action.\"\"\"\n", " try:\n", " msgs = []\n", " if prev_action:\n", " msgs.append(\n", " Part.from_function_response(\n", " name=\"previous_action\", response={\"content\": prev_action}\n", " )\n", " )\n", "\n", " prompt = \"\\n\".join(\n", " [\n", " f\"ENVIRONMENT: The room is currently in this state: {state}.\",\n", " \"Which should be the next action?\",\n", " ]\n", " )\n", " msgs.append(prompt)\n", " log(\n", " \"MESSAGE:\\n{}\".format(\n", " indent(\n", " \"\\n\".join([prev_action, prompt] if prev_action else [prompt]),\n", " 1,\n", " \"\\t\",\n", " )\n", " )\n", " )\n", "\n", " response = session.send_message(\n", " msgs, tools=[room_tools]\n", " ) # JSON mode unnecessary.\n", " action_label = get_action_from_function_call(get_function_call(response), log)\n", " return action_label\n", "\n", " except Exception:\n", " log(f\"Error during action selection: {e}\")\n", " traceback.print_exc()\n", " return \"done\" # Or a suitable default action" ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "id": "da084bc71468" }, "outputs": [], "source": [ "# Wrapping the action execution code into a function for better main loop readability.\n", "\n", "\n", "def execute_action(state: dict, action_label: str, log: Callable) -> tuple[dict, str]:\n", " \"\"\"Executes the action on the room state and returns the updated state and an acknowledgement.\"\"\"\n", " try:\n", " # Call the function mapped from the label\n", " state, acknowledgement = get_func(action_label)(state)\n", "\n", " except Exception:\n", " acknowledgement = \"No action suggested or action not recognized.\"\n", "\n", " return state, acknowledgement" ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "id": "6fb3f986f6e7" }, "outputs": [], "source": [ "# Main ReAct loop\n", "\n", "\n", "def main_react_loop_chat_fc(session, loop_continues, log):\n", " room_state = {}\n", " reset_room_state(room_state)\n", " trash_added = False\n", "\n", " prev_action = None\n", " cycle = 1\n", " while loop_continues(cycle, room_state):\n", " log(f\"Cycle #{cycle}\")\n", " # Observe the environment (use Gemini to generate an action thought)\n", " action_label = observe_and_reason(session, room_state, prev_action, log)\n", "\n", " # Execute the action and get the observation\n", " if action_label == \"done\":\n", " break\n", " room_state, acknowledgement = execute_action(room_state, action_label, log)\n", " prev_action = f\"ACTION: {action_label}\\nEXECUTED: {acknowledgement}\"\n", " log(prev_action + \"\\n\")\n", "\n", " # Simulating a change in environment\n", " if cycle == 4 and not trash_added:\n", " room_state[\"wastebin\"] = \"1 item\"\n", " trash_added = True\n", "\n", " cycle += 1\n", " # End of while loop\n", "\n", " # Determine the final result\n", " result = (\n", " \"The room is tidy!\" if is_room_tidy(room_state) else \"The room is not tidy!\"\n", " )\n", "\n", " return room_state, result" ] }, { "cell_type": "code", "execution_count": 19, "metadata": { "id": "0ekJ1kScDNc1" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Cycle #1\n", "MESSAGE:\n", "\tENVIRONMENT: The room is currently in this state: {'clothes': 'floor', 'books': 'scattered', 'wastebin': 'empty'}.\n", "\tWhich should be the next action?\n", "MODEL: {\n", " \"name\": \"pick_up_clothes\",\n", " \"args\": {}\n", "}\n", "ACTION: pick_up_clothes\n", "EXECUTED: The clothes are now being carried.\n", "\n", "Cycle #2\n", "MESSAGE:\n", "\tACTION: pick_up_clothes\n", "\tEXECUTED: The clothes are now being carried.\n", "\tENVIRONMENT: The room is currently in this state: {'clothes': 'carrying by hand', 'books': 'scattered', 'wastebin': 'empty'}.\n", "\tWhich should be the next action?\n", "MODEL: {\n", " \"name\": \"put_clothes_in_hamper\",\n", " \"args\": {}\n", "}\n", "ACTION: put_clothes_in_hamper\n", "EXECUTED: The clothes are now in the hamper.\n", "\n", "Cycle #3\n", "MESSAGE:\n", "\tACTION: put_clothes_in_hamper\n", "\tEXECUTED: The clothes are now in the hamper.\n", "\tENVIRONMENT: The room is currently in this state: {'clothes': 'hamper', 'books': 'scattered', 'wastebin': 'empty'}.\n", "\tWhich should be the next action?\n", "MODEL: {\n", " \"name\": \"pick_up_books\",\n", " \"args\": {}\n", "}\n", "ACTION: pick_up_books\n", "EXECUTED: The books are now in my hand.\n", "\n", "Cycle #4\n", "MESSAGE:\n", "\tACTION: pick_up_books\n", "\tEXECUTED: The books are now in my hand.\n", "\tENVIRONMENT: The room is currently in this state: {'clothes': 'hamper', 'books': 'in hand', 'wastebin': 'empty'}.\n", "\tWhich should be the next action?\n", "MODEL: {\n", " \"name\": \"place_books_on_shelf\",\n", " \"args\": {}\n", "}\n", "ACTION: place_books_on_shelf\n", "EXECUTED: The books are now on the shelf.\n", "\n", "Cycle #5\n", "MESSAGE:\n", "\tACTION: place_books_on_shelf\n", "\tEXECUTED: The books are now on the shelf.\n", "\tENVIRONMENT: The room is currently in this state: {'clothes': 'hamper', 'books': 'shelf', 'wastebin': '1 item'}.\n", "\tWhich should be the next action?\n", "MODEL: {\n", " \"name\": \"empty_wastebin\",\n", " \"args\": {}\n", "}\n", "ACTION: empty_wastebin\n", "EXECUTED: The wastebin is emptied.\n", "\n", "Cycle #6\n", "MESSAGE:\n", "\tACTION: empty_wastebin\n", "\tEXECUTED: The wastebin is emptied.\n", "\tENVIRONMENT: The room is currently in this state: {'clothes': 'hamper', 'books': 'shelf', 'wastebin': 'empty'}.\n", "\tWhich should be the next action?\n", "MODEL: {\n", " \"name\": \"done\",\n", " \"args\": {}\n", "}\n", "{'clothes': 'hamper', 'books': 'shelf', 'wastebin': 'empty'} The room is tidy!\n" ] } ], "source": [ "session = model.start_chat()\n", "\n", "room_state, result = main_react_loop_chat_fc(session, lambda c, r: c <= 10, logging)\n", "print(room_state, result)" ] } ], "metadata": { "colab": { "name": "intro_diy_react_agent.ipynb", "toc_visible": true }, "kernelspec": { "display_name": "Python 3", "name": "python3" } }, "nbformat": 4, "nbformat_minor": 0 }