notebooks/content_api_intro.ipynb (680 lines of code) (raw):

{ "cells": [ { "metadata": { "id": "WRPtnVSAOwcZ" }, "cell_type": "code", "source": [ "# Copyright 2025 DeepMind Technologies Limited. All Rights Reserved.\n", "#\n", "# Licensed under the Apache License, Version 2.0 (the \"License\");\n", "# you may not use this file except in compliance with the License.\n", "# You may obtain a copy of the License at\n", "#\n", "# http://www.apache.org/licenses/LICENSE-2.0\n", "#\n", "# Unless required by applicable law or agreed to in writing, software\n", "# distributed under the License is distributed on an \"AS IS\" BASIS,\n", "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n", "# See the License for the specific language governing permissions and\n", "# limitations under the License." ], "outputs": [], "execution_count": null }, { "metadata": { "id": "hYapZeTkOwcZ" }, "cell_type": "markdown", "source": [ "[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/google-gemini/genai-processors/blob/main/notebooks/content_api_intro.ipynb)" ] }, { "metadata": { "id": "AzRYfdhEZCj8" }, "cell_type": "markdown", "source": [ "# Welcome to the GenAI Processors Content API\n", "\n", "This notebook is your friendly introduction to the `content_api` module within\n", "the GenAI Processors library. We'll explore how to create, manipulate, and\n", "interact with the fundamental building blocks of content: `ProcessorPart` and\n", "`ProcessorContent`.\n", "\n", "**What you'll learn:**\n", "\n", "* How to create `ProcessorPart` objects from various data types (strings, \n", " bytes, PIL Images, GenAI Parts).\n", "* The key attributes of a `ProcessorPart` (like `mimetype`, `substream_name`,\n", " `role`, `metadata`).\n", "* How to construct `ProcessorContent` objects to group multiple \n", " `ProcessorPart`s.\n", "* Useful utility functions for working with content (e.g., `as_text`).\n", "* How `ProcessorPart` and `ProcessorContent` integrate with GenAI models.\n", "\n", "Let's dive in and unlock the power of structured content in your AI pipelines!" ] }, { "metadata": { "id": "EHBc--e3ZCj8" }, "cell_type": "markdown", "source": [ "## 1. 🛠️ Setup" ] }, { "metadata": { "id": "PyvMqTpOZCj8" }, "cell_type": "code", "source": [ "!pip install genai_processors" ], "outputs": [], "execution_count": null }, { "metadata": { "id": "CkG_a8JhZCj8" }, "cell_type": "code", "source": [ "# @title Import modules\n", "import dataclasses\n", "import dataclasses_json\n", "from genai_processors import content_api\n", "from genai_processors import processor\n", "from genai_processors import streams\n", "from genai_processors.core import genai_model\n", "from google.colab import userdata\n", "from google.genai import types as genai_types\n", "from IPython.display import display\n", "import nest_asyncio\n", "import PIL.Image\n", "import requests\n", "\n", "nest_asyncio.apply() # Needed to run async loops in Colab\n", "\n", "# Convenient aliases\n", "ProcessorPart = content_api.ProcessorPart\n", "ProcessorContent = content_api.ProcessorContent\n", "as_text = content_api.as_text\n", "is_text = content_api.is_text\n", "is_json = content_api.is_json\n", "\n", "# For GenAI Model interaction (optional, but useful for demonstration!)\n", "try:\n", " API_KEY = userdata.get(\"GOOGLE_API_KEY\")\n", " if not API_KEY:\n", " print(\n", " \"⚠️ API Key not found in Colab secrets. GenAI model examples will be\"\n", " \" skipped.\"\n", " )\n", " print(\n", " \"To run these, add your Gemini API Key to Colab Secrets with the name\"\n", " \" 'GOOGLE_API_KEY'.\"\n", " )\n", "except userdata.SecretNotFoundError:\n", " API_KEY = None # Or set your API key directly here if not using Colab secrets\n", " print(\n", " \"`userdata` not imported. Set API_KEY manually if you want to run GenAI\"\n", " \" model examples.\"\n", " )\n", "\n", "GenaiModel = genai_model.GenaiModel\n", "\n", "\n", "def generate_gdm_logo_bytes() -\u003e bytes:\n", " # The URL of the DeepMind logo image\n", " image_url = \"https://upload.wikimedia.org/wikipedia/commons/thumb/6/6a/DeepMind_new_logo.svg/2560px-DeepMind_new_logo.svg.png\"\n", " headers = {\"User-Agent\": \"genai_processors Colab\"}\n", " response = requests.get(image_url, headers=headers)\n", " response.raise_for_status()\n", " return response.content" ], "outputs": [], "execution_count": null }, { "metadata": { "id": "Y_xs5YKoZCj8" }, "cell_type": "markdown", "source": [ "## 2. 🧱 ProcessorPart: The Atomic Unit of Content\n", "\n", "A `ProcessorPart` is the smallest, indivisible piece of content your processors\n", "will handle. Think of it as a typed container for a single modality and a single\n", "role.\n", "\n", "You can create `ProcessorPart` objects from various Python types:" ] }, { "metadata": { "id": "y5MU5ZmgZCj9" }, "cell_type": "code", "source": [ "text_data = \"Hello, GenAI World! This is a text part.\"\n", "text_part = ProcessorPart(text_data)\n", "\n", "print(f\"Original data: {text_data}\")\n", "print(f\"ProcessorPart: {text_part}\")\n", "print(\n", " f\"MIME type: {text_part.mimetype}\"\n", ") # Automatically inferred as 'text/plain'\n", "print(f\"Text content: '{text_part.text}'\")" ], "outputs": [], "execution_count": null }, { "metadata": { "id": "cAXSW37sZCj9" }, "cell_type": "markdown", "source": [ "#### Key Attributes of ProcessorPart:\n", "\n", "* `part`: The underlying `google.genai.types.Part` object.\n", "* `text`: Access the text content (raises ValueError if not text).\n", "* `bytes`: Access the content as bytes.\n", "* `pil_image`: Access the content as a PIL Image (raises ValueError if not an\n", " image).\n", "* `mimetype`: The\n", " [MIME type](https://developer.mozilla.org/en-US/docs/Web/HTTP/Basics_of_HTTP/MIME_types)\n", " of the content (e.g., `text/plain`, `image/png`, `application/json`).\n", "* `role`: Indicates the producer of the content (e.g., `\"user\"`, `\"model\"`,\n", " `\"tool\"`). Useful for conversational AI. Default is `\"\"`.\n", "* `substream_name`: A custom string to categorize or route parts. Useful for\n", " distinguishing different types of information or alternative responses.\n", " Default is `\"\"`.\n", " - Some substream names have special meaning in processors; for example,\n", " `status` and `debug` are reserved for content that needs to be returned\n", " early to the user and will not be processed further down the pipeline.\n", " - `realtime` is for content that will use the Live API\n", " `send_realtime_content()` method (in contrast to\n", " `send_client_content()`).\n", "* `metadata`: A dictionary for any other arbitrary information you want to\n", " attach to the part.\n", "\n", "Let's create a text part with more attributes:" ] }, { "metadata": { "id": "nbkbhZB1ZCj9" }, "cell_type": "code", "source": [ "detailed_text_part = ProcessorPart(\n", " \"This is a user query.\",\n", " role=\"user\",\n", " substream_name=\"user_query_main\",\n", " metadata={\n", " \"timestamp\": \"2024-07-29T10:00:00Z\",\n", " \"session_id\": \"xyz123\",\n", " },\n", ")\n", "\n", "print(f\"ProcessorPart: {detailed_text_part}\")\n", "print(f\"Role: {detailed_text_part.role}\")\n", "print(f\"Substream Name: {detailed_text_part.substream_name}\")\n", "print(f\"Custom Metadata: {detailed_text_part.metadata}\")\n", "print(\n", " f\"Timestamp from metadata: {detailed_text_part.get_metadata('timestamp')}\"\n", ")" ], "outputs": [], "execution_count": null }, { "metadata": { "id": "RaX7sJ1pZCj9" }, "cell_type": "markdown", "source": [ "### From Bytes (e.g., Image Data)\n", "\n", "When creating a `ProcessorPart` from raw bytes, you **must** specify the\n", "`mimetype`." ] }, { "metadata": { "id": "nUhKz8_jZCj9" }, "cell_type": "code", "source": [ "gdm_png_bytes = generate_gdm_logo_bytes()\n", "\n", "image_bytes_part = ProcessorPart(gdm_png_bytes, mimetype=\"image/png\")\n", "\n", "print(f\"ProcessorPart (from bytes): {image_bytes_part}\")\n", "print(f\"MIME type: {image_bytes_part.mimetype}\")\n", "print(f\"Has bytes: {image_bytes_part.bytes is not None}\")\n", "\n", "# You can access it as a PIL Image too\n", "try:\n", " pil_img = image_bytes_part.pil_image\n", " display(pil_img)\n", "except Exception as e:\n", " print(f\"Error converting to PIL Image: {e}\")" ], "outputs": [], "execution_count": null }, { "metadata": { "id": "Fg7llHSVZCj9" }, "cell_type": "markdown", "source": [ "### From a PIL (Pillow) Image Object\n", "\n", "You can directly pass a `PIL.Image.Image` object. The library will handle\n", "converting it to bytes and inferring the MIME type (defaults to `image/webp` if\n", "the PIL Image has no format, or uses its existing format like `image/png`,\n", "`image/jpeg`)." ] }, { "metadata": { "id": "px4r9E8uZCj9" }, "cell_type": "code", "source": [ "# Create a simple PIL Image\n", "pil_image_obj = PIL.Image.new(\"RGB\", (60, 30), color=\"red\")\n", "pil_image_part = ProcessorPart(pil_image_obj)\n", "\n", "print(f\"ProcessorPart (from PIL Image): {pil_image_part}\")\n", "print(\n", " f\"Inferred MIME type: {pil_image_part.mimetype}\"\n", ") # Likely image/webp or image/png\n", "display(pil_image_part.pil_image) # Display in Colab\n", "\n", "# You can also specify a mimetype if you want a different format\n", "jpeg_pil_image_part = ProcessorPart(pil_image_obj, mimetype=\"image/jpeg\")\n", "print(f\"ProcessorPart (from PIL with specified JPEG): {jpeg_pil_image_part}\")\n", "print(f\"MIME type: {jpeg_pil_image_part.mimetype}\")\n", "display(jpeg_pil_image_part.pil_image)" ], "outputs": [], "execution_count": null }, { "metadata": { "id": "P5p_W5AyZCj9" }, "cell_type": "markdown", "source": [ "### From a `google.genai.types.Part`\n", "\n", "If you're working with the Google AI SDK, you can directly wrap\n", "`genai_types.Part` objects in a `ProcessorPart`." ] }, { "metadata": { "id": "UwycCtcdZCj9" }, "cell_type": "code", "source": [ "genai_text_part_sdk = genai_types.Part(text=\"From GenAI SDK!\")\n", "processor_part_from_sdk = ProcessorPart(genai_text_part_sdk, role=\"model\")\n", "\n", "print(f\"ProcessorPart (from GenAI SDK Part): {processor_part_from_sdk}\")\n", "print(f\"Text: {processor_part_from_sdk.text}\")\n", "print(f\"Role: {processor_part_from_sdk.role}\")" ], "outputs": [], "execution_count": null }, { "metadata": { "id": "afMps2VBZCj9" }, "cell_type": "markdown", "source": [ "### From another ProcessorPart\n", "\n", "Creating a `ProcessorPart` from an existing `ProcessorPart` creates a new\n", "instance, which uses the underlying `genai_types.Part` of the existing\n", "`ProcessorPart`. This means changes to the underlying `genai_types.Part` in the\n", "original `ProcessorPart` will be reflected in the new one.\n", "\n", "You can override attributes like `role`, `substream_name`, and `metadata` when\n", "creating a new `ProcessorPart` from an existing one." ] }, { "metadata": { "id": "enXjCrpSZCj9" }, "cell_type": "code", "source": [ "original_part = ProcessorPart(\n", " \"Original message.\",\n", " role=\"user\",\n", " substream_name=\"original_stream\",\n", " metadata={\"version\": 1},\n", ")\n", "\n", "# Simple copy\n", "copied_part = ProcessorPart(original_part)\n", "print(f\"Original: {original_part}\")\n", "print(f\"Copied: {copied_part}\")\n", "print(f\"Are they the same object? {original_part is copied_part}\") # False\n", "print(f\"Are they equal in value? {original_part == copied_part}\") # True\n", "\n", "# Copy with overridden attributes\n", "modified_copy_part = ProcessorPart(\n", " original_part,\n", " role=\"model\", # Changed role\n", " substream_name=\"modified_stream\", # Changed substream\n", " metadata={\"version\": 2, \"status\": \"processed\"}, # New metadata\n", ")\n", "print(f\"\\nModified Copy: {modified_copy_part}\")\n", "print(f\"Role: {modified_copy_part.role}\")\n", "print(f\"Substream: {modified_copy_part.substream_name}\")\n", "print(f\"Metadata: {modified_copy_part.metadata}\")" ], "outputs": [], "execution_count": null }, { "metadata": { "id": "1b8eTkJvZCj9" }, "cell_type": "markdown", "source": [ "### From a Python Dataclass (Structured Data)\n", "\n", "For structured data, you can use `ProcessorPart.from_dataclass()`. This will\n", "serialize your dataclass to JSON and set the `mimetype` to `application/json;\n", "type=\u003cClassName\u003e`." ] }, { "metadata": { "id": "_50DaL3NZCj9" }, "cell_type": "code", "source": [ "@dataclasses_json.dataclass_json # Important for JSON serialization\n", "@dataclasses.dataclass\n", "class MyStructuredData:\n", " id: int\n", " name: str\n", " tags: list[str]\n", "\n", "\n", "my_data_instance = MyStructuredData(\n", " id=101, name=\"Alpha\", tags=[\"important\", \"beta\"]\n", ")\n", "\n", "dataclass_part = ProcessorPart.from_dataclass(\n", " dataclass=my_data_instance,\n", " role=\"system_event\",\n", " metadata={\"source\": \"internal_module\"},\n", ")\n", "\n", "print(f\"ProcessorPart (from dataclass): {dataclass_part}\")\n", "print(f\"MIME type: {dataclass_part.mimetype}\")\n", "print(f\"Underlying text (JSON): {dataclass_part.text}\")\n", "assert is_json(dataclass_part.mimetype)\n", "\n", "# To get the dataclass back:\n", "retrieved_data_instance = dataclass_part.get_dataclass(MyStructuredData)\n", "print(f\"\\nRetrieved dataclass instance: {retrieved_data_instance}\")\n", "print(\n", " \"Is original equal to retrieved?\"\n", " f\" {my_data_instance == retrieved_data_instance}\"\n", ")" ], "outputs": [], "execution_count": null }, { "metadata": { "id": "Dj--SH4-ZCj9" }, "cell_type": "markdown", "source": [ "## 3. 📦 ProcessorContent: A Collection of Parts\n", "\n", "`ProcessorContent` is a container for one or more `ProcessorPart` objects. It is\n", "often used to represent a complete turn in a conversation or a collection of\n", "multimodal inputs.\n", "\n", "It behaves like a list of `ProcessorPart`s and can be constructed by passing\n", "`ProcessorPart` instances (or data that can be converted to `ProcessorPart`s) to\n", "its constructor." ] }, { "metadata": { "id": "teuUSQKhZCj9" }, "cell_type": "markdown", "source": [ "### Creating ProcessorContent" ] }, { "metadata": { "id": "yl6Dxu6ZZCj9" }, "cell_type": "code", "source": [ "# From individual strings/parts\n", "content1 = ProcessorContent(\n", " \"This is the first part.\",\n", " ProcessorPart(generate_gdm_logo_bytes(), mimetype=\"image/png\", role=\"user\"),\n", " \"And a final textual comment.\",\n", ")\n", "\n", "print(\"Content 1:\")\n", "for part in content1: # You can iterate directly over ProcessorContent\n", " print(\n", " f\" - {part.mimetype}:\"\n", " f\" {part.text if is_text(part.mimetype) else '[binary data]'}\"\n", " )\n", "print(f\"Length of Content 1: {len(content1)}\")\n", "\n", "# From a list of ProcessorPart objects\n", "parts_list = [\n", " ProcessorPart(\"Query about cats.\", role=\"user\"),\n", " ProcessorPart(\"Cats are fascinating creatures!\", role=\"model\"),\n", "]\n", "content2 = ProcessorContent(parts_list)\n", "\n", "print(\"\\nContent 2:\")\n", "for (\n", " mime,\n", " part_obj,\n", ") in content2.items(): # .items() yields (mimetype, ProcessorPart)\n", " print(f\" - Role: {part_obj.role}, Mimetype: {mime}, Text: {part_obj.text}\")\n", "\n", "# From another ProcessorContent object (creates a new collection)\n", "content3 = ProcessorContent(content1)\n", "print(f\"\\nContent 3 (copy of Content 1):\")\n", "print(f\"Is Content 1 same object as Content 3? {content1 is content3}\")\n", "print(f\"Is Content 1 equal to Content 3? {content1 == content3}\")" ], "outputs": [], "execution_count": null }, { "metadata": { "id": "7NX_7Q07ZCj9" }, "cell_type": "markdown", "source": [ "### Utility: `as_text()`\n", "\n", "A common task is to extract all textual information from a `ProcessorContent`\n", "object. The `content_api.as_text()` function does exactly this, concatenating\n", "the text from all text-based parts. You can specify the `substream_name`\n", "argument to extract the text of a given substream." ] }, { "metadata": { "id": "KynozC5PZCj9" }, "cell_type": "code", "source": [ "multimodal_content = ProcessorContent(\n", " ProcessorPart(\"Here is some initial text. \", substream_name=\"other\"),\n", " ProcessorPart(\n", " generate_gdm_logo_bytes(), mimetype=\"image/png\"\n", " ), # This will be ignored by as_text\n", " \"Followed by more text. \",\n", ")\n", "\n", "all_text = as_text(multimodal_content)\n", "print(f\"Concatenated text from multimodal_content:\\n{all_text}\")\n", "\n", "other_text_only = as_text(multimodal_content, substream_name=\"other\")\n", "print(f\"Text from 'other' substream:\\n{other_text_only}\")" ], "outputs": [], "execution_count": null }, { "metadata": { "id": "eOpQGy0EZCj9" }, "cell_type": "markdown", "source": [ "## 4. 🤖 Integration with GenAI Models (Optional)\n", "\n", "The `ProcessorPart` and `ProcessorContent` are designed to work seamlessly with\n", "GenAI models via the `GenaiModel` processor. The `GenaiModel` processor expects\n", "an `AsyncIterable[ProcessorPart]` as input, and `ProcessorContent` (or a list of\n", "`ProcessorPart`s, or even strings) can be easily converted to such a stream\n", "using `streams.stream_content()`." ] }, { "metadata": { "id": "sTO2tLJoZCj9" }, "cell_type": "markdown", "source": [ "#### Async" ] }, { "metadata": { "id": "BFnw61pcZCj9" }, "cell_type": "code", "source": [ "p_genai = GenaiModel(\n", " api_key=API_KEY,\n", " model_name=\"gemini-2.0-flash-lite\",\n", ")\n", "input_stream = streams.stream_content(\n", " \"What is the best part of owning a Dalmatian?\"\n", ")\n", "async for content_part in p_genai(input_stream):\n", " print(content_part.text)" ], "outputs": [], "execution_count": null }, { "metadata": { "id": "adSUaiWDZCj-" }, "cell_type": "markdown", "source": [ "#### Sync\n", "\n", "If you are in a sync environment and prefer to block on the model call, use the\n", "`apply_sync` method, which takes a list of `ProcessorPart` objects as input." ] }, { "metadata": { "id": "pzgSXjQAZCj-" }, "cell_type": "code", "source": [ "p = GenaiModel(\n", " api_key=API_KEY,\n", " model_name='gemini-2.0-flash-lite',\n", ")\n", "\n", "genai_content = processor.apply_sync(\n", " p, ['What is the best part of owning a Dalmatian?']\n", ")\n", "\n", "print('Raw parts\\n\\n')\n", "for content_part in genai_content:\n", " print(content_part)\n", "print('\\n\\n')\n", "print('Using `content_api.as_text:\\n\\n')\n", "print(content_api.as_text(genai_content))" ], "outputs": [], "execution_count": null }, { "metadata": { "id": "FwBCJxi6ZCj-" }, "cell_type": "markdown", "source": [ "## 5. Next Steps\n", "\n", "This tutorial covered the creation of `ProcessorPart` and `ProcessorContent`\n", "objects and how they form the basis of content handling in GenAI Processors. You\n", "also saw how they integrate with GenAI models.\n", "\n", "Check the\n", "[processor intro](https://colab.research.google.com/github/google-gemini/genai-processors/blob/main/notebooks/processor_intro.ipynb)\n", "notebook to dive deeper into creating real-time processors using the Live API." ] } ], "metadata": { "colab": { "last_runtime": { "build_target": "", "kind": "local" }, "private_outputs": true, "provenance": [ { "file_id": "1OsaTR6fZwYkLSzjjVAWyFUDYNVkJQs2D", "timestamp": 1743469082836 }, { "file_id": "1nPCZWSex6YMDln9a00jv1ZinyUETqZfI", "timestamp": 1743007845412 } ], "toc_visible": true }, "kernelspec": { "display_name": "Python 3", "name": "python3" }, "language_info": { "name": "python" } }, "nbformat": 4, "nbformat_minor": 0 }