notebooks/model-predictions.ipynb

{ "cells": [ { "cell_type": "code", "source": [ "# Copyright 2024 Google LLC\n", "#\n", "# Licensed under the Apache License, Version 2.0 (the \"License\");\n", "# you may not use this file except in compliance with the License.\n", "# You may obtain a copy of the License at\n", "#\n", "# https://www.apache.org/licenses/LICENSE-2.0\n", "#\n", "# Unless required by applicable law or agreed to in writing, software\n", "# distributed under the License is distributed on an \"AS IS\" BASIS,\n", "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n", "# See the License for the specific language governing permissions and\n", "# limitations under the License." ], "metadata": { "id": "l6nGHoRo3mym" }, "execution_count": null, "outputs": [] }, { "cell_type": "markdown", "source": [ "# Generative AI Knowledge Base model predictions\n", "\n", "To run this notebook, make sure you have uploaded at least one document into your knowledge base.\n", "\n", "> ⭐️ If you haven't, follow the [**Uploading documents and query model** tutorial](https://console.cloud.google.com/products/solutions/deployments?walkthrough_id=panels--sic--generative-ai-knowledge-base_toc).\n", "\n", "Before you begin, make sure all the dependencies are installed." ], "metadata": { "id": "PQFrKlY5Yi2w" } }, { "cell_type": "code", "source": [ "!pip install google-genai google-cloud-aiplatform google-cloud-firestore" ], "metadata": { "id": "W9C3mHjIiZn1", "collapsed": true }, "execution_count": 1, "outputs": [] }, { "cell_type": "markdown", "source": [ "## Overview\n", "\n", "A **Large Language Model (LLM)** can be very good at answering general questions.\n", "But it might not do as well to answer questions from your documents on its own.\n", "\n", "The LLM will answer only from what it learned from its _training dataset_.\n", "Your documents might include information or words that weren't on that dataset.\n", "Or they might be used in a different or more specialized context.\n", "\n", "This is where **Vector Search** comes into place.\n", "Each time you upload a document, the Cloud Function webhook processes it.\n", "When a document is processed, each individual page is _indexed_.\n", "This allows us to not only find documents, but the specific pages.\n", "\n", "The relevant pages can then be used as _context_ for the LLM to answer the question.\n", "This _grounds_ the model to answer questions based on the documents only.\n", "Without this, the model might give wrong answers, or _hallucinations_." ], "metadata": { "id": "tXeqwSesfIjO" } }, { "cell_type": "markdown", "source": [ "## My Google Cloud resources\n", "\n", "Fill in your project ID, the\n", "[Google Cloud location](https://cloud.google.com/about/locations)\n", "you want to use, and your\n", "Vector Search index endpoint ID.\n", "If you followed the tutorial, the deployed index ID should be `deployed_index`, otherwise change it to the ID you chose.\n", "\n", "You can find your Vector Search index endpoint ID in the [Index endpoints tab](https://console.cloud.google.com/vertex-ai/matching-engine/index-endpoints).\n", "\n", "> 💡 The Vector Search index endpoint ID looks like a number, like `1234567890123456789`.\n", "\n", "Run the following cell to set up your resources and authenticate to your account." ], "metadata": { "id": "nZeNBhYcknZK" } }, { "cell_type": "code", "source": [ "# @title\n", "from google.colab import auth\n", "\n", "project_id = \"\" # @param {type:\"string\"}\n", "location = \"us-central1\" # @param {type:\"string\"}\n", "index_endpoint_id = \"\" # @param {type:\"string\"}\n", "deployed_index_id = \"deployed_index\" # @param {type:\"string\"}\n", "\n", "assert project_id, \"Please set the project_id\"\n", "assert location, \"Please set the location\"\n", "assert index_endpoint_id, \"Please set the index_endpoint_id\"\n", "assert deployed_index_id, \"Please set the deployed_index_id\"\n", "\n", "auth.authenticate_user(project_id=project_id)" ], "metadata": { "cellView": "form", "id": "4EctJVdOj0MY" }, "execution_count": 2, "outputs": [] }, { "cell_type": "markdown", "source": [ "The first step is to initialize the GenAI client library using the project and location of your choice." ], "metadata": { "id": "1P7apRRQabq8" } }, { "cell_type": "code", "source": [ "from google import genai\n", "\n", "genai_client = genai.Client(vertexai=True, project=project_id, location=location)" ], "metadata": { "id": "nkPB50oClSD6" }, "execution_count": 3, "outputs": [] }, { "cell_type": "markdown", "source": [ "## Get text embeddings\n", "\n", "You can use the Gecko model to get embeddings from text.\n", "For more information, see the\n", "[Get text embeddings](https://cloud.google.com/vertex-ai/generative-ai/docs/embeddings/get-text-embeddings)\n", "page." ], "metadata": { "id": "5rDc4RataxgE" } }, { "cell_type": "code", "source": [ "def get_text_embedding(text: str) -> list[float]:\n", " response = genai_client.models.embed_content(\n", " model=\"text-embedding-005\",\n", " contents=[text],\n", " )\n", " embeddings = response.embeddings or []\n", " assert len(embeddings) == 1, f\"expected 1 embedding, got {len(embeddings)}\"\n", " return embeddings[0].values or []\n", "\n", "# Convert the question into an embedding.\n", "question = \"What are LFs and why are they useful?\"\n", "question_embedding = get_text_embedding(question)\n", "print(f\"Embedding dimensions: {len(question_embedding)}\")" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "YAOLxbMFrxfh", "outputId": "cbc1a7c6-a8c8-4acb-b168-600d13c97090" }, "execution_count": 4, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "Embedding dimensions: 768\n" ] } ] }, { "cell_type": "markdown", "source": [ "## Find document context\n", "\n", "All the documents you have processed have been indexed into your Vector Search index.\n", "You can query for the closest embeddings to a given embedding from your Vector Search index endpoint.\n", "\n", "> 💡 If you haven't processed any documents yet, you won't get any results." ], "metadata": { "id": "vnJfXPXAb-1Y" } }, { "cell_type": "code", "source": [ "from google.cloud import aiplatform\n", "from itertools import groupby\n", "\n", "aiplatform.init(project=project_id, location=location)\n", "\n", "def find_document(question: str, index_endpoint_id: str, deployed_index_id: str) -> tuple[str, int]:\n", " # Get embeddings for the question.\n", " embedding = get_text_embedding(question)\n", "\n", " # Find the closest point from the Vector Search index endpoint.\n", " endpoint = aiplatform.MatchingEngineIndexEndpoint(index_endpoint_id)\n", " point = endpoint.find_neighbors(\n", " deployed_index_id=deployed_index_id,\n", " queries=[embedding],\n", " num_neighbors=1,\n", " )[0][0]\n", "\n", " # Get the document name and page number from the point ID.\n", " (filename, page_number) = point.id.split(':', 1)\n", " return (filename, int(page_number))\n", "\n", "# Query the Vector Search index for the most relevant page.\n", "(filename, page_number) = find_document(question, index_endpoint_id, deployed_index_id)\n", "print(f\"{filename=} {page_number=}\")" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "YxLfbjSLeaIh", "outputId": "34e9f917-163a-4613-db97-3fccb10f8869" }, "execution_count": 5, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "filename='arxiv/cmp-lg/pdf/9410/9410009v1.pdf' page_number=3\n" ] } ] }, { "cell_type": "markdown", "source": [ "## Get document text\n", "\n", "When documents were processed, their text was stored in Firestore as well.\n", "The Vector Search query returned the relevant documents with their page numbers.\n", "With this you can download the document's pages and give only the most relevant page to the model." ], "metadata": { "id": "BzRC13xdeK5m" } }, { "cell_type": "code", "source": [ "from google.cloud import firestore\n", "\n", "def get_document_text(filename: str, page_number: int) -> str:\n", " db = firestore.Client(database=\"knowledge-base-database\")\n", " doc = db.collection(\"documents\").document(filename.replace('/', '-'))\n", " return doc.get().get('pages')[page_number]\n", "\n", "# Download the document's page text from Firestore.\n", "context = get_document_text(filename, page_number)\n", "print(f\"{context[:1000]}\\n...\\n...\")" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "nTJqJg1dfRY5", "outputId": "1f3d7c8a-9545-4c74-9c47-7a4022d4baaf" }, "execution_count": 6, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "EN SEM IND\n", "FR SEM IND\n", "VAR\n", "REST {Magn( 1 )}\n", "VAR\n", "REST {Magn(\n", "The interlingual status of the lexical function is\n", "self-evident. Any occurrence of Magn will be left\n", "intact during transfer and it will be the generation\n", "component that ultimately assigns a monolingual\n", "lexical entry to the LF.6\n", "3.2 Problems\n", "Lexical Functions abstract away from certain nu-\n", "ances in meaning and from different syntactic re-\n", "alizations. We discuss some of the problems raised\n", "by this abstraction in this section.\n", "Overgenerality An important problem stems\n", "from the interpretation of LFs implied by their\n", "use as an interlingua namely that the mean-\n", "ing of the collocate in some ways reduces to the\n", "meaning implied by the lexical function. This in-\n", "terpretation is trouble-free if we assume that LFs\n", "always deliver unique values; unfortunately cases\n", "to the contrary can be readily observed. An exam-\n", "ple attested from our corpus was the range of ad-\n", "verbial constructions possible with the verbal head\n", "oppose: adamantly, bitterly\n", "...\n", "...\n" ] } ] }, { "cell_type": "markdown", "source": [ "## Ask a foundational model\n", "\n", "With the relevant context ready, you can now make a _prompt_ that includes both the context and the question.\n", "\n", "Here's Gemini's response.\n", "Note that Gemini responds in [Markdown](https://www.markdownguide.org)." ], "metadata": { "id": "5NB2BO0tSBFu" } }, { "cell_type": "code", "source": [ "from IPython.display import Markdown, display\n", "from google.genai.types import GenerateContentConfig\n", "\n", "def ask_model(question: str) -> None:\n", " (filename, page_number) = find_document(question, index_endpoint_id, deployed_index_id)\n", " context = get_document_text(filename, page_number)\n", " response = genai_client.models.generate_content(\n", " model=\"gemini-2.0-flash\",\n", " contents=question,\n", " config=GenerateContentConfig(\n", " system_instruction=[\n", " \"Answer the question based on the following text:\",\n", " context,\n", " ],\n", " ),\n", " )\n", " print(question)\n", " display(Markdown(response.text))\n", "\n", "ask_model(\"What are LFs and why are they useful?\")" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 138 }, "id": "0KJSHiAOu_z6", "outputId": "d07464c0-2253-4bb9-d80f-e1c60a9932bf" }, "execution_count": 7, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "What are LFs and why are they useful?\n" ] }, { "output_type": "display_data", "data": { "text/plain": [ "<IPython.core.display.Markdown object>" ], "text/markdown": "Based on the provided text, LFs (Lexical Functions) are used as an interlingua in machine translation. Their usefulness stems from their ability to:\n\n1. **Abstract Away Nuances:** They abstract away from certain nuances in meaning and from different syntactic realizations.\n2. **Interlingual Representation:** The interlingual status of the lexical function is self-evident, and any occurrence of Magn will be left intact during transfer and it will be the generation component that ultimately assigns a monolingual lexical entry to the LF.\n" }, "metadata": {} } ] }, { "cell_type": "markdown", "source": [ "## (Optional) Ask your tuned model\n", "\n", "If you want to tune a model, follow the [**Fine-tune an LLM model** tutorial](https://console.cloud.google.com/products/solutions/deployments?walkthrough_id=panels--sic--generative-ai-knowledge-base_toc).\n", "\n", "First, find the tuning job ID for your tuned model." ], "metadata": { "id": "XyLNJ6fvXl1G" } }, { "cell_type": "code", "source": [ "from vertexai.tuning import sft\n", "\n", "for tuning_job in sft.SupervisedTuningJob.list():\n", " model_name = tuning_job.gca_resource.tuned_model_display_name\n", " tuning_job_id = tuning_job.resource_name\n", " print(f\"{model_name}: {tuning_job_id}\")" ], "metadata": { "id": "BAqaEdgY8_MP" }, "execution_count": null, "outputs": [] }, { "cell_type": "markdown", "source": [ "Copy your tuning job ID and paste it below.\n", "Don't forget to run the cell to define the `tuning_job_id` variable." ], "metadata": { "id": "OwXchBOF_rjW" } }, { "cell_type": "code", "source": [ "tuning_job_id = \"\" # @param {type:\"string\"}\n", "\n", "assert tuning_job_id, \"Please set the tuning_job_id\"" ], "metadata": { "cellView": "form", "id": "8dwbv-Fm_D8o" }, "execution_count": 9, "outputs": [] }, { "cell_type": "code", "source": [ "from vertexai.tuning import sft\n", "\n", "tuning_job = sft.SupervisedTuningJob(tuning_job_id)\n", "assert tuning_job.has_ended, \"Please wait until the tuning job finishes.\"\n", "assert tuning_job.tuned_model_endpoint_name\n", "\n", "tuned_model_endpoint = tuning_job.tuned_model_endpoint_name\n", "print(f\"{tuned_model_endpoint=}\")\n", "# The tuned model endpoint follows this format:\n", "# projects/<PROJECT_NUMBER>/locations/<LOCATION>/endpoints/<MODEL_ENDPOINT_ID>" ], "metadata": { "id": "jVJC211J6Imx" }, "execution_count": null, "outputs": [] }, { "cell_type": "code", "source": [ "from vertexai.generative_models import GenerativeModel\n", "\n", "def ask_tuned_model(tuned_model_endpoint: str, question: str) -> None:\n", " (filename, page_number) = find_document(question, index_endpoint_id, deployed_index_id)\n", " context = get_document_text(filename, page_number)\n", " response = genai_client.models.generate_content(\n", " model=tuned_model_endpoint,\n", " contents=[f\"Text: {context}\", question],\n", " config=GenerateContentConfig(\n", " system_instruction=[\n", " \"Answer the question based on the following text\",\n", " ],\n", " ),\n", " )\n", " print(question)\n", " display(Markdown(response.text))\n", "\n", "ask_tuned_model(tuned_model_endpoint, \"What are LFs and why are they useful?\")" ], "metadata": { "id": "ZgXqwCDq8wAf" }, "execution_count": null, "outputs": [] } ], "metadata": { "colab": { "provenance": [], "toc_visible": true }, "kernelspec": { "display_name": "Python 3", "name": "python3" }, "language_info": { "name": "python" } }, "nbformat": 4, "nbformat_minor": 0 }

notebooks/model-predictions.ipynb (512 lines of code) (raw):