tutorials-and-examples/vector-databases/NEXT-2024-Weaviate-Demo/notebook.ipynb

{ "cells": [ { "cell_type": "markdown", "id": "b19f6be7", "metadata": { "id": "b19f6be7" }, "source": [ "# Personalized Product Descriptions with Weaviate and Gemini\n", "\n", "Weaviate is an open-source vector database that enables you to build AI-Native applications with Gemini! This notebook has four parts:\n", "1. [Part 1: Connect to Weaviate, Define Schema, and Import Data](#part-1-install-dependencies-and-connect-to-weaviate)\n", "\n", "2. [Part 2: Run Vector Search Queries](#part-2-vector-search)\n", "\n", "3. [Part 3: Generative Feedback Loops](#part-3-generative-feedback-loops)\n", "\n", "4. [Part 4: Personalized Product Descriptions](#part-4-personalization)\n", "\n", "\n", "In this demo, we will show you how to embed your data, run a semantic search, make a generative call to Gemini and store the output in your vector database, and personalize the description based on the user profile. We are using the Google merch products as our dataset and will generate product descriptions by calling the Gemini API.\n", "\n", "# Use Case\n", "\n", "We will be working with an e-commerce dataset containing Google merch. We will load the data into the Weaviate vector database and use the semantic search features to retrieve data. Next, we will generate product descriptions and store them back into the database with a vector embedding for retrieval (aka, generative feedback loops). Lastly, we will create a small knowledge graph with uniquely generated product descriptions for the buyer personas Alice and Bob.\n", "\n", "### Requirements\n", "1. Weaviate vector database\n", "1. Gemini API key\n", "\n", "### Video\n", "**For an awesome walk through of this demo, check out [this](https://youtu.be/WORgeRAAN-4?si=-WvqNkPn8oCmnLGQ&t=1138) presentation from Google Cloud Next!**\n", "\n", "[![From RAG to autonomous apps with Weaviate and Gemini on Google Kubernetes Engine](http://i3.ytimg.com/vi/WORgeRAAN-4/hqdefault.jpg)](https://youtu.be/WORgeRAAN-4?si=-WvqNkPn8oCmnLGQ&t=1138)" ] }, { "cell_type": "markdown", "id": "7Wlb0vCDUK3h", "metadata": { "id": "7Wlb0vCDUK3h" }, "source": [ "## Install Dependencies and Libraries" ] }, { "cell_type": "code", "execution_count": null, "id": "1eQSHZzRx3n6", "metadata": { "id": "1eQSHZzRx3n6" }, "outputs": [], "source": [ "!pip install weaviate-client==4.5.5\n", "!pip install google-generativeai\n", "!pip install requests\n", "!pip install python-dotenv" ] }, { "cell_type": "code", "execution_count": null, "id": "iKmPS8v7s_Xc", "metadata": { "id": "iKmPS8v7s_Xc" }, "outputs": [], "source": [ "import weaviate\n", "import weaviate.classes.config as wvcc\n", "from weaviate.embedded import EmbeddedOptions\n", "import weaviate.classes as wvc\n", "from weaviate.classes.config import Property, DataType, ReferenceProperty\n", "from weaviate.util import generate_uuid5\n", "from weaviate.classes.query import QueryReference\n", "\n", "import os\n", "from dotenv import load_dotenv\n", "import json\n", "import requests\n", "import PIL\n", "import IPython\n", "\n", "from PIL import Image\n", "from io import BytesIO\n", "import google.generativeai as genai\n", "\n", "# Convert image links to PIL object\n", "def url_to_pil(url):\n", " response = requests.get(url)\n", " return Image.open(BytesIO(response.content))" ] }, { "cell_type": "markdown", "id": "cee8989d", "metadata": { "id": "cee8989d" }, "source": [ "## Part 1: Connect to Weaviate, Define Schema, and Import Data" ] }, { "cell_type": "markdown", "id": "t1Uc93joUOAR", "metadata": { "id": "t1Uc93joUOAR" }, "source": [ "### Connect to Weaviate\n", "You will need to deploy Weaviate on Kubernetes. Learn how to install the Weaviate helm chart [here](https://weaviate.io/developers/weaviate/installation/kubernetes)." ] }, { "cell_type": "code", "execution_count": null, "id": "a405d84c", "metadata": {}, "outputs": [], "source": [ "client = weaviate.connect_to_custom(\n", " http_host=WEAVIATE_HTTP_URL, # URL for the Weaviate HTTP endpoint\n", " http_port=\"80\", # Port number for the Weaviate HTTP endpoint\n", " http_secure=False, \n", " grpc_host=WEAVIATE_GRPC_URL, # URL for the Weaviate gRPC endpoint\n", " grpc_port=\"50051\", # Port number for the Weaviate gRPC endpoint\n", " grpc_secure=False, \n", " auth_credentials=weaviate.auth.AuthApiKey(WEAVIATE_AUTH) # Authentication credentials\n", ")" ] }, { "cell_type": "markdown", "id": "1199263a", "metadata": { "id": "1199263a" }, "source": [ "### Choose **only one** installation option\n", "\n", "Pick one of the three options below to run Weaviate" ] }, { "cell_type": "markdown", "id": "11886426", "metadata": { "id": "11886426" }, "source": [ "#### 1. Weaviate Cloud Service\n", "\n", "The first option is the [Weaviate Cloud Service](https://console.weaviate.cloud/), you can connect your notebook to a serverless Weaviate to keep the data persistent in the cloud." ] }, { "cell_type": "code", "execution_count": null, "id": "f984616a", "metadata": { "id": "f984616a" }, "outputs": [], "source": [ "load_dotenv()\n", "\n", "client = weaviate.connect_to_wcs(\n", " cluster_url=os.getenv(WCS_DEMO_URL), # Replace with your WCS URL\n", " auth_credentials=weaviate.auth.AuthApiKey(os.getenv(WCS_DEMO_RO_KEY)), # Replace with your WCS key\n", " headers={\"X-PaLM-Api-Key\": os.getenv(\"PALM-API-KEY\")}, # Replace with your Gemini API key\n", ")\n", "\n", "print(client.is_ready())" ] }, { "cell_type": "markdown", "id": "897684f3", "metadata": { "id": "897684f3" }, "source": [ "#### 2. Weaviate Embedded\n", "\n", "The second option is Weaviate embedded. This runs Weaviate inside your notebook. Ideal for quick experimentation." ] }, { "cell_type": "code", "execution_count": null, "id": "QveBlU9aL7jI", "metadata": { "id": "QveBlU9aL7jI" }, "outputs": [], "source": [ "client = weaviate.WeaviateClient(\n", " embedded_options=EmbeddedOptions(\n", " version=\"1.24.8\",\n", " additional_env_vars={\n", " \"ENABLE_MODULES\": \"text2vec-palm, generative-palm\"\n", " }),\n", " additional_headers={\n", " \"X-PaLM-Api-Key\": 'PALM-API-KEY' # Replace with your Gemini API key\n", " }\n", ")\n", "\n", "client.connect()" ] }, { "cell_type": "markdown", "id": "a1c36425", "metadata": { "id": "a1c36425" }, "source": [ "#### 3. Local (Docker)\n", "\n", "If you like to run Weaviate yourself, you can download the [Docker files](https://weaviate.io/developers/weaviate/installation/docker-compose) and run it locally on your machine or in the cloud. Make sure to include the Google module in the configurator." ] }, { "cell_type": "code", "execution_count": null, "id": "1f042619", "metadata": { "id": "1f042619" }, "outputs": [], "source": [ "client = weaviate.connect_to_local()\n", "\n", "print(client.is_ready())" ] }, { "cell_type": "markdown", "id": "mBahZ-eCrjJD", "metadata": { "id": "mBahZ-eCrjJD" }, "source": [ "### Create schema\n", "The schema tells Weaviate how you want to store your data. We will have two collections: Products and Personas. Each collection has metadata (properties) and specifies the embedding and language model." ] }, { "cell_type": "code", "execution_count": null, "id": "842e00de", "metadata": { "id": "842e00de" }, "outputs": [], "source": [ "# This is optional to empty your database\n", "result = client.collections.delete(\"Products\")\n", "print(result)\n", "result = client.collections.delete(\"Personas\")\n", "print(result)\n", "result = client.collections.delete(\"Personalized\")\n", "print(result)" ] }, { "cell_type": "code", "execution_count": null, "id": "LUmbb63eOGvX", "metadata": { "id": "LUmbb63eOGvX" }, "outputs": [], "source": [ "# Products Collection\n", "if not client.collections.exists(\"Products\"):\n", " collection = client.collections.create(\n", " name=\"Products\",\n", " vectorizer_config=wvcc.Configure.Vectorizer.text2vec_palm\n", " (\n", " project_id=\"project-id\", # Only required if you're using Vertex AI. Replace with your project id\n", " api_endpoint=\"generativelanguage.googleapis.com\",\n", " model_id=\"embedding-gecko-001\" # default model. You can switch to another model if desired\n", " ),\n", " generative_config=wvcc.Configure.Generative.palm(\n", " project_id=\"project-id\", # Only required if you're using Vertex AI. Replace with your project id\n", " api_endpoint=\"generativelanguage.googleapis.com\",\n", " model_id=\"gemini-pro-vision\" # You can switch to another model if desired\n", " ),\n", " properties=[ # properties for the Products collection\n", " Property(name=\"product_id\", data_type=DataType.TEXT),\n", " Property(name=\"title\", data_type=DataType.TEXT),\n", " Property(name=\"category\", data_type=DataType.TEXT),\n", " Property(name=\"link\", data_type=DataType.TEXT),\n", " Property(name=\"description\", data_type=DataType.TEXT),\n", " Property(name=\"brand\", data_type=DataType.TEXT),\n", " Property(name=\"generated_description\", data_type=DataType.TEXT),\n", " ]\n", " )\n", "\n", "# Personas Collection\n", "if not client.collections.exists(\"Personas\"):\n", " collection = client.collections.create(\n", " name=\"Personas\",\n", " vectorizer_config=wvcc.Configure.Vectorizer.text2vec_palm\n", " (\n", " project_id=\"project-id\", # Only required if you're using Vertex AI. Replace with your project id\n", " api_endpoint=\"generativelanguage.googleapis.com\",\n", " model_id=\"embedding-gecko-001\" # default model. You can switch to another model if desired\n", " ),\n", " generative_config=wvcc.Configure.Generative.palm(\n", " project_id=\"project-id\", # Only required if you're using Vertex AI. Replace with your project id\n", " api_endpoint=\"generativelanguage.googleapis.com\",\n", " model_id=\"gemini-pro-vision\" # You can switch to another model if desired\n", " ),\n", " properties=[ # properties for the Personas collection\n", " Property(name=\"name\", data_type=DataType.TEXT),\n", " Property(name=\"description\", data_type=DataType.TEXT),\n", " ]\n", " )" ] }, { "cell_type": "markdown", "id": "v5sYXBkMAZZm", "metadata": { "id": "v5sYXBkMAZZm" }, "source": [ "### Import Objects" ] }, { "cell_type": "code", "execution_count": null, "id": "vo0WckWt_gyq", "metadata": { "id": "vo0WckWt_gyq" }, "outputs": [], "source": [ "# URL to the raw JSON file\n", "url = 'https://raw.githubusercontent.com/bkauf/next-store/main/first_99_objects.json'\n", "response = requests.get(url)\n", "\n", "# Load the entire JSON content\n", "data = json.loads(response.text)" ] }, { "cell_type": "code", "execution_count": null, "id": "-uxOVFZ6_iA7", "metadata": { "id": "-uxOVFZ6_iA7" }, "outputs": [], "source": [ "# Print first object\n", "\n", "data[0]" ] }, { "cell_type": "markdown", "id": "3QhqNKBsvTND", "metadata": { "id": "3QhqNKBsvTND" }, "source": [ "#### Upload to Weaviate\n", "We will use Weaviate's batch import to get the 99 objects into our database" ] }, { "cell_type": "code", "execution_count": null, "id": "UUZ1yJAvuQXT", "metadata": { "id": "UUZ1yJAvuQXT" }, "outputs": [], "source": [ "products = client.collections.get(\"Products\")\n", "\n", "with products.batch.dynamic() as batch:\n", " for item in data:\n", " batch.add_object(\n", " properties={\n", " \"product_id\": item['product_id'],\n", " \"title\": item['title'],\n", " \"category\": item['category'],\n", " \"link\": item['link'],\n", " \"description\": item['description'],\n", " \"brand\": item['brand']\n", " }\n", ")" ] }, { "cell_type": "code", "execution_count": null, "id": "7XWKsV920vje", "metadata": { "id": "7XWKsV920vje" }, "outputs": [], "source": [ "# count how many objects are in the database\n", "products = client.collections.get(\"Products\")\n", "response = products.aggregate.over_all(total_count=True)\n", "print(response.total_count)" ] }, { "cell_type": "code", "execution_count": null, "id": "P5gC08735iVZ", "metadata": { "id": "P5gC08735iVZ" }, "outputs": [], "source": [ "# print the objects uuid and properties\n", "\n", "for product in products.iterator():\n", " print(product.uuid, product.properties)" ] }, { "cell_type": "markdown", "id": "a8c64f21", "metadata": { "id": "a8c64f21" }, "source": [ "From the printed list above, select one `uuid` and paste it in the below cell.\n", "\n", "Note: If you run the cell below without grabbing a `uuid`, it will result in an error." ] }, { "cell_type": "code", "execution_count": null, "id": "i-OEzKw85-LY", "metadata": { "id": "i-OEzKw85-LY" }, "outputs": [], "source": [ "product = products.query.fetch_object_by_id(\n", " \"87e5a137-d943-4863-90df-7eed6415fd58\", # <== paste a new product UUID here after importing\n", " include_vector=True\n", ")\n", "\n", "print(product.properties[\"title\"], product.vector[\"default\"])" ] }, { "cell_type": "markdown", "id": "1c6202c0", "metadata": { "id": "1c6202c0" }, "source": [ "## Part 2: Vector Search" ] }, { "cell_type": "markdown", "id": "RVKqhfrcyHOb", "metadata": { "id": "RVKqhfrcyHOb" }, "source": [ "### Vector Search\n", "Vector search returns the objects with most similar vectors to that of the query. We will use the `near_text` operator to find objects with the nearest vector to an input text." ] }, { "cell_type": "code", "execution_count": null, "id": "F70gYJJzxYHt", "metadata": { "id": "F70gYJJzxYHt" }, "outputs": [], "source": [ "products = client.collections.get(\"Products\")\n", "\n", "response = products.query.near_text(\n", " query=\"travel mug\",\n", " return_properties=[\"title\", \"description\", \"link\"], # only return these 3 properties\n", " limit=3 # limited to 3 objects\n", ")\n", "\n", "for product in response.objects:\n", " print(json.dumps(product.properties, indent=2))" ] }, { "cell_type": "markdown", "id": "Wm5SR_0nrlju", "metadata": { "id": "Wm5SR_0nrlju" }, "source": [ "### Hybrid Search\n", "[Hybrid search](https://weaviate.io/developers/weaviate/search/hybrid) combines keyword (BM25) and vector search together, giving you the best of both algorithms.\n", "\n", "To use hybrid search in Weaviate, all you have to do is define the `alpha` parameter to determine the weighting.\n", "\n", "`alpha` = 0 --> pure BM25\n", "\n", "`alpha` = 0.5 --> half BM25, half vector search\n", "\n", "`alpha` = 1 --> pure vector search" ] }, { "cell_type": "code", "execution_count": null, "id": "egqvUe2-rpnh", "metadata": { "id": "egqvUe2-rpnh" }, "outputs": [], "source": [ "products = client.collections.get(\"Products\")\n", "\n", "response = products.query.hybrid(\n", " query = \"dishwasher safe container\", # query\n", " alpha = 0.75, # leaning more towards vector search\n", " return_properties=[\"title\", \"description\", \"link\"], # return these 3 properties\n", " limit = 3 # limited to only 3 objects\n", ")\n", "\n", "for product in response.objects:\n", " print(json.dumps(product.properties, indent=2))" ] }, { "cell_type": "markdown", "id": "M7bX-bV5rqnR", "metadata": { "id": "M7bX-bV5rqnR" }, "source": [ "### Autocut\n", "Rather than hard-coding the limit on the number of objects (seen above), we can use [autocut](https://weaviate.io/developers/weaviate/api/graphql/additional-operators#autocut) to cut off the result set. Autocut limits the number of results returned based on significant variations in the result set's metrics, such as vector distance or score.\n", "\n", "\n", "To use autocut, you must specify the `auto_limit` parameter, which will stop returning results after the specified number of variations, or \"jumps,\" is reached.\n", "\n", "We will use the same hybrid search query above but use `auto_limit` rather than `limit`. Notice how there are actually 4 objects retrieved in this case, compared to the 3 objects returned in the previous query." ] }, { "cell_type": "code", "execution_count": null, "id": "JaEOs-mVruBf", "metadata": { "id": "JaEOs-mVruBf" }, "outputs": [], "source": [ "# auto_limit set to 1\n", "\n", "products = client.collections.get(\"Products\")\n", "\n", "response = products.query.hybrid(\n", " query = \"dishwasher safe container\", # query\n", " alpha = 0.75, # leaning more towards vector search\n", " return_properties=[\"title\", \"description\", \"link\"], # return these 3 properties\n", " auto_limit = 1 # autocut after 1 jump\n", ")\n", "\n", "for product in response.objects:\n", " print(json.dumps(product.properties, indent=2))" ] }, { "cell_type": "markdown", "id": "WJG6LXJ3yKYl", "metadata": { "id": "WJG6LXJ3yKYl" }, "source": [ "### Filters\n", "We can narrow down our results by adding a filter to the query.\n", "\n", "We will look for objects where `category` is equal to `drinkware`." ] }, { "cell_type": "code", "execution_count": null, "id": "JnuEwgEG0PVM", "metadata": { "id": "JnuEwgEG0PVM" }, "outputs": [], "source": [ "products = client.collections.get(\"Products\")\n", "\n", "response = products.query.near_text(\n", " query=\"travel cup\",\n", " return_properties=[\"title\", \"description\", \"category\", \"link\"], # returned properties\n", " filters=wvc.query.Filter.by_property(\"category\").equal(\"Drinkware\"), # filter\n", " limit=3, # limit to 3 objects\n", ")\n", "\n", "for product in response.objects:\n", " print(product.properties)\n", " print('===')" ] }, { "cell_type": "markdown", "id": "oU8Uc1FQimaf", "metadata": { "id": "oU8Uc1FQimaf" }, "source": [ "## Part 3: Generative Feedback Loops\n", "\n", "[Generative Feedback Loops](https://weaviate.io/blog/generative-feedback-loops-with-llms) refers to the process of storing the output from the language model back to the database.\n", "\n", "We will generate a description for each product in our database using Gemini and save it to the `generated_description` property in the `Products` collection." ] }, { "cell_type": "markdown", "id": "rCUXn9q0rDxf", "metadata": { "id": "rCUXn9q0rDxf" }, "source": [ "### Connect and configure Gemini model" ] }, { "cell_type": "code", "execution_count": null, "id": "JajEq9ihp6ed", "metadata": { "id": "JajEq9ihp6ed" }, "outputs": [], "source": [ "genai.configure(api_key='gemini-api-key') # gemini api key\n", "\n", "# Multimodal model\n", "model_pro_vision = genai.GenerativeModel(model_name='gemini-pro-vision') # multi-modal model (text and image)\n", "\n", "# LLM\n", "model_pro = genai.GenerativeModel(model_name='gemini-pro') # text only model" ] }, { "cell_type": "markdown", "id": "4h7iw16Viny1", "metadata": { "id": "4h7iw16Viny1" }, "source": [ "### Generate a description and store it in the `Products` collection\n", "\n", "Steps for the below cell:\n", "1. Run a vector search query to find travel jackets\n", " 1. Learn more about autocut (`auto_limit`) [here](https://weaviate.io/developers/weaviate/api/graphql/additional-operators#autocut).\n", "\n", "2. Grab the returned objects, prompt Gemini with the task and image, store the description in the `generated_description` property" ] }, { "cell_type": "code", "execution_count": null, "id": "Qjz4iYtbP5Ni", "metadata": { "id": "Qjz4iYtbP5Ni" }, "outputs": [], "source": [ "response = products.query.near_text( # first find travel jackets\n", " query=\"travel jacket\",\n", " return_properties=[\"title\", \"description\", \"category\", \"link\"],\n", " auto_limit=1, # limit it to 1 close group\n", ")\n", "\n", "for product in response.objects:\n", " if \"link\" in product.properties:\n", " id = product.uuid\n", " img_url = product.properties[\"link\"]\n", "\n", " pil_image = url_to_pil(img_url) # convert image to PIL object\n", " generated_description = model_pro_vision.generate_content([\"Write a short Facebook ad about this product photo.\", pil_image]) # prompt to Gemini\n", " generated_description = generated_description.text\n", " print(img_url)\n", " print(generated_description)\n", " print('===')\n", "\n", " # Update the Product collection with the generated description\n", " products.data.update(uuid=id, properties={\"generated_description\": generated_description})" ] }, { "cell_type": "markdown", "id": "7LjeX-2_vVb5", "metadata": { "id": "7LjeX-2_vVb5" }, "source": [ "### Vector Search on the `generated_description` property\n", "\n", "Since the product description was saved in our `Products` collection, we can run a vector search query on it." ] }, { "cell_type": "code", "execution_count": null, "id": "suWRGv6Zu6g2", "metadata": { "id": "suWRGv6Zu6g2" }, "outputs": [], "source": [ "products = client.collections.get(\"Products\")\n", "\n", "response = products.query.near_text(\n", " query=\"travel jacket\",\n", " return_properties=[\"generated_description\", \"description\", \"title\"],\n", " limit=1\n", " )\n", "\n", "for o in response.objects:\n", " print(o.uuid)\n", " print(json.dumps(o.properties, indent=2))" ] }, { "cell_type": "markdown", "id": "Qs5hE2k0OuOh", "metadata": { "id": "Qs5hE2k0OuOh" }, "source": [ "## Part 4: Personalization\n", "\n", "So far, we've generated product descriptions using Gemini's multi-modal model. In Part 4, we will generate product descriptions tailored to the persona.\n", "\n", "We will use [cross-references](https://weaviate.io/developers/weaviate/manage-data/cross-references) to establish directional relationships between collections." ] }, { "cell_type": "code", "execution_count": null, "id": "ba56WznY53C1", "metadata": { "id": "ba56WznY53C1" }, "outputs": [], "source": [ "# Personalized Collection\n", "\n", "if not client.collections.exists(\"Personalized\"):\n", " collection = client.collections.create(\n", " name=\"Personalized\",\n", " vectorizer_config=wvcc.Configure.Vectorizer.text2vec_palm\n", " (\n", " project_id=\"project-id\", # Only required if you're using Vertex AI. Replace with your project id\n", " api_endpoint=\"generativelanguage.googleapis.com\",\n", " model_id=\"embedding-gecko-001\" # default model. You can switch to another model if desired\n", " ),\n", " generative_config=wvcc.Configure.Generative.palm(\n", " project_id=\"project-id\", # Only required if you're using Vertex AI. Replace with your project id\n", " api_endpoint=\"generativelanguage.googleapis.com\",\n", " model_id=\"gemini-pro-vision\" # You can switch to another model if desired\n", " ),\n", " properties=[\n", " Property(name=\"description\", data_type=DataType.TEXT),\n", " ],\n", " # cross-references\n", " references=[\n", " ReferenceProperty(\n", " name=\"ofProduct\",\n", " target_collection=\"Products\" # connect personalized to the products collection\n", " ),\n", " ReferenceProperty(\n", " name=\"ofPersona\",\n", " target_collection=\"Personas\" # connect personalized to the personas collection\n", " )\n", " ]\n", ")" ] }, { "cell_type": "markdown", "id": "a44952d7", "metadata": { "id": "a44952d7" }, "source": [ "### Create two personas (Alice and Bob)" ] }, { "cell_type": "code", "execution_count": null, "id": "j1RlEbYpOw_5", "metadata": { "id": "j1RlEbYpOw_5" }, "outputs": [], "source": [ "personas = client.collections.get(\"Personas\")\n", "\n", "for persona in ['Alice', 'Bob']:\n", " generated_description = model_pro.generate_content([\"Create a fictional buyer persona named \" + persona + \", write a short description about them\"]) # use gemini-pro to generate persona description\n", " uuid = personas.data.insert({\n", " \"name\": persona,\n", " \"description\": generated_description.text\n", " })\n", " print(uuid)\n", " print(generated_description.text)\n", " print(\"===\")" ] }, { "cell_type": "code", "execution_count": null, "id": "B0oKIH5vQhhw", "metadata": { "id": "B0oKIH5vQhhw" }, "outputs": [], "source": [ "# print objects in the Personas collection\n", "\n", "personas = client.collections.get(\"Personas\")\n", "\n", "for persona in personas.iterator():\n", " print(persona.uuid, persona.properties)" ] }, { "cell_type": "markdown", "id": "4442bbc4", "metadata": { "id": "4442bbc4" }, "source": [ "### Generate a product description tailored to the persona\n", "\n", "Grab the product uuid from Part 1 and paste it below" ] }, { "cell_type": "code", "execution_count": null, "id": "UHi0V1MX2uNO", "metadata": { "id": "UHi0V1MX2uNO" }, "outputs": [], "source": [ "personalized = client.collections.get(\"Personalized\")\n", "\n", "product = products.query.fetch_object_by_id(\"87e5a137-d943-4863-90df-7eed6415fd58\") # <== paste a new product UUID here after importing\n", "print(product.properties['link'])\n", "print('===')\n", "\n", "personas = client.collections.get(\"Personas\")\n", "\n", "for persona in personas.iterator():\n", " generated_description = model_pro.generate_content([\"Create a product description tailored to the following person, make sure to use the name (\", persona.properties[\"name\"],\") of the persona.\\n\\n\", \"# Product Description\\n\", product.properties[\"description\"], \"# Persona\", persona.properties[\"description\"]]) # generate a description tailored to the persona\n", " print(generated_description.text)\n", " # Add the personalized description to the `description` property in the Personalized collection\n", " new_uuid = personalized.data.insert(\n", " properties={\n", " \"description\": generated_description.text },\n", " references={\n", " \"ofProduct\": product.uuid, # add cross-reference to the Product collection\n", " \"ofPersona\": persona.uuid # add cross-reference to the Persona collection\n", " },\n", " )\n", " print(\"New UUID\", new_uuid)\n", " print('===')" ] }, { "cell_type": "markdown", "id": "f3f56984", "metadata": { "id": "f3f56984" }, "source": [ "### Fetch the objects in the `Personalized` collection" ] }, { "cell_type": "code", "execution_count": null, "id": "gxZ1W_cK4_dw", "metadata": { "id": "gxZ1W_cK4_dw" }, "outputs": [], "source": [ "personalized = client.collections.get(\"Personalized\")\n", "\n", "response = personalized.query.fetch_objects(\n", " limit=2,\n", " include_vector=True,\n", " return_references=[QueryReference(\n", " link_on=\"ofProduct\", # return the title property from the Product collection\n", " return_properties=[\"title\"]\n", " ),\n", " QueryReference(\n", " link_on=\"ofPersona\",\n", " return_properties=[\"name\"] # return the name property from the Persona collection\n", " )\n", " ]\n", ")\n", "\n", "for item in response.objects:\n", " print(item.properties)\n", " for ref_obj in item.references[\"ofProduct\"].objects:\n", " print(ref_obj.properties)\n", " for ref_obj in item.references[\"ofPersona\"].objects:\n", " print(ref_obj.properties)\n", " print(item.vector[\"default\"])\n", " print(\"===\")" ] } ], "metadata": { "colab": { "collapsed_sections": [ "7Wlb0vCDUK3h", "t1Uc93joUOAR" ], "provenance": [] }, "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.6" } }, "nbformat": 4, "nbformat_minor": 5 }

tutorials-and-examples/vector-databases/NEXT-2024-Weaviate-Demo/notebook.ipynb (969 lines of code) (raw):