colab-enterprise/Campaign-Assets-Hyper-Personalized-Email.ipynb (2,552 lines of code) (raw):
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### <font color='#4285f4'>Overview</font>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This notebook empowers marketers to create hyper-personalized emails with unique marketing text and images generated for each customer. By analyzing current events in Paris and leveraging Gemini's advanced capabilities, we identify relevant events and tailor messages to customers' interests, even translating content and providing transparent explanations for all AI-generated elements.\n",
"\n",
"Process Flow:\n",
"1. Download current events for the next week\n",
"2. Extract up to 7 keywords from the events using Gemini\n",
"3. Create vector embeddings of the keywords\n",
"4. Ask Gemini for the top 3 public events in Paris\n",
"5. Match the top 3 events (vector search) with customers who have similar interests\n",
"6. Generate a marketing message based upon the customer's profile. Determine their top 3 menu items and craft a message that hints at the event taking place.\n",
"7. Pass the marketing message and ask Gemini to create a prompt for Imagen3\n",
" * a. Gemini with author our Imagen3 prompt for us.\n",
"8. Generate the image based upon the image prompt Gemini created\n",
"9. Pass the marketing message, user profile and generated image to Gemini Vision.\n",
" * a. Ask Gemini to verify that the image satisfies the original request.\n",
"10. Ask Gemini to translate the marketing text into another language.\n",
" * a. Ask Gemini to verify the translation to make sure it did what we asked\n",
"11. Create the HTML email message.\n",
"\n",
"Notes:\n",
"* We could also select menu items that also match the customer's food allergies\n",
"* Since we are download event data, we could partner with the events\n",
"\n",
"Cost:\n",
"* Very Small: A few calls to Gemini, Imagen3 and BigQuery\n",
"* Medium: Remember to stop your Colab Enterprise Notebook Runtime\n",
"\n",
"Author: \n",
"* Adam Paternostro"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Architecture Diagram\n",
"from IPython.display import Image\n",
"Image(url='https://storage.googleapis.com/data-analytics-golden-demo/chocolate-ai/v1/Artifacts/Campaign-Assets-Hyper-Personalized-Email-Architecture.png', width=1200)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### <font color='#4285f4'>Video Walkthrough</font>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[](https://storage.googleapis.com/data-analytics-golden-demo/chocolate-ai/v1/Videos/Campaign-Assets-Hyper-Personalized-Email.mp4)\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from IPython.display import HTML\n",
"\n",
"HTML(\"\"\"\n",
"<video width=\"800\" height=\"600\" controls>\n",
" <source src=\"https://storage.googleapis.com/data-analytics-golden-demo/chocolate-ai/v1/Videos/Campaign-Assets-Hyper-Personalized-Email.mp4\" type=\"video/mp4\">\n",
" Your browser does not support the video tag.\n",
"</video>\n",
"\"\"\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "HMsUvoF4BP7Y"
},
"source": [
"### <font color='#4285f4'>License</font>"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "jQgQkbOvj55d"
},
"source": [
"```\n",
"# Copyright 2024 Google LLC\n",
"#\n",
"# Licensed under the Apache License, Version 2.0 (the \"License\");\n",
"# you may not use this file except in compliance with the License.\n",
"# You may obtain a copy of the License at\n",
"#\n",
"# https://www.apache.org/licenses/LICENSE-2.0\n",
"#\n",
"# Unless required by applicable law or agreed to in writing, software\n",
"# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
"# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
"# See the License for the specific language governing permissions and\n",
"# limitations under the License.\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "m65vp54BUFRi"
},
"source": [
"### <font color='#4285f4'>Pip installs</font>"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "5MaWM6H5i6rX"
},
"outputs": [],
"source": [
"# PIP Installs\n",
"import sys\n",
"\n",
"# https://serpapi.com/\n",
"!{sys.executable} -m pip install google-search-results"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "UmyL-Rg4Dr_f"
},
"source": [
"### <font color='#4285f4'>Initialize</font>"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "q6vploHkW-Vk"
},
"outputs": [],
"source": [
"# You will need to create an account and verify your email.\n",
"# https://serpapi.com/\n",
"# You get 100 free calls (per month)\n",
"\n",
"serpapi_key = \"<<replace with your key>>\"\n",
"\n",
"# date:today - Today's Events\n",
"# date:tomorrow - Tomorrow's Events\n",
"htichips = \"date:today\"\n",
"htichips = \"date:week\""
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "xOYsEVSXp6IP"
},
"outputs": [],
"source": [
"from PIL import Image\n",
"from IPython.display import HTML\n",
"import IPython.display\n",
"import google.auth\n",
"import requests\n",
"import json\n",
"import uuid\n",
"import base64\n",
"import os\n",
"import cv2\n",
"import random\n",
"import time\n",
"import datetime\n",
"import base64\n",
"import random\n",
"\n",
"import logging\n",
"from tenacity import retry, wait_exponential, stop_after_attempt, before_sleep_log, retry_if_exception"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "wMlHl3bnkFPZ"
},
"outputs": [],
"source": [
"# Set these (run this cell to verify the output)\n",
"\n",
"bigquery_location = \"${bigquery_location}\"\n",
"region = \"${region}\"\n",
"location = \"${location}\"\n",
"storage_account = \"${chocolate_ai_bucket}\"\n",
"\n",
"# Get the current date and time\n",
"now = datetime.datetime.now()\n",
"\n",
"# Format the date and time as desired\n",
"formatted_date = now.strftime(\"%Y-%m-%d-%H-%M\")\n",
"\n",
"gemini_languages = [\"Arabic (ar)\", \"Bengali (bn)\", \"Bulgarian (bg)\", \"Chinese simplified (zh)\", \"Chinese traditional (zh)\",\n",
" \"Croatian (hr)\", \"Czech (cs)\", \"Danish (da)\", \"Dutch (nl)\", \"Estonian (et)\", \"Finnish (fi)\", \"French (fr)\",\n",
" \"German (de)\", \"Greek (el)\", \"Hebrew (iw)\", \"Hindi (hi)\", \"Hungarian (hu)\", \"Indonesian (id)\", \"Italian (it)\", \"Japanese (ja)\",\n",
" \"Korean (ko)\", \"Latvian (lv)\", \"Lithuanian (lt)\", \"Norwegian (no)\", \"Polish (pl)\", \"Portuguese (pt)\", \"Romanian (ro)\", \"Russian (ru)\",\n",
" \"Serbian (sr)\", \"Slovak (sk)\", \"Slovenian (sl)\", \"Spanish (es)\", \"Swahili (sw)\", \"Swedish (sv)\", \"Thai (th)\", \"Turkish (tr)\", \"Ukrainian (uk)\",\n",
" \"Vietnamese (vi)\"]\n",
"\n",
"# Get some values using gcloud\n",
"project_id = !(gcloud config get-value project)\n",
"user = !(gcloud auth list --filter=status:ACTIVE --format=\"value(account)\")\n",
"\n",
"if len(project_id) != 1:\n",
" raise RuntimeError(f\"project_id is not set: {project_id}\")\n",
"project_id = project_id[0]\n",
"\n",
"if len(user) != 1:\n",
" raise RuntimeError(f\"user is not set: {user}\")\n",
"user = user[0]\n",
"\n",
"print(f\"project_id = {project_id}\")\n",
"print(f\"user = {user}\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "sZ6m_wGrK0YG"
},
"source": [
"### <font color='#4285f4'>Helper Methods</font>"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "JbOjdSP1kN9T"
},
"source": [
"#### restAPIHelper\n",
"Calls the Google Cloud REST API using the current users credentials."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "40wlwnY4kM11"
},
"outputs": [],
"source": [
"def restAPIHelper(url: str, http_verb: str, request_body: str) -> str:\n",
" \"\"\"Calls the Google Cloud REST API passing in the current users credentials\"\"\"\n",
"\n",
" import requests\n",
" import google.auth\n",
" import json\n",
"\n",
" # Get an access token based upon the current user\n",
" creds, project = google.auth.default()\n",
" auth_req = google.auth.transport.requests.Request()\n",
" creds.refresh(auth_req)\n",
" access_token=creds.token\n",
"\n",
" headers = {\n",
" \"Content-Type\" : \"application/json\",\n",
" \"Authorization\" : \"Bearer \" + access_token\n",
" }\n",
"\n",
" if http_verb == \"GET\":\n",
" response = requests.get(url, headers=headers)\n",
" elif http_verb == \"POST\":\n",
" response = requests.post(url, json=request_body, headers=headers)\n",
" elif http_verb == \"PUT\":\n",
" response = requests.put(url, json=request_body, headers=headers)\n",
" elif http_verb == \"PATCH\":\n",
" response = requests.patch(url, json=request_body, headers=headers)\n",
" elif http_verb == \"DELETE\":\n",
" response = requests.delete(url, headers=headers)\n",
" else:\n",
" raise RuntimeError(f\"Unknown HTTP verb: {http_verb}\")\n",
"\n",
" if response.status_code == 200:\n",
" return json.loads(response.content)\n",
" #image_data = json.loads(response.content)[\"predictions\"][0][\"bytesBase64Encoded\"]\n",
" else:\n",
" error = f\"Error restAPIHelper -> ' Status: '{response.status_code}' Text: '{response.text}'\"\n",
" raise RuntimeError(error)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### RetryCondition (for retrying LLM calls)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def RetryCondition(error):\n",
" error_string = str(error)\n",
" print(error_string)\n",
"\n",
" retry_errors = [\n",
" \"RESOURCE_EXHAUSTED\",\n",
" \"No content in candidate\",\n",
" # Add more error messages here as needed\n",
" ]\n",
"\n",
" for retry_error in retry_errors:\n",
" if retry_error in error_string:\n",
" print(\"Retrying...\")\n",
" return True\n",
"\n",
" return False"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "3Q7UAkst0phu"
},
"source": [
"#### Imagen3 Image Generation"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "OjHSz0k20sgf"
},
"outputs": [],
"source": [
"def ImageGen(prompt):\n",
" creds, project = google.auth.default()\n",
" auth_req = google.auth.transport.requests.Request()\n",
" creds.refresh(auth_req)\n",
" access_token=creds.token\n",
"\n",
" headers = {\n",
" \"Content-Type\" : \"application/json\",\n",
" \"Authorization\" : \"Bearer \" + access_token\n",
" }\n",
"\n",
" model_version = \"imagen-3.0-generate-001\" # imagen-3.0-fast-generate-001\n",
" #model_version = \"imagen-3.0-generate-preview-0611\" # Preview Access Model\n",
"\n",
" # https://cloud.google.com/vertex-ai/docs/generative-ai/model-reference/image-generation\n",
" # url = f\"https://{location}-aiplatform.googleapis.com/v1/projects/{project}/locations/{location}/publishers/google/models/imagegeneration:predict\"\n",
" url = f\"https://{location}-aiplatform.googleapis.com/v1/projects/{project}/locations/{location}/publishers/google/models/{model_version}:predict\"\n",
"\n",
" payload = {\n",
" \"instances\": [\n",
" {\n",
" \"prompt\": prompt\n",
" }\n",
" ],\n",
" \"parameters\": {\n",
" \"sampleCount\": 1,\n",
" \"personGeneration\" : \"dont_allow\" # change to allow_adult for people generation\n",
" }\n",
" }\n",
"\n",
" response = requests.post(url, json=payload, headers=headers)\n",
"\n",
" if response.status_code == 200:\n",
" response_json = json.loads(response.content)\n",
" print(f\"Imagen3 response_json: {response_json}\")\n",
"\n",
" if \"blocked\" in response_json:\n",
" print(f\"Blocked: {response_json['blocked']}\")\n",
"\n",
" if \"predictions\" in response_json:\n",
" image_data = response_json[\"predictions\"][0][\"bytesBase64Encoded\"]\n",
" image_data = base64.b64decode(image_data)\n",
" filename= str(uuid.uuid4()) + \".png\"\n",
" with open(filename, \"wb\") as f:\n",
" f.write(image_data)\n",
" print(f\"Image generated OK.\")\n",
" return filename\n",
" else:\n",
" raise RuntimeError(f\"No predictions in response: {response.content}\")\n",
" else:\n",
" error = f\"Error with prompt:'{prompt}' Status:'{response.status_code}' Text:'{response.text}'\"\n",
" raise RuntimeError(error)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "vOFTk6sj1YIV"
},
"source": [
"#### Gemini LLM (Pro 1.0 , Pro 1.5)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "xHit3Hh-1ZAW"
},
"outputs": [],
"source": [
"@retry(wait=wait_exponential(multiplier=1, min=1, max=60), stop=stop_after_attempt(10), retry=retry_if_exception(RetryCondition), before_sleep=before_sleep_log(logging.getLogger(), logging.INFO))\n",
"def GeminiLLM(prompt, model = \"gemini-2.0-flash\", response_schema = None,\n",
" temperature = 1, topP = 1, topK = 32):\n",
"\n",
" # https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/inference#supported_models\n",
"\n",
" llm_response = None\n",
" if temperature < 0:\n",
" temperature = 0\n",
"\n",
" creds, project = google.auth.default()\n",
" auth_req = google.auth.transport.requests.Request() # required to acess access token\n",
" creds.refresh(auth_req)\n",
" access_token=creds.token\n",
"\n",
" headers = {\n",
" \"Content-Type\" : \"application/json\",\n",
" \"Authorization\" : \"Bearer \" + access_token\n",
" }\n",
"\n",
" # https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/inference\n",
" url = f\"https://{location}-aiplatform.googleapis.com/v1/projects/{project_id}/locations/{location}/publishers/google/models/{model}:generateContent\"\n",
"\n",
" generation_config = {\n",
" \"temperature\": temperature,\n",
" \"topP\": topP,\n",
" \"maxOutputTokens\": 8192,\n",
" \"candidateCount\": 1,\n",
" \"responseMimeType\": \"application/json\",\n",
" }\n",
"\n",
" # Add inthe response schema for when it is provided\n",
" if response_schema is not None:\n",
" generation_config[\"responseSchema\"] = response_schema\n",
"\n",
" if model == \"gemini-2.0-flash\":\n",
" generation_config[\"topK\"] = topK\n",
"\n",
" payload = {\n",
" \"contents\": {\n",
" \"role\": \"user\",\n",
" \"parts\": {\n",
" \"text\": prompt\n",
" },\n",
" },\n",
" \"generation_config\": {\n",
" **generation_config\n",
" },\n",
" \"safety_settings\": {\n",
" \"category\": \"HARM_CATEGORY_SEXUALLY_EXPLICIT\",\n",
" \"threshold\": \"BLOCK_LOW_AND_ABOVE\"\n",
" }\n",
" }\n",
"\n",
" response = requests.post(url, json=payload, headers=headers)\n",
"\n",
" if response.status_code == 200:\n",
" try:\n",
" json_response = json.loads(response.content)\n",
" except Exception as error:\n",
" raise RuntimeError(f\"An error occurred parsing the JSON: {error}\")\n",
"\n",
" if \"candidates\" in json_response:\n",
" candidates = json_response[\"candidates\"]\n",
" if len(candidates) > 0:\n",
" candidate = candidates[0]\n",
" if \"content\" in candidate:\n",
" content = candidate[\"content\"]\n",
" if \"parts\" in content:\n",
" parts = content[\"parts\"]\n",
" if len(parts):\n",
" part = parts[0]\n",
" if \"text\" in part:\n",
" text = part[\"text\"]\n",
" llm_response = text\n",
" else:\n",
" raise RuntimeError(\"No text in part: {response.content}\")\n",
" else:\n",
" raise RuntimeError(\"No parts in content: {response.content}\")\n",
" else:\n",
" raise RuntimeError(\"No parts in content: {response.content}\")\n",
" else:\n",
" raise RuntimeError(\"No content in candidate: {response.content}\")\n",
" else:\n",
" raise RuntimeError(\"No candidates: {response.content}\")\n",
" else:\n",
" raise RuntimeError(\"No candidates: {response.content}\")\n",
"\n",
" # Remove some typically response characters (if asking for a JSON reply)\n",
" llm_response = llm_response.replace(\"```json\",\"\")\n",
" llm_response = llm_response.replace(\"```\",\"\")\n",
" llm_response = llm_response.replace(\"\\n\",\"\")\n",
"\n",
" return llm_response\n",
"\n",
" else:\n",
" raise RuntimeError(f\"Error with prompt:'{prompt}' Status:'{response.status_code}' Text:'{response.text}'\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "rWXSCd5VCPjf"
},
"outputs": [],
"source": [
"@retry(wait=wait_exponential(multiplier=1, min=1, max=60), stop=stop_after_attempt(10), retry=retry_if_exception(RetryCondition), before_sleep=before_sleep_log(logging.getLogger(), logging.INFO))\n",
"def GeminiLLM_VerifyImage(prompt, imageBase64, model = \"gemini-2.0-flash\", response_schema = None,\n",
" temperature = 1, topP = 1, topK = 32):\n",
"\n",
" # https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/inference#supported_models\n",
" # model = \"gemini-2.0-flash\"\n",
"\n",
" llm_response = None\n",
" if temperature < 0:\n",
" temperature = 0\n",
"\n",
" creds, project = google.auth.default()\n",
" auth_req = google.auth.transport.requests.Request() # required to acess access token\n",
" creds.refresh(auth_req)\n",
" access_token=creds.token\n",
"\n",
" headers = {\n",
" \"Content-Type\" : \"application/json\",\n",
" \"Authorization\" : \"Bearer \" + access_token\n",
" }\n",
"\n",
" # https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/inference\n",
" url = f\"https://{location}-aiplatform.googleapis.com/v1/projects/{project_id}/locations/{location}/publishers/google/models/{model}:generateContent\"\n",
"\n",
" generation_config = {\n",
" \"temperature\": temperature,\n",
" \"topP\": topP,\n",
" \"maxOutputTokens\": 8192,\n",
" \"candidateCount\": 1,\n",
" \"responseMimeType\": \"application/json\",\n",
" }\n",
"\n",
" # Add inthe response schema for when it is provided\n",
" if response_schema is not None:\n",
" generation_config[\"responseSchema\"] = response_schema\n",
"\n",
" if model == \"gemini-2.0-flash\" or model == \"gemini-1.0-pro\":\n",
" generation_config[\"topK\"] = topK\n",
"\n",
" payload = {\n",
" \"contents\": {\n",
" \"role\": \"user\",\n",
" \"parts\": [\n",
" { \"text\": prompt },\n",
" { \"inlineData\": { \"mimeType\": \"image/png\", \"data\": f\"{imageBase64}\" } }\n",
" ]\n",
" },\n",
" \"generation_config\": {\n",
" **generation_config\n",
" },\n",
" \"safety_settings\": {\n",
" \"category\": \"HARM_CATEGORY_SEXUALLY_EXPLICIT\",\n",
" \"threshold\": \"BLOCK_LOW_AND_ABOVE\"\n",
" }\n",
" }\n",
"\n",
" response = requests.post(url, json=payload, headers=headers)\n",
"\n",
" if response.status_code == 200:\n",
" try:\n",
" json_response = json.loads(response.content)\n",
" except Exception as error:\n",
" raise RuntimeError(f\"An error occurred parsing the JSON: {error}\")\n",
"\n",
" if \"candidates\" in json_response:\n",
" candidates = json_response[\"candidates\"]\n",
" if len(candidates) > 0:\n",
" candidate = candidates[0]\n",
" if \"content\" in candidate:\n",
" content = candidate[\"content\"]\n",
" if \"parts\" in content:\n",
" parts = content[\"parts\"]\n",
" if len(parts):\n",
" part = parts[0]\n",
" if \"text\" in part:\n",
" text = part[\"text\"]\n",
" llm_response = text\n",
" else:\n",
" raise RuntimeError(\"No text in part: {response.content}\")\n",
" else:\n",
" raise RuntimeError(\"No parts in content: {response.content}\")\n",
" else:\n",
" raise RuntimeError(\"No parts in content: {response.content}\")\n",
" else:\n",
" raise RuntimeError(\"No content in candidate: {response.content}\")\n",
" else:\n",
" raise RuntimeError(\"No candidates: {response.content}\")\n",
" else:\n",
" raise RuntimeError(\"No candidates: {response.content}\")\n",
"\n",
" # Remove some typically response characters (if asking for a JSON reply)\n",
" llm_response = llm_response.replace(\"```json\",\"\")\n",
" llm_response = llm_response.replace(\"```\",\"\")\n",
" llm_response = llm_response.replace(\"\\n\",\"\")\n",
"\n",
" return llm_response\n",
"\n",
" else:\n",
" raise RuntimeError(f\"Error with prompt:'{prompt}' Status:'{response.status_code}' Text:'{response.text}'\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "bI-KJELZ1jgt"
},
"source": [
"#### Helper Functions"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "pmnCwYvA1kZv"
},
"outputs": [],
"source": [
"def RunQuery(sql):\n",
" import time\n",
" from google.cloud import bigquery\n",
" client = bigquery.Client()\n",
"\n",
" if (sql.startswith(\"SELECT\") or sql.startswith(\"WITH\")):\n",
" df_result = client.query(sql).to_dataframe()\n",
" return df_result\n",
" else:\n",
" job_config = bigquery.QueryJobConfig(priority=bigquery.QueryPriority.INTERACTIVE)\n",
" query_job = client.query(sql, job_config=job_config)\n",
"\n",
" # Check on the progress by getting the job's updated state.\n",
" query_job = client.get_job(\n",
" query_job.job_id, location=query_job.location\n",
" )\n",
" print(\"Job {} is currently in state {} with error result of {}\".format(query_job.job_id, query_job.state, query_job.error_result))\n",
"\n",
" while query_job.state != \"DONE\":\n",
" time.sleep(2)\n",
" query_job = client.get_job(\n",
" query_job.job_id, location=query_job.location\n",
" )\n",
" print(\"Job {} is currently in state {} with error result of {}\".format(query_job.job_id, query_job.state, query_job.error_result))\n",
"\n",
" if query_job.error_result == None:\n",
" return True\n",
" else:\n",
" raise Exception(query_job.error_result)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "_OAO_-LC1k3m"
},
"outputs": [],
"source": [
"def convert_png_to_base64(image_path):\n",
" image = cv2.imread(image_path)\n",
"\n",
" # Convert the image to a base64 string.\n",
" _, buffer = cv2.imencode('.png', image)\n",
" base64_string = base64.b64encode(buffer).decode('utf-8')\n",
"\n",
" return base64_string"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "VNAmwvAf1knl"
},
"outputs": [],
"source": [
"def PrettyPrintJson(json_string):\n",
" json_object = json.loads(json_string)\n",
" json_formatted_str = json.dumps(json_object, indent=2)\n",
" return json_formatted_str"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "IBH7sIg3XlGv"
},
"outputs": [],
"source": [
"def GetNextPrimaryKey(fully_qualified_table_name, field_name):\n",
" from google.cloud import bigquery\n",
" client = bigquery.Client()\n",
" sql = f\"\"\"\n",
" SELECT IFNULL(MAX({field_name}),0) AS result\n",
" FROM `{fully_qualified_table_name}`\n",
" \"\"\"\n",
" # print(sql)\n",
" df_result = client.query(sql).to_dataframe()\n",
" # display(df_result)\n",
" return df_result['result'].iloc[0] + 1"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "2apWaWdnTxnz"
},
"outputs": [],
"source": [
"def haveEventsBeenDownloaded(fully_qualified_table_name, field_name):\n",
" from google.cloud import bigquery\n",
" client = bigquery.Client()\n",
" sql = f\"\"\"\n",
" SELECT IFNULL(COUNT({field_name}),0) AS result\n",
" FROM `{fully_qualified_table_name}`\n",
" WHERE download_date = CURRENT_DATE()\n",
" \"\"\"\n",
" # print(sql)\n",
" df_result = client.query(sql).to_dataframe()\n",
" # display(df_result)\n",
" if df_result['result'].iloc[0] > 0:\n",
" return True\n",
" else:\n",
" return False"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "BljTy6yYaIp5"
},
"outputs": [],
"source": [
"# This was generated by GenAI\n",
"\n",
"def copy_file_to_gcs(local_file_path, bucket_name, destination_blob_name):\n",
" \"\"\"Copies a file from a local drive to a GCS bucket.\n",
"\n",
" Args:\n",
" local_file_path: The full path to the local file.\n",
" bucket_name: The name of the GCS bucket to upload to.\n",
" destination_blob_name: The desired name of the uploaded file in the bucket.\n",
"\n",
" Returns:\n",
" None\n",
" \"\"\"\n",
"\n",
" import os\n",
" from google.cloud import storage\n",
"\n",
" # Ensure the file exists locally\n",
" if not os.path.exists(local_file_path):\n",
" raise FileNotFoundError(f\"Local file '{local_file_path}' not found.\")\n",
"\n",
" # Create a storage client\n",
" storage_client = storage.Client()\n",
"\n",
" # Get a reference to the bucket\n",
" bucket = storage_client.bucket(bucket_name)\n",
"\n",
" # Create a blob object with the desired destination path\n",
" blob = bucket.blob(destination_blob_name)\n",
"\n",
" # Upload the file from the local filesystem\n",
" content_type = \"\"\n",
" if local_file_path.endswith(\".html\"):\n",
" content_type = \"text/html; charset=utf-8\"\n",
"\n",
" if local_file_path.endswith(\".json\"):\n",
" content_type = \"application/json; charset=utf-8\"\n",
"\n",
" if content_type == \"\":\n",
" blob.upload_from_filename(local_file_path)\n",
" else:\n",
" blob.upload_from_filename(local_file_path, content_type = content_type)\n",
"\n",
" print(f\"File '{local_file_path}' uploaded to GCS bucket '{bucket_name}' as '{destination_blob_name}. Content-Type: {content_type}'.\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "EYRHDPdVKBzd"
},
"source": [
"### <font color='#4285f4'>Create BigQuery Tables</font>"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "PJsTtUbD5SsQ"
},
"outputs": [],
"source": [
"%%bigquery\n",
"\n",
"#DROP TABLE IF EXISTS `chocolate_ai.customer_hyper_personalized_email`;\n",
"\n",
"CREATE TABLE IF NOT EXISTS `chocolate_ai.customer_hyper_personalized_email`\n",
"(\n",
" customer_id INTEGER NOT NULL OPTIONS(description=\"Primary key.\"),\n",
" email_date DATE NOT NULL OPTIONS(description=\"The date of the email.\"),\n",
"\n",
" llm_marketing_prompt STRING OPTIONS(description=\"The prompt to generate the marketing email text.\"),\n",
" llm_marketing_prompt_response_json JSON OPTIONS(description=\"The response from the llm marketing prompt in json.\"),\n",
" llm_marketing_prompt_response_text STRING OPTIONS(description=\"The response from the llm marketing prompt in text.\"),\n",
"\n",
" llm_orginial_image_prompt STRING OPTIONS(description=\"The prompt to generate the original image text.\"),\n",
" llm_orginial_image_prompt_response_json JSON OPTIONS(description=\"The response from the llm original prompt in json.\"),\n",
" llm_orginial_image_prompt_response_text STRING OPTIONS(description=\"The response from the llm original prompt in text.\"),\n",
"\n",
" llm_improved_image_prompt STRING OPTIONS(description=\"The prompt to generate the improved image text.\"),\n",
" -- The improved prompt will be passed to Imagen3\n",
" --llm_improved_image_prompt_response_json JSON OPTIONS(description=\"The response from the llm improved prompt in json.\"),\n",
" --llm_improved_image_prompt_response_text STRING OPTIONS(description=\"The response from the llm improved prompt in text.\"),\n",
"\n",
" llm_verify_image_prompt STRING OPTIONS(description=\"The prompt to verify the generated image.\"),\n",
" llm_verify_image_response_json JSON OPTIONS(description=\"The response from verify the generated image in json.\"),\n",
" llm_verify_image_text STRING OPTIONS(description=\"The response from verify the generated image in text.\"),\n",
"\n",
" llm_translation_language_prompt STRING OPTIONS(description=\"The prompt to generate the secondary lanagage text.\"),\n",
" llm_translation_language_prompt_response_json JSON OPTIONS(description=\"The response from the llm secondary lanagage prompt in json.\"),\n",
" llm_translation_language_prompt_response_text STRING OPTIONS(description=\"The response from the llm secondary lanagage prompt in text.\"),\n",
"\n",
" llm_validate_translation_prompt STRING OPTIONS(description=\"The prompt to generate the vadiation of the translation text.\"),\n",
" llm_validate_translation_prompt_response_json JSON OPTIONS(description=\"The response from the llm vadiation of the translation prompt in json.\"),\n",
" llm_validate_translation_prompt_response_text STRING OPTIONS(description=\"The response from the llm vadiation of the translation prompt in text.\"),\n",
"\n",
" image_gcs_filename STRING OPTIONS(description=\"The GCS path for the marketing campaign image.\"),\n",
" image_http_url STRING OPTIONS(description=\"The HTTP path for the marketing campaign image.\"),\n",
" image_generated BOOLEAN OPTIONS(description=\"Has the image been generated and saved to GCS.\"),\n",
" image_verified BOOLEAN OPTIONS(description=\"Did the image pass verification.\"),\n",
"\n",
" html_gcs_filename STRING OPTIONS(description=\"The GCS path for the marketing campaign HTML file.\"),\n",
" html_http_url STRING OPTIONS(description=\"The HTTP path for the marketing campaign HTML file.\"),\n",
" html_generated BOOLEAN OPTIONS(description=\"Has the HTML been generated and saved to GCS.\"),\n",
" translation_verified BOOLEAN OPTIONS(description=\"Did the translation pass verification.\"),\n",
")\n",
"CLUSTER BY customer_id;"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "FxcTX0z6XZD5"
},
"outputs": [],
"source": [
"%%bigquery\n",
"\n",
"--DROP TABLE IF EXISTS `chocolate_ai.event`;\n",
"\n",
"CREATE TABLE IF NOT EXISTS `chocolate_ai.event`\n",
"(\n",
" event_id INT NOT NULL OPTIONS(description=\"The primary key of the event.\"),\n",
" download_date DATE NOT NULL OPTIONS(description=\"The of the downloaded data (we download many event dates at once)\"),\n",
" event_title STRING NOT NULL OPTIONS(description=\"The title of the event.\"),\n",
" event_date DATE NOT NULL OPTIONS(description=\"The date of the event.\"),\n",
" event_time_string STRING NOT NULL OPTIONS(description=\"The time (string value) of the event.\"),\n",
" event_venue STRING NOT NULL OPTIONS(description=\"The venue of the event.\"),\n",
" event_venue_link STRING NOT NULL OPTIONS(description=\"The generated insight in text\"),\n",
" event_address STRING NOT NULL OPTIONS(description=\"The full address of the event.\"),\n",
" event_description STRING NOT NULL OPTIONS(description=\"The description of the event.\"),\n",
" event_reviews INT64 NOT NULL OPTIONS(description=\"The number of reviews of the event.\"),\n",
" event_thumbnail STRING NOT NULL OPTIONS(description=\"The thumbnail image for the event.\"),\n",
"\n",
" llm_event_prompt STRING OPTIONS(description=\"The LLM prompt to extract the event keywords.\"),\n",
" llm_event_prompt_json_result JSON OPTIONS(description=\"The unparsed json results from the LLM\"),\n",
"\n",
" event_keywords STRING OPTIONS(description=\"The keywords for the event.\"),\n",
" event_keywords_embeddings ARRAY<FLOAT64> OPTIONS(description=\"Vector embedding of the event keywords column.\"),\n",
")\n",
"CLUSTER BY event_id;"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "QIM9CI13Xqhy"
},
"source": [
"### <font color='#4285f4'>Get the Events (You only need to run this once since they are saved)</font>"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "zoWCxMCWXuaM"
},
"source": [
"You can use a 3rd party service to download the events\n",
"- https://serpapi.com/google-events-api\n",
"- https://serpapi.com/blog/scrape-google-events-results-with-python/"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "HNZanAdJEaPq"
},
"source": [
"#### Use serpapi"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "qXsRiPrQXzI7"
},
"outputs": [],
"source": [
"# %%bigquery\n",
"# OPTIONAL: Remove the current days events (in case you re-run this notebook). This will force a call to serpapi\n",
"# DELETE FROM `chocolate_ai.event` AS event WHERE event_date = CURRENT_DATE();"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "u-8V_cqdXsiy"
},
"outputs": [],
"source": [
"from serpapi import GoogleSearch\n",
"from datetime import date\n",
"\n",
"event_records = []\n",
"\n",
"if haveEventsBeenDownloaded(\"${project_id}.${bigquery_chocolate_ai_dataset}.event\",\"event_id\") == False:\n",
" event_id = GetNextPrimaryKey(\"${project_id}.${bigquery_chocolate_ai_dataset}.event\",\"event_id\")\n",
" print(f\"event_id: {event_id}\")\n",
"\n",
" event_date = date.today()\n",
"\n",
" google_events_params = {\n",
" \"engine\": \"google_events\",\n",
" \"q\": f\"Events in Paris, France\",\n",
" \"hl\": \"en\",\n",
" \"gl\": \"fl\",\n",
" \"api_key\": f\"{serpapi_key}\",\n",
" 'start': 0,\n",
" \"htichips\" : f\"{htichips}\"\n",
" }\n",
"\n",
" while True:\n",
" search = GoogleSearch(google_events_params)\n",
" event_search_results = search.get_dict()\n",
" if 'error' in event_search_results:\n",
" break\n",
"\n",
" for item in event_search_results[\"events_results\"]:\n",
" try:\n",
" event_venue = \"\"\n",
" address = \"\"\n",
" for address_item in item[\"address\"]:\n",
" if event_venue == \"\":\n",
" split_text = address_item.split(\", \")\n",
" event_venue = split_text[0]\n",
" address = split_text[1]\n",
" else:\n",
" address = address + \" \" + address_item\n",
"\n",
" print(f\"item: {item}\")\n",
" print(f\"event_venue: {event_venue}\")\n",
" print(f\"address: {address}\")\n",
" print(\"------------------------------------------------------\")\n",
"\n",
" event = {\n",
" \"event_id\" : event_id,\n",
" \"event_title\" : item['title'],\n",
" \"event_date\" : event_date, # Placeholder\n",
" \"event_time_string\" : item['date']['when'],\n",
" \"event_venue\" : event_venue,\n",
" \"event_venue_link\" : item['venue']['link'],\n",
" \"event_address\" : address,\n",
" \"event_description\" : item['description'],\n",
" \"event_reviews\" : item['venue']['reviews'],\n",
" \"event_thumbnail\" : item['thumbnail']\n",
" }\n",
" event_id = event_id + 1\n",
"\n",
" event_records.append(event)\n",
"\n",
" except Exception as error:\n",
" print(f\"error: {error}\")\n",
" print(f\"item: {item}\")\n",
" print(f\"------------------------------\")\n",
"\n",
" google_events_params['start'] += 10\n",
"\n",
" if google_events_params['start'] > 30:\n",
" break # only do 30 for now so our prompt is not too large"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "oMl8zBaOEfy8"
},
"source": [
"#### Convert the Event Date to a proper date"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "idiMrKUlhJLU"
},
"outputs": [],
"source": [
"# if we downloaded events then save them (otherwise skip)\n",
"from datetime import date\n",
"\n",
"if len(event_records) > 0 :\n",
" from datetime import datetime\n",
" # Write me the json in OpenAPI 3.0 schema object for the below object.\n",
" # Make all fields required.\n",
" # {\n",
" # \"date\" : \"YYYY-MM-DD\"\n",
" # }\n",
" response_schema = {\n",
" \"type\": \"object\",\n",
" \"required\": [\"date\"],\n",
" \"properties\": {\n",
" \"date\": {\n",
" \"type\": \"string\"\n",
" }\n",
" }\n",
" }\n",
"\n",
" for item in event_records:\n",
" if \"download_date\" not in item:\n",
" item[\"download_date\"] = date.today()\n",
"\n",
" # fix the event data (we need gemini to tell us a good event data)\n",
" llm_date_prompt = f\"\"\"Parse the following date into the following format: YYYY-MM-DD.\n",
" If the date spans several days then use the first day of the date.\n",
" The date to format is: {item['event_time_string']}\"\"\"\n",
"\n",
" print (llm_date_prompt)\n",
"\n",
" llm_date_response = GeminiLLM(llm_date_prompt, response_schema=response_schema)\n",
" llm_date_dict = json.loads(llm_date_response)\n",
"\n",
" item[\"event_date\"] = datetime.strptime(llm_date_dict['date'], \"%Y-%m-%d\").date()\n",
" print(event_date)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "xPM7R0UdU7Q4"
},
"source": [
"#### Save the Events (bulk insert into BigQuery)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "-bEZ5IVbX04l"
},
"outputs": [],
"source": [
"# if we downloaded events then save them (otherwise skip)\n",
"\n",
"if len(event_records) > 0 :\n",
" import pandas as pd\n",
" from google.cloud import bigquery\n",
"\n",
" # Load the events table (in bulk)\n",
" table_id = \"${project_id}.${bigquery_chocolate_ai_dataset}.event\"\n",
"\n",
" dataframe = pd.DataFrame(\n",
" pd.DataFrame(event_records), # Your source data\n",
" columns=[\n",
" \"event_id\",\n",
" \"download_date\",\n",
" \"event_title\",\n",
" \"event_date\",\n",
" \"event_time_string\",\n",
" \"event_venue\",\n",
" \"event_venue_link\",\n",
" \"event_address\",\n",
" \"event_description\",\n",
" \"event_reviews\",\n",
" \"event_thumbnail\"\n",
" ],\n",
" )\n",
"\n",
" job_config = bigquery.LoadJobConfig(\n",
" schema=[\n",
" bigquery.SchemaField(\"event_id\", bigquery.enums.SqlTypeNames.INT64, mode=\"REQUIRED\"),\n",
" bigquery.SchemaField(\"download_date\", bigquery.enums.SqlTypeNames.DATE, mode=\"REQUIRED\"),\n",
" bigquery.SchemaField(\"event_title\", bigquery.enums.SqlTypeNames.STRING, mode=\"REQUIRED\"),\n",
" bigquery.SchemaField(\"event_date\", bigquery.enums.SqlTypeNames.DATE, mode=\"REQUIRED\"),\n",
" bigquery.SchemaField(\"event_time_string\", bigquery.enums.SqlTypeNames.STRING, mode=\"REQUIRED\"),\n",
" bigquery.SchemaField(\"event_venue\", bigquery.enums.SqlTypeNames.STRING, mode=\"REQUIRED\"),\n",
" bigquery.SchemaField(\"event_venue_link\", bigquery.enums.SqlTypeNames.STRING, mode=\"REQUIRED\"),\n",
" bigquery.SchemaField(\"event_address\", bigquery.enums.SqlTypeNames.STRING, mode=\"REQUIRED\"),\n",
" bigquery.SchemaField(\"event_description\", bigquery.enums.SqlTypeNames.STRING, mode=\"REQUIRED\"),\n",
" bigquery.SchemaField(\"event_reviews\", bigquery.enums.SqlTypeNames.INT64, mode=\"REQUIRED\"),\n",
" bigquery.SchemaField(\"event_thumbnail\", bigquery.enums.SqlTypeNames.STRING, mode=\"REQUIRED\")\n",
" ],\n",
" write_disposition=\"WRITE_APPEND\",\n",
" )\n",
"\n",
" load_client = bigquery.Client()\n",
" job = load_client.load_table_from_dataframe(dataframe, table_id, job_config=job_config)\n",
" job.result() # Wait for the job to complete.\n",
"\n",
" table = load_client.get_table(table_id) # Make an API request.\n",
" print(\"Loaded {} rows and {} columns to {}\".format(table.num_rows, len(table.schema), table_id))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "VZX7nZoLX2rh"
},
"outputs": [],
"source": [
"%%bigquery\n",
"\n",
"# Display the results\n",
"SELECT event_id, download_date, event_title, event_date, event_time_string, event_venue\tevent_venue_link, event_address, event_description\n",
" FROM `chocolate_ai.event`\n",
"WHERE download_date = CURRENT_DATE()\n",
"ORDER BY event_id"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "c51M89g0Ejmz"
},
"source": [
"### <font color='#4285f4'>Use Gemini to extract the Event Keywords and create Vector Embeddings</font>"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "U4A_7n4SEPNO"
},
"outputs": [],
"source": [
"%%bigquery\n",
"\n",
"-- Create our prompt for Gemini and store it in a column so we can then batch score\n",
"UPDATE `chocolate_ai.event`\n",
" SET llm_event_prompt = CONCAT(\n",
" 'Extract up to 7 keywords from this event that would pertain to peoples hobbies or interests.\\n',\n",
" 'The keywords should be in English.\\n',\n",
" 'You will be perform a semantic search with these words.\\n',\n",
" 'Do not include \"Paris\" or \"France\" as a keyword since the search will be done only on Paris data.\\n',\n",
" 'Do not include dates in the results.\\n',\n",
" 'Do not include postal codes in the results.\\n',\n",
" 'Return the results in JSON with no special formatting, preceeding text or markdown text. It must only be JSON.\\n'\n",
" 'Return the keywords in the following JSON format:\\n',\n",
" '- Example 1: { \"keywords\": [\"text\",\"text\"] }\\n',\n",
" '- Example 2: { \"keywords\": [\"word\",\"keyword\"] }\\n',\n",
" '<event_data>\\n',\n",
" TO_JSON_STRING(TO_JSON(STRUCT(event_title, event_venue, event_description))),\n",
" '\\n',\n",
" '</event_data>')\n",
" WHERE download_date = CURRENT_DATE();"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "1MaT_0vREO_F"
},
"outputs": [],
"source": [
"%%bigquery\n",
"\n",
"-- Call Gemini Pro on all of our prompts (batch score)\n",
"UPDATE `chocolate_ai.event` AS event_parent\n",
" SET llm_event_prompt_json_result = llm_query.ml_generate_text_result\n",
" FROM (SELECT *\n",
" FROM ML.GENERATE_TEXT(MODEL`${project_id}.${bigquery_chocolate_ai_dataset}.gemini_model`,\n",
" (SELECT event_id,\n",
" llm_event_prompt AS prompt\n",
" FROM `${project_id}.${bigquery_chocolate_ai_dataset}.event`\n",
" WHERE download_date = CURRENT_DATE()),\n",
" STRUCT(.5 AS temperature, .8 AS top_p)\n",
" )\n",
" ) AS llm_query\n",
" WHERE event_parent.event_id = llm_query.event_id;"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "Z7p_x0EsHLxy"
},
"outputs": [],
"source": [
"%%bigquery\n",
"\n",
"-- We get an array back from Gemini, so we need to parse it into keywords, convert the json array to an array of strings and then flatten the string array with spaces as the seperator\n",
"SELECT ARRAY_TO_STRING(JSON_EXTRACT_STRING_ARRAY(JSON_EXTRACT(\n",
" PARSE_JSON('{ \"keywords\": [\"construction\", \"architecture\", \"building\", \"materials\", \"design\", \"technology\", \"innovation\"] }'),\n",
" '$.keywords')), \" \") AS keywords_string;"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "8sT3vZwmCZXN"
},
"outputs": [],
"source": [
"%%bigquery\n",
"\n",
"-- Extract the keywords from the returned JSON\n",
"UPDATE `chocolate_ai.event`\n",
" SET event_keywords = ARRAY_TO_STRING(JSON_EXTRACT_STRING_ARRAY(JSON_EXTRACT(\n",
" `${project_id}.${bigquery_chocolate_ai_dataset}.gemini_model_result_as_json`(llm_event_prompt_json_result), '$.keywords')), \" \")\n",
" WHERE download_date = CURRENT_DATE();"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "uMCv1S_DNekR"
},
"outputs": [],
"source": [
"%%bigquery\n",
"\n",
"-- Create vector embeddings on all the keywords\n",
"UPDATE `chocolate_ai.event` AS event_parent\n",
" SET event_keywords_embeddings = llm_query.text_embedding\n",
" FROM (SELECT *\n",
" FROM ML.GENERATE_TEXT_EMBEDDING(MODEL`${project_id}.${bigquery_chocolate_ai_dataset}.textembedding_model`,\n",
" (SELECT event_id,\n",
" event_keywords AS content\n",
" FROM `${project_id}.${bigquery_chocolate_ai_dataset}.event`\n",
" WHERE download_date = CURRENT_DATE()\n",
" --AND ARRAY_LENGTH(event_keywords_embeddings) = 0 -- Always generate\n",
" ),\n",
" STRUCT(TRUE AS flatten_json_output,\n",
" 'SEMANTIC_SIMILARITY' as task_type,\n",
" 768 AS output_dimensionality)\n",
" )\n",
" ) AS llm_query\n",
" WHERE event_parent.event_id = llm_query.event_id;"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "OtYES3ssr0AW"
},
"outputs": [],
"source": [
"%%bigquery\n",
"\n",
"SELECT event_id,\n",
" event_date,\n",
" event_title,\n",
" event_description,\n",
" event_keywords\n",
" FROM `chocolate_ai.event`\n",
"WHERE download_date = CURRENT_DATE()\n",
"ORDER BY event_id;"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "SLK7IX7QruT3"
},
"source": [
"### <font color='#4285f4'>Ask Gemini for the best event this week to target</font>\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "xqFnAxLFQtAI"
},
"outputs": [],
"source": [
"%%bigquery event_df\n",
"\n",
"SELECT TO_JSON_STRING(TO_JSON(STRUCT(event_id,\n",
" event_date,\n",
" event_title,\n",
" event_description,\n",
" event_keywords))) AS json_string\n",
" FROM `chocolate_ai.event`\n",
"WHERE download_date = CURRENT_DATE()\n",
"ORDER BY event_id;"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "9ccdxP2cmKmJ"
},
"outputs": [],
"source": [
"event_list_as_string = \"\"\n",
"\n",
"for index, row in event_df.iterrows():\n",
" event_list_as_string += row[\"json_string\"] + \"\\n\"\n",
"\n",
"print(event_list_as_string)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "ruCYNkxIpIPM"
},
"outputs": [],
"source": [
"#Write me the json in OpenAPI 3.0 schema object for the below object.\n",
"#Make all fields required.\n",
"# {\n",
"# [\n",
"# \"event_id\" : 2\n",
"# \"event_title\" : \"\"\n",
"# \"explanation\" : \"\"\n",
"# ]\n",
"# }\n",
"response_schema = {\n",
" \"type\": \"object\",\n",
" \"required\": [\"event_array\"],\n",
" \"properties\": {\n",
" \"event_array\": {\n",
" \"type\": \"array\",\n",
" \"items\": {\n",
" \"type\": \"object\",\n",
" \"required\": [\"event_id\", \"event_title\", \"event_keywords\", \"explanation\" ],\n",
" \"properties\": {\n",
" \"event_id\": {\n",
" \"type\": \"integer\"\n",
" },\n",
" \"event_title\": {\n",
" \"type\": \"string\"\n",
" },\n",
" \"event_keywords\": {\n",
" \"type\": \"string\"\n",
" },\n",
" \"explanation\": {\n",
" \"type\": \"string\"\n",
" }\n",
" }\n",
" }\n",
" }\n",
" }\n",
"}\n",
"\n",
"\n",
"best_event_prompt = f\"\"\"You work in the marketing department at Chocolate AI a chocolate, dessert and coffee shop located in Paris, France.\n",
"You are reviewing this weeks list of events that are happening in Paris.\n",
"We want you to select the top 3 best events, most public, event that will draw the most amount of people to it.\n",
"Explain your reasoning.\n",
"\n",
"Here are the events:\n",
"{event_list_as_string}\n",
"\"\"\"\n",
"\n",
"\n",
"best_event_response = GeminiLLM(best_event_prompt, response_schema=response_schema)\n",
"best_event_dict = json.loads(best_event_response)\n",
"\n",
"best_event_1 = best_event_dict[\"event_array\"][0][\"event_id\"]\n",
"best_event_2 = best_event_dict[\"event_array\"][1][\"event_id\"]\n",
"best_event_3 = best_event_dict[\"event_array\"][2][\"event_id\"]"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "7Hu9LxVnqyph"
},
"outputs": [],
"source": [
"print(PrettyPrintJson(json.dumps(best_event_dict)))"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "2olP2ZnglJlr"
},
"source": [
"### <font color='#4285f4'>Do a Vector Search to find customer who are likely to attend the event</font>\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "g9eObGGQlMZw"
},
"outputs": [],
"source": [
"# You can adjust the LIMIT clause for the number of items to generate\n",
"\n",
"sql = f\"\"\"WITH semantic_search AS\n",
"(\n",
" -- Find customers that match the top 3 events (the keywords we extracted)\n",
" SELECT query.event_id AS event_id,\n",
" query.event_title AS event_title,\n",
" query.event_description AS event_description,\n",
" query.event_keywords AS event_keywords,\n",
" base.customer_id AS customer_id,\n",
" base.customer_marketing_insights AS customer_marketing_insights,\n",
" distance\n",
" FROM VECTOR_SEARCH( ( -- table to search\n",
" SELECT customer_id, customer_marketing_insights, customer_marketing_insights_embedding\n",
" FROM `chocolate_ai.customer_marketing_profile`\n",
" WHERE ARRAY_LENGTH(customer_marketing_insights_embedding) = 768\n",
" ),\n",
" 'customer_marketing_insights_embedding', -- the column name that contains our embedding (from query above)\n",
" ( -- source of embeddings\n",
" SELECT event_id, event_title, event_description, event_keywords, event_keywords_embeddings\n",
" FROM `chocolate_ai.event`\n",
" WHERE ARRAY_LENGTH(event_keywords_embeddings) = 768\n",
" AND download_date = CURRENT_DATE()\n",
" AND event_id IN ({best_event_1}, {best_event_2}, {best_event_3})\n",
" ),\n",
" 'event_keywords_embeddings', -- the column name of our newly embedded data (from query above)\n",
" top_k => 10\n",
" )\n",
")\n",
"-- Get each customer's top 3 menu items\n",
"SELECT semantic_search.*,\n",
" customer.customer_name,\n",
" customer.country_code,\n",
" CONCAT(menu_1.menu_name,':',menu_1.menu_description) AS top_1_favorite_menu_item_name,\n",
" CONCAT(menu_2.menu_name,':',menu_2.menu_description) AS top_2_favorite_menu_item_name,\n",
" CONCAT(menu_3.menu_name,':',menu_3.menu_description) AS top_3_favorite_menu_item_name,\n",
"FROM semantic_search\n",
" INNER JOIN `chocolate_ai.customer_marketing_profile` AS customer_marketing_profile\n",
" ON semantic_search.customer_id = customer_marketing_profile.customer_id\n",
" INNER JOIN `chocolate_ai.customer` AS customer\n",
" ON semantic_search.customer_id = customer.customer_id\n",
" AND customer.country_code = 'FR' -- You can do different countries (assuming you know they are travel?)\n",
" INNER JOIN `chocolate_ai.menu` AS menu_1\n",
" ON CAST(JSON_VALUE(JSON_EXTRACT_ARRAY(customer_marketing_profile.customer_loyalty_data.top_3_favorite_menu_items,'$')[0]) AS INT64) = menu_1.menu_id\n",
" INNER JOIN `chocolate_ai.menu` AS menu_2\n",
" ON CAST(JSON_VALUE(JSON_EXTRACT_ARRAY(customer_marketing_profile.customer_loyalty_data.top_3_favorite_menu_items,'$')[1]) AS INT64) = menu_2.menu_id\n",
" INNER JOIN `chocolate_ai.menu` AS menu_3\n",
" ON CAST(JSON_VALUE(JSON_EXTRACT_ARRAY(customer_marketing_profile.customer_loyalty_data.top_3_favorite_menu_items,'$')[2]) AS INT64) = menu_3.menu_id\n",
"ORDER BY distance -- RAND() (use random to show different results on different runs, \"for testing\")\n",
"LIMIT 10 -- You can change this\n",
"\"\"\"\n",
"\n",
"top_3_events = RunQuery(sql)\n",
"\n",
"customer_list = []\n",
"for row in top_3_events.itertuples():\n",
" customer_dict = {\n",
" \"event_id\" : row.event_id,\n",
" \"event_title\" : row.event_title,\n",
" \"event_description\" : row.event_description,\n",
" \"event_keywords\" : row.event_keywords,\n",
" \"customer_id\" : row.customer_id,\n",
" \"customer_name\" : row.customer_name,\n",
" \"country_code\" : row.country_code,\n",
" \"customer_marketing_insights\" : row.customer_marketing_insights,\n",
" \"distance\" :row.distance,\n",
" \"top_1_favorite_menu_item_name\" : row.top_1_favorite_menu_item_name,\n",
" \"top_2_favorite_menu_item_name\" : row.top_2_favorite_menu_item_name,\n",
" \"top_3_favorite_menu_item_name\" : row.top_3_favorite_menu_item_name\n",
" }\n",
" print(f\"Customer: {customer_dict}\")\n",
" customer_list.append(customer_dict)\n",
"\n",
"# Just the Ids so we can query\n",
"customer_id_list = ([customer['customer_id'] for customer in customer_list])\n",
"customer_id_list_str = (', '.join(map(str, customer_id_list)))"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "TqCpfi1P9KtL"
},
"source": [
"### <font color='#4285f4'>Generate and run the LLM Marketing Prompt</font>"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "MzZdNARrKHQZ"
},
"outputs": [],
"source": [
"# In case you re-run\n",
"\n",
"sql = f\"\"\"DELETE\n",
" FROM `chocolate_ai.customer_hyper_personalized_email`\n",
" WHERE customer_id IN ({customer_id_list_str})\n",
" AND email_date = CURRENT_DATE();\"\"\"\n",
"\n",
"RunQuery(sql)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "mx8rLbqJ95d0"
},
"outputs": [],
"source": [
"# For each customer, generate the marketing prompt and run against Gemini\n",
"\n",
"# Write me the json in OpenAPI 3.0 schema object for the below object.\n",
"# Make all fields required.\n",
"# {\n",
"# \"customer_id\" : 0,\n",
"# \"email_subject\" : \"text\",\n",
"# \"marketing_text\" : \"text\",\n",
"# \"explanation\" : \"text\"\n",
"# }\n",
"response_schema = {\n",
" \"type\": \"object\",\n",
" \"required\": [\n",
" \"customer_id\",\n",
" \"email_subject\",\n",
" \"marketing_text\",\n",
" \"explanation\"\n",
" ],\n",
" \"properties\": {\n",
" \"customer_id\": {\n",
" \"type\": \"integer\"\n",
" },\n",
" \"email_subject\": {\n",
" \"type\": \"string\"\n",
" },\n",
" \"marketing_text\": {\n",
" \"type\": \"string\"\n",
" },\n",
" \"explanation\": {\n",
" \"type\": \"string\"\n",
" }\n",
" }\n",
"}\n",
"\n",
"for item in customer_list:\n",
" print(f\"Customer id: {item['customer_id']}\")\n",
" retry = 0\n",
" success = False\n",
" while not success:\n",
" try:\n",
" prompt = f\"\"\"You are a marketing expert at Chocolate A.I., a Paris-based chocolate, dessert, and coffee shop. Your task is to craft a highly personalized email for customer {item['customer_name']}.\n",
"\n",
" Leverage the following customer data:\n",
"\n",
" Top Favorite Menu Items: {item['top_1_favorite_menu_item_name']}, {item['top_2_favorite_menu_item_name']}, {item['top_3_favorite_menu_item_name']}\n",
"\n",
" Customer Marketing Insights: {item['customer_marketing_insights']}\n",
"\n",
" The customer lives in: {item['country_code']}\n",
"\n",
" Context: A relevant event is happening this week in Paris. We believe this customer would be interested based on their profile.\n",
"\n",
" Goal: Create a persuasive email that encourages the customer to visit Chocolate A.I. and highlights their favorite menu items, subtly connecting them to the ongoing event.\n",
"\n",
" Format: Please provide a complete, ready-to-send email, tailored to the specific customer and event details. Avoid using placeholders or templates.\n",
"\n",
" Do not tell the customer this, but here are the event details:\n",
" event_id: {item['event_id']}\n",
" event_title: {item['event_title']}\n",
" event_description: {item['event_description']}\n",
" event_keywords: {item['event_keywords']}\n",
"\n",
" Explanation: Briefly explain your thought process and how you've customized the email to resonate with the customer's interests and the event's theme.\n",
" \"\"\"\n",
"\n",
" print(prompt)\n",
" llm_result = GeminiLLM(prompt,response_schema=response_schema)\n",
" print(llm_result)\n",
" json_result = json.loads(llm_result)\n",
" result_escaped_quotes = llm_result.replace(\"\\\\\",\"\\\\\\\\\").replace(\"'\",\"\\'\")\n",
" #json_text = json_result['marketing_text'].replace(\"'\",\"\\'\")\n",
"\n",
" # Save to database\n",
" try:\n",
"\n",
" sql=f\"\"\"INSERT INTO `chocolate_ai.customer_hyper_personalized_email`\n",
" (customer_id, email_date, llm_marketing_prompt, llm_marketing_prompt_response_json, llm_marketing_prompt_response_text)\n",
" VALUES ({item['customer_id']}, CURRENT_DATE(), \\\"\\\"\\\"{prompt}\\\"\\\"\\\", JSON\\\"\\\"\\\"{result_escaped_quotes}\\\"\\\"\\\",\\\"\\\"\\\"{json_result['marketing_text']}\\\"\\\"\\\")\"\"\"\n",
" # print(sql)\n",
" RunQuery(sql)\n",
"\n",
" # Jump out of loop\n",
" item['marketing_text'] = json_result['marketing_text']\n",
" item['email_subject'] = json_result['email_subject']\n",
" success = True\n",
"\n",
" print(\"---------------------------------------------------------------------------------------\")\n",
" print(f\"LLM Marketing Prompt [Success] for Customer {item['customer_id']}\")\n",
" print(\"---------------------------------------------------------------------------------------\")\n",
"\n",
" except Exception as e:\n",
" retry += 1\n",
" print(\"---------------------------------------------------------------------------------------\")\n",
" print(f\"LLM Marketing Prompt [SQL Error] for Customer {item['customer_id']}: {sql}\")\n",
" print(\"---------------------------------------------------------------------------------------\")\n",
"\n",
" except Exception as e:\n",
" retry += 1\n",
" print(\"---------------------------------------------------------------------------------------\")\n",
" print(f\"LLM Marketing Prompt [Error] for Customer {item['customer_id']}: {e}\")\n",
" print(\"---------------------------------------------------------------------------------------\")\n",
"\n",
" if retry > 5:\n",
" print(\"!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!\")\n",
" print(f\"LLM Marketing Prompt [Retry Limit Reached - Skipping] for Customer {item['customer_id']}\")\n",
" print(\"!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!\")\n",
" break # Skip this customer"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "Xj5NUbf_HoyN"
},
"outputs": [],
"source": [
"# View Results\n",
"sql=f\"\"\"SELECT customer_id, llm_marketing_prompt, llm_marketing_prompt_response_json, llm_marketing_prompt_response_text\n",
" FROM `chocolate_ai.customer_hyper_personalized_email`\n",
" WHERE customer_id IN ({customer_id_list_str})\n",
" AND email_date = CURRENT_DATE()\"\"\"\n",
"\n",
"print(sql)\n",
"df_process = RunQuery(sql)\n",
"df_process"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "aqiUIjPCQPux"
},
"source": [
"### <font color='#4285f4'>Create an image prompt and enhance it by running it through Gemini</font>"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "yWN__E_hQPu8"
},
"outputs": [],
"source": [
"# For each customer, generate an improved image prompt using Gemini\n",
"\n",
"# Write me the json in OpenAPI 3.0 schema object for the below object.\n",
"# Make all fields required.\n",
"# {\n",
"# \"customer_id\" : 0,\n",
"# \"image_prompt\" : \"text\"\n",
"# \"explanation\" : \"text\"\n",
"# }\n",
"response_schema = {\n",
" \"type\": \"object\",\n",
" \"required\": [\n",
" \"customer_id\",\n",
" \"image_prompt\",\n",
" \"explanation\"\n",
" ],\n",
" \"properties\": {\n",
" \"customer_id\": {\n",
" \"type\": \"integer\"\n",
" },\n",
" \"image_prompt\": {\n",
" \"type\": \"string\"\n",
" },\n",
" \"explanation\": {\n",
" \"type\": \"string\"\n",
" }\n",
" }\n",
"}\n",
"\n",
"for item in customer_list:\n",
" print(f\"Customer id: {item['customer_id']}\")\n",
" retry = 0\n",
" success = False\n",
" while not success:\n",
" try:\n",
" prompt = f\"\"\"You are a marketing expert at Chocolate A.I., a Paris-based chocolate, dessert, and coffee shop. Your task is to craft a highly personalized image for customer {item['customer_name']}.\n",
" You need to send out a hyper-personalized email to a customer.\n",
" Generate a LLM Prompt to generate a marketing image based upon the customer's profile and the marketing message we are sending in the email.\n",
" The image needs to include one of the below top 3 menu items.\n",
" We want the image to be specific to this customer and their interests.\n",
" Think creatively and use the customer's interests to create a unique image.\n",
" This is important to show something about their interests, hobbies, sports, etc.\n",
" Do not include any names of professional sports teams since they are copyrighted.\n",
" Make sure you state the image should be photo realistic.\n",
" Avoid and copyrighted names such as sporting teams names.\n",
" Avoid mentioning any celebrity names or specific people.\n",
" Do not include references to kids or children in the image prompt.\n",
" Only audits can be rendered by the image process.\n",
" This this through step by step.\n",
" Double check for kids, children, or copyrighted sports teams names.\n",
"\n",
" Customer's profile:\n",
" customer_marketing_insights: {item['customer_marketing_insights']}\n",
" top_1_favorite_menu_item_name: {item['top_1_favorite_menu_item_name']}\n",
" top_2_favorite_menu_item_name: {item['top_2_favorite_menu_item_name']}\n",
" top_3_favorite_menu_item_name: {item['top_3_favorite_menu_item_name']}\n",
"\n",
" Marketing Message:\n",
" {item['marketing_text']}\n",
"\n",
" Do not tell the customer this, but there is an event in Paris that matches their hobbies so you should focus on an image that is specific to that event.\n",
" event_id: {item['event_id']}\n",
" event_title: {item['event_title']}\n",
" event_description: {item['event_description']}\n",
" event_keywords: {item['event_keywords']}\n",
" \"\"\"\n",
"\n",
" print(prompt)\n",
" llm_result = GeminiLLM(prompt,response_schema=response_schema)\n",
" print(llm_result)\n",
" json_result = json.loads(llm_result)\n",
" result_escaped_quotes = llm_result.replace(\"\\\\\",\"\\\\\\\\\").replace(\"'\",\"\\'\")\n",
"\n",
" # Save to database\n",
" try:\n",
" sql=f\"\"\"UPDATE `chocolate_ai.customer_hyper_personalized_email`\n",
" SET llm_orginial_image_prompt = \\\"\\\"\\\"{prompt}\\\"\\\"\\\",\n",
" llm_orginial_image_prompt_response_json = JSON\\\"\\\"\\\"{result_escaped_quotes}\\\"\\\"\\\",\n",
" llm_orginial_image_prompt_response_text = \\\"\\\"\\\"{json_result['image_prompt']}\\\"\\\"\\\",\n",
" llm_improved_image_prompt = \\\"\\\"\\\"{json_result['image_prompt']}\\\"\\\"\\\"\n",
" WHERE customer_id = {item['customer_id']}\"\"\"\n",
"\n",
" #print(sql)\n",
" RunQuery(sql)\n",
"\n",
" # Jump out of loop\n",
" item['image_prompt'] = json_result['image_prompt']\n",
" item['image_explanation'] = json_result['explanation']\n",
" success = True\n",
"\n",
" print(\"---------------------------------------------------------------------------------------\")\n",
" print(f\"LLM Image Prompt [Success] for Customer {item['customer_id']}\")\n",
" print(\"---------------------------------------------------------------------------------------\")\n",
"\n",
" except Exception as e:\n",
" retry += 1\n",
" print(\"---------------------------------------------------------------------------------------\")\n",
" print(f\"LLM Image Prompt [SQL Error] for Customer {item['customer_id']}: {sql}\")\n",
" print(\"---------------------------------------------------------------------------------------\")\n",
"\n",
" except Exception as e:\n",
" retry += 1\n",
" print(\"---------------------------------------------------------------------------------------\")\n",
" print(f\"LLM Image Prompt [Error] for Customer {item['customer_id']}: {e}\")\n",
" print(\"---------------------------------------------------------------------------------------\")\n",
"\n",
" if retry > 5:\n",
" print(\"!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!\")\n",
" print(f\"LLM Image Prompt [Retry Limit Reached - Skipping] for Customer {item['customer_id']}\")\n",
" print(\"!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!\")\n",
" break # Skip this customer"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "nJ7hDwU8QPu8"
},
"outputs": [],
"source": [
"# View Results\n",
"sql=f\"\"\"SELECT customer_id, llm_orginial_image_prompt, llm_orginial_image_prompt_response_json, llm_improved_image_prompt\n",
" FROM `chocolate_ai.customer_hyper_personalized_email`\n",
" WHERE customer_id IN ({customer_id_list_str})\n",
" AND email_date = CURRENT_DATE()\"\"\"\n",
"\n",
"print(sql)\n",
"df_process = RunQuery(sql)\n",
"df_process"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "0o5gCL_AWqBQ"
},
"source": [
"### <font color='#4285f4'>Call Imagen3 with the updated prompt</font>"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "GGgdgKdwWqBQ"
},
"outputs": [],
"source": [
"# For each customer, generate the image using Imagen3\n",
"for item in customer_list:\n",
" print(f\"Customer id: {item['customer_id']}\")\n",
" try:\n",
" image_prompt = item['image_prompt']\n",
" print(f\"Image Prompt: {image_prompt}\")\n",
" print(f\"Image Prompt Explanation: {item['image_explanation']}\")\n",
" filename = ImageGen(item['image_prompt'])\n",
"\n",
" img = Image.open(filename)\n",
" img.thumbnail([500,500]) # width, height\n",
" IPython.display.display(img)\n",
"\n",
" # Save image to GCS\n",
" dest_filename = f\"email_campaign_{item['customer_id']}.png\"\n",
" copy_file_to_gcs(filename, storage_account, f\"chocolate-ai/Campaign-Assets-Hyper-Personalized-Email/email-{formatted_date}/{dest_filename}\")\n",
" item['gcs_filename'] = f\"gs://{storage_account}/chocolate-ai/Campaign-Assets-Hyper-Personalized-Email/email-{formatted_date}/{dest_filename}\"\n",
" item['html_filename'] = f\"https://storage.cloud.google.com/{storage_account}/chocolate-ai/Campaign-Assets-Hyper-Personalized-Email/email-{formatted_date}/{dest_filename}\"\n",
"\n",
" # Update table in BigQuery\n",
" try:\n",
" sql=f\"\"\"UPDATE `chocolate_ai.customer_hyper_personalized_email`\n",
" SET image_gcs_filename = '{item['gcs_filename']}',\n",
" image_http_url = '{item['html_filename']}',\n",
" image_generated = TRUE\n",
" WHERE customer_id = {item['customer_id']}\"\"\"\n",
"\n",
" #print(sql)\n",
" RunQuery(sql)\n",
" item['image_filename'] = filename\n",
"\n",
" print(\"---------------------------------------------------------------------------------------\")\n",
" print(f\"Imagen3 [Success] for Customer {item['customer_id']}\")\n",
" print(\"---------------------------------------------------------------------------------------\")\n",
"\n",
" except Exception as e:\n",
" retry += 1\n",
" print(\"---------------------------------------------------------------------------------------\")\n",
" print(f\"Imagen3 [SQL Error] for Customer {item['customer_id']}: {sql}\")\n",
" print(\"---------------------------------------------------------------------------------------\")\n",
"\n",
" except Exception as e:\n",
" print(\"---------------------------------------------------------------------------------------\")\n",
" print(f\"Imagen3 [Error] for Customer {item['customer_id']}: {e}\")\n",
" print(\"---------------------------------------------------------------------------------------\")\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "KmJEbPmWO7PE"
},
"outputs": [],
"source": [
"# To view the bucket\n",
"print(f\"https://console.cloud.google.com/storage/browser/{storage_account}/chocolate-ai/Campaign-Assets-Hyper-Personalized-Email/email-{formatted_date}\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "SGhMT7ohDPcI"
},
"outputs": [],
"source": [
"# View Results\n",
"sql=f\"\"\"SELECT customer_id, image_gcs_filename, image_http_url, image_generated\n",
" FROM `chocolate_ai.customer_hyper_personalized_email`\n",
" WHERE customer_id IN ({customer_id_list_str})\n",
" AND email_date = CURRENT_DATE()\"\"\"\n",
"\n",
"print(sql)\n",
"df_process = RunQuery(sql)\n",
"df_process"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "OfwG_ykdCypl"
},
"source": [
"### <font color='#4285f4'>Verify the Generated Image with Gemini Vision</font>"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "V-_J7-7Y-S5W"
},
"outputs": [],
"source": [
"# Verify the generate image is correct\n",
"\n",
"# Write me the json in OpenAPI 3.0 schema object for the below object.\n",
"# Make all fields required.\n",
"# {\n",
"# \"image_verified\" : true\n",
"# \"explanation\" : \"text\"\n",
"# }\n",
"response_schema = {\n",
" \"type\": \"object\",\n",
" \"required\": [\n",
" \"image_verified\",\n",
" \"explanation\"\n",
" ],\n",
" \"properties\": {\n",
" \"image_verified\": {\n",
" \"type\": \"boolean\"\n",
" },\n",
" \"explanation\": {\n",
" \"type\": \"string\"\n",
" }\n",
" }\n",
"}\n",
"\n",
"json_schema = '{ \"image_verified\" : true, \"explanation\" : \"text\" }'\n",
"\n",
"for item in customer_list:\n",
" print(f\"Customer id: {item['customer_id']}\")\n",
" try:\n",
" image_prompt = item['image_prompt']\n",
" print(f\"Image Prompt: {image_prompt}\")\n",
" print(f\"Image Prompt Explanation: {item['image_explanation']}\")\n",
" filename = item['image_filename']\n",
"\n",
" prompt = f\"\"\"I need you to verify that the below image meets the following criteria:\n",
" <ImagePrompt>\n",
" {item['image_prompt']}\n",
" </ImagePrompt>\n",
" <ImageExplanation>\n",
" {item['image_explanation']}\n",
" </ImageExplanation>\n",
" \"\"\"\n",
"\n",
" print(prompt)\n",
" imageBase64 = convert_png_to_base64(filename)\n",
" llm_result = GeminiLLM_VerifyImage(prompt, imageBase64, response_schema=response_schema)\n",
" print(llm_result)\n",
" json_result = json.loads(llm_result)\n",
" result_escaped_quotes = llm_result.replace(\"\\\\\",\"\\\\\\\\\").replace(\"'\",\"\\'\")\n",
"\n",
"\n",
" # Update table in BigQuery\n",
" try:\n",
" sql=f\"\"\"UPDATE `chocolate_ai.customer_hyper_personalized_email`\n",
" SET llm_verify_image_prompt = \\\"\\\"\\\"{prompt}\\\"\\\"\\\",\n",
" llm_verify_image_response_json = JSON\\\"\\\"\\\"{result_escaped_quotes}\\\"\\\"\\\",\n",
" llm_verify_image_text = \\\"\\\"\\\"{json_result['explanation']}\\\"\\\"\\\",\n",
" image_verified = {json_result['image_verified']}\n",
" WHERE customer_id = {item['customer_id']}\"\"\"\n",
"\n",
" #print(sql)\n",
" RunQuery(sql)\n",
"\n",
" print(\"---------------------------------------------------------------------------------------\")\n",
" print(f\"Imagen3 Verification [Success] for Customer {item['customer_id']}\")\n",
" print(\"---------------------------------------------------------------------------------------\")\n",
"\n",
" except Exception as e:\n",
" retry += 1\n",
" print(\"---------------------------------------------------------------------------------------\")\n",
" print(f\"Imagen3 Verification [SQL Error] for Customer {item['customer_id']}: {sql}\")\n",
" print(\"---------------------------------------------------------------------------------------\")\n",
"\n",
" except Exception as e:\n",
" print(\"---------------------------------------------------------------------------------------\")\n",
" print(f\"Imagen3 Verification [Error] for Customer {item['customer_id']}: {e}\")\n",
" print(\"---------------------------------------------------------------------------------------\")\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "c0H1cwl2CJdj"
},
"outputs": [],
"source": [
"# View Results\n",
"sql=f\"\"\"SELECT customer_id, llm_verify_image_prompt, llm_verify_image_response_json, llm_verify_image_text, image_verified\n",
" FROM `chocolate_ai.customer_hyper_personalized_email`\n",
" WHERE customer_id IN ({customer_id_list_str})\n",
" AND email_date = CURRENT_DATE()\"\"\"\n",
"\n",
"print(sql)\n",
"df_process = RunQuery(sql)\n",
"df_process"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "OabA8ZAhC5IX"
},
"source": [
"### <font color='#4285f4'>Translate the Marketing Message to another language</font>"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "1xYdA9VKDIQV"
},
"outputs": [],
"source": [
"# Translate the marketing text into another language (we will randomly pick on)\n",
"\n",
"# Write me the json in OpenAPI 3.0 schema object for the below object.\n",
"# Make all fields required.\n",
"# {\n",
"# \"translated_text\" : \"text\"\n",
"# }\n",
"response_schema = {\n",
" \"type\": \"object\",\n",
" \"required\": [\n",
" \"translated_text\"\n",
" ],\n",
" \"properties\": {\n",
" \"translated_text\": {\n",
" \"type\": \"string\"\n",
" }\n",
" }\n",
"}\n",
"\n",
"\n",
"for item in customer_list:\n",
" # Pick an random language from the list\n",
" random_language = 11 # we will do French or you can do a Random language: random.randint(0,len(gemini_languages)-1)\n",
" print(f\"Random Language: {gemini_languages[random_language]}\")\n",
" print(f\"Customer id: {item['customer_id']}\")\n",
" item['translation_language'] = gemini_languages[random_language]\n",
" try:\n",
"\n",
" prompt = f\"\"\"You are an expert translator for the following language {gemini_languages[random_language]}.\n",
" Translate the following text from English to {gemini_languages[random_language]}:\n",
" <Text>\n",
" {item['marketing_text']}\n",
" </Text>\n",
" \"\"\"\n",
"\n",
" print(prompt)\n",
" llm_result = GeminiLLM(prompt, response_schema=response_schema)\n",
" print(llm_result)\n",
" json_result = json.loads(llm_result)\n",
" result_escaped_quotes = llm_result.replace(\"\\\\\",\"\\\\\\\\\").replace(\"'\",\"\\'\")\n",
"\n",
" # Update table in BigQuery\n",
" try:\n",
" sql=f\"\"\"UPDATE `chocolate_ai.customer_hyper_personalized_email`\n",
" SET llm_translation_language_prompt = \\\"\\\"\\\"{prompt}\\\"\\\"\\\",\n",
" llm_translation_language_prompt_response_json = JSON\\\"\\\"\\\"{result_escaped_quotes}\\\"\\\"\\\",\n",
" llm_translation_language_prompt_response_text = \\\"\\\"\\\"{json_result['translated_text']}\\\"\\\"\\\"\n",
" WHERE customer_id = {item['customer_id']}\"\"\"\n",
"\n",
" #print(sql)\n",
" RunQuery(sql)\n",
" item['translated_text'] = json_result['translated_text']\n",
"\n",
" print(\"---------------------------------------------------------------------------------------\")\n",
" print(f\"Translation [Success] for Customer {item['customer_id']}\")\n",
" print(\"---------------------------------------------------------------------------------------\")\n",
"\n",
" except Exception as e:\n",
" retry += 1\n",
" print(\"---------------------------------------------------------------------------------------\")\n",
" print(f\"Translation [SQL Error] for Customer {item['customer_id']}: {sql}\")\n",
" print(\"---------------------------------------------------------------------------------------\")\n",
"\n",
" except Exception as e:\n",
" print(\"---------------------------------------------------------------------------------------\")\n",
" print(f\"Translation [Error] for Customer {item['customer_id']}: {e}\")\n",
" print(\"---------------------------------------------------------------------------------------\")\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "_7QtxRWnP0Ll"
},
"outputs": [],
"source": [
"# View Results\n",
"sql=f\"\"\"SELECT customer_id, llm_translation_language_prompt, llm_translation_language_prompt_response_json, llm_translation_language_prompt_response_text\n",
" FROM `chocolate_ai.customer_hyper_personalized_email`\n",
" WHERE customer_id IN ({customer_id_list_str})\n",
" AND email_date = CURRENT_DATE()\"\"\"\n",
"\n",
"print(sql)\n",
"df_process = RunQuery(sql)\n",
"df_process"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "lrS2IH6VC9l-"
},
"source": [
"### <font color='#4285f4'>Verify the Translation</font>"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "I1p7flicDItv"
},
"outputs": [],
"source": [
"# Verify the translation is correct\n",
"\n",
"# Write me the json in OpenAPI 3.0 schema object for the below object.\n",
"# Make all fields required.\n",
"# {\n",
"# \"translation_verified\" : true\n",
"# \"explanation\" : \"text\"\n",
"# }\n",
"response_schema = {\n",
" \"type\": \"object\",\n",
" \"required\": [\n",
" \"translation_verified\",\n",
" \"explanation\"\n",
" ],\n",
" \"properties\": {\n",
" \"translation_verified\": {\n",
" \"type\": \"boolean\"\n",
" },\n",
" \"explanation\": {\n",
" \"type\": \"string\"\n",
" }\n",
" }\n",
"}\n",
"\n",
"\n",
"for item in customer_list:\n",
" print(f\"Customer id: {item['customer_id']}\")\n",
" try:\n",
" prompt = f\"\"\"I need you to verify that the below text is in the langugage of \"{item['translation_language']}\".\n",
" It was originially in English, so make sure it is not still English.\n",
" <Text>\n",
" {item['translated_text']}\n",
" </Text>\n",
" \"\"\"\n",
"\n",
" print(prompt)\n",
" llm_result = GeminiLLM(prompt, response_schema=response_schema)\n",
" print(llm_result)\n",
" json_result = json.loads(llm_result)\n",
" result_escaped_quotes = llm_result.replace(\"\\\\\",\"\\\\\\\\\").replace(\"'\",\"\\'\")\n",
"\n",
" # Update table in BigQuery\n",
" try:\n",
" sql=f\"\"\"UPDATE `chocolate_ai.customer_hyper_personalized_email`\n",
" SET llm_validate_translation_prompt = \\\"\\\"\\\"{prompt}\\\"\\\"\\\",\n",
" llm_validate_translation_prompt_response_json = JSON\\\"\\\"\\\"{result_escaped_quotes}\\\"\\\"\\\",\n",
" llm_validate_translation_prompt_response_text = \\\"\\\"\\\"{json_result['explanation']}\\\"\\\"\\\",\n",
" translation_verified = {json_result['translation_verified']}\n",
" WHERE customer_id = {item['customer_id']}\"\"\"\n",
"\n",
" #print(sql)\n",
" RunQuery(sql)\n",
"\n",
" print(\"---------------------------------------------------------------------------------------\")\n",
" print(f\"Translation Verification [Success] for Customer {item['customer_id']}\")\n",
" print(\"---------------------------------------------------------------------------------------\")\n",
"\n",
" except Exception as e:\n",
" retry += 1\n",
" print(\"---------------------------------------------------------------------------------------\")\n",
" print(f\"Translation Verification [SQL Error] for Customer {item['customer_id']}: {sql}\")\n",
" print(\"---------------------------------------------------------------------------------------\")\n",
"\n",
" except Exception as e:\n",
" print(\"---------------------------------------------------------------------------------------\")\n",
" print(f\"Translation Verification [Error] for Customer {item['customer_id']}: {e}\")\n",
" print(\"---------------------------------------------------------------------------------------\")\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "dNQoVMarRGxl"
},
"outputs": [],
"source": [
"# View Results\n",
"sql=f\"\"\"SELECT customer_id, llm_validate_translation_prompt, llm_validate_translation_prompt_response_json, llm_validate_translation_prompt_response_text, translation_verified\n",
" FROM `chocolate_ai.customer_hyper_personalized_email`\n",
" WHERE customer_id IN ({customer_id_list_str})\n",
" AND email_date = CURRENT_DATE()\"\"\"\n",
"\n",
"print(sql)\n",
"df_process = RunQuery(sql)\n",
"df_process"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "5sCUluQyDB6-"
},
"source": [
"### <font color='#4285f4'>Generate the HTML and Save</font>"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "xT1UZX2rDJNv"
},
"outputs": [],
"source": [
"html_template = \"\"\"<!DOCTYPE html>\n",
"<html>\n",
"<head>\n",
" <title>Coffee Campaign</title>\n",
" <style>\n",
" body {\n",
" font-family: 'Helvetica Neue', sans-serif;\n",
" }\n",
" .email-campaign {\n",
" background-color: #EDF2F9;\n",
" padding: 20px;\n",
" margin-bottom: 20px;\n",
" border-bottom: 2px solid #ddd;\n",
" }\n",
" h3 {\n",
" font-size: 16px;\n",
" margin-bottom: 10px;\n",
" color: #333;\n",
" }\n",
" p {\n",
" font-size: 14px;\n",
" line-height: 1.5;\n",
" color: #555;\n",
" }\n",
" </style>\n",
"</head>\n",
"<body>\n",
" <div class=\"email-campaign\">\n",
" <h3>Email Campaign (English)</h3>\n",
" <p style=\"font-weight: bold;\">Subject: ##email_subject##</p>\n",
" <p>##marketing_text##</p>\n",
" </div>\n",
" <div>\n",
" <img src=\"##html_filename##\" width=\"500\" height=\"500\" alt=\"Item Image\">\n",
" </div>\n",
"\n",
" <hr/>\n",
"\n",
" <div class=\"email-campaign\">\n",
" <h3>Email Campaign (##translation_language##)</h3>\n",
" <p>##translated_text##</p>\n",
" </div>\n",
" <div>\n",
" <img src=\"##html_filename##\" width=\"500\" height=\"500\" alt=\"Item Image\">\n",
" </div>\n",
"\n",
"</body>\n",
"</html>\n",
"\"\"\""
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "I7ApKaWhSdHE"
},
"outputs": [],
"source": [
"# Create HTML using the Template\n",
"\n",
"for item in customer_list:\n",
" print(item)\n",
" if 'html_filename' not in item:\n",
" # Error generating image\n",
" print(\"Error: 'html_filename' not in item\")\n",
" continue\n",
"\n",
" if 'translated_text' not in item:\n",
" # Error generating translation\n",
" print(\"Error: 'translated_text' not in item\")\n",
" continue\n",
"\n",
" # Replace the placeholders with the actual values\n",
" html = html_template \\\n",
" .replace(\"##email_subject##\", item['email_subject']) \\\n",
" .replace(\"##marketing_text##\", item['marketing_text']) \\\n",
" .replace(\"##translation_language##\", item['translation_language']) \\\n",
" .replace(\"##translated_text##\", item['translated_text']) \\\n",
" .replace(\"##html_filename##\", item['html_filename'])\n",
"\n",
" filename = f\"email_campaign_{item['customer_id']}.html\"\n",
"\n",
" # Save the HTML to a file\n",
" with open(filename, \"w\") as f:\n",
" f.write(html)\n",
"\n",
" copy_file_to_gcs(filename, storage_account,f\"chocolate-ai/Campaign-Assets-Hyper-Personalized-Email/email-{formatted_date}/\" + filename)\n",
"\n",
" # Update table in BigQuery\n",
" try:\n",
" html_gcs_filename = f\"gs://{storage_account}/chocolate-ai/Campaign-Assets-Hyper-Personalized-Email/email-{formatted_date}/{filename}\"\n",
" html_http_url = f\"https://storage.cloud.google.com/{storage_account}/chocolate-ai/Campaign-Assets-Hyper-Personalized-Email/email-{formatted_date}/{filename}\"\n",
"\n",
" sql=f\"\"\"UPDATE `chocolate_ai.customer_hyper_personalized_email`\n",
" SET html_gcs_filename = '{html_gcs_filename}',\n",
" html_http_url = '{html_http_url}',\n",
" html_generated = TRUE\n",
" WHERE customer_id = {item['customer_id']}\"\"\"\n",
"\n",
" #print(sql)\n",
" RunQuery(sql)\n",
"\n",
" print(\"---------------------------------------------------------------------------------------\")\n",
" print(f\"HTML Generation [Success] for Customer {item['customer_id']}\")\n",
" print(\"---------------------------------------------------------------------------------------\")\n",
"\n",
" except Exception as e:\n",
" retry += 1\n",
" print(\"---------------------------------------------------------------------------------------\")\n",
" print(f\"HTML Generation [SQL Error] for Customer {item['customer_id']}: {sql}\")\n",
" print(\"---------------------------------------------------------------------------------------\")\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "eAPgeLTndZTF"
},
"outputs": [],
"source": [
"# To view the bucket\n",
"print(f\"https://console.cloud.google.com/storage/browser/{storage_account}/chocolate-ai/Campaign-Assets-Hyper-Personalized-Email/email-{formatted_date}\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "VThvmO_SZmF5"
},
"outputs": [],
"source": [
"# View Results\n",
"sql=f\"\"\"SELECT customer_id, html_gcs_filename, html_http_url, html_generated\n",
" FROM `chocolate_ai.customer_hyper_personalized_email`\n",
" WHERE customer_id IN ({customer_id_list_str})\n",
" AND email_date = CURRENT_DATE();\"\"\"\n",
"\n",
"print(sql)\n",
"df_process = RunQuery(sql)\n",
"df_process"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "WT3u1L_aVayT"
},
"outputs": [],
"source": [
"filename = f\"email_campaign_{customer_list[0]['customer_id']}.html\"\n",
"IPython.display.HTML(filename=filename)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "XcRVFkNIXNT1"
},
"outputs": [],
"source": [
"filename = f\"email_campaign_{customer_list[1]['customer_id']}.html\"\n",
"IPython.display.HTML(filename=filename)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "FfLXgbircpZk"
},
"source": [
"### <font color='#4285f4'>View all results</font>"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "31W7KOUzZs4g"
},
"outputs": [],
"source": [
"# View All Results\n",
"sql=f\"\"\"SELECT *\n",
" FROM `chocolate_ai.customer_hyper_personalized_email`\n",
" WHERE customer_id IN ({customer_id_list_str})\n",
" AND email_date = CURRENT_DATE();\"\"\"\n",
"\n",
"print(sql)\n",
"df_process = RunQuery(sql)\n",
"df_process"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "42IxhtRRrvR-"
},
"source": [
"### <font color='#4285f4'>Clean Up</font>"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "6lF2Z7skFbvf"
},
"outputs": [],
"source": [
"# Placeholder"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "ASQ2BPisXDA0"
},
"source": [
"### <font color='#4285f4'>Reference Links</font>\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "rTY6xJdZ3ul8"
},
"source": [
"- [Imagen3](https://cloud.google.com/vertex-ai/docs/generative-ai/model-reference/image-generation)"
]
}
],
"metadata": {
"colab": {
"collapsed_sections": [
"HMsUvoF4BP7Y",
"m65vp54BUFRi",
"UmyL-Rg4Dr_f",
"JbOjdSP1kN9T",
"EYRHDPdVKBzd",
"HNZanAdJEaPq",
"oMl8zBaOEfy8",
"xPM7R0UdU7Q4",
"c51M89g0Ejmz",
"SLK7IX7QruT3",
"2olP2ZnglJlr",
"ASQ2BPisXDA0"
],
"name": "Campaign-Assets-Hyper-Personalized-Email",
"private_outputs": true,
"provenance": [],
"toc_visible": true
},
"kernelspec": {
"display_name": "Python 3",
"name": "python3"
},
"language_info": {
"name": "python"
}
},
"nbformat": 4,
"nbformat_minor": 0
}