colab-enterprise/Create-Campaign-Multimodal-Segment-Recommendations.ipynb (1,561 lines of code) (raw):
{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "MokqWo8Oiykt"
},
"source": [
"## <font color='#4285f4'>Overview</font>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This notebook uses Gemini to analyze marketing campaigns and identify ideal customer segments for Chocolate AI. You can start with either a campaign description or a video ad, and Gemini will recommend customer segments to target based one 1/ a predefined set of segments, and 2/ a dynamic segment that doesn’t already exist. The notebook shows how you can easily query complex segment data using JSON, and it demonstrates how to find matches for a net new segment using vector search in BigQuery.\n",
"\n",
"Process Flow: \n",
"1. Retrieve pre-defined customer segments from a BigQuery table (${project_id}.${bigquery_chocolate_ai_dataset}.customer_marketing_profile_segments)\n",
"2. Text-based Analysis:\n",
" - Fetch the details of the latest generated marketing campaign from another BigQuery table (${project_id}.${bigquery_chocolate_ai_dataset}.campaign)\n",
" - Construct a detailed prompt for Gemini LLM, providing campaign information and asking it to identify relevant pre-defined and novel customer segments along with explanations.\n",
" - Process the LLM response and extract the suggested segments and reasoning.\n",
" - Query BigQuery to count customers matching the recommended pre-defined segments.\n",
"3. Video-based Analysis:\n",
" - Upload a sample advertisement video to a Google Cloud Storage bucket.\n",
" - Call multimodal Gemini, providing the video URI and a prompt to identify customer segments likely receptive to the ad.\n",
" - Extract and displays the recommended pre-defined and novel segments, along with explanations.\n",
"4. Increase reach with dynamic segments powered by vector search\n",
" - Extract keywords from the LLM-generated ideal segment description.\n",
" - Perform vector search on customer profiles using these keywords to retrieve a list of matching customers (i.e. top 100).\n",
"\n",
"Author: Paul Ramsey"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "HMsUvoF4BP7Y"
},
"source": [
"## <font color='#4285f4'>License</font>"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "jQgQkbOvj55d"
},
"source": [
"```\n",
"# Copyright 2024 Google LLC\n",
"#\n",
"# Licensed under the Apache License, Version 2.0 (the \"License\");\n",
"# you may not use this file except in compliance with the License.\n",
"# You may obtain a copy of the License at\n",
"#\n",
"# https://www.apache.org/licenses/LICENSE-2.0\n",
"#\n",
"# Unless required by applicable law or agreed to in writing, software\n",
"# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
"# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
"# See the License for the specific language governing permissions and\n",
"# limitations under the License.\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "UmyL-Rg4Dr_f"
},
"source": [
"## <font color='#4285f4'>Initialize</font>"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "pZcs1uLnI6U2"
},
"source": [
"### Pip Installs and Imports"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "xOYsEVSXp6IP"
},
"outputs": [],
"source": [
"from IPython.display import HTML\n",
"from functools import reduce\n",
"import google.auth\n",
"import requests\n",
"import json\n",
"import markdown\n",
"\n",
"import logging\n",
"from tenacity import retry, wait_exponential, stop_after_attempt, before_sleep_log, retry_if_exception\n",
"from google.cloud import bigquery\n",
"client = bigquery.Client()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "wMlHl3bnkFPZ"
},
"outputs": [],
"source": [
"# Update these variables to match your environment\n",
"location=\"us-central1\" # Your region\n",
"bigquery_location = \"${bigquery_location}\" # Must be \"us\" or \"eu\"\n",
"\n",
"### Do not change the values in this cell below this line ###\n",
"project_id = !(gcloud config get-value project)\n",
"user = !(gcloud auth list --filter=status:ACTIVE --format=\"value(account)\")\n",
"\n",
"if len(project_id) != 1:\n",
" raise RuntimeError(f\"project_id is not set: {project_id}\")\n",
"project_id = project_id[0]\n",
"\n",
"if len(user) != 1:\n",
" raise RuntimeError(f\"user is not set: {user}\")\n",
"user = user[0]\n",
"\n",
"bucket_name = \"${chocolate_ai_bucket}\"\n",
"\n",
"print(f\"project_id = {project_id}\")\n",
"print(f\"user = {user}\")\n",
"print(f\"location = {location}\")\n",
"print(f\"bigquery_location = {bigquery_location}\")\n",
"print(f\"bucket_name = {bucket_name}\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "sZ6m_wGrK0YG"
},
"source": [
"### Helper Methods"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "g57ZpqGRgg51"
},
"source": [
"#### RetryCondition(error)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "lT9geulqggwr"
},
"outputs": [],
"source": [
"def RetryCondition(error):\n",
" error_string = str(error)\n",
" print(error_string)\n",
"\n",
" retry_errors = [\n",
" \"RESOURCE_EXHAUSTED\",\n",
" \"No content in candidate\",\n",
" # Add more error messages here as needed\n",
" ]\n",
"\n",
" for retry_error in retry_errors:\n",
" if retry_error in error_string:\n",
" print(\"Retrying...\")\n",
" return True\n",
"\n",
" return False"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "JbOjdSP1kN9T"
},
"source": [
"#### restAPIHelper\n",
"Calls the Google Cloud REST API using the current users credentials."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "40wlwnY4kM11"
},
"outputs": [],
"source": [
"def restAPIHelper(url: str, http_verb: str, request_body: str) -> str:\n",
" \"\"\"Calls the Google Cloud REST API passing in the current users credentials\"\"\"\n",
"\n",
" import requests\n",
" import google.auth\n",
" import json\n",
"\n",
" # Get an access token based upon the current user\n",
" creds, project = google.auth.default()\n",
" auth_req = google.auth.transport.requests.Request()\n",
" creds.refresh(auth_req)\n",
" access_token=creds.token\n",
"\n",
" headers = {\n",
" \"Content-Type\" : \"application/json\",\n",
" \"Authorization\" : \"Bearer \" + access_token\n",
" }\n",
"\n",
" if http_verb == \"GET\":\n",
" response = requests.get(url, headers=headers)\n",
" elif http_verb == \"POST\":\n",
" response = requests.post(url, json=request_body, headers=headers)\n",
" elif http_verb == \"PUT\":\n",
" response = requests.put(url, json=request_body, headers=headers)\n",
" elif http_verb == \"PATCH\":\n",
" response = requests.patch(url, json=request_body, headers=headers)\n",
" elif http_verb == \"DELETE\":\n",
" response = requests.delete(url, headers=headers)\n",
" else:\n",
" raise RuntimeError(f\"Unknown HTTP verb: {http_verb}\")\n",
"\n",
" if response.status_code == 200:\n",
" return json.loads(response.content)\n",
" #image_data = json.loads(response.content)[\"predictions\"][0][\"bytesBase64Encoded\"]\n",
" else:\n",
" error = f\"Error restAPIHelper -> ' Status: '{response.status_code}' Text: '{response.text}'\"\n",
" raise RuntimeError(error)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "4VMFSz-R3BW2"
},
"source": [
"#### Gemini LLM (Pro 1.0 , Pro 1.5)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "wnf6nhAc3JwL"
},
"outputs": [],
"source": [
"@retry(wait=wait_exponential(multiplier=1, min=1, max=60), stop=stop_after_attempt(10), retry=retry_if_exception(RetryCondition), before_sleep=before_sleep_log(logging.getLogger(), logging.INFO))\n",
"def GeminiLLM(prompt, model = \"gemini-2.0-flash\", response_schema = None,\n",
" temperature = 1, topP = 1, topK = 32):\n",
"\n",
" # https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/inference#supported_models\n",
" # model = \"gemini-2.0-flash\"\n",
"\n",
" llm_response = None\n",
" if temperature < 0:\n",
" temperature = 0\n",
"\n",
" creds, project = google.auth.default()\n",
" auth_req = google.auth.transport.requests.Request() # required to acess access token\n",
" creds.refresh(auth_req)\n",
" access_token=creds.token\n",
"\n",
" headers = {\n",
" \"Content-Type\" : \"application/json\",\n",
" \"Authorization\" : \"Bearer \" + access_token\n",
" }\n",
"\n",
" # https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/inference\n",
" url = f\"https://{location}-aiplatform.googleapis.com/v1/projects/{project_id}/locations/{location}/publishers/google/models/{model}:generateContent\"\n",
"\n",
" generation_config = {\n",
" \"temperature\": temperature,\n",
" \"topP\": topP,\n",
" \"maxOutputTokens\": 8192,\n",
" \"candidateCount\": 1,\n",
" \"responseMimeType\": \"application/json\",\n",
" }\n",
"\n",
" # Add inthe response schema for when it is provided\n",
" if response_schema is not None:\n",
" generation_config[\"responseSchema\"] = response_schema\n",
"\n",
" if model == \"gemini-2.0-flash\":\n",
" generation_config[\"topK\"] = topK\n",
"\n",
" payload = {\n",
" \"contents\": {\n",
" \"role\": \"user\",\n",
" \"parts\": {\n",
" \"text\": prompt\n",
" },\n",
" },\n",
" \"generation_config\": {\n",
" **generation_config\n",
" },\n",
" \"safety_settings\": {\n",
" \"category\": \"HARM_CATEGORY_SEXUALLY_EXPLICIT\",\n",
" \"threshold\": \"BLOCK_LOW_AND_ABOVE\"\n",
" }\n",
" }\n",
"\n",
" response = requests.post(url, json=payload, headers=headers)\n",
"\n",
" if response.status_code == 200:\n",
" try:\n",
" json_response = json.loads(response.content)\n",
" except Exception as error:\n",
" raise RuntimeError(f\"An error occurred parsing the JSON: {error}\")\n",
"\n",
" if \"candidates\" in json_response:\n",
" candidates = json_response[\"candidates\"]\n",
" if len(candidates) > 0:\n",
" candidate = candidates[0]\n",
" if \"content\" in candidate:\n",
" content = candidate[\"content\"]\n",
" if \"parts\" in content:\n",
" parts = content[\"parts\"]\n",
" if len(parts):\n",
" part = parts[0]\n",
" if \"text\" in part:\n",
" text = part[\"text\"]\n",
" llm_response = text\n",
" else:\n",
" raise RuntimeError(\"No text in part: {response.content}\")\n",
" else:\n",
" raise RuntimeError(\"No parts in content: {response.content}\")\n",
" else:\n",
" raise RuntimeError(\"No parts in content: {response.content}\")\n",
" else:\n",
" raise RuntimeError(\"No content in candidate: {response.content}\")\n",
" else:\n",
" raise RuntimeError(\"No candidates: {response.content}\")\n",
" else:\n",
" raise RuntimeError(\"No candidates: {response.content}\")\n",
"\n",
" # Remove some typically response characters (if asking for a JSON reply)\n",
" llm_response = llm_response.replace(\"```json\",\"\")\n",
" llm_response = llm_response.replace(\"```\",\"\")\n",
" llm_response = llm_response.replace(\"\\n\",\"\")\n",
"\n",
" return llm_response\n",
"\n",
" else:\n",
" raise RuntimeError(f\"Error with prompt:'{prompt}' Status:'{response.status_code}' Text:'{response.text}'\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "CbBALqHr3OsE"
},
"source": [
"#### Gemini LLM - Multimodal"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "n6LMtgHX3SXU"
},
"outputs": [],
"source": [
"@retry(wait=wait_exponential(multiplier=1, min=1, max=60), stop=stop_after_attempt(10), retry=retry_if_exception(RetryCondition), before_sleep=before_sleep_log(logging.getLogger(), logging.INFO))\n",
"def GeminiLLM_Multimodal(\n",
" text_prompt, video_uri, model=\"gemini-2.0-flash\", response_schema=None, temperature=1, topP=1, topK=32\n",
"):\n",
" \"\"\"\n",
" Calls the Gemini API with a text prompt and an image prompt.\n",
"\n",
" Args:\n",
" text_prompt: The text prompt.\n",
" image_data: A BytesIO object containing the image data.\n",
" model: The Gemini model to use.\n",
" response_schema: Optional response schema.\n",
" temperature: Temperature parameter for the model.\n",
" topP: Top-p parameter for the model.\n",
" topK: Top-k parameter for the model.\n",
"\n",
" Returns:\n",
" The Gemini response as a string.\n",
" \"\"\"\n",
"\n",
" creds, project = google.auth.default()\n",
" auth_req = google.auth.transport.requests.Request()\n",
" creds.refresh(auth_req)\n",
" access_token = creds.token\n",
"\n",
" headers = {\n",
" \"Content-Type\": \"application/json\",\n",
" \"Authorization\": \"Bearer \" + access_token\n",
" }\n",
"\n",
" url = f\"https://{location}-aiplatform.googleapis.com/v1/projects/{project_id}/locations/{location}/publishers/google/models/{model}:generateContent\"\n",
"\n",
"\n",
" generation_config = {\n",
" \"temperature\": temperature,\n",
" \"topP\": topP,\n",
" \"maxOutputTokens\": 8192,\n",
" \"candidateCount\": 1,\n",
" \"responseMimeType\": \"application/json\" #Invalid for multi-modal responses\n",
" }\n",
"\n",
" if response_schema is not None:\n",
" generation_config[\"responseSchema\"] = response_schema\n",
"\n",
" if model in [\"gemini-2.0-flash\"]:\n",
" generation_config[\"topK\"] = topK\n",
"\n",
" # Construct the payload with the image URI\n",
" payload = {\n",
" \"contents\": [{\n",
" \"role\": \"user\",\n",
" \"parts\": [\n",
" {\n",
" \"text\": text_prompt\n",
" },\n",
" {\n",
" \"fileData\": {\n",
" \"fileUri\": video_uri,\n",
" \"mimeType\": \"video/mp4\"\n",
" }\n",
" }\n",
" ]\n",
" }],\n",
" \"generation_config\": generation_config,\n",
" \"safety_settings\": {\n",
" \"category\": \"HARM_CATEGORY_SEXUALLY_EXPLICIT\",\n",
" \"threshold\": \"BLOCK_ONLY_HIGH\"\n",
" }\n",
" }\n",
"\n",
" response = requests.post(url, json=payload, headers=headers)\n",
"\n",
" if response.status_code == 200:\n",
" return json.loads(response.text)\n",
" else:\n",
" raise RuntimeError(\n",
" f\"Error with prompt:'{text_prompt}' Status:'{response.status_code}' Text:'{response.text}'\"\n",
" )\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "FumgN4cWV5An"
},
"source": [
"#### Run Query()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "FrHOgq0sV45p"
},
"outputs": [],
"source": [
"def RunQuery(sql, job_config = None):\n",
" import time\n",
"\n",
" if (sql.startswith(\"SELECT\") or sql.startswith(\"WITH\")):\n",
" df_result = client.query(sql).to_dataframe()\n",
" return df_result\n",
" else:\n",
" if job_config == None:\n",
" job_config = bigquery.QueryJobConfig(priority=bigquery.QueryPriority.INTERACTIVE)\n",
" query_job = client.query(sql, job_config=job_config)\n",
"\n",
" # Check on the progress by getting the job's updated state.\n",
" query_job = client.get_job(\n",
" query_job.job_id, location=query_job.location\n",
" )\n",
" print(\"Job {} is currently in state {} with error result of {}\".format(query_job.job_id, query_job.state, query_job.error_result))\n",
"\n",
" while query_job.state != \"DONE\":\n",
" time.sleep(2)\n",
" query_job = client.get_job(\n",
" query_job.job_id, location=query_job.location\n",
" )\n",
" print(\"Job {} is currently in state {} with error result of {}\".format(query_job.job_id, query_job.state, query_job.error_result))\n",
"\n",
" if query_job.error_result == None:\n",
" return True\n",
" else:\n",
" return False"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "apiQ6P-4gZUh"
},
"source": [
"#### PrettyPrintJson(json_string)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "aUrsncqA3lqK"
},
"outputs": [],
"source": [
"def PrettyPrintJson(json_string):\n",
" json_object = json.loads(json_string)\n",
" json_formatted_str = json.dumps(json_object, indent=2)\n",
" #print(json_formatted_str)\n",
" return json.dumps(json_object)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "6iSUuDYYExOz"
},
"source": [
"#### DisplayMarkdown()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "8aZDTHvNExIK"
},
"outputs": [],
"source": [
"def DisplayMarkdown(text):\n",
" \"\"\"\n",
" Displays text in markdown/HTML format in a Colab notebook.\n",
"\n",
" Args:\n",
" text: The text to display. Can be plain text or Markdown.\n",
" \"\"\"\n",
"\n",
" formatted_text = markdown.markdown(text) # Convert to HTML if necessary\n",
" display(HTML(formatted_text))\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "axOnALYKCtHl"
},
"source": [
"### Campaign Methods"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "97MGoAdc_1Vh"
},
"source": [
"#### GetPreDefinedSegments()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "OwiODKR-6XGM"
},
"outputs": [],
"source": [
"def GetPreDefinedSegments():\n",
" sql = f\"\"\"SELECT JSON_OBJECT(\n",
" 'pre_defined_customer_segments', JSON_ARRAY (\n",
" (\n",
" --benefits_sought\n",
" SELECT JSON_OBJECT('benefits_sought', ARRAY_AGG(DISTINCT segment))\n",
" FROM (\n",
" SELECT DISTINCT benefits_sought AS segments\n",
" FROM `${project_id}.${bigquery_chocolate_ai_dataset}.customer_marketing_profile_segments`\n",
" ), UNNEST(SPLIT(segments, ',')) AS segment\n",
" ),\n",
" (\n",
" --browsing_behavior\n",
" SELECT JSON_OBJECT('browsing_behavior', ARRAY_AGG(DISTINCT TRIM(segment, \" \")))\n",
" FROM (\n",
" SELECT DISTINCT browsing_behavior AS segments\n",
" FROM `${project_id}.${bigquery_chocolate_ai_dataset}.customer_marketing_profile_segments`\n",
" ), UNNEST(SPLIT(segments, ',')) AS segment\n",
" ),\n",
" (\n",
" --occasion_timing\n",
" SELECT JSON_OBJECT('occasion_timing', ARRAY_AGG(DISTINCT TRIM(segment, \" \")))\n",
" FROM (\n",
" SELECT DISTINCT occasion_timing AS segments\n",
" FROM `${project_id}.${bigquery_chocolate_ai_dataset}.customer_marketing_profile_segments`\n",
" ), UNNEST(SPLIT(segments, ',')) AS segment\n",
" ),\n",
" (\n",
" --purchase_history\n",
" SELECT JSON_OBJECT('purchase_history', ARRAY_AGG(DISTINCT TRIM(segment, \" \")))\n",
" FROM (\n",
" SELECT DISTINCT purchase_history AS segments\n",
" FROM `${project_id}.${bigquery_chocolate_ai_dataset}.customer_marketing_profile_segments`\n",
" ), UNNEST(SPLIT(segments, ',')) AS segment\n",
" ),\n",
" (\n",
" --spending_habits\n",
" SELECT JSON_OBJECT('spending_habits', ARRAY_AGG(DISTINCT TRIM(segment, \" \")))\n",
" FROM (\n",
" SELECT DISTINCT spending_habits AS segments\n",
" FROM `${project_id}.${bigquery_chocolate_ai_dataset}.customer_marketing_profile_segments`\n",
" ), UNNEST(SPLIT(segments, ',')) AS segment\n",
" ),\n",
"\n",
" (\n",
" --usage_frequency\n",
" SELECT JSON_OBJECT('usage_frequency', ARRAY_AGG(DISTINCT TRIM(segment, \" \")))\n",
" FROM (\n",
" SELECT DISTINCT usage_frequency AS segments\n",
" FROM `${project_id}.${bigquery_chocolate_ai_dataset}.customer_marketing_profile_segments`\n",
" ), UNNEST(SPLIT(segments, ',')) AS segment\n",
" ),\n",
"\n",
" (\n",
" --user_status\n",
" SELECT JSON_OBJECT('user_status', ARRAY_AGG(DISTINCT TRIM(segment, \" \")))\n",
" FROM (\n",
" SELECT DISTINCT user_status AS segments\n",
" FROM `${project_id}.${bigquery_chocolate_ai_dataset}.customer_marketing_profile_segments`\n",
" ), UNNEST(SPLIT(segments, ',')) AS segment\n",
" ),\n",
"\n",
" (\n",
" --age\n",
" SELECT JSON_OBJECT('age', ARRAY_AGG(DISTINCT TRIM(segment, \" \")))\n",
" FROM (\n",
" SELECT DISTINCT age AS segments\n",
" FROM `${project_id}.${bigquery_chocolate_ai_dataset}.customer_marketing_profile_segments`\n",
" ), UNNEST(SPLIT(segments, ',')) AS segment\n",
" ),\n",
"\n",
" (\n",
" --education\n",
" SELECT JSON_OBJECT('education', ARRAY_AGG(DISTINCT TRIM(segment, \" \")))\n",
" FROM (\n",
" SELECT DISTINCT education AS segments\n",
" FROM `${project_id}.${bigquery_chocolate_ai_dataset}.customer_marketing_profile_segments`\n",
" ), UNNEST(SPLIT(segments, ',')) AS segment\n",
" ),\n",
"\n",
" (\n",
" --ethnicity\n",
" SELECT JSON_OBJECT('ethnicity', ARRAY_AGG(DISTINCT TRIM(segment, \" \")))\n",
" FROM (\n",
" SELECT DISTINCT ethnicity AS segments\n",
" FROM `${project_id}.${bigquery_chocolate_ai_dataset}.customer_marketing_profile_segments`\n",
" ), UNNEST(SPLIT(segments, ',')) AS segment\n",
" ),\n",
"\n",
" (\n",
" --family_size\n",
" SELECT JSON_OBJECT('family_size', ARRAY_AGG(DISTINCT TRIM(segment, \" \")))\n",
" FROM (\n",
" SELECT DISTINCT family_size AS segments\n",
" FROM `${project_id}.${bigquery_chocolate_ai_dataset}.customer_marketing_profile_segments`\n",
" ), UNNEST(SPLIT(segments, ',')) AS segment\n",
" ),\n",
"\n",
" (\n",
" --gender\n",
" SELECT JSON_OBJECT('gender', ARRAY_AGG(DISTINCT TRIM(segment, \" \")))\n",
" FROM (\n",
" SELECT DISTINCT gender AS segments\n",
" FROM `${project_id}.${bigquery_chocolate_ai_dataset}.customer_marketing_profile_segments`\n",
" ), UNNEST(SPLIT(segments, ',')) AS segment\n",
" ),\n",
"\n",
" (\n",
" --generation\n",
" SELECT JSON_OBJECT('generation', ARRAY_AGG(DISTINCT TRIM(segment, \" \")))\n",
" FROM (\n",
" SELECT DISTINCT generation AS segments\n",
" FROM `${project_id}.${bigquery_chocolate_ai_dataset}.customer_marketing_profile_segments`\n",
" ), UNNEST(SPLIT(segments, ',')) AS segment\n",
" ),\n",
"\n",
" (\n",
" --income\n",
" SELECT JSON_OBJECT('income', ARRAY_AGG(DISTINCT TRIM(segment, \" \")))\n",
" FROM (\n",
" SELECT DISTINCT income AS segments\n",
" FROM `${project_id}.${bigquery_chocolate_ai_dataset}.customer_marketing_profile_segments`\n",
" WHERE income NOT LIKE 'Unknown%'\n",
" ), UNNEST(SPLIT(segments, ',')) AS segment\n",
" ),\n",
"\n",
" (\n",
" --language\n",
" SELECT JSON_OBJECT('language', ARRAY_AGG(DISTINCT TRIM(segment, \" \")))\n",
" FROM (\n",
" SELECT DISTINCT language AS segments\n",
" FROM `${project_id}.${bigquery_chocolate_ai_dataset}.customer_marketing_profile_segments`\n",
" ), UNNEST(SPLIT(segments, ',')) AS segment\n",
" ),\n",
"\n",
" (\n",
" --marital_status\n",
" SELECT JSON_OBJECT('marital_status', ARRAY_AGG(DISTINCT TRIM(segment, \" \")))\n",
" FROM (\n",
" SELECT DISTINCT marital_status AS segments\n",
" FROM `${project_id}.${bigquery_chocolate_ai_dataset}.customer_marketing_profile_segments`\n",
" ), UNNEST(SPLIT(segments, ',')) AS segment\n",
" ),\n",
"\n",
" (\n",
" --occupation\n",
" SELECT JSON_OBJECT('occupation', ARRAY_AGG(DISTINCT TRIM(segment, \" \")))\n",
" FROM (\n",
" SELECT DISTINCT occupation AS segments\n",
" FROM `${project_id}.${bigquery_chocolate_ai_dataset}.customer_marketing_profile_segments`\n",
" ), UNNEST(SPLIT(segments, ',')) AS segment\n",
" ),\n",
"\n",
" (\n",
" --challenges\n",
" SELECT JSON_OBJECT('challenges', ARRAY_AGG(DISTINCT TRIM(segment, \" \")))\n",
" FROM (\n",
" SELECT DISTINCT challenges AS segments\n",
" FROM `${project_id}.${bigquery_chocolate_ai_dataset}.customer_marketing_profile_segments`\n",
" ), UNNEST(SPLIT(segments, ',')) AS segment\n",
" ),\n",
"\n",
" (\n",
" --goals\n",
" SELECT JSON_OBJECT('goals', ARRAY_AGG(DISTINCT TRIM(segment, \" \")))\n",
" FROM (\n",
" SELECT DISTINCT goals AS segments\n",
" FROM `${project_id}.${bigquery_chocolate_ai_dataset}.customer_marketing_profile_segments`\n",
" ), UNNEST(SPLIT(segments, ',')) AS segment\n",
" ),\n",
"\n",
" (\n",
" --pain_points\n",
" SELECT JSON_OBJECT('pain_points', ARRAY_AGG(DISTINCT TRIM(segment, \" \")))\n",
" FROM (\n",
" SELECT DISTINCT pain_points AS segments\n",
" FROM `${project_id}.${bigquery_chocolate_ai_dataset}.customer_marketing_profile_segments`\n",
" ), UNNEST(SPLIT(segments, ',')) AS segment\n",
" ),\n",
"\n",
" (\n",
" --priorities\n",
" SELECT JSON_OBJECT('priorities', ARRAY_AGG(DISTINCT TRIM(segment, \" \")))\n",
" FROM (\n",
" SELECT DISTINCT priorities AS segments\n",
" FROM `${project_id}.${bigquery_chocolate_ai_dataset}.customer_marketing_profile_segments`\n",
" ), UNNEST(SPLIT(segments, ',')) AS segment\n",
" ),\n",
"\n",
" (\n",
" --specific_needs\n",
" SELECT JSON_OBJECT('specific_needs', ARRAY_AGG(DISTINCT TRIM(segment, \" \")))\n",
" FROM (\n",
" SELECT DISTINCT specific_needs AS segments\n",
" FROM `${project_id}.${bigquery_chocolate_ai_dataset}.customer_marketing_profile_segments`\n",
" ), UNNEST(SPLIT(segments, ',')) AS segment\n",
" ),\n",
"\n",
" (\n",
" --interests\n",
" SELECT JSON_OBJECT('interests', ARRAY_AGG(DISTINCT TRIM(segment, \" \")))\n",
" FROM (\n",
" SELECT DISTINCT interests AS segments\n",
" FROM `${project_id}.${bigquery_chocolate_ai_dataset}.customer_marketing_profile_segments`\n",
" ), UNNEST(SPLIT(segments, ',')) AS segment\n",
" ),\n",
"\n",
" (\n",
" --lifestyle\n",
" SELECT JSON_OBJECT('lifestyle', ARRAY_AGG(DISTINCT TRIM(segment, \" \")))\n",
" FROM (\n",
" SELECT DISTINCT lifestyle AS segments\n",
" FROM `${project_id}.${bigquery_chocolate_ai_dataset}.customer_marketing_profile_segments`\n",
" ), UNNEST(SPLIT(segments, ',')) AS segment\n",
" ),\n",
"\n",
" (\n",
" --motivations\n",
" SELECT JSON_OBJECT('motivations', ARRAY_AGG(DISTINCT TRIM(segment, \" \")))\n",
" FROM (\n",
" SELECT DISTINCT motivations AS segments\n",
" FROM `${project_id}.${bigquery_chocolate_ai_dataset}.customer_marketing_profile_segments`\n",
" ), UNNEST(SPLIT(segments, ',')) AS segment\n",
" ),\n",
"\n",
" (\n",
" --personality\n",
" SELECT JSON_OBJECT('personality', ARRAY_AGG(DISTINCT TRIM(segment, \" \")))\n",
" FROM (\n",
" SELECT DISTINCT personality AS segments\n",
" FROM `${project_id}.${bigquery_chocolate_ai_dataset}.customer_marketing_profile_segments`\n",
" ), UNNEST(SPLIT(segments, ',')) AS segment\n",
" ),\n",
"\n",
" (\n",
" --customer_values\n",
" SELECT JSON_OBJECT('customer_values', ARRAY_AGG(DISTINCT TRIM(segment, \" \")))\n",
" FROM (\n",
" SELECT DISTINCT customer_values AS segments\n",
" FROM `${project_id}.${bigquery_chocolate_ai_dataset}.customer_marketing_profile_segments`\n",
" ), UNNEST(SPLIT(segments, ',')) AS segment\n",
" ),\n",
"\n",
" (\n",
" --social_media_platforms\n",
" SELECT JSON_OBJECT('social_media_platforms', ARRAY_AGG(DISTINCT TRIM(segment, \" \")))\n",
" FROM (\n",
" SELECT DISTINCT social_media_platforms AS segments\n",
" FROM `${project_id}.${bigquery_chocolate_ai_dataset}.customer_marketing_profile_segments`\n",
" ), UNNEST(SPLIT(segments, ',')) AS segment\n",
" ),\n",
"\n",
" (\n",
" --tech_savviness\n",
" SELECT JSON_OBJECT('tech_savviness', ARRAY_AGG(DISTINCT TRIM(segment, \" \")))\n",
" FROM (\n",
" SELECT DISTINCT tech_savviness AS segments\n",
" FROM `${project_id}.${bigquery_chocolate_ai_dataset}.customer_marketing_profile_segments`\n",
" ), UNNEST(SPLIT(segments, ',')) AS segment\n",
" )\n",
" )\n",
" )\"\"\"\n",
"\n",
" result = RunQuery(sql)\n",
" return result"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "YVx2ex_1BtqR"
},
"source": [
"#### GetMenuItem(menu_id)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "OYh4A4jHBt0a"
},
"outputs": [],
"source": [
"def GetMenuItem(menu_id):\n",
" sql = f\"\"\"SELECT *\n",
" FROM `${project_id}.${bigquery_chocolate_ai_dataset}.menu`\n",
" WHERE menu_id = {menu_id}\n",
" \"\"\"\n",
" result = RunQuery(sql)\n",
" return result"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "PexYQ-Lb_4vp"
},
"source": [
"#### GetSegmentsFromCampaign()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "KR4NXhRHWfnM"
},
"outputs": [],
"source": [
"def GetSegmentsFromCampaign(campaign):\n",
"\n",
" # Extract campaign variables from the campaign object\n",
" campaign_id = campaign['campaign_id']\n",
" menu_id = campaign['menu_id']\n",
" campaign_name = campaign['campaign_name']\n",
" campaign_description = campaign['campaign_description']\n",
" campaign_goal = campaign['campaign_goal']\n",
" target_audience = campaign['target_audience']\n",
" marketing_channels = campaign['marketing_channels']\n",
" budget = campaign['budget']\n",
" explanation = campaign['explanation']\n",
" campaign_outcomes = campaign['campaign_outcomes']\n",
" campaign_start_date = campaign['campaign_start_date']\n",
" campaign_end_date = campaign['campaign_end_date']\n",
" campaign_created_date = campaign['campaign_created_date']\n",
"\n",
" # Get menu item details\n",
" menu_item = GetMenuItem(menu_id)\n",
" menu_item = menu_item.iloc[0]\n",
" menu_item_name = menu_item['menu_name']\n",
" menu_item_description = menu_item['menu_description']\n",
" menu_item_price = menu_item['menu_price']\n",
"\n",
" # Get pre-defined segments from the customer_marketing_profile table\n",
" pre_defined_segments = GetPreDefinedSegments()\n",
" pre_defined_segments = pre_defined_segments.iloc[0,0]\n",
"\n",
" # Define prompt\n",
" prompt = f\"\"\"You are a Marketing Analyst, and you are reviewing Chocolate AI's latest\n",
" marketing campaign called: {campaign_name}. Your task is to determine the ideal\n",
" customer segments that should be targeted with the new campaign. You should choose\n",
" at least two segments from the following set of pre-defined customer segments:\n",
"\n",
" {pre_defined_segments}\n",
"\n",
" You should also define one segment that doesn't exist in the pre-defined set of\n",
" segments that describes the perfect target audience for the new campaign.\n",
"\n",
" Here are the details of the campaign:\n",
"\n",
" Campaign Name: {campaign_name}\n",
" Campaign Description: {campaign_description}\n",
" Menu Item being Promoted:\n",
" Campaign Goal: {campaign_goal}\n",
" Target Audience: {target_audience}\n",
" Marketing Channels: {marketing_channels}\n",
" Budget: {budget}\n",
" Explanation: {explanation}\n",
" Campaign Outcomes: {campaign_outcomes}\n",
" Campaign Start Date: {campaign_start_date}\n",
" Campaign End Date: {campaign_end_date}\n",
"\n",
" Now choose at least two segments from the set of pre-defined customer segments,\n",
" and come up with one segment that doesn't exist in the pre-defined set of segments\n",
" that describes the perfect target audience for the new campaign.\n",
"\n",
" Think step by step and explain your reasoning.\"\"\"\n",
"\n",
" response_schema = {\n",
" \"type\": \"object\",\n",
" \"required\": [\n",
" \"pre_defined_segments\", \"novel_segment\", \"explanation\"\n",
" ],\n",
" \"properties\": {\n",
" \"pre_defined_segments\": {\n",
" \"type\": \"array\",\n",
" \"description\": \"An array of customer segments chosen from a pre-defined list.\",\n",
" \"items\": {\n",
" \"type\": \"string\",\n",
" \"description\": \"A segment that is chosen from a pre-defined list.\"\n",
" }\n",
" },\n",
" \"novel_segment\": {\n",
" \"type\": \"string\",\n",
" \"description\": \"A segment that is created to fit the ideal audience and does not exist in the pre-defined list.\"\n",
" },\n",
" \"explanation\": {\n",
" \"type\": \"string\",\n",
" \"description\": \"An explanation for your reasoning on why you picked these segments.\"\n",
" }\n",
" }\n",
" }\n",
"\n",
" result = GeminiLLM(prompt, response_schema=response_schema)\n",
" return result"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "IYiVTw90dTTX"
},
"source": [
"#### GetSegmentsFromVideo(campaign, uri)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "Mcz8Ut7ucSAV"
},
"outputs": [],
"source": [
"def GetSegmentsFromVideo(video_uri):\n",
"\n",
" # Get pre-defined segments from the customer_marketing_profile table\n",
" pre_defined_segments = GetPreDefinedSegments()\n",
" pre_defined_segments = pre_defined_segments.iloc[0,0]\n",
"\n",
" # Define prompt\n",
" text_prompt = f\"\"\"You are a marketing analyst, and you are reviewing a new video\n",
" ad created by your Content & Creatives team. Your task is to watch the new ad\n",
" and determine the customer segments that would be most receptive to the new ad\n",
" and also the most likely to make a purchase after seeing the ad.\n",
"\n",
" You should choose at least two segments from the following set of pre-defined\n",
" customer segments:\n",
"\n",
" {pre_defined_segments}\n",
"\n",
" You should also define one segment that doesn't exist in the pre-defined set of\n",
" segments that describes the perfect target audience for the new campaign.\n",
"\n",
" Now watch the supplied video, then choose at least two segments from the set of\n",
" pre-defined customer segments, and come up with one segment that doesn't exist\n",
" in the pre-defined set of segments that describes the perfect target audience\n",
" for the new ad.\n",
"\n",
" Think step by step and explain your reasoning.\"\"\"\n",
"\n",
" response_schema = {\n",
" \"type\": \"object\",\n",
" \"required\": [\n",
" \"pre_defined_segments\", \"novel_segment\", \"explanation\"\n",
" ],\n",
" \"properties\": {\n",
" \"pre_defined_segments\": {\n",
" \"type\": \"array\",\n",
" \"description\": \"An array of customer segments chosen from a pre-defined list.\",\n",
" \"items\": {\n",
" \"type\": \"string\",\n",
" \"description\": \"A segment that is chosen from a pre-defined list.\"\n",
" }\n",
" },\n",
" \"novel_segment\": {\n",
" \"type\": \"string\",\n",
" \"description\": \"A segment that is created to fit the ideal audience and does not exist in the pre-defined list.\"\n",
" },\n",
" \"explanation\": {\n",
" \"type\": \"string\",\n",
" \"description\": \"An explanation for your reasoning on why you picked these segments.\"\n",
" }\n",
" }\n",
" }\n",
"\n",
" response = GeminiLLM_Multimodal(text_prompt, video_uri, response_schema=response_schema)\n",
"\n",
" # Display the results\n",
" #response_text = response['candidates'][0]['content']['parts'][0]['text']\n",
"\n",
" #result = GeminiLLM(prompt, response_schema=response_schema)\n",
" return response"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "EYRHDPdVKBzd"
},
"source": [
"## <font color='#4285f4'>Get Segments from Campaign Description</font>"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "Kcqp37O39uSr"
},
"outputs": [],
"source": [
"# Get latest generated campaign\n",
"sql = \"\"\"SELECT * FROM `${project_id}.${bigquery_chocolate_ai_dataset}.campaign`\n",
" ORDER BY campaign_created_date\n",
" LIMIT 1;\"\"\"\n",
"\n",
"campaign = RunQuery(sql)\n",
"campaign = campaign.iloc[0]\n",
"\n",
"recommended_segments = GetSegmentsFromCampaign(campaign)\n",
"recommended_segments = json.loads(recommended_segments)\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "_pt6bshvDKqZ"
},
"outputs": [],
"source": [
"result_string = f\"\"\"\n",
"##Suggested Customer Segments\n",
"\n",
"**Campaign Name:**\n",
"{campaign['campaign_name']}\n",
"\n",
"**Campaign Description:**\n",
"{campaign['campaign_description']}\n",
"\n",
"**Suggested customer segments (existing):**\n",
"{recommended_segments[\"pre_defined_segments\"]}\n",
"\n",
"**Ideal customer segment (new segment):**\n",
"{recommended_segments[\"novel_segment\"]}\n",
"\n",
"**Reasoning for segment choices:**\n",
"{recommended_segments['explanation']}\n",
"\"\"\"\n",
"\n",
"DisplayMarkdown(result_string)\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Gysuhni0G2hH"
},
"source": [
"Let's assume that Gemini chose the following values:\n",
"\n",
"- Existing customer segments to target: `[\"purchase_history\": \"High Spender\"', '\"social_media_platforms\": [\"Instagram\", \"Facebook\"]]`\n",
"- Ideal customer segment: \"Luxury Foodie: Individuals with discerning palates who seek unique and high-quality culinary experiences. They are interested in the story behind their food, appreciate artisanal craftsmanship, and are willing to spend a premium for exceptional taste and ingredients.\"\n",
"\n",
"We can search through our customer data to get a list of customers we should target for this campaign based on this criteria."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "bFqD5NxRDfBl"
},
"outputs": [],
"source": [
"sql = \"\"\"SELECT COUNT(*) AS prospect_count\n",
" FROM `${project_id}.${bigquery_chocolate_ai_dataset}.customer_marketing_profile_segments`\n",
" WHERE purchase_history = 'High Spender'\n",
" OR purchase_history = 'Frequent Buyer';\"\"\"\n",
"\n",
"prospect_count = RunQuery(sql)\n",
"prospect_count = prospect_count.iloc[0]\n",
"\n",
"print(f\"There are {prospect_count['prospect_count']} customers in the recommended pre-defined segments.\")\n",
"\n",
"sql = \"\"\"SELECT *\n",
" FROM `${project_id}.${bigquery_chocolate_ai_dataset}.customer_marketing_profile_segments`\n",
" WHERE purchase_history = 'High Spender'\n",
" OR purchase_history = 'Frequent Buyer';\"\"\"\n",
"\n",
"prospect_list = RunQuery(sql)\n",
"prospect_list"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "tS1IwLSNIW8i"
},
"source": [
"We can also use Gemini to extract the key words from the description of the new ideal customer segment, and we can use vector search on our customer_marketing_profile data to get a list of customers who match this criteria."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "dEs51PpKIWMR"
},
"outputs": [],
"source": [
"prompt = f\"\"\"Extract up to 7 key words from the following customer segment description:\n",
"\n",
"{recommended_segments[\"novel_segment\"]}\n",
"\"\"\"\n",
"\n",
"response_schema = {\n",
" \"type\": \"object\",\n",
" \"required\": [\n",
" \"key_words\"\n",
" ],\n",
" \"properties\": {\n",
" \"key_words\": {\n",
" \"type\": \"array\",\n",
" \"description\": \"An array of key words extracted from the supplied text.\",\n",
" \"items\": {\n",
" \"type\": \"string\",\n",
" \"description\": \"A key term extracted from the supplied text.\"\n",
" }\n",
" }\n",
" }\n",
" }\n",
"\n",
"key_words = GeminiLLM(prompt, response_schema=response_schema)\n",
"key_words = json.loads(key_words)\n",
"key_words_string = \" \".join(key_words['key_words'])\n",
"print(f\"Key words: {key_words_string}\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "Wuk4iwsGJ-RW"
},
"outputs": [],
"source": [
"# Get the top 1000 customers that match the target customer segment\n",
"row_limit = 1000\n",
"sql = f\"\"\"WITH target_customers AS (\n",
" SELECT\n",
" distance,\n",
" base.customer_id,\n",
" base.customer_marketing_insights,\n",
" base.customer_profile_data,\n",
" base.customer_loyalty_data,\n",
" base.customer_segmentation_data\n",
" --base.customer_marketing_insights_embedding\n",
" FROM VECTOR_SEARCH(\n",
" -- base table or subquery\n",
" TABLE `${project_id}.${bigquery_chocolate_ai_dataset}.customer_marketing_profile`,\n",
"\n",
" -- embedding column to search in base table - must be of type ARRAY<FLOAT64>\n",
" 'customer_marketing_insights_embedding',\n",
"\n",
" -- query table or subquery - this is where you generate the search embedding\n",
" (\n",
" SELECT ml_generate_embedding_result, content AS query\n",
" FROM ML.GENERATE_EMBEDDING(\n",
" MODEL `${project_id}.${bigquery_chocolate_ai_dataset}.textembedding_model`,\n",
" (\n",
" -- Search term\n",
" SELECT \"{key_words_string}\" AS content\n",
" ),\n",
" STRUCT(\n",
" TRUE AS flatten_json_output,\n",
" 'SEMANTIC_SIMILARITY' as task_type,\n",
" 768 AS output_dimensionality\n",
" )\n",
" )\n",
" ),\n",
" top_k => {row_limit}\n",
" ) )\n",
" SELECT distance,\n",
" target_customers.customer_id,\n",
" cus.customer_name,\n",
" cus.customer_email,\n",
" cus.customer_yob,\n",
" target_customers.customer_marketing_insights,\n",
" target_customers.customer_profile_data,\n",
" target_customers.customer_loyalty_data,\n",
" target_customers.customer_segmentation_data\n",
" FROM target_customers\n",
" JOIN `${project_id}.${bigquery_chocolate_ai_dataset}.customer` cus\n",
" ON target_customers.customer_id = cus.customer_id\n",
" ORDER BY distance;\n",
" \"\"\"\n",
"\n",
"result = RunQuery(sql)\n",
"result"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Kq1b6W0YKAzm"
},
"source": [
"## <font color='#4285f4'>Get Segments from Video</font>"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "D-Jv3zKwf3KA"
},
"source": [
"### Copy Video to Local Bucket"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "vLIjv5AaJqTL"
},
"outputs": [],
"source": [
"# Copy the sample ad video to your local bucket\n",
"source_video_uri = \"gs://data-analytics-golden-demo/chocolate-ai/v1/Campaign-Assets-Text-to-Video-01/story-01/full-video-with-audio-en-GB.mp4\"\n",
"target_video_uri = f\"gs://{bucket_name}/chocolate_ai/videos/chocolate-ai_story-HITL-01_full-video-with-audio-en-GB.mp4\"\n",
"\n",
"!gsutil -m cp {source_video_uri} {target_video_uri}\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Ixu4Ptebf5wo"
},
"source": [
"### Watch the New Marketing Video"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "EYu_0qI7emJJ"
},
"outputs": [],
"source": [
"# View the video\n",
"uri_index = target_video_uri.index(bucket_name)\n",
"video_uri = 'https://storage.cloud.google.com/' + target_video_uri[uri_index:]\n",
"\n",
"HTML(f\"\"\"\n",
"<video width=600 height=337 controls autoplay>\n",
" <source src=\"{video_uri}\" type=\"video/mp4\">\n",
"</video>\n",
"\"\"\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "I-BVMQfogbt1"
},
"source": [
"### Get Recommended Segments for Ad from Gemini"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "nnomvE_EpMh-"
},
"outputs": [],
"source": [
"# Have Gemini watch the ad and suggest target segments\n",
"result = GetSegmentsFromVideo(target_video_uri)\n",
"\n",
"json_result = json.loads(result['candidates'][0]['content']['parts'][0]['text'])\n",
"pre_defined_segments = json_result['pre_defined_segments']\n",
"novel_segment = json_result['novel_segment']\n",
"explanation = json_result['explanation']\n",
"\n",
"result_string = f\"\"\"\n",
"##Suggested Customer Segments for Video\n",
"\n",
"**Suggested customer segments (existing):**\n",
"{json_result[\"pre_defined_segments\"]}\n",
"\n",
"**Ideal customer segment (new segment):**\n",
"{json_result[\"novel_segment\"]}\n",
"\n",
"**Reasoning for segment choices:**\n",
"{json_result['explanation']}\n",
"\"\"\"\n",
"\n",
"DisplayMarkdown(result_string)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "2jJTNc92sIY0"
},
"source": [
"Let's assume that Gemini chose the following values:\n",
"\n",
"- Existing customer segments to target: ['Special Occasions', 'High Spender']\n",
"- Ideal customer segment: \"Luxury Dessert Enthusiast\"\n",
"\n",
"We can search through our customer data to get a list of customers we should target for this campaign based on this criteria."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "MshapYjXsudH"
},
"outputs": [],
"source": [
"sql = \"\"\"SELECT COUNT(*) AS prospect_count\n",
" FROM `${project_id}.${bigquery_chocolate_ai_dataset}.customer_marketing_profile_segments`\n",
" WHERE purchase_history = 'High Spender'\n",
" OR occasion_timing = 'Special Occasions';\"\"\"\n",
"\n",
"prospect_count = RunQuery(sql)\n",
"prospect_count = prospect_count.iloc[0]\n",
"\n",
"print(f\"There are {prospect_count['prospect_count']} customers in the recommended pre-defined segments.\")\n",
"\n",
"sql = \"\"\"SELECT *\n",
" FROM `${project_id}.${bigquery_chocolate_ai_dataset}.customer_marketing_profile_segments`\n",
" WHERE purchase_history = 'High Spender'\n",
" OR occasion_timing = 'Special Occasions';\"\"\"\n",
"\n",
"prospect_list = RunQuery(sql)\n",
"prospect_list"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "swjNwrTqsudI"
},
"source": [
"We can also use Gemini to generate key words related to the new ideal customer segment, and we can use vector search on our customer_marketing_profile data to get a list of customers who match this criteria."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "hvVJvVO4sudI"
},
"outputs": [],
"source": [
"prompt = f\"\"\"Generate up to 7 key words that describe a customer who fits into the following customer segment:\n",
"\n",
"{novel_segment}\n",
"\"\"\"\n",
"\n",
"response_schema = {\n",
" \"type\": \"object\",\n",
" \"required\": [\n",
" \"key_words\"\n",
" ],\n",
" \"properties\": {\n",
" \"key_words\": {\n",
" \"type\": \"array\",\n",
" \"description\": \"An array of key words extracted from the supplied text.\",\n",
" \"items\": {\n",
" \"type\": \"string\",\n",
" \"description\": \"A key term extracted from the supplied text.\"\n",
" }\n",
" }\n",
" }\n",
" }\n",
"\n",
"key_words = GeminiLLM(prompt, response_schema=response_schema)\n",
"key_words = json.loads(key_words)\n",
"key_words_string = \" \".join(key_words['key_words'])\n",
"print(f\"Key words: {key_words_string}\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "Niq8YYlesudJ"
},
"outputs": [],
"source": [
"# Get the top 1000 customers that match the target customer segment\n",
"row_limit = 1000\n",
"sql = f\"\"\"WITH target_customers AS (\n",
" SELECT\n",
" distance,\n",
" base.customer_id,\n",
" base.customer_marketing_insights,\n",
" base.customer_profile_data,\n",
" base.customer_loyalty_data,\n",
" base.customer_segmentation_data\n",
" --base.customer_marketing_insights_embedding\n",
" FROM VECTOR_SEARCH(\n",
" -- base table or subquery\n",
" TABLE `${project_id}.${bigquery_chocolate_ai_dataset}.customer_marketing_profile`,\n",
"\n",
" -- embedding column to search in base table - must be of type ARRAY<FLOAT64>\n",
" 'customer_marketing_insights_embedding',\n",
"\n",
" -- query table or subquery - this is where you generate the search embedding\n",
" (\n",
" SELECT ml_generate_embedding_result, content AS query\n",
" FROM ML.GENERATE_EMBEDDING(\n",
" MODEL `${project_id}.${bigquery_chocolate_ai_dataset}.textembedding_model`,\n",
" (\n",
" -- Search term\n",
" SELECT \"{key_words_string}\" AS content\n",
" ),\n",
" STRUCT(\n",
" TRUE AS flatten_json_output,\n",
" 'SEMANTIC_SIMILARITY' as task_type,\n",
" 768 AS output_dimensionality\n",
" )\n",
" )\n",
" ),\n",
" top_k => {row_limit}\n",
" ) )\n",
" SELECT distance,\n",
" target_customers.customer_id,\n",
" cus.customer_name,\n",
" cus.customer_email,\n",
" cus.customer_yob,\n",
" target_customers.customer_marketing_insights,\n",
" target_customers.customer_profile_data,\n",
" target_customers.customer_loyalty_data,\n",
" target_customers.customer_segmentation_data\n",
" FROM target_customers\n",
" JOIN `${project_id}.${bigquery_chocolate_ai_dataset}.customer` cus\n",
" ON target_customers.customer_id = cus.customer_id\n",
" ORDER BY distance ASC;\n",
" \"\"\"\n",
"\n",
"result = RunQuery(sql)\n",
"result"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "ASQ2BPisXDA0"
},
"source": [
"## <font color='#4285f4'>Reference Links</font>\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "D2TgWfsR9uSr"
},
"source": [
"- [Generate multimodal content with the Gemini Enterprise API](https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/inference)\n",
"- [BigQuery JSON Functions](https://cloud.google.com/bigquery/docs/reference/standard-sql/json_functions)\n",
"- [Controlled Generation with Gemini](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/control-generated-output)\n",
"- [Working with JSON Data in BigQuery](https://cloud.google.com/bigquery/docs/json-data)\n",
"- [Generate Embeddings in BigQuery with ML.GENERATE_EMBEDDING](https://cloud.google.com/bigquery/docs/reference/standard-sql/bigqueryml-syntax-generate-embedding)\n",
"- [BigQuery Vector Indexes](https://cloud.google.com/bigquery/docs/vector-index)\n",
"- [BigQuery VECTOR_SEARCH() Function](https://cloud.google.com/bigquery/docs/reference/standard-sql/search_functions#vector_search)"
]
}
],
"metadata": {
"colab": {
"collapsed_sections": [
"42IxhtRRrvR-"
],
"name": "DB-GMA-Create-Campaign-Customer-Segmentation",
"private_outputs": true,
"provenance": [],
"toc_visible": true
},
"kernelspec": {
"display_name": "Python 3",
"name": "python3"
},
"language_info": {
"name": "python"
}
},
"nbformat": 4,
"nbformat_minor": 0
}