colab-enterprise/Create-Campaign-Recommendations.ipynb

{ "cells": [ { "cell_type": "markdown", "metadata": { "id": "2eJaZU5YAQUf" }, "source": [ "## Overview" ] }, { "cell_type": "markdown", "metadata": { "id": "sk7VH1mzWFT9" }, "source": [ "This notebook demonstrates a process for generating marketing campaign recommendations for Chocolate AI, a fictional French chocolatier and bakery. It leverages BigQuery for data storage and analysis, Gemini (a large language model) for creative content generation, and TimesFM (a time series forecasting model) for predicting campaign performance. Informed by your existing data, Gen AI can create powerful campaign recommendations to jumpstart your marketing team’s efforts. Marketing projects that previously took weeks to plan and brainstorm can be rapidly prototyped using Gen AI.\n", "\n", "Process Flow:\n", "1. Define Campaign Recommendation Functions:\n", " - GenerateCampaignRecommendations: Uses Gemini LLM to suggest campaign ideas based on provided goals, product information, and budget.\n", " - GetRankedProducts: Queries BigQuery to identify under-performing products based on sales changes over various time periods (daily, weekly, monthly).\n", " - GetFinalProductRanking: Combines rankings from different time periods to produce a final list of products to consider for promotion.\n", " - Visualize*: Several functions to create informative graphs for visualizing product rankings and campaign performance.\n", " - GenerateCampaignPerformance: Simulates randomized campaign performance data influenced by budget.\n", "2. Campaign Generation Workflow:\n", " - Step 1: Product Ranking:\n", " - Retrieves and visualizes rankings of the lowest-performing products over different time periods.\n", " - Combines these rankings into a final ranked list.\n", " - Uploads the visualization to Google Cloud Storage and uses Gemini (with multi-modal input) to interpret the graph and suggest products for promotion.\n", " - Step 2: Discount Optimization:\n", " - Based on the selected product, uses TimesFM to simulate the impact of different discount levels on sales, recommending a discount to achieve the desired sales increase.\n", " - Step 3: Campaign Ideation:\n", " - Fetches product details from BigQuery.\n", " - Calls GenerateCampaignRecommendations to create campaign ideas, including name, description, target audience, marketing channels, and expected outcomes.\n", " - Step 4: Campaign Performance Simulation:\n", " - Simulates the campaign's performance multiple times using GenerateCampaignPerformance to visualize potential outcomes.\n", " - Step 5: Mock Campaign Execution and Results:\n", " - Runs a final simulation to represent actual campaign results.\n", " - Visualizes the performance of the simulated campaign.\n", " - Saves the campaign details and simulated performance data into the BigQuery tables.\n", "\n", "Author: Paul Ramsey" ] }, { "cell_type": "markdown", "metadata": { "id": "HMsUvoF4BP7Y" }, "source": [ "## License" ] }, { "cell_type": "markdown", "metadata": { "id": "jQgQkbOvj55d" }, "source": [ "```\n", "# Copyright 2024 Google LLC\n", "#\n", "# Licensed under the Apache License, Version 2.0 (the \"License\");\n", "# you may not use this file except in compliance with the License.\n", "# You may obtain a copy of the License at\n", "#\n", "# https://www.apache.org/licenses/LICENSE-2.0\n", "#\n", "# Unless required by applicable law or agreed to in writing, software\n", "# distributed under the License is distributed on an \"AS IS\" BASIS,\n", "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n", "# See the License for the specific language governing permissions and\n", "# limitations under the License.\n", "```" ] }, { "cell_type": "markdown", "metadata": { "id": "5OSlvfBidakD" }, "source": [ "## Deploy TimesFM" ] }, { "cell_type": "markdown", "metadata": { "id": "G3qHbG1QdeeV" }, "source": [ "1. Open Vertex Model Garden\n", " - https://console.cloud.google.com/vertex-ai/publishers/google/model-garden/timesfm\n", "2. Click the Deploy button\n", "3. Select\n", " - Resource Id: google/timesfm-v20240828\n", " - Model Name: (leave default - name does not matter) \n", " - Endpoint name: (leave default - name does not matter)\n", " - Region: us-central1 (if you change you need to change the **Initialize** variables below)\n", " - Machine spec: (leave default - n1-standard-8)\n", "4. Click Deploy\n", "5. Wait 20-45 minutes\n", "6. Open Vertex Model Registry\n", " - https://console.cloud.google.com/vertex-ai/models\n", "7. Click on the model name\n", "8. Click on the model name under \"Deploy your model\"\n", "9. Click on \"Sample Request\" (at the top)\n", "10. Copy the endpoint id (i.e. ```ENDPOINT_ID=\"8770814150373801984\"```)\n", "11. Update the variable endpoint_id in the **Initialize** code below.\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##### TimesFM Deployment Video" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "[![TimesFM Deployment Video](https://storage.googleapis.com/data-analytics-golden-demo/chocolate-ai/v1/Videos/adam-paternostro-video.png)](https://storage.googleapis.com/data-analytics-golden-demo/chocolate-ai/v1/Videos/Campaign-Performance-Forecasting-TimesFM-Install.mp4)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from IPython.display import HTML\n", "\n", "HTML(\"\"\"\n", "<h2>Deploying TimesFM to a Vertex AI Endpoint Instructions</h2>\n", "<video width=\"800\" height=\"600\" controls>\n", " <source src=\"https://storage.googleapis.com/data-analytics-golden-demo/chocolate-ai/v1/Videos/Campaign-Performance-Forecasting-TimesFM-Install.mp4\" type=\"video/mp4\">\n", " Your browser does not support the video tag.\n", "</video>\n", "\"\"\")" ] }, { "cell_type": "markdown", "metadata": { "id": "UmyL-Rg4Dr_f" }, "source": [ "## Initialize" ] }, { "cell_type": "markdown", "metadata": { "id": "Rj0rVQrQn-fk" }, "source": [ "### Imports\n", "\n", "NOTE: The timesfm install and import takes about 5 minutes for the initial run and may require a restart of the kernel." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "xOYsEVSXp6IP" }, "outputs": [], "source": [ "from IPython.display import HTML\n", "import google.auth\n", "import requests\n", "import json\n", "import random\n", "from datetime import datetime\n", "import numpy as np\n", "import logging\n", "import matplotlib.pyplot as plt\n", "import pandas as pd\n", "import statistics\n", "from io import BytesIO\n", "import markdown\n", "import inspect\n", "\n", "from tenacity import retry, wait_exponential, stop_after_attempt, before_sleep_log, retry_if_exception\n", "from google.cloud import bigquery\n", "client = bigquery.Client()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "wMlHl3bnkFPZ" }, "outputs": [], "source": [ "endpoint_id=\"2481537270750904320\" # <- YOU MUST SET THIS !!!!\n", "\n", "# Update endpoint_id=\"000000000000000000\" # <- YOU MUST SET THIS !!!!these variables to match your environment\n", "bigquery_location = \"${bigquery_location}\"\n", "region = \"${region}\"\n", "location = \"${location}\"\n", "storage_account = \"${chocolate_ai_bucket}\"\n", "\n", "### Do not change the values in this cell below this line ###\n", "project_id = !(gcloud config get-value project)\n", "user = !(gcloud auth list --filter=status:ACTIVE --format=\"value(account)\")\n", "\n", "if len(project_id) != 1:\n", " raise RuntimeError(f\"project_id is not set: {project_id}\")\n", "project_id = project_id[0]\n", "\n", "if len(user) != 1:\n", " raise RuntimeError(f\"user is not set: {user}\")\n", "user = user[0]\n", "\n", "bucket_name = \"${chocolate_ai_bucket}\"\n", "\n", "print(f\"project_id = {project_id}\")\n", "print(f\"user = {user}\")\n", "print(f\"location = {location}\")\n", "print(f\"bigquery_location = {bigquery_location}\")\n", "print(f\"bucket_name = {bucket_name}\")" ] }, { "cell_type": "markdown", "metadata": { "id": "sZ6m_wGrK0YG" }, "source": [ "### Define Helper Methods" ] }, { "cell_type": "markdown", "metadata": { "id": "J49uCpLAWFT_" }, "source": [ "#### RetryCondition(error)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "uMtt0ZCiWFT_" }, "outputs": [], "source": [ "def RetryCondition(error):\n", " error_string = str(error)\n", " print(error_string)\n", "\n", " retry_errors = [\n", " \"RESOURCE_EXHAUSTED\",\n", " \"No content in candidate\",\n", " # Add more error messages here as needed\n", " ]\n", "\n", " for retry_error in retry_errors:\n", " if retry_error in error_string:\n", " print(\"Retrying...\")\n", " return True\n", "\n", " return False" ] }, { "cell_type": "markdown", "metadata": { "id": "JbOjdSP1kN9T" }, "source": [ "#### restAPIHelper\n", "Calls the Google Cloud REST API using the current users credentials." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "40wlwnY4kM11" }, "outputs": [], "source": [ "def restAPIHelper(url: str, http_verb: str, request_body: str) -> str:\n", " \"\"\"Calls the Google Cloud REST API passing in the current users credentials\"\"\"\n", "\n", " import requests\n", " import google.auth\n", " import json\n", "\n", " # Get an access token based upon the current user\n", " creds, project = google.auth.default()\n", " auth_req = google.auth.transport.requests.Request()\n", " creds.refresh(auth_req)\n", " access_token=creds.token\n", "\n", " headers = {\n", " \"Content-Type\" : \"application/json\",\n", " \"Authorization\" : \"Bearer \" + access_token\n", " }\n", "\n", " if http_verb == \"GET\":\n", " response = requests.get(url, headers=headers)\n", " elif http_verb == \"POST\":\n", " response = requests.post(url, json=request_body, headers=headers)\n", " elif http_verb == \"PUT\":\n", " response = requests.put(url, json=request_body, headers=headers)\n", " elif http_verb == \"PATCH\":\n", " response = requests.patch(url, json=request_body, headers=headers)\n", " elif http_verb == \"DELETE\":\n", " response = requests.delete(url, headers=headers)\n", " else:\n", " raise RuntimeError(f\"Unknown HTTP verb: {http_verb}\")\n", "\n", " if response.status_code == 200:\n", " return json.loads(response.content)\n", " #image_data = json.loads(response.content)[\"predictions\"][0][\"bytesBase64Encoded\"]\n", " else:\n", " error = f\"Error restAPIHelper -> ' Status: '{response.status_code}' Text: '{response.text}'\"\n", " raise RuntimeError(error)" ] }, { "cell_type": "markdown", "metadata": { "id": "tZSpPUde3FRX" }, "source": [ "#### Gemini LLM (Pro 1.0 , Pro 1.5)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "h-PgzuF53LoS" }, "outputs": [], "source": [ "@retry(wait=wait_exponential(multiplier=1, min=1, max=60), stop=stop_after_attempt(10), retry=retry_if_exception(RetryCondition), before_sleep=before_sleep_log(logging.getLogger(), logging.INFO))\n", "def GeminiLLM(prompt, model = \"gemini-2.0-flash\", response_schema = None,\n", " temperature = 1, topP = 1, topK = 32):\n", "\n", " # https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/inference#supported_models\n", " # model = \"gemini-2.0-flash\"\n", "\n", " llm_response = None\n", " if temperature < 0:\n", " temperature = 0\n", "\n", " creds, project = google.auth.default()\n", " auth_req = google.auth.transport.requests.Request() # required to acess access token\n", " creds.refresh(auth_req)\n", " access_token=creds.token\n", "\n", " headers = {\n", " \"Content-Type\" : \"application/json\",\n", " \"Authorization\" : \"Bearer \" + access_token\n", " }\n", "\n", " # https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/inference\n", " url = f\"https://{location}-aiplatform.googleapis.com/v1/projects/{project_id}/locations/{location}/publishers/google/models/{model}:generateContent\"\n", "\n", " generation_config = {\n", " \"temperature\": temperature,\n", " \"topP\": topP,\n", " \"maxOutputTokens\": 8192,\n", " \"candidateCount\": 1,\n", " \"responseMimeType\": \"application/json\",\n", " }\n", "\n", " # Add inthe response schema for when it is provided\n", " if response_schema is not None:\n", " generation_config[\"responseSchema\"] = response_schema\n", "\n", " if model == \"gemini-2.0-flash\":\n", " generation_config[\"topK\"] = topK\n", "\n", " payload = {\n", " \"contents\": {\n", " \"role\": \"user\",\n", " \"parts\": {\n", " \"text\": prompt\n", " },\n", " },\n", " \"generation_config\": {\n", " **generation_config\n", " },\n", " \"safety_settings\": {\n", " \"category\": \"HARM_CATEGORY_SEXUALLY_EXPLICIT\",\n", " \"threshold\": \"BLOCK_LOW_AND_ABOVE\"\n", " }\n", " }\n", "\n", " response = requests.post(url, json=payload, headers=headers)\n", "\n", " if response.status_code == 200:\n", " try:\n", " json_response = json.loads(response.content)\n", " except Exception as error:\n", " raise RuntimeError(f\"An error occurred parsing the JSON: {error}\")\n", "\n", " if \"candidates\" in json_response:\n", " candidates = json_response[\"candidates\"]\n", " if len(candidates) > 0:\n", " candidate = candidates[0]\n", " if \"content\" in candidate:\n", " content = candidate[\"content\"]\n", " if \"parts\" in content:\n", " parts = content[\"parts\"]\n", " if len(parts):\n", " part = parts[0]\n", " if \"text\" in part:\n", " text = part[\"text\"]\n", " llm_response = text\n", " else:\n", " raise RuntimeError(\"No text in part: {response.content}\")\n", " else:\n", " raise RuntimeError(\"No parts in content: {response.content}\")\n", " else:\n", " raise RuntimeError(\"No parts in content: {response.content}\")\n", " else:\n", " raise RuntimeError(\"No content in candidate: {response.content}\")\n", " else:\n", " raise RuntimeError(\"No candidates: {response.content}\")\n", " else:\n", " raise RuntimeError(\"No candidates: {response.content}\")\n", "\n", " # Remove some typically response characters (if asking for a JSON reply)\n", " llm_response = llm_response.replace(\"```json\",\"\")\n", " llm_response = llm_response.replace(\"```\",\"\")\n", " llm_response = llm_response.replace(\"\\n\",\"\")\n", "\n", " return llm_response\n", "\n", " else:\n", " raise RuntimeError(f\"Error with prompt:'{prompt}' Status:'{response.status_code}' Text:'{response.text}'\")" ] }, { "cell_type": "markdown", "metadata": { "id": "j0NorIvF3QRi" }, "source": [ "#### Gemini LLM - Multimodal" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "r_BorHLGBnej" }, "outputs": [], "source": [ "@retry(wait=wait_exponential(multiplier=1, min=1, max=60), stop=stop_after_attempt(10), retry=retry_if_exception(RetryCondition), before_sleep=before_sleep_log(logging.getLogger(), logging.INFO))\n", "def GeminiLLM_Multimodal(\n", " text_prompt, image_uri, model=\"gemini-2.0-flash\", response_schema=None, temperature=1, topP=1, topK=32\n", "):\n", " \"\"\"\n", " Calls the Gemini API with a text prompt and an image prompt.\n", "\n", " Args:\n", " text_prompt: The text prompt.\n", " image_data: A BytesIO object containing the image data.\n", " model: The Gemini model to use.\n", " response_schema: Optional response schema.\n", " temperature: Temperature parameter for the model.\n", " topP: Top-p parameter for the model.\n", " topK: Top-k parameter for the model.\n", "\n", " Returns:\n", " The Gemini response as a string.\n", " \"\"\"\n", "\n", " creds, project = google.auth.default()\n", " auth_req = google.auth.transport.requests.Request()\n", " creds.refresh(auth_req)\n", " access_token = creds.token\n", "\n", " headers = {\n", " \"Content-Type\": \"application/json\",\n", " \"Authorization\": \"Bearer \" + access_token\n", " }\n", "\n", " url = f\"https://{location}-aiplatform.googleapis.com/v1/projects/{project_id}/locations/{location}/publishers/google/models/{model}:generateContent\"\n", "\n", "\n", " generation_config = {\n", " \"temperature\": temperature,\n", " \"topP\": topP,\n", " \"maxOutputTokens\": 8192,\n", " \"candidateCount\": 1,\n", " #\"responseMimeType\": \"text/html\", Invalid for multi-modal responses\n", " }\n", "\n", " if response_schema is not None:\n", " generation_config[\"responseSchema\"] = response_schema\n", "\n", " if model in [\"gemini-2.0-flash\"]:\n", " generation_config[\"topK\"] = topK\n", "\n", " # Construct the payload with the image URI\n", " payload = {\n", " \"contents\": [{\n", " \"role\": \"user\",\n", " \"parts\": [\n", " {\n", " \"text\": text_prompt\n", " },\n", " {\n", " \"fileData\": {\n", " \"fileUri\": image_uri,\n", " \"mimeType\": \"image/png\"\n", " }\n", " }\n", " ]\n", " }],\n", " \"generation_config\": generation_config,\n", " \"safety_settings\": {\n", " \"category\": \"HARM_CATEGORY_SEXUALLY_EXPLICIT\",\n", " \"threshold\": \"BLOCK_LOW_AND_ABOVE\"\n", " }\n", " }\n", "\n", " response = requests.post(url, json=payload, headers=headers)\n", "\n", " if response.status_code == 200:\n", " return json.loads(response.text)\n", " else:\n", " raise RuntimeError(\n", " f\"Error with prompt:'{text_prompt}' Status:'{response.status_code}' Text:'{response.text}'\"\n", " )\n" ] }, { "cell_type": "markdown", "metadata": { "id": "Wp6zIq-3M86P" }, "source": [ "#### RunQuery()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "BbDTEwYWM4Vv" }, "outputs": [], "source": [ "def RunQuery(sql, job_config = None):\n", " import time\n", "\n", " if (sql.startswith(\"SELECT\") or sql.startswith(\"WITH\")):\n", " df_result = client.query(sql).to_dataframe()\n", " return df_result\n", " else:\n", " if job_config == None:\n", " job_config = bigquery.QueryJobConfig(priority=bigquery.QueryPriority.INTERACTIVE)\n", " query_job = client.query(sql, job_config=job_config)\n", "\n", " # Check on the progress by getting the job's updated state.\n", " query_job = client.get_job(\n", " query_job.job_id, location=query_job.location\n", " )\n", " print(\"Job {} is currently in state {} with error result of {}\".format(query_job.job_id, query_job.state, query_job.error_result))\n", "\n", " while query_job.state != \"DONE\":\n", " time.sleep(2)\n", " query_job = client.get_job(\n", " query_job.job_id, location=query_job.location\n", " )\n", " print(\"Job {} is currently in state {} with error result of {}\".format(query_job.job_id, query_job.state, query_job.error_result))\n", "\n", " if query_job.error_result == None:\n", " return True\n", " else:\n", " return False" ] }, { "cell_type": "markdown", "metadata": { "id": "tuLsHkPq3dFq" }, "source": [ "#### PrettyPrintJson(json_string)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "9VMElV2A3m56" }, "outputs": [], "source": [ "def PrettyPrintJson(json_string):\n", " json_object = json.loads(json_string)\n", " json_formatted_str = json.dumps(json_object, indent=2)\n", " print(json_formatted_str)\n", " return json.dumps(json_object)" ] }, { "cell_type": "markdown", "metadata": { "id": "TFoIjOkq6jX6" }, "source": [ "#### UploadImageToGcs()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "QvnwDdDq6kBo" }, "outputs": [], "source": [ "from google.cloud import storage\n", "\n", "def UploadImageToGcs(image_data, bucket_name, image_name):\n", " \"\"\"Uploads an image to Google Cloud Storage.\"\"\"\n", "\n", " storage_client = storage.Client()\n", " bucket = storage_client.bucket(bucket_name)\n", " blob = bucket.blob(image_name)\n", " blob.upload_from_file(image_data, content_type='image/png')\n", " return f\"gs://{bucket_name}/{image_name}\"" ] }, { "cell_type": "markdown", "metadata": { "id": "L19Qk4yhIfQl" }, "source": [ "#### DisplayMarkdown()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "DKtThrq3IfnO" }, "outputs": [], "source": [ "def DisplayMarkdown(text):\n", " \"\"\"\n", " Displays text in markdown/HTML format in a Colab notebook.\n", "\n", " Args:\n", " text: The text to display. Can be plain text or Markdown.\n", " \"\"\"\n", "\n", " formatted_text = markdown.markdown(text) # Convert to HTML if necessary\n", " display(HTML(formatted_text))\n" ] }, { "cell_type": "markdown", "metadata": { "id": "zYlKv698XvUZ" }, "source": [ "#### getProjectNumber()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "BfVj1YXUXvNh" }, "outputs": [], "source": [ "def getProjectNumber(project_id):\n", " \"\"\"Batch activates service apis\"\"\"\n", "\n", " url = f\"https://cloudresourcemanager.googleapis.com/v1/projects/{project_id}\"\n", " json_result = restAPIHelper(url, \"GET\", None)\n", "\n", " project_number = json_result[\"projectNumber\"]\n", " return project_number" ] }, { "cell_type": "markdown", "metadata": { "id": "7D5nNRqiZTIE" }, "source": [ "#### timesFMInference()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "uYOV9054ZS_u" }, "outputs": [], "source": [ "def timesFMInference(project_number, endpoint_id, payload):\n", " url = f\"https://{location}-aiplatform.googleapis.com/v1/projects/{project_number}/locations/{location}/endpoints/{endpoint_id}:predict\"\n", " # print(f\"url: {url}\")\n", " response = restAPIHelper(url, http_verb=\"POST\", request_body=payload)\n", " # print(f\"response: {response}\")\n", " return response" ] }, { "cell_type": "markdown", "metadata": { "id": "LnoeM8d1oEj-" }, "source": [ "## Table Definitions" ] }, { "cell_type": "markdown", "metadata": { "id": "WtkUBfH8O0VM" }, "source": [ "This notebook leverages the tables and data created by notebook `DB-GMA-Synthetic-Data-Generation-Campaigns.ipynb`. The table definitions are provided below for reference.\n", "\n", "```sql\n", "CREATE TABLE IF NOT EXISTS `${project_id}.${bigquery_chocolate_ai_dataset}.campaign` (\n", " campaign_id STRING NOT NULL OPTIONS(description=\"Unique identifier for each campaign (e.g., 'SummerSale2024', 'HolidayPromoDec')\"),\n", " menu_id INT64 OPTIONS(description=\"Id of the menu item being promoted in the campaign\"),\n", " campaign_name STRING OPTIONS(description=\"A descriptive name for the campaign.\"),\n", " campaign_goal STRING OPTIONS(description=\"The objective of the campaign (e.g., 'Increase brand awareness', 'Drive website traffic', 'Generate leads')\"),\n", " campaign_start_date DATE OPTIONS(description=\"The date the campaign started.\"),\n", " campaign_end_date DATE OPTIONS(description=\"The date the campaign ended.\"),\n", " target_audience STRING OPTIONS(description=\"The intended audience for the campaign (e.g., 'Women aged 25-35', 'Small business owners in Texas')\"),\n", " marketing_channels STRING OPTIONS(description=\"The marketing channels used in the campaign (e.g., 'Social Media', 'Email', 'SEO', 'Paid Ads')\"),\n", " budget FLOAT64 OPTIONS(description=\"The overall budget allocated for the campaign.\"),\n", " explanation STRING OPTIONS(description=\"The model reasoning for creating the campaign.\"),\n", " campaign_outcomes STRING OPTIONS(description=\"Expected outcomes of the campaign.\"),\n", ") CLUSTER BY campaign_id;\n", "```\n", "\n", "```sql\n", "CREATE TABLE IF NOT EXISTS `${project_id}.${bigquery_chocolate_ai_dataset}.campaign_performance` (\n", " campaign_id STRING NOT NULL OPTIONS(description=\"Foreign key referencing the campaign table. Unique identifier for the campaign.\"),\n", " impressions INT64 OPTIONS(description=\"Number of times the campaign was displayed.\"),\n", " reach INT64 OPTIONS(description=\"Number of unique individuals exposed to the campaign.\"),\n", " website_traffic INT64 OPTIONS(description=\"Visits to your website attributed to the campaign.\"),\n", " leads_generated INT64 OPTIONS(description=\"Number of new leads captured (e.g., email sign-ups, contact form submissions).\"),\n", " conversions INT64 OPTIONS(description=\"Number of desired actions taken (e.g., purchases, downloads).\"),\n", " cost_per_click FLOAT64 OPTIONS(description=\"Average cost per click on your ads (if applicable).\"),\n", " cost_per_acquisition FLOAT64 OPTIONS(description=\"Average cost to acquire a new customer.\"),\n", " return_on_investment FLOAT64 OPTIONS(description=\"Return on Investment calculated as (Revenue - Cost) / Cost.\")\n", ") CLUSTER BY campaign_id;\n", "```" ] }, { "cell_type": "markdown", "metadata": { "id": "SVC7WZCdGHMj" }, "source": [ "## Define Campaign Functions" ] }, { "cell_type": "markdown", "metadata": { "id": "9_UFktuayJXv" }, "source": [ "### GenerateCampaignRecommendations()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "YHSA5g1JpXJU" }, "outputs": [], "source": [ "# Use tenacity to retry in case of resource exhausted errors (429)\n", "@retry(wait=wait_exponential(multiplier=1, min=1, max=60), stop=stop_after_attempt(10), retry=retry_if_exception(RetryCondition), before_sleep=before_sleep_log(logging.getLogger(), logging.INFO))\n", "def GenerateCampaignRecommendations(campaign_goal, product_name, product_description, budget):\n", "\n", " # For each row in customer_marketing_profile, send this prompt:\n", " prompt = f\"\"\"\n", " You are an expert Marketing Analyst, and you are creating a new marketing campaign for a\n", " French chocolatier and bakery called 'Chocolate AI'.\n", "\n", " Your task is to recommend a campaign for the product that will achieve the campaign goals within budget.\n", "\n", " You should include the following information in your response:\n", " - A creative campaign name.\n", " - A detailed description of the campaign.\n", " - The ideal target audience.\n", " - Ideal marketing channels. This should be optimized for the target audience.\n", " - Expected campaign outcomes. This should map directly to the campaign goals.\n", " - An explanation/justification for the reasoning behind choosing this campaign.\n", "\n", " Your total budget for the campaign is: {budget}\n", "\n", " Here is the goal of the campaign:\n", " {campaign_goal}\n", "\n", " Here is the product you are promoting:\n", " {product_name}: {product_description}\n", "\n", " Additional instructions:\n", " - Results can be creative, but they must be plausible.\n", "\n", " Now generate the campaign recommendation. Think step by step and explain your reasoning.\n", " \"\"\"\n", "\n", " # Ref: https://cloud.google.com/vertex-ai/docs/reference/rest/v1/Schema\n", " response_schema = {}\n", "\n", " response_schema = {\n", " \"type\": \"object\",\n", " \"required\": [\"campaign_name\", \"campaign_description\", \"target_audience\", \"marketing_channels\", \"campaign_outcomes\", \"explanation\"],\n", " \"properties\": {\n", " \"campaign_name\": {\n", " \"type\": \"string\",\n", " \"description\": \"A descriptive and creative name for the campaign.\"\n", " },\n", " \"campaign_description\": {\n", " \"type\": \"string\",\n", " \"description\": \"A detailed description of the campaign.\"\n", " },\n", " \"target_audience\": {\n", " \"type\": \"string\",\n", " \"description\": \"The intended audience of the campaign.\"\n", " },\n", " \"marketing_channels\": {\n", " \"type\": \"string\",\n", " \"description\": \"The recommended marketing channels to be used in the campaign.\"\n", " },\n", " \"campaign_outcomes\": {\n", " \"type\": \"string\",\n", " \"description\": \"The expected outcomes of the campaign.\"\n", " },\n", " \"explanation\": {\n", " \"type\": \"string\",\n", " \"description\": \"An explanation for the reasoning behind choose this campaign.\"\n", " }\n", " }\n", " }\n", "\n", " result = GeminiLLM(prompt, response_schema=response_schema)\n", " return result\n" ] }, { "cell_type": "markdown", "metadata": { "id": "MRTWH4BDyMV3" }, "source": [ "### GetRankedProducts()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "toM5TtUiP0CS" }, "outputs": [], "source": [ "def GetRankedProducts(count, time_period='monthly', top_bottom='top'):\n", " \"\"\"\n", " Get ranked products based on sales change over a specified time period.\n", " Valid time periods are 'yearly', 'monthly', 'weekly', and 'daily'.\n", "\n", " Args:\n", " count: The number of products to return.\n", " time_period: The time period for comparison ('yearly', 'monthly', 'weekly', 'daily').\n", " top_bottom: Whether to return the 'top' or 'bottom' performing products.\n", "\n", " Returns:\n", " A pandas DataFrame with the ranked products.\n", " \"\"\"\n", "\n", " sort_order = 'DESC' if top_bottom == 'top' else 'ASC'\n", "\n", " # Define intervals based on the time_period parameter\n", " intervals = {\n", " 'yearly': '365 DAY',\n", " 'monthly': '30 DAY',\n", " 'weekly': '7 DAY',\n", " 'daily': '1 DAY'\n", " }\n", "\n", " interval = intervals.get(time_period)\n", "\n", " sql = f\"\"\"WITH current_period_sales AS (\n", " SELECT\n", " m.menu_name,\n", " m.menu_id,\n", " m.menu_description,\n", " SUM(oi.item_total) AS cumulative_sales\n", " FROM\n", " `${project_id}.${bigquery_chocolate_ai_dataset}.order_item` AS oi\n", " INNER JOIN `${project_id}.${bigquery_chocolate_ai_dataset}.order` AS o ON o.order_id = oi.order_id\n", " INNER JOIN `${project_id}.${bigquery_chocolate_ai_dataset}.menu` AS m ON m.menu_id = oi.menu_id\n", " WHERE o.order_datetime BETWEEN TIMESTAMP(DATE_SUB(CURRENT_DATE(), INTERVAL {interval}))\n", " AND TIMESTAMP(CURRENT_DATE())\n", " GROUP BY menu_name, m.menu_id, m.menu_description\n", " ), prior_period_sales AS (\n", " SELECT\n", " m.menu_name,\n", " m.menu_id,\n", " m.menu_description,\n", " SUM(oi.item_total) AS cumulative_sales\n", " FROM\n", " `${project_id}.${bigquery_chocolate_ai_dataset}.order_item` AS oi\n", " INNER JOIN `${project_id}.${bigquery_chocolate_ai_dataset}.order` AS o ON o.order_id = oi.order_id\n", " INNER JOIN `${project_id}.${bigquery_chocolate_ai_dataset}.menu` AS m ON m.menu_id = oi.menu_id\n", " WHERE o.order_datetime BETWEEN TIMESTAMP(DATE_SUB(CURRENT_DATE(), INTERVAL 2 * {interval}))\n", " AND TIMESTAMP(DATE_SUB(CURRENT_DATE(), INTERVAL {interval}))\n", " GROUP BY menu_name, m.menu_id, m.menu_description\n", " ) SELECT current_period_sales.menu_id,\n", " current_period_sales.menu_name,\n", " current_period_sales.menu_description,\n", " FORMAT('%s €', CAST(ROUND(current_period_sales.cumulative_sales, 0) as STRING)) AS current_period_cumulative_sales,\n", " FORMAT('%s €', CAST(ROUND(prior_period_sales.cumulative_sales, 0) as STRING)) AS prior_period_cumulative_sales,\n", " FORMAT('%s €', CAST(ROUND(current_period_sales.cumulative_sales - prior_period_sales.cumulative_sales, 0) as STRING)) AS change,\n", " ROUND((current_period_sales.cumulative_sales - prior_period_sales.cumulative_sales) / prior_period_sales.cumulative_sales * 100,0) AS change_percent\n", " FROM current_period_sales\n", " JOIN prior_period_sales ON current_period_sales.menu_name = prior_period_sales.menu_name\n", " ORDER BY change_percent {sort_order}\n", " LIMIT {count};\n", " \"\"\"\n", "\n", " return RunQuery(sql)\n" ] }, { "cell_type": "markdown", "metadata": { "id": "SlO6yP77yN1A" }, "source": [ "### GetFinalProductRanking()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "yMMMON0aXcHR" }, "outputs": [], "source": [ "def GetFinalProductRanking(daily_df, weekly_df, monthly_df):\n", " \"\"\"\n", " Ranks products based on appearances and positions in three DataFrames.\n", "\n", " Args:\n", " daily_df: DataFrame with daily rankings.\n", " weekly_df: DataFrame with weekly rankings.\n", " monthly_df: DataFrame with monthly rankings.\n", "\n", " Returns:\n", " A pandas DataFrame with the final product ranking.\n", " \"\"\"\n", "\n", " # 1. Assign weights based on time period\n", " daily_df['weight'] = 1\n", " weekly_df['weight'] = 3\n", " monthly_df['weight'] = 10\n", "\n", " # 2. Concatenate DataFrames\n", " combined_df = pd.concat([daily_df, weekly_df, monthly_df])\n", "\n", " # 3. Calculate weighted rank\n", " combined_df['rank'] = combined_df.groupby('menu_name')['change_percent'].rank(ascending=True)\n", " combined_df['weighted_rank'] = combined_df['rank'] * combined_df['weight']\n", "\n", " # 4. Aggregate weighted rank and count appearances\n", " final_ranking = combined_df.groupby('menu_name').agg(\n", " final_weighted_rank=('weighted_rank', 'sum'),\n", " appearances=('menu_name', 'count')\n", " ).reset_index()\n", "\n", " # 5. Sort by weighted rank and appearances\n", " final_ranking = final_ranking.sort_values(\n", " by=['appearances', 'final_weighted_rank'], ascending=[False, False]\n", " )\n", "\n", " return final_ranking" ] }, { "cell_type": "markdown", "metadata": { "id": "472c27EayREC" }, "source": [ "### VisualizePreliminaryProductRankings()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "6trcTNPeWG4I" }, "outputs": [], "source": [ "def VisualizePreliminaryProductRankings(daily_df, weekly_df, monthly_df):\n", " \"\"\"\n", " Visualizes the product rankings from three DataFrames (daily, weekly, monthly).\n", "\n", " Args:\n", " daily_df: DataFrame with daily rankings.\n", " weekly_df: DataFrame with weekly rankings.\n", " monthly_df: DataFrame with monthly rankings.\n", " \"\"\"\n", "\n", " fig, axes = plt.subplots(3, 1, figsize=(10, 12)) # 3 subplots for daily, weekly, monthly\n", "\n", " # Helper function to plot on each subplot\n", " def plot_ranking(ax, df, time_period):\n", " ax.barh(df['menu_name'], df['change_percent'], color='skyblue')\n", " ax.set_xlabel('Percent Change in Sales')\n", " ax.set_title(f'Top 10 Lowest Performing Products ({time_period})')\n", " ax.invert_yaxis() # Invert y-axis for better readability\n", "\n", " plot_ranking(axes[0], daily_df, 'Daily')\n", " plot_ranking(axes[1], weekly_df, 'Weekly')\n", " plot_ranking(axes[2], monthly_df, 'Monthly')\n", "\n", " plt.tight_layout()\n", " plt.show()" ] }, { "cell_type": "markdown", "metadata": { "id": "QnbIJW_UySnC" }, "source": [ "### VisualizeFinalProductRankings()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "3IzHBHjfaLgX" }, "outputs": [], "source": [ "def VisualizeFinalProductRankings(ranking_df, title):\n", " \"\"\"\n", " Visualizes the product rankings from a DataFrame, including appearances.\n", "\n", " Args:\n", " ranking_df: DataFrame with product rankings.\n", " title: Title of the plot.\n", "\n", " Returns:\n", " An image object suitable for Gemini interpretation.\n", " \"\"\"\n", "\n", " fig, ax1 = plt.subplots(figsize=(10, 6))\n", "\n", " # Bar chart for weighted rank\n", " ax1.barh(ranking_df['menu_name'], ranking_df['final_weighted_rank'], color='skyblue', label='Weighted Rank')\n", " ax1.set_xlabel('Final Weighted Rank')\n", " ax1.set_title(title)\n", " ax1.invert_yaxis()\n", "\n", " # Add appearances as text annotations\n", " for i, (index, row) in enumerate(ranking_df.iterrows()):\n", " ax1.text(row['final_weighted_rank'], i, f\" ({row['appearances']})\",\n", " va='center', color='black')\n", "\n", " # Legend\n", " ax1.legend(loc='lower right')\n", "\n", " plt.tight_layout()\n", "\n", " # Save the plot to a BytesIO object\n", " image_data = BytesIO()\n", " plt.savefig(image_data, format='png')\n", " image_data.seek(0) # Reset the stream position\n", "\n", " plt.show()\n", " plt.close(fig) # Close the figure to free up resources\n", "\n", " return image_data" ] }, { "cell_type": "markdown", "metadata": { "id": "hAdAwphRgH7j" }, "source": [ "### VisualizeSimulatedCampaignPerformance()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "wmTCoJ1lgJKa" }, "outputs": [], "source": [ "def VisualizeSimulatedCampaignPerformance(campaign_results):\n", " \"\"\"\n", " Visualizes campaign performance metrics for multiple simulations.\n", "\n", " Args:\n", " campaign_results: A list of lists, where each inner list contains campaign performance data\n", " for a single simulation.\n", " \"\"\"\n", "\n", " num_simulations = len(campaign_results)\n", " num_metrics = len(campaign_results[0][0]) - 1 # Exclude campaign_id\n", "\n", " # Extract metric names (assuming they are consistent across simulations)\n", " metric_names = [\n", " \"Impressions\", \"Reach\", \"Website Traffic\", \"Leads Generated\", \"Conversions\",\n", " \"Cost per Click\", \"Cost per Acquisition\", \"Return on Investment\"\n", " ]\n", "\n", " # Set up figure and axes\n", " fig, axes = plt.subplots(nrows=num_metrics, ncols=1, figsize=(10, 6 * num_metrics))\n", " fig.suptitle('Campaign Performance Metrics Across Simulations', fontsize=16)\n", "\n", " # Iterate through metrics\n", " for i in range(num_metrics):\n", " ax = axes[i]\n", " max_campaigns = max(len(sim_result) for sim_result in campaign_results)\n", " for j, simulation_results in enumerate(campaign_results):\n", " metric_values = [result[i + 1] for result in simulation_results] # Corrected line\n", " x_pos = np.arange(len(simulation_results)) + (j - num_simulations / 2) * 0.2 # Adjust bar positions\n", " ax.bar(x_pos, metric_values, width=0.2, label=f\"Simulation {j+1}\")\n", "\n", " # Calculate and plot the average line\n", " avg_metric_values = np.mean([result[i + 1] for sim_result in campaign_results for result in sim_result])\n", " ax.axhline(y=avg_metric_values, color='black', linestyle='--', label=\"Average\")\n", "\n", " ax.set_xticks(np.arange(max_campaigns))\n", " ax.set_xticklabels([f\"Campaign {i+1}\" for i in range(max_campaigns)])\n", " ax.set_ylabel(metric_names[i])\n", " ax.legend()\n", "\n", " plt.tight_layout()\n", " plt.show()" ] }, { "cell_type": "markdown", "metadata": { "id": "xL6qURJN7cc3" }, "source": [ "### VisualizeSingleCampaignPerformance()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "28AP3Vn77b1u" }, "outputs": [], "source": [ "def VisualizeSingleCampaignPerformance(campaign_data):\n", " \"\"\"\n", " Visualizes performance metrics for a single campaign.\n", "\n", " Args:\n", " campaign_data: A tuple containing campaign performance data.\n", " \"\"\"\n", "\n", " # Extract data from the tuple\n", " campaign_id = campaign_data[0]\n", " impressions = campaign_data[1]\n", " reach = campaign_data[2]\n", " website_traffic = campaign_data[3]\n", " leads_generated = campaign_data[4]\n", " conversions = campaign_data[5]\n", " cost_per_click = campaign_data[6]\n", " cost_per_acquisition = campaign_data[7]\n", " return_on_investment = campaign_data[8]\n", "\n", " # Metric names corresponding to the data\n", " metric_names = [\n", " \"Impressions\", \"Reach\", \"Website Traffic\", \"Leads Generated\", \"Conversions\",\n", " \"Cost per Click\", \"Cost per Acquisition\", \"Return on Investment\"\n", " ]\n", "\n", " # Metric values\n", " metric_values = [\n", " impressions, reach, website_traffic, leads_generated, conversions,\n", " cost_per_click, cost_per_acquisition, return_on_investment\n", " ]\n", "\n", " # Create the bar chart\n", " plt.figure(figsize=(10, 6))\n", " bars = plt.bar(metric_names, metric_values, color='skyblue') # Store the bar objects\n", " plt.title(f\"Campaign Performance\")\n", " plt.xlabel(\"Metrics\")\n", " plt.ylabel(\"Value\")\n", " plt.xticks(rotation=45)\n", " plt.yscale('log') # logarithmic scale to avoid losing small numbers in graph\n", "\n", " # Add labels above each bar\n", " for bar in bars:\n", " height = bar.get_height()\n", " plt.annotate(f'{height}',\n", " xy=(bar.get_x() + bar.get_width() / 2, height),\n", " xytext=(0, 3), # 3 points vertical offset\n", " textcoords=\"offset points\",\n", " ha='center', va='bottom')\n", "\n", "\n", " plt.tight_layout()\n", " plt.show()" ] }, { "cell_type": "markdown", "metadata": { "id": "YapRbglyeMva" }, "source": [ "### GenerateCampaignPerformance()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "Iqf70nAReNmg" }, "outputs": [], "source": [ "def GenerateCampaignPerformance(budget):\n", " \"\"\"\n", " Generates randomized campaign performance data, which is influence by budget.\n", "\n", " Args:\n", " campaign_id: The ID of the campaign.\n", " budget: The budget for the campaign.\n", "\n", " Returns:\n", " A dictionary containing the performance data.\n", "\n", " Example:\n", " budget = 5000\n", " performance_data = GenerateCampaignPerformance(budget)\n", " https://colab-embedded.cloud.google.com/embedded/projects/data-beans-demo-s5sc5wm836/locations/us-central1/repositories/DB-GMA-Create-Campaign-Recommendations?cde=1&embedding_app=bigquery&authuser=0#scrollTo=GenerateCampaignPerformance_mance_data)\n", " \"\"\"\n", "\n", " # Generate random scale factors within a range for non-linear relationship\n", " impressions_factor = random.uniform(0.6, 0.8)\n", " reach_factor = random.uniform(0.5, 0.7)\n", " traffic_factor = random.uniform(0.4, 0.6)\n", " leads_factor = random.uniform(0.3, 0.5)\n", " conversions_factor = random.uniform(0.2, 0.4)\n", "\n", " # Generate data points with randomness and scaling\n", " impressions = int(random.uniform(0.8, 1.2) * (budget ** impressions_factor) * 100)\n", " reach = int(random.uniform(0.7, 1.1) * (budget ** reach_factor) * 100)\n", " website_traffic = int(random.uniform(0.6, 1.0) * (budget ** traffic_factor) * 10)\n", " leads_generated = int(random.uniform(0.5, 0.9) * (budget ** leads_factor) * 5)\n", " conversions = int(random.uniform(0.4, 0.8) * (budget ** conversions_factor) * 2)\n", "\n", "\n", " # Cost per click and acquisition (with some randomness)\n", " cost_per_click = round(random.uniform(0.2, 1.5), 2)\n", " cost_per_acquisition = round(random.uniform(5, 50) * (15000 / budget), 2)\n", "\n", " # ROI (mostly positive, but some negative)\n", " roi_sign = random.choices([-1, 1], weights=[0.2, 0.8])[0] # 20% chance of negative ROI\n", " return_on_investment = round(roi_sign * random.uniform(0.5, 10.0), 2)\n", "\n", " return {\n", " \"impressions\": impressions,\n", " \"reach\": reach,\n", " \"website_traffic\": website_traffic,\n", " \"leads_generated\": leads_generated,\n", " \"conversions\": conversions,\n", " \"cost_per_click\": cost_per_click,\n", " \"cost_per_acquisition\": cost_per_acquisition,\n", " \"return_on_investment\": return_on_investment\n", " }" ] }, { "cell_type": "markdown", "metadata": { "id": "CZ2X18WoU5Tk" }, "source": [ "## Generate Campaign Recommendations" ] }, { "cell_type": "markdown", "metadata": { "id": "q_lxFdgOU-eH" }, "source": [ "### Step 1: Get Ranked Listed of Under-performing Products" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "fUXILcWbN-v7" }, "outputs": [], "source": [ "# Get a list of the 10 worst-performing products over the last day, week, and month\n", "product_ranking_daily = GetRankedProducts(10,'daily','bottom')\n", "product_ranking_weekly = GetRankedProducts(10,'weekly','bottom')\n", "product_ranking_monthly = GetRankedProducts(10,'monthly','bottom')\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "wQc7qSHMWRy4" }, "outputs": [], "source": [ "# Visualize the results\n", "VisualizePreliminaryProductRankings(product_ranking_daily, product_ranking_weekly, product_ranking_monthly)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "nWLjcInOWxVu" }, "outputs": [], "source": [ "# Combine the 3 data frames into a final ranked data frame\n", "# Products that appear multiple times are ranked higher\n", "# Products that appear in longer time horizons rankings are also ranked higher\n", "final_ranking = GetFinalProductRanking(product_ranking_daily, product_ranking_weekly, product_ranking_monthly)\n", "\n", "final_ranking.head(10)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "E4lp5EdyavwO" }, "outputs": [], "source": [ "# Visualize the final rankings\n", "final_ranking_img = VisualizeFinalProductRankings(final_ranking, 'Final Product Ranking (Daily, Weekly, Monthly)')\n", "\n", "# Upload image to GCS\n", "now = datetime.now()\n", "timestamp_str = now.strftime(\"%Y%m%d_%H%M%S\")\n", "image_name = f\"images/product_ranking_{timestamp_str}.png\"\n", "image_uri = UploadImageToGcs(final_ranking_img, bucket_name, image_name)\n", "\n", "# Have Gemini review the graph data and recommend a product to promote.\n", "preliminary_ranking_function = inspect.getsource(GetRankedProducts)\n", "final_ranking_function = inspect.getsource(GetFinalProductRanking)\n", "\n", "text_prompt = f\"\"\"You are a marketing analyst, and you are reviewing product\n", "sales to determine which low-performing products you should promote with a\n", "marketing campaign. The graph supplied with this prompt shows the worst-performing\n", "products over the last day, week, and month. It was created by running the\n", "GetRankedProducts() function three times: once each for daily, weekly, monthly\n", "time periods. Then the results of those three GetRankedProducts() objects were\n", "re-ranked by the GetFinalProductRanking() function.\n", "\n", "Here is the definition of GetRankedProducts():\n", "```\n", "{preliminary_ranking_function}\n", "```\n", "\n", "And here is the definition of GetFinalProductRanking():\n", "```\n", "{final_ranking_function}\n", "```\n", "\n", "Your task is to interpret the supplied graph and explain it to a business\n", "stakeholder. Then select 2 products you recommend promoting based on the data,\n", "and explain your reasoning.\"\"\"\n", "\n", "response = GeminiLLM_Multimodal(text_prompt, image_uri)\n", "\n", "# Display the results\n", "response_text = response['candidates'][0]['content']['parts'][0]['text']\n", "DisplayMarkdown(response_text)" ] }, { "cell_type": "markdown", "metadata": { "id": "paYayUy_evtx" }, "source": [ "### Step 2: Determine Discount Amount" ] }, { "cell_type": "markdown", "metadata": { "id": "LK5qwrRfe2Cj" }, "source": [ "Let's say that we decide we should promote \"Earl Grey & Bergamot Chocolate Symphony\t\", and we want to increase sales by 30% in the next week. We can model the impact of offering different levels discounts using [TimesFM](https://github.com/google-research/timesfm) to help us find the right offer to achieve our goals." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "BdqrBPpYk8Fk" }, "outputs": [], "source": [ "# Define product to promote and desired sales increase during promotion period\n", "product_to_promote = \"Earl Grey & Bergamot Chocolate Symphony\" # Product name\n", "target_sales_increase = 30 # 30% increase in average daily sales\n", "max_promotion_discount_percent = 50 # 50% - The maximum discount you're willing to offer during the promotion\n", "\n", "# Get sales of the product for the last 42 days (6 weeks)\n", "sql = f\"\"\"SELECT\n", " m.menu_name,\n", " SUM(oi.item_total) AS sales,\n", " TIMESTAMP_TRUNC(o.order_datetime, DAY) AS day\n", " FROM\n", " `${project_id}.${bigquery_chocolate_ai_dataset}.order_item` AS oi\n", " INNER JOIN `${project_id}.${bigquery_chocolate_ai_dataset}.order` AS o ON o.order_id = oi.order_id\n", " INNER JOIN `${project_id}.${bigquery_chocolate_ai_dataset}.menu` AS m ON m.menu_id = oi.menu_id\n", " WHERE m.menu_name = \"{product_to_promote}\"\n", " AND o.order_datetime BETWEEN TIMESTAMP(DATE_SUB(CURRENT_DATE(), INTERVAL 42 DAY)) AND CURRENT_TIMESTAMP()\n", " GROUP BY 1, 3\n", " ORDER BY 3\n", " LIMIT 42;\"\"\"\n", "\n", "daily_sales = RunQuery(sql)\n", "daily_sales.head()\n", "#daily_sales.describe()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "o8y4hX66ur7D" }, "outputs": [], "source": [ "# Configure TimesFM\n", "\n", "context_len = 512\n", "horizon_len = 7 # Predict next 7 days, this could be 128 without requiring compute (129 would be a step up). This is more of the max horizon len.\n", "input_patch_len = 32\n", "output_patch_len = 128\n", "num_layers = 20\n", "model_dims = 1280\n", "timesfm_backend = \"cpu\" # cpu, gpu or cuda\n", "xreg_mode = \"xreg + timesfm\"" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "WT1NXFZnojVK" }, "outputs": [], "source": [ "# Define function to forecast promotion sales\n", "def ForecastPromotionSales(inputs, current_promotion_discount_percent, past_promotion_discount_percent_assumption=40):\n", "\n", " # IMPORTANT:\n", " # The covariates MUST cover both the context data PLUS the horizon length.\n", " # For example, inputs has a length of 42, so covariate arrays must be of length 49.\n", "\n", " # Calculate stats for input sales\n", " median = statistics.median(daily_sales['sales'])\n", " standard_deviation = statistics.stdev(daily_sales['sales'])\n", "\n", " # Generate discount array - for demo purposes, we're inferring that we had a\n", " # promotion running on days that we had higher than median + stdev sales\n", " discount_array = [0 if x < median + standard_deviation else past_promotion_discount_percent_assumption for x in inputs]\n", "\n", " # Add 7 days of current_promotion_discount_percent to discount_array to cover horizon_len\n", " [discount_array.append(current_promotion_discount_percent) for x in range(7)]\n", "\n", " # Define day of week array with 7 weeks of data to cover inputs + horizon_len\n", " day_of_week_array = [str(x + 1) for x in range(7)]*7 #\n", "\n", " # These are our categorical covariates (additional factors that we think might influence the thing we're trying to predict).\n", " # Here we consider the day of the week and if a marketing campaign was in progress\n", " dynamic_categorical_covariates = {\n", " \"day_of_week\": day_of_week_array\n", " }\n", "\n", " # These are our numerical covariates (additional numeric factors, just like the categories, but numbers)\n", " # Here we consider the temperature of the day\n", " dynamic_numerical_covariates = {\n", " \"discount_percent\": discount_array\n", " }\n", "\n", " # These are our static covariates (additional factors that we think are fixed, like the price of the product)\n", " # Here we consider the price of the product\n", " static_numerical_covariates = {\n", " \"price\": 18.99\n", " }\n", "\n", " # These are our static categorical covariates (additional factors that we think are fixed, like the menu item)\n", " # Here we consider the menu item\n", " static_categorical_covariates = {\n", " \"menu_item\" : \"chocolate_tasting_flight\"\n", " }\n", "\n", "\n", " # frequency of each context time series. 0 for high frequency (default), 1 for medium, and 2 for low.\n", " frequency = 0\n", "\n", " # Build JSON payload\n", " payload = {\n", " \"instances\": [\n", " {\n", " \"input\": inputs,\n", " \"freq\": frequency,\n", " \"horizon\": horizon_len,\n", " \"dynamic_numerical_covariates\": dynamic_numerical_covariates,\n", " \"dynamic_categorical_covariates\": dynamic_categorical_covariates,\n", " \"static_numerical_covariates\": static_numerical_covariates,\n", " \"static_categorical_covariates\": static_categorical_covariates,\n", " \"xreg_kwargs\": {\n", " \"xreg_mode\" : xreg_mode\n", " }\n", " }\n", " ]\n", " }\n", "\n", " # Get the project number in order to call the endpoint\n", " project_number = getProjectNumber(project_id)\n", "\n", " # Calls TimeFM to make a prediction\n", " times_fm_inference = timesFMInference(project_number, endpoint_id, payload)\n", "\n", " # Create an array of forecasted elements\n", " model_forecast = [times_fm_inference[\"predictions\"][0][\"point_forecast\"]]\n", "\n", " return model_forecast[0]" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "5UuZY48Y9RV3" }, "outputs": [], "source": [ "# Run forecasts based on our sales data and desired sales increase\n", "inputs = [int(round(x, 0)) for x in daily_sales['sales']]\n", "average_sales = statistics.mean(inputs)\n", "target_sales = average_sales * (1 + (target_sales_increase/100))\n", "recommended_discount_percent = 0\n", "\n", "print(f\"Current Average daily sales: {round(average_sales, 2)}\")\n", "print(f\"Target Average daily sales: {round(target_sales,2)} ({target_sales_increase}% increase)\")\n", "print(f\"Maximum promotion discount: {max_promotion_discount_percent}%\\n\")\n", "\n", "print(\"Forecasting ideal discount to achieve your sales target...\")\n", "for discount in range(max_promotion_discount_percent):\n", " result = ForecastPromotionSales(inputs, discount)\n", " if statistics.mean(result) >= target_sales:\n", " print(f\"A promotion discount of {discount}% is projected to achieve your sales target.\")\n", " print(f\"\\nSetting recommended discount to {discount}%.\")\n", " recommended_discount_percent = discount\n", " break\n", " else:\n", " print(f\"A promotion discount of {discount}% is NOT likely to achieve your sales target.\")" ] }, { "cell_type": "markdown", "metadata": { "id": "pON1Nd6LOLwf" }, "source": [ "### Step 3: Generate Campaigns Ideas\n", "\n", "Now that we've decided which product to promote and the target discount amount, let's generate some campaign ideas with Gemini." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "1_f0Mn3vXPCm" }, "outputs": [], "source": [ "# Get product details for the product(s) we will be promoting\n", "sql = f\"\"\"SELECT * FROM `${project_id}.${bigquery_chocolate_ai_dataset}.menu`\n", "WHERE menu_name = '{product_to_promote}'\n", "\"\"\"\n", "\n", "product_details = RunQuery(sql)\n", "product_details = product_details.iloc[0]\n", "\n", "product_name = product_details['menu_name']\n", "product_description = product_details['menu_description']\n", "menu_id = product_details['menu_id']" ] }, { "cell_type": "markdown", "metadata": { "id": "--Am44LQXR0Q" }, "source": [ "Run the cell below as many times as you'd like until you're happy with the recommeded campaign." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "SYdgrzyUutdb" }, "outputs": [], "source": [ "# Define goal and budget\n", "campaign_goal = f\"Increase sales of {product_to_promote} by {target_sales_increase}%.\"\n", "budget = 10000\n", "\n", "# Generate campaign recommendations for low-performing products\n", "result = GenerateCampaignRecommendations(campaign_goal, product_name, product_description, budget)\n", "#PrettyPrintJson(result)\n", "\n", "json_result = json.loads(result)\n", "result_string = f\"\"\"\n", "##Suggested campaign for {product_to_promote}\n", "\n", "**Campaign Name:** {json_result[\"campaign_name\"]}\n", "\n", "**Campaign Description:** {json_result[\"campaign_description\"]}\n", "\n", "**Target Audience:** {json_result[\"target_audience\"]}\n", "\n", "**Predicted Campaign Outcomes:**\n", "{json_result[\"campaign_outcomes\"]}\n", "\n", "**Reasoning for Campaign Choice:**\n", "{json_result[\"explanation\"]}\n", "\n", "**Suggested Marketing Channels:**\n", "{json_result[\"marketing_channels\"]}\n", "\"\"\"\n", "\n", "DisplayMarkdown(result_string)" ] }, { "cell_type": "markdown", "metadata": { "id": "emvzU_tHZJAv" }, "source": [ "Run the cell below to save the campaign to the database." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "e1cYwvNcUsfo" }, "outputs": [], "source": [ "# Set start and end dates for camapaign to activate it\n", "campaign_start_date = '2024-09-30'\n", "campaign_end_date = '2024-10-06'\n", "\n", "# Save campaign to database\n", "# Using json.dumps() to avoid unexpected quoting errors on insert\n", "campaign_name = json.dumps(json_result[\"campaign_name\"])\n", "campaign_description = json.dumps(json_result[\"campaign_description\"])\n", "campaign_outcomes = json.dumps(json_result[\"campaign_outcomes\"])\n", "target_audience = json.dumps(json_result[\"target_audience\"])\n", "marketing_channels = json.dumps(json_result[\"marketing_channels\"])\n", "explanation = json.dumps(json_result[\"explanation\"])\n", "campaign_goal = json.dumps(campaign_goal)\n", "budget = budget\n", "\n", "sql=f\"\"\"\n", "INSERT INTO `chocolate-ai-demo-b2kvrbnkb3.chocolate_ai.campaign`\n", " (campaign_id, campaign_name, campaign_description, campaign_goal, target_audience, marketing_channels, budget, explanation, campaign_outcomes, menu_id, campaign_start_date, campaign_end_date, campaign_created_date)\n", "VALUES\n", " (GENERATE_UUID(), {campaign_name}, {campaign_description}, {campaign_goal}, {target_audience}, {marketing_channels}, {budget}, {explanation}, {campaign_outcomes}, {menu_id}, '{campaign_start_date}', '{campaign_end_date}', CAST(CURRENT_DATETIME() AS STRING));\n", "\"\"\"\n", "\n", "RunQuery(sql)\n" ] }, { "cell_type": "markdown", "metadata": { "id": "pojNNIWUbVwf" }, "source": [ "### Step 4: Simulate Campaign Performance" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "_hqF97i1adqe" }, "outputs": [], "source": [ "# This cell uses GenerateCampaignPerformance() to simulate the possible outcomes of this campaign.\n", "sql = f\"\"\"SELECT * FROM `${project_id}.${bigquery_chocolate_ai_dataset}.campaign`\n", "WHERE campaign_name = {campaign_name}\n", "ORDER BY campaign_created_date DESC;\n", "\"\"\"\n", "\n", "campaign_table = RunQuery(sql)\n", "campaign_details = campaign_table.iloc[0]\n", "campaign_id = campaign_details['campaign_id']\n", "campaign_budget = campaign_details['budget']\n", "campaign_details\n", "\n", "simulation_count = 10\n", "\n", "sim_campaign_performance = []\n", "for sim in range(simulation_count):\n", " sim_campaign_performance.append([(row.campaign_id, *[v for v in GenerateCampaignPerformance(campaign_budget).values()]) for row in campaign_table.itertuples()])\n", "\n", "VisualizeSimulatedCampaignPerformance(sim_campaign_performance)" ] }, { "cell_type": "markdown", "metadata": { "id": "4jxW-P1LoyHT" }, "source": [ "### Step 5: Run campaign and view results" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "wzJs1g3VokN7" }, "outputs": [], "source": [ "# Run one final simulation to mock an actual outcome, and save it to the database\n", "campaign_performance = []\n", "campaign_performance.append([(row.campaign_id, *[v for v in GenerateCampaignPerformance(campaign_budget).values()]) for row in campaign_table.itertuples()])\n", "\n", "# Unnest results\n", "campaign_performance = campaign_performance[0][0]\n", "\n", "# Show campaign performance\n", "VisualizeSingleCampaignPerformance(campaign_performance)\n", "\n", "# Save to database\n", "sql=f\"\"\"INSERT INTO `${project_id}.${bigquery_chocolate_ai_dataset}.campaign_performance` (\n", " campaign_id, impressions, reach, website_traffic, leads_generated, conversions, cost_per_click, cost_per_acquisition, return_on_investment)\n", " VALUES {str(campaign_performance)}\n", "\"\"\"\n", "\n", "result = RunQuery(sql)\n" ] }, { "cell_type": "markdown", "metadata": { "id": "ASQ2BPisXDA0" }, "source": [ "## Reference Links\n" ] }, { "cell_type": "markdown", "metadata": { "id": "MNr3FTda-Hgl" }, "source": [ "- [TimesFM: a pre-trained time-series foundation model developed by Google Research for time-series forecasting](https://github.com/google-research/timesfm)\n", "- [Generate content with the Gemini Enterprise API](https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/inference)\n", "- [Controlled Generation with Gemini](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/control-generated-output)" ] } ], "metadata": { "colab": { "name": "Create-Campaign-Recommendations", "private_outputs": true, "provenance": [] }, "kernelspec": { "display_name": "Python 3", "name": "python3" }, "language_info": { "name": "python" } }, "nbformat": 4, "nbformat_minor": 0 }

colab-enterprise/Create-Campaign-Recommendations.ipynb (1,847 lines of code) (raw):