### <font color='#4285f4'>Overview</font>

TimesFM (Time Series Foundation Model) is a pretrained time-series foundation model developed by Google Research for univariate time-series forecasting.

TimesFM is a State of the Art LLM that can predict times series data where no pre-training is required.  This notebook will show a basic example so customers can then incorporate into their analytics.  This is the best place to start with TimesFM before diving into the other notebooks.

Process Flow:

1. Gather 2 weeks of sales data
2. Gather any dynamic covariates
    * a. Categorical - categories that affect our sales (e.g. day of the week, if a marketing campaign was in effect)
    * b. Numerical - numeric values that affect our sales (e.g. temperature)

    For any dynamic covariates we need to add 7 more values since we want to predict the next 7 days of sales. We will know if we are running a marketing campaign and we can check the weather forecast for the future temperature.
3. Gather any static covariates
    * a. Categorical - categories that affect our sales (e.g. the menu item we are selling)
    * b. Numerical - numeric values that affect our sales (e.g. the price of the menu item)
4. Specify the frequency
    * a. 0 for high frequency (default), 1 for medium, and 2 for low.
5. Load the model
6. Run the prediction

Notes:

- TimesFM allows you to perform time series forcasting with pretraining a model.
- Run this on a e2-standard-8 machine with 250 GB of disk.
- A GPU is **not** required for testing purposes in this notebook.
- You can deploy TimesFM via the Vertex Model Garden **utilizing a GPU**.

Cost:
* Low: Using TimesFM locally
* Medium: Remember to stop your Colab Enterprise Notebook Runtime

Author:
* Adam Paternostro

**About TimesFM**
- View GitHub repository: [Link](https://github.com/google-research/timesfm/)
- TimesFM is a "State-of-the-Art Large Language Models". These are the most advanced and powerful language models currently available.
- TimesFM supports univariate time-series forecasting
  - This is like trying to predict what the temperature will be tomorrow, based only on the past temperature data. We're not looking at other factors like rainfall or humidity, just the temperature itself. It's like saying, "Based on how the temperature has changed in the past, what's my best guess for tomorrow?"
- TimesFM also supports Covariate/Multivariate support
  - Now imagine we want to improve our predictions by considering other factors that might influence the temperature, like rainfall or humidity. That's where covariate/multivariate support comes in. It's like saying, "Okay, I know past temperatures are important, but what if I also looked at past rainfall and humidity to make my prediction even better?"
    - Covariates: These are the additional factors (like rainfall and humidity) that we think might influence the thing we're trying to predict. They're like sidekicks helping us make a more informed guess.
    - Multivariate: This just means we're now dealing with multiple variables (temperature, rainfall, humidity) instead of just one.
  So with covariate/multivariate support, our forecasting model gets smarter. It can learn how changes in rainfall or humidity tend to affect the temperature, and use that information to make more accurate predictions. It's like having a team of experts working together to solve a puzzle, instead of just one person trying to figure it out alone.
  - Notebook with covariates ([link](https://github.com/google-research/timesfm/blob/master/notebooks/covariates.ipynb))
- Decoder-only patched-transformer architecture
  - This is getting into the technical nuts and bolts of how the model is built.
  - "Transformer" is a type of neural network architecture that has become very popular for language tasks.
  - "Decoder-only" means it's specifically designed for generating text (like in translation or writing tasks), not for understanding and analyzing existing text.
  - "Patched" likely refers to some modifications made to the basic transformer design to make it more efficient or better suited to the specific task.
- Can handle different context and horizon length
  - "Context" refers to the surrounding information the model uses to make predictions. So, this model can work with varying amounts of context, from short sentences to longer passages.
  - "Horizon length" is how far into the future the model is trying to predict. This design can handle both short-term and longer-term predictions.
- Fast inference due to patching
  - "Inference" is the process of using the model to make predictions.
  - The "patching" mentioned earlier helps make this process faster. This is important for real-world applications where you need quick responses.


### <font color='#4285f4'>Video Walkthrough</font>

[![Video](https://storage.googleapis.com/data-analytics-golden-demo/chocolate-ai/v1/Videos/adam-paternostro-video.png)](https://storage.googleapis.com/data-analytics-golden-demo/chocolate-ai/v1/Videos/Campaign-Performance-Forecasting-TimesFM.mp4)


In [None]:
from IPython.display import HTML

HTML("""
<video width="800" height="600" controls>
  <source src="https://storage.googleapis.com/data-analytics-golden-demo/chocolate-ai/v1/Videos/Campaign-Performance-Forecasting-TimesFM.mp4" type="video/mp4">
  Your browser does not support the video tag.
</video>
""")

### <font color='#4285f4'>License</font>

```
# Copyright 2024 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
```

### <font color='#4285f4'>Deploy TimesFM</font>

1. Open Vertex Model Garden
   -  https://console.cloud.google.com/vertex-ai/publishers/google/model-garden/timesfm
2. Click the Deploy button
3. Select
   - Resource Id:  google/timesfm-v20240828
   - Model Name: (leave default - name does not matter)   
   - Endpoint name: (leave default - name does not matter)
   - Region: us-central1 (if you change you need to change the **Initialize** variables below)
   - Machine spec: (leave default - n1-standard-8)
4. Click Deploy
5. Wait 20 minutes
6. Open Vertex Model Registry
   - https://console.cloud.google.com/vertex-ai/models
7. Click on the model name
8. Click on the model name under "Deploy your model"
9. Click on "Sample Request" (at the top)
10. Copy the ```ENDPOINT_ID="000000000000000000"```
11. Update the variable endpoint_id in the **Initialize** code below.



##### TimesFM Deployment Video

[![TimesFM Deployment Video](https://storage.googleapis.com/data-analytics-golden-demo/chocolate-ai/v1/Videos/adam-paternostro-video.png)](https://storage.googleapis.com/data-analytics-golden-demo/chocolate-ai/v1/Videos/Campaign-Performance-Forecasting-TimesFM-Install.mp4)

In [None]:
from IPython.display import HTML

HTML("""
<h2>Deploying TimesFM to a Vertex AI Endpoint Instructions</h2>
<video width="800" height="600" controls>
  <source src="https://storage.googleapis.com/data-analytics-golden-demo/chocolate-ai/v1/Videos/Campaign-Performance-Forecasting-TimesFM-Install.mp4" type="video/mp4">
  Your browser does not support the video tag.
</video>
""")

### <font color='#4285f4'>Pip installs</font>

In [None]:
# PIP Installs
import sys

# https://PLACEHOLDER.com/index.html

# For better performance and production, deploy to Vertex AI endpoint with GPU
# This takes about 5 minutes to install and you will need to reset your runtime
# !{sys.executable} -m pip install timesfm <- THERE ARE TOO MANY DEPENDENCIES TO EASILY TO THIS IN COLAB

### <font color='#4285f4'>Initialize</font>

In [None]:
import IPython.display
import google.auth
import requests
import json
import uuid
import base64
import os
import cv2
import random
import time
import datetime
import base64
import random

In [None]:
# Set these (run this cell to verify the output)

endpoint_id="000000000000000000"  # <- YOU MUST SET THIS !!!!

bigquery_location = "${bigquery_location}"
region = "${region}"
location = "${location}"
storage_account = "${chocolate_ai_bucket}"

# Get the current date and time
now = datetime.datetime.now()

# Format the date and time as desired
formatted_date = now.strftime("%Y-%m-%d-%H-%M")


# Get some values using gcloud
project_id = !(gcloud config get-value project)
user = !(gcloud auth list --filter=status:ACTIVE --format="value(account)")

if len(project_id) != 1:
  raise RuntimeError(f"project_id is not set: {project_id}")
project_id = project_id[0]

if len(user) != 1:
  raise RuntimeError(f"user is not set: {user}")
user = user[0]

print(f"project_id = {project_id}")
print(f"user = {user}")

### <font color='#4285f4'>Helper Methods</font>

#### restAPIHelper
Calls the Google Cloud REST API using the current users credentials.

In [None]:
def restAPIHelper(url: str, http_verb: str, request_body: str) -> str:
  """Calls the Google Cloud REST API passing in the current users credentials"""

  import requests
  import google.auth
  import json

  # Get an access token based upon the current user
  creds, project = google.auth.default()
  auth_req = google.auth.transport.requests.Request()
  creds.refresh(auth_req)
  access_token=creds.token

  headers = {
    "Content-Type" : "application/json",
    "Authorization" : "Bearer " + access_token
  }

  if http_verb == "GET":
    response = requests.get(url, headers=headers)
  elif http_verb == "POST":
    response = requests.post(url, json=request_body, headers=headers)
  elif http_verb == "PUT":
    response = requests.put(url, json=request_body, headers=headers)
  elif http_verb == "PATCH":
    response = requests.patch(url, json=request_body, headers=headers)
  elif http_verb == "DELETE":
    response = requests.delete(url, headers=headers)
  else:
    raise RuntimeError(f"Unknown HTTP verb: {http_verb}")

  if response.status_code == 200:
    return json.loads(response.content)
    #image_data = json.loads(response.content)["predictions"][0]["bytesBase64Encoded"]
  else:
    error = f"Error restAPIHelper -> ' Status: '{response.status_code}' Text: '{response.text}'"
    raise RuntimeError(error)

#### timesFMInference
Calls TimesFM Vertex Model Endpoint

In [None]:
def timesFMInference(project_number, endpoint_id, payload):
  url = f"https://{location}-aiplatform.googleapis.com/v1/projects/{project_number}/locations/{location}/endpoints/{endpoint_id}:predict"
  # print(f"url: {url}")
  response = restAPIHelper(url, http_verb="POST", request_body=payload)
  # print(f"response: {response}")
  return response

#### getProjectNumber
Gets the project number from a project id

In [None]:
def getProjectNumber(project_id):
  """Batch activates service apis"""

  url = f"https://cloudresourcemanager.googleapis.com/v1/projects/{project_id}"
  json_result = restAPIHelper(url, "GET", None)
  print(f"getProjectNumber (GET) json_result: {json_result}")

  project_number = json_result["projectNumber"]
  return project_number

### <font color='#4285f4'>Tutorial: Sales Forecast (Marketing Campaign, Day of Week, Temperature)</font>

In [None]:
# We want to predict our chocolate ai sales based upon past sales, past marketing campaigns and the temperature.

# Let's view our example data.
# We have 2 weeks of existings sales data (columns C through P)
#   For the sales data we know:
#      1. If a marketing campaign was taking place
#      2. The day of the week (maybe weekends are busier?)
#      3. The temperature (maywe we sell more on cold days that hot?)
# We have the price and item name which are "static"

# We want to predict the next 1 week of sales data
#   We need to provide if we will be running a marketing campaign, the temperature (so get the next weeks weather data) and the day of the week
#   We can then run our prediction

from IPython.display import Image
Image(url='https://storage.googleapis.com/data-analytics-golden-demo/chocolate-ai/v1/Artifacts/TimesFM.png', width=1200)

### <font color='#4285f4'>Configure TimesFM</font>

In [None]:
# import timesfm

In [None]:
context_len = 512
horizon_len = 7 # Predict next 7 days, this could be 128 without requiring compute (129 would be a step up).  This is more of the max horizon len.
input_patch_len = 32
output_patch_len = 128
num_layers = 20
model_dims = 1280
timesfm_backend = "cpu" # cpu, gpu or cuda
xreg_mode = "xreg + timesfm"

# from jax._src import config
# config.update("jax_platforms", timesfm_backend)

# model = timesfm.TimesFm(
#     context_len=context_len,
#     horizon_len=horizon_len,
#     input_patch_len=input_patch_len,
#     output_patch_len=output_patch_len,
#     num_layers=num_layers,
#     model_dims=model_dims,
#     backend=timesfm_backend,
# )

# Load the model
# This can produce "ERROR:absl:For checkpoint version > 1.0, we require users to provide", you can ignore that
# model.load_from_checkpoint(repo_id="google/timesfm-1.0-200m")

### <font color='#4285f4'>Run the Prediction</font>

In [None]:
# This is our sales data, 2 weeks of data
# Array size: 14 elements in the inner array since we have 2 week a sales data
inputs = [[100,105,125,133,145,107,156,101,106,105,105,104,136,165]]

# These are our categorical covariates (additional factors that we think might influence the thing we're trying to predict).
# Here we consider the day of the week and if a marketing campaign was in progress
# Array size: 21 elements in the inner array since we have 2 week a sales data and are predicting 7 more days. We need to provide the future 7 days of data.
dynamic_categorical_covariates = {
    "day_of_week": [[1,2,3,4,5,6,7,1,2,3,4,5,6,7,1,2,3,4,5,6,7]],
    "marketing_campaign": [["N","N","Y","Y","Y","N","N","N","N","N","N","N","Y","N","N","Y","N","N","N","N","N"]]
}

# These are our numerical covariates (additional numeric factors, just like the categories, but numbers)
# Here we consider the temperature of the day
# Array size: 21 elements in the inner array since we have 2 week a sales data and are predicting 7 more days. We need to provide the future 7 days of data.
dynamic_numerical_covariates = {
    "temperature": [[90,90,90,90,90,90,100,90,90,90,90,90,90,100,90,90,90,90,90,100,90]]
}

# These are our static covariates (additional factors that we think are fixed, like the price of the product)
# Here we consider the price of the product
# Array size: 1 element in the inner array since we have 1 static covariate for the entire prediction
static_numerical_covariates = {
    "price": [7.95]
}

# These are our static categorical covariates (additional factors that we think are fixed)
# Here we consider the menu item
# Array size: 1 element in the inner array since we have 1 static covariate for the entire prediction
static_categorical_covariates = {
    "menu_item" : ["cafe-mocha"]
}

# frequency of each context time series. 0 for high frequency (default), 1 for medium, and 2 for low.
# Array size: 1 element in the inner array since we have 1 frequency for the entire prediction
frequency = [0]

# model_forecast, xreg_forecast = model.forecast_with_covariates(
#     inputs=inputs,
#     dynamic_categorical_covariates=dynamic_categorical_covariates,
#     dynamic_numerical_covariates=dynamic_numerical_covariates,
#     static_numerical_covariates=static_numerical_covariates,
#     static_categorical_covariates=static_categorical_covariates,
#     freq=frequency,
#     xreg_mode="xreg + timesfm",              # default
#     ridge=0.0,
#     force_on_cpu=False,
#     normalize_xreg_target_per_input=True,    # default
# )

# See the next 7 days of forecasted values
# model_forecast[0]

In [None]:
# Create the JSON payload to the Vertex Model Endpoint
# You can have multiple array elements under "instances" and TimesFM will create a prediction for each
# The array elements from above are placed in the payload
payload = {
  "instances": [
    {
        "input": inputs[0],
        "freq": frequency[0],
        "horizon": horizon_len,
        "dynamic_numerical_covariates": {
            "temperature": dynamic_numerical_covariates["temperature"][0]
        },
        "dynamic_categorical_covariates": {
            "day_of_week": dynamic_categorical_covariates["day_of_week"][0],
            "marketing_campaign": dynamic_categorical_covariates["marketing_campaign"][0]
        },
        "static_numerical_covariates": {
            "price": static_numerical_covariates["price"][0]
        },
        "static_categorical_covariates": {
            "menu_item": static_categorical_covariates["menu_item"][0]
        },
        "xreg_kwargs": {
            "xreg_mode" : xreg_mode
        }
    }
  ]
}

In [None]:
# Get the project number in order to call teh endpoint
project_number = getProjectNumber(project_id)

In [None]:
# Calls TimeFM to make a prediction
times_fm_inference = timesFMInference(project_number, endpoint_id, payload)

In [None]:
# Create an array of forecasted elements
model_forecast = [times_fm_inference["predictions"][0]["point_forecast"]]

In [None]:
# View our prediction
model_forecast

### <font color='#4285f4'>Visualize the results</font>

- The light blue is a past known sales data
- The dark blue is our predicted data by TimesFM
- The labels that are bolded are days we are running a marketing campaign
- The yellow line is our temperature

**Prediction**
- We predict higher sales on the 16th and 20th
  - The 16th is a marketing campaign (the label is bolded)
  - The 20th is a high temperature day (the line spikes)

In [None]:
import matplotlib.pyplot as plt

# Data
sales = inputs[0]
predicted_sales = model_forecast[0]
marketing_campaigns = dynamic_categorical_covariates["marketing_campaign"]
temperature = dynamic_numerical_covariates["temperature"][0]

# Create x-axis values
all_days = list(range(1, 22))  # Days as integers from 1 to 21
days = list(range(1, 15))  # Adjust range if needed
days_predicted = list(range(15, 22))

# Create the plot
fig, ax1 = plt.subplots(figsize=(12, 6))

# Plot sales as a bar chart
ax1.bar(days, sales, color='#{:02x}{:02x}{:02x}'.format(92, 200, 243), label='Sales')

# Plot predicted sales as a bar chart
ax1.bar(days_predicted, predicted_sales, color='#{:02x}{:02x}{:02x}'.format(53, 106, 228), label='Predicted Sales')

# Set x-axis ticks and labels for all days
ax1.set_xticks(all_days)
ax1.set_xticklabels(all_days)

ax1.set_xlabel('Days (bold means marketing campaign)')
ax1.set_ylabel('Sales', color="black")
ax1.tick_params(axis='y', labelcolor='#{:02x}{:02x}{:02x}'.format(53, 106, 228))

# Create a second y-axis for temperature
ax2 = ax1.twinx()
ax2.plot(all_days, temperature, color='#{:02x}{:02x}{:02x}'.format(176, 202, 78), linestyle = '--', alpha = 1)
ax2.set_ylabel('Temperature', color="black")
ax2.tick_params(axis='y', labelcolor='#{:02x}{:02x}{:02x}'.format(176, 202, 78))

# Add marketing campaign indicators (bold the "days" when we had a marketing campaign)
for i, campaign in enumerate(marketing_campaigns[0]):
  if campaign == 'Y':
    ax1.get_xticklabels()[i].set_weight('bold')

# Add legend
lines, labels = ax1.get_legend_handles_labels()
lines2, labels2 = ax2.get_legend_handles_labels()
ax1.legend(lines + lines2, labels + labels2)

plt.title('Predicted Sales by TimesFM')
plt.show()

### <font color='#4285f4'>Clean Up</font>

**To save on costs:**

1. Open Vertex Model Registry
   - https://console.cloud.google.com/vertex-ai/models
2. Click on the model name
3. Under "Deploy your model" click the 3 dots and select "Undeploy Model"
4. Under "Deploy your model" click the 3 dots and select "Delete Endpoint"
5. Go back one screen and click the 3 dots and select "Delete Model"


### <font color='#4285f4'>Reference Links</font>

- [GitHub](https://github.com/google-research/timesfm)
- [Vertex AI Model Garden](https://console.cloud.google.com/vertex-ai/publishers/google/model-garden/timesfm)