sdk/python/foundation-models/nixtla/05_demand_forecasting.ipynb (605 lines of code) (raw):

{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Prerequisites\n", "\n", "Please make sure to follow these steps to start using TimeGEN: \n", "\n", "* Register for a valid Azure account with subscription \n", "* Make sure you have access to [Azure AI Studio](https://learn.microsoft.com/en-us/azure/ai-studio/what-is-ai-studio?tabs=home)\n", "* Create a project and resource group\n", "* Select `TimeGEN-1`.\n", "\n", " > Notice that some models may not be available in all the regions in Azure AI and Azure Machine Learning. On those cases, you can create a workspace or project in the region where the models are available and then consume it with a connection from a different one. To learn more about using connections see [Consume models with connections](https://learn.microsoft.com/en-us/azure/ai-studio/how-to/deployments-connections)\n", "\n", "* Deploy with \"Pay-as-you-go\"\n", "\n", "Once deployed successfully, you should be assigned for an API endpoint and a security key for inference.\n", "\n", "To complete this tutorial, you will need to:\n", "\n", "* Install `nixtla` and `pandas`:\n", "\n", " ```bash\n", " pip install nixtla pandas\n", " ```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Forecasting Demand" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this tutorial, we show how to use TimeGEN on an intermittent series where we have many values at zero. Here, we use a subset of the M5 dataset that tracks the demand for food items in a Californian store. The dataset also includes exogenous variables like the sell price and the type of event occuring at a particular day.\n", "\n", "TimeGEN achieves the best performance at a MAE of 0.49, which represents a **14% improvement** over the best statistical model specifically built to handle intermittent time series data.\n", "\n", "To complete this tutorial, you will need to:\n", "\n", "* Install `nixtla`, `pandas`, `numpy`, `utilsforecast`, `statsforecast`:\n", "\n", " ```bash\n", " pip install nixtla pandas numpy utilsforecast statsforecast\n", " ```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Initial setup\n", "\n", "We start off by importing the required packages for this tutorial and create an instace of `NixtlaClient`." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import time\n", "import pandas as pd\n", "import numpy as np\n", "\n", "from nixtla import NixtlaClient\n", "\n", "from utilsforecast.losses import mae\n", "from utilsforecast.evaluation import evaluate" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "nixtla_client = NixtlaClient(\n", " base_url=\"you azure ai endpoint\",\n", " api_key=\"your api_key\",\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We now read the dataset and plot it." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df = pd.read_csv(\n", " \"https://raw.githubusercontent.com/Nixtla/transfer-learning-time-series/main/datasets/m5_sales_exog_small.csv\"\n", ")\n", "df[\"ds\"] = pd.to_datetime(df[\"ds\"])\n", "\n", "df.head()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "nixtla_client.plot(\n", " df,\n", " max_insample_length=365,\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In the figure above, we can see the intermittent nature of this dataset, with many periods with zero demand.\n", "\n", "Now, let's use TimeGEN to forecast the demand of each product." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Bounded forecasts\n", "\n", "To avoid getting negative predictions coming from the model, we use a log transformation on the data. That way, the model will be forced to predict only positive values.\n", "\n", "Note that due to the presence of zeros in our dataset, we add one to all points before taking the log." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df_transformed = df.copy()\n", "\n", "df_transformed[\"y\"] = np.log(df_transformed[\"y\"] + 1)\n", "\n", "df_transformed.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, let's keep the last 28 time steps for the test set and use the rest as input to the model." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "test_df = df_transformed.groupby(\"unique_id\").tail(28)\n", "\n", "input_df = df_transformed.drop(test_df.index).reset_index(drop=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Forecasting with TimeGEN" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "start = time.time()\n", "\n", "fcst_df = nixtla_client.forecast(\n", " df=input_df,\n", " h=28,\n", " level=[80], # Generate a 80% confidence interval\n", " finetune_steps=10, # Specify the number of steps for fine-tuning\n", " finetune_loss=\"mae\", # Use the MAE as the loss function for fine-tuning\n", " time_col=\"ds\",\n", " target_col=\"y\",\n", " id_col=\"unique_id\",\n", ")\n", "\n", "end = time.time()\n", "\n", "TimeGEN_duration = end - start\n", "\n", "print(f\"Time (TimeGEN): {TimeGEN_duration}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Great! TimeGEN was done in **5.8 seconds** and we now have predictions. However, those predictions are transformed, so we need to inverse the transformation to get back to the orignal scale. Therefore, we take the exponential and subtract one from each data point." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "cols = [col for col in fcst_df.columns if col not in [\"ds\", \"unique_id\"]]\n", "\n", "for col in cols:\n", " fcst_df[col] = np.exp(fcst_df[col]) - 1\n", "\n", "fcst_df.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Evaluation" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Before measuring the performance metric, let's plot the predictions against the actual values." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "nixtla_client.plot(\n", " test_df, fcst_df, models=[\"TimeGPT\"], level=[80], time_col=\"ds\", target_col=\"y\"\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Finally, we can measure the mean absolute error (MAE) of the model." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fcst_df[\"ds\"] = pd.to_datetime(fcst_df[\"ds\"])\n", "\n", "test_df = pd.merge(test_df, fcst_df, \"left\", [\"unique_id\", \"ds\"])" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "evaluation = evaluate(\n", " test_df, metrics=[mae], models=[\"TimeGPT\"], target_col=\"y\", id_col=\"unique_id\"\n", ")\n", "\n", "average_metrics = evaluation.groupby(\"metric\")[\"TimeGPT\"].mean()\n", "average_metrics" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Forecasting with statistical models" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The library `statsforecast` by Nixtla provides a suite of statistical models specifically built for intermittent forecasting, such as Croston, IMAPA and TSB. Let's use these models and see how they perform against TimeGEN." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from statsforecast import StatsForecast\n", "from statsforecast.models import CrostonClassic, CrostonOptimized, IMAPA, TSB" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here, we use four models: two versions of Croston, IMAPA and TSB." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "models = [CrostonClassic(), CrostonOptimized(), IMAPA(), TSB(0.1, 0.1)]\n", "\n", "sf = StatsForecast(models=models, freq=\"D\", n_jobs=-1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Then, we can fit the models on our data." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "start = time.time()\n", "\n", "sf.fit(df=input_df)\n", "\n", "sf_preds = sf.predict(h=28)\n", "\n", "end = time.time()\n", "\n", "sf_duration = end - start\n", "\n", "print(f\"Statistical models took :{sf_duration}s\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here, fitting and predicting with four statistical models took 5.2 seconds, while TimeGEN took 5.8 seconds, so TimeGEN was only 0.6 seconds slower." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Again, we need to inverse the transformation. Remember that the training data was previously transformed using the log function." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "cols = [col for col in sf_preds.columns if col not in [\"ds\", \"unique_id\"]]\n", "\n", "for col in cols:\n", " sf_preds[col] = np.exp(sf_preds[col]) - 1\n", "\n", "sf_preds.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Evaluation" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, let's combine the predictions from all methods and see which performs best." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "test_df = pd.merge(test_df, sf_preds, \"left\", [\"unique_id\", \"ds\"])\n", "test_df.head()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "evaluation = evaluate(\n", " test_df,\n", " metrics=[mae],\n", " models=[\"TimeGPT\", \"CrostonClassic\", \"CrostonOptimized\", \"IMAPA\", \"TSB\"],\n", " target_col=\"y\",\n", " id_col=\"unique_id\",\n", ")\n", "\n", "average_metrics = evaluation.groupby(\"metric\")[\n", " [\"TimeGPT\", \"CrostonClassic\", \"CrostonOptimized\", \"IMAPA\", \"TSB\"]\n", "].mean()\n", "average_metrics" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In the table above, we can see that TimeGEN achieves the lowest MAE, achieving a 12.8% improvement over the best performing statistical model.\n", "\n", "Now, this was done without using any of the available exogenous features. While the statsitical models do not support them, let's try including them in TimeGEN." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Forecasting with exogenous variables using TimeGEN" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To forecast with exogenous variables, we need to specify their future values over the forecast horizon. Therefore, let's simply take the types of events, as those dates are known in advance. " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "futr_exog_df = test_df.drop(\n", " [\n", " \"TimeGPT\",\n", " \"CrostonClassic\",\n", " \"CrostonOptimized\",\n", " \"IMAPA\",\n", " \"TSB\",\n", " \"y\",\n", " \"TimeGPT-lo-80\",\n", " \"TimeGPT-hi-80\",\n", " \"sell_price\",\n", " ],\n", " axis=1,\n", ")\n", "futr_exog_df.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Then, we simply call the `forecast` method and pass the `futr_exog_df` in the `X_df` parameter." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "start = time.time()\n", "\n", "fcst_df = nixtla_client.forecast(\n", " df=input_df,\n", " X_df=futr_exog_df,\n", " h=28,\n", " level=[80], # Generate a 80% confidence interval\n", " finetune_steps=10, # Specify the number of steps for fine-tuning\n", " finetune_loss=\"mae\", # Use the MAE as the loss function for fine-tuning\n", " time_col=\"ds\",\n", " target_col=\"y\",\n", " id_col=\"unique_id\",\n", ")\n", "\n", "end = time.time()\n", "\n", "TimeGEN_duration = end - start\n", "\n", "print(f\"Time (TimeGEN): {TimeGEN_duration}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Great! Remember that the predictions are transformed, so we have to inverse the transformation again." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fcst_df.rename(\n", " columns={\n", " \"TimeGPT\": \"TimeGPT_ex\",\n", " },\n", " inplace=True,\n", ")\n", "\n", "cols = [col for col in fcst_df.columns if col not in [\"ds\", \"unique_id\"]]\n", "\n", "for col in cols:\n", " fcst_df[col] = np.exp(fcst_df[col]) - 1\n", "\n", "fcst_df.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Evaluation" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Finally, let's evaluate the performance of TimeGEN with exogenous features." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "test_df[\"TimeGPT_ex\"] = fcst_df[\"TimeGPT_ex\"].values\n", "test_df.head()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "evaluation = evaluate(\n", " test_df,\n", " metrics=[mae],\n", " models=[\n", " \"TimeGPT\",\n", " \"CrostonClassic\",\n", " \"CrostonOptimized\",\n", " \"IMAPA\",\n", " \"TSB\",\n", " \"TimeGPT_ex\",\n", " ],\n", " target_col=\"y\",\n", " id_col=\"unique_id\",\n", ")\n", "\n", "average_metrics = evaluation.groupby(\"metric\")[\n", " [\"TimeGPT\", \"CrostonClassic\", \"CrostonOptimized\", \"IMAPA\", \"TSB\", \"TimeGPT_ex\"]\n", "].mean()\n", "average_metrics" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "From the table above, we can see that using exogenous features improved the performance of TimeGEN. Now, it represents a 14% improvement over the best statistical model. \n", "\n", "Using TimeGEN with exogenous features took 6.8 seconds. This is 1.6 seconds slower than statitstical models, but it resulted in much better predictions." ] } ], "metadata": { "kernelspec": { "display_name": "python3", "language": "python", "name": "python3" } }, "nbformat": 4, "nbformat_minor": 4 }