notebooks/additional

{ "cells": [ { "cell_type": "code", "execution_count": 1, "metadata": { "block_hidden": true }, "outputs": [], "source": [ "%load_ext rpy2.ipython\n", "%matplotlib inline\n", "import pandas as pd\n", "import numpy as np\n", "from prophet import Prophet\n", "import logging\n", "logging.getLogger('prophet').setLevel(logging.ERROR)\n", "import warnings\n", "warnings.filterwarnings(\"ignore\")" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "block_hidden": true }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "INFO:numexpr.utils:NumExpr defaulting to 8 threads.\n" ] }, { "data": { "text/plain": [ "<prophet.forecaster.Prophet at 0x7f578ce95760>" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df = pd.DataFrame({\n", " 'ds': pd.date_range(start='2020-01-01', periods=20),\n", " 'y': np.arange(20),\n", "})\n", "m = Prophet(weekly_seasonality=False)\n", "m.fit(df)" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "block_hidden": true }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "R[write to console]: Loading required package: Rcpp\n", "\n", "R[write to console]: Loading required package: rlang\n", "\n", "R[write to console]: Disabling yearly seasonality. Run prophet with yearly.seasonality=TRUE to override this.\n", "\n", "R[write to console]: Disabling daily seasonality. Run prophet with daily.seasonality=TRUE to override this.\n", "\n", "R[write to console]: n.changepoints greater than number of observations. Using 15\n", "\n" ] } ], "source": [ "%%R\n", "library(prophet)\n", "df <- data.frame(\n", " ds=seq(as.Date(\"2020-01-01\"), by = \"day\", length.out = 20),\n", " y=1:20\n", ")\n", "m <- prophet(df, weekly.seasonality=FALSE)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Saving models\n", "\n", "It is possible to save fitted Prophet models so that they can be loaded and used later.\n", "\n", "In R, this is done with `saveRDS` and `readRDS`:" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "%%R\n", "saveRDS(m, file=\"model.RDS\") # Save model\n", "m <- readRDS(file=\"model.RDS\") # Load model" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In Python, models should not be saved with pickle; the Stan backend attached to the model object will not pickle well, and will produce issues under certain versions of Python. Instead, you should use the built-in serialization functions to serialize the model to json:" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "import json\n", "from prophet.serialize import model_to_json, model_from_json\n", "\n", "with open('serialized_model.json', 'w') as fout:\n", " json.dump(model_to_json(m), fout) # Save model\n", "\n", "with open('serialized_model.json', 'r') as fin:\n", " m = model_from_json(json.load(fin)) # Load model" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The json file will be portable across systems, and deserialization is backwards compatible with older versions of prophet." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Flat trend and custom trends\n", "\n", "For time series that exhibit strong seasonality patterns rather than trend changes, it may be useful to force the trend growth rate to be flat. This can be achieved simply by passing `growth=flat` when creating the model:" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "output_hidden": true }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "R[write to console]: Disabling yearly seasonality. Run prophet with yearly.seasonality=TRUE to override this.\n", "\n", "R[write to console]: Disabling daily seasonality. Run prophet with daily.seasonality=TRUE to override this.\n", "\n", "R[write to console]: n.changepoints greater than number of observations. Using 15\n", "\n" ] } ], "source": [ "%%R\n", "m <- prophet(df, growth='flat')" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "m = Prophet(growth='flat')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that if this is used on a time series that doesn't have a constant trend, any trend will be fit with the noise term and so there will be high predictive uncertainty in the forecast.\n", "\n", "To use a trend besides these three built-in trend functions (piecewise linear, piecewise logistic growth, and flat), you can download the source code from github, modify the trend function as desired in a local branch, and then install that local version. [This PR](https://github.com/facebook/prophet/pull/1466/files) provides a good illustration of what must be done to implement a custom trend, as does [this one](https://github.com/facebook/prophet/pull/1794) that implements a step function trend and [this one](https://github.com/facebook/prophet/pull/1778) for a new trend in R." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Updating fitted models\n", "\n", "A common setting for forecasting is fitting models that need to be updated as additional data come in. Prophet models can only be fit once, and a new model must be re-fit when new data become available. In most settings, model fitting is fast enough that there isn't any issue with re-fitting from scratch. However, it is possible to speed things up a little by warm-starting the fit from the model parameters of the earlier model. This code example shows how this can be done in Python:" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "1.33 s ± 55.9 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n", "185 ms ± 4.46 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)\n" ] } ], "source": [ "def stan_init(m):\n", " \"\"\"Retrieve parameters from a trained model.\n", " \n", " Retrieve parameters from a trained model in the format\n", " used to initialize a new Stan model.\n", " \n", " Parameters\n", " ----------\n", " m: A trained model of the Prophet class.\n", " \n", " Returns\n", " -------\n", " A Dictionary containing retrieved parameters of m.\n", " \n", " \"\"\"\n", " res = {}\n", " for pname in ['k', 'm', 'sigma_obs']:\n", " res[pname] = m.params[pname][0][0]\n", " for pname in ['delta', 'beta']:\n", " res[pname] = m.params[pname][0]\n", " return res\n", "\n", "df = pd.read_csv('../examples/example_wp_log_peyton_manning.csv')\n", "df1 = df.loc[df['ds'] < '2016-01-19', :] # All data except the last day\n", "m1 = Prophet().fit(df1) # A model fit to all data except the last day\n", "\n", "\n", "%timeit m2 = Prophet().fit(df) # Adding the last day, fitting from scratch\n", "%timeit m2 = Prophet().fit(df, init=stan_init(m1)) # Adding the last day, warm-starting from m1" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As can be seen, the parameters from the previous model are passed in to the fitting for the next with the kwarg `init`. In this case, model fitting was about 5x faster when using warm starting. The speedup will generally depend on how much the optimal model parameters have changed with the addition of the new data.\n", "\n", "There are few caveats that should be kept in mind when considering warm-starting. First, warm-starting may work well for small updates to the data (like the addition of one day in the example above) but can be worse than fitting from scratch if there are large changes to the data (i.e., a lot of days have been added). This is because when a large amount of history is added, the location of the changepoints will be very different between the two models, and so the parameters from the previous model may actually produce a bad trend initialization. Second, as a detail, the number of changepoints need to be consistent from one model to the next or else an error will be raised because the changepoint prior parameter `delta` will be the wrong size." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### External references\n", "These github repositories provide examples of building on top of Prophet in ways that may be of broad interest:\n", "* [forecastr](https://github.com/garethcull/forecastr): A web app that provides a UI for Prophet.\n", "* [NeuralProphet](https://github.com/ourownstory/neural_prophet): A Prophet-style model implemented in pytorch, to be more adaptable and extensible." ] } ], "metadata": { "celltoolbar": "Edit Metadata", "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.3" } }, "nbformat": 4, "nbformat_minor": 2 }

notebooks/additional_topics.ipynb (291 lines of code) (raw):