- This article is a learn-by-doing introduction to Orbit (Object-Oriented Bayesian Time-Series), an open-source Python framework created by Uber for full Bayesian time series forecasting and inference.
- Time series models help Uber predict demand so they know where to send their drivers, forecast hardware and computation requirements so their servers don’t go down, and allocate billions of dollars in annual marketing budget.
- Currently, Orbit supports concrete implementations for the following models:
- Exponential Smoothing (ETS)
- Local Global Trend (LGT)
- Damped Local Trend (DLT)
- Kernel Time-based Regression (KTR)
- It also supports the following sampling/optimization methods for model estimation/inferences:
- Markov-Chain Monte Carlo (MCMC) as a full sampling method
- Maximum a Posteriori (MAP) as a point estimate method
- Variational Inference (VI) as a hybrid-sampling method on approximate distribution
- A notable feature of Orbit is its use of probabilistic modeling to capture the uncertainty inherent in time-series data. This allows users to obtain credible probabilistic forecasts with confidence intervals.
- Installing from PYPI
!pip install orbit-ml
- Setting the working directory YOURPATH
import os
os.chdir('YOURPATH') # Set working directory
os. getcwd()
Table of Contents
Insurance Claims
- Let’s look at the iclaims dataset that contains the weekly initial claims for US unemployment benefits against a few related Google trend queries.
- Basic imports
import pandas as pd
import numpy as np
import orbit
import matplotlib.pyplot as plt
from orbit.utils.dataset import load_iclaims
from orbit.diagnostics.plot import plot_predicted_data, plot_predicted_components
from orbit.utils.plot import get_orbit_style
plt.style.use(get_orbit_style())
from orbit.models import ETS
orbit.__version__
'1.1.4.2'
- Loading the log-log transformed time-series data and train-test split
raw_df = load_iclaims(transform=True)
raw_df.dtypes
week datetime64[ns]
claims float64
trend.unemploy float64
trend.filling float64
trend.job float64
sp500 float64
vix float64
dtype: object
df = raw_df.copy()
test_size=52
train_df=df[:-test_size]
test_df=df[-test_size:]
- Training the ETS forecasting model
ets = ETS(
response_col='claims',
date_col='week',
seasonality=52,
seed=2020,
estimator='stan-mcmc',
)
ets.fit(train_df)
predicted_df = ets.predict(df=df, decompose=True)
_ = plot_predicted_data(training_actual_df=train_df,
predicted_df=predicted_df,
date_col='week',
actual_col='claims',
test_actual_df=test_df)

_ = plot_predicted_components(predicted_df=predicted_df, date_col='week')

- Extracting and Analyzing Posterior Samples with ArviZ by performing a random walk over the parameter space
posterior_samples = ets.get_posterior_samples()
posterior_samples.keys()
dict_keys(['l', 'lev_sm', 'obs_sigma', 's', 'sea_sm', 'loglk'])
import arviz as az
posterior_samples = ets.get_posterior_samples(permute=False)
# example from https://arviz-devs.github.io/arviz/index.html
az.style.use("arviz-darkgrid")
az.plot_pair(
posterior_samples,
var_names=["sea_sm", "lev_sm", "obs_sigma"],
kind="kde",
marginals=True,
textsize=15,
)
plt.show()

- Training the Local Global Trend (LGT) model
%matplotlib inline
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import orbit
from orbit.models import LGT
from orbit.diagnostics.plot import plot_predicted_data
from orbit.diagnostics.plot import plot_predicted_components
from orbit.utils.dataset import load_iclaims
# load data
df = load_iclaims()
# define date and response column
date_col = 'week'
response_col = 'claims'
df.dtypes
test_size = 52
train_df = df[:-test_size]
test_df = df[-test_size:]
lgt = LGT(
response_col=response_col,
date_col=date_col,
estimator='stan-map',
seasonality=52,
seed=8888,
)
%%time
lgt.fit(df=train_df)
CPU times: total: 125 ms
Wall time: 41.4 s
predicted_df = lgt.predict(df=test_df)
_ = plot_predicted_data(training_actual_df=train_df, predicted_df=predicted_df,
date_col=date_col, actual_col=response_col,
test_actual_df=test_df, title='Prediction with LGTMAP Model')

- Training the Damped Local Trend (DLT) model
%matplotlib inline
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import orbit
from orbit.models import DLT
from orbit.diagnostics.plot import plot_predicted_data,plot_predicted_components
from orbit.utils.dataset import load_iclaims
import warnings
warnings.filterwarnings('ignore')
print(orbit.__version__)
1.1.4.2
# load log-transformed data
df = load_iclaims()
train_df = df[df['week'] < '2017-01-01']
test_df = df[df['week'] >= '2017-01-01']
response_col = 'claims'
date_col = 'week'
regressor_col = ['trend.unemploy', 'trend.filling', 'trend.job']
dlt = DLT(
response_col=response_col,
regressor_col=regressor_col,
date_col=date_col,
seasonality=52,
prediction_percentiles=[5, 95],
)
dlt.fit(train_df)
- Plotting the DLT prediction vs train data
predicted_df = dlt.predict(df=train_df, decompose=True)
_ = plot_predicted_data(train_df, predicted_df,
date_col=dlt.date_col, actual_col=dlt.response_col)

- Plotting the DLT prediction vs test data

- Plotting DLT prediction, trend, seasonality, and regression

Store Unit Sales
- Our next example deals with the real sales data made available by Favorita, a large Ecuadorian grocery chain.
- Preparing the input data
import orbit
from orbit.models import DLT
import pandas as pd
import numpy as np
import os
def wmape(y_true, y_pred):
return np.abs(y_true - y_pred).sum() / np.abs(y_true).sum()
path = 'train.csv'
data = pd.read_csv(path, index_col='id', parse_dates=['date'])
data2 = data.loc[((data['store_nbr'] == 1)), ['date', 'unit_sales', 'onpromotion']]
dec25 = list()
for year in range(2013,2017):
dec18 = data2.loc[(data2['date'] == f'{year}-12-18')]
dec25 += [{'date': pd.Timestamp(f'{year}-12-25'), 'unit_sales': dec18['unit_sales'].values[0], 'onpromotion': dec18['onpromotion'].values[0]}]
data2 = pd.concat([data2, pd.DataFrame(dec25)], ignore_index=True).sort_values('date')
train = data2.loc[data2['date'] < '2017-01-01']
valid = data2.loc[(data2['date'] >= '2017-01-01') & (data2['date'] < '2017-04-01')]
df_daily = train.set_index('date').resample('D')["unit_sales"].sum().to_frame()
df_daily.tail()
unit_sales
date
2016-12-27 12157.823
2016-12-28 12144.918
2016-12-29 10244.317
2016-12-30 13584.621
2016-12-31 10741.060
- ETS model prediction
import orbit
from orbit.models import ETS
ets = ETS(date_col='date',
response_col='unit_sales',
seasonality=7,
prediction_percentiles=[5, 95],
seed=1)
p = ets.predict(df=df_daily)
plt.figure(figsize=(15,6))
plt.plot(p['date'],p['prediction'])
plt.plot(p['date'],df_daily['unit_sales'])

- Plotting the ETS model prediction with percentiles vs actual data
fig, ax = plt.subplots(1,1, figsize=(1280/96, 720/96))
ax.plot(p['date'], df_daily['unit_sales'], label='actual')
ax.plot(p['date'], p['prediction'], label='prediction')
ax.fill_between(p['date'], p['prediction_5'], p['prediction_95'], alpha=0.2, color='orange', label='prediction percentiles')
ax.set_title('ETS Model')
ax.set_ylabel('Sales')
ax.set_xlabel('Date')
ax.legend()
plt.show()

- DLT model prediction
df = df_daily.reset_index()
df.tail()
date unit_sales
1455 2016-12-27 12157.823
1456 2016-12-28 12144.918
1457 2016-12-29 10244.317
1458 2016-12-30 13584.621
1459 2016-12-31 10741.060
dlt = DLT(
response_col='unit_sales',
date_col='date',
estimator='stan-map',
seasonality=52,fig, ax = plt.subplots(1,1, figsize=(1280/96, 720/96))
ax.plot(p1['date'], df['unit_sales'], label='actual')
ax.plot(p1['date'], p1['prediction'], label='prediction')
ax.fill_between(p1['date'], p1['prediction_5'], p1['prediction_95'], alpha=0.2, color='orange', label='prediction percentiles')
ax.set_title('DLT Model')
ax.set_ylabel('Sales')
ax.set_xlabel('Date')
ax.legend()
plt.show()
seed=8888,
global_trend_option='logistic',
# for prediction uncertainty
n_bootstrap_draws=1000,
)
dlt.fit(df)
p1 = dlt.predict(df)

Summary
- Time series forecasting is an active R&D topic in academia as well as industry.
- In this article, we have validated Orbit, a Bayesian time series modeling user interface which is simple to use, adaptable, interoperable, and high-performing (fast computation).
- We have utilized the Forecaster objects to initiate processes like fitting, forecasting (prediction), and posterior sample extraction. The Forecaster class is a wrapper class for several Bayesian estimating flows.
- Throughout the study, we have used 2 public-domain datasets: the weekly initial claims for US unemployment benefits and the real sales data made available by Favorita, a large Ecuadorian grocery chain.
- Our future work will focus on the Orbit’s kernel-based time-varying regression (KTR) model, which defines a smooth, time-varying representation of regression coefficients using latent variables.
Explore More
- Orbit Examples
- Hands-On Guide To Orbit: Uber’s Python Framework For Bayesian Forecasting & Inference
- Time-Series Forecasting With Orbit
- OrbitML Python Prediction Package — The Easy Way
- Orbit: Uber’s Python Framework For Forecasting

Leave a comment