A time series is a succession of chronologically ordered data spaced at equal or unequal intervals. The forecasting process consists of predicting the future value of a time series, either by modeling the series solely based on its past behavior (autoregressive) or by using other external variables.
This paper describes how to use Scikit-learn regression models to perform forecasting on time series. Specifically, it introduces Skforecast, a simple package that contains the classes and functions necessary to adapt any Scikit-learn regression model to forecasting problems.
The common objective of working with time series is not only to predict the next element in the series ($t_{+1}$) but an entire future interval or a point far away in time ($t_{+n}$). Each prediction jump is known as a step.
There are several strategies that allow generating this type of multiple prediction.
Recursive multi-step forecasting
Since to predict the moment $t_{n}$ the value of $t_{n-1}$ is needed, which is unknown, it is necessary to make recursive predictions. New predictions use previous ones as predictors. This process is known as recursive forecasting or recursive multi-step forecasting.
The main adaptation needed to apply Scikit-learn models to recursive multi-step forecasting problems is to transform the time series into a matrix in which each value is associated with the time window (lags) preceding it. This forecasting strategy can be easily generated with the ForecasterAutoreg
and ForecasterAutoregCustom
classes from the Skforecast package.
This type of transformation also allows the inclusion of exogenous variables to the time series.
Direct multi-step forecasting
The direct multi-step forecasting method consists of training a different model for each step. For example, to predict the following 5 values of a time series, 5 different models are required to be trained, one for each step. As a result, the predictions are independent of each other.
The main complexity of this approach is to generate the correct training matrices for each model. The ForecasterAutoregMultiOutput
class of the Skforecast package automates this process. It is also important to bear in mind that this strategy has a higher computational cost since it requires the train of multiple models. The following diagram shows the process for a case in which the response variable and two exogenous variables are available.
Multiple output forecasting
Certain models are capable of simultaneously predicting several values of a sequence (one-shot). An example of a model with this capability is the LSTM neural network.
A time series is available with the monthly expenditure (millions of dollars) on corticosteroid drugs that the Australian health system had between 1991 and 2008. It is intended to create an autoregressive model capable of predicting future monthly expenditures.
# Data manipulation
# ==============================================================================
import numpy as np
import pandas as pd
# Plots
# ==============================================================================
import matplotlib.pyplot as plt
%matplotlib inline
# Warnings configuration
# ==============================================================================
import warnings
warnings.filterwarnings('ignore')
In addition to the above, Skforecast, a library containing the classes and functions needed to adapt any Scikit-learn regression model to forecasting problems, is used. It can be installed in the following ways:
pip install skforecast
A specific version:
pip install git+https://github.com/JoaquinAmatRodrigo/skforecast@v0.3.0
Last version (unstable):
pip install git+https://github.com/JoaquinAmatRodrigo/skforecast#master
# Modeling and Forecasting
# ==============================================================================
from sklearn.linear_model import LinearRegression
from sklearn.linear_model import Lasso
from sklearn.ensemble import RandomForestRegressor
from sklearn.metrics import mean_squared_error
from skforecast.ForecasterAutoreg import ForecasterAutoreg
from skforecast.ForecasterAutoregCustom import ForecasterAutoregCustom
from skforecast.ForecasterAutoregMultiOutput import ForecasterAutoregMultiOutput
from skforecast.model_selection import grid_search_forecaster
from skforecast.model_selection import backtesting_forecaster
from skforecast.model_selection import backtesting_forecaster_intervals
from joblib import dump, load
The data used in the examples of this paper have been obtained from the magnificent book Forecasting: Principles and Practice by Rob J Hyndman and George Athanasopoulos.
# Data download
# ==============================================================================
url = 'https://raw.githubusercontent.com/JoaquinAmatRodrigo/skforecast/master/data/h2o.csv'
data_raw = pd.read_csv(url, sep=',')
data_raw = data_raw.rename(columns={'fecha': 'date'})
The column date has been stored as a string
. To convert it to datetime
the pd.to_datetime()
function can be use. Once in datetime
format, and to make use of pandas functionalities, it is set as an index. Also, since the data is monthly, the frequency is set as Monthly Started 'MS'.
# Data preparation
# ==============================================================================
data = data_raw.copy()
data['date'] = pd.to_datetime(data['date'], format='%Y/%m/%d')
data = data.set_index('date')
data = data.rename(columns={'x': 'y'})
data = data.asfreq('MS')
data = data['y']
data = data.sort_index()
The time series is verified to be complete.
# Verify that a temporary index is complete
# ==============================================================================
(data.index == pd.date_range(start=data.index.min(),
end=data.index.max(),
freq=data.index.freq)).all()
# Fill gaps in a temporary index
# ==============================================================================
# data.asfreq(freq='30min', fill_value=np.nan)
The last 36 months are used as the test set to evaluate the predictive capacity of the model.
# Split data into train-test
# ==============================================================================
steps = 36
data_train = data[:-steps]
data_test = data[-steps:]
fig, ax=plt.subplots(figsize=(9, 4))
data_train.plot(ax=ax, label='train')
data_test.plot(ax=ax, label='test')
ax.legend();
With the ForecasterAutoreg
class, a model is created and trained from a RandomForestRegressor
regressor with a time window of 6 lags. This means that the model uses the previous 6 months as predictors.
# Create and train forecaster
# ==============================================================================
forecaster_rf = ForecasterAutoreg(
regressor=RandomForestRegressor(random_state=123),
lags=6
)
forecaster_rf.fit(y=data_train)
forecaster_rf
# Predictions
# ==============================================================================
steps = 36
predictions = forecaster_rf.predict(steps=steps)
# Temporal index is added to predictions
predictions = pd.Series(data=predictions, index=data_test.index)
predictions.head()
# Plot
# ==============================================================================
fig, ax = plt.subplots(figsize=(9, 4))
data_train.plot(ax=ax, label='train')
data_test.plot(ax=ax, label='test')
predictions.plot(ax=ax, label='predictions')
ax.legend();
The error that the model makes in its predictions is quantified. In this case, the metric used is the mean squared error (mse).
# Error
# ==============================================================================
error_mse = mean_squared_error(
y_true = data_test,
y_pred = predictions
)
print(f"Test error (mse): {error_mse}")
The trained ForecasterAutoreg
uses a 6 lag time window and a Random Forest model with the default hyperparameters. However, there is no reason why these values are the most suitable.
To identify the best combination of lags and hyperparameters, time series cross-validation and backtesting strategies are available in the Skforecast package. Regardless of the procedure used, it is important not to include the test data in the search process to avoid overfitting problems. Time series cross-validation along the training dataset is used in this case. For the first fold, the initial 50% of the observations are the training data and, the next 10 steps represent the validation set. In successive folds, the training set will contain all the data used in the previous fold and, the next 10 steps will be used as new validation data. This process will be repeated until the entire training data set is used.
# Hyperparameter Grid search
# ==============================================================================
forecaster_rf = ForecasterAutoreg(
regressor = RandomForestRegressor(random_state=123),
lags = 12 # This value will be replaced in the grid search
)
# Regressor's hyperparameters
param_grid = {'n_estimators': [100, 500],
'max_depth': [3, 5, 10]}
# Lags used as predictors
lags_grid = [10, 20]
results_grid = grid_search_forecaster(
forecaster = forecaster_rf,
y = data_train,
param_grid = param_grid,
lags_grid = lags_grid,
steps = 10,
method = 'cv',
metric = 'mean_squared_error',
initial_train_size = int(len(data_train)*0.5),
allow_incomplete_fold = False,
return_best = True,
verbose = False
)
# Grid Search results
# ==============================================================================
results_grid
The best results are obtained using a time window of 20 lags and a Random Forest set up of {'max_depth': 10, 'n_estimators': 500}.
Finally, a ForecasterAutoreg
is trained with the optimal configuration found by validation. This step is not necessary if return_best = True
is specified in the grid_search_forecaster()
function.
# Create and train forecaster with the best hyperparameters
# ==============================================================================
regressor = RandomForestRegressor(max_depth=10, n_estimators=500, random_state=123)
forecaster_rf = ForecasterAutoreg(
regressor = regressor,
lags = 20
)
forecaster_rf.fit(y=data_train)
# Predictions
# ==============================================================================
predictions = forecaster_rf.predict(steps=steps)
# Temporal index is added to predictions
predictions = pd.Series(data=predictions, index=data_test.index)
# Plot
# ==============================================================================
fig, ax = plt.subplots(figsize=(9, 4))
data_train.plot(ax=ax, label='train')
data_test.plot(ax=ax, label='test')
predictions.plot(ax=ax, label='predictions')
ax.legend();
# Error de test
# ==============================================================================
error_mse = mean_squared_error(
y_true = data_test,
y_pred = predictions
)
print(f"Test error (mse): {error_mse}")
The optimal combination of hyperparameters significantly reduces test error.
Since the ForecasterAutoreg
object uses Scikit-learn models, the importance of predictors can be accessed once trained. When the regressor used is a LinearRegression()
, Lasso()
or Ridge()
, the coefficients of the model reflect their importance, obtained with the get_coef()
method. In GradientBoostingRegressor()
or RandomForestRegressor()
regressors, the importance of predictors is based on impurity reduction and is accessible through the get_feature_importances()
method. In both cases, the values returned are sorted as the lags order.
# Predictors importance
# ==============================================================================
importance = forecaster_rf.get_feature_importances()
dict(zip(forecaster_rf.lags, importance))
In the previous example, only lags of the predicted variable itself have been used as predictors. In certain scenarios, it is possible to have information about other variables, whose future value is known, so could serve as additional predictors in the model.
Continuing with the previous example, a new variable whose behavior is correlated with the modeled time series and it is wanted to incorporate as a predictor is simulated. The same applies to multiple exogenous variables.
# Data download
# ==============================================================================
url = 'https://raw.githubusercontent.com/JoaquinAmatRodrigo/skforecast/master/data/h2o_exog.csv'
data_raw = pd.read_csv(url, sep=',')
data_raw = data_raw.rename(columns={'fecha': 'date'})
# Data preparation
# ==============================================================================
data = data_raw.copy()
data['date'] = pd.to_datetime(data['date'], format='%Y/%m/%d')
data = data.set_index('date')
data = data.asfreq('MS')
data = data.sort_index()
fig, ax = plt.subplots(figsize=(9, 4))
data['y'].plot(ax=ax, label='y')
data['exog_1'].plot(ax=ax, label='exogenous variable')
ax.legend();
# Split data into train-test
# ==============================================================================
steps = 36
data_train = data[:-steps]
data_test = data[-steps:]
# Create and train forecaster
# ==============================================================================
forecaster_rf = ForecasterAutoreg(
regressor = RandomForestRegressor(random_state=123),
lags = 8
)
forecaster_rf.fit(y=data_train['y'], exog=data_train['exog_1'])
If the ForecasterAutoreg
is trained with an exogenous variable, the value of this variable must be passed to predict()
. It is only applicable to scenarios in which future information on the exogenous variable is available.
# Predictions
# ==============================================================================
predictions = forecaster_rf.predict(steps=steps, exog=data_test['exog_1'])
# Temporal index is added to predictions
predictions = pd.Series(data=predictions, index=data_test.index)
# Plot
# ==============================================================================
fig, ax=plt.subplots(figsize=(9, 4))
data_train['y'].plot(ax=ax, label='train')
data_test['y'].plot(ax=ax, label='test')
predictions.plot(ax=ax, label='predictions')
ax.legend();
# Error
# ==============================================================================
error_mse = mean_squared_error(
y_true = data_test['y'],
y_pred = predictions
)
print(f"Test error (mse): {error_mse}")
# Hyperparameter Grid search
# ==============================================================================
forecaster_rf = ForecasterAutoreg(
regressor = RandomForestRegressor(random_state=123),
lags = 12 # This value will be replaced in the grid search
)
param_grid = {'n_estimators': [50, 100, 500],
'max_depth': [3, 5, 10]}
lags_grid = [5, 12, 20]
results_grid = grid_search_forecaster(
forecaster = forecaster_rf,
y = data_train['y'],
exog = data_train['exog_1'],
param_grid = param_grid,
lags_grid = lags_grid,
steps = 10,
method = 'cv',
metric = 'mean_squared_error',
initial_train_size = int(len(data_train)*0.5),
allow_incomplete_fold = False,
return_best = True,
verbose = False
)
# Grid Search results
# ==============================================================================
results_grid.head()
The best results are obtained using a time window of 12 lags and a Random Forest set up of {'max_depth': 3, 'n_estimators': 50}.
Setting return_best = True
in the grid_search_forecaster()
, after the search, the ForecasterAutoreg
object has been modified and trained with the best match found.
# Predictions
# ==============================================================================
predictions = forecaster_rf.predict(steps=steps, exog=data_test['exog_1'])
# Temporal index is added to predictions
predictions = pd.Series(data=predictions, index=data_test.index)
# Plot
# ==============================================================================
fig, ax=plt.subplots(figsize=(9, 4))
data_train['y'].plot(ax=ax, label='train')
data_test['y'].plot(ax=ax, label='test')
predictions.plot(ax=ax, label='predictions')
ax.legend();
# Error
# ==============================================================================
error_mse = mean_squared_error(y_true = data_test['y'], y_pred = predictions)
print(f"Test error (mse) {error_mse}")
In addition to the lags, it may be interesting to incorporate other characteristics of the time series in some scenarios. For example, the moving average of the last n values could be used to capture the series's trend.
The ForecasterAutoregCustom
class behaves very similar to the ForecasterAutoreg
class seen in the previous sections, but with the difference that it is the user who defines the function used to create the predictors.
The first example of the paper about predicting the last 36 months of the time series is repeated. In this case, the predictors are the first 10 lags and the values' moving average of the lasts 20 months.
# Data download
# ==============================================================================
url = 'https://raw.githubusercontent.com/JoaquinAmatRodrigo/skforecast/master/data/h2o.csv'
data_raw = pd.read_csv(url, sep=',')
data_raw = data_raw.rename(columns={'fecha': 'date'})
# Data preparation
# ==============================================================================
data = data_raw.copy()
data['date'] = pd.to_datetime(data['date'], format='%Y/%m/%d')
data = data.set_index('date')
data = data.rename(columns={'x': 'y'})
data = data.asfreq('MS')
data = data['y']
data = data.sort_index()
# Split data into train-test
# ==============================================================================
steps = 36
data_train = data[:-steps]
data_test = data[-steps:]
A ForecasterAutoregCustom
is created and trained from a RandomForestRegressor
regressor. The create_predictor()
function, which calculates the first 10 lags and the moving average of the last 20 values, is used to create the predictors.
# Function to calculate predictors from time series
# ==============================================================================
def create_predictors(y):
'''
Create the first 10 lags.
Calculate the moving average of the last 20 values.
'''
X_train = pd.DataFrame({'y':y.copy()})
for i in range(0, 10):
X_train[f'lag_{i+1}'] = X_train['y'].shift(i)
X_train['moving_avg'] = X_train['y'].rolling(20).mean()
X_train = X_train.drop(columns='y').tail(1).to_numpy()
return X_train
When creating the forecaster, the window_size
argument must be equal to or greater than the window used by the function that creates the predictors. This value, in this case, is 20.
# Create and train forecaster
# ==============================================================================
forecaster_rf = ForecasterAutoregCustom(
regressor = RandomForestRegressor(random_state=123),
fun_predictors = create_predictors,
window_size = 20
)
forecaster_rf.fit(y=data_train)
forecaster_rf
# Predictions
# ==============================================================================
steps = 36
predictions = forecaster_rf.predict(steps=steps)
# Temporal index is added to predictions
predictions = pd.Series(data=predictions, index=data_test.index)
# Plot
# ==============================================================================
fig, ax = plt.subplots(figsize=(9, 4))
data_train.plot(ax=ax, label='train')
data_test.plot(ax=ax, label='test')
predictions.plot(ax=ax, label='predictions')
ax.legend();
# Error
# ==============================================================================
error_mse = mean_squared_error(
y_true = data_test,
y_pred = predictions
)
print(f"Test error (mse): {error_mse}")
When using the grid_search_forecaster()
function with a ForecasterAutoregCustom
, thelags_grid
argument is not specified.
# Hyperparameter Grid search
# ==============================================================================
forecaster_rf = ForecasterAutoregCustom(
regressor = RandomForestRegressor(random_state=123),
fun_predictors = create_predictors,
window_size = 20
)
# Regressor's hyperparameters
param_grid = {'n_estimators': [100, 500],
'max_depth': [3, 5, 10]}
results_grid = grid_search_forecaster(
forecaster = forecaster_rf,
y = data_train,
param_grid = param_grid,
steps = 10,
method = 'cv',
metric = 'mean_squared_error',
initial_train_size = int(len(data_train)*0.5),
allow_incomplete_fold = True,
return_best = True,
verbose = False
)
# Grid Search results
# ==============================================================================
results_grid
# Predictions
# ==============================================================================
predictions = forecaster_rf.predict(steps=steps)
# Temporal index is added to predictions
predictions = pd.Series(data=predictions, index=data_test.index)
# Plot
# ==============================================================================
fig, ax = plt.subplots(figsize=(9, 4))
data_train.plot(ax=ax, label='train')
data_test.plot(ax=ax, label='test')
predictions.plot(ax=ax, label='predictions')
ax.legend();
# Error
# ==============================================================================
error_mse = mean_squared_error(y_true = data_test, y_pred = predictions)
print(f"Test error (mse) {error_mse}")
The ForecasterAutoreg
and ForecasterAutoregCustom
models follow a recursive prediction strategy in which each new prediction builds on the previous one. An alternative is to train a model for each of the steps to be predicted. This strategy, commonly known as direct multi-step forecasting, is computationally more expensive than recursive since it requires training several models. However, in some scenarios, it achieves better results. These kinds of models can be obtained with the ForecasterAutoregMultiOutput
class and can include one or multiple exogenous variables.
# Data download
# ==============================================================================
url = 'https://raw.githubusercontent.com/JoaquinAmatRodrigo/skforecast/master/data/h2o.csv'
data_raw = pd.read_csv(url, sep=',')
data_raw = data_raw.rename(columns={'fecha': 'date'})
# Data preparation
# ==============================================================================
data = data_raw.copy()
data['date'] = pd.to_datetime(data['date'], format='%Y/%m/%d')
data = data.set_index('date')
data = data.rename(columns={'x': 'y'})
data = data.asfreq('MS')
data = data['y']
data = data.sort_index()
# Split data into train-test
# ==============================================================================
steps = 36
data_train = data[:-steps]
data_test = data[-steps:]
Unlike when using ForecasterAutoreg
or ForecasterAutoregCustom
, the number of steps to be predicted must be indicated in the ForecasterAutoregMultiOutput
type models. This means that the number of predictions obtained when executing the predict()
method is always the same.
# Hyperparameter Grid search
# ==============================================================================
forecaster_rf = ForecasterAutoregMultiOutput(
regressor = Lasso(random_state=123),
steps = 36,
lags = 8 # This value will be replaced in the grid search
)
param_grid = {'alpha': np.logspace(-5, 5, 10)}
lags_grid = [5, 12, 20]
results_grid = grid_search_forecaster(
forecaster = forecaster_rf,
y = data_train,
param_grid = param_grid,
lags_grid = lags_grid,
steps = 36,
method = 'cv',
metric = 'mean_squared_error',
initial_train_size = int(len(data_train)*0.5),
allow_incomplete_fold = False,
return_best = True,
verbose = False
)
# Grid Search results
# ==============================================================================
results_grid.head()
The best results are obtained using a time window of 12 lags and a Lasso setting {'alpha': 0.001668}.
# Predictions
# ==============================================================================
predictions = forecaster_rf.predict()
# Temporal index is added to predictions
predictions = pd.Series(data=predictions, index=data_test.index)
# Gráfico
# ==============================================================================
fig, ax = plt.subplots(figsize=(9, 4))
data_train.plot(ax=ax, label='train')
data_test.plot(ax=ax, label='test')
predictions.plot(ax=ax, label='predictions')
ax.legend();
# Error
# ==============================================================================
error_mse = mean_squared_error(y_true = data_test, y_pred = predictions)
print(f"Test error (mse) {error_mse}")
The Backtesting process consists of simulating the behavior that the model would have had if it had been run on a recurring basis, for example, predicting at intervals of 3 years (36 months) a total of 9 years. This type of evaluation can be easily applied with the backtesting_forecaster()
function. This function returns, in addition to the predictions, an error metric.
# Backtesting
# ==============================================================================
n_test = 36*3 # The last 9 years are separated for the backtest
data_train = data[:-n_test]
data_test = data[-n_test:]
steps = 36 # 3 year (36 month) folds are used
regressor = LinearRegression()
forecaster = ForecasterAutoreg(regressor=regressor, lags=15)
metric, predictions_backtest = backtesting_forecaster(
forecaster = forecaster,
y = data,
initial_train_size = len(data_train),
steps = steps,
metric = 'mean_squared_error',
verbose = True
)
print(f"Backtest error: {metric}")
# Add datetime index to predictions
predictions_backtest = pd.Series(data=predictions_backtest, index=data_test.index)
fig, ax = plt.subplots(figsize=(9, 4))
#data_train.plot(ax=ax, label='train')
data_test.plot(ax=ax, label='test')
predictions_backtest.plot(ax=ax, label='predictions')
ax.legend();
A prediction interval defines the interval within which the true value of $y$ is expected to be found with a given probability.
Rob J Hyndman and George Athanasopoulos, list in their book Forecasting: Principles and Practice multiple ways to estimate prediction intervals, most of which require that the residuals (errors) of the model are distributed in a normal way. When this property cannot be assumed, bootstrapping can be resorted to, which only assumes that the residuals are uncorrelated. This is the method used in the Skforecast library for the ForecasterAutoreg
and ForecasterAutoregCustom
type models.
# Data download
# ==============================================================================
url = 'https://raw.githubusercontent.com/JoaquinAmatRodrigo/skforecast/master/data/h2o.csv'
data_raw = pd.read_csv(url, sep=',')
data_raw = data_raw.rename(columns={'fecha': 'date'})
# Data preparation
# ==============================================================================
data = data_raw.copy()
data['date'] = pd.to_datetime(data['date'], format='%Y/%m/%d')
data = data.set_index('date')
data = data.rename(columns={'x': 'y'})
data = data.asfreq('MS')
data = data['y']
data = data.sort_index()
# Split data into train-test
# ==============================================================================
steps = 36
data_train = data[:-steps]
data_test = data[-steps:]
# Create and train forecaster
# ==============================================================================
forecaster = ForecasterAutoreg(
regressor=LinearRegression(),
lags=15
)
forecaster.fit(y=data_train)
# Prediction intervals
# ==============================================================================
predictions = forecaster.predict_interval(
steps = steps,
interval = [1, 99],
n_boot = 1000
)
# Datetime index added
predictions = pd.DataFrame(data=predictions, index=data_test.index)
# Prediction Error
# ==============================================================================
error_mse = mean_squared_error(
y_true = data_test,
y_pred = predictions.iloc[:, 0]
)
print(f"Test error (mse): {error_mse}")
# Plot
# ==============================================================================
fig, ax=plt.subplots(figsize=(9, 4))
#data_train.plot(ax=ax, label='train')
data_test.plot(ax=ax, label='test')
predictions.iloc[:, 0].plot(ax=ax, label='predictions')
ax.fill_between(predictions.index,
predictions.iloc[:, 1],
predictions.iloc[:, 2],
alpha=0.5)
ax.legend();
# Backtest with prediction intervals
# ==============================================================================
n_test = 36*3
data_train = data[:-n_test]
data_test = data[-n_test:]
steps = 36
regressor = LinearRegression()
forecaster = ForecasterAutoreg(regressor=regressor, lags=15)
metric, predictions = backtesting_forecaster_intervals(
forecaster = forecaster,
y = data,
initial_train_size = len(data_train),
steps = steps,
metric = 'mean_squared_error',
interval = [1, 99],
n_boot = 100,
in_sample_residuals = True,
verbose = True,
)
print(metric)
# Datetime index is added
predictions = pd.DataFrame(data=predictions, index=data_test.index)
# Plot
# ==============================================================================
fig, ax = plt.subplots(figsize=(9, 4))
#data_train.plot(ax=ax, label='train')
data_test.plot(ax=ax, label='test')
predictions.iloc[:, 0].plot(ax=ax, label='predictions')
ax.fill_between(predictions.index,
predictions.iloc[:, 1],
predictions.iloc[:, 2],
alpha=0.5)
ax.legend();
Skforecast models can be loaded and stored using pickle or joblib packages. A simple example using joblib is shown below.
# Create forecaster
forecaster = ForecasterAutoreg(LinearRegression(), lags=3)
forecaster.fit(y=pd.Series(np.arange(50)))
# Save model
dump(forecaster, filename='forecaster.py')
# Load model
forecaster_loaded = load('forecaster.py')
# Predict
forecaster_loaded.predict(steps=5)
import session_info
session_info.show(html=False)
Hyndman, R.J., & Athanasopoulos, G. (2021) Forecasting: principles and practice, 3rd edition, OTexts: Melbourne, Australia. Book
Time Series Analysis and Forecasting with ADAM Ivan Svetunkov Book
Python Data Science Handbook by Jake VanderPlas Book
Python for Finance: Mastering Data-Driven Finance Book
How to cite this paper?
Forecasting series temporales con Python y Scikitlearn by Joaquín Amat Rodrigo and Javier Escobar Ortiz, available under a Attribution 4.0 International (CC BY 4.0) at https://www.cienciadedatos.net/py27-forecasting-series-temporales-python-scikitlearn.html
This work by Joaquín Amat Rodrigo is licensed under a Creative Commons Attribution 4.0 International License.