SARIMA: Forecasting seasonal data with Python and R


ARIMA (Autoregressive Integrated Moving Average) is a major tool used in time series analysis to attempt to forecast future values of a variable based on its present value. For this particular example, a monthly weather dataset from 1941 for Dublin, Ireland from the Irish weather broadcaster Met Eireann is used, and an ARIMA model is constructed to forecast maximum temperature readings using this time series.



The purpose of ARIMA is to determine the nature of the relationship between our residuals, which would provide our model with a certain degree of forecasting power. In the first instance, in order to conduct a time series analysis we must express our dataset in terms of logarithms. If our data is expressed solely in price terms, then this does not allow for continuous compounding of returns over time and will give misleading results.

An ARIMA model consists of coordinates (p, d, q):

  • p stands for the number of autoregressive terms, i.e. the number of observations from past time values used to forecast future values. e.g. if the value of p is 2, then this means that two previous time observations in the series are being used to forecast the future trend.
  • d denotes the number of differences needed to make the time series stationary (i.e. one with a constant mean, variance, and autocorrelation). For instance, if d = 1, then it means that a first-difference of the series must be obtained to transform it into a stationary one.
  • q represents the moving average of the previous forecast errors in our model, or the lagged values of the error term. As an example, if q has a value of 1, then this means that we have one lagged value of the error term in the model.


Seasonality is a significant concern when it comes to modelling time series. Seasonality is a particularly endemic feature of weather data – hence why many parts of the world have four seasons!

When seasonality is not accounted for, one risks erroneous forecasts of the data. While one could forecast a mean value for a particular time series, the peaks and valleys around that mean affect the forecasts for that time series significantly.

For instance, the mean maximum recorded temperature in Dublin, Ireland for the year 2018 was 18.9°C. However, with a low of 11.9°C and a high of 26.7°C, the dispersion around this mean is significant and influenced by seasonality.

For instance, let’s take a look at ARIMA forecasts for maximum temperature both with and without seasonal components:

Seasonal ARIMA


Non-seasonal ARIMA


In these instances, we can see that the non-seasonal ARIMA forecast shows a much wider dispersion and makes it more unlikely that the maximum temperature for each month will be forecasted accurately.

In this regard, ARIMA needs to be modified in order to include a seasonal component.

ARIMA(p, d, q) × (P, D, Q)S

with p = non-seasonal AR order, d = non-seasonal differencing, q = non-seasonal MA order, P = seasonal AR order, D = seasonal differencing, Q = seasonal MA order, and S = time span of repeating seasonal pattern.

Seasonal ARIMA Analysis with R

Using the aforementioned data, the following procedures are carried out in R:

  1. The time series is converted into logarithmic format in order to smooth the volatility in the series.
  2. Autocorrelation and Partial Autocorrelation (ACF and PACF) plots are generated to detect the presence of stationarity, and a Dickey-Fuller test is conducted to validate the same.
  3. The time series is decomposed in order to examine the seasonal trend in isolation.
  4. auto.arima is then used to examine the best ARIMA configuration for the training data (the first 80% of all temperature data).
  5. The predicted values are then compared to the test values (the latter 20% of the data) to determine the model accuracy.
  6. Finally, the Ljung-Box test is used to determine if the data is independently distributed or exhibits serial correlation.

Here are the ACF and PACF plots:





Now, the time series is defined and the components are analysed:


From the above, we see that there is a clear seasonal component present in the time series. As a result, it is highly likely that the ARIMA model will need a seasonal component attached.

To determine the ARIMA configuration, the auto.arima function in R is used.

From the above, the best identified configuration on the basis of BIC is:


Now that the configuration has been selected, the forecasts can be made. With the size of the test data being 186 observations, 186 forecasts are run accordingly.

As previously plotted above, here is a visual representation of the forecast:


Now, the data needs to be converted back into its original format by calculating the exponent of the log predictions. These predictions are then compared against the test data to identify:

  1. The percentage differences between the predictions and the actual data.
  2. The number of predictions with a percentage difference of lower than 10% from the actual.

Now that the forecasted values have been calculated, we can compare this against the test data to forecast the percentage error:

From the above:

  1. The mean percentage error (or average of all percentage errors) is -0.3%.
  2. The number of predictions with a percentage error below 10% relative to the actual is over 70%.

In the context of weather data, having 70% of predictions within 10% of the actual value implies good performance. For instance, the difference between 14°C and 15°C is just over 7%, meaning that a 10% difference would imply a 1°C-2°C difference between the forecasted and actual values.

By plotting a histogram, we can see that a large majority of the forecasts are within 20% of the actual temperature values:


Finally, a Ljung-Box test is conducted. Essentially, the test is being used to determine if the residuals of our time series follow a random pattern, or if there is a significant degree of non-randomness.

H0: Residuals follow a random pattern
HA: Residuals do not follow a random pattern

Note that the method for choosing a specific number of lags for Ljung-Box can be quite arbitrary. In this regard, we will run the Ljung-Box test with lags 5, 10, and 15. To run this test in R, we use the following functions:

We see that across lags 5, 10, and 15, the null hypothesis that the lags follow a random pattern cannot be rejected and therefore our ARIMA model is free of autocorrelation.

Seasonal ARIMA Analysis with Python

Now, a similar analysis will be conducted on the data in Python.

In particular, the pyramid library will be used in a similar manner to auto.arima in R to determine the best model configuration.

Firstly, let’s generate a heatmap using seaborn for the most recent four years of data to determine the maximum temperature dispersion by month:


Again, we see significant dispersion in the maximum temperature across months.
Now, the relevant data processing is conducted and the time series is decomposed into its elements:




Now, the ARIMA model is generated with pyramid in order to identify the best configuration:

Now, a prediction can be made with the model for the next 185 periods, with the prediction data compared to the test data in a similar way as done with R.


However, note that in this instance, the accuracy came in at 40% (i.e. 40% of the predictions were within 10% accuracy of the actual), whereas the accuracy of the ARIMA model in R was over 70%.

Note that the SARIMA model generated in this instance was of a different configuration to that generated in R.

Model configuration indicated in Python

SARIMAX(1, 1, 1)x(0, 1, 1, 12)

Model configuration indicated in R

SARIMAX(1, 0, 0)x(2, 1, 0, 12)

In this regard, it would appear that R did a better job than Python in identifying the correct configuration.

To validate this, the anlaysis is re-run again in Python, using the same ARIMA configuration generated in R.

Here is the analysis:

It is observed that the updated configuration has raised the reported accuracy to over 70%, indicating that modifying the ARIMA model resulted in a significant improvement.

Again, a histogram illustrates that the majority of forecast errors lie below 20%:



In this example, we have seen:

  1. How to generate ARIMA models in Python and R
  2. Importance in accounting for seasonality trends and methods to accomplish this
  3. How to select the correct ARIMA modification and validate results

Thank you for your time.