Kalman Filter: Modelling Time Series Shocks with KFAS in R

We have already seen how time series models such as ARIMA can be used to make time series forecasts. While these models can prove to have high degrees of accuracy, they have one major shortcoming – they do not account for “shocks”, or sudden changes in a time series. Let’s see how we can potentially alleviate this problem using a model known as the Kalman Filter.

Time Series Shocks

Let’s take the stock market as an example. An index could have an overall upward trend, and then spike sharply downwards during a sell-off. A conventional time series model wouldn’t necessarily account for this right away, and it would likely take several periods into the future before the suddent change in trend would be taken into account.

Therefore, we wish to use a time series model that is indeed capable of accounting for such shocks. Let’s take a look at a handy model known as the Kalman Filter.

The Kalman Filter is a state-space model that adjusts more quickly for shocks to a time series. Let’s see how this works using an example.

In January 2015, currency markets underwent one of the biggest shocks ever endured, when the Swiss National Bank decided to depeg the Swiss franc from the euro. As a result, the Swiss franc soared in value while other major currencies plummeted:

usdchf

Let’s see how the Kalman Filter adjusts for such a shock.

Kalman Filter with KFAS library

Firstly, let’s download data for CHF/USD for the month of January 2015:

> require(Quandl)
Loading required package: Quandl
Loading required package: xts
Loading required package: zoo

Attaching package: ‘zoo’

The following objects are masked from ‘package:base’:

    as.Date, as.Date.numeric

> usdchf = Quandl("FRED/DEXSZUS", start_date="2015-01-01",end_date="2015-01-30",type="xts")
> usdchf=data.frame(usdchf)
> usdchf=(log(usdchf$usdchf))

We are converting the usdchf into a data frame, and then converting into log format to structure our time series in terms of returns.

Now, we will attempt to model this time series with the Kalman Filter using the KFAS library.

> #Kalman Filter
> library(KFAS)
Warning message:
package ‘KFAS’ was built under R version 3.5.1 
> logmodel <- SSModel(usdchf ~ SSMtrend(1, Q = 0.01), H = 0.01)
> out <- KFS(logmodel)
> ts.plot(ts(usdchf), out$a, out$att, out$alpha, col = 1:4)
> title("CHF/USD")

Let’s go through the above.

SSModel denotes “state space model”, and observe that we are regressing the CHF/USD time series against the SSMtrend, which denotes our smoothed estimates, or state predictions one-step ahead of that of the actual series.

Q and H denote our unconstrained time-invariant covariance estimates. The steps to estimate these can be quite complex, so for our purposes I’m going to set these to a default value of 0.01.

When we plot our time series, here is what we come up with:

kalman filter

We can see that our a, att, and alpha series are adjusting to the shock instantaneously.

  • a: One-step-ahead predictions of states
  • att: Filtered estimates of states
  • alphahat: Smoothed estimates of states

Let’s now combine the above into a data frame along with our original series and see what we come up with:

> df<-data.frame(usdchf,out$a[1:20],out$att[1:20],out$alpha[1:20])
> View(df)
> col_headings<-c("usdchf","a","att","alpha")
> names(df)<-col_headings
> View(df)

Here is the data frame we come up with:

data frame

Again, we can see that our estimates are moving largely in line with the CHF/USD.

One commenter below made quite a good point about the Kalman Filter. Its purpose is not to “predict” a shock per se, but the advantage of the model is that it adjusts for a shock much faster than a traditional time series model would. Let’s illustrate this using ARIMA.

How does ARIMA perform?

When we run this model using ARIMA, it is notable that the model fails to adjust significantly for the shock and does not give us a particuarly clear direction regarding trend.

Let’s firstly run and plot our ARIMA model:

> #ARIMA
> library(tseries)

    ‘tseries’ version: 0.10-45

    ‘tseries’ is a package for time series analysis and computational finance.

    See ‘library(help="tseries")’ for details.

> library(forecast)
> fitusdchf<-auto.arima(usdchf[1:20])
> forecastedvalues_ln=forecast(fitusdchf,h=10)
> plot(forecastedvalues_ln)

We yield an ARIMA (0, 1, 0) configuration.

arima

We see that when we run the ARIMA model on 20 days of data (including the shock), the confidence intervals of the ARIMA model still remain quite large, and it is therefore hard to predict future values.

Another Example

So, we’ve seen how the Kalman Filter adjusted to the sudden movement in the USD/CHF. Let’s take another example of a currency shock. When Britain voted for “Brexit” in June 2016, we saw the GBP/USD subsequently plunge.

gbpusd

How well would the Kalman Filter have modelled this? Let’s find out!

As in the example of USD/CHF, we download our GBP/USD data from Quandl and run the Kalman Filter:

require(Quandl)
library(KFAS)

gbpusd = Quandl("FRED/DEXUSUK", start_date="2016-01-01",end_date="2016-12-31",type="xts")
gbpusd=data.frame(gbpusd)
gbpusd=(log(gbpusd$gbpusd))

logmodel <- SSModel(gbpusd ~ SSMtrend(1, Q = 0.01), H = 0.01)
out <- KFS(logmodel)

df<-data.frame(gbpusd,out$a[1:251],out$att[1:251],out$alpha[1:251])
View(df)
col_headings<-c("gbpusd","a","att","alpha")
names(df)<-col_headings
View(df)
ts.plot(ts(gbpusd[100:150]), out$a[100:150], out$att[100:150], out$alpha[100:150], col = 1:4)
title("GBP/USD")

Here is a plot of our data. Again, we see that a, att, and alpha are adjusting quickly to the change:

gbpusd kalman filter

Here are the a, att, and alpha statistics:

a att alpha

Again, if we try plotting using ARIMA, we see that the forecast range is a lot wider, which indicates that our model has not properly accounted for the time series shock:

library(tseries)
library(forecast)
fitgbpusd<-auto.arima(gbpusd[1:125])
forecastedvalues_ln=forecast(fitgbpusd,h=100)
plot(forecastedvalues_ln)

arima 200

Conclusion

In this tutorial, you have learned:

  • Importance of adjusting for time series shocks
  • How to implement a Kalman Filter using KFAS in R
  • How to interpret output from a Kalman Filter
  • Why the Kalman Filter is a suitable model for modelling time-series shocks

Many thanks for reading this tutorial, and please leave any questions you may have in the comments below.

Author: Michael Grogan

Michael Grogan is a machine learning consultant and educator, with a profound passion for statistics and data science.

5 thoughts on “Kalman Filter: Modelling Time Series Shocks with KFAS in R”

  1. Hi Michael,

    Thank you for the tutorial.
    But are you sure the “a: One-step-ahead predictions of states” actually predicts the direction?
    Looking at this example and trying it on a different data set, it looks it just follows the direction with one step delay (rather than actually predicting it).

    Andrey

    1. Hi Andrey,

      Many thanks for your comment.

      The “a” variable – despite the name – is a one-step delay rather than a prediction as you correctly stated.

      However, what we are particularly interested in is the att and alpha variables, as this allows us to compare the actual values to the respective filtered and smoothed estimates of states.

      As an example, I ran this model again for the GBP/USD currency for the month of January 2017.

      Then, once the mean deviation between GBP/USD and a, att, and alpha is calculated, the following percentage deviations were obtained:

      a: -6.2%
      att: -0.27%
      alpha: 0.052%

      As you can see, the deviations between a and the currency rate (expressed in logarithmic form), is still significantly greater than att and alpha – therefore it is these two variables that are yielding us a more accurate prediction.

      Best,
      Michael

  2. Thank you for your reply Michael!

    The “a: One-step-ahead predictions of states” description confused me 🙂
    Also looking at “a” and”att” values – they are identical over 1 lag delay, i.e. “a” is delayed “filtered estimate”? And it’s (att) not really useful in terms of predicting the next state.

    alphahat looks like it may predict the change, but does it really do it for the current data?
    In your example it shows the change, but I believe it’s just adjusted for the historical data (i.e. weighted avg or MA).

    Try this data set – just before the “shock”
    usdchf = Quandl(“FRED/DEXSZUS”, start_date=”2015-01-01″,end_date=”2015-01-14″,type=”xts”)

    The ts I’m working on is a seasonal one, so we observe a “change” at every 2-3 readings and it makes it harder to validate the model 🙂

    1. Hi Andrey,

      With these models, the more data you have, the better.

      For instance, I ran the model for USD/CHF again, but this time from 2010-01-01 to 2018-09-29.

      Now, you’re right that Kalman does rely on weighted averaging, and if there is particularly big shock, the filter may not necessarily “predict” what the price on the next day will be, for instance. The model adjusts quickly, but the reading on the day before can be used as a guide.

      For instance, the shock happens at observation 1263, which is when we saw a sudden decrease in the USD/CHF, let’s see what was happening with alpha the day before.

      What’s interesting is that the blue line (alpha) is actually decreasing significantly before the shock actually happens.

      kalman filter

      We can also see that the value of alpha is decreasing the day before.

      table

      Granted, the move is not as extensive, and the Kalman Filter could not have predicted just how low the USD/CHF would fall. However, the USD/CHF was rising in the days beforehand whereas the alpha value was not, so the discrepancy may have hinted at a sudden reversal.

      With a seasonal model, if you have an abrupt change for every 2-3 readings, then your dataset might well be stationary, in which case I don’t know if the Kalman filter would be applicable? Maybe not, but it would depend on the nature of your data. 🙂

  3. Hi Michael,

    Try the data just prior to “shock: and see if any of the variables give you an indication or a hint
    usdchf = Quandl(“FRED/DEXSZUS”, start_date=”2015-01-01″,end_date=”2015-01-14″,type=”xts”)
    and
    usdchf = Quandl(“FRED/DEXSZUS”, start_date=”2010-01-01″,end_date=”2015-01-14″,type=”xts”)

    It looks to me the alpha variable works simply as MA or weighted average and adjusts to give the indication only when the shock happened.

    And then try (i.e. the “shock” happened)
    usdchf = Quandl(“FRED/DEXSZUS”, start_date=”2015-01-01″,end_date=”2015-01-15″,type=”xts”)

    Kind Regards,
    Andrey

Leave a Reply

Your email address will not be published. Required fields are marked *

four × 5 =