Kalman Filter: Modelling Time Series Shocks with KFAS in R

We have already seen how time series models such as ARIMA can be used to make time series forecasts. While these models can prove to have high degrees of accuracy, they have one major shortcoming – they do not account for “shocks”, or sudden changes in a time series. Let’s see how we can potentially alleviate this problem using a model known as the Kalman Filter.

Continue reading “Kalman Filter: Modelling Time Series Shocks with KFAS in R”

Working with panel data in R: Fixed vs. Random Effects (plm)

Panel data, along with cross-sectional and time series data, are the main data types that we encounter when working with regression analysis.

Continue reading “Working with panel data in R: Fixed vs. Random Effects (plm)”

Robust Regressions: Dealing with Outliers

It is often the case that a dataset contains significant outliers – or observations that are significantly out of range from the majority of other observations in our dataset. Let us see how we can use robust regressions to deal with this issue.

Continue reading “Robust Regressions: Dealing with Outliers”

Variance-Covariance Matrix: Stock Price Analysis in R (corpcor, covmat)

The purpose of a variance-covariance matrix is to illustrate the variance of a particular variable (diagonals) while covariance illustrates the covariances between the exhaustive combinations of variables.

Continue reading “Variance-Covariance Matrix: Stock Price Analysis in R (corpcor, covmat)”

Sentiment Analysis with twitteR and tidytext

A sentiment analysis is a useful way of gauging group opinion on a certain topic at a particular point in time.

Using social media data, let us see how we can use the twitteR library to stream tweets from Twitter and conduct a sentiment analysis to determine current sentiment on gold prices.

Continue reading “Sentiment Analysis with twitteR and tidytext”

Cumulative Binomial Probability with R and Shiny

In conducting probability analysis, the two variables that take account of the chance of an event happening are N (number of observations) and λ (lambda – our hit rate/chance of occurrence in a single interval). When we talk about a cumulative binomial probability distribution, we mean to say that the greater the number of trials, the higher the overall probability of an event occurring.

Continue reading “Cumulative Binomial Probability with R and Shiny”

plyr and dplyr: Data Manipulation in R

The purpose of the plyr and dplyr libraries in R is to manipulate data with ease.

As we’ve seen in a previous post, there are various methods of wrangling and summarising data in R. However, wouldn’t it be great if we had some libraries that can greatly simplify this process for us?

Continue reading “plyr and dplyr: Data Manipulation in R”

ARIMA Models: Stock Price Forecasting with Python and R

ARIMA (Autoregressive Integrated Moving Average) is a major tool used in time series analysis to attempt to forecast future values of a variable based on its present value. For this particular example, I use a stock price dataset of Johnson & Johnson (JNJ) from 2006-2016, and use the aforementioned model to conduct price forecasting on this time series.

Continue reading “ARIMA Models: Stock Price Forecasting with Python and R”

PostgreSQL Databases: Connect To R and Python

PostgreSQL is a commonly used database language for creating and managing large amounts of data effectively.

Here, you will see how to:

  1. create a PostgreSQL database using the Linux terminal
  2. connect the PostgreSQL database to R using the “RpostgreSQL” library, and to Python using the “psycopg2” library

Continue reading “PostgreSQL Databases: Connect To R and Python”

Creating functions and using lapply in R

Functions are used to simplify a series of calculations.

For instance, let us suppose that there exists an array of numbers which we wish to add to another variable. Instead of carrying out separate calculations for each number in the array, it would be much easier to simply create a function that does this for us automatically.

Continue reading “Creating functions and using lapply in R”