In this example, an OLS regression model is constructed in an attempt to forecast future S&P 500 levels based on the price of Brent crude oil. However, since this OLS regression is incorporating time series data, potential violations such as […]

Continue reading »# Category: R

# Bayesian Statistics: Analysis of Health Data

The premise of Bayesian statistics is that distributions are based on personal belief about the shape of such a distribution, rather than the classical assumption which does not take such subjectivity into account. In this regard, Bayesian statistics defines distributions […]

Continue reading »# Multilevel Modelling in R: Analysing Vendor Data

One of the main limitations of regression analysis is when one needs to examine changes in data across several categories. This problem can be resolved by using a multilevel model, i.e. one that varies at more than one level and […]

Continue reading »# Visualizing New York City WiFi Access with K-Means Clustering

Visualization has become a key application of data science in the telecommunications industry. Specifically, telecommunication analysis is highly dependent on the use of geospatial data. This is because telecommunication networks in themselves are geographically dispersed, and analysis of such dispersions […]

Continue reading »# Keras with R: Predicting car sales

Keras is an API used for running high-level neural networks. The model runs on top of TensorFlow, and was developed by Google. In this particular example, a neural network will be built in Keras to solve a regression problem, i.e. […]

Continue reading »# Kalman Filter: Modelling Time Series Shocks with KFAS in R

When it comes to time series forecasts, conventional models such as ARIMA are often a popular option. While these models can prove to have high degrees of accuracy, they have one major shortcoming – they do not typically account for […]

Continue reading »# Working with panel data in R: Fixed vs. Random Effects (plm)

Panel data, along with cross-sectional and time series data, are the main data types that we encounter when working with regression analysis.

Continue reading »# Robust Regressions: Dealing with Outliers

It is often the case that a dataset contains significant outliers – or observations that are significantly out of range from the majority of other observations in our dataset. Let us see how we can use robust regressions to deal […]

Continue reading »# Variance-Covariance Matrix: Stock Price Analysis in R (corpcor, covmat)

The purpose of a variance-covariance matrix is to illustrate the variance of a particular variable (diagonals) while covariance illustrates the covariances between the exhaustive combinations of variables.

Continue reading »# Cumulative Binomial Probability with R and Shiny

In conducting probability analysis, the two variables that take account of the chance of an event happening are N (number of observations) and λ (lambda – our hit rate/chance of occurrence in a single interval). When we talk about a […]

Continue reading »