In this example, an OLS regression model is constructed in an attempt to forecast future S&P 500 levels based on the price of Brent crude oil. However, since this OLS regression is incorporating time series data, potential violations such as […]

Continue reading »# Bayesian Statistics: Analysis of Health Data

The premise of Bayesian statistics is that distributions are based on personal belief about the shape of such a distribution, rather than the classical assumption which does not take such subjectivity into account. In this regard, Bayesian statistics defines distributions […]

Continue reading »# Boosting: Is It Always The Best Option?

Gradient boosting has become quite a popular technique in the area of machine learning. Given its reputation for achieving potentially higher accuracy than other modelling techniques, it has become particularly popular as a “go-to” model for Kaggle competitions. However, use […]

Continue reading »# Multilevel Modelling in R: Analysing Vendor Data

One of the main limitations of regression analysis is when one needs to examine changes in data across several categories. This problem can be resolved by using a multilevel model, i.e. one that varies at more than one level and […]

Continue reading »# Visualizing New York City WiFi Access with K-Means Clustering

Visualization has become a key application of data science in the telecommunications industry. Specifically, telecommunication analysis is highly dependent on the use of geospatial data. This is because telecommunication networks in themselves are geographically dispersed, and analysis of such dispersions […]

Continue reading »# Predicting Irish electricity consumption with an LSTM neural network

In this example, neural networks are used to forecast energy consumption of the Dublin City Council Civic Offices using data between April 2011 – February 2013. The original dataset is available from data.gov.ie, and daily data was created by summing […]

Continue reading »# Keras with R: Predicting car sales

Keras is an API used for running high-level neural networks. The model runs on top of TensorFlow, and was developed by Google. In this particular example, a neural network will be built in Keras to solve a regression problem, i.e. […]

Continue reading »# Image Recognition with Keras: Convolutional Neural Networks

Image recognition and classification is a rapidly growing field in the area of machine learning. In particular, object recognition is a key feature of image classification, and the commercial implications of this are vast.

Continue reading »# Keras: Regression-based neural networks

Keras is an API used for running high-level neural networks. The model runs on top of TensorFlow, and was developed by Google. The main competitor to Keras at this point in time is PyTorch, developed by Facebook. While PyTorch has […]

Continue reading »# K-Nearest Neighbors (KNN): Solving Classification Problems

In this tutorial, we are going to use the K-Nearest Neighbors (KNN) algorithm to solve a classification problem. Firstly, what exactly do we mean by classification? Classification across a variable means that results are categorised into a particular group. e.g. […]

Continue reading »