# Keras: Regression-based neural networks

Keras is an API used for running high-level neural networks. The model runs on top of TensorFlow, and was developed by Google. The main competitor to Keras at this point in time is PyTorch, developed by Facebook. While PyTorch has […]

# JavaScript: Statistics and Visualizations

Let’s take a look at how we can do some statistical analysis and visualizations with JavaScript. Admittedly, JavaScript is not a language one particularly associates with data science – it has always traditionally belonged to web developers. That said, I’ve […]

# K-Nearest Neighbors (KNN): Solving Classification Problems

In this tutorial, we are going to use the K-Nearest Neighbors (KNN) algorithm to solve a classification problem. Firstly, what exactly do we mean by classification? Classification across a variable means that results are categorised into a particular group. e.g. […]

# VLOOKUP and SUMIF: Replicate in Python

Often times, a new user to Python will wish to replicate analysis previously done in Excel. Two major instances of this are the VLOOKUP and SUMIF commands. VLOOKUP: Combining data through a common index SUMIF: Summing up values by category […]

# matplotlib: Generating line and pie charts in Python

Let’s take a look at how we can generate plots in Python. matplotlib is a particularly powerful library that we can use to generate visualisations. Let’s see how this works using a couple of examples. In a previous tutorial, we […]

# Cross Correlation Analysis: Analysing Currency Pairs in Python

When working with a time series, one important thing we wish to determine is whether one series “causes” changes in another. In other words, is there a strong correlation between a time series and another given a number of lags? […]

# Huber vs. Ridge Regressions: Accounting for Outliers

In a previous tutorial, we saw how we can use Huber and Bisquare weightings to adjust for outliers in a dataset. These weightings allow us to adjust our regression analysis to give less weight to extreme values. The previous analysis […]

# pykalman: Analysis of USD/CHF with Kalman Filter

In a previous tutorial, we saw how the Kalman Filter can account for “shocks”, or sudden changes in a time series. The analysis was done within R. Let’s now see how we can analyse the USD/CHF currency pair with the […]

# Kalman Filter: Modelling Time Series Shocks with KFAS in R

We have already seen how time series models such as ARIMA can be used to make time series forecasts. While these models can prove to have high degrees of accuracy, they have one major shortcoming – they do not account […]

# Working with panel data in R: Fixed vs. Random Effects (plm)

Panel data, along with cross-sectional and time series data, are the main data types that we encounter when working with regression analysis. Types of data Cross-Sectional: Data collected at one particular point in time Time Series: Data collected across several […]