Creating functions and using lapply in R

Functions are used to simplify a series of calculations.

For instance, let us suppose that there exists an array of numbers which we wish to add to another variable. Instead of carrying out separate calculations for each number in the array, it would be much easier to simply create a function that does this for us automatically.

Continue reading “Creating functions and using lapply in R”

OLS and Logistic Regression Models in R

We use linear models primarily to analyse cross-sectional data; i.e. data collected at one specific point in time across several observations. We can also use such models with time series data, but need to be cautious of issues such as serial correlation.

Continue reading “OLS and Logistic Regression Models in R”

Cross-Correlation of Currency Pairs In R (ccf)

When working with a time series, one important thing we wish to determine is whether one series “causes” changes in another. In other words, is there a strong correlation between a time series and another given a number of lags? The way we can detect this is through measuring cross-correlation.

Continue reading “Cross-Correlation of Currency Pairs In R (ccf)”

Chow Test For Structural Breaks in Time Series

A Chow test is designed to determine whether a structural break in a time series exists. That is to say, a sharp change in trend in a time series that merits further study. For instance, a structural break in one series can give useful clues as to whether such a change is being propagated across other variables – assuming that there is a significant correlation between them under normal circumstances.

Continue reading “Chow Test For Structural Breaks in Time Series”

Decision Trees and Random Forests in R

Decision trees are a highly useful visual aid in analysing a series of predicted outcomes for a particular model. As such, it is often used as a supplement (or even alternative to) regression analysis in determining how a series of explanatory variables will impact the dependent variable.

Continue reading “Decision Trees and Random Forests in R”

Data Cleaning, Merging and Wrangling in R

One of the big issues when it comes to working with data in any context is the issue of data cleaning and merging of datasets, since it is often the case that you will find yourself having to collate data across multiple files, and will need to rely on R to carry out functions that you would normally carry out using commands like VLOOKUP in Excel.

Continue reading “Data Cleaning, Merging and Wrangling in R”

neuralnet: Train and Test Neural Networks Using R

A neural network is a computational system that creates predictions based on existing data. Let us train and test a neural network using the neuralnet library in R.

Continue reading “neuralnet: Train and Test Neural Networks Using R”

Serial Correlation: Durbin-Watson and Cochrane-Orcutt Remedy

Serial correlation (also known as autocorrelation) is a violation of the Ordinary Least Squares assumption that all observations of the error term in a dataset are uncorrelated. In a model with serial correlation, the current value of the error term is a function of the one immediately previous to it:

  et = ρe(t-1) + ut
  where e = error term of equation in question; ρ = first-order autocorrelation coefficient; u = classical (not serially correlated error term)

Continue reading “Serial Correlation: Durbin-Watson and Cochrane-Orcutt Remedy”