# Portfolio

## rvest: Web Scraping Using R

rvest is one of the standard libraries when it comes to web scraping using R. In the following example, we use R to import a sample table from this webpage using the aforementioned library.

## Create a database with mySQL and execute queries

The following is a hypothetical dataset of 20 securities with various financial variables for each. As a database language, mySQL allows us to select specific data as specified by the user, as well as conduct certain calculations on the data already available. In this regard, we use mySQL queries below to illustrate the use of the same in manipulating the database and conducting various calculations (note that the securities in this database are hypothetical, and any resemblance to a real-life security or company is merely coincidental).

## Serial Correlation: Durbin-Watson and Cochrane-Orcutt Remedy

Serial correlation (also known as autocorrelation) is a violation of the Ordinary Least Squares assumption that all observations of the error term in a dataset are uncorrelated. In a model with serial correlation, the current value of the error term is a function of the one immediately previous to it:

```  et = ρe(t-1) + ut

where e = error term of equation in question; ρ = first-order autocorrelation coefficient; u = classical (not serially correlated error term)
```

## Stationarity and Cointegration in R (adf, egcm, pp, kpss)

When we refer to a time series as stationary, we mean to say that its mean, variance and autocorrelation are all consistent over time. Cointegration, on the other hand, is when we have two time series that are non-stationary, but a linear combination of them results in a stationary time series. So, why is the concept of stationarity important? Well, a large purpose of time series modelling is to be able to predict future values from current data.