Variance-Covariance Matrix: Stock Price Analysis in R (corpcor, covmat)

The purpose of a variance-covariance matrix is to illustrate the variance of a particular variable (diagonals) while covariance illustrates the covariances between the exhaustive combinations of variables.

Continue reading “Variance-Covariance Matrix: Stock Price Analysis in R (corpcor, covmat)”

Sentiment Analysis with twitteR and tidytext

A sentiment analysis is a useful way of gauging group opinion on a certain topic at a particular point in time.

Using social media data, let us see how we can use the twitteR library to stream tweets from Twitter and conduct a sentiment analysis to determine current sentiment on gold prices.

Continue reading “Sentiment Analysis with twitteR and tidytext”

Decision Trees with Python

Let’s take a look at how we can construct decision trees in Python.

A decision tree is a model used to solve classification and regression tasks. As we saw in our example for R, the model allows us to generate various outcomes using the model, allowing us to make a decision with the data.

Continue reading “Decision Trees with Python”

Voice Recognition with Python (speech_recognition and PyAudio)

Python has quite a handy library called speech_recognition, which we can use to create a program where a user’s voice can be transcribed into text.

Let’s have a look at how we can do this. Note that I’m using Python version 3.6.0 at the time of writing to run the below.

Continue reading “Voice Recognition with Python (speech_recognition and PyAudio)”

Cumulative Binomial Probability with R and Shiny

In conducting probability analysis, the two variables that take account of the chance of an event happening are N (number of observations) and λ (lambda – our hit rate/chance of occurrence in a single interval). When we talk about a cumulative binomial probability distribution, we mean to say that the greater the number of trials, the higher the overall probability of an event occurring.

Continue reading “Cumulative Binomial Probability with R and Shiny”

Linear and Logistic Regression Modelling in Python

The statsmodels and sklearn libraries are frequently used when it comes to generating regression output. While these libraries are frequently used in regression analysis, it is often the case that a user needs to work with different libraries depending on the extent of the analysis.

Continue reading “Linear and Logistic Regression Modelling in Python”

plyr and dplyr: Data Manipulation in R

The purpose of the plyr and dplyr libraries in R is to manipulate data with ease.

As we’ve seen in a previous post, there are various methods of wrangling and summarising data in R. However, wouldn’t it be great if we had some libraries that can greatly simplify this process for us?

Continue reading “plyr and dplyr: Data Manipulation in R”

How To Create a Twitter App and API Interface Via Python

This tutorial illustrates how to use a Python API to connect to a Twitter account using the Twitter library. Specifically, this API allows a user to extract high quantities of data pertaining to a specific Twitter account, as well as directly control Twitter posts from the Python platform (such as posting multiple tweets at once).

Click here to read the rest of my tutorial at Sitepoint.

ARIMA Models: Stock Price Forecasting with Python and R

ARIMA (Autoregressive Integrated Moving Average) is a major tool used in time series analysis to attempt to forecast future values of a variable based on its present value. For this particular example, I use a stock price dataset of Johnson & Johnson (JNJ) from 2006-2016, and use the aforementioned model to conduct price forecasting on this time series.

Continue reading “ARIMA Models: Stock Price Forecasting with Python and R”