A neural network is a computational system that creates predictions based on existing data. Let us train and test a neural network using the *neuralnet* library in R.

## How To Construct A Neural Network?

A neural network consists of:

**Input layers:**Layers that take inputs based on existing data**Hidden layers:**Layers that use backpropagation to optimise the weights of the input variables in order to improve the predictive power of the model**Output layers:**Output of predictions based on the data from the input and hidden layers

## Model Background

Our goal is to develop a neural network *to determine if a stock pays a dividend or not*.

As such, we are using the neural network to solve a classification problem. By classification, I mean ones where we are using a neural network to classify data by categories.

In our dataset, we assign a value of **1** to a stock that pays a dividend. We assign a value of **0** to a stock that does not pay a dividend.

Our independent variables are as follows:

**fcfps:**Free cash flow per share (in $)**earnings_growth:**Earnings growth in the past year (in %)**de:**Debt to Equity ratio**mcap:**Market Capitalization of the stock**current_ratio:**Current Ratio (or Current Assets/Current Liabilities)

Let’s take a look at the steps we will follow in constructing this model.

## Data Normalization

One of the most important procedures when forming a neural network is data normalization. This involves adjusting the data to a common scale so as to accurately compare predicted and actual values. Failure to normalize the data will typically result in the prediction value remaining the same across all observations, regardless of the input values.

We can do this in two ways in R:

- Scale the data frame automatically using the
*scale*function in R - Transform the data using a
*max-min normalization*technique

We implement both techniques below but choose to use the max-min normalization technique. Please see this link for further details on how to use the normalization function.

#Scaled Normalization scaleddata<-scale(mydata) #Max-Min Normalization normalize <- function(x) { return ((x - min(x)) / (max(x) - min(x))) } maxmindf <- as.data.frame(lapply(mydata, normalize))

We have now scaled our new dataset and saved it into a data frame titled *maxmindf*:

We base our training data (trainset) on 80% of the observations. The test data (testset) is based on the remaining 20% of observations.

#Training and Test Data trainset <- maxmindf[1:160, ] testset <- maxmindf[161:200, ]

## Training a Neural Network Model using neuralnet

We now load the *neuralnet* library into R.

Observe that we are:

- Using neuralnet to "regress" the dependent
*"dividend"*variable against the other independent variables - Setting the number of hidden layers to (2,1) based on the hidden=(2,1) formula
- The linear.output variable is set to FALSE, given the impact of the independent variables on the dependent variable (dividend) is assumed to be non-linear

Deciding on the number of hidden layers in a neural network is not an exact science. In fact, there are instances where accuracy will likely be higher without any hidden layers. Therefore, trial and error plays a significant role in this process.

One possibility is to compare how the accuracy of the predictions change as we modify the number of hidden layers.

I found that using a (2,1) configuration ultimately yielded *92.5%* classification accuracy for this example.

#Neural Network library(neuralnet) nn <- neuralnet(dividend ~ fcfps + earnings_growth + de + mcap + current_ratio, data=trainset, hidden=c(2,1), linear.output=FALSE, threshold=0.01) nn$result.matrix plot(nn)

Our neural network looks like this:

We now generate the error of the neural network model, along with the weights between the inputs, hidden layers, and outputs:

> nn$result.matrix 1 error 2.027188266758 reached.threshold 0.009190064608 steps 750.000000000000 Intercept.to.1layhid1 3.287965374794 fcfps.to.1layhid1 -1.723307330428 earnings_growth.to.1layhid1 -0.076629853467 de.to.1layhid1 1.243670462201 mcap.to.1layhid1 -3.520369700429 current_ratio.to.1layhid1 -3.068677865885 Intercept.to.1layhid2 3.618803162161 fcfps.to.1layhid2 1.109150492946 earnings_growth.to.1layhid2 -11.588713924832 de.to.1layhid2 -1.526458929898 mcap.to.1layhid2 -3.769192938001 current_ratio.to.1layhid2 -4.547481937028 Intercept.to.2layhid1 2.991704593713 1layhid.1.to.2layhid1 -7.372717428050 1layhid.2.to.2layhid1 -22.367528820159 Intercept.to.dividend -5.673537382132 2layhid.1.to.dividend 17.963989719804

## Testing The Accuracy Of The Model

As already mentioned, our neural network has been created using the training data. We then compare this to the test data to gauge the accuracy of the neural network forecast.

In the below:

- The "subset" function is used to eliminate the dependent variable from the test data
- The "compute" function then creates the prediction variable
- A "results" variable then compares the predicted data with the actual data
- A confusion matrix is then created with the table function to compare the number of true/false positives and negatives

#Test the resulting output temp_test <- subset(testset, select = c("fcfps","earnings_growth", "de", "mcap", "current_ratio")) head(temp_test) nn.results <- compute(nn, temp_test) #Accuracy results <- data.frame(actual = testset$dividend, prediction = nn.results$net.result) results roundedresults<-sapply(results,round,digits=0) roundedresultsdf=data.frame(roundedresults) attach(roundedresultsdf) table(actual,prediction)

The predicted results are compared to the actual results:

> results actual prediction 161 0 0.003457573932 162 1 0.999946522139 163 0 0.006824520245 164 0 0.004010802517 165 0 0.003887702302 166 1 0.999874153644 167 1 0.999980366384 168 1 0.998146780599 169 0 0.003492978628 170 1 0.537350561975 171 0 0.004603236937 172 1 0.999939975488 173 1 0.999828510307 174 0 0.015485658976 175 0 0.003549498180 176 0 0.005980944621 177 1 0.030221943252 178 1 0.999984214195 179 1 0.999208836940 180 0 0.004112814595 181 1 0.999959074201 182 1 0.999968776118 183 1 0.999919457006 184 0 0.005201629773 185 1 0.999781888469 186 1 0.997984179158 187 1 0.954069175017 188 0 0.003532946404 189 1 0.226749902056 190 1 0.983333352988 191 0 0.003899637063 192 0 0.005530617432 193 1 0.926567574613 194 1 0.999921050854 195 0 0.007256878789 196 1 0.017199765149 197 1 0.999987719505 198 0 0.005474975456 199 0 0.003427332586 200 1 0.999985252611

## Confusion Matrix

Then, we create a confusion matrix to compare the number of true/false positives and negatives:

> table(actual,prediction) prediction actual 0 1 0 17 0 1 3 20

A confusion matrix is used to determine the number of true and false positives generated by our predictions. The model generates 17 true negatives (0's), 20 true positives (1's), while there are 3 false positives.

Ultimately, we yield an *92.5% (37/40)* accuracy rate in determining whether a stock pays a dividend or not.

## Conclusion

In this tutorial, you have learned how to use a neural network to solve classification problems.

Specifically, you saw how we can:

- Normalize data for meaningful analysis
- Classify data using a neural network
- Test accuracy using a confusion matrix

If you have any questions, please leave them in the comments below. Many thanks for your time!

It seems as though the data set you have provided is missing two variables. Is there any way to fix this on my end or is the complete data set posted elsewhere?

Hi Kristofer,

Thanks for your message. Apologies, that was a previous dataset.

I’ve uploaded the new one – please click the dividendinfo.csv link above and you will be able to access it.

Best,

Michael

Sir,

I am using neural network for binary classification problem having class {0,1}.

I used the following command for training neutral network.

nn <- neuralnet(f,data = trainDataset, hidden=c(node),stepmax = 1e+9,learningrate = 0.3,linear.output=F)

And I used the following command for prediction

pred = compute(nn, testvalue)$net.result

Then I created the confusion matrix using the following

dh = table(testClass,round(pred))

testClass 0 1 2

0| 42 3 0

1| 6 18 2

Q: Why neural network predicting extra class 2, however in the data set we have only two classes 0 and 1.

If I see the output of round(pred). It gives output 0, 1 and 2 also.

How can I control this unknown class 2.

Thanks

Shuakat

Hi Shuakat,

I’m not sure I follow what you are trying to do here. Firstly, I can’t tell what hidden configuration you are using, and are you sure there are only two classes in your dataset?

I would recommend trying to run the problem above first. If that works, then it’s simply a matter of substituting the variables in your new dataset.

Hope this helps.

Best,

Michael

Hi Michael,

Excellent & clear article

with example R code. Thank you!

Q: In the Confusion Matrix (at the end of post),

where it says:

“…while there are 3 false positives”.

shouldn’t it say, instead:

“…while there are 3 false negatives”?.

After all, the Conf. Matrix is showing

under clolumn “Predicted=0”,

3 items which are “Actual=1”.

So, they were Predicted as negative

but that is false (ie: a “false negative”),

since they are actually a positive, Actual 1!.

Pls, Michael , correct me if I’m wrong.

Thanks!

SFdude

Hi SFdude,

Sorry for the typo, that’s been amended. You’re very right – if it is marked as a negative when it is in fact a positive, then it is a “false negative”.

Thanks for bringing it to my attention!

Best,

Michael