Regression-based neural networks with keras

For this example, we use a linear activation function within the keras library to create a regression-based neural network. We will use the car sales dataset again (as we did with neuralnet in R).

Firstly, we import our libraries. Note that you will need TensorFlow installed on your system to be able to execute the below code. Depending on your operating system, you can find one of my YouTube tutorials on how to install on Windows 10 here.

Libraries

from tensorflow.python.keras.models import Sequential
from tensorflow.python.keras.layers import Dense
from tensorflow.python.keras.wrappers.scikit_learn import KerasRegressor
import numpy as np
import pandas as pd
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import KFold
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import MinMaxScaler

Set Directory

import os;
path="C:/Users/michaeljgrogan/Documents/a_documents/computing/data science/datasets"
os.chdir(path)
os.getcwd()

Since we are implementing a neural network, the variables need to be normalized in order for the neural network to interpret them properly. Therefore, our variables are transformed using the MaxMinScaler():

#Variables
dataset=np.loadtxt("cars.csv", delimiter=",")
x=dataset[:,0:5]
y=dataset[:,5]
y=np.reshape(y, (-1,1))
scaler = MinMaxScaler()
print(scaler.fit(x))
print(scaler.fit(y))
xscale=scaler.transform(x)
yscale=scaler.transform(y)

The data is then split into training and test data:

X_train, X_test, y_train, y_test = train_test_split(xscale, yscale)

Keras Model Configuration: Neural Network API

Now, we train the neural network. We are using the four input variables (age, gender, miles, debt, and income), along with four hidden neurons, and finally using the linear activation function to process the output.

model = Sequential()
model.add(Dense(12, input_dim=5, kernel_initializer='normal', activation='relu'))
model.add(Dense(8, activation='relu'))
model.add(Dense(1, activation='linear'))
model.summary()

The mean_squared_error (mse) and mean_absolute_error (mae) are our loss functions – i.e. an estimate of how accurate the neural network is in predicting the test data. We can see that with the validation_split set to 0.2, 80% of the training data is used to test the model, while the remaining 20% is used for testing purposes.

model.compile(loss='mse', optimizer='adam', metrics=['mse','mae'])
model.fit(X_train, y_train, epochs=150, batch_size=50,  verbose=1, validation_split=0.2)

From the output, we can see that the more epochs are run, the lower our MSE and MAE become, indicating improvement in accuracy across each iteration of our model.

Neural Network Output

>>> model.fit(X_train, y_train, epochs=150, batch_size=50,  verbose=1, validation_split=0.2)
Train on 540 samples, validate on 135 samples
Train on 577 samples, validate on 145 samples
Epoch 1/150
2018-03-24 19:31:05.078618: I C:\tf_jenkins\workspace\rel-win\M\windows\PY\35\tensorflow\core\platform\cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX AVX2
 50/577 [=>............................] 50/577 [=>............................] - ETA: 6s - loss: 0.1718 - mean_squared577/577 [==============================]577/577 [==============================] - 1s 1ms/step - loss: 0.1522 - mean_squared_error: 0.1522 - mean_absolute_error: 0.3003 - val_loss: 0.1368 - val_mean_squared_error: 0.1368 - val_mean_absolute_error: 0.2714

Epoch 2/150
 50/577 [=>............................] 50/577 [=>............................] - ETA: 0s - loss: 0.1506 - mean_squared577/577 [==============================]577/577 [==============================] - 0s 56us/step - loss: 0.1153 - mean_squared_error: 0.1153 - mean_absolute_error: 0.2524 - val_loss: 0.1027 - val_mean_squared_error: 0.1027 - val_mean_absolute_error: 0.2341

Epoch 3/150
 50/577 [=>............................] 50/577 [=>............................] - ETA: 0s - loss: 0.0733 - mean_squared577/577 [==============================]577/577 [==============================] - 0s 56us/step - loss: 0.0843 - mean_squared_error: 0.0843 - mean_absolute_error: 0.2183 - val_loss: 0.0770 - val_mean_squared_error: 0.0770 - val_mean_absolute_error: 0.2095

Epoch 4/150
 50/577 [=>............................] 50/577 [=>............................] - ETA: 0s - loss: 0.0577 - mean_squared577/577 [==============================]577/577 [==============================] - 0s 57us/step - loss: 0.0626 - mean_squared_error: 0.0626 - mean_absolute_error: 0.1952 - val_loss: 0.0583 - val_mean_squared_error: 0.0583 - val_mean_absolute_error: 0.1935

Epoch 5/150
 50/577 [=>............................] 50/577 [=>............................] - ETA: 0s - loss: 0.0498 - mean_squared577/577 [==============================]577/577 [==============================] - 0s 58us/step - loss: 0.0475 - mean_squared_error: 0.0475 - mean_absolute_error: 0.1774 - val_loss: 0.0454 - val_mean_squared_error: 0.0454 - val_mean_absolute_error: 0.1798
...
Epoch 145/150
 50/577 [=>............................] 50/577 [=>............................] - ETA: 0s - loss: 0.0138 - mean_squared577/577 [==============================]577/577 [==============================] - 0s 64us/step - loss: 0.0154 - mean_squared_error: 0.0154 - mean_absolute_error: 0.0902 - val_loss: 0.0161 - val_mean_squared_error: 0.0161 - val_mean_absolute_error: 0.0932

Epoch 146/150
 50/577 [=>............................] 50/577 [=>............................] - ETA: 0s - loss: 0.0119 - mean_squared577/577 [==============================]577/577 [==============================] - 0s 61us/step - loss: 0.0154 - mean_squared_error: 0.0154 - mean_absolute_error: 0.0903 - val_loss: 0.0162 - val_mean_squared_error: 0.0162 - val_mean_absolute_error: 0.0936

Epoch 147/150
 50/577 [=>............................] 50/577 [=>............................] - ETA: 0s - loss: 0.0161 - mean_squared577/577 [==============================]577/577 [==============================] - 0s 61us/step - loss: 0.0155 - mean_squared_error: 0.0155 - mean_absolute_error: 0.0913 - val_loss: 0.0161 - val_mean_squared_error: 0.0161 - val_mean_absolute_error: 0.0939

Epoch 148/150
 50/577 [=>............................] 50/577 [=>............................] - ETA: 0s - loss: 0.0222 - mean_squared577/577 [==============================]577/577 [==============================] - 0s 63us/step - loss: 0.0153 - mean_squared_error: 0.0153 - mean_absolute_error: 0.0900 - val_loss: 0.0164 - val_mean_squared_error: 0.0164 - val_mean_absolute_error: 0.0934

Epoch 149/150
 50/577 [=>............................] 50/577 [=>............................] - ETA: 0s - loss: 0.0147 - mean_squared577/577 [==============================]577/577 [==============================] - 0s 64us/step - loss: 0.0153 - mean_squared_error: 0.0153 - mean_absolute_error: 0.0897 - val_loss: 0.0161 - val_mean_squared_error: 0.0161 - val_mean_absolute_error: 0.0935

Epoch 150/150
 50/577 [=>............................] 50/577 [=>............................] - ETA: 0s - loss: 0.0152 - mean_squared577/577 [==============================]577/577 [==============================] - 0s 59us/step - loss: 0.0153 - mean_squared_error: 0.0153 - mean_absolute_error: 0.0901 - val_loss: 0.0162 - val_mean_squared_error: 0.0162 - val_mean_absolute_error: 0.0934

2 comments

    1. The data is being partitioned into training and test to see how well the model could potentially perform on unseen data. That said, if we are looking to prevent the possibility of overfitting, then k-fold cross validation could be a solution here.

Leave a Reply

Your email address will not be published. Required fields are marked *

four × 2 =