Predicting Irish electricity consumption with neural networks

predicted vs actual consumption

In this example, neural networks are used to forecast energy consumption of the Dublin City Council Civic Offices between March 2011 – February 2013.

Summary of Study

This analysis is divided into two parts:

  1. The neuralnet library in R is used to predict electricity consumption through the use of various explanatory variables
  2. An LSTM network is generated using Keras to predict electricity consumption using the time series exclusive of any explanatory variables

The relevant data was sourced from and Electricity consumption data was provided on an hourly basis, but converted to daily data for the purpose of this analysis.

The variables are as follows:

  1. eurgbp: EUR/GBP currency rate
  2. rain: Rainfall
  3. maxt: Maximum temperature
  4. mint: Minimum temperature
  5. wdsp: Wind speed
  6. sun: Sunlight hours
  7. kwh: KWH (consumption)

With Ireland obtaining about 45% of its electricity from natural gas, 96% of which is imported from Scotland, EUR/GBP currency fluctuations clearly have a significant impact on the cost of electricity in Ireland, and was therefore included as an explanatory variable.

Moreover, with weather conditions also significantly influencing electricity usage, weather data for the Dublin region was also included for the relevant dates in question.

Key Findings

It was found that of the two models, LSTM was able to predict electricity consumption more accurately, with the training and test predictions closely mirroring actual consumption:

predicted vs actual consumption

The model demonstrated an average error of 353.25 on the training dataset, and 255.13 on the test dataset (out of thousands of kilowatts).

Part 1: neuralnet

A neural network consists of:

  • Input layers: Layers that take inputs based on existing data
  • Hidden layers: Layers that use backpropagation to optimise the weights of the input variables in order to improve the predictive power of the model
  • Output layers: Output of predictions based on the data from the input and hidden layers


1.1. Data Normalization

The data is normalized and split into training and test data:

> normalize <- function(x) {
>  return ((x - min(x)) / (max(x) - min(x)))
> }
> maxmindf <-, normalize))

trainset <- maxmindf[1:378, ]
testset <- maxmindf[379:472, ]

1.2. Neural Network Output

The neural network is then run and the parameters are generated:

> library(neuralnet)
> nn <- neuralnet(kwh ~ eurgbp + rain + maxt + mint + wdsp + sun,data=trainset, hidden=c(5,2), linear.output=TRUE, threshold=0.01)
> nn$result.matrix
error                   2.168927756297
reached.threshold       0.008657878909
steps                 994.000000000000  -0.943475389102      1.221792852624        0.222508044224        1.356892947349       -0.377284881968        0.749993672528        -0.250669884677   3.424295572041     -4.921292790902        3.380551856044       -2.353604121342        0.877423599705       -0.581900515451        -7.083263552687   0.352457802915      3.715376984054       -1.030450129246       -0.672907974572        0.898040603876       -1.474470972212        -1.793900522508   0.819225033685    -16.770362105816       -2.483557437596       -0.059472312293        2.650852686615        3.863732942893         0.224801123127 -13.987427433833     -1.661519269508      -52.279711798215       22.717540151979       11.670399514036        9.713301368020        10.804887927196  -0.834412474581   1.629948945316  -3.064448233097   0.197497636177  -0.370098281335  -0.402324278545  -1.176093680811   1.312897190062   0.593640022150   1.906008701982   1.811035017074  -0.725078284924       -0.093973916107        0.700847362516        0.922218125575

Here is what our neural network looks like in visual format:


1.3. Model Validation

Then, we validate (or test the accuracy of our model) by comparing the estimated consumption in KWH yielded from the neural network to the actual consumption as reported in the test output:

> results <- data.frame(actual = testset$kwh, prediction = nn.results$net.result)
> results
          actual     prediction
379 0.8394856269  0.72836479401
380 0.7976933676  0.72836479401
381 0.8125463657  0.72836479401
382 0.8377382154  0.72836479401
383 0.8394856269  0.72836479401
384 0.8415242737  0.72836479401
467 0.7464359625  0.80778769677
468 0.7018769682  0.82063018370
469 0.7004207919  0.78094824279
470 0.6726078249  0.77185373598
471 0.7176036721  0.91671846789
472 0.7199335541  0.80974222504

1.4. Accuracy

In the below code, we are then converting the data back to its original format, and yielding an accuracy of 98% on a mean absolute deviation basis (i.e. the average deviation between estimated and actual electricity consumption stands at a mean of 2%). Note that we are also converting our data back into standard values given that they were previously scaled using the max-min normalization technique:

> predicted=results$prediction * abs(diff(range(kwh))) + min(kwh)
> actual=results$actual * abs(diff(range(kwh))) + min(kwh)
> comparison=data.frame(predicted,actual)
> deviation=((actual-predicted)/actual)
> comparison=data.frame(predicted,actual,deviation)
> accuracy=1-abs(mean(deviation))
> accuracy
[1] 0.9828191884

A mean accuracy of 98% is obtained using a (5,2) hidden configuration. However, note that since this is a mean accuracy, it does not necessarily imply that all predictions generated by the model will have such high accuracy. Indeed, accuracy is lower in certain cases as can be observed from the histogram below.

When we plot a histogram of the deviation (with 100 breaks), we see that the majority of forecasts fall within 10% from the actual consumption.


When plotting the predicted and actual consumption, it is observed that while the prediction series generated by the neural network follows the general range of the actual (i.e. between 4200-5000 Kwhs), the model is not particularly adept at predicting the peaks and valleys in the series (or periods of abnormally low or high usage).

predicted vs actual consumption

Part 2: LSTM (Long-Short Term Memory Network)

A shortcoming of traditional neural network models is that they do not account for dependencies across time series data.

When a neural network was generated using neuralnet, it was assumed that all observations are independent to each other. However, this is not necessarily the case.

2.1. Issue of Stationarity

When observing line charts for both KWH (consumption) and the EUR/GBP, we can see that the KWH time series shows a stationary pattern (stationary meaning that the mean, variance, and autocorrelation are constant):


However, when the EUR/GBP currency fluctuations are plotted over the same time period, the data is clearly non-stationary, i.e. the mean, variance, and autocorrelation differ over time:


Given that non-stationarity was present in certain explanatory variables, the LSTM model will now be used to predict future values of KWH against the test set - independent of any other explanatory variables.

In other words, only the values of KWH will be predicted using LSTM. The analysis is carried out using the Keras library in Python. The following guide also provides a detailed overview of predictions with LSTM using a separate example.

2.2. Data Processing

Firstly, the relevant libraries are imported and data processing is carried out:

# Import libraries
import numpy as np
import matplotlib.pyplot as plt
from pandas import read_csv
import math
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM
from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import mean_squared_error
import os;

# Form dataset matrix
def create_dataset(dataset, previous=1):
dataX, dataY = [], []
for i in range(len(dataset)-previous-1):
a = dataset[i:(i+previous), 0]
dataY.append(dataset[i + previous, 0])
return np.array(dataX), np.array(dataY)

# fix random seed for reproducibility

# load dataset
dataframe = read_csv('data.csv', usecols=[0], engine='python', skipfooter=3)
dataset = dataframe.values
dataset = dataset.astype('float32')

# normalize dataset with MinMaxScaler
scaler = MinMaxScaler(feature_range=(0, 1))
dataset = scaler.fit_transform(dataset)

# Training and Test data partition
train_size = int(len(dataset) * 0.8)
test_size = len(dataset) - train_size
train, test = dataset[0:train_size,:], dataset[train_size:len(dataset),:]

# reshape into X=t and Y=t+1
previous = 1
X_train, Y_train = create_dataset(train, previous)
X_test, Y_test = create_dataset(test, previous)

# reshape input to be [samples, time steps, features]
X_train = np.reshape(X_train, (X_train.shape[0], 1, X_train.shape[1]))
X_test = np.reshape(X_test, (X_test.shape[0], 1, X_test.shape[1]))

2.3. LSTM Generation and Predictions

Then, the LSTM model is generated and predictions are yielded:

# Generate LSTM network
model = Sequential()
model.add(LSTM(4, input_shape=(1, previous)))
model.compile(loss='mean_squared_error', optimizer='adam'), Y_train, epochs=100, batch_size=1, verbose=2)

# Generate predictions
trainpred = model.predict(X_train)
testpred = model.predict(X_test)

# Convert predictions back to normal values
trainpred = scaler.inverse_transform(trainpred)
Y_train = scaler.inverse_transform([Y_train])
testpred = scaler.inverse_transform(testpred)
Y_test = scaler.inverse_transform([Y_test])

# calculate RMSE
trainScore = math.sqrt(mean_squared_error(Y_train[0], trainpred[:,0]))
print('Train Score: %.2f RMSE' % (trainScore))
testScore = math.sqrt(mean_squared_error(Y_test[0], testpred[:,0]))
print('Test Score: %.2f RMSE' % (testScore))

# Train predictions
trainpredPlot = np.empty_like(dataset)
trainpredPlot[:, :] = np.nan
trainpredPlot[previous:len(trainpred)+previous, :] = trainpred

# Test predictions
testpredPlot = np.empty_like(dataset)
testpredPlot[:, :] = np.nan
testpredPlot[len(trainpred)+(previous*2)+1:len(dataset)-1, :] = testpred

# Plot all predictions
inversetransform, =plt.plot(scaler.inverse_transform(dataset))
trainpred, =plt.plot(trainpredPlot)
testpred, =plt.plot(testpredPlot)
plt.title("Predicted vs. Actual Consumption")

The model is trained over 100 epochs, and the predictions are generated.

2.4. Accuracy

When plotting the actual consumption (blue line) with the training and test predictions (orange and green lines), the two series follow each other quite closely, with the exception of certain spikes downward (or periods of abnormally low usage):

predicted vs actual consumption

Moreover, here is our output when 100 epochs are generated:

Epoch 94/100
 - 1s - loss: 0.0108
Epoch 95/100
 - 1s - loss: 0.0108
Epoch 96/100
 - 1s - loss: 0.0107
Epoch 97/100
 - 1s - loss: 0.0108
Epoch 98/100
 - 1s - loss: 0.0108
Epoch 99/100
 - 1s - loss: 0.0108
Epoch 100/100
 - 1s - loss: 0.0109

>>> # calculate RMSE
... trainScore = math.sqrt(mean_squared_error(Y_train[0], trainpred[:,0]))
>>> print('Train Score: %.2f RMSE' % (trainScore))
Train Score: 353.25 RMSE
>>> testScore = math.sqrt(mean_squared_error(Y_test[0], testpred[:,0]))
>>> print('Test Score: %.2f RMSE' % (testScore))
Test Score: 255.13 RMSE

The model has an average error of 353.25 on the training dataset, and 255.13 on the test dataset (out of thousands of kilowatts).


Of the two neural networks, LSTM proved to be more accurate at predicting fluctuations in electricity consumption.

In the case of neuralnet, the model was not completely adept at handling non-stationary data present in various explanatory variables.

Moreover, factors such as temperature already follow set historical trends generally (with the exception of abnormal weather patterns which might have an effect on consumption).

In this regard, a traditional neural network with explanatory variables proved less effective in this instance than LSTM, which was able to model fluctuations in consumption without the need for explanatory data.

Leave a Reply

Your email address will not be published. Required fields are marked *

11 − nine =