Image Recognition with Keras: Convolutional Neural Networks

Image recognition and classification is a rapidly growing field in the area of machine learning.

In particular, object recognition is a key feature of image classification, and the commercial implications of this are vast.

For instance, image classifiers will be used in the future to:

  • Replace passwords with facial recognition
  • Allow autonomous vehicles to detect obstructions
  • Identify geographical features from satellite imagery

These are just a few of many examples of how image classification will ultimately shape the future of the world we live in.

So, let’s take a look at an example of how we can build our own image classifier.

Image Classification: Cars vs. Planes

In this instance, we conduct image classification with the Keras library. Specifically, we train Keras to be able to distinguish between an image of a car and a plane.

Here are the two images we wish to identify:

Car (7813125.jpg)

image car

Plane (56315795.jpg)

image plane

This is a relatively simple example, and thus I used 100 images for each to train the model (80 as training data, 20 as test data). If we were trying to build an application that could identify faces with a high degree of accuracy, then we would need many more images (potentially hundreds of thousands), as attempting to identify a person from their face is a much harder task than identifying two distinct objects. In our case, a car and a plane.

Firstly, let’s import our libraries.

from keras.layers import Conv2D
from keras.layers import Dense
from keras.layers import Flatten
from keras.preprocessing import image
from keras.layers import MaxPooling2D
from keras.models import Sequential
import numpy as np

Now, we set up what is called a CNN (convolutional neural network).

What is a CNN (Convolutional Neural Network)?

A CNN is a special type of neural network that is suited to analyzing visual imagery. This is essentially how the CNN works:

  1. Firstly, convolution allows us to extract appropriate features from the input images.
  2. Pooling then allows us to reduce dimensionality of the feature maps but keep the most important information.
  3. Flattening then allows us to arrange 3D volumes into a 1D vector.
  4. Forming fully connected layers then allows us to ensure connections to all activations in the previous layer.

Let’s carry out this process.

classifier = Sequential()
classifier.add(Conv2D(32, (3, 3), input_shape=(64, 64, 3),
               activation='relu'))
classifier.add(MaxPooling2D(pool_size=(2, 2)))

classifier.add(Conv2D(32, (3, 3), activation='relu'))
classifier.add(MaxPooling2D(pool_size=(2, 2)))
classifier.add(Flatten())

classifier.add(Dense(units=128, activation='relu'))
classifier.add(Dense(units=1, activation='sigmoid'))

Now, we can compile the CNN and train the classifier. We are using the binary_crossentropy as our loss function. By binary crossentropy, we are choosing this as our loss because we are identifying our image based on classification, e.g. 0 = car, 1 = plane. When we refer to loss, we mean the degree of error in our model, i.e. a higher loss means we are more likely to misclassify.

Classifier Training

Let’s compile the CNN and train the classifier. We choose to generate 20 epochs (20 forward and backward passes which are used to train the weights of our neural network), across 80 steps.

classifier.compile(optimizer='adam', loss='binary_crossentropy',
                   metrics=['accuracy'])
from keras.preprocessing.image import ImageDataGenerator
train_imagedata = ImageDataGenerator(rescale=1. / 255, shear_range=0.2,
        zoom_range=0.2, horizontal_flip=True)
test_imagedata = ImageDataGenerator(rescale=1. / 255)
training_set = \
    train_imagedata.flow_from_directory('/home/directory/image classification/data/training_set'
        , target_size=(64, 64), batch_size=32, class_mode='binary')
test_set = \
    test_imagedata.flow_from_directory('/home/directory/image classification/data/test_set'
        , target_size=(64, 64), batch_size=32, class_mode='binary')
history=classifier.fit_generator(training_set, steps_per_epoch=80, epochs=20,
                         validation_data=test_set,
                         validation_steps=80)

Now that we have trained the model, we can plot our model loss and accuracy across the 20 epochs:

import matplotlib.pyplot as plt
print(history.history.keys())
# Loss
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('model loss')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['loss', 'val_loss'], loc='upper left')
plt.show()
# Accuracy
plt.plot(history.history['acc'])
plt.plot(history.history['val_acc'])
plt.title('model accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(['acc', 'val_acc'], loc='upper left')
plt.show()

Let’s take a look at our plots:

Model Accuracy

model accuracy

Model Loss

model loss

In this instance, we can see that the loss and accuracy are respectively minimized and maximized between 2 to 3 epochs. Therefore, we retrain the model using 2 epochs instead of 20. Given that we are dealing with quite simplistic data, we have only needed two epochs to train the model successfully – more epochs result in overfitting.

classifier.compile(optimizer='adam', loss='binary_crossentropy',
                   metrics=['accuracy'])
from keras.preprocessing.image import ImageDataGenerator
train_imagedata = ImageDataGenerator(rescale=1. / 255, shear_range=0.2,
        zoom_range=0.2, horizontal_flip=True)
test_imagedata = ImageDataGenerator(rescale=1. / 255)
training_set = \
    train_imagedata.flow_from_directory('/home/directory/image classification/data/training_set'
        , target_size=(64, 64), batch_size=32, class_mode='binary')
test_set = \
    test_imagedata.flow_from_directory('/home/directory/image classification/data/test_set'
        , target_size=(64, 64), batch_size=32, class_mode='binary')
history=classifier.fit_generator(training_set, steps_per_epoch=80, epochs=2,
                         validation_data=test_set,
                         validation_steps=80)

Image Classification

Upon retraining, we can now feed an unseen image of both a car and a plane into the model to see if it is identified accurately.

In this instance, when we feed an image of a car to the model, it correctly identifies it as a 0:

>>> test_image = \
...     image.load_img('/home/directory/image classification/data/7813125.jpg'
...                    , target_size=(64, 64))
>>> test_image = image.img_to_array(test_image)
>>> test_image = np.expand_dims(test_image, axis=0)
>>> result = classifier.predict(test_image)
>>> training_set.class_indices
{'cars': 0, 'planes': 1}
>>> result
array([[0.]], dtype=float32)

Now, let’s try the same for the image of a plane:

>>> test_image = \
...     image.load_img('/home/directory/image classification/data/56315795.jpg'
...                    , target_size=(64, 64))
>>> test_image = image.img_to_array(test_image)
>>> test_image = np.expand_dims(test_image, axis=0)
>>> result = classifier.predict(test_image)
>>> training_set.class_indices
{'cars': 0, 'planes': 1}
>>> result
array([[1.]], dtype=float32)

This time, the classifier correctly identifies the image as a 1 (1 = plane).

Conclusion

In this tutorial, you have seen:

  • How to construct a CNN
  • How to train a CNN to classify images
  • How to test the accuracy of image classification

Many thanks for your time, and please feel free to leave any questions or comments below. You can also find the full code at my Github repository.

Leave a Reply

Your email address will not be published. Required fields are marked *

five × two =