Neural Networks

Using neural networks to identify handwritten digits.

Our dataset contains 1593 flattened 16x16 black and white images (each pixel is represented by a 1 or 0) of handwritten digits (0-9). Approximately half of the digits were written neatly and the other half were written as quickly as possible. This provides some images that are very difficult for even the human eye to decipher correctly. Each point in our dataset will contain 266 features. The first 256 features represent pixel values for the flattened image. The last 10 features identify the known label for the image using one-hot encoding. For example, [0,0,0,1,0,0,0,0,0,0] is a 3 and [0,0,0,0,0,0,0,0,0,1] is a 9.

In this example we will learn about:

Getting Started

[2]:
import nimble

paths = nimble.fetchFiles('uci::Semeion Handwritten Digit')
images = nimble.data(paths[0], returnType="Matrix")

Preparing the data

We need to separate the features identifying the labels (the last 10 features) from the features containing our image data. Using features.extract performs this separation. New labels are placed in the labels object and our images object now only contains our image data.

[3]:
labels = images.features.extract(range(256, len(images.features)))
labels.show('one-hot encoded labels', points=7)
one-hot encoded labels
1593pt x 10ft
         0      1      2      3      4      5      6      7      8      9
     ┌─────────────────────────────────────────────────────────────────────
   0 │ 1.000  0.000  0.000  0.000  0.000  0.000  0.000  0.000  0.000  0.000
   1 │ 1.000  0.000  0.000  0.000  0.000  0.000  0.000  0.000  0.000  0.000
   2 │ 1.000  0.000  0.000  0.000  0.000  0.000  0.000  0.000  0.000  0.000
   │ │   │      │      │      │      │      │      │      │      │      │
1590 │ 0.000  0.000  0.000  0.000  0.000  0.000  0.000  0.000  0.000  1.000
1591 │ 0.000  0.000  0.000  0.000  0.000  0.000  0.000  0.000  0.000  1.000
1592 │ 0.000  0.000  0.000  0.000  0.000  0.000  0.000  0.000  0.000  1.000

Rather than 10 one-hot encoded features, we need our labels to be a single feature with the values 0-9 for our neural network. We can perform this conversion by matrix multiplying (with Python’s @ matrix multiplication operator) our labels object (1593 x 10) by a feature vector with the sequential values 0-9 (10 x 1). Since each label contains nine 0 values and a single 1, the only non-zero product is between the 1 value in the label and the value in the feature vector that corresponds with the index of the label’s 1 value. So, we quickly create a 1593 x 1 object with our labels as the integers 0 through 9.

[4]:
intLabels = labels @ nimble.data(range(10)).T
intLabels.show('integer labels', points=7)
integer labels
1593pt x 1ft
         0
     ┌──────
   0 │ 0.000
   1 │ 0.000
   2 │ 0.000
   │ │   │
1590 │ 9.000
1591 │ 9.000
1592 │ 9.000

Now that we have a single feature of labels, we can randomly partition our data into training and testing sets.

[5]:
trainX, trainY, testX, testY = images.trainAndTestSets(testFraction=0.25,
                                                       labels=intLabels)

Simple neural network

For this example, we will be using the Keras neural network package so it must be installed in the current environment. Our first task will be to build a simple Sequential model using Nimble’s interface with Keras. The layers argument for a Sequential object requires a list of Keras Layer objects. However, there is no need to import the objects directly from Keras. As long as Keras is installed, nimble.Init can search the interfaced package for the desired objects and instantiate it with any keyword arguments. So we can avoid extra imports (i.e., from keras.layers import  Dense, Dropout) and there is no need to recall the package’s module names that contain the objects we want to use.

[6]:
layer0 = nimble.Init('Dense', units=64, activation='relu')
layer1 = nimble.Init('Dropout', rate=0.5)
layer2 = nimble.Init('Dense', units=10, activation='softmax')
layers = [layer0, layer1, layer2]

Now that our layers are defined, we can use nimble.trainAndApply to train the neural network with our trainX and trainY data and make predictions on our testX data. Similar to above, the string ‘keras.Sequential’ informs Nimble to use the Sequential object from Keras, so importing the object manually is not necessary.

[7]:
predictions = nimble.trainAndApply(
    'keras.Sequential', trainX=trainX, trainY=trainY, testX=testX,
    layers=layers, optimizer='adam', loss='sparse_categorical_crossentropy',
    metrics=['accuracy'], epochs=10)

accuracy = nimble.calculate.fractionCorrect(testY, predictions)
print('Accuracy of simple neural network:', accuracy)
2024-06-27 00:29:31.665382: I external/local_tsl/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used.
2024-06-27 00:29:31.669034: I external/local_tsl/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used.
2024-06-27 00:29:31.704665: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-06-27 00:29:32.525452: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
Epoch 1/10
38/38 ━━━━━━━━━━━━━━━━━━━━ 1s 839us/step - accuracy: 0.1437 - loss: 2.4851
Epoch 2/10
38/38 ━━━━━━━━━━━━━━━━━━━━ 0s 777us/step - accuracy: 0.4535 - loss: 1.6186
Epoch 3/10
38/38 ━━━━━━━━━━━━━━━━━━━━ 0s 770us/step - accuracy: 0.6143 - loss: 1.2124
Epoch 4/10
38/38 ━━━━━━━━━━━━━━━━━━━━ 0s 739us/step - accuracy: 0.6832 - loss: 0.9522
Epoch 5/10
38/38 ━━━━━━━━━━━━━━━━━━━━ 0s 702us/step - accuracy: 0.7726 - loss: 0.7906
Epoch 6/10
38/38 ━━━━━━━━━━━━━━━━━━━━ 0s 726us/step - accuracy: 0.7689 - loss: 0.6958
Epoch 7/10
38/38 ━━━━━━━━━━━━━━━━━━━━ 0s 728us/step - accuracy: 0.8262 - loss: 0.6057
Epoch 8/10
38/38 ━━━━━━━━━━━━━━━━━━━━ 0s 761us/step - accuracy: 0.8232 - loss: 0.5558
Epoch 9/10
38/38 ━━━━━━━━━━━━━━━━━━━━ 0s 753us/step - accuracy: 0.8419 - loss: 0.5226
Epoch 10/10
38/38 ━━━━━━━━━━━━━━━━━━━━ 0s 770us/step - accuracy: 0.8590 - loss: 0.4621
13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step
Accuracy of simple neural network: 0.9020100502512562

Some older Keras versions rely on sources of randomness outside of Nimble’s control, so your exact result could vary from ours, but should be around 90% accurate. This is pretty good for a simple model only trained for 10 epochs.

Convolutional Neural Network

Let’s try to do better by increasing the complexity and creating a 2D convolutional neural network. This algorithm requires that our data be formatted so that it knows that each image is a 16 x 16 single channel (i.e., grayscale) image, so our flattened image data will not work. This will require each point to be a 3D (16 x 16 x 1) object. Fortunately, Nimble supports multi-dimensional data. We can reshape each point in our trainX and testX data using unflatten. Ultimately, this allows Nimble to identify each object as a four-dimensional object (a container of 3D objects representing 2D grayscale images). It is worth noting that Nimble will treat the object as if it has more than two dimensions, but the underlying data object is always two-dimensional. For this reason, the shape attribute will always provide the two-dimensional shape and the dimensions attribute will provide the dimensions that Nimble considers the object to have (shape and dimensions are the same for 2D data).

[8]:
def reshapePoint(pt):
    ret = pt.copy()
    ret.unflatten((16, 16, 1))
    return ret

trainX = trainX.points.calculate(reshapePoint)
testX = testX.points.calculate(reshapePoint)
print('trainX.shape', trainX.shape, 'trainX.dimensions', trainX.dimensions)
print('testX.shape', testX.shape, 'testX.dimensions', testX.dimensions)
trainX.shape (1195, 256) trainX.dimensions (1195, 16, 16, 1)
testX.shape (398, 256) testX.dimensions (398, 16, 16, 1)

For our 2D convolutional neural network, we will need five different types of Keras Layers objects. Just as we did with our simple neural network above, we can use nimble.Init to instantiate these objects without directly importing them from Keras.

[9]:
layersCNN = []
layersCNN.append(nimble.Init("Input", shape=(16, 16, 1)))
layersCNN.append(nimble.Init('Conv2D', filters=64, kernel_size=3,
                             activation='relu', ))
layersCNN.append(nimble.Init('Conv2D', filters=32, kernel_size=3,
                             activation='relu'))
layersCNN.append(nimble.Init('Dropout', rate=0.2))
layersCNN.append(nimble.Init('MaxPooling2D', pool_size=2))
layersCNN.append(nimble.Init('Flatten'))
layersCNN.append(nimble.Init('Dense', units=128, activation='relu'))
layersCNN.append(nimble.Init('Dense', units=10, activation='softmax'))

predictionsCNN = nimble.trainAndApply(
    'keras.Sequential', trainX=trainX, trainY=trainY, testX=testX,
    layers=layersCNN, optimizer='adam', loss='sparse_categorical_crossentropy',
    metrics=['accuracy'], epochs=10)
Epoch 1/10
38/38 ━━━━━━━━━━━━━━━━━━━━ 1s 8ms/step - accuracy: 0.4267 - loss: 1.7646
Epoch 2/10
38/38 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - accuracy: 0.8351 - loss: 0.5234
Epoch 3/10
38/38 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - accuracy: 0.9171 - loss: 0.2316
Epoch 4/10
38/38 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - accuracy: 0.9604 - loss: 0.1268
Epoch 5/10
38/38 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - accuracy: 0.9743 - loss: 0.0839
Epoch 6/10
38/38 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - accuracy: 0.9821 - loss: 0.0631
Epoch 7/10
38/38 ━━━━━━━━━━━━━━━━━━━━ 1s 8ms/step - accuracy: 0.9910 - loss: 0.0312
Epoch 8/10
38/38 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - accuracy: 0.9957 - loss: 0.0231
Epoch 9/10
38/38 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - accuracy: 0.9859 - loss: 0.0425
Epoch 10/10
38/38 ━━━━━━━━━━━━━━━━━━━━ 1s 8ms/step - accuracy: 0.9852 - loss: 0.0392
13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step

We see that the loss and accuracy of this model improved much faster than our previous model. Let’s check how it performed on our test set.

[10]:
accuracyCNN = nimble.calculate.fractionCorrect(testY, predictionsCNN)
print('Accuracy of 2D convolutional neural network:', accuracyCNN)
Accuracy of 2D convolutional neural network: 0.9321608040201005

With the same amount of training, our convolutional neural network is about 6% more accurate than our simple neural network. Considering some images are very difficult to correctly identify because they were drawn as quickly as possible, nearly 97% accuracy is a significant improvement and a very good result.

References:

Semeion Research Center of Sciences of Communication, via Sersale 117, 00128 Rome, Italy Tattile Via Gaetano Donizetti, 1-3-5,25030 Mairano (Brescia), Italy

Dua, D. and Graff, C. (2019). UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science.

Link to original dataset: https://archive.ics.uci.edu/ml/datasets/Semeion+Handwritten+Digit