We apply neural network for handwritten digit recognition in this lab.
We use the MNIST database (Modified National Institute of Standards and Technology database) is a large database of handwritten digits (\(28 \times 28\)) that is commonly used for training and testing machine learning algorithms.
You can prepare the data by the following code
library(keras)
mnist <- dataset_mnist()
x_train <- mnist$train$x
y_train <- mnist$train$y
x_test <- mnist$test$x
y_test <- mnist$test$y
Training set:
dim(x_train)
## [1] 60000 28 28
dim(y_train)
## [1] 60000
Let’s take a look over the first 10 images in the training set.
for (i in 1:10) {
(image(t(x_train[i, 28:1,]), useRaster=TRUE, axes=FALSE, col=grey(seq(0, 1, length = 256)), main = y_train[i]))
}
Vectorize \(28 \times 28\) images into \(784\)-vectors and scale entries to [0, 1]:
# reshape
x_train <- array_reshape(x_train, c(nrow(x_train), 784))
x_test <- array_reshape(x_test, c(nrow(x_test), 784))
# rescale
x_train <- x_train / 255
x_test <- x_test / 255
dim(x_train)
## [1] 60000 784
dim(x_test)
## [1] 10000 784
Encode \(y\) as binary class matrix:
y_train <- to_categorical(y_train, 10)
y_test <- to_categorical(y_test, 10)
dim(y_train)
## [1] 60000 10
dim(y_test)
## [1] 10000 10
head(y_train)
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
## [1,] 0 0 0 0 0 1 0 0 0 0
## [2,] 1 0 0 0 0 0 0 0 0 0
## [3,] 0 0 0 0 1 0 0 0 0 0
## [4,] 0 1 0 0 0 0 0 0 0 0
## [5,] 0 0 0 0 0 0 0 0 0 1
## [6,] 0 0 1 0 0 0 0 0 0 0
Fit a multinomial logit regression model to the training set and test the accuracy with the test set. Plot the first 10 digits in the test set and compare with their predicted value.
mlogit <- keras_model_sequential()
mlogit %>%
layer_dense(units = 10, activation = 'softmax', input_shape = c(784))
summary(mlogit)
## Model: "sequential"
## ________________________________________________________________________________
## Layer (type) Output Shape Param #
## ================================================================================
## dense (Dense) (None, 10) 7850
## ================================================================================
## Total params: 7,850
## Trainable params: 7,850
## Non-trainable params: 0
## ________________________________________________________________________________
# compile model
mlogit %>% compile(
loss = 'categorical_crossentropy',
optimizer = optimizer_rmsprop(),
metrics = c('accuracy')
)
# fit model
mlogit_history <- mlogit %>% fit(
x_train, y_train,
epochs = 20, batch_size = 128,
validation_split = 0.2
)
# Evaluate model performance on the test data:
mlogit %>% evaluate(x_test, y_test)
## loss accuracy
## 0.2713428 0.9258000
Generate predictions on new data:
y_predict <- mlogit %>% predict_classes(x_test)
for (i in 1:10) {
(image(t(mnist$test$x[i, 28:1,]), useRaster=TRUE, axes=FALSE, col=grey(seq(0, 1, length = 256)), main = y_predict[i]))
}
Fit a multi-layer neural network and perform the task in Q0 again.
You can refer to this example code. https://tensorflow.rstudio.com/guide/keras/examples/mnist_mlp/
Fit a convolutional neural network and perform the same task in Q0.
You can refer to this example code. https://tensorflow.rstudio.com/guide/keras/examples/mnist_cnn/
Summarize the prediction accuracy and runtime differences between these models.