There is three main model type used in TensorFlow to create our models. The most commonly used type of model in TensorFlow is Keras Sequential API. This sequential Api is consists of a stack of layers. This API allows us to create models as fastly and simple. If you used the Keras library before probably you may know what is this API. This API is too convenient but not flexible to create custom models in TensorFlow. We can define this sequential API in two different ways.

In this tutorial, we will learn how to implement the MNIST dataset using TensorFlow Keras Sequential API. But we will only use ANN to create our model not will use CNN. Let’s look at close how we create our model using Sequential API. First, we need to import TensorFlow and Keras and layers to create the model.

import tensorflow as tf 
from tensorflow import keras # import keras from tensorflow
from tensorflow.keras import layers # import layers from tf.keras
from tensorflow.keras.datasets import mnist # import the mnist dataset from tf.keras.datasets
import matplotlib.pyplot as plt # to show images on the screen

Let’s create a neural network to recognize handwritten digits using the MNIST dataset. This MNIST dataset consists of 70000 images and these images are separated as 60000 training data and 10000 testing data. The dataset has 10 classes consist of numerical logits and these are called labels. Each image consists of 28 by 28 pixels in a grayscale format in the dataset. In the following code, we will load the MNIST dataset as X_train, Y_train, X_test, Y_test.

(x_train, y_train),(x_test,y_test) = mnist.load_data()
print(x_train.shape) # Check x_train shape
print(y_train.shape) # Check y_train shape
print(x_test.shape) # Check x_test shape
print(y_test.shape) # Check y_test shape
Downloading data from
11493376/11490434 [==============================] - 0s 0us/step
(60000, 28, 28)
(10000, 28, 28)

Let’s look at the images in x_train using the matplotlib library.


Before we create our model we need to prepare the MNIST dataset. Each image consist of pixel values between 0-255 and we have to scale this values to between 0-1 to implement our model fastly. Also we shoult convert pixels float32 format.CodeText

x_train = x_train.reshape(-1,x_train.shape[1]*x_train.shape[2]).astype("float32")/255.0
x_test = x_test.reshape(-1,x_test.shape[1]*x_test.shape[2]).astype("float32")/255.0
(60000, 784) 
(10000, 784)

Keras Sequential API

First Way

The first way is not convenient to create a sequential model because in this way we can’t do debugging operations.

model = keras.models.Sequential([
                                 layers.Dense(10, activation="softmax"),

Model: "sequential"
Layer (type)                 Output Shape              Param #   
dense (Dense)                (None, 512)               401920    
dense_1 (Dense)              (None, 128)               65664     
dense_2 (Dense)              (None, 64)                8256      
dense_3 (Dense)              (None, 10)                650       
Total params: 476,490
Trainable params: 476,490
Non-trainable params: 0

Second Way

The second way has structured more flexible than the first way. If you use the second way to create a sequential model you can do debugging operations.

model = keras.models.Sequential()
model.add(layers.Dense(512, activation="relu"))
model.add(layers.Dense(128, activation="relu"))
model.add(layers.Dense(64, activation="relu"))
model.add(layers.Dense(10, activation="softmax"))

Model: "sequential_1"
Layer (type)                 Output Shape              Param #   
dense_4 (Dense)              (None, 512)               401920    
dense_5 (Dense)              (None, 128)               65664     
dense_6 (Dense)              (None, 64)                8256      
dense_7 (Dense)              (None, 10)                650       
Total params: 476,490
Trainable params: 476,490
Non-trainable params: 0

Let’s compile and fit the model which we created.

model.compile(optimizer="adam", loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True), metrics=["accuracy"]), y_train, batch_size=32, epochs=3, validation_data=(x_test, y_test))
Epoch 1/3
1875/1875 [==============================] - 6s 2ms/step - loss: 0.3415 - accuracy: 0.8957 - val_loss: 0.1159 - val_accuracy: 0.9626
Epoch 2/3
1875/1875 [==============================] - 4s 2ms/step - loss: 0.0902 - accuracy: 0.9716 - val_loss: 0.0975 - val_accuracy: 0.9708
Epoch 3/3
1875/1875 [==============================] - 4s 2ms/step - loss: 0.0595 - accuracy: 0.9817 - val_loss: 0.0813 - val_accuracy: 0.9753
<tensorflow.python.keras.callbacks.History at 0x7f79502d30b8>

Let’s evaluate the model which we created.

loss_value, val_accuracy = model.evaluate(x_test, y_test, batch_size=32, verbose=0)
print("val_accuracy : ", val_accuracy)
val_accuracy :  0.9753000140190125