Pre-trained Neural Network Models

AI Maverick
3 min readOct 17, 2022

--

Introduction

In contrast to training a deep model from scratch which needs an enormous amount of training and test data, computational time, and budgets, there are pre-trained models, which we can use for different purposes.

In the following post, we will learn about the nature of the convolutional neural networks and their layers and a short example of defining transfer learning models in python with the use of the Keras library.

Convolutional Neural Networks

The pre-trained applications or neural networks are trained Convolutional Neural Networks or CNN. The CNN models focus on feature extraction by building a series of convolutional and pooling layers. Usually, we increase the layered complexity by adding each layer in which the early layer’s tasks are the extraction of simple features. Therefore, different pre-trained models are skilled in image feature extraction and train the connected network for classification.

The layer used in the CNN are as follows;

  • Convolutional

This is the central part of the CNN models, which includes concerted parameters that need to be learned. The output of the layer is the stacked activation maps of all filters.

  • Activation

The output of the Convolutional layer goes to the elementwise activation function. This layer returns the signal.

  • Pooling

It is used to diminish the width and height of the output.

Note that there are different pooling options including the average pooling or the max pool

AveragePooling2D()
  • Dense

The fully connected network includes a simple layer of units and each receives input from the previous layer and will classify the image.

Transfer Learning

Transfer learning refers to the model that has knowledge from the different databases and can be used as a pre-trained model for a new dataset. As an example, the Keras library was considered for implementing the pre-trained models on the imagenet database.

from tensorflow.keras.applications import VGG16model = VGG16(include_top=False, weights="imagenet")

To follow the related example NoteBook, refer here.

The top three layers have been removed which gives us the convolutional layers only.

As for the pre-trained model, the regular workflow would be to train only the last output layer of the model first,

A snip of the pre-trained model definition

then fine-tune the pre-trained model by updating the weights of the layer which would be the inference step. In the inference, you do not have to define the model from the scratch and all you have to do is to make it trainable by changing the following parameter.

Inference mode of the transfer learner

And as you already have the weights, the training procedure in the inference mode would be much faster to update the weight and return the highest possible accuracy on the used dataset.

Keras

In the above example, I used the Keras library to import the pre-trained model. This is a great tool for those who need to implement the transfer learning models very quickly in python. You can find different pre-trained models in Keras (Following table).

Example of the Keras applications

Conclusion

Convolutional Neural Networks are capable of handling image processing problems as well as other tasks. They include different layers and filters in order to learn from the input image of the dataset. Each layer and filter has its parameters, which are related to its behavior and should be set carefully.

As the training of these models needs more computational budgets including hardware and time, the transfer learning models including the pre-trained models introduced. They already have the pre-trained layers and the weights are updated on the imagenet or other datasets. Therefore the inference mode to train these models will be much faster with even better generalization accuracy.

--

--

No responses yet