Pre-trained Neural Network Models
Introduction
In contrast to training a deep model from scratch which needs an enormous amount of training and test data, computational time, and budgets, there are pre-trained models, which we can use for different purposes.
In the following post, we will learn about the nature of the convolutional neural networks and their layers and a short example of defining transfer learning models in python with the use of the Keras library.
Convolutional Neural Networks
The pre-trained applications or neural networks are trained Convolutional Neural Networks or CNN. The CNN models focus on feature extraction by building a series of convolutional and pooling layers. Usually, we increase the layered complexity by adding each layer in which the early layer’s tasks are the extraction of simple features. Therefore, different pre-trained models are skilled in image feature extraction and train the connected network for classification.
The layer used in the CNN are as follows;
- Convolutional
This is the central part of the CNN models, which includes concerted parameters that need to be learned. The output of the layer is the stacked activation maps of all filters.
- Activation
The output of the Convolutional layer goes to the elementwise activation function. This layer returns the signal.
- Pooling
It is used to diminish the width and height of the output.
Note that there are different pooling options including the average pooling or the max pool
AveragePooling2D()
- Dense
The fully connected network includes a simple layer of units and each receives input from the previous layer and will classify the image.
Transfer Learning
Transfer learning refers to the model that has knowledge from the different databases and can be used as a pre-trained model for a new dataset. As an example, the Keras library was considered for implementing the pre-trained models on the imagenet database.
from tensorflow.keras.applications import VGG16model = VGG16(include_top=False, weights="imagenet")
To follow the related example NoteBook, refer here.
The top three layers have been removed which gives us the convolutional layers only.
As for the pre-trained model, the regular workflow would be to train only the last output layer of the model first,
then fine-tune the pre-trained model by updating the weights of the layer which would be the inference step. In the inference, you do not have to define the model from the scratch and all you have to do is to make it trainable by changing the following parameter.
And as you already have the weights, the training procedure in the inference mode would be much faster to update the weight and return the highest possible accuracy on the used dataset.
Keras
In the above example, I used the Keras library to import the pre-trained model. This is a great tool for those who need to implement the transfer learning models very quickly in python. You can find different pre-trained models in Keras (Following table).
Conclusion
Convolutional Neural Networks are capable of handling image processing problems as well as other tasks. They include different layers and filters in order to learn from the input image of the dataset. Each layer and filter has its parameters, which are related to its behavior and should be set carefully.
As the training of these models needs more computational budgets including hardware and time, the transfer learning models including the pre-trained models introduced. They already have the pre-trained layers and the weights are updated on the imagenet or other datasets. Therefore the inference mode to train these models will be much faster with even better generalization accuracy.