Employing Deep Learning Approaches For A
Employing Deep Learning Approaches For A
Recognition: A Review
Sajjad Soroori1, Ali Tourani1 and Asadollah Shahbahrami1
1
Department of Computer Engineering, University of Guilan, Rasht, Iran
ABSTRACT
Employing deep learning approaches has resulted in magnificent perfections in computer
vision applications in recent years. In addition, Deep Neural Networks (DNNs) have shown to Keywords
be remarkable alternatives for common shallow machine learning techniques like Support Deep Learning, Deep Neural
Vector Machines (SVM). Deep learning provides great solutions for both classic and modern Networks, Automatic License
image processing, feature extraction and object detection problems. Considering the Plate Recognition, Intelligent
advantages of utilizing DNNs in a wide range of computer vision fields, this paper presents a Transportation Systems, Image
concise review of different deep learning approaches employed in Automatic License Plate Processing.
Recognition (ALPR) systems. In such systems, deep learning techniques have been utilized in
various phases of ALPR including license plate detection, character segmentation and Optical
Character Recognition (OCR). Additionally, a comprehensive overview of common DNN
architectures is introduced for better clarification and classification of introduced methods.
1. INTRODUCTION
Nowadays, urban development has faced with severe challenges due to the population boost and lack of transportation
infrastructures to cover peoples’ transportation demands. In this regard, Intelligent Transportation Systems (ITS) have
appeared as some great solutions to handle traffic-related problems and avoid occlusions [1]. These systems utilize a wide
range of devices for data gathering, processing and output model generation, such as cameras and image sensors, radar
guns, inductive loop detectors, and so on [2]. Among them, roadway cameras play a key role in delivering data for better
urban planning policies, due to their flexibility to provide data, lower installation and maintenance costs, and the notable
ability to apply machine vision techniques on their output results. In this regard, a new branch of Intelligent Transportation
Systems known as vision-based ITS has been formed in which image/video outputs of the cameras are utilized in further
processing stages [3]. Some instances of vision-based ITS are speed measurement systems, vehicle detection and
classification, and Automatic License Plate Recognition (ALPR).
ALPR is used in most of vision-based ITS systems, such as vehicle detection/tracking, speed measurement, traffic flow
estimation, violation detection, etc. This process consists of three sequential steps named as License Plate Detection
(LPD) to extract a license plate area from image, character segmentation to fetch areas containing characters in a license
plate, and Optical Character Recognition (OCR) to classify the detected characters inside the mentioned segments.
Consequently, here are a wide variety of approaches that utilize different object detection, image processing, and machine
vision techniques to automatically fetch license plate data from still images [4-9]. With the advent of deep learning in a
wide range of applications, ALPR has also become an attractive field for researchers to test their Deep Neural Network
(DNN) architectures for license plate localization and recognition. Deep learning-based approaches generally provide
significant results in correct detection of license plate area and character recognition in case a huge dataset containing
various data forms provided.
This paper provides a brief review of deep learning techniques employed for different phases of ALPR systems. The rest
of the paper is organized as follows: Section 2 provides a brief introduction to the common deep learning approaches and
algorithms. In Section 3, a concise review of some recent ALPR methods based on DNNs is presented. Eventually, the
paper concludes the review in Section 4.
2. BACKGROUND
Considering the power of DNNs in classification and object detection systems, these architectures have gained a wide
range of attention in recent studies. The main concept of deep learning is attached to DNNs, where several hidden layers
are utilized to process input data and provide a final object detection/classification model. It should be noted that these
networks are faced with some challenges like harder training processes and overfitting problems comparing to shallow
learning structures [10]. Fundamentally, deep learning approaches are computationally intensive and need huge data and
processing resources to generate a high accuracy model [11]. Thus, only if sufficient training data and some kind of
parallel hardware architecture e.g. multiple GPUs are provided, DNNs can alter classic object detection or classification
techniques and present remarkable results.
In the special case of ANPR and vision-based ITS, the input of deep learning systems is images, whether still images or
consequent video frames. According to the 2D nature of images, an especial type of DNNs known as Convolutional Deep
Neural Networks (CDNNs or CNNs) are utilized in machine vision applications [12]. CNNs are multi-layer networks
with the ability to capture all temporal and spatial dependencies by applying convolution filters. In feature extraction
applications, high-level and non-linear features are extracted by convolution and fully-connected layers, respectively. In
this regard, by dividing an input image into smaller portions and passing them through hidden layers of the network,
existing features are extracted and determined in the form of candidate regions. Since the number of candidate regions in
CNNs is huge, a similar approach called Region-based Convolutional Neural Networks (R-CNNs) utilizes a Selective
Search algorithm to limit the number of final regions [13]. This is done by recursively merging of neighboring candidate
regions into larger ones, i.e. region proposals. Although R-CNN reduces the number of candidate regions to a large extent,
the process of feature extraction costs a huge time. In this regard, Fast Region-based Convolutional Neural Networks
(Fast R-CNNs) approach employs a convolutional feature map to feed region proposals and renders the convolution
process only once for each input image [14]. Consequently, the process of feature extraction is done in a much lower
amount of time comparing to R-CNN. Other differences of Fast R-CNN and R-CNN approaches include adding a
SoftMax probability layer for better classification and a Region of Interest (RoI) module in the pooling layer to generate
fixed-size shapes in Fast R-CNN. In the same way, Faster Region-based Convolutional Neural Networks (Faster R-CNNs)
is another approach that presents a better performance by utilizing a separate convolution network called Region Proposal
Network (RPN) as an alternative of Selective Search algorithm [15]. The region proposals in this method are fed to a Fast
R-CNN algorithm for classification.
In contrast with the mentioned approaches where smaller portions of the input image are utilized in feature extraction,
some other powerful methods seek these features at the entire image. In this regard, You Only Look Once (YOLO) is a
real-time object detection technique that processes the whole image at the first step and extracts the regions with a higher
probability of feature existence [16]. By splitting the input image into equal-sized grids, a CNN processes the bounding
boxes inside the grids to calculate the probability of object existence. In the second version of YOLO, i.e. YOLO v.2 or
YOLO-9000, a reordering of the architecture modules and training the network by COCO and ImageNet dataset has
resulted in faster performance and higher detection accuracy [17]. Similarly, the third version of YOLO, i.e. YOLO v.3,
skips the connections to residual blocks and employs three levels of down-sampling function along with an improved
feature extractor which makes it more powerful than the two previous versions [18].
Table 1- A brief overview of some ALPR approaches implemented by deep learning techniques.
License Optical
Deep learning Plate Character
Resource Segmentation Accuracy/Performance Dataset(s)
approach Detecti Recognitio
on n
two sequential Precision = 93.8% two generated
Selmi et al. [22]
CNNs ✓ ✓ ✓ Recall = 91.3% datasets
a combination of
Shivakumara et al. MIMOs
[23]
CNN and Recurrent ✖ ✓ ✓ Recognition rate = 86.4%
dataset
Neural Networks
YOLO and sliding LPD accuracy = 98.22%
Hendry et al. [24]
window methods ✓ ✓ ✓ OCR accuracy = 78%
AOLP dataset
CarFlag,
CNN and Region Detection rate = 98.2% to
Li et al. [25]
Proposal Networks ✓ ✓ ✖
99.12%
AOLP, Caltech
and PKUData
the best output of
Detection rate = 99.4% real condition
Hou et al. [26] both YOLOv2 and ✓ ✓ ✓ Recognition rate = 97.9% datasets
YOLOv3 results
a combination of Caltech,
Precision = 92.3%
Wang et al. [27] CNN and Recurrent ✓ ✖ ✓ Recall = 92.1%
generated
Neural Networks dataset
detection based on generated
Artan et al. [28]
color features ✓ ✓ ✓ Recognition rate = 92%
dataset
LPD by template
Detection rate = 91% real condition
Hashmi et al. [29] matching and OCR ✓ ✓ ✓ Recognition rate = 93% datasets
by deep CNN
a modified ResNet- AOLP and
Zhuang et al. [30]
101 architecture ✓ ✖ ✓ Detection rate = 99%
Media Lab
Gonçalves et al. two sequential
[31] CNNs ✓ ✖ ✓ Detection rate = 85.6% SSIG-SegPlate
Unit-linking Pulse
Detection rate = 96.1% generated
Zhao et al. [32] Coupled Neural ✓ ✓ ✓ Recognition rate = 96.7% dataset
Network
4. CONCLUSIONS
This paper provided a concise study of recent deep learning approaches employed in Automatic License Plate Recognition (ALPR)
systems. ALPR consists of three major steps, including License Plate Detection (LPD), character segmentation and Optical Character
Recognition (OCR) and some recent studies have focused on each step independently. In this regard, we reviewed these researches
based on the application of deep learning techniques in them. In addition, a brief introduction to popular Deep Neural Networks (DNNs)
architectures employed in machine vision-based approaches is provided. It should be noted that the main scope of this paper is mainly
focused on the researches published since 2015 and provided high accuracy outcomes.
5. REFERENCES
L. Janušová, and S. Čičmancová, "Improving Safety of Transportation by using Intelligent Transport Systems," Procedia
[1]
Engineering, vol. 134, pp. 14-22, 2016.
L. Qi, "Research on Intelligent Transportation System Technologies and Applications," Workshop on Power Electronics and
[2] Intelligent Transportation System, Guangzhou, pp. 529-531, 2008.
M. Bommes, A. Fazekas, T. Volkenhoff, and M. Oeser, "Video Based Intelligent Transportation Systems - State of the Art
[3]
and Future Development," Transportation Research Procedia, vol. 14, pp. 4495-4504, 2016.
N. Saleem, H. Muazzam, H. M. Tahir, and U. Farooq, "Automatic License Plate Recognition using Extracted Features," 4th
[4]
International Symposium on Computational and Business Intelligence, Olten, pp. 221-225, 2016.
T. Panchal, H. Patel, and A. Panchal, "License Plate Detection Using Harris Corner and Character Segmentation by Integrated
[5]
Approach from an Image," Procedia Computer Science, vol. 79, pp. 419-425, 2016.
M. A. Khan, M. Sharif, M. Y. Javed, T. Akram, M. Yasmin, and T. Saba, "License Number Plate Recognition System using
[6]
Entropy-Based Features Selection Approach with SVM," IET Image Processing, vol. 12, no. 2, pp. 200-209, 2018.
[7] S. S. Tabrizi, and N. Cavus, "A Hybrid KNN-SVM Model for Iranian License Plate Recognition," Procedia Computer
Science, vol. 102, pp. 588-594, 2016.
Y. Luo, Y. Li, S. Huang, and F. Han, "Multiple Chinese Vehicle License Plate Localization in Complex Scenes," IEEE 3 rd
[8] International Conference on Image, Vision and Computing, Chongqing, pp. 745-749, 2018.
R. M. Khoshki and S. Ganesan, "Improved Automatic License Plate Recognition (ALPR) System based on Single Pass
[9] Connected Component Labeling (CCL) and Reign Property Function," IEEE International Conference on
Electro/Information Technology, Dekalb, pp. 426-431, 2015.
L. Deng, "A Tutorial Survey of Architectures, Algorithms, and Applications for Deep Learning," APSIPA Transactions on
[10]
Signal and Information Processing, vol. 3, no. 2, 2014
M. Hatt, C. Parmar, J. Qi, and I. El Naqa, "Machine (Deep) Learning Methods for Image Processing and Radiomics," IEEE
[11]
Transactions on Radiation and Plasma Medical Sciences, vol. 3, no. 2, pp. 104-108, 2019.
N. Aloysius and M. Geetha, "A Review on Deep Convolutional Neural Networks," International Conference on
[12]
Communication and Signal Processing, Chennai, pp. 0588-0592, 2017
R. Girshick, J. Donahue, T. Darrell and J. Malik, "Region-Based Convolutional Networks for Accurate Object Detection and
[13]
Segmentation," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 38, no. 1, pp. 142-158, 2016.
[14] R. Girshick, "Fast R-CNN," IEEE International Conference on Computer Vision, Santiago, pp. 1440-1448, 2015.
Sh. Ren, K. He, R. Girshick, and J. Sun, "Faster R-CNN: Towards Real-Time Object Detection with Region Proposal
[15]
Networks," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, no. 6, pp. 1137-1149, 2015.
J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, "You Only Look Once: Unified, Real-Time Object Detection," IEEE
[16]
Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2016-Dec., pp. 779-788, 2016.
J. Redmon, and A. Farhadi, "YOLO9000: Better, Faster, Stronger," IEEE Conference on Computer Vision and Pattern
[17]
Recognition, pp. 6517-6525, 2016.
J. Redmon, and A. Farhadi, "YOLOv3: An Incremental Improvement," Retrieved from https://arxiv.org/abs/1804.02767,
[18]
2018
N. Aloysius and M. Geetha, "A Review on Deep Convolutional Neural Networks," International Conference on
[19]
Communication and Signal Processing, Chennai, pp. 0588-059, 2017.
P. P. Shinde and S. Shah, "A Review of Machine Learning and Deep Learning Applications," 4th International Conference
[20]
on Computing Communication Control and Automation, Pune, pp. 1-6, 2018.
L. Jiao, F. Zhang, F. Liu, S. Yang, L. Li, Z. Feng and R. Qu, "A Survey of Deep Learning-Based Object Detection," IEEE
[21]
Access, vol. 7, pp. 128837-128868, 2019.
Z. Selmi, M. Ben Halima and A. M. Alimi, "Deep Learning System for Automatic License Plate Detection and Recognition,"
[22]
14th IAPR International Conference on Document Analysis and Recognition, pp. 1132-1138, Kyoto, 2017.
P. Shivakumara, D. Tang, M. Asadzadehkaljahi, T. Lu, U. Pal and M. Hossein Anisi, "CNN-RNN based Method for License
[23]
Plate Recognition," CAAI Transactions on Intelligence Technology, vol. 3, no. 3, pp. 169-175, 2018.
Hendry, Rung-Ching Chen, "Automatic License Plate Recognition via Sliding-Window Darknet-YOLO Deep Learning,"
[24]
Image and Vision Computing, vol. 87, 2019, pp. 47-56, ISSN 0262-8856, 2019,
H. Li, P. Wang and C. Shen, "Toward End-to-End Car License Plate Detection and Recognition with Deep Neural Networks,"
[25]
IEEE Transactions on Intelligent Transportation Systems, vol. 20, no. 3, pp. 1126-1136, 2019.
X. Hou, M. Fu, X. Wu, Z. Huang and S. Sun, "Vehicle License Plate Recognition System Based on Deep Learning Deployed
[26] to PYNQ," 18th International Symposium on Communications and Information Technologies, pp. 79-84, Bangkok, 2018.
J. Wang, X. Liu, A. Liu, J. Xiao, "A Deep Learning-Based Method for Vehicle License Plate Recognition in Natural Scene,"
[27]
APSIPA Transactions on Signal and Information Processing, vol. 8, 2019.
Y. Artan, B. Alkan, B. Balci and A. Elihoş, "Deep Learning based Vehicle Make, Model and Color Recognition using
[28] License Plate Recognition Camera Images," 27 th Signal Processing and Communications Applications Conference, pp. 1-4,
Sivas, 2019.
S. N. Hashmi, K. Kumar, S. Khandelwal, D. Lochan, and S. Mittal, "Real Time License Plate Recognition from Video
[29]
Streams using Deep Learning," International Journal of Information Retrieval Research, vol. 9, no. 1, pp. 65-87, 2019.
J. Zhuang, S. Hou, Z. Wang, and Z. J. Zha, "Towards Human-Level License Plate Recognition," The European Conference
[30]
on Computer Vision, pp. 306-321, 2018.
G. Resende Gonçalves, M. Alves Diniz, R. Laroca, D. Menotti, and W. Robson Schwartz, "Real-Time Automatic License
[31] Plate Recognition through Deep Multi-Task Networks," 31st Conference on Graphics, Patterns and Images, pp. 110-117,
Parana, 2018.
Y. Zhao, and X. Gu, "Vehicle License Plate Localization and License Number Recognition using Unit-Linking Pulse
[32]
Coupled Neural Network," Neural Information Processing: Lecture Notes in Computer Science, vol. 7667, Berlin, 2012.