A Simple Vehicle Counting System Using Deep Learning With Yolov3 Model
A Simple Vehicle Counting System Using Deep Learning With Yolov3 Model
net/publication/341075968
A Simple Vehicle Counting System Using Deep Learning with YOLOv3 Model
CITATIONS READS
0 2,679
1 author:
Muhammad Fachrie
Universitas Teknologi Yogyakarta
15 PUBLICATIONS 14 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
The Dimensional Reduction to Improve the Speed and Accuracy of Neural Network in the Senior High School Student's Major View project
All content following this page was uploaded by Muhammad Fachrie on 23 June 2020.
RESTI journal
(System Engineering and Information Technology )
Vol. 4 No. 2 (2020) 462 - 468 ISSN Electronic Media: 2580-0760
Abstract
Deep Learning is a popular Machine Learning algorithm that is widely used in many areas in current daily life. Its robust
performance and ready-to-use frameworks and architectures enables many people to develop various Deep Learning-based
software or systems to support human tasks and activities. Traffic monitoring is one area that utilizes Deep Learning for several
purposes. By using cameras installed in some spots on the roads, many tasks such as vehicle counting, vehicle identification,
traffic violation monitoring, vehicle speed monitoring, etc. can be realized. In this paper, we discuss a Deep Learning
implementation to create a vehicle counting system without having to track the vehicles movements. To enhance the system
performance and to reduce time in deploying Deep Learning architecture, hence pretrained model of YOLOv3 is used in this
research due to its good performance and moderate computational time in object detection. This research aims to create a
simple vehicle counting system to help human in classify and counting the vehicles that cross the street. The counting is based
on four types of vehicle, i.e. car, motorcycle, bus, and truck, while previous research counts the car only. As the result, our
proposed system capable to count the vehicles crossing the road based on video captured by camera with the highest accuracy
of 97.72%.
Keywords: deep learning, yolov3, object detection, vehicle counting, traffic monitoring
RESTI Journal (System Engineering and Information Technology) Vol . 4 No. 3 (2020) 462 – 468
463
Muhammad Fachrie
RESTI Journal (System Engineering and Information Technology) Vol. 4 No. 3 (2020) 462 – 468
(a) (b)
(c) (d)
Figure 3. Screenshots of videos from each recording scenario: (a)
frontside 1x zoom, (b) frontside 2x zoom, (c) backside 1x zoom, and
(d) backside 2x zoom.
(c)
Figure 5. Samples of detected vehicles: (a) car, (b) truck and bus, and
(c) motorcycle.
RESTI Journal (System Engineering and Information Technology) Vol . 4 No. 3 (2020) 462 – 468
464
Muhammad Fachrie
RESTI Journal (System Engineering and Information Technology) Vol. 4 No. 3 (2020) 462 – 468
RESTI Journal (System Engineering and Information Technology) Vol . 4 No. 3 (2020) 462 – 468
465
Muhammad Fachrie
RESTI Journal (System Engineering and Information Technology) Vol. 4 No. 3 (2020) 462 – 468
In the first testing scenario as described in Table 1, the explains why ‘truck’ always get overcounted in both
system achieved the highest accuracy in the first video scenarios.
that is recorded from frontside with 1x zoom with
96.96% of accuracy. Type ‘motorcycle’ and ‘car’ has the
most accurate counting with only 2.14% and 5.3% of
error in average respectively among all videos, while
type ‘truck’ got the worst accuracy followed by the type
‘bus’.
The second scenario was run using 15 fps videos with
the same resolution. As described in Table 2, the
counting accuracy decreases 5% up to 14% for each
video. But still, the first video got the highest accuracy
with 91.14%. Type ‘car’ got a relatively constant
average error for both testing scenarios, while
‘motorcycle’ got more error up to 11% than the previous
one in the second scenario. ‘Truck’ rather better in the
second scenario but still the worst among other vehicles.
Table 2. Result of the second testing scenario (1080p and 15fps) Figure 9. Sample of vehicle that is detected as two different types of
vehicle (marked by two bounding boxes) at the same time while
Counting Counting Overall crossing the border line.
Video Vehicle
Real System Error Accuracy
Car 149 148 -1 Type ‘motorcycle’ is also well-detected by YOLOv3,
Video 1 even though some motorcycles are not detected even
Motor 244 218 -6
(frontside 91.14%
1x zoom)
Bus 1 2 +1 when the location is close to the camera. But fortunately,
Truck 1 8 +7 this miss-detection does not decrease the counting
Car 128 116 -12
Video 2 accuracy because some other motorcycles are also
Motor 232 212 -20
(frontside
Bus 3 5 +2
85.71% double counted due to double detection on the same
2x zoom) object as in Figure 10. The double detection on
Truck 1 19 +18
Car 122 113 -9 motorcycle is caused by YOLOv3 recognized two
Video 3
(backside
Motor 173 128 -45
76.09%
different kinds of motorcycle, i.e. a motorcycle with its
Bus 1 3 +2 driver and a motorcycle itself.
1x zoom)
Truck 1 16 +15
Car 99 94 -5
Video 4
Motor 247 209 -38
(backside 84.15%
Bus 0 2 +2
2x zoom)
Truck 1 11 +10
Average Accuracy 84.27%
RESTI Journal (System Engineering and Information Technology) Vol . 4 No. 3 (2020) 462 – 468
466
Muhammad Fachrie
RESTI Journal (System Engineering and Information Technology) Vol. 4 No. 3 (2020) 462 – 468
Table 3. Result of the first and the second testing scenario after improvement
Real Counting (30 fps) Accuracy (30 fps) Counting (15 fps) Accuracy (15 fps)
Video Vehicle
Count. Prev. New Prev. New Prev. New Prev. New
Car 149 150 150 148 148
Video 1
Motor 244 246 246 218 218
(frontside 96.96% 97.72% 91.14% 91.90%
Bus 1 2 2 2 2
1x zoom)
Truck 1 9 6 8 5
Car 128 114 114 116 116
Video 2
Motor 232 219 219 212 212
(frontside 87.09% 89.83% 85.71% 88.46%
Bus 3 4 4 5 5
2x zoom)
Truck 1 20 10 19 9
Car 122 114 113 113 112
Video 3
Motor 173 170 170 128 128
(backside 90.24% 91.25% 76.09% 77.44%
Bus 1 3 2 3 2
1x zoom)
Truck 1 17 14 16 12
Car 99 96 96 94 94
Video 4
Motor 247 246 246 209 209
(backside 94.52% 97.41% 84.15% 86.17%
Bus 0 2 2 2 2
2x zoom)
Truck 1 14 4 11 4
Since the counting accuracy of type ‘truck’ is the worst, [4] J. Hagerty, R. J. Stanley, and W. V Stoecker, “Medical Image
Processing in the Age of Deep Learning Is There Still Room
then we tried to improve the system by adding some
for Conventional Medical Image Processing Techniques ?,” no.
lines of code to solve the double detection on car objects Visigrapp, pp. 306–311, 2017.
by ignoring the ‘truck’ or ‘bus’ label when it is detected [5] J. T. G. Nodado, M. A. P. Abugan, A. C. Aralar, and H. C. P.
together with ‘car’ in the same object. This Morales, “Intelligent Traffic Light System Using Computer
Vision with Android Monitoring and Control,” TENCON 2018
improvement gives better result to both testing scenario
- 2018 IEEE Reg. 10 Conf., no. October, pp. 2461–2466, 2018.
as can be seen in Table 3. However, ‘truck’ is still rather [6] A. J. Kun and Z. Vámossy, “Traffic Monitoring with Computer
overcounted due to misclassification of vehicles. Some Vision,” in 7th Int’l Symposium on Applied Machine
objects of ‘car’ are miss-classified as ‘truck’ or ‘bus’ by Intelligence and Informatics, 2009, pp. 131–134.
[7] Z. Iftikhar, P. Dissanayake, and P. Vial, “Computer Vision
YOLOv3. But overall, the proposed system can perform
Based Traffic Monitoring System for Multi-track Freeways,”
well to count the vehicles on the road. in 2014 Int’l. Conf. on Intelligent Computing, 2014, pp. 339–
349.
4. Conclusion [8] Krishna, M. Poddar, M. K. Giridhar, and A. S. Prabhu,
“Automated Traffic Monitoring System Using Computer
A vehicle counting system has been developed using Vision,” in 2016 Int’l. Conf. n ICT in Business Industry &
YOLOv3 without tracking the vehicle movements. The Governance, 2016.
[9] S. Alghyaline, N. K. T. El-Omari, R. M. Al-Khatib, and H. Y.
counting is simply executed by evaluating the distance Al-Kharbshh, “RT-VC: An Efficient Real Time Vehicle
between the vehicle’s centroid to the border line. It Counting Approach,” J. Theor. Appl. Inf. Technol., vol. 97, no.
successfully achieved the highest accuracy of 97.72% 7, pp. 2062–2075, 2019.
when using frontside-1x zoom video. YOLOv3 plays [10] T. Paula, C. Florina, R. Brad, L. Br, and M. Greavu, “An Image
Feature-Based Method for Parking Lot Occupancy,” Futur.
significant role in detecting the vehicles since the Internet, vol. 11, no. 169, pp. 1–17, 2019.
counting is object to the detected object only. Object [11] T. Fabian, “A Vision-Based Algorithm for Parking Lot
‘car’ has the highest counting accuracy followed by Utilization Evaluation Using Conditional Random Fields,” in
‘motorcycle’ and ‘bus’, while ‘truck’ is the worst. 2013 Int’l Symposium on Visual Computing, 2013, pp. 222–
233.
Frame rate of the video also give impact to the [12] B. Y. Cai, R. Alvarez, M. Sit, F. Duarte, and C. Ratti, “Deep
performance since it represents the integrity of Learning Based Video System for Accurate and Real-Time
information that is processed by the system. All in all, Parking Measurement,” IEEE Internet Things J., vol. 6, no. 5,
this work has been successfully completed with good pp. 7693–7701, 2019.
[13] W. Wu, O. Bulan, E. A. Bernal, and R. P. Loce, “Detection of
result. In the future, any improvement should be made Moving Violations,” Comput. Vis. Imaging Intell. Transp.
to get a better system. Syst., vol. 1, pp. 101–130, 2017.
[14] N. Seenouvong and U. Watchareeruetai, “A Computer Vision
Based Vehicle Detection and Counting System,” in 8th
References International Conference on Knowledge and Smart
[1] N. K. Chauhan and K. Singh, “A Review on Conventional Technology, 2016, pp. 224–227.
Machine Learning vs Deep Learning,” 2018 Int. Conf. Comput. [15] B. Benjdira, T. Khurseed, A. Koubaa, A. Ammar, and K. Ouni,
Power Commun. Technol., pp. 347–352, 2018. “Car Detection using Unmanned Aerial Vehicles: Comparison
[2] N. O. Mahony et al., “Deep Learning vs . Traditional Computer between Faster R-CNN and YOLOv3,” in 1st Unmanned
Vision,” no. Cv. Vehicle Systems Conference, 2018, pp. 1–6.
[3] M. F. Rachmadi, M. C. Valdés-hernández, M. Leonora, F. [16] M. Bugeja, A. Dingli, M. Attard, D. Seychell, and T. I. S.
Agan, and T. Komura, “Deep Learning vs . Conventional Roma, “Comparison of Vehicle Detection Techniques applied
Machine Learning : Pilot Study of WMH Segmentation in to IP Camera Video Feeds for use in Intelligent Transport
Brain MRI with Absence or Mild Vascular Pathology †,” pp. Systems,” Transp. Res. Procedia, vol. 45, pp. 971–978, 2020.
1–19. [17] J. Redmon, “YOLOv3: An Incremental Improvement.”
[18] L. Ouyang and H. Wang, “Vehicle target detection in complex
RESTI Journal (System Engineering and Information Technology) Vol . 4 No. 3 (2020) 462 – 468
467
Muhammad Fachrie
RESTI Journal (System Engineering and Information Technology) Vol. 4 No. 3 (2020) 462 – 468
scenes based on YOLOv3 algorithm,” IOP Conf. Ser. Mater. Networks,” in Neural Information Processing Systems (NIPS)
Sci. Eng., vol. 569, pp. 1–7, 2019. 2015, 2015, pp. 1–14.
[19] H. Song and H. Liang, “Vision-based vehicle detection and [25] R. F. C. Networks and J. Dai, “R-FCN: Object Detection via
counting system using deep learning in highway scenes,” Song Region-based Fully Convolutional Networks,” in Neural
Eur. Transp. Res. Rev., vol. 11, no. 51, pp. 1–16, 2019. Information Processing Systems (NIPS) 2016, 2016.
[20] J. Uus and T. Krilaviˇ, “Detection of different types of vehicles [26] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You Only
from aerial imagery,” vol. 3, 2019. Look Once: Unified, Real-Time Object Detection,” in 2016
[21] X. Ding and R. Yang, “Vehicle and Parking Space Detection IEEE Conference on Computer Vision and Pattern Recognition
Based on Improved YOLO Network Model Vehicle and (CVPR), 2016.
Parking Space Detection Based on Improved YOLO Network [27] W. Liu et al., “SSD : Single Shot MultiBox Detector,” in
Model,” J. Phys. Conf. Ser., 2019. European Conference on Computer Vision (ECCV) 2016,
[22] R. Girshick, J. Donahue, S. Member, T. Darrell, and J. Malik, 2016.
“Region-based Convolutional Networks for Accurate Object [28] J. Redmon and A. Farhadi, “YOLO9000: Better, Faster,
Detection and Segmentation,” IEEE Trans. Pattern Anal. Stronger,” in 2017 IEEE Conference on Computer Vision and
Mach. Intell., vol. 38, no. 1, pp. 1–16, 2016. Pattern Recognition (CVPR), 2017, pp. 1–9.
[23] R. Girshick, “Fast R-CNN,” in 2015 IEEE International [29] K. He, G. Gkioxari, P. Dollar, and R. Girshick, “Mask R-
Conference on Computer Vision (ICCV), 2015. CNN,” in IEEE International Conference on Computer Vision
[24] S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN : (ICCV), 2017.
Towards Real-Time Object Detection with Region Proposal
RESTI Journal (System Engineering and Information Technology) Vol . 4 No. 3 (2020) 462 – 468
468