0% found this document useful (0 votes)
134 views8 pages

A Simple Vehicle Counting System Using Deep Learning With Yolov3 Model

This document discusses a vehicle counting system using the YOLOv3 deep learning model. The system is designed to count four types of vehicles - cars, motorcycles, buses and trucks - captured on video. It aims to improve upon previous research which only counted cars and did not differentiate vehicle types. The proposed system is able to count vehicles in videos with 97.72% accuracy, identifying their type as car, motorcycle, bus or truck.

Uploaded by

A. H
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
134 views8 pages

A Simple Vehicle Counting System Using Deep Learning With Yolov3 Model

This document discusses a vehicle counting system using the YOLOv3 deep learning model. The system is designed to count four types of vehicles - cars, motorcycles, buses and trucks - captured on video. It aims to improve upon previous research which only counted cars and did not differentiate vehicle types. The proposed system is able to count vehicles in videos with 97.72% accuracy, identifying their type as car, motorcycle, bus or truck.

Uploaded by

A. H
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/341075968

A Simple Vehicle Counting System Using Deep Learning with YOLOv3 Model

Preprint · May 2020


DOI: 10.13140/RG.2.2.15026.56001

CITATIONS READS

0 2,679

1 author:

Muhammad Fachrie
Universitas Teknologi Yogyakarta
15 PUBLICATIONS   14 CITATIONS   

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Intelligent Traffic Monitoring System View project

The Dimensional Reduction to Improve the Speed and Accuracy of Neural Network in the Senior High School Student's Major View project

All content following this page was uploaded by Muhammad Fachrie on 23 June 2020.

The user has requested enhancement of the downloaded file.


Accredited by National Journal Accreditation (ARJUNA) Managed by
Ministry of Research, Technology, and Higher Education, Republic Indonesia with Second Grade (Peringkat 2, Sinta 2)
since year 2017 to 2021 according to the decree No. 10/E/KPT/2019.

Published online on the journal’s webpage: http://jurnal.iaii.or.id

RESTI journal
(System Engineering and Information Technology )
Vol. 4 No. 2 (2020) 462 - 468 ISSN Electronic Media: 2580-0760

A Simple Vehicle Counting System Using Deep Learning with YOLOv3


Model
Muhammad Fachrie
Informatics Department, Faculty of Electrical and Information Technology, Universitas Teknologi Yogyakarta
[email protected]

Abstract
Deep Learning is a popular Machine Learning algorithm that is widely used in many areas in current daily life. Its robust
performance and ready-to-use frameworks and architectures enables many people to develop various Deep Learning-based
software or systems to support human tasks and activities. Traffic monitoring is one area that utilizes Deep Learning for several
purposes. By using cameras installed in some spots on the roads, many tasks such as vehicle counting, vehicle identification,
traffic violation monitoring, vehicle speed monitoring, etc. can be realized. In this paper, we discuss a Deep Learning
implementation to create a vehicle counting system without having to track the vehicles movements. To enhance the system
performance and to reduce time in deploying Deep Learning architecture, hence pretrained model of YOLOv3 is used in this
research due to its good performance and moderate computational time in object detection. This research aims to create a
simple vehicle counting system to help human in classify and counting the vehicles that cross the street. The counting is based
on four types of vehicle, i.e. car, motorcycle, bus, and truck, while previous research counts the car only. As the result, our
proposed system capable to count the vehicles crossing the road based on video captured by camera with the highest accuracy
of 97.72%.
Keywords: deep learning, yolov3, object detection, vehicle counting, traffic monitoring

1. Introduction light system [5], vehicle counting system [6]–[9],


vehicle speed monitoring [8], parking lot monitoring
Deep Learning (DL) outperforms the conventional
[10]–[12], and traffic violation monitoring [13]. Every
Machine Learning (ML) algorithms in many tasks,
task mentioned before is started by detecting the position
especially in Computer Vision (CV). ML plays
of each vehicle, i.e. car. Hence, object detection
important role in CV due to its capability to learn the
algorithm has crucial role in this part. Traditional
pattern of objects or images and classify the object that
Machine Learning approach needs preprocessing
is taken by camera. Previously, a CV system needs
approach to complete this task, e.g. image gray scaling,
preprocessing and feature extraction step before it can
image binarization, and background subtraction [6], [7],
detect, classify, or recognize objects within the image
[14], or sometimes using edge detection [5]. Of course,
using ML algorithm [1]–[3]. Different objects or cases
this approach has limitations, e.g. when the vehicle’s
needs different techniques in preprocessing and feature
shadow exists in the image, the detection can be less
extraction. Hence it makes the single model of
precise. The inaccuracy of detection also occurs when
conventional CV limited to detect or recognize certain
some changes happen to the surface of the road, e.g. road
objects only. While DL with its large and deep networks,
repair, road damage, or any obstacles on the road,
automatically preprocess and extract the image features
because those can disturb the image subtraction process.
within its networks then classify the image class, even
While Deep Learning (DL) approach gives more flexible
more it can detect the location of every single objects
performance without having to preprocess the image and
inside the image. Nevertheless, DL requires high
extract the feature using several methods, even though it
specifications of machine and large amount of data to
is computationally expensive, and it needs large amount
train the networks and optimize its performance [1], [4].
of data to train the networks. Furthermore, there are now
Traffic monitoring system is one area that implemented exist better DL architectures that has been trained with
CV technology for some tasks, e.g. intelligent traffic

Accepted by editor: 07-04-2020 | Final Revision: 25-05-2020 | Online Publication : 20-06-2020


462
Muhammad Fachrie
RESTI Journal (System Engineering and Information Technology) Vol. 4 No. 3 (2020) 462 – 468

millions of data, hence the development of CV system


become easier.
In this work, we developed a simple vehicle counting
system using Deep Learning algorithm. Pretrained
YOLOv3 is used as the DL architecture that is well
known with its good accuracy in object detection and its
moderate computation compared to other DL
architectures [15]–[17]. Moreover, YOLOv3 has been
used in several vehicle detection systems as in [15],
[18]–[21]. This work is motivated by previous research
that was mostly tested using videos in the highways that
are only passed by cars, bus, or truck, and there are no
motorcycles. Besides, any buses or trucks are simply
considered as ‘car’ without classifying them into more
detail classes as ‘bus’ or ‘truck’ when counting. While
some traffic monitoring system may need more detail Figure 1. General architecture of the vehicle counting system.
information about the type of the vehicles whether it is a
car, truck, bus, or motorcycle. The previous studies as in The system is developed and tested on machine with
[6]–[9] were also mostly tested in good traffic condition mobile version GPU from Nvidia, i.e. Nvidia Geforce
with good driving manner, so that the counting gives MX150 that has 384 cuda cores with 1.5GHz clock
accurate result. Hence, in this work we focus on speed and 4 GB of VRAM. Although this engine is not
developing a system that count the number of vehicles as fast as the GTX version from Nvidia, but it is quite
crossing the road where the counting is based on the type enough to run the Deep Learning-based system. From
of the vehicle itself, i.e. car, bus, truck, and motorcycle our observation, it can detect the objects for about 0.2 –
based on Deep Learning algorithm with YOLOv3 0.3 seconds for a single frame.
architecture. 2.2. Data Acquisition
2. Research Method In order to test the system’s performance, several videos
in Full HD resolution (1080p) were successfully
This research was conducted in several phases, starting recorded using 8 megapixels smartphone camera in 30
with some literature reviews to explore the result from fps. The video was taken manually from the top of
previous related studies in order to find what problems pedestrian bridge in one of big city in Indonesia. The
that should be solved in the current research, also to distance between the surface of the road to the top of the
decide what methods should be implemented. The bridge is approximately 5 meters high as illustrated in
system is designed once the problem and the methods Figure 2.
are clearly decided. As mentioned in Introduction
section, YOLOv3 is used as the algorithm to detect the
vehicles that cross the road.
To test the system, a set of data was collected in form of
videos recorded in Full HD resolution (1080p). The
performance of the system is measured by its accuracy
in counting the vehicles. It is compared to the real
number of vehicles counted by human.
2.1. System Architecture
The vehicle counting system which was built in this
work has two main modules, i.e. Object Detection
Module and Counting Module as given in Figure 1. The
first module reads every single frame from the video and
doing vehicle detection using YOLOv3 algorithm. This
module results the location of every detected vehicles, Figure 2. Illustration of data acquisition process
i.e. the bounding box coordinates. Then, the second
module counts the number of vehicles that crossing the The video was recorded in fine weather about 03:00 pm.
road based on the coordinates or location of the vehicles. They are four scenarios in taking the video, i.e. frontside
So, the result of object detection module plays recording with 1x zoom, frontside recording with 2x
significant role in this system, because once the vehicle zoom, backside recording with 1x zoom and backside
is not detected, then it will not be counted. recording with 2x zoom. Therefore, there are four
different videos which is three minutes long for each.

RESTI Journal (System Engineering and Information Technology) Vol . 4 No. 3 (2020) 462 – 468
463
Muhammad Fachrie
RESTI Journal (System Engineering and Information Technology) Vol. 4 No. 3 (2020) 462 – 468

The road is one-way direction with four different lanes.


Various vehicles are passing by the road, e.g. cars,
trucks, buses, motorcycles, bicycles, etc. Figure 3 shows
the sample frames of each recording scenario.

(a) (b)

(c) (d)
Figure 3. Screenshots of videos from each recording scenario: (a)
frontside 1x zoom, (b) frontside 2x zoom, (c) backside 1x zoom, and
(d) backside 2x zoom.

2.3. YOLOv3 Figure 4. Architecture of Darknet-53 [17].


Deep Learning (DL) with its Convolutional Neural In this work, we used pretrained YOLOv3 model which
Networks (CNN) architecture is well-known to have has been trained using Ms. COCO dataset that makes
very good performance in Computer Vision (CV), this model can detects and classifies 80 different objects.
especially in object detection and classification. There But we just use YOLOv3 to detect three types of
are several CNN-based architectures that is used in CV vehicles, i.e. car, bus, truck, and motorcycle. Some
for object detection and classification, e.g. Regional- detected vehicles are presented in Figure 5.
based CNN (R-CNN) [22], Fast R-CNN [23], Faster R-
CNN [24], Region-based Fully CNN (R-FCN) [25],
YOLO [26], Single Shot Detector (SSD) [27],
YOLO9000 [28], YOLOv2 [28], Mask R-CNN [29], and
YOLOv3 [17]. Among these object detections
techniques, YOLOv3 is considered as the most suitable
model to be used in this experiment due to its good
accuracy and real time speed of computation. (a)
YOLOv3 is a Deep Learning model with Convolutional
Neural Networks (CNN) architecture, which is an
improvement from the previous version, i.e. YOLOv2 -
where YOLOv2 itself is the improvement from YOLO.
Based on [17], YOLOv3 has 53 convolutional layers as
described in Figure 4, hence the network itself is named
Darknet-53. With this architecture, beside the
improvement of the detection accuracy, it also optimizes (b)
the GPU utilization and makes this network more
efficient in computation [17]. YOLO is also invariant to
the size of the input image, so that it makes the
implementation of YOLO is easier and more practice.

(c)
Figure 5. Samples of detected vehicles: (a) car, (b) truck and bus, and
(c) motorcycle.

RESTI Journal (System Engineering and Information Technology) Vol . 4 No. 3 (2020) 462 – 468
464
Muhammad Fachrie
RESTI Journal (System Engineering and Information Technology) Vol. 4 No. 3 (2020) 462 – 468

2.4. Non-Tracking Vehicle Counting 2.5. Performance Measurement


Different with previous works that used object tracking The performance of the system is measured by
to count the number of vehicles, we propose other simple comparing the difference between real number of
strategy to count the vehicles without having to track its vehicles and number of counted vehicles by the system.
movement from frame to frame. As described in Figure The percentage of accuracy is calculated using equation
6, our method just simply evaluates the distance between (1).
the vehicle’s centroid to the border line. If the distance |𝑅𝐶−𝑆𝐶|
is less or equal to threshold value which is defined 𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = (1 − ) × 100% (1)
𝑅𝐶
before, hence it is counted as one vehicle. In this
experiment, the threshold value is decided as 1.5% from where RC is real number of vehicles counted by human
the resolution of the video after several observations. and SC is number of vehicles counted by the system.

3. Result and Discussion


The vehicle counting system developed in this work is
tested using four different videos as described in
previous section. Originally, the videos are in 1080p of
resolution with 30 fps. But, to explore the performance
of the system, we used two different scenarios in the
testing phase. Since we used a pretrained YOLOv3, so
there is no training phase in this work instead.

Figure 6. The vehicle is counted based on its centroid distance to the


border line.

But this simple strategy has weakness if in several


frames, the same vehicle has more than one close
positions to the border line, then the system will count it
as two or even three vehicles. Therefore, we applied
additional strategy as shown in Figure 7 by analyzing the
three consecutive frames and evaluating the position of Figure 8. Sample of detected frame from the testing scenario.
the same vehicles to the border line. The same vehicles
are identified by measuring their centroid to the previous In the first testing scenario, we ran the system using all
frame (the same vehicles in different frame must have four videos with 1080p Full HD resolution and 30 fps.
the closest distance among the other vehicles). This This totally has about 5400 frames. The sample of the
strategy can decrease the counting error. We also detected frame is shown in Figure 8.
observed that the minimum distance of the same vehicles Table 1. Result of the first testing scenario (1080p and 30fps)
is 4% from the video resolution. Counting Counting Overall
Video Vehicle
Real System Error Accuracy
Car 149 150 +1
Video 1
Motor 244 246 +2
(frontside 96.96%
Bus 1 2 +1
1x zoom)
Truck 1 9 +8
Car 128 114 -14
Video 2
Motor 232 219 -13
(frontside 87.09%
Bus 3 4 +1
2x zoom)
Truck 1 20 +19
Car 122 114 -8
Video 3
Motor 173 170 -3
(backside 90.24%
Bus 1 3 +2
1x zoom)
Truck 1 17 +16
Car 99 96 -3
Figure 7. Illustration of non-tracking vehicle counting system. Video 4
Motor 247 246 -1
(backside 94.52%
Bus 0 2 +2
2x zoom)
Truck 1 14 +13
Average Accuracy 92.20%

RESTI Journal (System Engineering and Information Technology) Vol . 4 No. 3 (2020) 462 – 468
465
Muhammad Fachrie
RESTI Journal (System Engineering and Information Technology) Vol. 4 No. 3 (2020) 462 – 468

In the first testing scenario as described in Table 1, the explains why ‘truck’ always get overcounted in both
system achieved the highest accuracy in the first video scenarios.
that is recorded from frontside with 1x zoom with
96.96% of accuracy. Type ‘motorcycle’ and ‘car’ has the
most accurate counting with only 2.14% and 5.3% of
error in average respectively among all videos, while
type ‘truck’ got the worst accuracy followed by the type
‘bus’.
The second scenario was run using 15 fps videos with
the same resolution. As described in Table 2, the
counting accuracy decreases 5% up to 14% for each
video. But still, the first video got the highest accuracy
with 91.14%. Type ‘car’ got a relatively constant
average error for both testing scenarios, while
‘motorcycle’ got more error up to 11% than the previous
one in the second scenario. ‘Truck’ rather better in the
second scenario but still the worst among other vehicles.
Table 2. Result of the second testing scenario (1080p and 15fps) Figure 9. Sample of vehicle that is detected as two different types of
vehicle (marked by two bounding boxes) at the same time while
Counting Counting Overall crossing the border line.
Video Vehicle
Real System Error Accuracy
Car 149 148 -1 Type ‘motorcycle’ is also well-detected by YOLOv3,
Video 1 even though some motorcycles are not detected even
Motor 244 218 -6
(frontside 91.14%
1x zoom)
Bus 1 2 +1 when the location is close to the camera. But fortunately,
Truck 1 8 +7 this miss-detection does not decrease the counting
Car 128 116 -12
Video 2 accuracy because some other motorcycles are also
Motor 232 212 -20
(frontside
Bus 3 5 +2
85.71% double counted due to double detection on the same
2x zoom) object as in Figure 10. The double detection on
Truck 1 19 +18
Car 122 113 -9 motorcycle is caused by YOLOv3 recognized two
Video 3
(backside
Motor 173 128 -45
76.09%
different kinds of motorcycle, i.e. a motorcycle with its
Bus 1 3 +2 driver and a motorcycle itself.
1x zoom)
Truck 1 16 +15
Car 99 94 -5
Video 4
Motor 247 209 -38
(backside 84.15%
Bus 0 2 +2
2x zoom)
Truck 1 11 +10
Average Accuracy 84.27%

Based on these two scenarios, the video from frontside


with 1x zoom is the most suitable to be used in the
proposed system, even though other videos also give
good results. Higher frame rate gives better performance
because there is more information processed by the (a)
system, while decrease of fps ignores some or even most
of information.
‘Car’ has the most stable counting accuracy among the
others, even in lower fps video. This because the object
of car is well-recognized by YOLOv3 model, so that
every ‘car’ object is always detected in every frame.
YOLOv3 is also capable to detect small objects of car
that are located far from the camera as can be seen in
Figure 8. After some observations, we realized that there
is a drawback occurs, i.e. some cars are detected as two
different types of vehicles at the same time as shown in (b)
Figure 9. It happens to the car whose shape is similar to Figure 10. Sample of motorcycle objects: (a) undetected motorcycle
‘truck’ or ‘bus’. Therefore, a single vehicle could be when crossing border line, (b) double detected motorcycle when
double counted as two different types of vehicle. This crossing the border line.

RESTI Journal (System Engineering and Information Technology) Vol . 4 No. 3 (2020) 462 – 468
466
Muhammad Fachrie
RESTI Journal (System Engineering and Information Technology) Vol. 4 No. 3 (2020) 462 – 468

Table 3. Result of the first and the second testing scenario after improvement
Real Counting (30 fps) Accuracy (30 fps) Counting (15 fps) Accuracy (15 fps)
Video Vehicle
Count. Prev. New Prev. New Prev. New Prev. New
Car 149 150 150 148 148
Video 1
Motor 244 246 246 218 218
(frontside 96.96% 97.72% 91.14% 91.90%
Bus 1 2 2 2 2
1x zoom)
Truck 1 9 6 8 5
Car 128 114 114 116 116
Video 2
Motor 232 219 219 212 212
(frontside 87.09% 89.83% 85.71% 88.46%
Bus 3 4 4 5 5
2x zoom)
Truck 1 20 10 19 9
Car 122 114 113 113 112
Video 3
Motor 173 170 170 128 128
(backside 90.24% 91.25% 76.09% 77.44%
Bus 1 3 2 3 2
1x zoom)
Truck 1 17 14 16 12
Car 99 96 96 94 94
Video 4
Motor 247 246 246 209 209
(backside 94.52% 97.41% 84.15% 86.17%
Bus 0 2 2 2 2
2x zoom)
Truck 1 14 4 11 4

Since the counting accuracy of type ‘truck’ is the worst, [4] J. Hagerty, R. J. Stanley, and W. V Stoecker, “Medical Image
Processing in the Age of Deep Learning Is There Still Room
then we tried to improve the system by adding some
for Conventional Medical Image Processing Techniques ?,” no.
lines of code to solve the double detection on car objects Visigrapp, pp. 306–311, 2017.
by ignoring the ‘truck’ or ‘bus’ label when it is detected [5] J. T. G. Nodado, M. A. P. Abugan, A. C. Aralar, and H. C. P.
together with ‘car’ in the same object. This Morales, “Intelligent Traffic Light System Using Computer
Vision with Android Monitoring and Control,” TENCON 2018
improvement gives better result to both testing scenario
- 2018 IEEE Reg. 10 Conf., no. October, pp. 2461–2466, 2018.
as can be seen in Table 3. However, ‘truck’ is still rather [6] A. J. Kun and Z. Vámossy, “Traffic Monitoring with Computer
overcounted due to misclassification of vehicles. Some Vision,” in 7th Int’l Symposium on Applied Machine
objects of ‘car’ are miss-classified as ‘truck’ or ‘bus’ by Intelligence and Informatics, 2009, pp. 131–134.
[7] Z. Iftikhar, P. Dissanayake, and P. Vial, “Computer Vision
YOLOv3. But overall, the proposed system can perform
Based Traffic Monitoring System for Multi-track Freeways,”
well to count the vehicles on the road. in 2014 Int’l. Conf. on Intelligent Computing, 2014, pp. 339–
349.
4. Conclusion [8] Krishna, M. Poddar, M. K. Giridhar, and A. S. Prabhu,
“Automated Traffic Monitoring System Using Computer
A vehicle counting system has been developed using Vision,” in 2016 Int’l. Conf. n ICT in Business Industry &
YOLOv3 without tracking the vehicle movements. The Governance, 2016.
[9] S. Alghyaline, N. K. T. El-Omari, R. M. Al-Khatib, and H. Y.
counting is simply executed by evaluating the distance Al-Kharbshh, “RT-VC: An Efficient Real Time Vehicle
between the vehicle’s centroid to the border line. It Counting Approach,” J. Theor. Appl. Inf. Technol., vol. 97, no.
successfully achieved the highest accuracy of 97.72% 7, pp. 2062–2075, 2019.
when using frontside-1x zoom video. YOLOv3 plays [10] T. Paula, C. Florina, R. Brad, L. Br, and M. Greavu, “An Image
Feature-Based Method for Parking Lot Occupancy,” Futur.
significant role in detecting the vehicles since the Internet, vol. 11, no. 169, pp. 1–17, 2019.
counting is object to the detected object only. Object [11] T. Fabian, “A Vision-Based Algorithm for Parking Lot
‘car’ has the highest counting accuracy followed by Utilization Evaluation Using Conditional Random Fields,” in
‘motorcycle’ and ‘bus’, while ‘truck’ is the worst. 2013 Int’l Symposium on Visual Computing, 2013, pp. 222–
233.
Frame rate of the video also give impact to the [12] B. Y. Cai, R. Alvarez, M. Sit, F. Duarte, and C. Ratti, “Deep
performance since it represents the integrity of Learning Based Video System for Accurate and Real-Time
information that is processed by the system. All in all, Parking Measurement,” IEEE Internet Things J., vol. 6, no. 5,
this work has been successfully completed with good pp. 7693–7701, 2019.
[13] W. Wu, O. Bulan, E. A. Bernal, and R. P. Loce, “Detection of
result. In the future, any improvement should be made Moving Violations,” Comput. Vis. Imaging Intell. Transp.
to get a better system. Syst., vol. 1, pp. 101–130, 2017.
[14] N. Seenouvong and U. Watchareeruetai, “A Computer Vision
Based Vehicle Detection and Counting System,” in 8th
References International Conference on Knowledge and Smart
[1] N. K. Chauhan and K. Singh, “A Review on Conventional Technology, 2016, pp. 224–227.
Machine Learning vs Deep Learning,” 2018 Int. Conf. Comput. [15] B. Benjdira, T. Khurseed, A. Koubaa, A. Ammar, and K. Ouni,
Power Commun. Technol., pp. 347–352, 2018. “Car Detection using Unmanned Aerial Vehicles: Comparison
[2] N. O. Mahony et al., “Deep Learning vs . Traditional Computer between Faster R-CNN and YOLOv3,” in 1st Unmanned
Vision,” no. Cv. Vehicle Systems Conference, 2018, pp. 1–6.
[3] M. F. Rachmadi, M. C. Valdés-hernández, M. Leonora, F. [16] M. Bugeja, A. Dingli, M. Attard, D. Seychell, and T. I. S.
Agan, and T. Komura, “Deep Learning vs . Conventional Roma, “Comparison of Vehicle Detection Techniques applied
Machine Learning : Pilot Study of WMH Segmentation in to IP Camera Video Feeds for use in Intelligent Transport
Brain MRI with Absence or Mild Vascular Pathology †,” pp. Systems,” Transp. Res. Procedia, vol. 45, pp. 971–978, 2020.
1–19. [17] J. Redmon, “YOLOv3: An Incremental Improvement.”
[18] L. Ouyang and H. Wang, “Vehicle target detection in complex

RESTI Journal (System Engineering and Information Technology) Vol . 4 No. 3 (2020) 462 – 468
467
Muhammad Fachrie
RESTI Journal (System Engineering and Information Technology) Vol. 4 No. 3 (2020) 462 – 468

scenes based on YOLOv3 algorithm,” IOP Conf. Ser. Mater. Networks,” in Neural Information Processing Systems (NIPS)
Sci. Eng., vol. 569, pp. 1–7, 2019. 2015, 2015, pp. 1–14.
[19] H. Song and H. Liang, “Vision-based vehicle detection and [25] R. F. C. Networks and J. Dai, “R-FCN: Object Detection via
counting system using deep learning in highway scenes,” Song Region-based Fully Convolutional Networks,” in Neural
Eur. Transp. Res. Rev., vol. 11, no. 51, pp. 1–16, 2019. Information Processing Systems (NIPS) 2016, 2016.
[20] J. Uus and T. Krilaviˇ, “Detection of different types of vehicles [26] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You Only
from aerial imagery,” vol. 3, 2019. Look Once: Unified, Real-Time Object Detection,” in 2016
[21] X. Ding and R. Yang, “Vehicle and Parking Space Detection IEEE Conference on Computer Vision and Pattern Recognition
Based on Improved YOLO Network Model Vehicle and (CVPR), 2016.
Parking Space Detection Based on Improved YOLO Network [27] W. Liu et al., “SSD : Single Shot MultiBox Detector,” in
Model,” J. Phys. Conf. Ser., 2019. European Conference on Computer Vision (ECCV) 2016,
[22] R. Girshick, J. Donahue, S. Member, T. Darrell, and J. Malik, 2016.
“Region-based Convolutional Networks for Accurate Object [28] J. Redmon and A. Farhadi, “YOLO9000: Better, Faster,
Detection and Segmentation,” IEEE Trans. Pattern Anal. Stronger,” in 2017 IEEE Conference on Computer Vision and
Mach. Intell., vol. 38, no. 1, pp. 1–16, 2016. Pattern Recognition (CVPR), 2017, pp. 1–9.
[23] R. Girshick, “Fast R-CNN,” in 2015 IEEE International [29] K. He, G. Gkioxari, P. Dollar, and R. Girshick, “Mask R-
Conference on Computer Vision (ICCV), 2015. CNN,” in IEEE International Conference on Computer Vision
[24] S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN : (ICCV), 2017.
Towards Real-Time Object Detection with Region Proposal

RESTI Journal (System Engineering and Information Technology) Vol . 4 No. 3 (2020) 462 – 468
468

View publication stats

You might also like