Fin Irjmets1655562675
Fin Irjmets1655562675
Fig.1. Block diagram of proposed vehicle detection, counting and classification system.
A. Background Learning Module
This is the first module in the system whose main purpose is to learn about the background in a sense that how
it is different from the foreground. Furthermore, as proposed system works on a video feed, this module
extracts the frames from it and learns about the background. In a traffic scene captured with a static camera
installed on the road side, the moving objects can be considered as the foreground and static objects as the
www.irjmets.com @International Research Journal of Modernization in Engineering, Technology and Science
[2518]
e-ISSN: 2582-5208
International Research Journal of Modernization in Engineering Technology and Science
( Peer-Reviewed, Open Access, Fully Refereed International Journal )
Volume:04/Issue:06/June-2022 Impact Factor- 6.752 www.irjmets.com
background. Image processing algorithms are used to learn about the background using the above-mentioned
technique.
B. Foreground Extraction Module
This module consists of three steps, background subtraction, image enhancement and foreground extraction.
Background is subtracted so that foreground objects are visible. This is done usually by static pixels of static
objects to binary 0. After background subtraction image enhancement techniques such as noise filtering,
dilation and erosion are used to get proper contours of the foreground objects. The final result obtained from
this module is the foreground.
C. Vehicle Classification Module
The third and the last module in the proposed system is classification. After applying foreground extraction
module, proper contours are acquired. Features of these contours such as centroid, aspect ratio, area, size and
solidity are extracted and are used for the classification of the vehicles.
IV. DETAILED METHODOLOGY
The first step of the proposed system is to grab a data video on which we want to perform the classification.
After video selection, ROI is defined. ROI needs a careful human supervision because region of interest and
imaginary line plays important role in classification. After ROI is defined, the system performs series of tasks
i.e., applying background mask, subtracting mask, performing binary threshold, morphology using erosion and
dilation, median blur, applying masked data to the frame, convert frame to gray scale. Contours are detected
after these operations. Once contours are detected; system analyses the moments of the contours, marks the
detected contours and centroid is calculated. If calculated centroid is in the range of the diagonal, system moves
towards further operation for classification else system will be redirected towards the detection of contours
again. The last step is the classification; the system classifies the vehicles with two different methods i.e., using
SVM and with the CC.
The classification using SVM is used in which SIFT features are calculated for the contours and used as input to
the SVM. Three types of vehicles are identified by SVM which are Low Transport Vehicle (LTV), Medium
Transport Vehicle (MTV) and Heavy Transport Vehicle (HTV). SVM classifies the vehicle using the features
extracted with the help of SIFT and then corresponding variables are incremented according to the output i.e.,
LTV, MTV and HTV. In addition, classification of the vehicle’s using CC is done. Once the centroid calculated is in
the range of diagonal; the properties of contours are extracted. The features extracted are compared with the
assumed values and output is calculated. In the end the corresponding variables are incremented according to
the output.
A. Region of interest
ROI is a particular portion of an image on which an operation is to be performed. ROI gives the flexibility to just
work with in a particular region instead of manipulating the whole image. In proposed system, selection of
region of interest is very important to reduce the false positives in the detection and classification of vehicles.
Selection of ROI is pretty simple, once the video is started, the user has to press the “I” key on the keyboard to
activate the input mode. Afterwards the user uses his mouse to select the four points on the video which defines
the region of interest. Once selected, pressing of any key on the keyboard selects the region of interest, crops it
and shows the new video feed on only that region. Fig. 2 shows ROI selection input mode. Notice the four green
dots on the screen, these are the points defining the ROI and were placed using mouse clicks.
Fig.5. Comparison of original count and system count values of five videos.
Fig.6. Vehicle classification comparisons between original count, CC, SVM and
BoF methods in video
Fig.7. Vehicle classification comparisons between original count, CC, SVM and BoF methods in video.
Fig.8. Vehicle classification comparisons between original count, CC, SVM and BoF methods in video.
Fig.9. Vehicle classification comparisons between original count, CC, SVM and BoF methods in video.
Fig.10. Vehicle classification comparisons between original count, CC, SVM and BoF methods in video.
Table 1. Vehicle classification errors of CC, SVM and BoF methods
Ground Classification Error Classification Error
Truth Count % (CC) % (BoF and SVM)
Video 1 123 13 22.76
Video 2 229 1.3 2.6
Video 3 258 10.85 0.7
Video 4 235 4.25 10.6
Video 5 220 5.9 8.18
VI. CONCLUSION
The proposed solution is implemented on python, using the OpenCV bindings. The traffic camera footages from
variety of sources are in implementation. A simple interface is developed for the user to select the region of
interest to be analyzed and then image processing techniques are applied to calculate vehicle count and
classified the vehicles using machine learning algorithms. From experiments it is apparent that CC method
outperforms than BoF and SVM method in all results and gives more close classification results to the ground
truth values. Currently proposed system works with already captured videos but it can be modified to be used
for processing live video streams [4] by adding microcontrollers. One of the limitations of the system is that it is
not efficient at detection of occlusion of the vehicles which affects the accuracy of the counting as well as
classification. This problem could be solved by introducing the second level feature classification such as the
classification on the bases of color. Another limitation of the current system is that it needs human supervision
for defining the region of interest. The user has to define an imaginary line where centroid of the contours
intersects for the counting of vehicles hence the accuracy is dependent on the judgment of the human
supervisor. Furthermore, the camera angle also affects the system hence camera calibration techniques could
be used for the detection of the lane for the better view of the road and increasing the efficiency. The system is
not capable of detection of vehicles in the night as it needs the foreground objects to be visible for extraction of
contour properties as well as features for the classification using SIFT features [31]. The system could also be
improved for better accuracy using the more sophisticated image segmentation and artificial intelligence
operations.
VII. REFERENCES
[1] S.-Y. Cheung, and P.P. Varaiya, “Traffic surveillance by wireless sensor networks: Final report”, PhD
diss., University of California at Berkeley, 2006.
[2] S. Oh, S. Ritchie, and C. Oh, “Real-time traffic measurement from single loop inductive signatures”,
Transportation Research Record: Journal of the Transportation Research Board, (1804), pp. 98-106,
2002.