2019 International Conference on Vision Towards Emerging Trends in Communication and Networking (ViTECoN)
Vision-Face Recognition Attendance Monitoring
System for Surveillance using Deep Learning
Technology and Computer Vision.
Harikrishnan J Arya Sudarsan Aravind Sadashiv
[email protected] [email protected] [email protected] Remya Ajai A S
Department of Electronics and
Communication Engineering,
Amrita Vishwa Vidyapeetham,
Amritapuri
India
[email protected]
Abstract— Nowadays, Artificial Neural networks can be
trained over several billion images and can be used to detect II. SYSTEM DESCRIPTION
and recognize faces with relative ease and flexibility in an There are 4 main phases for the vision attendance system
instant. This concept is used in the implementation of this real
Fig.1:
time attendance cum surveillance system that can be
prototyped and set into action. Some of the major applications • Face Detection and its respective gathering of data.
of this innovative method include face attendance using a single • Training the recognizer
snap mode in smartphones for university classes, further real-
time facial recognition surveillance of lab facilities and work • Facial Recognition
places which can set this as a first line of defense against
intruders from gaining access. The user-friendly graphical user • Attendance Management in Excel
interface provides flexibility and ease in running these
powerful face recognition algorithms powered by deep-
learning. We have achieved a maximum recognition accuracy
of 74 percent while running the real time surveillance
algorithm. This work was done as a solution to the absence of a
robust and user friendly face recognition attendance system.
Keywords— Face Attendance, Face Detection, Face
Recognition, Computer Vision, Deep Learning, Biometric
Surveillance.
I. INTRODUCTION
Face Recognition algorithms have many limitations in a
real-world scenario like lighting conditions and proxy via
pictures and low-quality image processing. In this era of
Machine learning and Artificial Intelligence computers have
evolved to process huge chunks of data at very high rates
with maximum process efficiency and image processing and
computer vision powered by Deep learning algorithms have
geared up to make face recognition a reality, solving or
reducing most of the previous limitations. Vision is a fully
isolated system setup on a Raspberry Pi and is modular. The
system can be controlled via a network communication to
the PI server and also, all the reports and well maintained in
the attendance server onboard and online. This module is
equipped with a user-friendly GUI (Graphical User
Interface) for ease of use. The GUI offer various options for
the administrator to operate the system like choose between
Realtime surveillance or single snap mode, to train and
update the classifier with new face data Fig.4. The backbone
of the software is driven by python code with Open source
Fig. 1. System depiction and data transfer flow chart.
computer vision library and Tkinter library for Graphical
user interface.
978-1-5386-9353-7/19/$31.00 ©2019 IEEE
Fig. 2. Program workflow and breakdown.
III. IMAGE PRE-PROCESSING eyes are lighter than that of region of the nose, this can
The input image is read and is converted into grayscale classify the samples [7]. All such features are applied on all
image of 8 bits. The image is flipped along the vertical axis training images. And for each feature, there is a threshold
and is resized using Bicubic Interpolation over 4x4 pixel value, which identifies the face and nonface images and
neighborhood method. Bicubic Interpolation is used to find classify them accordingly. Since there is high chance for
the position of two nearby points and getting the orientation errors, we consider the features with minimum error rate.
or gradients at those positions, which gets connected by a During each classification, the misclassified images are
piece wise cubic spline to get a smoother and image with a given more weight and the process is repeated. New error
fewer interpolation artefacts. It is also used to calculate the rates and weights are calculated until the requirements are
positions of the points between the two end points and there met. To ease the process, the features are grouped into
by getting the new resized pixels value. For the facial different sets of classifiers and then applied to the window
recognition it is necessary to enhance the facial features. one at a time Fig.2. If a window fails the first stage, it is
The method used here is thresholding using Tozero, which discarded. If it succeeds, then the next set of features are
is performed on to the image where the pixels at the applied. If a window passes all the stages, it is considered as
thresholding value is converted to black and the other to a face region.
subsequent lighter shades. Morphological transformation is
performed on the image in order to remove the noise and V. TRAINING
isolate individual elements. Morphological closing which is
used here will first dilate the image that is, remove the noise In the training phase, the entire user data within the database
of the object image and close the disjoint areas. It is is selected, accompanied by the OpenCV recognizer,
followed by erosion which removes the white noise by “Trainer”. This operation is performed by a function
discarding the pixels near the boundary depending on the contained in OpenCV, which will result in a .yml file placed
kernel used Fig.2. on a “trainer/” directory. The recognizer, that is used here, is
the LOCAL BINARY PATTERNS HISTOGRAMS
IV. DETECTION (abbreviated as LBPH), which is implemented using the
following line of code: cv2.face.LBPHFaceRecognizer-
Face detection is done using Haar Cascade Method (feature- create(). To select all the photos from the directory :
based) which uses machine learning algorithms to classify “dataset/”, “getImagesAndLabels (path)” function is used,
images into two categories- images containing faces and which when implemented, will return two arrays “Ids” and
images of objects [1]. In this method we use “faces”. These two arrays are then provided as inputs to the
cv2.CascadeClassifier() to classify the input images. recognizer.train() function which trains our recognizer and
Further, feature wise extraction is performed on the images. results in a file named “trainer.yml” that will be saved in the
Features are selected in such a way that it could effectively trainer directory[2].
classify images. For instance, the area of the eyes is
compared to the areas near the nose. As the region near the
Fig. 3. Working demo with database results and Hardware Prototype.
VI. RECOGNITION
Image recognition is done using Local Binary Pattern capture. The image of the student is captured, flipped and
Histogram (LBPH) in which a 3x3 matrix window is moved converted into grayscale. Within a for loop, a rectangular
along the image and the pixel value at the central location of box is created to outline the face and using a function called
the matrix is calculated. The pixel value at the center is recognizer.predict(), a numerical value on the closeness or
compared with the neighboring pixel values. If the values similarity of the face is found by comparing with the trained
measured are greater than or equal to the central pixel value faces. Using the “If” and “Else” conditions, if the
then that pixel value is read as 1 and the rest as 0. The confidence value is within a certain threshold, it will show
values are read in clockwise order, forming a binary value the name of the face on the window, else it will show
which is the local binary pattern of that specific area. This is “unknown face”. It prints the name of the student and
done to the entire image and LBPs are obtained [3]. confidence level, and using above entry function, the name
The obtained LBPs are converted to its corresponding and status record is then entered into the excel sheet and the
binary values and histograms are plotted against each value. loop is exited.
Each face will have a histogram in the training dataset.
When a new face is detected, its histogram is compared with VIII. REALTIME SURVEILLANCE
the training dataset and the best matching histogram is Realtime surveillance can be obtained using the backend
returned with the label of the person it belongs to. After that, function, realtime() : This function gives produces a real
when a face is captured on the camera, the recognizer tries time monitor which can be used for surveillance in lab
to identify the face and make a prediction. It checks if the facilities and classrooms for real-time surveillance. Its
face data is already available in the training dataset. If the implemented to show that the system works by showing a
data is available, it returns the histogram ID and an index. A live feed Fig.3. The variables are initialized and uses
portion of the face is taken as a parameter by the recognizer. Videocapture() function. Within an infinite loop, the images
predict () and it will output the ID of the owner and the are captured using read() function, flipped and converted
accuracy of the recognizer. The confident index will output into grayscale[4]. A rectangular box is made to show that it
zero if it is a perfect match. When a successful prediction is identified the face using predict() function, the confidence
made, the probable id and the probability of a right match is levels are predicted[6]. If the predicted confidence levels are
displayed over the image. Otherwise, an unknown label is within a certain threshold, the name of the student is
displayed Fig.3. displayed. Using the escape key, the real time monitor can
be exited.
VII. SINGLE SNAP MODE
For the purpose of using the single snap mode, the backend IX. REPORT GENERATION
consists of a function, snap() : This function is used to The Back end of the program consist of a function,
capture the face of a student, identify him/her and if Openreport() : This function is to open the excel report
identified, marks him/her as present in the excel sheet of which contains the attendance record of the students for a
attendances[5]. It consists of several variables which are particular day. The frontend consists of the tkinter program
initialized first, and also contains a list of names of the which includes the necessary widgets, labels etc to create
trained faces. Then it initializes and starts real time video the visual output of the GUI. Within the frontend script, the
like a Raspberry Pi Fig.3. The entire system can be run on a
raspberry pi along with a surveillance camera which are the
only requirements. The modularity of the system enables it
to be implemented or applied in classrooms and lab facilities
for real-time surveillance or regular attendance even under
poor lighting conditions due to the Image Pre-processing
techniques which are being used in the system.
The system enables the user to store the data (attendance)
automatically to the attendance server onboard and also
online which saves time and is also user friendly. Vision can
be used in Single Snap mode or Real time surveillance mode
according to the convenience of the user. Due to the
efficient onboard data storage mechanism and real time
surveillance Proxy via pictures is impossible, resolving
major limitation of Face recognition systems. In comparison
to reference [2], the haar cascade classifier used in this
system is way easier to implement as it only requires a
smaller dataset in contrast to the Deep Semi-NMF feature
used in the cited reference. In comparison to reference [7],
our current system is much simpler, quicker and provides a
very high detection success rate of over 70 per cent in
contrast to the cited reference. In comparison to reference
[8], the Haar features based cascade classifier is better at
detecting edges and lines than the viola jones algorithm that
is used in the cited reference. In comparison to reference
[13], the haar cascade classifier used in our current
algorithm only requires a comparatively smaller dataset to
Fig.4. Graphical User Inferface of Vision. the cited reference and provides a higher face detection
accuracy of over 70 per cent. In comparison to reference
functions are called from the backend script using the [10], the current model of vision uses LBPH along with haar
import module which allows the use of functions in the cascade classifier to provide high face detection accuracy
frontend script and using the button, function is used via using a smaller dataset along with faster processing speeds
command to indicate that when the button for the required and is able to detect all types of faces with a high accuracy.
function is pressed, the command accesses the function and
does the required functionality to obtain the desired result. REFERENCES
In this case if it recognizes the student, it marks him/her as
present in the attendance record for the current day Fig.3. [1] Li Cuimei, Qi Zhiliang, Jia Nan, Wu Jianhua, “Human face detection
algorithm via Haar classifier combined with three additional
X. FUTURE WORKS classifiers”,13th IEEE International Conference on Electronics
Measurement & Instruments (ICEMI), 2017
On the basis of the present research works carried out, the
following are the scope of future work for the detection and [2] Qingdong Liang, Wenting Fang, “College Student Attendance
recognition [8]. To increase the efficiency of detection and System based on Face Recognition”, IOP Conf. Ser.: Mater. Sci Eng.
recognition to above 90 percent. Expanding the domain of 466 012107, 2018
attendance system from colleges to office and enterprises. [3] Kritika Shrivastava, Shweta Manda, Prof. P.S. Chavan, Prof. T.B.
Monitoring the arrival and departure of the employees Patil, “Conceptual model for proficient automated Attendance
accurately and reliably without any proxy attendance [16]. System based on Face Recognition and Gender classification using
Using of superior algorithms like Fisher face algorithm for Haar Cascade, LBPH algorithm along with LDA model”,
International Journal of Applied Engineering Research ISSN
facial recognition and classification to maximize the 09734562 , Volume 13, Number 10 pp.8075-8080 © Research India
separation between classes in the training data process. Publications, 2018
To develop this system into a simple portable hardware
[4] Felix Juefei-Xu, Marios Savvidas, “Subspace- Based Discrete
system. The proposed methods can be applied for Active Transform Encoded Local Binary Patterns Representation for Robust
student tracking, law enforcement, Information Security, Periocular Matching on NIST’s Face Recognition Grand Challenge”,
IEEE Transactions on Image Processing, Vol. 23, No.8, 8,August
Voter verification, Access control, Surveillance and 2014
banking.
[5] Ankita Digambar Patil, Dr. S. M. Mukane, Aishwarya Vijaykumar
Patil, Simran Raju Inamdar, “ Automatic Attendance system using
XI. RESULTS Webcam”, et.al; International Journal of Advance Research, Idea and
Innovations in Technology”, ISSN: 2454- 132X Impact factor: 4.295,
Vision has been trained and tested over several situations of vol. 4, Issue 2
a classroom environment and has achieved a maximum
recognition accuracy of 74 per cent and above. The entire [6] Visar Shu, Agni Dika, “Using Real time Computer Vision algorithms
in Automatic Attendance Management System”, 32nd Intenational
Graphical user inter face is robust and easy to use and has Conference on Information Technology Interfaces ( IEEE ITI -2010)
been configured to be used with a pocket-sized computer
[7] Xiaoyang Tan, Bill Triggs, “Enhanced LocalTexture Feature Sets for [14] Lopez Ruiz, “Local Binary Patterns applied to Face Detection and
Face Recognition under Difficult lighting conditions”, IEEE Recognition”
Transactions on Image Processing, 2010
[15] Pradeep, B.V Rahul, E.S. Bhavani, “ Follow me robot using Bluetooth
[8] Deepan.P1, Raja Vignesh.R2, Venkateswaran.S3, “Face Recognition Based position estimation”, R.R. International Conference on
based Automated Attendence Management System using Hybrid Advances in Computing, Communications and Informatics, ICACCI,
Classifier”, International Research Jouranal of Engineering and pp.584-589, January 2017
Technology(IRJET) e-ISSN: 2395-0056, vol.4 Issue: 5, p-ISSN:
2395-0072, May 2017 [16] Subarna, B. Viswanathan, “ Real time Facial expression recognition
based on deep convolutional spatial neural networks”, D.M.
[9] Ya-Nan Wang, Jian-Bo Su, “Symmetry description and face International Conference on Emerging Trends and Innovations in
recognition using Face Symmetry based on local binary pattern Engineering and Technological Research, ICETIETR 8529105, 2018
feature”, Proceedings of the 32nd Chineese Control Conference, 2013
[17] Neethu A, Athi Narayanan, S. Bijlani, “ People count estimation using
[10] Timo Ahonen, Abdenour Hadid, Matti Pietika Inen, “Face description Hybrid face detection method”, K Proceedings- 2016 International
with local binary patterns: Application to Face Recognition”, IEEE Conference on Information Science, 7845316, pp. 144-148, ICIS
transactions on pattern analysis and Machine Intelligence, Vol. 28, 2016
No:12, Dec 2006
[18] Prathish S, Athi Narayanan, S. Bijlani, “An Intelligent system for
[11] K Zhang, Z. Zhang, Z. Li, Y. Qiao, “Joint Face Detection and Online exam monitoring”, K. Proceedings International Conference
Allignment using Multi task Cascaded Convolution Neural on Information Science, pp. 138-143, ICIS 2016 7845315
Networks”
[12] Paul Viola, Michael Jones, “Rapid Object Detection using Boosted [19] Megalingam, R.K. Nair, R.N. Prakhya, “Automated voice based
Cascade of Simple Features”,IEEE Computer Society Conference on Home navigation system for the elderly and physically challenged”,
Computer Vision and Pattern Recognition,1:I-511-I-518 S.M.2011 2nd International Conference on Wireless Communication,
vol.1 , February 2001 Vehicular Technology, Information Theory and Aerospace and
Electronics Systems Technology, Wireless VITAE 2011 5940832
[13] Naved Khan Baloch, M Haroon Yousaf, Waqar Ahmad, “Algorithm
for Efiicient Attendance Management: Face Recognition based
approach”, IJCSI International Journal of Computer Science Issues,
Vol. 9, Issue 4, No 1, July 2012