0% found this document useful (0 votes)

19 views45 pages

Object Detection Using Machine Learning: Bachelor of Technology

Uploaded by

mazumdarankit6

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views45 pages

Object Detection Using Machine Learning: Bachelor of Technology

Uploaded by

mazumdarankit6

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 45

OBJECT DETECTION USING MACHINE LEARNING

A Project report submitted in partial fulfillment of the requirements of the degree of

BACHELOR OF TECHNOLOGY
IN

APPLIED ELECTRONICS AND INSTRUMENTATION ENGINEERING

SUBMITTED BY

Name University Roll number

Deeplekha Gupta 10805519004

Ankita Das 10805519009
Aditi Choubey 10805519012

UNDER THE GUIDANCE OF

Mrs. LIPIKA MANDAL

Assistant Professor (AEIE)

DEPARTMENT OF AEIE

ASANSOL ENGINEERING COLLEGE

AFFILIATED TO
MAULANA ABUL KALAM AZAD UNIVERSITY OF TECHNOLOGY
i
DEPARTMENTOF APPLIED ELECTRONICS &
INSTRUMENTATION ENGINEERING
ASANSOL ENGINEERING COLLEGE

Acknowledgement
It is our great privilege to express our profound and sincere gratitude to our Project Supervisor,
Mrs. Lipika Mandal for providing us a very cooperative and precious guidance at every stage of
the present project work being carried out under her supervision. Her valuable advice and
instructions in carrying out the present study has been a very rewarding and pleasurable
experience that has greatly benefited me throughout the course of work.
We would like to convey my sincere gratitude towards Mr. Intekhab Hussain, Head of the
Department of AEIE, Asansol Engineering College for providing us the requisite support fortime
completion of our work. We would also like pay our heartiest thanks and gratitude to all the
teachers of the Department of AEIE, Asansol Engineering College for various suggestions being
provided in attaining success in our work.
We would like to express our earnest thanks to our other colleagues along with all technical staffs
of the Department of AEIE, Asansol Engineering College for their valuable assistance being
provided during our project work.
Finally, we would like to express our deep sense of gratitude to our parents for their constant
motivation and support throughout our work.

……………………………………………..
(Deeplekha Gupta)

……………………………………………..
(Ankita Das)

…………………………………………….
(Aditi Choubey)

ii
DEPARTMENTOF APPLIED ELECTRONICS &
INSTRUMENTATION ENGINEERING
ASANSOL ENGINEERING COLLEGE

Certificate of Approval

This is to certify that the work presented in the project entitled “OBJECT DETECTION USING
MACHINE LEARNING”, submitted by Deeplekha Gupta, AnkitaDas, Aditi Choubey in partial
fulfillment of the requirement for the award of degree of B. Tech in Applied Electronics &
Instrumentation Engineering of Asansol Engineering College, Asansol, is an authentic work
carried out under my supervision and guidance.
To the best of my knowledge, the content of this project does not form a basis of the award of
any previous degree to anyone else.

……………………………. ………………………………….
Intekhab Hussain Lipika Mandal (Assistant Professor)
(Head of the Department) Project Guide
Dept of AEIE Dept of AEIE
Asansol Engineering College Asansol Engineering College

…………………………………….
Bhaskar Roy
(Assistant Professor)
Final Year Project Coordinator
Dept. of AEIE
Asansol Engineering College

iii
Abstract
Efficient and accurate object detection has been an important topic in the advancement of computer
vision systems. With the advent of deep learning techniques, the accuracy for object detection has
increased drastically. The aim of object detection is to detect all instances of objects from a known
class, such as people, cars or faces in an image. Generally, only a small number of instances of the
object are present in the image, but there are a very large number of possible locations and scales at
which they can occur and that need to somehow be explored. In this project, we use a completely
deep learning-based approach to solve the problem of object detection in an end-to-end fashion. The
network is trained on the most challenging publicly available data-set, on which an object detection
challenge is conducted annually. Object recognition is to describe a collection of related computer
vision tasks that involve activities like identifying objects in digital photographs. Image classification
involves activities such as predicting the class of one object in an image. Object localization is
referring to identifying the location of one or more objects in an image and drawing a bounding box
around their extent. Object detection does the work of combines these two tasks and localizes and
classifies one or more objects in an image. There are various applications of object detection that
have been well researched including face detection, character recognition, and vehicle calculator.
Object detection can be used for various purposes including retrieval and surveillance. The resulting
system is fast and accurate, thus aiding those applications which require object detection.

iv
Contents
Acknowledgement ii
Certificate of Approval iii
Abstract iv
Chapter 1:Introduction 1

PART 1: OBJECT DETECTION USING MACHINE LEARNING 1

1.1 INTRODUCTION 1
1.2 PROJECT OBJECTIVE 1
2
1.3 MOTIVATION
1.4 PROBLEM FORMULATION 2-3
PART 2: FACIAL EMOTION RECOGNITION USING MACHINE 3
LEARNING
2.1 INTRODUCTION 3-4
2.2 PROJECT OBJECTIVE 4
2.3 MOTIVATION
4
2.4 PROBLEM FORMULATION 4-5
Chapter 2: Literature Survey 6-7
Chapter 3: Project Methodology 8
PART 3.1: OBJECT DETECTION USING MACHINE LEARING 8
3.1.1 SYSTEM DESIGN 8
3.1.2 FLOW CHART 8-9
PART 3.2: FACIAL EMOTION RECOGNITION USING 9
MACHINE LEARNING
3.2.1 SYSTEM DESIGN 9
3.2.2 FLOW CHART 10-11
Chapter 4 :Algorithms 12
4.1 SSD 12

v
4.2 MOBILENET 12-13

vi
4.3 COCO DATASET 13
4.4 YOLO 13-14
4.5 VGG 14
4.6 R-CNN 14
Chapter 5 : Image Processing 15
5.1 DESCRIPTION 15-16
5.2 DIGITAL IMAGE PROCESSING 16-19
5.3 GRAY SCALE IMAGE 19
5.4 COLOR IMAGE 19-20
5.5 RELATED TECHNOLOGY 20
Chapter 6 : Software Requirement 21
PART 6.1: OBJECT DETECTION USING MACHINE LEARNING 21
6.1.1 JUPYTER NOTEBOOK 21
6.1.2 MODULES USED 21-23
23-24
6.1.3 PROGRAMMING LANGUAGE USED
PART 6.2: FACIAL EMOTION RECOGNITION USING 24
MACHINE LEARNING
6.2.1 HARDWARE INTERFACES 24
6.2.2 PLANNING 24
24-25
6.2.3 THE LIBRARY & PACKAGES
6.2.4 HAAR CASCADE CLASSIFIER IN OPEN-CV 25-27
Chapter 7: Results 28
PART 7.1: OBJECT DETECTION USING MACHINE LERNING 28
7.1.1 RESULTS 28
PART 7.2: FACIAL EMOTION RECOGNITION USING 29
MACHINE LEARNING

7.2.2 RESULTS 29-30

vii
Chapter 8: Conclusion 31
8.1 CONCLUSION 31
8.2 FUTURE SCOPE 31-32
Chapter 9: References 33-37

viii
CHAPTER 1 - INTRODUCTION
PART 1: OBJECT DETECTION USING MACHINE LEARNING

1.1INTRODUCTION:
Efficient and accurate object detection has been an important topic in the advancement of computer
vision systems. With the advent of deep learning techniques, the accuracy for object detection has
increased drastically. The project aims to incorporate state-of-the-art technique for object detection
with the goal of achieving high accuracy with a real-time performance. Object detection is a computer
vision technique that allows us to identify and locate object in an image or video data. With this kind of
identification and localization, object detection can be used to count objects in a scene and determine
and track their precise location, all while accurately labelling on them. Object detection is breaking into
a wide range of industries, with use cases ranging from personal security to productivity in the
workplace. Object detection and recognition is applied in many areas of computer vision, including
image retrieval, security, surveillance, automated vehicle systems and machine inspection. Significant
challenges stay on the field of object recognition. The possibilities are endless when it comes to future
use cases for object detection.

1.2 PROJECT OBJECTIVE:

The motive of object detection is to recognize and locate all known objects in a scene. Preferably in 3D
space, recovering pose of objects in 3D is very important for robotic control systems.

Imparting intelligence to machines and making robots more and more autonomous and independent
has been a sustaining technological dream for the mankind. It is our dream to let the robots take on
tedious, boring, or dangerous work so that we can commit our time to more creative tasks.
Unfortunately, the intelligent part seems to be still lagging behind. In real life, to achieve this goal,
besides hardware development, we need the software that can enable robot the intelligence to do the
work and act independently. One of the crucial components regarding this is vision, apart from other
types of intelligences such as learning and cognitive thinking. A robot cannot be too intelligent if it
cannot see and adapt to a dynamic environment.

The searching or recognition process in real time scenario is very difficult. So far, no effective solution
has been found for this problem. Despite a lot of research in this area, the methods developed so far are
not efficient, require long training time, are not suitable for real time application, and are not scalable to
1
large number of classes. Object detection is relatively simpler if the machine is looking for detecting
one particular object. However, recognizing all the objects inherently requires the skill to differentiate
one object from the other, though they may be of same type. Such problem is very difficult for
machines, if they do not know about the various possibilities of objects.

1.3 MOTIVATION:

Object detection is breaking into a wide range of industries, with use cases ranging from personal
security to productivity in the workplace. Object detection and recognition is applied in many areas of
computer vision, including image retrieval, security, surveillance, automated vehicle systems and
machine inspection. Significant challenges stay on the field of object recognition. The possibilities are
endless when it comes to future use cases for object detection. Object detection is probably the most
profound aspect of computer vision due the number of practical use cases. Object detection refers to the
capability of software systems to locate objects in an image/scene and identify each object. It has been
widely used for face detection, vehicle detection, pedestrian counting, web images, security systems
and driver less cars. There are many ways object detection can be used as well in many fields of
practice. Like every other computer technology, a wide range of creative and amazing uses of object
detection will definitely come from the efforts of computer programmers and software developers.

Getting to use modern object detection methods in applications and systems, as well as building new
applications based on these methods is not a straight forward task. Early implementations of object
detection involved the use of classical algorithms, like the ones supported in OpenCV, the popular
computer vision library. However, these classical algorithms could not achieve enough performance to
work under different conditions.

Object detection from a complex background is a challenging application in image processing. The
goal of this project is to identify objects placed over a surface froma complex background image using
various techniques.

1.4 PROBLEM FORMULATION:

Many problems in computer vision were saturating on their accuracy before a decade. However, with
the rise of deep learning techniques, the accuracy of these problems drastically improved. One of the
major problems was that of image classification, which is needed as predicting the class of the image.
A slightly complicated problem is that of image localization, where the image contains a single object

2
and the system should predict the class of the location of the object in the image (a bounding box
around the object). The more complicated problem (this project), of object detection involves both
classification and localization. In this case, the input to the system will be a image, and the output will
be a bounding box corresponding to all the objects in the image, along with the class of object in each
box. An overview of all these problems is depicted in Fig. 1

Fig. 01: Classification & Localization

The major challenge in this problem is that of the variable dimension of the output which is caused due
to the variable number of objects that can be present in any given input image. Any general machine
learning task requires a xed dimension of input and output for the model to be trained. Another
important obstacle for widespread adoption of object detection systems is the requirement of real-time
(>30fps) while being accurate in detection. The more complex the model is, the more time it requires
for inference; and the less complex the model is, the less is the accuracy. This trade-o between
accuracy and performance needs to be chosen as per the application. The problem involves
classification as well as regression, leading the model to be learnt simultaneously. This adds to the
complexity of the problem.

PART 2: FACIAL EMOTION RECOGNITION USING MACHINE LEARNING

2.1 INTRODUCTION

Human emotion detection is implemented in many areas requiring additional security or information
about the person. It can be seen as a second step to face detection where we may be required to set up a
second layer of security, where along with the face, the emotion is also detected. This can be useful
verify that the person standing in front of the camera is not just a 2-dimensional representation
[1].Another important domain where we see the importance of emotion detection is for business
promotions. Most of the businesses thrive on customer responses to all their products and offers. If an
3
artificial intelligent system can capture and identify real time emotions based on user image or video,
they can make a decision on whether the customer liked or disliked the product or offer.We have seen
that security is the main reason for identifying any person. It can be based on finger-print matching,
voice recognition, passwords, retina detection etc. Identifying the intent of the person can also be
important to avert threats. This can be helpful in vulnerable areas like airports, concerts and major
public gatherings which have seen many breaches in recent years.

2.2 PROJECT OBJECTIVE:

i. To develop a facial expression recognition system.

ii. To experiment machine learning algorithm in computer vision fields.

iii. To detect emotion thus facilitating Intelligent Human-Computer Interaction.

2.3 MOTIVATION:

In today’s networked world the need to maintain security of information or physical property is
becoming both increasingly important and increasingly difficult. In countries like Nepal the rate of
crimes are increasing day by day. No automatic systems are there that can track person’s activity. If we
will be able to track Facial expressions of persons automatically then we can find the criminal easily
since facial expressions changes doing different activities. So, we decided to make a Facial Expression
Recognition System. We are interested in this project after we went through few papers in this area.
The papers were published as per their system creation and way of creating the system for accurate and
reliable facial expression recognition system. As a result, we are highly motivated to develop a system
that recognizes facial expression and track one person’s activity.

2.4 PROBLEM FORMULATIOM:

Human facial expressions can be easily classified into 7 basic emotions: happy, sad, surprise, fear,
anger, disgust, and neutral. Our facial emotions are expressed through activation of specific sets of
facial muscles. These sometimes subtle, yet complex, signals in an expression often contain an
abundant amount of information about our state of mind. Through facial emotion recognition, we are
able to measure the effects that content and services 30 have on the audience/users through an easy and
low-cost procedure. For example, retailers may use these metrics to evaluate customer interest. Health
care providers can provide better service by using additional information about patient’s emotional
4
state during treatment. Entertainment producers can monitor audience engagement in events to
consistently create desired content.

Humans are well - trained in reading the emotions of others, in fact, at just 14 months old, babies can
already tell the difference between happy and sad. But can computers do a better job than us in
accessing emotional states? To answer the question, we designed a deep learning neural.

5
CHAPTER 2 – LITERATURE SURVEY
In the year 2017 Kaiming He, Georgia Gkioxari, Piotr Dollar, Ross Girshick proposed Mask RCNN.In
this paper Mask RCNN is certainly not a commonplace object detection network. It was intended to
settle a difficult example division task, i.e, making a mask for each object in the scene.Nonetheless,
Mask R-CNN indicated an incredible augmentation to the Faster R-CNN framework, and furthermore
thusly motivated object location research. The fundamental thought is to add a binary mask prediction
branch after ROI pooling alongside the current bounding box and characterization branches. Obviously,
both perform multiple tasks preparing (division + detection) and the new ROI Align layer adds to some
improvement over the bounding box benchmark.

In the year 2017 NavaneethBodla, Bharat Singh, Rama Chellappa, Larry S. Davis proposed SoftNMS –
Improving Object Detection with One Line of Code. In this paper Non maximum suppression (NMS) is
broadly utilized in anchor based object detection networks to diminish copy positive proposition that
are close-by. All the more explicitly, NMS iteratively wipes out applicant boxes on the off chance that
they have a high IOU with a surer applicant box. This could prompt some sudden conduct when two
objects with a similar class are to be sure near one another. SoftNMS rolled out a little improvement to
just downsizing the certainty score of the overlapped applicant boxes with a boundary. This scaling
boundary gives us more controlwhen tuning the localization execution, and furthermore prompts a
superior exactness when a high review is likewise required.

In the year 2017 ZhaoweiCai UC San Diego, Nuno Vasconcelos UC San proposed Cascade RCNN:
Delving into High Quality Object Detection. While FPN investigating how to plan a superior R-CNN
neck to utilize backbone highlights Cascade R-CNN examined an upgrade of R-CNN grouping and
regression head. The basic assumption that is straightforward yet sagacious: the higher IOU rules we
utilize while planning positive focuses on, the less false positive predictions the network will figure out
how to make. In any case, we can't just increment such IOU threshold from regularly utilized 0.5 to
more forceful 0.7, in light of the fact that it could likewise prompt all the more overpowering negative
models during training.

In the year 2017 Tsung-Yi Lin PriyaGoyal Ross GirshickKaiming He Piotr Dollar proposed Focal 13
Loss for Dense Object Detection. To comprehend why one-stage locators are typically not comparable
to two-stage detectors, RetinaNet explored the frontal area foundation class unevenness issue from a
one-stage detectors dense predictions. Take YOLO for instance, it attempted to predict classes and
6
bounding boxes for all potential areas meanwhile, so the majority of the yields are coordinated to
negative class during training. SSD tended to this issue by online hard model mining. YOLO utilized
an objectiveness score to certainly prepare a closer view classifier in the beginning phase of training.
RetinaNet thinks the two of them didn't get the way in to the issue, so it developed another loss
function work called Focal Loss to assist the network with realizing what'ssignificant.

In the year 2018 Shu Liu, Lu Qi, Haifang Qin, Jianping Shi, JiayaJia proposed PathAggregation
Network for Instance Segmentation. In this paper Occurrence division has a close relationship with
object detection, so regularly another case segmentation network could likewise profit object
recognition research in a roundabout way.

In the year 2018 ChengjiLiu, Yufan Tao, JiaweiLiang, Kai Li, Yihang Chen proposed Object Detection
Based on YOLO Network.In this paper YOLO v3 is the latestformofthe YOLO versions. Following
YOLOv2's convention, YOLOv3 acquired more thoughts from past exploration and got a powerful
incredible one- stage finder like a beast. YOLO v3 adjusted the speed, exactness, and execution
unpredictability really well. Also, it got truly mainstream in the business as a resultof its quick speed
and basic parts. Basically, YOLO v3's success comes from its all the more impressive backbone
include extractor and a RetinaNet-like identification head with a FPN neck.

7
CHAPTER 3 -PROJECT METHODOLOGY
PART 3.1: OBJECT DETECTION USING MACHINE LEARNING

3.1.1 SYSTEM DESIGN:

The aim of object detection is to detect all instances of objects from a known class, such as people, cars
or faces in an image. Generally, only a small number of instances of the object are present in the image,
but there is a very large number of possible locations and scales at which they can occur and that need
to somehow be explored.

We used Python as programming language, and labeling the object with text data we use Open CV &
used pretrained deep learning architecture based on tensor flowso in this regard we again use OpenCV
to load already pre-trained TensorFlow frozen models.

3.1.2 FLOW CHART:

Fig:02: Flow Chart

8
At first, we have to load the image in our model after that object will be recognized by our model and
Classification and localization done instantly by our algorithm then, object will be detected.
Segmentation is termed as categorizing each pixel value of an image to a particular class. To make this
project we took help of SSD-MobilenetV2 algorithm.

PART 3.2: FACIAL EMOTION RECOGNITION USING MACHINE

LEARNING

3.2.1 SYSTEM DESIGN:

System design shows the overall design of system. In this section we discuss in detail the design
aspects of the system:

Fig:03 : System Diagram

9
3.2.2 FLOWCHART:

Fig 04: Working Flow

I.IMAGE PRE-PROCESSING:

Image pre-processing includes the removal of noise and normalization against the variation of pixel
position or brightness.

o Color Normalization
o Histogram Normalization

II. FACE DETECTION:

Face Detection is useful in detection of facial image. Face Detection is carried out in training dataset
using Haar classifier called Voila-Jones face detector and implemented through Opencv. Haar like
features encodes the difference in average intensity in different parts of the image and consists of black

10
and white connected rectangles in which the value of the feature is the difference of sum of pixel values
in black and white regions [6].

III. FACIAL FEATURE EXTRACTION:

Selection of the feature vector is the most important part in a pattern classification problem. The image
of face after pre-processing is then used for extracting the important features. The inherent problems
related to image classification include the scale, pose, 17 translation and variations in illumination level
[6].

IV. EMOTION CLASSIFICATION:

Emotion classification can be divided into two classes, primary emotion such as joy, sadness, anger,
fear disgust, and surprise, and secondary emotion, which evokes a mental image that correlates to
memory or primary emotion.

11
CHAPTER 4 - ALGORITHMS
4.1 SSD:
If there is an image so that single shot multi-box detector divides the image into small patches and then
combination of the patches based on the most salient feature it joins those patches and ask the classifier
to classify that image.

The SSD object detection composes of 2 parts:

o Extract feature maps

o Apply convolution filters to detect object

SSD uses VGG16 to extract feature maps. Then it detects objects using the Conv4_3 layer. For
illustration, we draw the Conv4_3 to be 8 × 8 spatially (it should be 38 × 38). For each cell in the
image (also called location), it makes 4 object predictions. Each prediction composes of a boundary
box and 21 scores for each class (one extra class for no object), and we pick the highest score as the
class for the bounded object. Conv4_3 makes total of 38 × 38 × 4 predictions: four predictions per cell
regardless of the depth of feature maps. An expected, many predictions contain no object. SSD reserves
a class “0” to indicate. SSD does not use the delegated region proposal network. Instead, it resolves to a
very simple method. It computes both the location and class scores using small convolution filters.
After extraction the feature maps, SSD applies 3 × 3 convolution filters for each cell to make
predictions. (These filters compute the results just like the regular CNN filters.) Each filter gives
outputs as 25 channels: 21 scores for each class plus one boundary box.In the beginning, we describe
the SSD detects objects from a single layer. Actually, it uses multiple layers (multi-scale feature maps)
for the detecting objects independently. As CNN reduces the spatial dimension gradually, the
resolution of the feature maps also decreases. SSD uses lower resolution layers for the detect larger-
scale objects. For example, the 4× 4 feature maps are used for the larger- scale object.

4.2 MOBILENET:

MobileNet uses depth wise separable convolutions that help in building deep neural networks. The
MobileNet model is more appropriate for portable and embedded vision-based applications where there
is absence of process control. The main objective of MobileNet is to optimize the latency while
building small neural nets at the same time. It concentrates just on size without much focus on speed.

12
MobileNets are constructed from depth wise separable convolutions.The number of parameters is
reduced significantly by this model through the use of depth wise separable convolutions, when
compared to that done by the network with normal convolutions having the same depth in the
networks.MobileNet is an efficient and portable CNN architecture that is used in real world
applications. MobileNets primarily use depth wiseseparable convolutions in place of the standard
convolutions used in earlier architectures to build lighter models.MobileNets introduce two new global
hyperparameters(width multiplier and resolution multiplier) that allow model developers to trade off
latency or accuracy for speed and low size depending on their requirements.Each depth wiseseparable
convolution layer consists of a depth wise convolution and a pointwise convolution. Countingdepth
wise and pointwise convolutions as separate layers, a MobileNet has 28 layers. A standard MobileNet
has 4.2 million parameters which can be further reduced by tuning the width multiplier hyperparameter
appropriately.

4.3 COCO DATASET:

COCO stands for Common Objects in Context, as the image dataset was created with the goal of
advancing image recognition. The COCO dataset contains challenging, high-quality visual datasets for
computer vision, mostly state-of-the- art neural networks. For example, COCO is often used to
benchmark algorithms to compare the performance of realtime object detection. The format of the
COCO dataset is automatically interpreted by advanced neural network libraries.

4.4: YOLO:

YOLO is real-time object detection. It applies one neural network to the complete image dividing the
image into regions and predicts bounding boxes and possibilities for every region. Predicted
probabilities are the basis on which these bounding boxes are weighted. A single neural network
predicts bounding boxes and class possibilities directly from full pictures in one evaluation. Since the
full detection pipeline is a single network, it can be optimized end-to-end directly on detection
performance.You Only Look Once (YOLO) is one of the most popular model architectures and object
detection algorithms. It uses one of the best neural network architectures to produce high accuracy and
overall processing speed, which is the main reason for its popularity. If we search Google for object
detection algorithms, the first result will be related to the YOLO model.

YOLO algorithm aims to predict a class of an object and the bounding box that defines the object
location on the input image. It recognizes each bounding box using four numbers:
13
o Center of the bounding box ((b{x}, b{y}))
o Width of the box (b{w})
o Height of the box (b{h})
o In addition to that, YOLO predicts the corresponding number c for the predicted class as well as
the probability of the prediction (P{c})

4.5 VGG:

VGG network is another convolution neural network architecture used for image classification.It stands
for Visual Geometry Group; it is a standard deep Convolutional Neural Network (CNN) architecture
with multiple layers. The “deep” refers to the number of layers with VGG-16 or VGG-19 consisting of
16 and 19 convolutional layers. The VGG architecture is the basis of ground-breaking object
recognition models.

4.6 R-CNN:

R-CNN is a progressive visual object detection system that combines bottom-up region proposals with
rich options computed by a convolution neural network.R-CNN uses region proposal ways to initial
generate potential bounding boxes in a picture and then run a classifier on these proposed boxes.

14
CHAPTER 5 - IMAGE PROCESSING
5.1 DESCRIPTION:

With the advent of modern technology our desires went high and it binds no bounds. In the present era
a huge research work is going on in the field of digital image and image processing. The way of
progression has been exponential and it is ever increasing. Image Processing is a vast area of research
in present day world and its applications are very widespread.

Image processing is the field of signal processing where both the input and output signals are images.
One of the most important applications of Image processing is Facial expression recognition. Our
emotion is revealed by the expressions in our face. Facial Expressions plays an important role in
interpersonal communication. Facial expression is a non-verbal scientific gesture which gets expressed
in our face as per our emotions. Automatic recognition of facial expression plays an important role in
artificial intelligence and robotics and thus it is a need of the generation. Some application related to
this include Personal identification and Access control, Videophone and Teleconferencing, Forensic
application, Human-Computer Interaction, Automated Surveillance, Cosmetology and so on.

The objective of this project is to develop Automatic Facial Expression Recognition System which can
take human facial images containing some expression as input and recognize and classify it into seven
different expression class such as:

 Neutral
 Angry
 Happy
 Disgust
 Fear
 Sad
 Surprise

15
Fig:05: Basic different emotions

5.2 DIGITAL IMAGE PROCESSING:

Computerized picture preparing is a range portrayed by the requirement for broad test work to build up the
practicality of proposed answers for a given issue. A critical trademark hidden the plan of picture
preparing frameworks is the huge level of testing and experimentation that Typically is required before
touching base at a satisfactory arrangement. This trademark informs that the capacity to plan
approaches and rapidly model hopeful arrangements by and large assumes a noteworthy part in
diminishing the cost and time required to land at a suitable framework execution.

What is DIP?

A picture might be characterized as a two-dimensional capacity f(x, y), where x, y are spatial directions,
and the adequacy off at any combine of directions (x, y) is known as the power or dark level of the
picture by then. Whenever x, y and the abundance estimation of are all limited discrete amounts, we
call the picture a computerized picture. The field of DIP alludes to preparing advanced picture by
methods for computerized PC. Advanced picture is made out of a limited number of components, each
of which has a specific area and esteem. The components are called pixels.

16
Vision is the most progressive of our sensor, so it is not amazing that picture play the absolute most
imperative part in human observation. Be that as it may, dissimilar to people, who are constrained to
the visual band of the EM range imaging machines cover practically the whole EM range, going from
gamma to radio waves. They can work likewise on pictures produced by sources that people are not
acclimated to partner with picture.

There is no broad understanding among creators in regards to where picture handling stops and other
related territories, for example, picture examination and PC vision begin. Now and then a qualification
is made by characterizing picture handling as a teach in which both the info and yield at a procedure
are pictures. This is constraining and to some degree manufactured limit. The range of picture
investigation is in the middle of picture preparing and PC vision.

There are no obvious limits in the continuum from picture handling toward one side to finish vision at
the other. In any case, one helpful worldview is to consider three sorts of mechanized procedures in this
continuum: low, mid and abnormal state forms. Low-level process includes primitive operations, for
example, picture preparing to decrease commotion differentiate upgrade and picture honing.

A low-level process is described by the way that both its sources of info and yields are pictures.

Mid-level process on pictures includes assignments, for example, division, depiction of that Question
diminishes them to a frame reasonable for PC handling and characterization of individual articles.

A mid-level process is portrayed by the way that its sources of info by and large are pictures however
its yields are properties removed from those pictures. At long last more elevated amount handling
includes "Understanding an outlet of perceived items, as in picture examination and at the farthest end
of the continuum playing out the intellectual capacities typically connected with human vision.
Advanced picture handling, as effectively characterized is utilized effectively in a wide scope of
regions of outstanding social and monetary esteem.

What is an Image?

A picture is spoken to as a two dimensional capacity f(x, y) where x and y are spatial co-ordinates and
the adequacy of "T" at any match of directions (x, y) is known as the power of the picture by then.

Processing on image:

Processing on image can be of three types They are low-level, mid-level, high level.
17
Low-level Processing:

o Preprocessing to remove noise.

o Contrast enhancement.
o Image Sharpening.

Medium-level Processing:

o Segmentations
o Edge Detection
o Object Extraction

High-level Processing:

o Image analysis
o Scene interpretation

Fig 06: Digital image

Why Image Processing?

Since the digital image is invisible, it must be prepared for viewing on one or more output device(laser
printer, monitor at).The digital image can be optimized for the application by enhancing the appearance
of the structures within it.

There are three of image processing used. They are:

o Image to Image transformation

o Image to Information transformation
18
o Information to Image transformation

Pixel:

Pixel is the smallest element of an image. Each pixel corresponds to any one value. In an 8-bit gray scale
image, the value of the pixel between 0 and 255.Each pixel store a value proportional to the light
intensity at that particular location. It is indicated in either Pixels per inch or Dots per inch.

Resolution:

The resolution can be defined in many ways. Such as pixel resolution, spatial resolution, temporal
resolution, spectral resolution. In pixel resolution, the term resolution refers to the total number of
count of pixels in a digital image. For example, If an image has M rows and N columns, then its
resolution can be defined as MX N. Higher is the pixel resolution, the higher is the quality of the
image.

Resolution of an image is of generally two types:

 Low Resolution image

 High Resolution image

Since high resolution is not a cost-effective process It is not always possible to achieve high resolution
images with low cost. Hence it is desirable imaging. In Super Resolution imaging, with the help of
certain methods and 42 algorithms we can be able to produce high resolution images from the low-
resolution image from the low-resolution images.

5.3 GRAY SCALE IMAGE:

A gray scale picture is a capacity I (xylem) of the two spatial directions of the picture plane. I(x,y) is the
force of the picture force of picture at the point (x, y) on the picture plane. I (xylem) take non- negative
expect the picture is limited by a rectangle.

5.4 COLOR IMAGE:

It can be spoken to by three capacities, R (xylem) for red, G (xylem) for green and B (xylem) for blue. A
picture might be persistent as for the x and y facilitates and furthermore in adequacy. Changing over
such a picture to advanced shape requires that the directions and the adequacy to be digitized.

19
Digitizing the facilitates esteems is called inspecting. Digitizing the adequacy esteems is called
quantization.

5.5 RELATED TECHNOLOGY:

I. TENSORFLOW:

Tensor flow is an open-source software library for high performance numerical computation. It allows
simple deployment of computation across a range of platforms (CPUs, GPUs, TPUs) due to its versatile
design also from desktops to clusters of servers to mobile and edge devices. Tensor flow was designed
and developed by researchers and engineers from the Google Brain team at intervals Google’s AI
organization, it comes with robust support for machine learning and deep learning and the versatile
numerical computation core is used across several alternative scientific domains.

To construct, train and deploy Object Detection Models TensorFlow is used that makes it easy and also
it provides a collection of Detection Models pre-trained on the COCO dataset, the Kitti dataset, and the
Open Images dataset. One among the numerous Detection Models is that the combination of Single
Shot Detector (SSDs) and Mobile Nets architecture that is quick, efficient and doesn't need huge
computational capability to accomplish the object Detection.

II. FACIAL RECOGNITION:

“Deep Face” is a deep learning facial recognition system developed to identify human faces in a digital
image. Designed and developed by a group of researchers in Facebook. Google also has its own facial
recognition system in Google Photos, which automatically separates all the photos according to the
person in the image. 44 There are various components involved in Facial Recognition or authors could
say it focuses on various aspects like the eyes, nose, mouth and the eyebrows for recognizing a face.

III. SINGLE SIZE MULTI BOX DETECTOR:

SSD discretizes the output space of bounding boxes into a set of default boxes over different aspect
ratios and scales per feature map location. At the time of prediction, the network generates scores for
the presence of each object category in each default box and generates adjustments to the box to better
match the object shape. Additionally, the network combines predictions from multiple feature maps
with different resolutions to naturally handle objects of various sizes.

20
CHAPTER 6 – SOFTWARE REQUIREMENT
PART 6.1 OBJECT DETECTION USING MACHINE LEARNING

6.1.1 JUPYTER NOTEBOOK (EDITOR USED):

The Jupyter Notebook App is a server-client application that allows editing andrunning notebook
documents via a web browser. The Jupyter Notebook App can be executed on a local desktop requiring
no internet access (as described in this document) or can be installed on a remote server and accessed
through the internet.In addition to displaying/editing/running notebook documents, the Jupyter
Notebook App has a “Dashboard” (Notebook Dashboard), a “control panel” showing local files and
allowing to open notebook documents or shutting down their kernels.

6.1.2 MODULES USED:

I. OPEN CV:

OpenCV stands for Open supply pc Vision Library is associate open supply pc vision and machine
learning software system library. The purpose of creation of OpenCV was to produce a standard
infrastructure for computer vision applications and to accelerate the utilization of machine perception
within the business product [6]. It becomes very easy for businesses to utilize and modify the code with
OpenCV as it is a BSD-licensed product. It is a rich wholesome library as it contains 2500 optimized
algorithms, which also includes a comprehensive set of both classic and progressive computer vision
and machine learning algorithms. These algorithms are used for various functions such as discover and
acknowledging faces. Identify objects classify human actions. In videos, track camera movements,
track moving objects. Extract 3D models of objects, manufacture 3D purpose clouds from stereo
cameras, sew pictures along to providea high-resolution image of a complete scene, find similar
pictures from a picture information, remove red eyes from images that are clicked with the flash, follow
eye movements, recognize scenery and establish markers to overlay it with augmented reality.
Officially launched in 1999 the OpenCV project was initially an Intel Research initiative to advance
CPU-intensive applications, part of a series of projects including real-time ray tracing and 3D display
walls The main contributors to the project included a number of optimization experts in Intel Russia, as
well as Intel's Performance Library Team. In the early days of OpenCV, the goals of the project were
described as:

21
 Advance vision research by providing not only open but also optimized code for basic vision
infrastructure. No more reinventing the wheel.
 Disseminate vision knowledge by providing a common infrastructure that developers could
build on, so that code would be more readily readable and transferable.

OpenCV's application areas include:

o 2D and 3D feature toolkits

o Ego-motion estimation
o Facial recognition system
o Gesture recognition
o Human-computer interaction (HCI)
o Mobile robotics
o Motion Understanding
o Object identification
o Segmentation and recognition

II. MATPLOTLIB:

Matplotlib is a python library used to create 2D graphs and plots by using python scripts. It has a
module named pyplot which makes things easy for plotting by providing feature to control line styles,
font properties, formatting axes etc.It supports a very wide variety of graphs and plots namely -
histogram, bar charts, power spectra, error charts etc. It is used along with NumPy to provide an
environment that is an effective opensource alternative for MatLab. It can also be used with graphics
toolkits like PyQt and wxPyt.

Matplotlib is a comprehensive library for creating static, animated, and interactive visualizations in
Python.

Matplotlib is a comprehensive library for creating static, animated, and interactive visualizations in
Python. Matplotlib makes easy things easy and hard things possible.

It is an amazing visualization library in Python for 2D plots of arrays. Matplotlib is a multi-platform

data visualization library built on NumPy arrays and designed to work with the broader SciPy stack. It
was introduced by John Hunter in the year 2002. One of the greatest benefits of visualization is that it

22
allows us visual access to huge amounts of data in easily digestible visuals. Matplotlib consists of
several plots like line, bar, scatter, histogram etc.

Fig 07: Scatter (X, Y)

6.1.3 PROGRAMMING LANGUAGE USED:

Python: Python is an interpreted, object-oriented, high-level programming language with dynamic

semantics. Its high-level built-in data structures, combined with dynamic typing and dynamic binding,
make it very attractive for Rapid Application Development, as well as for use as a scripting or glue
language to connect existing components together. Python's simple, easy to learn syntax emphasizes
readability and therefore reduces the cost of program maintenance. Python supports modules and
packages, which encourages program modularity and code reuse. The Python interpreter and the
extensive standard library are available in source or binary form without charge for all major platforms,
and can be freely distributed. Often, programmers fall in love with Python because of the increased
productivity it provides. Since there is no compilation step, the edit-test-debug cycle is incredibly fast.
Debugging Python programs is easy: a bug or bad input will never cause a segmentation fault. Instead,
when the interpreter discovers an error, it raises an exception. When the program doesn't catch the
exception, the interpreter prints a stack trace. A source level debugger allows inspection of local and
global variables, evaluation of arbitrary expressions, setting breakpoints, stepping through the code a
line at a time, and so on. The debugger is written in Python itself, testifying to Python's introspective

23
power. On the other hand, often the quickest way to debug a program is to add a few print statements to
the source: the fast edit-test-debug cycle makes this simple approach very effective.

PART 6.2 FACIAL EMOTION RECOGNITION USING MACHINE LEARNING

6.2.1 HARDWARE INTERFACES:

i. Processor: Intel CORE i5 processor with minimum 2.9 GHz

ii. RAM: Minimum 4 GB
iii. Hard Disk: Minimum 500 GB

6.2.2 PLANNING:

1) Analysis of the problem statement

2) Gathering of the requirement specification
3) Analyzation of the feasibility of the project
4) Development of the general lay out
5) Going by the journals regarding the previous related works on this field
6) Choosing the method for developing the algorithm
7) Analyzing the various pros and cons
8) Starting the development of the project
9) Installation software like Anaconda or any other
10) Developing an algorithm
11) Analyzation of algorithm by guide
12) Coding as per the developed algorithm in python

6.2.3 THE LIBRARY &PACKAGES:

NumPy:NumPy is an acronym for "Numeric Python “or "Numerical Python". It is an open-source

extension module for Python, which provides fast precompiled functions for mathematical and
numerical routines. Furthermore, NumPy enriches the programming language Python with powerful
data structures for efficient computation of multi-dimensional arrays and matrices. The implementation
is even aiming at huge matrices and arrays. Besides that, the module supplies a large library of high -
level mathematical functions to operate on these matrices and arrays.It is the fundamental package for
scientific computing with Python. It contains various features including these important ones:

24
 A powerful N-dimensional array object
 Sophisticated(broadcasting)functions
 Tools for integrating C/C++ and Fortran code
 Useful linear algebra, Fourier transform and random number capabilities

NumPy Array:A NumPy array is a grid of values, all of the same type, and is indexed by a tuple of
non-negative integers. The number of dimensions is the rank of the array; the shape of an array is a
tuple of integers giving the size of the array along each dimension.

SciPy:SciPy (Scientific Python) is often mentioned in the same breath with NumPy. SciPy extends the
capabilities of NumPy with further useful functions for minimization regression, Fouriertransformation
and many others.

Keras:Keras is a high-level neural networks API, written in Python and capable of running on top of
Tensor Flow, CNTK, or Theano. It was developed with a focus on enabling fast experimentation. Keras
contains numerous implementations of commonly used neural network Building blocks such as layers,
objectives, activation functions, optimizers, and a Host of tools to make working with image and text
data easier. The code is hosted on GitHub, and community support forums include the GitHub issues
page, and a Slack channel. Keras allows users to productize deep models on smartphones (iOS and
Android), on the web, or on the Java Virtual Machine. It also allows use of distributed training of deep
learning models on clusters of Graphics Processing Units (GPU).

6.2.4 HAAR CASCADE CLASSIFIER IN OPENCV:

The algorithm needs a lot of positive images (images of faces) and negative images (images without
faces) to train the classifier. Then we need to extract features from it. They are just like our
convolutional kernel. Each feature is a single value obtained by subtracting sum of pixels under white
rectangle from sum of pixels under black rectangle.

Now all possible sizes and locations of each kernel is used to calculate plenty of features. (Just imagine
how much computation it needs? Even a 24x24 window results over 160000 features). For each feature
calculation, we need to find sum of pixels under white and black rectangles. To solve this, they
introduced the integral images. It simplifies calculation of sum of pixels, how large may be the number
of pixels, to an operation involving just four pixels. Nice, isn’t it? It makes things super-fast.

25
But among all these features we calculated, most of them are irrelevant. For example, consider the
image below. Top row shows two good features. The first feature selected seems to focus on the
property that the region of the eyes is often darker than the region of the nose and cheeks. The second
feature selected relies on the property that the eyes are darker than the bridge of the nose. But the same
windows applying on cheeks or any other place is irrelevant. So how do we select the best features out
of 160000+ features? It is achieved by Adaboost.

For this, we apply each and every feature on all the training images. For each feature, it finds the best
threshold which will classify the faces to positive and negative. But obviously, there will be errors or
misclassifications. We select the features with minimum error rate, which means they are the features
that best classifies the face and non- face images. (The process is not as simple as this. Each image is
given an equal weight in the beginning. After each classification, weights of misclassified images are
increased. Then again same process is done. New error rates are calculated. Also new weights. The
process is continued until required accuracy or error rate is achieved or required number of features are
found).

Final classifier is a weighted sum of these weak classifiers. It is called weak because it alone
can‟tclassify the image, but together with others forms a strong classifier. The paper says even 200
features provide detection with 95% accuracy. Their final setup had around 6000 features. (Imagine a
reduction from 160000+ features to 6000 features. That is a big gain).

In an image, most of the image region is non-face region. So it is a better idea to have a simple method
to check if a window is not a face region. If it is not, discard it in a single shot. Don‟t process it again.
Instead focus on region where there can be a face. This way, we can find more time to check a possible
face region.

For this they introduced the concept of Cascade of Classifiers. Instead of applying all the 6000 features
on a window, group the features into different stages of classifiers and apply one-by-one. (Normally
first few stages willcontain very a smaller number of features). If a window fails the first stage, discard
it. We don’t consider remaining features on it. If it passes, apply the second stage of features and
continue the process. The window which passes all stages is a face region. Haar-like features are digital
image features used in object recognition. They owe their name to their intuitive similarity with Haar
wavelets and were used in the first real-time face detector.Historically, working with only image
intensities (i.e., the RGB pixel values at each and every pixel of image) made the task of feature

26
calculation computationally expensive. A publication by Papa Georgiou et al. discussed working with
an alternate feature set based on Haar wavelets instead of the usual image intensities. Paul Viola and
Michael Jones adapted the idea of using Haar wavelets and developed the so-called Haar-like features.
A Haar-like feature considers adjacent rectangular regions at a specific location in a detection window,
sums up the pixel intensities in each region and calculates the difference between these sums. This
difference is then used to categorize subsections of an image. For example, with a human face, it is a
common observation that among all faces the region of the eyes is darker than the region of the cheeks.
Therefore, a common Haar feature for face detection is a set of two adjacent rectangles that lie above
the eye and the cheek region. The position of these rectangles is defined relative to a detection window
that acts like a bounding box to the target object (the face in this case).

In the detection phase of the Viola–Jones object detection framework, a window of the target size is
moved over the input image, and for each subsection of the image the Haar-like feature is calculated.
This difference is then compared to a learned threshold that separates non-objects from objects.
Because such a Haar-like feature is only a weak learner or classifier (its detection quality is slightly
better than random guessing) a large number of Haar-like features are necessary to describe an object
with sufficient accuracy.

In the Viola–Jones object detection framework, the Haar-like features are therefore organized in
something called a classifier cascade to form a strong learner or classifier.

The key advantage of a Haar-like feature over most other features is its calculation speed. Due to the
use of integral images, a Haar-like feature of any size can be calculated in constant time (approximately
60 microprocessor instructions for a 2-rectangle feature).

27
CHAPTER 7 – RESULTS
PART 7.1 OBJECT DETECTION USING MACHINE LEARNING

7.1.1 RESULTS

Fig I: Before Detection

This is a snap shot of a video data which we feed to the algorithm and expect our algorithm to detect
and identify objects in the image and label them according to the class assigned to it.

Fig II: After Detection

28
PART 7.2 FACIAL EMOTION RECOGNITION USING MACHINE LEARNING

7.2.2 RESULTS

I. First Input Result

Fig A: Before Detection

This is a snapshot of before detecting an image. After we feed that image into our model. We can see
that our algorithm predicts the emotion of the image and shows it with labeling a square box with text.

The ultimate output is shown in below figure:

Fig B: After Detection

29
II. Second Input Result

Fig C: Before Detection

Fig D: After Detection

As we can our project can detect the emotions of the object in an image accurately.

30
CHAPTER 8 – CONCLUSION
8.1 CONCLUSION:
In this case, when the model predicts incorrectly, the correct label is often the second most likely
emotion. The facial expression recognition system presented in this research work contributes a
resilient face recognition model based on the mapping of behavioral characteristics with the
physiological biometric characteristics. The physiological characteristics of the human face with
relevance to various expressions such as happiness, sadness, fear, anger, surprise and disgust are
associated with geometrical structures which restored as base matching template for the recognition
system. The behavioral aspect of this system relates the attitude behind different expressions as
property base. The property bases are alienated as exposed and hidden category in genetic algorithmic
genes. The gene training set evaluates the expressional uniqueness of individual faces and provide a
resilient expressional recognition model in the field of biometric security.

The design of a novel asymmetric cryptosystem based on biometrics having features like hierarchical
group security eliminates the use of passwords and smart cards as opposed to earlier cryptosystems. It
requires a special hardware support like all other biometrics system. This research work promises a
new direction of research in the field of asymmetric biometric cryptosystems which is highly desirable
in order to get rid of passwords and smart cards completely. Experimental analysis and study show that
the hierarchical security structures are effective in geometric shape identification for physiological
traits.

8.2 FUTURE SCOPE:

It is important to note that there is no specific formula to build a neural network that would guarantee
to work well. Different problems would require different network architecture and a lot of trail and
errors to produce desirable validation accuracy. This is the reason why neural nets are often perceived
as “black box” algorithms.

In this project we got an accuracy of almost 70% which is not bad at all comparing all the previous
models. But we need to improve in specific areas like –

 Number and configuration of convolution layers

31
 Number and configuration of dense layers
 Drop out percentage in dense layers

But due to lack of highly configured system we could not go deeper into dense neural network as the
system gets very slow and we will try to improve in these areas in future.

We would also like to train more databases into the system to make the model more and more accurate but
again resources become a hindrance in the path and we also need to improve in several areas in future
to resolve the errors and improve the accuracy.

Having examined techniques to cope with expression variation, in future it may be investigated in more
depth about the face classification problem and optimal fusion of color and depth information. Further
study can be laid down in the direction of allele of gene matching to the geometric factors of the facial
expressions. The genetic property evolution framework for facial expressional system can be studied to
suit the requirement of different security models such as criminal detection, governmental confidential
security breaches etc.

32
CHAPTER 9 - REFERENCES
[1] D. C. Ali Mollahosseini and M. H. Mahoor. Going deeper in facial expression recognition using
deep neural networks. IEEE Winter Conference on Applications of Computer Vi- sion, 2016

[2] S.-Y. D. Bo-Kyeong Kim, JihyeonRoh and S.-Y. Lee. Hi- erarchical committee of deep
convolutional neural networks for robust facial expression recognition. Journal on Multi- modal User
Interfaces, pages 1–17, 2015.

[3] F. Chollet. keras. https://github.com/fchollet/ keras, 2015.

[4] [4] P. Ekman and W. V. Friesen.Emotional facial action coding system. Unpublished manuscript,
University of California at San Francisco, 1983.

[5] B. Graham. Fractional max-pooling. 2015.

[6] S. Ioffe and C. Szegedy. Batch normalization: Accelerating deep network training by reducing
internal covariate shift. JMLRProceedings, 2015

. [7] M. S.-S. S. M. B. Z. L. X. S. B. P. J. Zhang, S. Ma and R. Mech. Salient object subitizing. IEEE

Conference on Computer Vision and Pattern Recognition (CVPR), 2015.

[8] Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, and T. Darrell.
Caffe: Convolutional architecture for fast feature embedding. arXiv preprint arXiv:1408.5093, 2014

[9] Lab41. Misc, caffekeras util. https://github.com/ Lab41/Misc/blob/master/blog/kerasvgg/

caffekeras_util.py, 2015.

[10] Sokolova, M., Japkowicz, N., &Szpakowicz, S. (2006, December). Beyond accuracy, F-score and
ROC: a family of discriminant measures for performance evaluation. In Australasian Joint Conference
on Artificial Intelligence (pp. 1015- 1021). Springer Berlin Heidelberg.

[11] Michel, P., & El Kaliouby, R. (2005). Facial expression recognition using support vector
machines. In The 10th International Conference on Human-Computer Interaction, Crete, Greece.

33
[12] Michel, P., & El Kaliouby, R. (2003, November). Real time facial expression recognition in video
using support vector machines. In Proceedings of the 5th international conference on Multimodal
interfaces (pp. 258- 264). ACM.

LIST OF PUBLICATION:

[1] Ankita Das, Aditi Choubey, Lipika Mandal “Object Detection Using Machine Learning ” in National
Conference on Computational & Characterization Techniques in Engineering and Sciences (CCTES
2023) February 27-28,2023.

34
35
36
37

SOLUTIONS of End-of-Chapter Problems CHAPTER 3 The Z Transform Digital Signal Processing: Andreas Antoniou
No ratings yet
SOLUTIONS of End-of-Chapter Problems CHAPTER 3 The Z Transform Digital Signal Processing: Andreas Antoniou
37 pages
Drag & Drop Volume Profile Indicator User Guide: Dragdropvolumeprofile - Ex4 Next
No ratings yet
Drag & Drop Volume Profile Indicator User Guide: Dragdropvolumeprofile - Ex4 Next
7 pages
AZ900
No ratings yet
AZ900
136 pages
Sample Mini Project in Deep Learning
No ratings yet
Sample Mini Project in Deep Learning
61 pages
M. e Report
No ratings yet
M. e Report
56 pages
Real Time Object Dectection
No ratings yet
Real Time Object Dectection
60 pages
Object Tracking
No ratings yet
Object Tracking
50 pages
Ankit Synopsis
No ratings yet
Ankit Synopsis
13 pages
Object Detectionusing Machine Learningand Deep Learning
No ratings yet
Object Detectionusing Machine Learningand Deep Learning
9 pages
Final Project2
No ratings yet
Final Project2
46 pages
Mini Project Report
No ratings yet
Mini Project Report
15 pages
Final Document
No ratings yet
Final Document
51 pages
Final Project Report
No ratings yet
Final Project Report
19 pages
Drishya
No ratings yet
Drishya
38 pages
Realtime Object Detection Documentation
No ratings yet
Realtime Object Detection Documentation
37 pages
Sem 5 Proj Report Group
No ratings yet
Sem 5 Proj Report Group
20 pages
Object Detection
No ratings yet
Object Detection
13 pages
Devansh Rajesh Dhuri 8TH F Roll No.13 (Object Detection in Ai)
No ratings yet
Devansh Rajesh Dhuri 8TH F Roll No.13 (Object Detection in Ai)
10 pages
Project Report D
No ratings yet
Project Report D
50 pages
Real Time Object Detection Using Deep Learning Andmachine Learning Project
No ratings yet
Real Time Object Detection Using Deep Learning Andmachine Learning Project
56 pages
Result - 9 - 19 - 2023, 7 - 14 - 17 AM
No ratings yet
Result - 9 - 19 - 2023, 7 - 14 - 17 AM
43 pages
Irjet V6i4920
No ratings yet
Irjet V6i4920
7 pages
3page Abstract
No ratings yet
3page Abstract
4 pages
Preliminary Papers
No ratings yet
Preliminary Papers
10 pages
DSP Project Report
100% (1)
DSP Project Report
14 pages
Ankit Report
No ratings yet
Ankit Report
73 pages
Object Detection - Deep Learning: Jamia Hamdard
No ratings yet
Object Detection - Deep Learning: Jamia Hamdard
26 pages
Object Detection Using Deep Learning
No ratings yet
Object Detection Using Deep Learning
45 pages
MR - Chetan Seminar Report
No ratings yet
MR - Chetan Seminar Report
42 pages
Second Progress Report UID - 17BCS2127
No ratings yet
Second Progress Report UID - 17BCS2127
13 pages
Final Project Report
No ratings yet
Final Project Report
60 pages
Object Detection
0% (1)
Object Detection
49 pages
A.I Project (9) - 094855
No ratings yet
A.I Project (9) - 094855
30 pages
Final Year PPT 1
No ratings yet
Final Year PPT 1
18 pages
Project Detecto!: A Real-Time Object Detection Model
No ratings yet
Project Detecto!: A Real-Time Object Detection Model
3 pages
Object Detection Presentation
No ratings yet
Object Detection Presentation
12 pages
Realtime Object Detection Using SSD
No ratings yet
Realtime Object Detection Using SSD
8 pages
Ss Report
No ratings yet
Ss Report
8 pages
Report 34
No ratings yet
Report 34
26 pages
Major Project Synopsis
No ratings yet
Major Project Synopsis
5 pages
Solar Battery Charger
No ratings yet
Solar Battery Charger
25 pages
E3sconf Icmed-Icmpc2023 01016
No ratings yet
E3sconf Icmed-Icmpc2023 01016
6 pages
Unit 3
No ratings yet
Unit 3
17 pages
Final Report
No ratings yet
Final Report
62 pages
Enabling Object Detection Through Speech For Visually Impaired-2
No ratings yet
Enabling Object Detection Through Speech For Visually Impaired-2
55 pages
Image and Video Analytics Unit 3
No ratings yet
Image and Video Analytics Unit 3
18 pages
Object Detection Using Machine Learningand Neural Networks
No ratings yet
Object Detection Using Machine Learningand Neural Networks
10 pages
Documentation
No ratings yet
Documentation
44 pages
Minor Project: Object Recognition System
No ratings yet
Minor Project: Object Recognition System
20 pages
Object Detection Using Machine Learning Techniques
No ratings yet
Object Detection Using Machine Learning Techniques
20 pages
De PPT (EC) 2-1
No ratings yet
De PPT (EC) 2-1
16 pages
Object Detection and Identification A Project Report: November 2019
No ratings yet
Object Detection and Identification A Project Report: November 2019
45 pages
Object Detection Using Deep Learning
No ratings yet
Object Detection Using Deep Learning
5 pages
Object Detection and Identification A Project Report: November 2019
No ratings yet
Object Detection and Identification A Project Report: November 2019
45 pages
Synopsis
No ratings yet
Synopsis
4 pages
Speech On Object Detection
No ratings yet
Speech On Object Detection
4 pages
Index Page Arjun
No ratings yet
Index Page Arjun
8 pages
Object Detection: Anu Vyshakh C P - BL - En.P2Ebs18001 Machine Learning Embedded Systems M.Tech
No ratings yet
Object Detection: Anu Vyshakh C P - BL - En.P2Ebs18001 Machine Learning Embedded Systems M.Tech
2 pages
OBJECT DETECTION AND IDENTIFICATION Report TC
No ratings yet
OBJECT DETECTION AND IDENTIFICATION Report TC
10 pages
PRN Aat
No ratings yet
PRN Aat
11 pages
John 2020 Comparative
No ratings yet
John 2020 Comparative
7 pages
Machine Learning Mastery for Engineers
From Everand
Machine Learning Mastery for Engineers
Abdellatif Sadeq
No ratings yet
ANSYS Workbench 2023 R2: A Tutorial Approach, 6th Edition
From Everand
ANSYS Workbench 2023 R2: A Tutorial Approach, 6th Edition
Prof. Sham Tickoo
No ratings yet
Modul 1
No ratings yet
Modul 1
3 pages
Iot Ca3 Answer
No ratings yet
Iot Ca3 Answer
14 pages
Sem 720
No ratings yet
Sem 720
4 pages
Dbms Module 1
No ratings yet
Dbms Module 1
78 pages
Database System With Administration: Technical Assessment
No ratings yet
Database System With Administration: Technical Assessment
3 pages
Hns Determine Beest Fit Topology
No ratings yet
Hns Determine Beest Fit Topology
25 pages
(Data Sheet) hm70 Evo - v1.00 - Rev1 - 20210316.semnat
No ratings yet
(Data Sheet) hm70 Evo - v1.00 - Rev1 - 20210316.semnat
24 pages
CR Touch 说明书 - 88×48 - EN（修改2021820）
No ratings yet
CR Touch 说明书 - 88×48 - EN（修改2021820）
1 page
HP E78x - Part List
No ratings yet
HP E78x - Part List
2 pages
Launcher Log
No ratings yet
Launcher Log
179 pages
02 MP 8086 Architecture and Instruction Set
No ratings yet
02 MP 8086 Architecture and Instruction Set
12 pages
Smart Parking System
No ratings yet
Smart Parking System
15 pages
Inca Ip PB
No ratings yet
Inca Ip PB
2 pages
Sensors and RTOS: Prof. Chung-Ta King
No ratings yet
Sensors and RTOS: Prof. Chung-Ta King
41 pages
Dsap Lab Report 077bei045
No ratings yet
Dsap Lab Report 077bei045
27 pages
Mini Project Blood Bank and Donor Management System-Documentation
100% (1)
Mini Project Blood Bank and Donor Management System-Documentation
24 pages
Incident Response Report
No ratings yet
Incident Response Report
4 pages
Launching of Servizing App
No ratings yet
Launching of Servizing App
4 pages
AIS CH3 (Relational Database)
No ratings yet
AIS CH3 (Relational Database)
61 pages
Faldo 728HPM Paper Cutter
No ratings yet
Faldo 728HPM Paper Cutter
30 pages
Log
No ratings yet
Log
52 pages
Storage (S3, Cloudfront)
No ratings yet
Storage (S3, Cloudfront)
21 pages
University of Cagliari: Blynk Platform
No ratings yet
University of Cagliari: Blynk Platform
34 pages
Comprog - Java Chap 4
No ratings yet
Comprog - Java Chap 4
23 pages
HOMO2
No ratings yet
HOMO2
15 pages
WM - W800 - SDK User Manual V1.1: Beijing Lianshengde Microelectronics Co., Ltd. (Winner Micro)
No ratings yet
WM - W800 - SDK User Manual V1.1: Beijing Lianshengde Microelectronics Co., Ltd. (Winner Micro)
20 pages
Lab Manual
No ratings yet
Lab Manual
118 pages
UM7210 EN V1 Lumiso Expert A4 User Manual
No ratings yet
UM7210 EN V1 Lumiso Expert A4 User Manual
34 pages
Final Exam Microprocessor Fundamentals 2020 Part 2
No ratings yet
Final Exam Microprocessor Fundamentals 2020 Part 2
7 pages
FE Sherwin
No ratings yet
FE Sherwin
2 pages
Wireless Hacking Tool
No ratings yet
Wireless Hacking Tool
9 pages