Project Report End
Project Report End
********** **********
********** **********
********** **********
PROJECT
ON
OF
Signature ………………….
Signature ………………….
Date ……………………….
Date ……………………….
I
DECLARATION
I, FOKAM NYAPTSE LILIAN GERAUD, declare that I am the sole author of this
thesis. I authorize the UNIVERSITY INSTITUTE OF THE TROPICS to lend this thesis to
other institutions or individuals for the purpose of scholarly research. I understand the nature
of plagiarism and am fully aware of the institution’s policy on it. I declare that this report is my
original work and contains neither material previously published by another person nor
material that has been accepted for the award of a certificate in any other institution, except
where due acknowledgments have been made.
Date …………………………
Signature…………………….
II
DEDICATION
TO MY FAMILY
III
ACKNOWLEDGEMENTS
I will like on a special note to acknowledge all those who in one way or the other were of
assistance to me. My profound gratitude goes to the following people for their various
contributions that have led to the realization of my report;
I express my profound gratitude to my supervisor ATSOPMENE TANGO VANETTE
ELEONORE for his patience, moral support, guardians and encouragement, because he
was always present at any time when I needed him for any during the writing of this report,
To the Proprietor of IUGET Hon Dr. JOSEPH NGUEPI to have created this institution
and more particularly the Software Engineering department,
To the coordinator of South Polytech Mr. TSAKOU KOUMETIO BILLY for all his
efforts so that we should be having lectures, me to surpass myself.
To all my lecturers of IUGET especially Eng. Nyambi Blaise Prosper Mbuh, for his
mentorship and numerous encouragements.
Also, special thanks go to the to my parents Mr. and Mrs. FOKAM who gave me the
support financially, materially, morally and for always being by my side,
To the Mr. and Mrs. MINGUE for their guidance, their moral and financial support and
particularly for their warm welcome in their home,
To all my brothers and sisters which have always been there to guide me during the
writing of this report,
To all my classmates who were always there when I had difficulties in the understanding
of some subjects.
IV
TABLE OF CONTENTS
CERTIFICATION ..................................................................................................................... I
DECLARATION ...................................................................................................................... II
ACKNOWLEDGEMENTS .....................................................................................................IV
ABSTRACT........................................................................................................................... XII
V
1.8.4 Overfitting ................................................................................................................. 6
VI
3.2.1 Materials ................................................................................................................. 42
REFERENCES ........................................................................................................................ 70
VII
TABLE OF FIGURES
Figure 1: Generators of facial expressions and emotion (Trovato, Tatsuhiro , Nobutsuna , Kenji
, & Atsuo , 2012)...................................................................................................................... 12
Figure 2 Facial Muscles and Emotions (Zizhao , et al., 22) .................................................... 14
Figure 3 Action unit and associated muscle (Dorante, Miguel, et al., 2020)........................... 15
Figure 4 Examples of some action units extracted from Cohn and Kanade’s database .......... 18
Figure 5 Machine learning working......................................................................................... 23
Figure 6 Deep Learning working ............................................................................................. 24
Figure 7 Different types of facial expressions ......................................................................... 26
Figure 8 Different steps on the SDLC ..................................................................................... 28
Figure 9 Illustration of the waterfall model ............................................................................. 30
Figure 10 Illustration of the agile(iterative) model .................................................................. 30
Figure 11 Illustration of the spiral model ................................................................................ 31
Figure 12 Illustration of the V-Shape model ........................................................................... 31
Figure 13 Agile model overview ............................................................................................. 33
Figure 14 MERISE model ....................................................................................................... 35
Figure 15 Class diagram of the RTMRS.................................................................................. 37
Figure 16 Activity Diagram of RTMRS Showing the login process, user and admin process38
Figure 17 Activity Diagram of RTMRS showing the chatbot interaction ............................... 39
Figure 18 Use case diagram of RTMRS .................................................................................. 40
Figure 19 Sequence diagram for the admin functions ............................................................. 41
Figure 20 Sequence diagram showing how the system recognizes mood ............................... 42
Figure 21 Downloading dataset from Kaggle.com website ..................................................... 43
Figure 22 Hp Elitebook ............................................................................................................ 44
Figure 23 Intel Core Processor ................................................................................................ 44
Figure 24 1 Tb toshiba HDD ................................................................................................... 44
Figure 25 Interface Design Using QtDesigner......................................................................... 45
Figure 26 Training Process ...................................................................................................... 56
VIII
Figure 27 The program File Organizational Structure ............................................................. 57
Figure 28 Registration interface............................................................................................... 58
Figure 29 Login interface......................................................................................................... 59
Figure 30 Login Interface showing wrong login credentials ................................................... 59
Figure 31 Login interface showing correct credentials but the server in not online................ 60
Figure 32 Main user interface .................................................................................................. 61
Figure 33 Happy mood displayed ............................................................................................ 61
Figure 34 Sad mood displayed................................................................................................. 62
Figure 35 Neutral mood displayed........................................................................................... 62
Figure 36 Neutral mood displayed........................................................................................... 63
Figure 37 Surprised mood displayed ....................................................................................... 63
Figure 38 Admin Dashboard.................................................................................................... 64
Figure 39 The manage user interface ....................................................................................... 64
IX
LIST OF ABBREVIATIONS
ECG Electrocardiogram
ECG Electrocardiograms
EEG Electroencephalogram
FACS Facial Action Coding System
Gb Gigabyte
GUI Graphical User Interface
HDD Hard Disk Drive
HOG Histogram of Oriented Gradients
HR Human Resource
Hz Hertz
IDE Integrated Development Environment
X
RNNs Recurrent Neural Networks
RTMRS Real Time Mood Recognition System.
SADT Structured Analysis and Design Technique
SADT Structured Analysis and Design Technique
XI
ABSTRACT
In daily life, the role of non-verbal communication is greatly significant. Also, daily
activities are generally base our mood during that day. Human Emotion detection from image
is one of the most powerful and challenging research task in social communication. The
development of a facial Mood Recognition System which is design to detect emotions from
facial expressions using computerized algorithms. This facial mood recognition system is
designed to assist enterprises in detecting the daily mood of users to ensure that daily output is
not affected. In order to achieve this, we developed a system which will be able to detect
humans’ mood in real time by capturing images of the face using a camera, then applying the
face detecting algorithms, feature extraction and processing and later on the mood detecting
algorithm for the detection of mood. The system utilizes the fer2013 dataset, which was
downloaded from Kaggle, the dataset which has 28709 images for training and 7178 images
for testing, employs the Haar Cascade library for face detection, Qt-designer is used for form
designing, while PyCharm serves as the IDE, and Python is used as programming language.
The system utilizes a convolutional neural network (CNN) for mood detection. The CNN
algorithm processes the extracted features from the facial expression and determines the mood
of the individual. We trained our system and had an accuracy of 58.7% on our trained model.
The system also includes a database to store user information and training data for the mood
detector. The admin can manage the user database and train the mood detector model through
the GUI. This system provides an effective means of detecting users’ mood to ensure a positive
work environment and maintain high levels of productivity. The system also includes a chatbot
engine that allows users to interact with the system and receive personalized responses based
on their mood.
Keywords: CNN, Python, PyCharm, GUI, Mood detection, Face detection, Dataset, Kaggle,
Emotions, Facial expression.
XII
RESUME
Dans la vie quotidienne, le rôle de la communication non verbale est très important. De
plus, les activités quotidiennes sont généralement à la base de notre humeur au cours de cette
journée. La détection des émotions humaines à partir d'images est l'une des tâches de recherche
les plus puissantes et les plus difficiles en communication sociale. Le développement d'un
système de reconnaissance de l'humeur faciale conçu pour détecter les émotions à partir des
expressions faciales à l'aide d'algorithmes informatisés. Ce système de reconnaissance de
l'humeur faciale est conçu pour aider les entreprises à détecter l'humeur quotidienne des
employés afin de s'assurer que la production quotidienne n'est pas affectée. Pour y parvenir,
nous avons développé un système qui sera capable de détecter l'humeur des humains en temps
réel en capturant des images du visage à l'aide d'une caméra, puis en appliquant les algorithmes
de détection de visage, l'extraction et le traitement des caractéristiques et plus tard l'algorithme
de détection d'humeur. Le système utilise l'ensemble de données fer2013, qui a été téléchargé
à partir de Kaggle, l'ensemble de données qui contient 28709 images pour la formation et 7178
images pour les tests, utilise la bibliothèque Haar Cascade pour la détection de visage, Qt-
designer est utilisé pour la conception de formulaires, tandis que PyCharm sert d'IDE, et Python
est utilisé comme langage de programmation. Le système utilise un réseau neuronal convolutif
(CNN) pour la détection de l'humeur. L'algorithme CNN traite les caractéristiques extraites de
l'expression faciale et détermine l'humeur de l'individu. Nous avons formé notre système et
avons obtenu une précision de 58,7 % sur notre modèle formé. Le système comprend
également une base de données pour stocker des informations utilisateur et des données de
formation pour le détecteur d'humeur. L'administrateur peut gérer la base de données des
utilisateurs et former le modèle de détecteur d'humeur via l'interface graphique. Ce système
fournit un moyen efficace de détection d'humeur des employés pour assurer un environnement
de travail positif et maintenir des niveaux élevés de productivité. Le système comprend
également un moteur de Chatbot qui permet aux employés d'interagir avec le système et de
recevoir des réponses personnalisées en fonction de leur humeur.
Mots-clés : CNN, Python, PyCharm, GUI, Détection d'humeur, Détection de visage, Ensemble
de données, Émotions, Expression faciale.
XIII
CHAPTER ONE
GENERAL INTRODUCTION.
In the contemporary digital landscape, the interaction between humans and technology has
reached unprecedented levels of sophistication. With the rapid advancement of artificial
intelligence (AI) and machine learning technologies, there is an emerging demand for systems
that go beyond simple task execution to understanding and responding to human emotions.
This has given rise to mood recognition systems, which are designed to detect and interpret
human emotions through the analysis of facial expressions, voice tones, and other behavioral
and physiological cues.
Mood recognition systems hold immense potential across a variety of domains. In mental
health, these systems can offer new ways to monitor emotional well-being, providing valuable
insights that aid in the early detection and management of mental health disorders. For instance,
mood recognition technology can help in identifying symptoms of depression or anxiety,
enabling timely intervention and personalized treatment. This is particularly significant in a
world where mental health issues are increasingly prevalent and often go unrecognized until
they become severe.
In the realm of human-computer interaction, mood recognition can significantly enhance the
user experience. By enabling machines to understand and react to the emotional states of users,
these systems can create more intuitive, empathetic, and engaging interactions. This can be
applied in various contexts, from virtual assistants and customer service bots to educational
software and gaming. For example, an emotionally aware virtual assistant can adjust its
responses based on the user’s mood, providing a more tailored and supportive experience.
Social robotics is another area poised to benefit from mood recognition technology. Social
robots, designed to interact with humans in environments such as homes, hospitals, and care
facilities, can greatly improve their functionality and acceptance by being able to recognize
and respond to human emotions. This capability can enhance the effectiveness of robots in
providing companionship, support, and care, particularly for individuals who may be
emotionally or socially isolated, such as the elderly or those with certain disabilities.
Moreover, the implications of mood recognition extend into sectors such as marketing,
entertainment, and personal wellness. In marketing, understanding consumer emotions can lead
to more effective and personalized advertising strategies. In entertainment, mood recognition
can be used to adapt content dynamically, creating more immersive and engaging experiences.
For personal wellness, wearable devices equipped with mood recognition capabilities can
1
provide users with real-time feedback on their emotional states, encouraging mindfulness and
stress management.
Despite its promising applications, the development and deployment of mood recognition
systems are not without challenges. Ensuring the accuracy of emotion detection in diverse and
dynamic real-world conditions is a significant hurdle. Moreover, there are important ethical
considerations related to privacy, consent, and the potential misuse of emotional data that must
be carefully addressed. It is crucial to develop these technologies responsibly, with robust
safeguards to protect user data and clear ethical guidelines to govern their use.
In conclusion, mood recognition systems represent a significant leap forward in the capability
of machines to interact with humans in a meaningful and emotionally intelligent manner. By
leveraging advances in AI and machine learning, these systems have the potential to
revolutionize a wide array of fields, improving mental health care, enhancing user experiences,
and advancing the development of social robotics. As research and development in this area
continue to progress, it is essential to balance innovation with ethical responsibility, ensuring
that the benefits of mood recognition technology are realized in a way that respects and protects
individual rights and well-being.
Nowadays the facial expressions we have has so many meanings that it becomes very
difficult to know which expression means what. Also due to the frequent changes in people’s
emotions it is some time difficult to know what some one feels at a certain moment. So due to
this, a system which will be able to detect the mood of a person and know why the persons.
facial expression recognition is a dynamic and rapidly advancing field with the potential to
significantly enhance the interaction between humans and machines. By addressing current
challenges and exploring new research directions, we can develop more accurate, reliable, and
ethically sound systems that leverage the power of facial expressions to understand and respond
to human emotions. This study aims to contribute to this exciting field by developing a robust
2
facial expression recognition system using state-of-the-art deep learning techniques, with
applications in mental health, human-computer interaction, and social robotics.
Facial expressions have different meaning, human intentions and emotions are expressed
through this, and it is an efficient and fundamental feature used in facial expressions system.
In today's rapidly evolving digital era, the interaction between humans and machines is
becoming increasingly sophisticated and nuanced. The advancement of AI and machine
learning technologies has brought about significant transformations in various aspects of our
daily lives. As these technologies continue to progress, there is a growing and compelling need
for systems that can not only process data and execute tasks but also understand and respond
to human emotions in a meaningful and empathetic manner.
Mood recognition systems represent a frontier in this endeavor. These systems are designed to
detect and interpret human emotions by analyzing facial expressions, voice tones, and other
physiological and behavioral cues. The potential applications of mood recognition technology
are vast and varied, with the capability to revolutionize numerous domains, including mental
health, human-computer interaction, and social robotics.
3
Improve the model's accuracy under different conditions such as varying lighting and
angles.
“Can computers be used in the detection of emotions accurately hereby offering good
insight to community??”
What are moods (emotions)??
How can human emotions be detected?
What are the effects of mood changes in the society?
When we want to try to get an understanding about the different questions and starting
with the first two, we will tend to know what emotions are, the different types which exits,
the most frequent once expressed. Moreover, the considering the third questions we’ll see
the different methods used in mood detection and recognition, and their importance in the
community. At the fourth place we’ll see the different effects of mood changes in the
society and how to remedy to this
The system can provide early detection of emotional disorders, aiding healthcare professionals
in diagnosis and monitoring.
4
The study serves as an educational tool for students and researchers in AI and psychology. It
raises public awareness about the benefits of emotional recognition technology.
Commercial Opportunities:
Reliable mood recognition technology opens new market opportunities in entertainment,
customer service, and wellness. It allows companies to differentiate their products with
advanced mood recognition features.
Social Robotics Enhancement:
Mood recognition can enable more human-like interactions in social robots, increasing their
effectiveness in care settings. It enhances assistive technologies for individuals with emotional
or communication difficulties.
Crime detection
Emotion recognition system can be use to detect and reduce fraudulent insurances claim,
deploy fraud prevention strategies.
Public safety
Fatigue during driving can be detected using mood thereby preventing accidents. Some lie
detectors are also made based on emotions recognitions system
1.6 HYPOTHESIS
Based on the study and the realization of this project, some assumptions have been made;
Small Traces on the face which have occurred due to accidents are ignored by the system
and do not affect the detection of the mood of a user.
The intensity of the expressions shown on the face affects the results.
In case a user has facial disabilities, the system, does not take that into consideration before
detecting the mood, it detects the mood based on the physical aspect of the face.
We assume that the webcam used in the capturing phase is the best.
For the realization of this project, three main steps were followed, namely:
5
Classification of the different state of emotions
For our project, the dataset used is the FER2013 for the training and testing of our model, this
dataset has been downloaded from the Kaggle website.
The facial expressions are classified into 7 categories; Anger, Disgust, Fear, Happy, Sad,
Surprise, and Neutral.
Many issues arise to the development and testing of a facial expressions mood recognition
system, some are;
Emotion (Mood)
This is a very strong feeling deriving from one’s circumstances like relationship, occasions
with others. The basic once we have are; anger, fear, disgust, happy, sad, surprise, Neutral.
Facial expression
6
A facial expression is one or more trace, positions, motions of face muscles beneath the skin
of the face.
This is a technology with is used in the analysis of facial expressions in both videos and static
images in order to extract it and gives the informations about the state of emotion of someone.
Neural network
This is the interconnection of neurons in order to share resources and work together.
This is the interconnection of artificial neurons in order to share resources and work together.
Deep learning
Deep learning, a subset of machine learning, involves the use of neural networks with many
layers (hence "deep") to model complex patterns in large datasets. he power of deep learning
lies in its ability to automatically learn hierarchical features from raw data, enabling it to
achieve high levels of accuracy in tasks that were previously considered challenging for
computers.
This is a neural network which deal with the analysis of visual imagery. It is a deep neural
network which does not deal with matrix multiplications but uses specific technic known as
convolution.
Machine Learning
Machine learning is a subset of artificial intelligence that involves the development of
algorithms and statistical models that enable computers to perform specific tasks without
explicit instructions. By using data to learn and improve from experiences, machine learning
algorithms can identify patterns, make decisions, and predict outcomes.
Python
Python is a high-level, interpreted programming language known for its readability and
simplicity. It supports multiple programming paradigms, including procedural, object-oriented,
and functional programming. Python is widely used in various fields such as web development,
7
data analysis, artificial intelligence, and scientific computing due to its extensive libraries and
community support.
Kaggle
This is widely used online platform for data science and machine learning competitions. It
provides datasets, code, and tools for users to collaborate, learn, and compete in solving
complex data-driven problems. Kaggle also offers educational resources and a community
where data scientists and machine learning practitioners can share their knowledge and
expertise.
FER 2013
FER 2013 (Facial Expression Recognition 2013) is a publicly available dataset commonly used
for training and evaluating facial expression recognition systems. It consists of grayscale
images of human faces displaying seven different emotions: anger, disgust, fear, happiness,
sadness, surprise, and neutral.
Artificial Neurons
Artificial neurons, inspired by biological neurons, are the fundamental units of artificial neural
networks. Each artificial neuron receives one or more inputs, processes them using a weighted
sum and an activation function, and produces an output. These neurons are interconnected to
form layers in a neural network, enabling the system to learn complex patterns and perform
tasks such as classification, regression, and pattern recognition.
In order to attain the above listed objectives, we have divided this work into five chapters:
Chapter ONE, entitled "GENERAL INTRODUCTION ", presents the background, the
problem statement, research question, research hypothesis, objectives, significance,
scope and delimitation of the study, and definition of some keywords and terms.
Chapter TWO, entitled "LITERATURE REVIEW", deals with the literature review and
general description Mood recognition system and previous work.
Chapter THREE, entitled "RESEARCH METHODOLOGY AND MATERIALS
USED", will deal with the methodology which will be used in the development of the
software. It presents the analysis of needs, systems design, development tools and
materials used for the application.
8
Chapter TWO which is of title "RESULTS AND DISCUSSIONS", will deal with the
practical issues of development and implementation of the software. It presents the
results obtained, the comments of screen capture and simulations of the application
conceived.
Chapter FIVE which is entitled "CONCLUSION AND RECOMMENDATION",
presents the summary of findings, recommendations, suggestions for further studies.
9
CHAPTER TWO
LITERATURE REVIEW
Mood recognition, also known as emotion recognition, involves capturing and identifying
human emotions from facial expressions, voice tones, body language, and physiological
signals. This technology is increasingly used in fields like human-computer interaction, mental
health monitoring, and customer service.
Some previous studies in the field of emotion detection, biofield analysis, and sound
therapy. Over the past decades, much research has been conducted to acquire a better
understanding of human emotions and develop improved applications in different domains.
Nowadays, dedicated emotion detection is a major area of research that aims to describe the
interaction between humans and robots or machines.
10
concluded that better outcomes can be achieved by using multimodal techniques; therefore,
further research is needed in this domain.
In addition to, (Seyeditabari, Narges , & Wlodek, 2018) proposed that the analysis of
emotions via a text for emotion detection is a multi-class classification problem that needs
advanced artificial intelligence methods. Supervised machine learning approaches have been
used for sentiment analysis, since an abundance of data are now generated on social media
platforms. The authors also concluded that there is much inefficiency in the current emotion
classifiers due to the lack of quality data and the complex nature of emotional expression, and
therefore, the development of improved hybrid approaches is critical.
Moreover, (Gosai, Himangini J. , & Prof. Hardik S. , 2018) proposed a new approach
for emotion detection via textual data based on a natural language processing technique. The
authors suggest a context-based approach can yield better results than context-free approaches
and that semantic and syntactic data help to improve the prediction accuracy.
(Wagh p. & K. , 2018) reviewed works on EEG signals for brain–computer interaction,
which can be used to interact with the world and develop applications in the biomedical domain.
In addition, this work introduced a methodology to detect emotions via EEG signals that
involves complex computing, including signal pre-processing, feature extraction, and signal
classifications. However, this proposed technique requires wearable sensors and involves many
uncertainties. In this same domain, i.e., emotion recognition.
(Dzedzickis, Artūras , & Vytautas , 2020) analyzed various technical articles and
scientific papers on contact-based and contactless methods used for human emotion detection.
The analysis was based on each emotion according to its detected intensity and usefulness for
emotion classification. The authors also elaborated about the emotion sensors along with their
area of application, expected outcomes, and limitations associated with them. The authors
concluded the article by providing a two-step procedure for detected emotion classification.
Step one includes the selection of parameters and methods, whereas step two regards the
selection of sensors. However, such techniques (EEG, ECG, Heart Rate, Respiration Rate, etc.)
still suffer from many limitations, and the measurement of uncertainties, the lack of clear
specifications for recognizing a particular emotion, etc., must be considered thoroughly for
future applications using IoT and big data techniques. The use of multimodal techniques
integrated with machine learning seems to be powerful for future applications.
11
2.3 REVIEW BY CONCEPTS
Figure 1: Generators of facial expressions and emotion (Trovato, Tatsuhiro , Nobutsuna , Kenji , & Atsuo , 2012)
12
person's inner emotional state without observing their facial expressions. Thus, facial
expressions are crucial for nonverbal communication
From a physiological perspective, facial expressions result from the activity of facial muscles,
also known as mimetic muscles or muscles of facial expression. These muscles are part of the
head's muscle group, which also includes scalp muscles, chewing muscles responsible for jaw
movement, and tongue muscles. The facial muscles are innervated by the facial nerve, which
branches throughout the face. When activated, this nerve causes muscle contractions, leading
to the formation of facial expressions.
Emotions are the experience of a person's attitude toward the satisfaction of objective things
and are critical to an individual's mental health and social behavior. Emotions consist of three
components: subjective experience, external performance, and physiological arousal. The
external performance of emotions is often reflected by facial expression, which is an important
tool for expressing and recognizing emotions Expressing and recognizing facial expressions
are crucial skills for human social interaction. It has been demonstrated by much research that
inferences of emotion from facial expressions are based on facial movement cues, i.e., muscle
movements of the face.
Based on the knowledge of facial muscle movements, researchers usually described facial
muscle movement objectively by creating facial coding systems, including Facial Action
Coding System (FACS) Face Animation Parameters Maximally Discriminative Facial
Movement Coding System (Monadic Phases Coding System and The Facial Expression
Coding System Depending upon the instantaneous changes in facial appearance produced by
muscle activity, majority of these facial coding systems divide facial expressions into different
action units (AUs), which can be used to perform quantitative analysis on facial expressions.
(Zizhao , et al., 22)
13
enough to fit the tens of millions of learned parameters in deep learning networks. (Zizhao , et
al., 22)
Based on the principle of universality, (Ekman & Freisen, 1978)developed methods to measure
facial behavior, notably creating the widely used Facial Action Coding System (FACS). This
system uses approximately forty independent anatomical features to define a comprehensive
taxonomy of all facial expressions. FACS is the most popular standard for systematically
classifying the physical expressions of facial emotions and is utilized by psychologists and
computer graphics artists alike.
FACS defines 46 Action Units (AUs), each representing specific contractions or relaxations of
one or more facial muscles. These AUs can be combined in various ways to produce different
facial expressions. The original philosophy of FACS was to train experts to recognize and
interpret these Action Units. Today, the system is also used to automate the recognition of AUs
and the corresponding expressions, as well as for graphical simulations of faces.
14
"FACS experts" analyze observed expressions by breaking them down into their constituent
Action Units. For any given facial expression, the result is a list of AUs that produced it. The
system also allows for the recording of duration, intensity, and asymmetry of expressions,
providing a precise description of facial movements.
Ekman formalized this expertise through the FACS Manual (Ekman & Freisen, 1978)a detailed
and technical guide explaining how to categorize facial behaviors based on the muscles
involved. In other words, it explains how muscle actions relate to facial appearances. The
FACS Manual enhances awareness and sensitivity to subtle facial behaviors, which can be
valuable for psychotherapists, interviewers, and other professionals who require deep insights
into interpersonal communication.
Figure 3 Action unit and associated muscle (Dorante, Miguel, et al., 2020)
Facial expressions result from coordinated muscle movements in the face, and they play a
crucial role in conveying emotions. The study of these expressions involves understanding the
specific muscle actions that produce them. The Facial Action Coding System (FACS),
developed by Ekman and Friesen, categorizes these muscle actions into Action Units (AUs),
allowing for a systematic analysis of facial expressions.
15
Anger
This is a powerful emotional response that can lead to aggressive behavior, arising from various
sources, including frustration, physical threats, verbal threats, false accusations, and violations
of personal values. Physiologically, anger increases blood pressure and muscle tension. Facial
cues associated with anger include lowered and tightened eyebrows with vertical wrinkles
between them, tight and straight eyelids, narrowed eyes with pupils focused on the source of
anger, and lips that are either tightly closed or slightly open, indicating preparation for shouting.
These facial characteristics prepare the body for potential physical or verbal confrontation.
Disgust
It is a negative emotion often triggered by unpleasant smells, tastes, or sights. Unlike other
emotions, what provokes disgust can vary greatly based on cultural and personal factors. In
extreme cases, disgust can cause vomiting. The most prominent facial cues of disgust include
a lifted upper lip, a wrinkled nose, raised cheeks, and lifted but not closed eyelids with wrinkles
under the eyes. The eyebrows are usually lowered. These facial expressions are responses to
sensory experiences that are perceived as repulsive.
Fear
It is triggered by dangerous or stressful situations and can be related to future events, such as
fear of loss or violence. Fear prepares the body to either escape or defend itself, causing
increases in heart rate and blood pressure, and eyes open wide to absorb more light. In extreme
cases, fear can cause paralysis. Facial cues of fear include raised and drawn inward eyebrows,
an open mouth with visible teeth, lifted cheeks, and wrinkles appearing outside the corners of
the eyes. These responses prepare the body for quick reactions to perceived threats.
Sadness
It occurs in response to loss or pain, typically manifesting as a calm emotion often accompanied
by tears. When experiencing sadness, facial muscles lose tension, resulting in facial cues such
as lowered inner parts of the eyebrows and retracted, quivering lip corners. These expressions
reflect inner emotional pain and a state of withdrawal.
Surprise
It is a brief, sudden emotion triggered by unexpected situations, leading to either positive or
negative reactions depending on the context. It is characterized by a rapid onset and short
duration. The most prominent facial cues of surprise include raised eyebrows, causing wrinkles
on the forehead, wide-open eyes, and an open mouth often shaped in an "O." These signals
16
indicate an immediate reaction to unexpected events, often leading to further emotional
responses such as happiness or sadness.
Happiness
This is typically expressed through smiling, where the corners of the mouth are turned up. The
cheeks may be raised, and crow's feet wrinkles can appear around the eyes. Happiness is often
recognized by the upturned mouth corners, raised cheeks, and the appearance of crow’s feet
around the eyes. This expression signals contentment, pleasure, or joy and is easily recognized
due to its universally positive nature.
Each primary facial expression is associated with specific AUs. For example:
Happiness
AU6 (cheek raiser) and AU12 (lip corner puller)
Sadness
AU1 (inner brow raiser) and AU15 (lip corner depressor)
Anger
AU4 (brow lowerer) and AU23 (lip tightener)
Fear
AU1+2 (inner and outer brow raiser) and AU5 (upper lid raiser)
Surprise
AU1+2 (inner and outer brow raiser) and AU26 (jaw drop)
Disgust
AU9 (nose wrinkler) and AU10 (upper lip raiser)
17
Figure 4 Examples of some action units extracted from Cohn and Kanade’s database
18
from various data sources. By analyzing facial expressions, voice tones, body language, and
physiological signals, mood recognition systems can provide valuable insights and enhance
applications across multiple domains. As technology advances, the accuracy and reliability of
mood recognition will continue to improve, enabling more sophisticated and empathetic
human-computer interactions.
Face recognition
Feature extraction
Mood classification
Face detection
For several years, face detection in images and videos has become a reality thanks to
advancements in learning algorithms. This technology is now a common feature in our daily
lives. It's prevalent in cameras and smartphones, and social networks like Facebook use it to
identify faces in photos for tagging. Similarly, Snapchat's latest updates utilize the front camera
of smartphones to detect faces and apply various animations and deformations.
o Data Collection: Gathering data from relevant sources such as images, audio
recordings, video footage, and physiological sensors.
o Preprocessing: Preparing the data for analysis by cleaning, normalizing, and
augmenting it. For facial expressions, this may involve detecting and aligning faces
in images and for audio, this may involve noise reduction and feature extraction.
We can categorize face detection algorithms are into two types: absolute detectors and
differential detectors.
-Absolute detectors
Also known as frame-by-frame detectors, determine the position of a face in each frame of a
video independently of previous frames. The main advantage of these detectors is their ability
to be easily parallelized across multiple frames, allowing them to quickly react to changes in
the number of faces in an image and avoid "drift," which refers to losing track of a face's
position. However, these detectors do not use temporal information, which could make them
faster and more precise. A foundational technique used by most absolute detectors is the Viola-
Jones algorithm (Viola and Jones, 2001), which trains a classifier to differentiate faces from
19
other objects using images of various face sizes and other objects, requiring a higher number
of non-face objects for optimal efficiency.
-Differential detectors
Also known as face trackers, use the previous position of a face to determine its current
position. These detectors rely on an absolute detector for initialization in the first frame and
track the face's position in subsequent frames. They are highly accurate and fast, but the
accumulation of small positional errors over time can lead to "drift" since the detector does not
reset once initialized. A well-known example of a differential detector is the Active
Appearance Model (AAM) algorithm. AAM represents a face as a triangular mesh-like model
with about 70 points, built through learning on various faces with different expressions. The
algorithm matches this model to any face in an image by warping it to fit the face's parameters.
After initializing the positions of the face parameters in the first frame (manually or using an
absolute detector), the algorithm tracks these parameters in subsequent frames using pre-
calculated deformations. AAM not only detects faces but also extracts their features.
Feature Extraction
Here extracting relevant features from the data that can be used to identify emotions. Common
features include facial landmarks, Mel-Frequency Cepstral Coefficients (MFCCs) for audio,
and key points for body posture.
Mood Classification
Model Training
Training machine learning or deep learning models on labeled datasets to recognize different
emotions. Techniques often involve CNNs for image data, RNNs or LSTMs for audio data,
and hybrid models for multi-modal data.
Adapting the trained model to new data to predict emotions and analyzing the results and
refining the model for better accuracy.
20
the weights are iteratively readjusted to minimize the error. As the training database is
repeatedly presented to the network, the error decreases until it stabilizes at a zero or constant
level, at which point learning should be stopped. After training, the network is ready for use.
Naive Bayes Algorithm:
This algorithm uses a probabilistic model to determine the class of an instance. It calculates the
probability that the instance belongs to a particular class by multiplying the class probability
with the probability of each feature value for that class, based on pre-calculated mean values
and variances.
k-Nearest Neighbors (k-NN) Algorithm:
This algorithm classifies an instance based on the majority class among its "k" nearest
neighbors. During the training phase, it stores the data instances to optimize the neighbor
search. When classifying a new instance, it finds the "k" nearest neighbors based on Euclidean
distance and assigns the instance to the most common class among these neighbors.
Decision Tree Algorithm:
This algorithm classifies instances by using a series of threshold values for specific features.
During training, it evaluates the entropy of each feature and selects the one with the lowest
entropy as the next node in the tree. The process continues until all instances at a node belong
to the same class, at which point the node becomes a leaf. After training, the decision tree can
instantly classify new instances by traversing the tree based on feature thresholds.
Support Vector Machine (SVM) Algorithm:
This algorithm maximizes the margin between different classes using a minimum number of
support vectors. During training, it uses provided data to find a set of linear functions that best
separate the classes. By mapping the data into a higher-dimensional space, the algorithm finds
simple functions to separate the classes. After training, the boundaries of each class are defined,
allowing new instances to be classified immediately.
21
Artificial Intelligence is "The science and engineering of making intelligent machines,
especially intelligent computer programs" according to the father of, John MeCarthy.
Artificial Intelligence is applied in a lot of present world activities as well as different sectors
and domains. Some major parts of them include:
Emotion recognition
Robotics
Speech and Facial Recognition
Object Detection
Medical Diagnosis etc
Since AI aims to create machines capable of reasoning and thinking logically, similar to
humans. To achieve this goal, two main approaches have been developed which are Machine
Learning and Deep Learning
22
Figure 5 Machine learning working
23
Figure 6 Deep Learning working
24
Which helps in the conversion of different types of documents, such as scanned paper
documents or PDFs, into text which are editable and searchable.
Emotion recognition systems have seen significant advancements in recent years, leveraging
various methodologies and technologies.
Yu et al. (2013) introduced a semi-automatic method for creating a facial expression
database. They utilized a web search for emotion keywords, followed by the Voila Jones face
detector to filter out non-facial images. A Binary Support Vector Machine (SVM) was then
employed to select relevant images, ultimately forming a diversified facial expression database.
The study proposed using Weber Local Descriptor (WLD) and histogram contextualization for
multi-resolution analysis of faces, demonstrating a fast and precise framework. However, the
WLD descriptor produced large parameter vectors, necessitating reduction, which made the
method somewhat slow and unreliable.
Carcagni et al. (2015) developed a system using Histogram of Oriented Gradients
(HOG) for feature extraction. The HOG descriptor extracted regions of interest from the image
through gradients. The system involved three phases: cropping and registering faces, applying
the HOG descriptor, and using an SVM classifier to estimate facial expressions. They tested
the system on various publicly available facial databases under different conditions and
validated it on continuous data streams online, achieving a 95.9% classification accuracy, 98%
accuracy, and a 98.9% recognition rate, processing seven images per second. However, the
algorithm was limited to a face pose range of -30 to 30 degrees.
Radlak and Smolka in 2016 combined two facial detection techniques: the method of
Zhu and Ramanan (2012) and the Dlib detector. Facial landmarks were extracted using Kazemi
and Sullivan's (2014) technique. If the Dlib detector failed, Zhu and Ramanan's method was
used. The detected face was normalized using affine transformation, and facial landmarks were
used to extract multiscale patches for feature vectors. The Random Frog algorithm was used
for feature extraction, and a multiclass SVM classifier achieved a recognition rate of 36.93%
on the validation database, with the best results for anger expressions, while disgust and fear
had insufficient results.
Pu and other researcher used action units (AUs) for facial expression recognition and
analysis in videos with two random forest algorithms. The first algorithm detected AUs, and
the second classified facial expressions. Facial movements were tracked using the Active
Appearance Model (AAM) and the Lucas-Kanade (LK) optical flow tracker. Their method
25
achieved an accuracy of 89.37% for the double classifier by random forest, a 100% AU
recognition rate, and a 96.38% expression recognition rate. However, the system was not tested
in real-time, and the extracted features' dimensionality was high, requiring discrimination.
2.5 EMPERICAL REVIEW
2.5.1 Emotion Recognition Based on Facial Expressions
The process of human communication is intricately tied to the variation of emotions. When
individuals experience fundamental emotions, their faces exhibit various expression patterns,
each with distinct features and scales of distribution. Facial expression recognition is an
essential component of human-computer interaction, enabling computers to interpret facial
expressions based on human cognition. The facial expression recognition process can be
divided into three key modules: face detection, feature extraction, and classification. Face
detection, a critical technology in face recognition et (Zizhao , et al., 22), has matured
significantly, allowing effective extraction of salient features from the original facial image.
The accurate classification of these features is crucial to the recognition outcome. For instance,
Gao and Ma (2020) extracted facial expression attributes from images to predict emotional
states based on changes in facial expressions.
26
to identify and interpret the speaker's emotional state. Ton-That and Cao (2019) utilized speech
signals for emotion recognition and achieved promising results using a voice emotion database.
However, individual differences can cause significant variability in speech signals,
necessitating the creation of a large phonetic database, which presents recognition challenges.
Additionally, noisy environments can degrade speech quality, thus affecting emotion
recognition accuracy, indicating that high-quality speech signal acquisition demands a
controlled environment.
Among the various methods of conveying human emotional information, including facial
expressions, voice, physiological signals, and gestures, facial expressions are the most direct
and are relatively easy to capture in most settings. Consequently, this study focuses on using
facial expressions to analyze human emotional states.
27
CHAPTER THREE
RESEARCH METHODOLOGY AND MATERIALS
3.1 RESEARCH DEISGN
In this section which is very important and crucial, we’ll outlining the methods and procedures
for collecting and analyzing data. It ensures the research questions are effectively addressed,
and the study is conducted in a structured and systematic way. The main function of research
design is to create a clear plan that guides the research process. This plan ensures the research
is done in an organized, efficient, and ethical way, leading to accurate results.
28
Set schedule
3. System Design
Develop high-level and detailed design documents
Create system models
Design database structures
4. Implementation (Coding)
Write code
Develop modules
Integrate components
5. Testing
Conduct unit testing
Perform integration testing
Execute system testing
Carry out acceptance testing
6. Deployment
Install software
Configure system
Migrate data
Train users
7. Maintenance
Fix bugs
Update software
Provide user support
Improve features
8. Documentation
Write user manuals
Create technical documentation
Develop system guides
Some software development methodologies include;
WATERFALL MODEL:
This is a linear, sequential approach where each phase must be completed before the next
begins.
29
Figure 9 Illustration of the waterfall model
AGILE METHODOLOGY:
It is an iterative model whereby each step can be performed by someone or a group a people
and after the completion of a step we can come back to is for modifications approach that
emphasizes flexibility, collaboration, and customer feedback. Common frameworks include
Scrum and Kanban.
SPIRAL MODEL:
This model combines iterative (agile) development with systematic aspects of the waterfall
model, focusing on risk assessment and reduction.
30
Figure 11 Illustration of the spiral model
31
3.1.1.1 Choice and Justification
In our project we decided to use the Agile methodology
Agile Software Development is an iterative and incremental approach to software
development that emphasizes the importance of delivering a working product quickly and
frequently. It involves close collaboration between the development team and the customer to
ensure that the product meets their needs and expectations.
32
for software development that prioritize individuals and interactions, working software,
customer collaboration, and responding to change.
The different phases in the Agile development cycle may not happen in succession; they
are flexible and always evolving, with many occurring parallelly. The Agile Software
Development process typically consists of the following steps:
Gathering of requirements
The customer’s requirements for the software are gathered and prioritized.
Planning
The development team creates a plan for delivering the software, including the features that
will be delivered in each iteration.
Development
The development team works to build the software, using frequent and rapid iterations.
Testing
The software is thoroughly tested to ensure that it meets the customer’s requirements and is of
high quality.
Deployment
33
Maintenance
The software is maintained to ensure that it continues to meet the customer’s needs and
expectations.
Disadvantages of Agile
While flexibility in Agile is usually a positive, it also comes with some trade-offs. It can be
hard to establish a solid delivery date, documentation can be neglected, or the final product can
be very different than originally intended. Here are some of the disadvantages of Agile:
Planning can be less concrete
Because project managers are often reprioritizing tasks, it’s possible some items scheduled for
delivery may not be complete in time. And, additional sprints may be added at any time in the
project, adding to the overall timeline.
Team must be knowledgeable
Agile teams are usually small, so team members must be highly skilled in a variety of areas
and understand Agile methodology.
Time commitment from developers
Active involvement and collaboration are required throughout the Agile process, which is more
time consuming than a traditional approach.
Documentation can be neglected
Agile prefers working deliverables over comprehensive documentation. While documentation
on its own does not lead to success, teams should find the right balance between documentation
and discussion.
3.1.2 DESIGN METHODOLOGY
In software development there are many types of methodologies used in the designing of
software for their development. Some of these methodologies are MERISE, SADT (Structured
Analysis and Design Technique), OMT (Object-Modeling Technique) And UML (Unified
Modelling Language) methodologies. In the development of our project, we will you the UML
design.
UNIFIED MODELLING LANGUAGE (UML)
Unified Modelling Language (UML) is a modelling language used for specifying, visualizing,
constructing, and documenting the artefacts of software systems.
MERISE
34
MERISE is a French acronym, «Méthode de Recherche Informatique pour les Systèmes
d'Entreprise». It is a method of analysis, design and IT project management. Merise proceeds
to separate treatment of data and processes, where the data-oriented view is modelled in three
stages, from conceptual, logical through to physical. This methodology uses data dictionary to
represent data.
SADT is a methodology of functional analysis and the most known methodology of project
management. It does not only allow the project management developers to describe the tasks
of the project and their interaction, but also to describe the system which the project aims to
study, create or modify.
OMT is an object modeling language for software modeling and designing. The methodology
consists of building a model of the system and then adding implementation details during its
design. It describes Object model or static structure of the system.
UML was created by the Object Management Group (OMG) and the 1.0 specification
draft was proposed to the OMG in January 1997.
35
3.1.2.1 Choice and Justification
For our project we’ll use the UML methodology.
UML has its own syntax (the structure of statements in a computer language) and
semantics (meanings of symbols and statements).
Even though UML is generally used in the modeling software systems, it is also used to model
non-software systems as well. For example, the process flow in a manufacturing unit, etc.
Furthermore, UML is not a programming language but tools can be used to generate
source-code in various languages using UML diagrams. UML has a direct relation with object-
oriented analysis and design. After some standardization, UML has become an OMG standard.
Advantages of UML
When UML diagrams have been drawn properly, it becomes very easy to understand
the particular system,
Independence with respect to any programming language,
Powerful support of communication,
Readily used and flexible,
Description of all the models of the analysis to the realization of the software,
CLASS DIAGRAM
This is a UML diagram which shows the different class, their respective attributes,
methods and the relationship between the different classes of a system. It shows the static part
of a software.
o A class is a blueprint for creating objects, it has a name, attributes and methods.
o Attributes or data fields are specifications such as color, height etc.
o A method or behavior is what that particular class do.
36
Figure 15 Class diagram of the RTMRS
The Face_detector class uses the camera class which takes frames(images) using the
captureInage() method and detects all faces present using the detectsFace() method.
Feature_Extractor analyzes the image data to identify the user's facial expression. It has a
extractsExpression() method that returns the emotion data.
Mood_Detector is responsible for detecting the user's mood based on the facial expression
extracted from the image. It uses the Feature_Extractor class to perform the analysis uses the
getFacialExpression() method and returns the result. Finally it generates displays the mood
detected using displaymood() method.
The Chatbot class is responsible for generating a response based on the user's mood is does it
using the initiates_chat() method. It uses the object to determine the appropriate response.
Finally, we have the User class, which represents the user of the system and from which the
mood is detected and chats with the bot after the mood has been detected.
37
ACTIVITY DIAGRAM
This is a simple UML diagram which shows the different activities of a system in a flow chat.
It has a start node which shows the starting point of the system, an ending node which marks
the end of a system’s activities, activities which are represented by rectangle with rounded
edges, decisions node which helps in decision making.
Figure 16 Activity Diagram of RTMRS Showing the login process, user and admin process
The activity diagram starts with the users (admin or user) starting the application and
interacting with the GUI. The user is then asked to login, if it’s simple user he/she will be
redirected to user’s GUI, the user can now click on the Detect mood button which start the
camera which captures images in the form of as video which is used for mood detection. The
mood is analyzed and displayed and the appropriate response is generated by the chatbot based
on user’s mood. and if it’s the admin he/she will be redirected to the admin GUI.
38
Figure 17 Activity Diagram of RTMRS showing the chatbot interaction
From figure 11 above we can see how the chatbot interacts with the user with the help of a UI
form as a 3rd party.
Use case diagrams are shows the functionality of a system using actors and use cases.
A use case diagram is a dynamic or behavior diagram in UML.
Use cases are a set of actions, services, and functions that the system needs to perform.
Actors are people or entities operating under defined roles within the system.
39
Figure 18 Use case diagram of RTMRS
On figure 12 above we can see the use case diagram of our system with two actors (user and
admin) and their respective use cases.
SEQUENCE DIAGRAM
The sequence diagram shows the flow of messages in the system. It portrays the
communication between any two lifelines as a time-ordered sequence of events, such that these
lifelines took part at the run time. It simply depicts interaction between objects in a sequential
order. Sequence diagram is also known as event diagram since it shows the different events in
system in the form of messages.
40
Figure 19 Sequence diagram for the admin functions
Figure 13 shows the sequences diagram for the admin’s functions. It has 3 objects; the admin,
the system and the database which interact with each through the flow of messages between
each other.
41
Figure 20 Sequence diagram showing how the system recognizes mood
Figure 14 shows the sequence diagram of the mood detection process. Here we have 3 objects;
the user, the mood detector and the chatbot. The 3 classes interact with each through the flow
of messages.
Disadvantages of UML
Here we’ll discuss about the different materials, programming techniques, and software used
during the development of the project
3.2.1 Materials
Here we’ll discuss about the different materials used;
42
3.2.1.1 Dataset
A dataset is a collection of data organized in a structured manner, typically in tabular form,
where each row represents a single data point and each column represents a specific variable
or attribute. Datasets are used for analysis, machine learning, and data mining. They can come
from various sources, including databases, sensors, surveys, and experiments, and can include
a wide range of data types, such as numerical, categorical, textual, or image data. For our
project, we’ve used the FER2013 dataset downloaded from Kaggle
(https://www.kaggle.com/datasets/deadskull7/fer2013)
The FER2013 dataset from Kaggle is a widely recognized resource for training and evaluating
facial emotion recognition models. It consists of 35,887 grayscale images, each measuring
48x48 pixels and capturing human faces with various expressions. The images are categorized
into seven distinct emotion labels: Angry, Disgust, Fear, Happy, Sad, Surprise, and Neutral.
These images are split into three subsets: 28,709 images for training, 3,589 for validation, and
3,589 for testing. Each image is represented in a CSV file, where the emotion is denoted by an
integer (0-6), and the pixel values are listed as a space-separated string of 2,304 values.
HP EliteBook 8570p
8 Gb RAM,
1Tb HDD
A processor speed of 2.7Hz (4CPUs)
43
Intel Core (TM) i5 CPU
Figure 22 Hp Elitebook
44
employed in developing software with graphical interfaces. Qt Designer enhances this process
by offering an intuitive drag-and-drop interface, allowing developers to design windows,
dialogs, and forms visually without needing to write code. This approach simplifies the design
process and accelerates development.
One of the key features of Qt Designer is its comprehensive set of widgets and layouts.
Developers can easily use buttons, labels, text fields, and various layouts such as horizontal,
vertical, and grid, to create complex and responsive user interfaces. Each element can be
customized and configured to meet specific design requirements. Additionally, Qt Designer
supports the powerful signal and slot mechanism, a core feature of the Qt framework, which
facilitates communication between different UI components. This feature enables developers
to connect signals (events) to slots (functions) to define the behavior and interactions within
the interface seamlessly.
45
3.2.2 Database and DBMS
3.2.2.1 Database
A database is an organized collection of data that is stored and accessed electronically.
For our project, we’ve used the MySQL Database. Structured Query Language (SQL) is a
primary language for querying and managing data. SQL commands allow users to perform
operations such as creating tables, inserting data, querying data, updating records, and deleting
information.
Database and Table Management: Create, drop, and modify databases and tables.
Query Execution: Run SQL queries and view the results.
Data Import/Export: Import and export data in various formats (e.g., SQL, CSV,
Excel).
User Management: Manage database users and their permissions.
Backup and Restore: Backup databases and restore them from backups.
3.2.3.1 Python
Python is a high-level, interpreted programming language known for its simplicity and
readability. Designed with a focus on code readability and developer productivity, Python uses
a clear and straightforward syntax that allows programmers to express concepts in fewer lines
of code compared to other languages. Its extensive standard library and supportive community
make it a versatile tool for various programming tasks.
Some benefits of Python are;
46
Ease of Use: Python's clear syntax and readability make it accessible for both beginners
and experienced developers. This ease of use allows for rapid development and
prototyping, which is crucial in AI research and development.
Rich Ecosystem of Libraries: Python boasts a vast array of libraries and frameworks
specifically designed for AI and machine learning, such as TensorFlow, Keras, PyTorch,
and scikit-learn. These libraries provide pre-built functions and models that facilitate
complex AI tasks.
Community Support: The Python community is large and active, providing extensive
support through forums, tutorials, and documentation. This strong community support
ensures that developers can find solutions to problems and stay updated with the latest
advancements in AI.
Rapid Prototyping: Python's simplicity allows for quick experimentation and
iteration, which is beneficial for developing and testing AI models. This rapid
prototyping capability accelerates the development cycle and fosters innovation.
3.2.3.2 Libraries
Some libraries used are;
Keras
Keras is a high-level neural networks API written in Python, capable of running on top of
TensorFlow. It allows for easy and fast prototyping through a user-friendly, modular, and
extensible interface. Keras supports both convolutional networks and recurrent networks, as
well as combinations of the two. It provides a clean and concise way to build deep learning
models and has become an essential tool for researchers and practitioners in the field of
machine learning due to its simplicity and flexibility.
MySQL Connector/Python
MySQL Connector/Python is a driver that provides a native Python interface to MySQL
databases. This library allows Python programs to connect to MySQL databases and execute
queries. It supports all standard MySQL operations and ensures compatibility with Python's
DB-API. With robust connection management, transaction support, and error handling, it
simplifies the process of integrating MySQL databases into Python applications, making it a
vital tool for developers working with MySQL.
NumPy
NumPy is a fundamental package for scientific computing in Python. It provides support for
large multi-dimensional arrays and matrices, along with a collection of mathematical functions
47
to operate on these arrays. NumPy is essential for performing numerical computations
efficiently and is the foundation for many other scientific computing libraries, including SciPy
and scikit-learn. Its powerful n-dimensional array object, broadcasting functions, and tools for
integrating C/C++ and Fortran code make it a cornerstone of the Python scientific stack.
OpenAI
OpenAI is an organization that develops advanced artificial intelligence models and provides
APIs for integrating these models into various applications. The OpenAI Python library allows
developers to access OpenAI's machine learning models, such as GPT (Generative Pre-trained
Transformer), to perform tasks like natural language understanding, text generation, and more.
This library simplifies the process of building applications that leverage cutting-edge AI
technology for diverse use cases, from chatbots to data analysis.
OpenCV
OpenCV (Open Source Computer Vision Library) is a comprehensive library for computer
vision and image processing tasks. It contains over 2,500 optimized algorithms for various
image and video analysis tasks, such as object detection, facial recognition, image
segmentation, and motion tracking. OpenCV is widely used in both academic research and
industrial applications due to its efficiency and wide range of functionality, making it an
indispensable tool for developing computer vision applications.
Pandas
Pandas is a powerful data manipulation and analysis library for Python. It provides data
structures like DataFrame and Series, which are essential for handling structured data. Pandas
offers numerous functions for data cleaning, transformation, aggregation, and visualization,
making it an invaluable tool for data scientists and analysts. It is designed to work seamlessly
with other libraries in the Python ecosystem, such as NumPy, SciPy, and Matplotlib, providing
a comprehensive solution for data analysis tasks.
Pillow
Pillow is a popular Python Imaging Library (PIL) fork that adds image processing capabilities
to your Python interpreter. It provides extensive file format support, an efficient internal
representation, and powerful image processing capabilities. With Pillow, you can perform a
wide range of image operations, including opening, manipulating, and saving images in various
formats. It is widely used in web development, scientific applications, and desktop GUIs for
image manipulation tasks.
PyQt5
48
PyQt5 is a set of Python bindings for Qt libraries, used for developing cross-platform
applications and GUIs. PyQt5 enables the creation of highly interactive and visually appealing
applications with a modern look and feel. It provides a wide range of modules and tools for
working with GUI components, event handling, and multimedia. PyQt5's extensive
functionality and ease of use make it a popular choice for developing desktop applications in
Python.
Python-dateutil
Python-dateutil is a powerful library that provides extensions to the standard Python datetime
module. It offers comprehensive parsing of dates in various formats, arithmetic operations on
date objects, and support for time zones. Python-dateutil simplifies complex date
manipulations and is widely used in applications that require robust date and time handling.
Scikit-learn
Scikit-learn is a machine learning library for Python that provides simple and efficient tools for
data mining and data analysis. It offers a wide range of supervised and unsupervised learning
algorithms, such as regression, classification, clustering, and dimensionality reduction. Scikit-
learn is built on NumPy, SciPy, and matplotlib, and is designed to be accessible and reusable
in various contexts, making it a key library for machine learning practitioners.
SciPy
SciPy is an open-source Python library used for scientific and technical computing. It builds
on NumPy and provides additional functionality for optimization, integration, interpolation,
eigenvalue problems, algebraic equations, and other scientific computations. SciPy's extensive
range of algorithms and tools makes it essential for researchers and engineers performing
complex numerical calculations.
Setuptools
Setuptools is a Python package that facilitates the packaging, distribution, and installation of
Python projects. It extends the capabilities of the standard library's distutils and allows for more
complex build processes, dependency management, and project metadata. Setuptools is crucial
for developers looking to distribute their Python software efficiently and manage dependencies
seamlessly.
TensorFlow
TensorFlow is an open-source machine learning framework developed by Google. It is used
for a wide range of machine learning tasks, including deep learning, and provides a
comprehensive ecosystem for building, training, and deploying machine learning models.
49
TensorFlow supports computation on various platforms, from CPUs to GPUs and TPUs, and
offers tools for model optimization and deployment across different environments. It is widely
adopted in both research and industry for developing scalable and efficient machine learning
applications.
Wheel
Wheel is a built-package format for Python that helps to improve the speed and reliability of
package installation. Unlike the older. egg format, wheels are a binary distribution format that
eliminates the need for building packages from source during installation. This results in faster
and more consistent installations, especially for packages with complex dependencies or those
that require compilation. Wheel is an essential tool for Python package distribution and is
supported by pip, the standard package manager for Python.
3.2.3.3 Code Sample
Training Code
import os
import numpy as np
from tensorflow.keras.preprocessing import ImageDataGenerator
from keras.src.models import Sequential
from keras.src.layers import Conv2D, MaxPooling2D, Flatten, Dense
def train_start():
dataset = 'data'
img_width, img_height = 48, 48
epochs = 50
batch_size = 32
num_classes = 7
train_set = os.path.join(dataset, 'train')
test_set = os.path.join(dataset, 'test')
50
batch_size=batch_size,
class_mode='categorical')
model.compile(loss='categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'])
class users(QDialog):
def __init__(self):
super(users, self).__init__()
loadUi('users.ui', self)
self.setWindowTitle('MANAGE USERS')
self.But1.clicked.connect(self.select)
self.But2.clicked.connect(self.Save)
self.But3.clicked.connect(self.update)
self.tab2.cellClicked.connect(self.show_id)
self.But4.clicked.connect(self.delete_user)
def dbcon(self):
conn = mysql.connector.connect(
51
host="localhost",
user="root",
password="",
database="mood_rec",
)
return conn
def clear(self):
self.text1.setText('')
self.text2.setText('')
self.text3.setText('')
self.text4.setText('')
def Save(self):
firstname = self.text2.toPlainText()
lastname = self.text1.toPlainText()
uname = self.text3.toPlainText()
upass = self.text4.toPlainText()
conr = self.dbcon()
cur = conr.cursor()
cur.execute("select * from users where username= '" + uname + "'
and password= '" + upass + "'")
result = cur.fetchone()
if result:
QMessageBox.information(self, "ERROR", "Username or Password
already exist!!!")
elif firstname == "" or lastname == "" or uname == "" or upass ==
"":
QMessageBox.information(self, "Error", "All fields are
Required")
else:
query = "INSERT INTO users (firstName, lastName, username,
password) VALUES (%s, %s, %s,%s)"
val = (firstname, lastname, uname, upass)
cur.execute(query, val)
conr.commit()
self.clear()
QMessageBox.information(self, "Registration Output",
"Registration successful!")
self.select()
def select(self):
con = self.dbcon()
cur = con.cursor()
# Fetch user information from the database
cur.execute(sql1)
rows = cur.fetchall()
row = 0
self.tab2.setRowCount(len(rows))
52
row += 1
con.close()
def delete_user(self):
user_id = self.label6.text()
cond = self.dbcon()
cursd = cond.cursor()
cursd.execute("DELETE FROM users WHERE id = %s", (user_id,))
cond.commit()
QMessageBox.information(self, "Done", "Deleted successfully.")
self.select()
if __name__ == "__main__":
app = QApplication(sys.argv)
widget = users()
widget.show()
sys.exit(app.exec_())
import cv2
import numpy as np
import tensorflow as tf
def mood_start():
# Load the face recognition library
face_cascade_path =
'C:/Users/Lilian/Desktop/realTimemoodRecognitionSystem/haarcascade_frontalf
ace_default.xml'
face_cascade = cv2.CascadeClassifier(face_cascade_path)
if face_cascade.empty():
print(f"Error loading Haar Cascade file from {face_cascade_path}")
return
53
# Load the trained model
model_path =
'C:/Users/Lilian/Desktop/realTimemoodRecognitionSystem/training3.h5'
try:
model = tf.keras.models.load_model(model_path)
except Exception as e:
print(f"Error loading model from {model_path}: {e}")
return
while True:
# Capture a frame from the webcam
ret, frame = webcam.read()
if not ret:
break
54
# Display the predicted mood on the frame
label_text = 'Mood: ' + mood
cv2.putText(frame, label_text, (x, y - 10),
cv2.FONT_HERSHEY_SIMPLEX, 0.9, (0, 255, 0), 2)
55
CHAPTER FOUR
PRESENTATION OF DATA AND ANALYSIS
4.1 PRESENTATION OF DATA
Figure 26 shows the different training result at each epoch showing the validation accuracy and
the value loss.
56
4.1.2 The Program File Organization
This project File Shows the structure and organization of our Project with the different files.
57
Sequence Diagrams
o Admin sequence diagram
o User sequence diagram
Activity Diagrams
o Chatbot activity
o RTMRS activity
Class Diagram
o RTMRS class diagram
Use Case Diagrams
o RTMRS use case
XML File
o `haarcascade_frontalface_default.xml` for face detection.
Model File
o `model.h5` for the trained model.
Python Scripts
o `registration.py` for handling user registration.
o `train.py` and `training.py` for model training.
o Etc.
Training Data
o `training3.h5` and `training4.h5` for different versions of the trained models.
4.1.3 The Registration Page
58
4.1.4 Login Interfaces
Figure 29 shows the login interface where the user or the admin can log-in using their
credentials.
Here the user or an admin can login into the system using the same interface but are redirected
to different interfaces base on their different privileges.
59
Figure 30 shows the login interface when wrong login credentials are entered. When wrong
login credentials are entered, a message is displayed on the text field to notify the user about
the wrong entry therefore the user will have to use correct credentials in order to login into the
system.
Figure 31 Login interface showing correct credentials but the server in not online
Figure 31 show login interface when credentials are correct but the local server not is off.
Here when correct credentials are entered and database server not Online, a message will be
display to actually notify the user about it’s server not reachable at that particular moment.
60
4.1.5 Main User Interface
These main user interface has a button “Detect mood” which when clicked, the detection phase
starts.
Here we can see how the system detects the happy mood and displays the GUI.
61
Figure 34 Sad mood displayed
Here, figure 34 shows how the system detects the sad mood and displays the GUI. Also we can
see that the facial expressions and positioning of parts of the face are different as for the happy
mood.
62
Here, figure 36 shows how the system detects the neutral mood and displays the GUI
From Figure 36 we can clearly see that the system detects only the mood on the face and not
those other objects.
63
Here, figure 37 shows how the system detects the surprise mood and displays the GUI
The figure 38 is simple admin dashboard where the admin of the system can perform little
tasks.
64
Figure 39 shows the interface where the admin clicks on button to perform actions
View User: Fetch and display details of the selected user in the input fields.
Add User: Insert a new user record into the database with the provided details.
Update User: Update the existing user record in the database with the new details
entered.
Delete User: Remove the selected user from the database.
Also, we have List of Users which is table displaying user information with columns: UserID,
Firstname, Lastname, Username.
Further, we have;
4.2 CONCLUSION
At the end of our training, we’ve obtained and accuracy of 58.7% over training our model for
2hrs which is able to detect mood in wide variety of faces
65
CHAPTER FIVE
DISCUSSION, RECOMMENDATIONS AND CONCLUSION
5.1 DISCUSSION AND SUMMARY OF FINDINGS
The development and evaluation system using facial expressions from the FER2013
dataset have yielded several important insights. The system, developed using PyCharm as the
IDE, achieved an accuracy of 58.7%. While this indicates some level of success, it also
highlights several areas that need further refinement and optimization.
The model's performance, as reflected by the 58.7% accuracy, suggests that it has
learned to distinguish between various facial expressions to a certain extent. However, this
moderate accuracy highlights the complexity and variability inherent in the FER2013 dataset.
The choice of feature extraction methods, likely involving convolutional layers within a deep
learning framework, played a crucial role in capturing essential facial features. Despite this,
the model's performance suggests that the feature representation could be improved, potentially
by employing more sophisticated techniques.
Data preprocessing steps, such as normalization, data augmentation, and face alignment,
are critical in preparing the dataset for effective model training. Any shortcomings in these
steps can significantly affect the model's overall performance. For instance, data augmentation
helps create a more robust model by simulating various real-world scenarios, but the balance
must be carefully managed to avoid introducing noise.
The architecture of the deep learning model, possibly a convolutional neural network
(CNN), was effective to some degree but might require further tuning. Exploring deeper
architectures or incorporating advanced techniques like residual connections or attention
mechanisms could enhance performance. Additionally, fine-tuning hyperparameters, such as
learning rate, batch size, and the number of epochs, through techniques like grid search or
random search, might yield better results. The selection of an appropriate loss function and
optimization algorithm is also crucial, and experimenting with different options could improve
convergence and accuracy.
The FER2013 dataset presents its own set of challenges, including class imbalances and
variations in image quality. Addressing these issues through techniques like oversampling,
under sampling, or using weighted loss functions could help improve model performance.
Effective data augmentation strategies, such as random cropping, rotation, and color jittering,
can help the model generalize better. However, over-augmentation might introduce noise, so
66
finding the right balance is essential. Employing cross-validation techniques ensures that the
model's performance is robust and not overfitted to a particular subset of the data, providing a
more reliable estimate of the model's true performance.
In summary, system achieved an accuracy of 58.7% on the FER2013 dataset, indicating
both promise and areas for improvement. Enhancing the model architecture, tuning
hyperparameters, improving data handling, and employing robust training techniques are
critical for better performance. Future work could explore integrating multi-modal data, such
as combining facial expressions with speech or physiological signals, to enhance the accuracy
of mood recognition systems. While the current system shows promise, achieving higher
accuracy will require iterative improvements and possibly integrating more sophisticated
techniques. The lessons learned from this project provide a solid foundation for future
enhancements in facial expression-based mood recognition systems.
Here are some of the major challenges faced during the development of this project
67
play a crucial role. Often, the species of crops grown in certain parts of the world differ
significantly from those in other regions. Consequently, datasets may not accurately represent
the actual conditions relevant to the specific area of interest.
5.2.4 Model Training
Training deep learning models, especially complex ones with many layers, demands
significant computational power and memory. This often requires access to GPUs or TPUs,
which may not be available to all practitioners. Deep learning models can take a long time to
train, depending on the complexity of the model and the size of the dataset. This can be a
bottleneck, especially if iterative experimentation is required.
5.2.5 Overfitting
Deep learning models are prone to overfitting, especially when trained on small
datasets. This happens when the model learns to memorize the training data rather than
generalize from it. Techniques like dropout, data augmentation, and regularization are used to
combat this, but they are not always fully effective.
5.3 RECOMMENDATIONS
A project of this caliber, despite being incomplete, warrants some recommendations to enhance
its future development. Considering that the prototyping methodology was employed in this
project, the following suggestions are made:
Access to more powerful computational resources, such as cloud-based GPUs or TPUs,
can enable the training of more complex models and faster experimentation cycles.
Collaborating with institutions or leveraging cloud platforms like Google Colab, AWS, or
Azure can provide these resources. This will help in model training and improve model
performance.
I’ll highly recommend large enterprises to implement this system because system can
monitor and recognize employees' emotional states, allowing timely intervention and support
for those experiencing stress, burnout, or other negative emotions. This can lead to improved
overall well-being and job satisfaction.
Also, to commercial enterprises, the system can analyze customers' facial expressions and
emotions in real-time while they interact with products or services. This immediate feedback
can be invaluable for understanding customer reactions and making timely improvements. By
identifying customers' emotional responses, businesses can tailor their interactions to better
meet customer needs, leading to a more personalized and satisfying experience.
68
5.4 CONCLUSION
The development of a mood recognition system using facial expressions and the
FER2013 dataset together with deep learning algorithms presented various challenges,
including data quality issues, class imbalances, and computational constraints. Despite these
difficulties, achieving an accuracy of 58.7% demonstrates the system's potential and lays a
foundation for future enhancements. To improve the system further, recommendations such as
advanced data augmentation, addressing class imbalances, utilizing state-of-the-art
architectures, optimizing hyperparameters, leveraging high-performance computing resources,
employing comprehensive evaluation metrics, and conducting real-world testing should be
considered. By addressing these areas, the performance and reliability of the mood recognition
system can be significantly enhanced, paving the way for more accurate and robust human-
computer interaction applications.
Continued research and development in this field will undoubtedly lead to more
sophisticated and reliable emotion recognition systems, ultimately contributing to
advancements in areas such as mental health monitoring, user experience enhancement, and
human-robot interaction.
69
REFERENCES
Dorante, Miguel, Kollár, B., Obed, D., Haug, Valentin, Fischer, Sebastian, & Pomahac, Boh.
(2020, January). Research gate. Retrieved from RESEARCH GATE:
https://www.researchgate.net/publication/338615247_Recognizing_Emotional_Expre
ssion_as_an_Outcome_Measure_After_Face_Transplant
Dzedzickis, A., Artūras , K., & Vytautas , B. (2020). Human Emotion Recognition: Review of
Sensors and Methods. MDPI Open Access Journals, 20(3).
Ekman, & Freisen. (1978). facial action coding system. Retrieved from ScienceDirect:
sciencedirect.com/topics/computer-science/facial-action-coding-system
Filippini, C., David , P., Daniela , C., Antonio , M. C., & Arcangelo , M. (2020). Thermal
Infrared Imaging-Based Affective Computing and Its Application to Facilitate Human
Robot Interaction. MDPI Open Access Journals, 4-10.
Garcia-Garcia, J. M., Victor M. R. , P., & Maria D. , L. (2017). Emotion detection: a technology
review. ACM proceedings, 1-8.
Gosai, D. D., Himangini J. , G., & Prof. Hardik S. , J. (2018). A Review on a Emotion Detection
and Recognization from Text Using. International Journal of Applied Engineering
Research ISSN 0973-4562 Volume 13, Number 9 (2018) pp. 6745-6750.
Seyeditabari, A., Narges , T., & Wlodek, Z. (2018). Emotion Detection in Text: a Review.
arXiv:1806.00674.
Trovato, G., Tatsuhiro , K., Nobutsuna , E., Kenji , H., & Atsuo , T. (2012). Development of
facial expressions generator for emotion expressive humanoid robot. Retrieved from
Waseda University : https://waseda.elsevierpure.com/en/publications/development-of-
facial-expressions-generator-for-emotion-expressiv
Wagh p., k., & K. , V. (2018). Electroencephalograph (EEG) Based Emotion Recognition
System: A Review. Innovations in Electronics and Communication Engineering, 37-
59.
Zizhao , D., Gang , W., Shaoyuan , L., Jingting , L., Wenjing Yan, & Su-Jing Wang. (22,
January 04). Spontaneous Facial Expressions and Micro-expressions Coding: From
70
Brain to Face. Retrieved from Frontiers:
https://www.frontiersin.org/journals/psychology/articles/10.3389/fpsyg.2021.784834/
full#B47
71