Facial Sentiment Analysis Using AI Techniques State-of-the-Art, Taxonomies, and Challenges
Facial Sentiment Analysis Using AI Techniques State-of-the-Art, Taxonomies, and Challenges
Received April 28, 2020, accepted May 7, 2020, date of publication May 11, 2020, date of current version May 26, 2020.
Digital Object Identifier 10.1109/ACCESS.2020.2993803
ABSTRACT With the advancements in machine and deep learning algorithms, the envision of various
critical real-life applications in computer vision becomes possible. One of the applications is facial sentiment
analysis. Deep learning has made facial expression recognition the most trending research fields in computer
vision area. Recently, deep learning-based FER models have suffered from various technological issues like
under-fitting or over-fitting. It is due to either insufficient training and expression data. Motivated from
the above facts, this paper presents a systematic and comprehensive survey on current state-of-art Artificial
Intelligence techniques (datasets and algorithms) that provide a solution to the aforementioned issues. It also
presents a taxonomy of existing facial sentiment analysis strategies in brief. Then, this paper reviews the
existing novel machine and deep learning networks proposed by researchers that are specifically designed
for facial expression recognition based on static images and present their merits and demerits and summarized
their approach. Finally, this paper also presents the open issues and research challenges for the design of a
robust facial expression recognition system.
INDEX TERMS Facial sentiment analysis, machine learning, deep learning, convolutional neural network,
deep belief network, artificial intelligence.
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see [Link]
VOLUME 8, 2020 90495
K. Patel et al.: Facial Sentiment Analysis Using AI Techniques: State-of-the-Art, Taxonomies, and Challenges
forest classifiers, artificial neural networks, etc. were used by C. RESEARCH CONTRIBUTIONS
the researchers to recognize the seven basic emotions. They In this paper, we surveyed various existing literature on Facial
also claimed good and effective results. Automated human Sentiment Analysis focusing on the DL techniques, datasets,
emotion detection is all-important in security and surveillance and the methodologies used to classify emotions. Following
applications these days. To further improve its performance, are the crisp contributions of the paper.
the researchers are trying hard to explore further in this field. • We present an in-depth survey on FER methods and
Various challenges like occlusion in datasets, over-fitting of dataset used. Then, we highlight the advanced methods
models, etc. have to be taken care of while implementing the used for FER and their comparative analysis.
FER. As per the literature explored and knowledge of the • We present a taxonomy on FER methods based on face
authors, no survey is available, which exhaustively compares detection, feature extraction, and emotion classification.
the FER approaches from the perspective of AI. Motivated • Finally, we presented the open issues and research chal-
from the aforementioned fact, we present a comprehensive lenges in the Facial Sentiment Analysis.
survey on FER using Artificial Intelligence (AI) techniques in
which we have explored the state-of-the-art machine learning
D. ORGANIZATION
and DL (DL) approaches with their merits and demerits.
Structure of the survey is as shown in FIGURE 2. Section II
focuses on the evolution of facial recognition techniques
B. SCOPE OF THE SURVEY presented by the authors across the globe and the dataset
Facial sentiment analysis is the most trending topics in Com- used. It also describes the need for facial detection, dimension
puter Vision area. A lot of literature has already been pub- reduction, normalization, feature extraction, and emotion
lished by researchers across the globe in this field, but still, classification. In Section III, we highlighted the bibliometric
many researchers are trying to solve the challenges and issues analysis and methodology used for conducting the proposed
in FER. Various surveys have been published in recent years survey. In Section IV, we discuss various facial expression
[7], [15], [26], [27] on sentiment analysis. These surveys have databases available for analysis. Section V discusses the
mainly focused on traditional methods like support vector proposed taxonomy (facial sentiment analysis taxonomy).
machine (SVM), decision tree classifiers, and artificial neural In Section VI, we discuss the open issues and research chal-
network (ANN). The DL methods [28], [29] have rarely been lenges of FER, and finally, Section VII concludes the survey.
explored by the researchers working in the same field. So, Table 1 lists all the acronyms used in the paper.
in this paper, we analyzed the surveys on facial sentiment
analysis and presented a comparative analysis. For example, II. BACKGROUND
Hemalatha and Sumathi [15] surveyed various methods for This section focuses on the background and importance of
facial detection, facial feature extraction and classification of facial expressions for sentiment analysis. It is bifurcated into
FER, but not presented the proper comparison of methods four subsections. Firstly, we discuss the evolution of the
considered and the dataset used. Later, the authors in [16] also time-line of facial recognition methods. Secondly, we dis-
presented the survey on FER, but they had not mentioned any- cuss the need for Facial Detection, Dimension Reduction,
thing about datasets useful for emotions recognition. Another and Normalization for sentiment analysis. In the third sub-
survey of Chengeta and Viriri [19] was on various traditional section, we focus on the need for feature extraction from
feature extraction techniques like principal component analy- the face image. Finally, we highlight the need for emotion
sis (PCA), Linear Discriminant Analysis (LDA), and Locally classification.
Linear Embedding (LLE), and thereafter they proposed an
ensemble classifier. They failed to compare the advanced A. EVOLUTION TIMELINE
DL approach, which is currently the most novel approach in Figure 3 gives a brief overview on the evolutionary time-line
FER. Again, Baskar and Kumar [18] also lacks in explaining of facial sentimental recognition methods given by the
various DL approaches. researchers across the globe along with the datasets. There
Recently, DL-based FER approaches has been explored in exists various algorithms for FER such as traditional state-of-
[7], [27], which are the detailed surveys without the discus- the-art algorithms and DL-based algorithms proposed by var-
sion on FER. Therefore, in the proposed survey, we make a ious researchers till 2020. The emotion recognition was first
systematic survey of various databases used for FER, various stated in the paper proposed by Bassili [30] in 1978 where
methods for face detection, facial feature extraction, and authors have classified the emotions into six basic gestures
emotion classification, future challenges, and current issues such as happiness, sadness, fear, surprise, anger, and dis-
in facial sentiment analysis. Our aim for this survey that it gust. Different algorithms (traditional and DL) were used for
would be quite beneficial for those who want to explore in FER by the authors. For example, Padgett and Cottrell [31],
this field and they will get a complete overview of all the the first time (ANN) in 1996, SVM [32] in 2000, CNN [33]
advanced systematic approaches in facial sentiment analysis. in 2003, Multi-SVM [34] in 2006, boosted DBN [34] in 2014,
Table 2 presents relative differences between the existing RNN [35] in 2015, and (PHRNN and MSCNN) [36] in 2017.
surveys with the proposed survey. Also, many datasets have been created for training ans testing
TABLE 2. A relative comparison of the proposed survey with the existing FER surveys.
these FER models. The time-line shows the list of datasets as bounding box, which is put over the face. Detecting a face
per the creation year. Datasets are- JAFFE [37] in 1998, CK+ is quite complex as the human faces can be in different sizes
[38] in 2000, MMI [39] in 2002, Oulu-CASIA [40] in 2008, and shapes. So, the face detection algorithm plays a vital role
Multi-PIE [41] in 2009, (RaFD [42], MUG [43], and TFD in the aforementioned situation. Various algorithms for face
[44]) in 2010, and FER-2013 [45] in 2013. As new datasets detection are available such as Viola-Jones [46], PCA, LDA,
are being available supported with DL algorithms to solve the and genetic algorithms. Viola-Jones algorithm is one of the
challenges in FER. most widely used algorithms for face detection. It differenti-
ates faces from the non-faces. PCA is the other most widely
B. NEED FOR FACIAL DETECTION, DIMENSION used face detection method. It is used to reduce the image
REDUCTION AND NORMALIZATION dimensions and has four main parts:- feature covariance,
In the FER process, the first pre-requisite step is face detec- eigen decomposition, principal component transformation,
tion, which involves the detection of a face in the image or and choosing components [47]. Reducing the dimensions
frame and removes the insignificant pixels. The face detection from m − dimensions to n − dimensions: ∀ m > n does not
algorithm gives the output in the form of coordinates of the means we are losing the properties of the image, moreover it
preserves [48]. After the dimension reduction, normalization The second method, which is an appearance-based method,
can be used to scale-up the image. takes care of the states of different points of the face, such
as the position of the eye, shape of important points such
C. NEED FOR FEATURE EXTRACTION as mouth and eyebrows using the salient point features.
Facial Expression analysis comprises of various methods The majority of the traditional methods have used Local
such as facial landmark identification, feature extraction, Binary Pattern (LBP) as the feature extraction technique,
and different feature extraction databases. Facial landmarks which is a generic-based framework for the extraction of
are drawn by the facial key points which are derived from features from the static image. It converts the most impor-
the geometry of the face [16]. Feature Extraction is done tant features of the input image, as mentioned above, into a
after preprocessing phase [49]. There are two methods avail- histogram [52].
able for feature extraction are appearance-based extrac-
tion and geometric-based extraction. The geometric-based
method extracts feature like edge features and corner features. D. NEED FOR EMOTION CLASSIFICATION
Neha et al. [50] analyzed the performance of the feature The third step in the FER is the Emotion Classification. There
extraction technique Gabor filter. They also tested the average are various methods that are used for the classification of
gabor filter and compared both the filtering techniques to emotions after applying face detection and feature extraction
enhance the recognition rate. algorithms. The various classification algorithms are con-
• Corners: Corners of an image is a significant property, volutional neural network (CNN) [53], SVM, and restricted
which can be inferred from the complex objects of boltzmann machine (RBM). The most widely used method
the image. Cho et al. [51] developed the corner detec- for classification is CNN. It is the most efficient algorithm as
tion technique, which measures the distance and angle it can be applied directly to the input image without applying
between two straight lines. any feature extraction and face detection algorithms and still
• Edges: They are one-dimensional features that represent gets better accuracy over the input data [54]. The number
the boundary of an image region. of images in the training data set also has a huge impact
published till-date. The comparative analysis of the datasets To remove less illumination problem on the face, tungsten
is presented in Table 4. lights were used.
an input image. It has four stages, such as haar-like features, • Integral Graph: As the dimension of the generated Haar
integral graphs, AdaBoost training, and cascading classifier. feature is large, a technique called integral map can be
• Haar-Like Features: Viola-Jones algorithm uses Haar- used to isolate the picture cells, such as 2D coordinates
like features, 1.e., a scalar product between the image of the gray-scale picture and the estimations of every
and some Haar-like templates [82]. As shown in FIG- pixel point [83]. The procedure to make an integral graph
URE 6, edge features, linear features, center features, is that each pixel is made equivalent to the total of all
and diagonal features are the four Haar features used pixels above and to one side of the concerned pixel.
in Viola-Jones face detection algorithm [83]. There are Henceforth, the total of all pixels in the rectangle shape
two regions, as shown in the figure, black shaded and is determined.
white shaded regions. The eigenvalue is calculated using • AdaBoost Training: This training algorithm is a weak
the difference between those two regions for linear fea- classifier and is made to learn multiple times to become
tures [83]. good. The two things that we need to consider for the
X X classification of an image. First, the locale of the eyes
eigenvalue(v) = − (1)
is darker than the district of the nose and the cheeks,
white black
X X whereas the other thing is, eyes should be darker than
eigenvalue(v) = −2 × , (2) nasal scaffold [75].
white black
uses fisher face (enhancement of eigenface) for reducing the number. But the basic LBP had a limitation with large-scale
dimensions of the features and the identification of face in an structures and highly sensitive to noise [100], [101].
image. Fisher’s face is usually used when the images have a It is also invariant to the rotations and size of the fea-
contrast in illumination. Various steps to create the fisher face tures, which are increasing exponentially with the increase in
is same as PCA and performs better than PCA [86]. neighbors.
A Uniform LBP was proposed that considers the U pattern,
B. FEATURE EXTRACTION which has at most 2 bitwise transitions from 0 to 1. So, various
This section discussed the various feature extraction tech- extensions of the LBP were proposed for the neighborhoods
niques and explained how they could be used in FER models. of any size. A circular neighborhood was proposed with any
We also compare the various Feature Extraction techniques number of pixels and radius. It is represented by the notation
and also analyzed various works done using the various tech- (P, R) where P means the number of sampling points on
niques, as shown in Table 6. a circle of radius R. There are various applications where
LBP is used, such as-texture analysis, face analysis, and
1) LOCAL BINARY PATTERN classification. It codifies the local primitives, including the
It is the recent texture descriptor that converts the value edges, corners, different spots, and flat areas. Nowadays, LBP
of the original pixel of the image with a decimal value converts the important pixels of an image into a histogram
and converts it into codes known as LBP codes [97]–[99]. which is known as the Histogram of Oriented Graph approach
Labels are formed by thresholding the 3 × 3 neighborhood (HoG), that stores the information of local-micro patterns of
with a central value and considers the result as a binary the faces.
A modified algorithm for LBP was proposed by the authors • The first stage is to apply DCT on the image.
of [89]. The steps for generating the threshold are as follows. • The second step is the selection of co-efficients [106].
• The input preprocessed image is divided into 3 × 3 By applying DCT on an U × V image then a 2D U × V
blocks. co-efficient matrix is formed.
• For each block, calculate the minimum and maximum of √ M −1
block representing the pixel intensity value of the block. 2 X
Gx (0) = X (m) (6)
• Now, calculate the threshold value of block B by taking M
m=0
an average of both the minimum and maximum values.
M −1
• If any element of the block is greater than threshold, then 2 X (2m + 1)k5
Gx (k) = X (m) cos , (7)
write ‘1’ to it, else write ‘0’. M 2M
m=0
• The eight-bit pattern is converted to a decimal number,
where k = 1, 2, . . . , (M − 1) (8)
representing transformed block B.
where Gx (k) is the kth DCT co-efficient [107].
2) GABOR FILTER Jayalekshmi et al. [75] in their work have used DCT over
It extracts both time and frequency domains [102] of the fixed discrete sequences to convert the data into elementary
image. Means, it analyzes whether there is any particular fre- frequency components.
quency content in the image in a particular direction around
the point of analysis. The use of 2D Gabor Filter is made in 4) SCALE INVARIANT FEATURE TRANSFORM (SIFT)
the spatial domain. Gabor Filters is quite successful in FER It transforms the input image data into scale-invariant co-
models. Multi-resolution structures are applied to images ordinates relative to the local features and stored into a
which consist of multi-frequencies and multi orientations. database [108]. SIFT features are highly distinctive i.e., it
These structures relate Gabor Filters to wavelets [103]. can match a single feature with a large probability from
The filters having real and imaginary components rep- the database. SIFT also features scale and rotation invariant,
resents the orthogonal directions. The equations are shown which means that even if we scale or rotate the image, the fea-
below [102]: tures remain preserved. This is useful in FER when a rotated
" 02 02
!#
1 1 a b image comes as an input. If we rotate the image, then also
9ω,θ (a, b) = exp jωa0
exp − + 2 the features are maintained, and we can get those features
2π σa σb 2 σa 2 σb
efficiently [109]. The stages of the SIFT procedure are as
(3)
follows.
a0 = a cos θ + b sin θ (4)
• Scale-Space Extrema Detection: The first stage searches
b = −a sin θ + b cos θ
0
(5) all image locations and scales using the Gaussian
where (a,b) is the pixel position in the spatial domain, θ is the method.
orientation of Gabor filter, ω is the radial central frequency • Keypoint localization: It determines the location and
and σ is the standard deviation of the Gaussian Filter which scale at each point of the image.
means it controls the size of the Gabor Envelope. • Orientation Assignment: Rotation Invariance is per-
Liu et al. [87] proposed in his paper the local Gabor filter formed by assigning one or more dominant orientations
bank LG (m × n) which spreads all over. It also contains to each key point.
multi-scale information of features those having a global • Keypoint Descriptor: A descriptor is made to represent
filter or the image as well as it also reduces the redundancy in each keypoint, which supports the assigned orientation
eigen values. This reduces the time for extracting the features. in the preceding stage. It supports the histogram of the
[84] used a Gabor Filter bank with eight orientations and gradient within the image. The changes in illumination
five scales. The formed bank is used to filter the generated are scaled back by the descriptor to the key point. When
divided images 40 times each. This created a computational the keypoint descriptors are received, they are often used
burden from them, so they reduced the number of features by as a feature or keypoint for data to solve various prob-
dimension reduction techniques. Dimension reduction using lems. More detailed information on SIFT computation
PCA is explained under section PCA. are often found in [110], [111].
Kravets et al. [90] proposed P-SIFT (Parallel SIFT) algo-
3) DISCRETE COSINE TRANSFORM (DCT) rithm, which reduces the computation time and increases the
A finite sequence of data or feature points as the sum of processing speed. In P-SIFT, the problem is divided into sub-
cosine functions are oscillating at different frequencies is tasks, and multiple processors are used for feature extraction.
represented by DCT. It is a way to compressing the data/ The program reads the input image and generates the key
2D-image without losing its original meaning [104]. In such points. After that, it matches the key points with respect to
type of applications (data compression), an input of 8 × 8 each image in the database. Matching is done using Euclidean
size is used for DCT [105] for feature extraction. It has distance. The images that have a ratio of first least distance
two stages: to the second least distance is less than 0.8 are taken into
active feature in a pooling region [134]. The max Mostly, 2 × 2 pooling can be used without overlapping.
pooling function is as follows: This means that M in the above equation is always 2.
And the large pooling has M value as 4, 8, 16, which
aj = max (an×n
i u(n, n)) (11) always have a dependency on input image size. So,
N ×N
[136], in their paper, proposed a Multi-activation pool-
2) Average Pooling: It has a function u(x,y) (i.e., win- ing method in order to satisfy the need of a large pooling
dow function) to the input data, and selects the region. This method allows top-p activations to pass
average value for each input data on the preceding through the pooling rate. Here p indicates the total num-
layer feature map [135], [136] ber of picked activations. If p = M × M , then it means
×M
MX that each and every activation through the computation
1 contributes to the final output of neuron [136]. For the
acti = xj (12)
M ×M random pooling region Xi , we denote the nth-picked
J =1
softmax function at the output layer. They used the batch The aim of SVM is to identify the maximum margin plane
normalization [154] with each convolutional layer as well as between the classes. The maximum margin plane can be
the FC layers. To preserve the data, they used zero-padding obtained from the maximum distance between the positive
in the convolutional layers. They achieved an accuracy of and the negative margin plane, respectively of the two classes.
approx. 72.249 % with the ensemble of 9 networks. The distance between the separating plane and the positive
To solve the problem of poor performance in real appli- margin plane should be equal on both sides.
cations caused because of the stored facial images that most To solve the problem of recognizing emotions from facial
of the time show expression not as a single emotion but expressions in a simple and speeded manner, Datta et al. [122]
represents a multiple emotions, Gan et al. [126] designed an presented a classification system that used the concatenation
approach using CNN and soft label which associates with of geometric as well as texture-based features to classify the
multiple emotions and expressions. They obtained the soft emotions using SVMs. They have used the hierarchical SVM
labels using constructor involving 2 step scheme: architecture to leverage the benefits of multi-class binary
1) The initial step is to prepare a CNN model with hard classification. CK+ dataset was used for classification. They
data labels for supervision and the softmax function for have achieved the significant enhancements in the accuracy
optimization. using hybrid SVM features compared to LBP features.
2) The second step is to fuse the possibility of prediction Nuno Lopes et al. [78] in 2018 given a classification model
to get soft labels from the pre-trained models [126]. for FER in the elderly and also present the differences of FER
Their architecture is similar to VGG16, however, the last FC in the elderly and other age people. They used the Support
layer is adjusted as C-way yields, where quantity C is the Vector Machine with a multi-class classification for classi-
number of emotion classes. fying the emotions [159]. They proposed two architectures,
Zadeh et al. [127] proposed a DL model having a CNN the first approach removes the wrinkles, nasolabial fold, and
layer and 2 Gabor Filters to classify different human sen- other facial features, using edge-preserving smoothing tech-
timent. This model uses a feature selection method called niques. While in the second architecture, they introduced an
Gabor Filter, which is commonly applied for texture outline. algorithm from API Microsoft, which detects the age of the
It returns where there is any texture change in the image. person. The lifespan dataset was used to train and test the
Then these features are fed to a CNN (Convolutional Neural multi-class SVM. They used 80% images to test the accuracy
Network) for the classification of human sentiment. Their of the SVM and 20% to test the accuracy of the application.
model has the following stages- Input Images, resize, 1st They got an accuracy of 95.24% in the young age group and
Gabor Filter, 2nd Gabor Filter, CNN layer, and classification accuracy of 90.32% in the elderly age group.
of sentiments. They tested their model on the JAFFE dataset. SVM is a linear classifier that can be applied for linearly
They also compared the dataset classification using simple separable data. But SVM can take high dimensional data as
CNN and its model (CNN with 2 Gabor Filters). They trained input also which most of the time is non-linear data. So a
them for 30 epoch and got an accuracy of 91.16% on simple mapping function is applied to the SVM training, which is
CNN and 97.16% on their model [127]. non-linear and converts the data into linearly separable but in
a higher dimension. This function is called a kernel function.
2) SUPPORT VECTOR MACHINE (SVM) There are various kernel functions, but [85], in his paper, used
SVM is a classifier [155] was designed for classifying out Radial Basis Kernel Function(RBF). They used one versus
of two classes. If the SVM has more than two classes, then one approach in this paper.
more than one SVMs is to be implemented. There are three Ibrahim Adeyanju et al. [118] proposed a method in which
methods by which we can implement SVM for more than two he used four SVM kernels to classify different emotions of
classes. faces. They used a Radial Basis, Polynomial, Linear, and
• One versus all: It was proposed in [156]. It constructs Quadratic functions as SVM kernels. They tested their model
k SVM models for training data having k number of on 467 training and 238 test sets to classify 7 emotions. They
classes. If there are three classes, then SVM can be got a maximum average accuracy of 86.4% on RBF kernel,
performed three times for every class [157]. 99.33% on Quadratic function, 97.65% on Polynomial, and
• One versus one: It was introduced in [156]. This method 97.86% on Linear.
constructs k(k − 1)/2 classifiers, where two classes at
a time are taken to train the model. In this method, 3) ARTIFICIAL NEURAL NETWORK (ANN)
SVM is performed between every class that is to be ANN is inspired by the biological neural networks that consti-
classified [157]. tute the brain [160]. Our brain consists of millions of neurons
• Directed Acyclic Graph SVM (DAGSVM): It was pro- that form a neural network. These neurons are interconnected
posed in [158]. Its training phase is similar to one-vs-one with each other and process the signals to/from the brain to
method. The testing phase makes use of a rooted binary the other parts of our body [161]. This type of link is called
DAG having at the most k(k−1)/2 internal nodes and the synapses. There are approximately 100 billion neurons and
maximum k leaves. An advantage of using a DAGSVM are interconnected by thousands or more synapses. In ANN,
is that is to generalize the analysis [158]. the signal is a real or binary number and the output of these
The results of their model were better than most of the other
strategies.
FIGURE 12 shows the comparative analysis of the accu-
racies of various state-of-the-art approaches on different
datasets.
should include images of people of all age groups as different visualization-based rule of FER and also figures out which
age groups exhibit emotions differently. There are datasets part of the face has more discerning information. Its results
that have images of a particular age group, but no dataset indicate the activations of filters with strong correlation to the
has a mixture of all the age groups [7]. This dataset, if devel- face mark regions which correspond to a particular Action
oped, would assist in developing research on cross-age, cross- Unit. In 2016, Mousavi et al. [173] used the concept of
culture, and cross-gender. visualization techniques and proposed a new visualization
technique LIPNet.
C. FER ON 3D DATA
The current research mainly focuses on 2D FER data, which G. OTHER ISSUES
faces challenges to illumination factors and pose variations Various other issues have risen based on the prototypi-
[168]. 3D face shape models are naturally robust to pose vari- cal expression categories, namely real versus fake emotion
ations and illumination factors. [169] in his work, proposed recognition challenge and complementary emotion recogni-
CNN without facial landmark detection, which estimates tion problem. Also, the apps for real-time FER is still a chal-
expression coefficients from image intensities. Recently, lenging task [174]. Many DL techniques have been applied
many works have been proposed which combines both 2D regarding the above problems.
and 3D data to improve the accuracy.
VII. CONCLUSION
D. DIFFERENT MODALITIES IN FER This paper presents a detailed systematic survey to analyze
Facial Expression is only modality that can be used to recog- current state-of-the-art approaches for facial emotion recog-
nize human behavior. The combination of other patterns like nition in static images and various parameters that influ-
infrared images, capturing the information of 3D models, and ence the results of these approaches. We have developed a
physiological data is trending research area due to large com- taxonomy based on different methods used for face detec-
plementary expressions. Reference [170] employed various tion, feature extraction, and emotion classification. Various
multi-modal affect recognition techniques. facial expression databases used as input for the FER are
discussed. We have reviewed previous works on this field
E. FER ON INFRARED DATA and concluded that much of the work had been done in this
At present, the gray-scale and RGB colors are at the trend field. We have compared various detection, extraction, and
in the deep FER, but are more vulnerable to light effects. classification approaches and concluded that which approach
However, the infrared images records the emotions produced is more prominent in achieving better performance in avail-
by skin distribution which are not subtle to the illumination able computation power. By discussing current issues and
variations. In 2017, Wu et al. [171] given a 3D CNN archi- research challenges in the future, we concluded that there is
tecture to fuse spatial and temporal features in FER images. still much research needed in this field, such as FER in 3D
face shape models, recognizing emotion in images under
F. VISUALIZATION TECHNIQUES occlusion, etc. Real-time FER is still a challenging task. In the
Adding visualization techniques [172] over the CNN model future, we would like to survey the FER problem in videos
results in a quantitative analysis of how it contributes to the using more advanced DL techniques.
[48] S. Tanwar, T. Ramani, and S. Tyagi, ‘‘Dimensionality reduction using [76] H. Li, Z. Lin, X. Shen, J. Brandt, and G. Hua, ‘‘A convolutional neural
PCA and SVD in big data: A comparative case study,’’ in Future Internet network cascade for face detection,’’ in Proc. IEEE Conf. Comput. Vis.
Technologies and Trends, Z. Patel and S. Gupta, eds. Cham, Switzerland: Pattern Recognit. (CVPR), Jun. 2015, pp. 5325–5334.
Springer, 2018, pp. 116–125. [77] H. Ding, S. K. Zhou, and R. Chellappa, ‘‘FaceNet2ExpNet: Regular-
[49] G. Kumar and P. K. Bhatia, ‘‘A detailed review of feature extraction izing a deep face recognition net for expression recognition,’’ in Proc.
in image processing systems,’’ in Proc. 4th Int. Conf. Adv. Comput. 12th IEEE Int. Conf. Autom. Face Gesture Recognit. (FG ), May 2017,
Commun. Technol., Feb. 2014, pp. 5–12. pp. 118–126.
[50] N. Janu, S. Kumar, and P. Mathur, ‘‘Performance analysis of feature [78] N. Lopes, A. Silva, S. R. Khanal, A. Reis, J. Barroso, V. Filipe, and
extraction techniques for facial expression recognition,’’ Int. J. Comput. J. Sampaio, ‘‘Facial emotion recognition in the elderly using a SVM clas-
Appl., vol. 166, no. 1, pp. 1–3, 2017. sifier,’’ in Proc. 2nd Int. Conf. Technol. Innov. Sports, Health Wellbeing
[51] K. Cho and S. M. Dunn, ‘‘Learning shape classes,’’ IEEE Trans. Pattern (TISHW), Thessaloniki, Greece, Jun. 2018, pp. 1–5.
Anal. Mach. Intell., vol. 16, no. 9, pp. 882–888, Sep. 1994. [79] M. N. Chaudhari, M. Deshmukh, G. Ramrakhiani, and R. Parvatikar,
[52] K.-C. Song, Y.-H. Yan, W.-H. Chen, and X. Zhang, ‘‘Research and ‘‘Face detection using viola jones algorithm and neural networks,’’ in
perspective on local binary pattern,’’ Acta Automatica Sinica, vol. 39, Proc. 4th Int. Conf. Comput. Commun. Control Autom. (ICCUBEA),
no. 6, pp. 730–744, Mar. 2014. Aug. 2018, pp. 1–6.
[53] Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, ‘‘Gradient-based learn- [80] N. B. Kar, K. S. Babu, A. K. Sangaiah, and S. Bakshi, ‘‘Face expression
ing applied to document recognition,’’ Proc. IEEE, vol. 86, no. 11, recognition system based on ripplet transform type II and least square
pp. 2278–2324, Nov. 1998. SVM,’’ Multimedia Tools Appl., vol. 78, no. 4, pp. 4789–4812, Feb. 2019.
[81] H. M. Shah, A. Dinesh, and T. S. Sharmila, ‘‘Analysis of facial landmark
[54] R. S. Jadhav and P. Ghadekar, ‘‘Content based facial emotion recogni-
features to determine the best subset for finding face orientation,’’ in Proc.
tion model using machine learning algorithm,’’ in Proc. Int. Conf. Adv.
Int. Conf. Comput. Intell. Data Sci. (ICCIDS), Feb. 2019, pp. 1–4.
Comput. Telecommun. (ICACAT), Dec. 2018, pp. 1–5.
[82] Y.-Q. Wang, ‘‘An analysis of the viola-jones face detection algorithm,’’
[55] B. Kitchenham, O. Pearl Brereton, D. Budgen, M. Turner, J. Bailey, and Image Process. Line, vol. 4, pp. 128–148, Jun. 2014.
S. Linkman, ‘‘Systematic literature reviews in software engineering—A [83] W.-Y. Lu and M. Yang, ‘‘Face detection based on viola-jones algorithm
systematic literature review,’’ Inf. Softw. Technol., vol. 51, no. 1, pp. 7–15, applying composite features,’’ in Proc. Int. Conf. Robots Intell. Syst.
Jan. 2009. (ICRIS), Jun. 2019, pp. 82–85.
[56] P. Mehta, R. Gupta, and S. Tanwar, ‘‘Blockchain envisioned UAV net- [84] B. Islam, F. Mahmud, and A. Hossain, ‘‘Facial expression region segmen-
works: Challenges, solutions, and comparisons,’’ Comput. Commun., tation based approach to emotion recognition using 2D Gabor filter and
vol. 151, pp. 518–538, Feb. 2020. multiclass support vector machine,’’ in Proc. 21st Int. Conf. Comput. Inf.
[57] B. Kitchenham and S. Charters, ‘‘Guidelines for performing systematic Technol. (ICCIT), Dec. 2018, pp. 1–6.
literature reviews in software engineering,’’ School Comput. Sci. Math., [85] Y. Luo, C.-M. Wu, and Y. Zhang, ‘‘Facial expression recognition based on
Keele Univ., Keele, U.K., Tech. Rep. EBSE-2007-01, 2007. fusion feature of PCA and LBP with SVM,’’ Optik Int. J. Light Electron
[58] Japanese Female Facial Expressions (JAFFE). Accessed: 1998. [Online]. Opt., vol. 124, no. 17, pp. 2767–2770, Sep. 2013.
Available: [Link] [86] Carnap, Hilbert, Ackermann, Russell, and Whitehead, ‘‘A logical calculus
[59] Extended Cohn-Kanade (CK+). Accessed: 2008. [Online]. Available: of the ideas immanent in nervous activity,’’ Tech. Rep., Jan. 1970.
[Link] [87] S.-S. Liu and Y.-T. Tian, ‘‘Facial expression recognition method based on
[60] Mmi. Accessed: 2005. [Online]. Available: [Link] Gabor wavelet features and fractional power polynomial kernel PCA,’’ in
[61] Oulu-Casia. Accessed: Nov. 17, 2011. [Online]. Available: [Link] Advances in Neural Networks—ISNN, L. Zhang, B.-L. Lu, and J. Kwok,
[Link]/CMV/Downloads/Oulu-CASIA/ eds. Berlin, Germany: Springer, 2010, pp. 144–151.
[62] Multi-Pie. Accessed: Oct. 2009. [Online]. Available: [Link] [88] H.-F. Huang and S.-C. Tai, ‘‘Facial expression recognition using new
edu/afs/cs/project/PIE/MultiPie/Multi-Pie/[Link] feature extraction algorithm,’’ ELCVIA Electron. Lett. Comput. Vis. Image
[63] Multimedia Understanding Group (MUG). Accessed: Apr. 2010. Anal., vol. 11, no. 1, p. 41, 2012.
[Online]. Available: [Link] [89] S. Biswas and J. Sil, ‘‘Facial expression recognition using modified local
[64] Toronto Faces Dataset (TFD). Accessed: Apr. 2005. [Online]. Available: binary pattern,’’ in Computational Intelligence in Data Mining, vol. 2,
[Link] L. C. Jain, H. S. Behera, J. K. Mandal, and D. P. Mohapatra, eds.
[65] Radbound Faces Database (RAFD). Accessed: 2011. [Online]. Available: New Delhi, India: Springer, 2015, pp. 595–604.
[Link] [90] S. Chickerur, T. Reddy, and O. Shabalina, ‘‘Parallel scale invariant fea-
[66] Fer-2013. Accessed: 2013. [Online]. Available: [Link] ture transform based approach for facial expression recognition,’’ in
c/challenges-in-representation-learning-facial-expression-recognition- Creativity in Intelligent Technologies and Data Science, A. Kravets,
challenge/data M. Shcherbakov, M. Kultsova, and O. Shabalina, eds. Cham, Switzerland:
[67] A. Dhall, R. Goecke, S. Lucey, and T. Gedeon, ‘‘Static facial expression Springer, 2015, pp. 621–636.
[91] A. Mollahosseini, D. Chan, and M. H. Mahoor, ‘‘Going deeper in facial
analysis in tough conditions: Data, evaluation protocol and benchmark,’’
expression recognition using deep neural networks,’’ in Proc. IEEE Win-
in Proc. IEEE Int. Conf. Comput. Vis. Workshops (ICCV Workshops),
ter Conf. Appl. Comput. Vis. (WACV), Lake Placid, NY, USA, Mar. 2016,
Nov. 2011, pp. 2106–2112.
pp. 1–10.
[68] Sfew (Emotiw). Accessed: 2012. [Online]. Available: [Link]
[92] N. Mehta and S. Jadhav, ‘‘Facial emotion recognition using log Gabor
[Link]
filter and PCA,’’ in Proc. Int. Conf. Comput. Commun. Control Autom.
[69] A. Mollahosseini, B. Hasani, and M. H. Mahoor, ‘‘AffectNet: A database
(ICCUBEA), Aug. 2016, pp. 1–5.
for facial expression, valence, and arousal computing in the wild,’’ IEEE [93] M. Sajjad, A. Shah, Z. Jan, S. I. Shah, S. W. Baik, and I. Mehmood,
Trans. Affect. Comput., vol. 10, no. 1, pp. 18–31, Jan. 2019. ‘‘Facial appearance and texture feature-based robust facial expression
[70] Affectnet. Accessed: 2017. [Online]. Available: [Link] recognition framework for sentiment knowledge discovery,’’ Cluster
com/affectnet/ Comput., vol. 21, no. 1, pp. 549–567, Mar. 2018.
[71] R. Kosti, J. Alvarez, A. Recasens, and A. Lapedriza, ‘‘Context based [94] A. Srivastava, S. Mane, A. Shah, N. Shrivastava, and B. Thakare, ‘‘A sur-
emotion recognition using EMOTIC dataset,’’ IEEE Trans. Pattern vey of face detection algorithms,’’ in Proc. Int. Conf. Inventive Syst.
Anal. Mach. Intell., early access, May 14, 2019, doi: 10.1109/TPAMI. Control (ICISC), Jan. 2017, pp. 1–4.
2019.2916866. [95] R. Ravi, S. V. Yadhukrishna, and R. Prithviraj, ‘‘A face expression recog-
[72] Context is Important to Recognize Emotions. Accessed: 2020. [Online]. nition using CNN & LBP,’’ in Proc. 4th Int. Conf. Comput. Methodologies
Available: [Link] Commun. (ICCMC), Mar. 2020, pp. 684–689.
[73] M. Lyons, S. Akamatsu, M. Kamachi, and J. Gyoba, ‘‘Coding facial [96] A. L. A. Ramos, B. G. Dadiz, and A. B. G. Santos, ‘‘Classifying emotion
expressions with Gabor wavelets,’’ in Proc. 3rd IEEE Int. Conf. Autom. based on facial expression analysis using Gabor filter: A basis for adaptive
Face Gesture Recognit., Apr. 1998, pp. 200–205. effective teaching strategy,’’ in Computational Science and Technology.
[74] M. F. Valstar and M. Pantic, ‘‘Induced disgust, happiness and surprise: Singapore: Springer, 2020, pp. 469–479.
An addition to the MMI facial expression database,’’ Tech. Rep., 2010. [97] T. Ojala, M. Pietikainen, and T. Maenpaa, ‘‘Multiresolution gray-scale
[75] J. Jayalekshmi and T. Mathew, ‘‘Facial expression recognition and emo- and rotation invariant texture classification with local binary patterns,’’
tion classification system for sentiment analysis,’’ in Proc. Int. Conf. IEEE Trans. Pattern Anal. Mach. Intell., vol. 24, no. 7, pp. 971–987,
Netw. Adv. Comput. Technol. (NetACT), Jul. 2017, pp. 1–8. Jul. 2002.
[98] V. Takala, T. Ahonen, and M. Pietikäinen, ‘‘Block-based methods for [123] J. Cai, Z. Meng, A. S. Khan, Z. Li, J. OReilly, and Y. Tong, ‘‘Island loss for
image retrieval using local binary patterns,’’ in Image Analysis, H. Kalvi- learning discriminative features in facial expression recognition,’’ in Proc.
ainen, J. Parkkinen, and A. Kaarna, eds. Berlin, Germany: Springer, 2005, 13th IEEE Int. Conf. Autom. Face Gesture Recognit. (FG ), May 2018,
pp. 882–891. pp. 302–309.
[99] T. Ahonen, A. Hadid, and M. Pietikainen, ‘‘Face description with local [124] P. Dhankhar, ‘‘Resnet-50 and VGG-16 for recognizing facial emotions,’’
binary patterns: Application to face recognition,’’ IEEE Trans. Pattern Int. J. Innov. Eng. Technol., vol. 13, no. 4, pp. 126–130, 2019.
Anal. Mach. Intell., vol. 28, no. 12, pp. 2037–2041, Dec. 2006. [125] A. Renda, M. Barsacchi, A. Bechini, and F. Marcelloni, ‘‘Com-
[100] M. Guo, X. Hou, Y. Ma, and X. Wu, ‘‘Facial expression recognition using paring ensemble strategies for deep learning: An application to
ELBP based on covariance matrix transform in KLT,’’ Multimedia Tools facial expression recognition,’’ Expert Syst. Appl., vol. 136, pp. 1–11,
Appl., vol. 76, no. 2, pp. 2995–3010, Jan. 2017. Dec. 2019.
[101] D. Huang, C. Shan, M. Ardabilian, Y. Wang, and L. Chen, ‘‘Local binary [126] Y. Gan, J. Chen, and L. Xu, ‘‘Facial expression recognition boosted by
patterns and its application to facial image analysis: A survey,’’ IEEE soft label with a diverse ensemble,’’ Pattern Recognit. Lett., vol. 125,
Trans. Syst., Man, Cybern. C, Appl. Rev., vol. 41, no. 6, pp. 765–781, pp. 105–112, Jul. 2019.
Nov. 2011. [127] M. M. Taghi Zadeh, M. Imani, and B. Majidi, ‘‘Fast facial emo-
[102] K. Verma and A. Khunteta, ‘‘Facial expression recognition using Gabor tion recognition using convolutional neural networks and Gabor fil-
filter and multi-layer artificial neural network,’’ in Proc. Int. Conf. Inf., ters,’’ in Proc. 5th Conf. Knowl. Based Eng. Innov. (KBEI), Feb. 2019,
Commun., Instrum. Control (ICICIC), Aug. 2017, pp. 1–5. pp. 577–581.
[103] J. Ilonen, J. Kämäräinen, and H. Kälviäinen, ‘‘Efficient computation of [128] A. Rajendra Kurup, M. Ajith, and M. Martínez Ramón, ‘‘Semi-supervised
Gabor,’’ Dept. Inf. Technol., Lappeenranta Univ. Technol., Lappeenranta, facial expression recognition using reduced spatial features and deep
Finland, Res. Rep. 100. belief networks,’’ Neurocomputing, vol. 367, pp. 188–197, Nov. 2019.
[104] A. B. Watson, ‘‘Image compression using the discrete cosine transform,’’ [129] E. Pranav, S. Kamal, C. Satheesh Chandran, and M. H. Supriya, ‘‘Facial
Math. J., vol. 4, no. 1, p. 81, 1994. emotion recognition using deep convolutional neural network,’’ in Proc.
[105] E. Feig and S. Winograd, ‘‘Fast algorithms for the discrete cosine trans- 6th Int. Conf. Adv. Comput. Commun. Syst. (ICACCS), Mar. 2020,
form,’’ IEEE Trans. Signal Process., vol. 40, no. 9, pp. 2174–2193, pp. 317–320.
Sep. 1992. [130] R. Gupta, S. Tanwar, S. Tyagi, and N. Kumar, ‘‘Machine learning mod-
[106] S. Dabbaghchian, A. Aghagolzadeh, and M. S. Moin, ‘‘Feature extraction els for secure data analytics: A taxonomy and threat model,’’ Comput.
using discrete cosine transform for face recognition,’’ in Proc. 9th Int. Commun., vol. 153, pp. 406–440, Mar. 2020.
Symp. Signal Process. Appl., Feb. 2007, pp. 1–4. [131] M. Mathieu, M. Henaff, and Y. LeCun, ‘‘Fast training of convolutional
[107] N. Ahmed, T. Natarajan, and K. R. Rao, ‘‘Discrete cosine transform,’’ networks through ffts,’’ 2013, arXiv:1312.5851. [Online]. Available:
IEEE Trans. Comput., vol. 100, no. 1, pp. 90–93, 1974. [Link]
[108] D. G. Lowe, ‘‘Distinctive image features from scale-invariant keypoints,’’ [132] S. Anwar, K. Hwang, and W. Sung, ‘‘Structured pruning of deep convolu-
Int. J. Comput. Vis., vol. 60, no. 2, pp. 91–110, Nov. 2004. tional neural networks,’’ ACM J. Emerg. Technol. Comput. Syst., vol. 13,
[109] D. G. Lowe, ‘‘Object recognition from local scale-invariant features,’’ in no. 3, pp. 1–18, May 2017.
Proc. 7th IEEE Int. Conf. Comput. Vis., Sep. 1999, pp. 1150–1157. [133] D. Mungra, A. Agrawal, P. Sharma, S. Tanwar, and M. S. Obaidat,
[110] N. Dalal and B. Triggs, ‘‘Histograms of oriented gradients for human ‘‘PRATIT: A CNN-based emotion recognition system using histogram
detection,’’ in Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern equalization and data augmentation,’’ Multimedia Tools Appl., vol. 79,
Recognit. (CVPR), Jun. 2005, pp. 886–893. nos. 3–4, pp. 2285–2307, Jan. 2020.
[111] T. Nguyen, E.-A. Park, J. Han, D.-C. Park, and S.-Y. Min, ‘‘Object detec- [134] D. Scherer, A. C. Müller, and S. Behnke, ‘‘Evaluation of pooling oper-
tion using scale invariant feature transform,’’ in Genetic and Evolutionary ations in convolutional architectures for object recognition,’’ in Proc.
Computing. Cham, Switzerland: Springer, 2014, pp. 65–72. ICANN, 2010, pp. 92–101.
[112] S. Tanwar, J. Vora, S. Kaneriya, S. Tyagi, N. Kumar, V. Sharma, and [135] M. D. Zeiler and R. Fergus, ‘‘Stochastic pooling for regularization of
I. You, ‘‘Human arthritis analysis in fog computing environment using deep convolutional neural networks,’’ 2013, arXiv:1301.3557. [Online].
Bayesian network classifier and thread protocol,’’ IEEE Consum. Elec- Available: [Link]
tron. Mag., vol. 9, no. 1, pp. 88–94, Jan. 2020. [136] Q. Zhao, S. Lyu, B. Zhang, and W. Feng, ‘‘Multiactivation pooling
[113] X. Xiong and F. De la Torre, ‘‘Supervised descent method and its appli- method in convolutional neural networks for image recognition,’’ Wire-
cations to face alignment,’’ in Proc. IEEE Conf. Comput. Vis. Pattern less Commun. Mobile Comput., vol. 2018, Jun. 2018, Art. no. 8196906.
Recognit., Jun. 2013, pp. 532–539. [137] C.-L. Zhang, J.-H. Luo, X.-S. Wei, and J. Wu, ‘‘In defense of fully
[114] H. Bay, T. Tuytelaars, and L. Van Gool, ‘‘Surf: Speeded up robust fea- connected layers in visual representation transfer,’’ in Proc. PCM, 2017,
tures,’’ in Proc. Eur. Conf. Comput. Vis. Berlin, Germany: Springer, 2006, pp. 807–817.
pp. 404–417. [138] A. Krizhevsky, I. Sutskever, and G. E. Hinton, ‘‘ImageNet classifica-
[115] B. Islam, F. Mahmud, A. Hossain, P. B. Goala, and M. S. Mia, ‘‘A facial tion with deep convolutional neural networks,’’ in Proc. NIPS, 2012,
region segmentation based approach to recognize human emotion using pp. 1097–1105.
fusion of HOG & LBP features and artificial neural network,’’ in Proc. [139] S. Albawi, T. A. Mohammed, and S. Al-Zawi, ‘‘Understanding of a
4th Int. Conf. Electr. Eng. Inf. Commun. Technol. (iCEEiCT), Sep. 2018, convolutional neural network,’’ in Proc. Int. Conf. Eng. Technol. (ICET),
pp. 642–646. Aug. 2017, pp. 1–6.
[116] P. Liu, S. Han, Z. Meng, and Y. Tong, ‘‘Facial expression recognition via [140] Y. Sun, W. Zhang, H. Gu, C. Liu, S. Hong, W. Xu, J. Yang, and
a boosted deep belief network,’’ in Proc. IEEE Conf. Comput. Vis. Pattern G. Gui, ‘‘Convolutional neural network based models for improving
Recognit., Jun. 2014, pp. 1805–1812. super-resolution imaging,’’ IEEE Access, vol. 7, pp. 43042–43051, 2019.
[117] Y. Lv, Z. Feng, and C. Xu, ‘‘Facial expression recognition via deep [141] R. Memisevic, K. R. Konda, and D. Krueger, ‘‘Zero-bias autoen-
learning,’’ in Proc. Int. Conf. Smart Comput., Nov. 2014, pp. 303–308. coders and the benefits of co-adapting features,’’ 2014, arXiv:1402.3337.
[118] I. A. Adeyanju, E. O. Omidiora, and O. F. Oyedokun, ‘‘Performance [Online]. Available: [Link]
evaluation of different support vector machine kernels for face emotion [142] T. L. Paine, P. Khorrami, W. Han, and T. S. Huang, ‘‘An analy-
recognition,’’ in Proc. SAI Intell. Syst. Conf. (IntelliSys), Nov. 2015, sis of unsupervised pre-training in light of recent advances,’’ 2014,
pp. 804–806. arXiv:1412.6597. [Online]. Available: [Link]
[119] K. Talele, A. Shirsat, T. Uplenchwar, and K. Tuckley, ‘‘Facial expression [143] A. Krizhevsky, I. Sutskever, and G. E. Hinton, ‘‘ImageNet classification
recognition using general regression neural network,’’ in Proc. IEEE with deep convolutional neural networks,’’ Commun. ACM, vol. 60, no. 6,
Bombay Sect. Symp. (IBSS), Dec. 2016, pp. 1–6. pp. 84–90, May 2017.
[120] P. Khorrami, T. L. Paine, and T. S. Huang, ‘‘Do deep neural networks [144] A. Gudi, ‘‘Recognizing semantic features in faces using deep learn-
learn facial action units when doing expression recognition?’’ in Proc. ing,’’ 2015, arXiv:1512.00743. [Online]. Available: [Link]
IEEE Int. Conf. Comput. Vis. Workshop (ICCVW), Dec. 2015, pp. 19–27. 1512.00743
[121] G. Wen, Z. Hou, H. Li, D. Li, L. Jiang, and E. Xun, ‘‘Ensemble of [145] K. Simonyan and A. Zisserman, ‘‘Very deep convolutional networks
deep neural networks with probability-based fusion for facial expression for large-scale image recognition,’’ 2014, arXiv:1409.1556. [Online].
recognition,’’ Cognit. Comput., vol. 9, no. 5, pp. 597–610, Oct. 2017. Available: [Link]
[122] S. Datta, D. Sen, and R. Balasubramanian, ‘‘Integrating geometric and [146] K. He, X. Zhang, S. Ren, and J. Sun, ‘‘Deep residual learning for
textural features for facial emotion classification using SVM frame- image recognition,’’ in Proc. IEEE Conf. Comput. Vis. Pattern Recognit.
works,’’ in Proc. CVIP, 2016, pp. 619–628. (CVPR), Jun. 2016, pp. 770–778.
[147] D. Eigen, J. T. Rolfe, R. Fergus, and Y. LeCun, ‘‘Understanding [169] F.-J. Chang, A. Tuan Tran, T. Hassner, I. Masi, R. Nevatia, and
deep architectures using a recursive convolutional network,’’ 2013, G. Medioni, ‘‘ExpNet: Landmark-free, deep, 3D facial expressions,’’
arXiv:1312.1847. [Online]. Available: [Link] in Proc. 13th IEEE Int. Conf. Autom. Face Gesture Recognit. (FG),
[148] G. Sandbach, S. Zafeiriou, M. Pantic, and L. Yin, ‘‘Static and dynamic May 2018, pp. 122–129.
3D facial expression recognition: A comprehensive survey,’’ Image Vis. [170] F. Ringeval, M. Pantic, B. Schuller, M. Valstar, J. Gratch, R. Cowie,
Comput., vol. 30, no. 10, pp. 683–697, Oct. 2012. S. Scherer, S. Mozgai, N. Cummins, and M. Schmitt, ‘‘AVEC 2017:
[149] S. Wan and J. K. Aggarwal, ‘‘Spontaneous facial expression recognition: Real-life depression, and affect recognition workshop and challenge,’’ in
A robust metric learning approach,’’ Pattern Recognit., vol. 47, no. 5, Proc. 7th Annu. Workshop Audio/Visual Emotion Challenge AVEC, 2017,
pp. 1859–1868, May 2014. pp. 3–9.
[150] P. Thakkar, K. Varma, V. Ukani, S. Mankad, and S. Tanwar, ‘‘Com- [171] Z. Wu, T. Chen, Y. Chen, Z. Zhang, and G. Liu, ‘‘NIRExpNet: Three-
bining user-based and item-based collaborative filtering using machine stream 3D convolutional neural network for near infrared facial expres-
learning,’’ in Information and Communication Technology for Intelligent sion recognition,’’ Appl. Sci., vol. 7, no. 11, p. 1184, 2017.
Systems, S. C. Satapathy and A. Joshi, eds. Singapore: Springer, 2019, [172] M. D. Zeiler and R. Fergus, ‘‘Visualizing and understanding convolu-
pp. 173–180. tional networks,’’ in Proc. Eur. Conf. Comput. Vis. Cham, Switzerland:
[151] S. Kaneriya, S. Tanwar, S. Buddhadev, J. P. Verma, S. Tyagi, N. Kumar, Springer, 2014, pp. 818–833.
and S. Misra, ‘‘A range-based approach for long-term forecast of weather [173] N. Mousavi, H. Siqueira, P. Barros, B. Fernandes, and S. Wermter,
using probabilistic Markov model,’’ in Proc. IEEE Int. Conf. Commun. ‘‘Understanding how deep neural networks learn face expressions,’’
Workshops (ICC Workshops), May 2018, pp. 1–6. in Proc. Int. Joint Conf. Neural Netw. (IJCNN), Jul. 2016,
[152] R. Lysiak, M. Kurzynski, and T. Woloszynski, ‘‘Optimal selection of pp. 227–234.
ensemble classifiers using measures of competence and diversity of base [174] I. Song, H.-J. Kim, and P. B. Jeon, ‘‘Deep learning for real-time robust
classifiers,’’ Neurocomputing, vol. 126, pp. 29–35, Feb. 2014. facial expression recognition on a smartphone,’’ in Proc. IEEE Int. Conf.
[153] B.-K. Kim, S.-Y. Dong, J. Roh, G. Kim, and S.-Y. Lee, ‘‘Fusing aligned Consum. Electron. (ICCE), Jan. 2014, pp. 564–567.
and non-aligned face information for automatic affect recognition in the
wild: A deep learning approach,’’ in Proc. IEEE Conf. Comput. Vis.
Pattern Recognit. Workshops (CVPRW), Jun. 2016, pp. 1499–1508.
[154] S. Ioffe and C. Szegedy, ‘‘Batch normalization: Accelerating deep
network training by reducing internal covariate shift,’’ 2015,
arXiv:1502.03167. [Online]. Available: [Link]
03167
[155] V. Vapnik, The Nature of Statistical Learning Theory. New York, NY, KEYUR PATEL is currently pursuing the bache-
USA: Springer, 2013. lor’s degree with Nirma University, Ahmedabad,
[156] S. Knerr, L. Personnaz, and G. Dreyfus, ‘‘Single-layer learning revisited: India. His research interests include computer
A stepwise procedure for building and training a neural network,’’ in vision, natural language processing, energy-based
Neurocomputing (NATO ASI Series), vol. F68, F. F. Soulié and J. Hérault, models, and reinforcement learning.
Eds. Berlin, Germany: Springer-Verlag, 1990, pp. 41–50.
[157] C.-W. Hsu and C.-J. Lin, ‘‘A comparison of methods for multiclass
support vector machines,’’ IEEE Trans. Neural Netw., vol. 13, no. 2,
pp. 415–425, Mar. 2002.
[158] J. C. Platt, N. Cristianini, and J. Shawe-Taylor, ‘‘Large margin dags for
multiclass classification,’’ in Proc. NIPS, 1999, pp. 547–553.
[159] S. Tanwar, Q. Bhatia, P. Patel, A. Kumari, P. K. Singh, and W.-C. Hong,
‘‘Machine learning adoption in blockchain-based smart applications: The
challenges, and a way forward,’’ IEEE Access, vol. 8, pp. 474–488, 2020.
[160] W. S. McCulloch and W. Pitts, ‘‘A logical calculus of the ideas immanent
in nervous activity,’’ Bull. Math. Biol., vol. 52, nos. 1–2, pp. 99–115,
Jan. 1990. DEV MEHTA is currently pursuing the bachelor’s
[161] R. Gupta, S. Tanwar, S. Tyagi, and N. Kumar, ‘‘Tactile-Internet-Based degree with Nirma University, Ahmedabad, India.
telesurgery system for healthcare 4.0: An architecture, research chal- His research interests are machine learning, com-
lenges, and future directions,’’ IEEE Netw., vol. 33, no. 6, pp. 22–29, puter vision, and natural language processing.
Nov. 2019.
[162] H. Vachhani, M. S. Obiadat, A. Thakkar, V. Shah, R. Sojitra, J. Bhatia,
and S. Tanwar, ‘‘Machine learning based stock market analysis: A short
survey,’’ in Innovative Data Communication Technologies and Applica-
tion, J. S. Raj, A. Bashar, and S. R. J. Ramson, eds. Cham, Switzerland:
Springer, 2020, pp. 12–26.
[163] G. Hinton, ‘‘Deep belief networks,’’ Scholarpedia, vol. 4, no. 5, p. 5947,
2009.
[164] G. E. Hinton, S. Osindero, and Y.-W. Teh, ‘‘A fast learning algorithm
for deep belief nets,’’ Neural Comput., vol. 18, no. 7, pp. 1527–1554,
Jul. 2006.
[165] M. Schuster and K. K. Paliwal, ‘‘Bidirectional recurrent neural net-
works,’’ IEEE Trans. Signal Process., vol. 45, no. 11, pp. 2673–2681,
Nov. 1997. CHINMAY MISTRY is currently pursuing the
[166] A. Mostafa, M. I. Khalil, and H. Abbas, ‘‘Emotion recognition by facial bachelor’s degree with Nirma University, Ahmed-
features using recurrent neural networks,’’ in Proc. 13th Int. Conf. Com- abad, India. His research interests are machine
put. Eng. Syst. (ICCES), Dec. 2018, pp. 417–422. learning, computer vision, and natural language
[167] X. Huang, G. Zhao, W. Zheng, and M. Pietikäinen, ‘‘Towards a dynamic processing.
expression recognition system under facial occlusion,’’ Pattern Recognit.
Lett., vol. 33, no. 16, pp. 2181–2191, Dec. 2012.
[168] M. Pantie and L. J. M. Rothkrantz, ‘‘Automatic analysis of facial expres-
sions: The state of the art,’’ IEEE Trans. Pattern Anal. Mach. Intell.,
vol. 22, no. 12, pp. 1424–1445, Dec. 2000.
RAJESH GUPTA (Student Member, IEEE) NEERAJ KUMAR (Senior Member, IEEE)
received the B.E. degree from the University of received the Ph.D. degree in CSE from Shri Mata
Jammu, India, in 2008 and the [Link]. degree Vaishno Devi University, Katra, India. He was
from Shri Mata Vaishno Devi University, Jammu, a Postdoctoral Research Fellow with Coven-
India, in 2013. He is a full-time Ph.D. Research try University, Coventry, U.K. He is currently
Scholar with Computer Science and Engineer- a Full Professor with the Department of Com-
ing Department, Nirma University, Ahmedabad, puter Science and Engineering, Thapar University,
India. He has authored/coauthored 13 publications Patiala, India. He is also a Visiting Professor
(including seven articles in SCI indexed jour- at Coventry University. He has published more
nals and six articles in IEEE ComSoc sponsored than 300 technical research articles in leading
international conferences). Some of his research findings are published in journals and conferences from IEEE, Elsevier, Springer, John Wiley, and
top-cited journals, such as the IEEE NETWORKS, Computer Communications, so on. Some of his research findings are published in top-cited jour-
Computer and Electrical Engineering (Elsevier), and the International nals, such as the IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, the IEEE
Journal of Communication System (Wiley). His research interests include TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, the IEEE TRANSACTIONS
network security, blockchain technology, 5G communication networks, and ON INTELLIGENT TRANSPORTATION, the IEEE TRANSACTIONS ON CLOUD COMPUTING,
machine learning. He is a recipient of the Doctoral Scholarship from the the IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, the IEEE
Ministry of Electronics and Information Technology, Govt. of India, under TRANSACTIONS ON VEHICULAR TECHNOLOGY, the IEEE TRANSACTIONS ON
the Visvesvaraya Ph.D. Scheme. CONSUMER ELECTRONICS, the IEEE Network, the IEEE Communications Mag-
azine, the IEEE WIRELESS COMMUNICATIONS, the IEEE INTERNET OF THINGS
JOURNAL, the IEEE SYSTEMS JOURNAL, Future Generation Computing Sys-
tems, the Journal of Network and Computer Applications, and Computer
Communications. He has guided many Ph.D. and M.E./[Link]. students.
His research was supported by funding from Tata Consultancy Service,
SUDEEP TANWAR (Member, IEEE) received
the Council of Scientific and Industrial Research (CSIR), and the Department
the [Link]. degree from Kurukshetra Univer-
of Science and Technology. He was awarded the Best Research Paper
sity, India, in 2002, the [Link]. degree (Hons.)
Awards from IEEE ICC 2018 and IEEE SYSTEMS JOURNAL 2018. He is
from Guru Gobind Singh Indraprastha University,
leading the research group Sustainable Practices for Internet of Energy and
Delhi, India, in 2009, and the Ph.D. degree with
Security (SPINES) where group members are working on the latest cutting
specialization in wireless sensor network, in 2016.
edge technologies. He is a TPC member and a reviewer of many international
He is an Associate Professor with Computer Sci-
conferences across the globe.
ence and Engineering Department, Institute of
Technology, Nirma University, Ahmedabad, India.
He is a Visiting Professor with Jan Wyzykowski AMOUN ALAZAB (Senior Member, IEEE)
University, Polkowice, Poland and the University of Pitesti, Pitesti, Romania. received the Ph.D. degree in computer science
He has authored or coauthored more than 130 technical research articles from the School of Science, Information Tech-
published in leading journals and conferences from the IEEE, Elsevier, nology and Engineering, Federation University of
Springer, Wiley, and so on. Some of his research findings are published in Australia. He is a Cyber Security Researcher and
top-cited journals, such as the IEEE TRANSACTIONS ON NETWORK SCIENCE AND a Practitioner with industry and academic experi-
ENGINEERING, the IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, the IEEE ence. He is an Associate Professor with the Col-
TRANSACTIONS ON INDUSTRIAL INFORMATICS, Computer Communication, Applied lege of Engineering, IT and Environment, Charles
Soft Computing, the Journal of Network and Computer Application, Perva- Darwin University, Australia. His research is mul-
sive and Mobile Computing, the International Journal of Communication tidisciplinary that focuses on cyber security and
System, Telecommunication System, Computer and Electrical Engineering, digital forensics of computer systems with a focus on cybercrime detec-
and the IEEE SYSTEMS JOURNAL. He has also published six edited/authored tion and prevention. He has more than 150 research articles in many
books with international/national publishers, such as IET and Springer. international journals and conferences, such as the IEEE TRANSACTIONS ON
He has guided many students leading to M.E./[Link]. and guiding students INDUSTRIAL INFORMATICS, the IEEE TRANSACTIONS ON INDUSTRY APPLICATIONS,
leading to Ph.D. His current interests include wireless sensor networks, the IEEE TRANSACTIONS ON BIG DATA, the IEEE TRANSACTIONS ON VEHICULAR
fog computing, smart grid, the IoT, and blockchain technology. He was TECHNOLOGY, COMPUTERS AND SECURITY, and Future Generation Comput-
invited as a Guest Editor/Editorial Board Member of many international ing Systems. He delivered many invited and keynote speeches, 24 events
journals, invited for keynote speaker in many international conferences held in 2019 alone. He convened and chaired more than 50 conferences and work-
in Asia and invited as the program chair, the publications chair, the publicity shops. He works closely with government and industry on many projects,
chair, and the session chair in many international conferences held in North including the Northern Territory (NT) Department of Information and Cor-
America, Europe, Asia, and Africa. He has been awarded the Best Research porate Services, IBM, Trend Micro, the Australian Federal Police (AFP),
Paper Awards from the IEEE GLOBECOM 2018, IEEE ICC 2019, and Westpac, and the Attorney Generals Department. He is the Founding Chair
Springer ICRIC-2019. He is an Associate Editor of IJCS (Wiley) and Security of the IEEE Northern Territory (NT) Subsection.
and Privacy (Wiley).