Emotion Recognition of Social Media Users Based On Deep Learing
Emotion Recognition of Social Media Users Based On Deep Learing
ABSTRACT
Issues with sentiment analysis in social media include neglecting the long-distance
semantic link of emotional features, failing to capture the feature words with emotional
hue effectively, and depending excessively on manual annotation. This research
provides a user emotion recognition model to achieve the emotional analysis of
microblog public opinion events. Three types of inspiring text, ‘‘joy,’’ ‘‘anger,’’ and
‘‘sadness,’’ are obtained by the data collecting and data preprocessing of micro-blog
public opinion event comment text. Then, an algorithm using the linear discriminant
analysis (LDA) model, emotion dictionary, and manual annotation is created to
extract emotional feature words. The captured motivational text is converted into a
word vector using Word2vec. After gathering the long-distance semantic data with
bidirectional long short-term memories (BiLSTM) and convolutional neural networks
(CNN) extract the text’s key characteristics to finish the emotion categorization. The
test results demonstrate an average increase in F1 value of 3.66 percent for six machine
learning models and an average increase in F1 value of 1.84 percent for seven deep
learning models. The suggested model performs better at identifying the emotions of
social media users than the current machine learning and deep learning methods.
Subjects Human-Computer Interaction, Data Mining and Machine Learning, Network Science
and Online Social Networks, Text Mining, Sentiment Analysis
Keywords Deep learning, Machine learning, Social media, Sentiment characteristics, Emotional
analysis
Submitted 10 February 2023
Accepted 4 May 2023
Published 14 June 2023 INTRODUCTION
Corresponding author
Chen Li, [email protected]
With the rapid proliferation of Web 2.0 and social media, the internet has become a
treasure trove of comment information, containing users’ value tendencies and emotional
Academic editor
Faisal Saeed coloring on public opinion events, character views, and scenery. This information reflects
Additional Information and the public’s emotions and attitudes towards various phenomena, including joy, anger,
Declarations can be found on sadness, approval, and criticism. The automatic and expeditious extraction of users’
page 12
emotional tendencies from unstructured comments is crucial for dynamically monitoring
DOI 10.7717/peerj-cs.1414 the emotional state of public opinion events. Hence, the advent of sentinel analysis is
Copyright a much-needed development (Chen, Jin & Lin, 2021). Sentiment analysis, also known
2023 Li and Li as opinion mining and tendency analysis, aims to analyze, process, reason, and predict
Distributed under subjective texts with emotional coloring, focusing on the distinguishing features of different
Creative Commons CC-BY 4.0 emotional hues, such as positivity or negativity. Moreover, personal emotion, an important
OPEN ACCESS concept closely related to sentiment, plays a significant role (Yang et al., 2020). According
How to cite this article Li C, Li F. 2023. Emotion recognition of social media users based on deep learning. PeerJ Comput. Sci. 9:e1414
http://doi.org/10.7717/peerj-cs.1414
Like On my way to see you, Happy To bid farewell to the old year Surprised
The moment I saw you, it was
the wind is warm, the rain and welcome the New Year, I
all worth it.
is sweet, wish you longevity and heaven.
Anger Sadness There is a feeling called Fear I thought I was afraid of only
Anger burns in the hearts
helpless, there is a feeling farewell, originally I was also
of men.
called powerless. afraid of reunion.
Figure 1 Weibo users’ emotional expression. Content source: https://weibo.com. In expressing feelings,
judgments, appreciation, speculation, and recognition, subjective sentences to a certain extent will involve
emotions. In addition, emotions can be visible as the declaration of one’s emotions and contemplations.
The concept of emotion is close to emotion. Opinion intensity is related to some emotional intensity, such
as happiness, surprise, anger, sadness or fear, as shown in the figure.
Full-size DOI: 10.7717/peerjcs.1414/fig-1
to Liu (2010), an emotional sentence expresses personal feelings, opinions, or beliefs, while
an objective sentence lacks emotion.
Furthermore, subjective phrases inherently include emotions to some degree when
expressing feelings, assessments, appreciations, hypotheses, and recognitions. Disclosing
one’s feelings and thoughts is another way to express emotions. Subjective emotion and the
idea of emotion are connected. As seen in Fig. 1, the intensity of an opinion is frequently
linked to the intensity of an emotion, such as joy, surprise, wrath, sadness, or fear.
The emergence of social media has made it easier for individuals to share their
information. Government and businesses must communicate their sentiments and
firsthand knowledge of the occurrences. Bollen, Mao & Pepe (2011) conducted a six-
dimensional analysis of public emotions using the profile of mood states. Another study
examined the emotional shifts experienced by Twitter users following key events using
Prazick’s theory of emotional development psychology to map eight emotions into four
pairs of emotional polarities (Wang, Wang & Feng, 2016). However, social media generates
varying opinion data because of its large user base. Most of the time, the information is
concise and packed with many remarks and personal feelings. There is a lot of text noise,
which somewhat enhances the complexity of text mining research (Liu, Qi & Xu , 2021).
Traditional sentiment analysis approaches extract characteristics from text data to achieve
sentiment classification. The results are somewhat biased since they do not adequately
consider the semantic importance between contexts, which cannot reflect the true feelings
of social media.
RELATED WORKS
Relying on social media, users have generated many opinions on public opinion events,
people’s views and scenery, which provides the possibility to understand and deeply mine
the user’s information behavior. Its core research is to analyze the emotions that users
express on the social media platform, that is, emotional analysis. The emotional analysis
mainly includes orientation classification, sentiment analysis, emotion time series analysis,
subjective detection, opinion summary, opinion retrieval, opinion holder extraction,
irony and irony detection, cross-domain sentiment analysis and multimodal sentiment
analysis (Zhao, 2021). The most common sentiment analysis is emotion classification and
sentiment analysis. Emotion classification is based on the assumption that an entity or its
aspects and attributes can be divided into two opposite emotional polarity, which can be
divided into positive, negative and neutral. Mood analysis is based on emotional analysis
and combined with the profile of mood states (Li, Lin & Lin, 2018). Kramer, Guillory &
Hancock (2014) proposed that emotions on social media platforms can be transmitted
through the emotional contagion mechanism based on the experimental research of users
of the Facebook platform. The study found that people in the social network environment
will unconsciously experience the same emotional state as their friends. Yu & Wang (2015)
analyzed the Twitter data during the 2014 World Cup and found that the emotion of users’
tweeting is consistent with the actual situation on the field.
Depending on the feature set, machine learning-based sentiment analysis methods can
be divided into supervised learning technology and unsupervised learning technology.
Based on the supervised machine learning method, support vector machine (SVM), naive
Bayes, decision tree algorithm, etc., which needs sufficient corpus as support (Wang et
al., 2021). Based on unsupervised learning, the primary methods are unsupervised and
Overall framework
This article proposes an improved LDA and CNN-BiLSTM emotion classification model.
The overall framework is shown in Fig. 2 (Rhanoui et al., 2019; Liu et al., 2020).
The specific implementation process is as follows:
(1) Through Python and XPath technology, the user-defined web crawler is used to collect
the comment information of microblog social media public opinion events, including
‘‘Joy,’’ ‘‘anger,’’ and ‘‘sadness,’’ and store them in the local CSV file.
Figure 2 Overall framework of the model. This article proposes an improved LDA and CNN-BiLSTM
emotion classification model. The overall framework is shown.
Full-size DOI: 10.7717/peerjcs.1414/fig-2
(2) The data preprocessing of comment text includes Jieba Chinese word segmentation,
stop word filtering, special character deletion, duplicate comment deletion, comment
annotation, etc.
(3) The deep features of the word model are extracted from the model and used as the
input feature of the word model.
(4) The CNN-BiLSTM model is constructed, and CNN is used to extract the key features of
the text, and LSTM captures the long-distance semantic features. Finally, the Softmax
classifier calculates the sentiment tendency of social media public opinion event
comments to complete the emotion classification. The output results correspond to
‘‘joy,’’ ‘‘anger,’’ and ‘‘sadness,’’ respectively.
CNN model
A convolutional neural network (CNN) mainly comprises convolution and pooling layers.
In this article, a three-layer CNN is constructed to extract the key features of comment
text with sentiment events. The convolution layer will receive the emotion feature matrix
of n × d, and the convolution process is shown in Eq. (1).
Figure 3 Process of emotional feature extraction. A feature extraction method based on the LDA model
and emotion dictionary is constructed. The specific implementation process is shown.
Full-size DOI: 10.7717/peerjcs.1414/fig-3
In the formula, f represents the activation function, and the ReLu function is usually
used to accelerate the training convergence speed. hd represents the feature of comments
on Weibo social media after vector convolution processing; wd represents the convolution
kernel of size d; Vi represents the word vector of the input layer. bd represents the offset
item. This convolution operation can effectively generate local feature sets, as shown in Eq.
(2).
The pooling layer can compress the size of text feature vectors and model parameters,
maximizing emotional feature retention. Its calculation formula is shown in Eq. (3).
si = max{Hd }. (3)
Filters with convolution kernels of 2, 3 and 4 were constructed to extract key features of
Weibo comment text, and then their output vectors were input into the BiLSTM model.
BiLSTM model
The bi-directional long short-term memory (BiLSTM) model is a variant of the recurrent
neural network, which extracts features from the front and back directions to capture the
long-distance dependency and context semantic features. This article extracts the emotional
features of public opinion event reviews.
The network structure of the BiLSTM model is shown in Fig. 4,
Through the transmission of state to enhance the subject information and to effectively
capture emotional characteristics such as ‘‘like’’ and ‘‘haha,’’ ‘‘uncomfortable,’’ and ‘‘pray,’’
Forward LStm
Backward LStm
←−
where, hEt represents the state of the forward LSTM layer at time t, and ht ←− represents the
state of the backward LSTM layer at time t. xt represents the input word vector; w1 to w6
represents weight parameters; f represents activation function; yt is the final output of the
bidirectional LSTM layer. Finally, the vector obtained by the BiLSTM model is input into
the Softmax classifier to realize the emotion classification. That is, the emotion categories
of ‘‘joy,’’ ‘‘anger,’’ and ‘‘sadness’’ are predicted.
METHODOLOGY
Data acquisition
Building a web crawler using Python and XPath technologies to gather comment data
on public opinion events on Weibo (https://weibo.com) is the experimental data for
this research. 200,000 data sets with emotive color were created after data cleaning and
preprocessing. Three data sets—one for each emotion (joy, anger, and sadness) were
randomly split into training, test, and validation sets. The ratio of training set, test set and
validation set was 3:1:1. Table 1 displays the distribution of the data.
Evaluation index
A confusion matrix is a standard tool used to evaluate unbalanced data, as shown in Table
1. It can obtain classification accuracy, precision, recall, and F1.
For sentiment analysis of Weibo social media comments, F1 and accuracy are used
for experimental evaluation in this article, and the calculation process is shown in Eqs.
(7)–(10).
TP
Precision = (7)
TP + FP
TP
Recall = (8)
TP + FN
2 × Precision × Recall
F1 = (9)
Precision + Recall
TP + TN
Accuracy = (10)
TP + TN + FP + FN
Among them, precision is used to evaluate the percentage of emotion classification
correctly predicted as the percentage of specified category in the number of anticipated
category reviews. The recall is used to assess the percentage of emotion classification
correctly predicted in the number of emotion reviews of the category. F1 is a weighted
harmonic mean of precision and recall.
Experimental process
This article used the LDA model and emotion dictionary. The n_topic of the LDA model was
set as three, corresponding to ‘‘joy,’’ ‘‘anger,’’ and ‘‘sadness,’’ respectively. This operation
allows unnecessary noise feature words to be effectively filtered, and feature words will be
more emotional after processing, which provides good support for the subsequent emotion
classification of the CNN-BiLSTM model. The specific process is as follows:
(1) Chinese word segmentation and data cleaning (including stop word filtering and special
character cleaning) extract feature words that only retain semantic value information.
(2) The LDA model is used to extract the emotional feature words of ‘‘joy,’’ ‘‘anger,’’
and ‘‘sadness,’’ and the emotional feature words of other comments are added by
combining the emotional vocabulary ontology database and manual annotation.
(3) The emotion features are extracted from the CNN-BiLSTM model.
After feature extraction based on the LDA model and emotion dictionary, this article
constructs the CNN-BiLSTM model and realizes the social media sentiment analysis
experiment. The hyper-parameters of the model are shown in Table 2.
In addition, the Epoch of the model is 200. A Dropout layer is added to prevent
overfitting. To avoid the influence of one abnormal experiment result, the whole experiment
result is the average value of ten experiment results. At the same time, it is compared with
classical machine learning models (including logistic regression, SVM, random forest,
KNN, naive Bayes, AdaBoost) and deep learning models (including LSTM, BiLSTM, Gru,
BiGRU, CNN, TextCNN).
“Without”“Problem”“Death”“Real”“Pathetic”
Anger
“Serious”“Angry”
“TMD”“HeHe”“Vulnerable”
“Pathetic”“Pray”“Distress”“Silence”
Sadness
“Blessing”“Pity”
Figure 5 Results of emotional feature words extraction. The emotional feature word extraction model
proposed in this article completes the social media sentiment analysis task.
Full-size DOI: 10.7717/peerjcs.1414/fig-5
Figure 6 Comparison of emotion classification of different models. The F1 value of the proposed
model is 0.89, and the accuracy rate is 0.87. The experimental results are better than the existing machine
learning and deep learning models.
Full-size DOI: 10.7717/peerjcs.1414/fig-6
The results show that the fusion of the LDA model and emotion dictionary indicates
better performance in sentiment analysis of social media reviews. The F1 value of six
machine learning models is increased by 3.66%, and that of seven deep learning models is
increased by 1.84%, which shows that the effective extraction of emotional feature words
can improve the effectiveness of the classification model to a certain extent. It can fully
realize the sentiment analysis of the comments on social media public opinion events,
better perceive the public sentiment and predict the emotional trend. It can give full play
to the advantages of multilevel and multi-scale feature extraction networks in feature
extraction, and can better extract word-level, phrase-level and sentence-level features to
ensure the adequacy of feature extraction
CONCLUSION
This manuscript introduces a model for recognizing the emotions of social media
users, which facilitates the emotional analysis of microblog public opinion events. The
experimental results reveal the superior performance of our approach, as it yields a
precision, recall, F1 value, and accuracy of 0.8946, 0.8841, 0.8893, and 0.8778, respectively,
surpassing the existing machine learning and deep learning models. Notably, the LDA
model and emotion dictionary’s experimental results significantly improved, with an F1
value 3.66% higher than the six machine learning models and 1.84% higher than the
seven deep learning models. In conclusion, our method effectively perceives the emotional
situation of public opinion events in social media and holds substantial research value.
Nonetheless, this study only uses sorrow, anger, and joy as the indicators of emotion
analysis, thereby limiting its scope to coarse-grained analysis. These experimental findings
suggest that using CNN alone to extract and learn emotional features is inadequate.
In feature extraction, we recommend combining local feature extraction and global
feature extraction, emphasizing global feature extraction. Future research should analyze
fine-grained emotion characteristics using more complex data annotation.
Funding
This study was supported by the National Social Science Fund of China (Key program)
project number: 21AXW006. The funders had no role in study design, data collection and
analysis, decision to publish, or preparation of the manuscript.
Grant Disclosures
The following grant information was disclosed by the authors:
National Social Science Fund of China (Key program): 21AXW006.
Competing Interests
The authors declare there are no competing interests.
Author Contributions
• Chen Li conceived and designed the experiments, performed the experiments, analyzed
the data, performed the computation work, prepared figures and/or tables, authored or
reviewed drafts of the article, and approved the final draft.
Data Availability
The following information was supplied regarding data availability:
The code is available in the Supplemental File. The data is available at Zenodo: tzw.
(2023). Emotion recognition of social media [Data set]. Zenodo. https://doi.org/10.5281/
zenodo.7622150.
Supplemental Information
Supplemental information for this article can be found online at http://dx.doi.org/10.7717/
peerj-cs.1414#supplemental-information.
REFERENCES
Blei DM Jordan MI. 2003. Modeling annotated data. In: Proceedings of the 26th an-
nual international ACM SIGIR conference on research and development in infor-
mation retrieval. New York: Association for Computing Machinery, 127–134
DOI 10.1145/860435.860460.
Bollen J, Mao H, Pepe A. 2011. Modeling public mood and emotion: twitter sentiment
and socio-economic phenomena Johan. In: Proceedings of the 5th international AAAI
conference on weblogs and social media, 450–453.
Che S, Li X. 2021. A comparative analysis of affective utterances in Chinese and American
corporate letters to shareholders from the perspective of evaluation system: Text
Mining Technology Based on emotion dictionary and machine learning Foreign
languages. Journal of Shanghai International Studies University 44(02):50–59 (in
Chinese).
Chen H, Jin S, Lin W. 2021. Quantitative analysis of social media rumor transmission
related to Xinguan epidemic. Computer Research and Development 58(7):1366–1384
(in Chinese).
Kramer ADI, Guillory JE, Hancock JT. 2014. Experimental evidence of massive
scale emotional contagion through social networks. Proceedings of the Na-
tional Academy of Sciences of the United States of America 111(24):8788–8790
DOI 10.1073/pnas.1320040111.
Liu X, Qi R, Xu L. 2021. Sentiment analysis of Russian twitter text with multilevel
features. Minicomputer System 42(06):1176–1183 (in Chinese).
Liu B. 2010. Sentiment analysis and subjectivity. In: Handbook of natural language
processing. 627–666.
Liu ZX, Zhang DG, Luo GZ, Lian M, Liu B. 2020. A new method of emotional analysis
based on CNN–BiLSTM hybrid neural network. Cluster Computing 23:2901–2913
DOI 10.1007/s10586-020-03055-9.
Li R, Lin Z, Lin H. 2018. An overview of text sentiment analysis. Computer Research and
Development 55(01):30–52 (in Chinese).