NLP and Sentiment Analysis
NLP and Sentiment Analysis
Introduction
Natural Language Processing
Sentiment Analysis
Paper Case Study 1: Sentiment Analysis Techniques
Paper Case Study 2 : Natural Language Processing For Sentiment Analysis
Goals for today
Sentiment analysis is used to find the author’s attitude towards something. Senti
ment analysis tools categorize pieces of writing as positive, neutral, or negative.
Some tools offer sentiment score which helps with the gradation of particular e
motions.
Sentiment score is a scaling system that reflects the emotional depth of emotion
s in a piece of text. It detects emotions and assigns them a particular value, for e
xample, from 0 up to 10 – from the most negative to most positive.
Why sentiment analysis is important?
First of all, sentiment analysis saves time and effort because the process of sentiment extraction i
s fully automated – it’s the algorithm that analyses sentiment data and so human participation is
sparse.
Secondly, sentiment analysis is important because emotions and attitudes towards a topic can be
come actionable pieces of information useful in numerous areas of business and research. There
are many industries which benefit from knowing the feelings of a target audience towards servic
es, policies, etc. One of the more interesting examples is the Obama administration which use
d sentiment analysis to get insight of the public’s sentiments towards policies before the 2012 ele
ction.
And lastly, sentiment analysis is becoming a more and more popular topic as artificial intelligenc
e, machine learning and natural language processing technologies that are booming these days.
Applications
Mainstream applications
Review-oriented search engines
Market research (companies, politicians, ...)
Improve information extraction, summarization, and question answering
Discard subjecte sentences
Show multiple viewpoints
Improve communication and HCI?
Detect flames in emails and forums
Nudge people to avoid „angry“ Facebook posts?
Augment recommender systems: downgrade items that received a lot of negative feedback
Detect web pages with sensitive content inappropriate for ads placement
...
Well kids, I had an awesome birthday thanks to you. =D
Just wanted to so thank you for coming and thanks for th
e gifts and junk. =) I have many pictures and I will post t current
hem later. hearts mood:
Fine-grained Sentiment Analysis involves determining the polarity of the opinion. It can be a simple bina
ry positive/negative sentiment differentiation. This type can also go into the more higher specification (for
example, very positive, positive, neutral, negative, very negative), depending on the use case (for example,
as in five-star Amazon reviews).
Emotion detection is used to identify signs of specific emotional states presented in the text. Usually, there
is a combination of lexicons and machine learning algorithms that determine what is what and why.
Aspect-based sentiment analysis goes deeper. Its purpose is to identify an opinion regarding a specific ele
ment of the product. For example, the brightness of the flashlight in the smartphone. The aspect-based anal
ysis is commonly used in product analytics to keep an eye on how the product is perceived and what are the
strong and weak points from the customer point of view.
Intent Analysis is all about the action. Its purpose is to determine what kind of intention is expressed in the
message. It is commonly used in customer support systems to streamline the workflow.
Issues in aspect-/sentence-oriented SA (1)
Example
mod
“Canon G3 produces great pictures”
Rule: `a noun on which an opinion word directly depends through mod is taken as an aspect‘ allows extra
ction in both directions
Issues in aspect-/sentence-oriented SA (3)
But: domain-dependent:
Movie reviews: movie ~ picture
Camera reviews: movie video; picture photos
Carenini et al (2005): extend dictionary using the corpus
Input: taxonomy of aspects for a domain
similarity metrics defined using string similarity, synonyms and distances measured using WordNet
merge each discovered aspect expression to an aspect node in the taxonomy.
WordNet
Issues in aspect-/sentence-oriented SA (3)
Corpus-based:
learn from labelled examples
Disadvantage: need these (expensive!)
Advantage: domain dependence
Issues in aspect-/sentence-oriented SA (3)
Speech
Written language
Phonology: sounds / letters / pronunciation
Morphology: the structure of words
Syntax: how these sequences are structured
Semantics: meaning of the strings
Interaction between levels
Issues in Syntax
2. Identify collocations
mother in law, hot dog
Compositional versus non-compositional collocates
Issues in Syntax
Shallow parsing:
“the dog chased the bear”
“the dog” “chased the bear”
subject - predicate
Identify basic structures
NP-[the dog] VP-[chased the bear]
Issues in Semantics
Extract information
Detect new patterns:
Hidden information etc.
Gov./mil. puts lots of money put into IE research
Sentiment Analysis Tasks
Simplest task:
Is the attitude of the text positive or negative?
More complex:
Rank the attitude of the text from 1 to 5
Advanced:
Detect the target, source, or complex attitude types
How does Sentiment Analysis work?
Sentiment analysis aims at finding an opinionated point of view and its disposition and highlighting the inform
ation of particular interest in the process. It is applied for the following operations:
Find and extract the opinionated data (aka sentiment data) on a specific platform (customer support, review
s, etc.)
Determine its polarity (positive or negative)
Define the subject matter (what is being talked about in general and specifically)
Identify the opinion holder (on its own and in correlation with the existing audience segments)
More specifically, depending on the purpose, sentiment analysis algorithm can be used at the following scopes:
Document-level - for the entire text.
Sentence-level - obtains the sentiment of a single sentence.
Sub-sentence (Word) level - obtains the sentiment of sub-expressions within a sentence.
Document-level sentiment analysis
The major challenges on document-level sentiment analysis are cross-domain sentiment analysis
and cross-language sentiment analysis. It has been shown that specific domain-oriented sentime
nt analysis has achieved remarkable accuracy, which is highly sensitive to the domain.
The feature vector used in these tasks contains a bag of words, which should be specific to a part
icular domain and are limited. Sentiment classifier is applied as it is costly to annotate data for e
ach new domain. Spectral feature alignment, structural correspondence learning, and sentiment-s
ensitive thesaurus are three classical techniques. They are different in terms of feature vector exp
ansion, words relatedness measurement, and finally classifier used for classification.
Many methods used for cross-domain classification usually utilize labeled or unlabeled data or b
oth of them. Hence, the techniques give different results for different domains as well as for diff
erent purposes.
Document-level sentiment analysis
Bollegala et al. [1] developed a technique which uses sentiment-sensitive thesaurus (SST) for pe
rforming cross-domain sentiment analysis. They proposed a cross-domain sentiment classifier us
ing an automatically extracted sentiment-sensitive thesaurus.
To handle the mismatch between features in cross-domain sentiment classification, they utilized
labeled data from multiple source domains and unlabeled data from target domains to compute t
he relatedness of features and construct a sentiment-sensitive thesaurus.
Then the created thesaurus is used to expand feature vectors during training and testing process f
or a binary classifier. A relevant subset of the features is selected using L1 regularization.
[1] Bollegala D, Weir D, Carroll J (2013) Cross-domain sentiment classification using a sentiment sensitive thesaurus. IEEE
Trans Knowl Data Eng 25(8):1719–1731
Document-level sentiment analysis
Another document-level sentiment analysis is cross-language sentiment analysis, which has been
studied by several researchers. Most of them focus on sentiment classification at the document le
vel.
Xia et al. [1] proposed a three-stage cascade model for the polarity shift problem in the context o
f document-level sentiment classification, in which each document is split into a set of sub-sente
nces, and a hybrid model is built up with employing rules and statistical methods to detect explic
it and implicit polarity shifts. Then, a polarity shift elimination method is used to remove polarit
y shift in negations. Finally, different types of polarity shifts are used to train base classifiers.
Li et al. [2] proposed a cross-lingual structural correspondence learning SCL based on the distrib
uted representation of words; it can learn meaningful one-to-many mappings for pivot words usi
ng large amounts of monolingual data and a small dictionary.
[1] Xia R, Xu F, Yu J, Qi Y, Cambria E (2016) Polarity shift detection, elimination and ensemble: a three-stage model for document-level sentiment analysis. Inf Process Manag
52(1):36–45
[2] Li N, Zhai S, Zhang Z, Liu B (2017) Structural correspondence learning for cross-lingual sentiment classification with one-to-many mappings. AAAI 2017:3490–3496
Sentence-level sentiment analysis
Sometimes document-level sentiment analysis is too coarse for some special purposes. A lot of
early work at sentence-level analysis focuses on identifying subjective sentences. But there will
be complex tasks such as dealing with conditional sentences or dealing with sarcastic sentences.
In such cases, sentence-level sentiment analysis is desirable.
Wu et al. [1] proposed an approach for sentence-level sentiment classification without labeling s
entence. It is a unified framework to incorporate two types of weak supervision with document-l
evel and word-level sentiment labels, to learn the sentence-level sentiment classifier.
[1] Wu F, Zhang J, Yuan Z, Wu S, Huang Y, Yan J (2017) Sentence-level sentiment classification with weak supervision. In:
Proceedings of the 40th international ACM SIGIR conference on research and development in information retrieval, 2017.
ACM, pp 973–976
Word-level sentiment analysis
Document-level analysis focuses on distinguishing the entire document from subjective or objective, and p
ositive or negative, while sentence-level analysis is more effective than document-level analysis, because a
document contains both subjective and objective sentences.
While word is the basic unit of language, the polarity of a word is closely related to the subjectivity of co
rresponding sentence or document. There exists a huge possibility that a sentence containing an adjective is
a subjective sentence. In addition, choice of word for expression reflects not only the individual’s demog
raphic characteristic such as gender, age, but also reflects its motivation, personality, social status, and
other psychological or social traits. Therefore, word is the basis of text sentiment analysis.
At present, the commonly used methods include: natural language processing technology-based approach a
nd machine learning-based approach. For sentiment analysis of micro-blog text, most researchers suggest t
hat term matching-based technique should be adopted. The emotional term is the link between the emotion
al orientation of the text and the single word. Each word can be regarded as the collection of certain kinds o
f viewpoint information, which is a clue to the emotion and subjectivity of the text. .
Units of analysis, methods, features
The unit of analysis
community
another person
user / author
document
sentence or clause
aspect (e.g. product feature)
The analysis method
Machine learning
Supervised
Unsupervised
Lexicon-based
Dictionary
Flat
With semantics
Corpus
Discourse analysis
Features
Features:
Words (bag-of-words)
N-grams
Parts-of-speech (e.g. Adjectives and adjective-adverb combinations)
Opinion words (lexicon-based: dictionary or corpus)
Valence intensifiers and shifters (for negation); modal verbs; ...
Syntactic dependency
Features
RNN
The current approach to sentiment analysis (providing the Other problems related to the common approach come
RNN behaves as expected) suffers from the accuracy of from the type of NN used for prediction.
prediction being highly depended on the quality of the
encoded data. The most common being:
For example, one common approach known as hot-
encoding does not provide a reliant encoding as similar GRADIENT VANISHING
terms that might share contextual meaning are encoded as GRADIENT EXPLODING
separate entities.
I like cats. Kittens are really cute. These problems are however easily fixable through the
1000000 I use of the more advanced LSTM model, which allows a
0 1 0 0 0 0 0 like NN to ”forget” information.
0 0 1 0 0 0 0 cats
0 0 0 1 0 0 0 Kittens
0 0 0 0 1 0 0 are
0 0 0 0 0 1 0 really
0 0 0 0 0 0 1 cute
Case Study 1 => Analysis on Tweets
Natural Language Processing for Sentiment Analysis
Introduction
Sentiment Lexicon Model
Tweets Sentiment Analysis
Extract Sentiment Based on Subject
Sentiment Lexicon Model
There are many ways to analyze sentiment, but the most common way being use
d today is the lexicon-based model. This model uses a dictionary of words that a
re annotated (by humans) with their polarity (good or bad) and strength (how go
od or how bad). The lexicon-based sentiment analyzer combs through text and pi
cks out specific words or phrases, called tokens, and classifies their polarity and
strength to capture the text’s opinion towards its main subject matter.
Paper Abstract Points
Sentiments Classification
Subjectivity Classification
Semantic Association
Polarity Classification
Proposed System
Tweets were extracted from a Twitter database for experiment. All tweets were m
anually labelled as positive, negative and neutral. This set of tweets was used to
evaluate the performance of the proposed system, using metrics such as the accur
acy and precision of the predictive result.
To present the tweets in structured manner, some preprocessing have been done
on the dataset before being further analyzed by the proposed system. Pre-process
ing ensures that the tweets will be prepared in formal language format that can b
e read and understood by machine. After pre-processing, sentiment of tweets can
be determined through sentiment classification.
Classification
A data set is a collection of related, discrete items of related data that may be access
ed individually or in combination or managed as a whole entity.
A total of 1513 tweets were extracted from Twitter and manually labelled. These tw
eets contain keyword ‘Unifi’, which is a telecommunication service in Malaysia. It i
s also the subject in sentiment classification. There are 345 positive tweets, 641 nega
tive tweets and 531 neutral tweets. Tweets were analyzed by the proposed system to
obtain the predictive sentiment.
Analyzing Tweets
Naïve Bayes
Decision Trees
Support Vector Machines
Preprocessing
Subjective classification differentiates the tweets into subjective or objective. The system scan
s the tweets word by word, and finds out the word that contains sentiment. If the word in the t
weet carries positive or negative sentiment weightage, the tweet will be classified as subjective
. Else, it will be objective, is which also neutral.
For example, Come and get internet package” or Come and get new internet package”
The first tweet does not have any word that carries sentiment score. It will be classified as
Objective and tagged as Neutral. While in second tweet, “new” is a word that has sentiment
score. The tweet will be classified as subjective, and proceed to next step for semantic asso
ciation.
Semantic Association
In semantic association, the sentiment lexicons that associate to subject are bein
g defined through grammatical relationships between subject and sentiment lexicons
. As tweets are mostly short and straight forward, the grammar structure is simpler t
han normal text. Sentiment lexicons that mostly associate with subject will be adject
ives or verbs.
For example, ‘I love Unifi’, the verb ‘love’ is the sentiment lexicons. While chec
king with Senti WordNet, ‘love’ has a positive score of 0.625. Hence, we can con
clude that ‘Unifi’ has Positive sentiment, thus classify the tweet as Positive.
Cont….
For comparative opinion, the position of the subject is very important. For instan
ce, in the tweet ‘Unifi is better than M’, adjective ‘better’ is found, but there are
2 Subjects – ‘Unifi’ and ‘M’. The subject that exists after the comparative adjecti
ve carries a contrast sentiment with the Subject that appears before. In this case,
as ‘better’ carries a Positive score of 0.825, ‘Unifi’ will be classified as positive,
and ‘M’ will be classified as negative
Alchemy API
Twitter is a very popular social media platform. In this paper, we present the prel
iminary results of our proposed system that corporates NLP technique to extract
subject from tweets, and classify the polarity of tweets by analyzing sentiment le
xicons that are associated to subject.
From the experiments, the proposed system performs better compared to Alchem
y API, but still need to be improved as SVM is doing better. For future works, th
e focus will be on how to enhance the accuracy of sentiment analysis.
WWW 2018, April 23-27, 2018, Lyon, France
σ X CAP 1 CAP 2
THE MODEL OF A CAPSULE
REP. VECTOR Each model will be introduced by
itself later within the presentation. P1 REP. 1 REP. 2 P2
ATTENTION
The attention layer interprets the In simple terms, the attention mechanism is able to look at how well
content of the encoded instance. encoded weighted terms relate to the given context.
The probability that a cell will activate is given by the product between
a learned weight and the representation of the capsule combined with
a learned bias.
These two arbitrary values are set during the supervised training of
the model, based on two goals described in the paper.
PROBABILITY
RECONSTRUCTION
The reconstruction module attempts to
replicate the encoded input given to the
capsule by using the capsule’s LEARNING GOAL II
representation.
Minimizing the error margin of the reproduction outputted by the
matching capsule while maximizing the error margin outputted
by the other capsules.
The presented model has been trained and tested on three datasets.
Zeyang Lei
The two stage process: Yujiu Yang
Firstly, the model analyses the input and decides which contextual Yi Liu
information is relevant to the feeling associated to given input.
Secondly, the model analyses sentence wise structures to form a
representation for the given input.
A DEEPER LOOK
The features maps are then fed to the phrase-level attention layer
that will process a “sentiment-specific” sentence representation.
NOTE: the sentiment-specific sentence representation is fed to a softmax layer in
order to predict the degree of each sentiment found within a particular sentence.
In training, the model attempts to minimize a cross-entropy error rate.
The presented model has been trained and tested on two datasets.
They are as following: Movie Review, Stanford Sentiment Treebank.
Following the procedure the model has scored the highest accuracy
of all the tested models.
CONCLUSIONS
The mentioned papers attempt to deal with the
difficulty of translating linguistic information into
usable ML parameters through the use of ATTENTION.
Both suggested models have accomplished competitive
results.
Thank you.