Supervise
d ML
&
Sentiment
Analysis
CONTENTS
● Review Supervised ML
● Build your own tweet classifier!
TABLE OF CONTENTS
Sentiment
Supervised ML
01 Training 02 analysis
Use of ML in SA
Feature Extraction Logistic regression
03 Extract Features 04 Apply LR for SA
01
SUPERVISED ML
Training
Supervised ML (training)
Paramete
rs
Cost
Feature Prediction Outpu Outpu
Function t vs
s t Label
Label
s
02
SENTIMENT
ANALYSIS
Use of ML in SA
Sentiment analysis
Positive:
Tweet I am happy because I am 1
learning NLP
Negative:
:
0
Logistic
regression
Sentiment analysis
I am happy
because I Train
Classify Positive:
LR
am 1
learning
NLP
Summary
● Features, Trai Predic
Labels n t
● Extract Train Predict
features LR sentiment
03
FEATURE EXTRACTION
Extract Features
Outline
● Vocabulary
● Feature extraction
● Sparse representations and some of their
issues
Vocabulary
I am happy because I am
Tweets: learning
...N..
[tweet_1, tweet_2, ..., ...
.
tweet_m] LP
I hated the
movie
[ I, happ becaus learnin NLP ... the movie
am, y, e, g, , hated, , ]
Feature extraction
I am happy because I am learning
NLP
NLP, . hate the movie
[ I , am, happy, because, .. d, , ]
learning,
[ 1, 1 1, 1, 1, ... 0 0
1, , 0, , ]
A lot of zeros! That’s a sparse
representation.
Problems with sparse representations
I am happy because I am
learning NLP All
zeros!
[ 1 , 1 , 1 , 1 , 1 , 1 , ... , 0 , … ,
0,0,0]
1
1. Large training
time
2. Large prediction
time
Summary
● Vocabulary: set of unique words
● Vocabulary, Text [1 ….. 0 ….. 1 .. 0 .. 1 .. 0]
● Sparse representations are problematic for training and
prediction times
Negative and
Positive
Frequencies
Outline
● Populate your vocabulary with a frequency count for
each class
Positive and
negative counts Vocabulary
I
Corpu am
s
I am happy because I am happy
learning NLP because
I am happy learning
I am sad, I am not learning NLP
sad
NLP I am sad
not
Positive and
negative counts
Negative
I am happy because I am I am sad, I am tweets
not learning NLP
learning NLP
I am sad
I am happy
Positive and negative counts
Vocabular PosFreq
y (1) 3
I am happy because I am learning NLP aI 3
m
happ 2
y
becaus 1
I am happy
e 1
learnin 1
g NLP 0
sad 0
not
Positive and negative counts
Vocabulary NegFreq
(0) 3 Negative
I 3 tweets
I am sad, I am not learning NLP
am 0
I am sad
happy 0
because 1
learning 1
NLP 2
sad 1
not 1
Word frequency in classes
Vocabulary PosFreq (1) NegFreq (0)
I 3 3
am 3 3
freqs: dictionary mapping
happy 2 0
from
because 1 0 (word, class) to frequency
learning 1 1
NLP 1 1
sad 0 2
not 0 1
Summary
● Divide tweet corpus into two classes: positive and
negative
● Count each time each word appears in either class
➔ Feature extraction for training and prediction!
Feature
extraction
with
frequencies
Outline
● Extract features from your frequencies dictionary to create
a features
vector
Word frequency in classes
Vocabulary PosFreq (1) NegFreq (0)
I 3 3
am 3 3
freqs: dictionary mapping
happy 2 0
from
because 1 0 (word, class) to frequency
learning 1 1
NLP 1 1
sad 0 2
not 0 1
Feature extraction
freqs: dictionary mapping from (word, class) to
frequency
Features Sum Pos. Sum
of Bia Frequencie Neg.
tweet m s s Frequenci
es
Feature extraction
Vocabulary PosFreq (1) I am sad, I am not learning
I 3 NLP
am 3
happy 20
because 10
learning 1
NLP 1
8
sad 0
not 0
Feature extraction
Vocabulary NegFreq (0) I am sad, I am not learning
I 3 NLP
am 3
happy 0
because 0
learning 1
NLP 1
sad 2 11
not 1
Feature
extraction I am sad, I am not learning
NLP
Summary
● Dictionary mapping (word,class) to
frequencies
➔ Cleaning unimportant information from your
tweets
Preprocessin
g
Outline
● Removing stopwords, punctuation, handles
and URLs
● Stemming
● Lowercasing
Preprocessing: stop words and
punctuation
@Class and @Bishal are Stop words
tuning a GREAT AI model at and Punctuation
https://geeksforgeeks.com!!! is ,
are .
at :
has !
for “
a ‘
Preprocessing: stop words and
punctuation
@Class and @Bishal are Stop words
tuning a GREAT AI model at and Punctuation
https://geeksforgeeks.com!!! is ,
are .
at :
has !
@Class @Bishal tuning
for “
GREAT AI model a ‘
https://geeksforgeeks.com!!!
Preprocessing: stop words and
punctuation
@Class @Bishal tuning Stop words
GREAT AI model and Punctuation
https://geeksforgeeks.com!!! is ,
are .
at :
has !
@Class @Bishal tuning
for “
GREAT AI model a ‘
https://geeksforgeeks.com
Preprocessing: Handles and URLs
@Class @Bishal tuning GREAT AI
model
https://geeksforgeeks.com
tuning GREAT AI
model
Preprocessing: Stemming and
lowercasing
tuning GREAT AI
model GREAT
Great
tune great great
tun
Preprocessed
tune
tweet:
d [tun, great,ai,
tunin model]
g
Summary
● Stop words, punctuation, handles and
URLs
● Stemming
● Lowercasing
● Less unnecessary Better
info times
Putting it all
together
Outline
● Generalize the
process
● How to code it!
General overview
I am Happy Because i am learning NLP
@deeplearning Preprocessi
ng
[happy, learn,
Featurenlp]
Extraction
Bia [1, 4, 2] Sum
s negative
frequencies
Sum positive
frequencies
General overview
I am Happy Because i am
learning NLP [happy, learn, nlp] [[1, 40, 20],
@geeksforgeeks
[1, 20, 50],
I am sad not learning NLP [sad, not, learn,nlp]
...
... ... [1, 5, 35]]
I am sad :( [sad]
General overview
[[1, 40, 20],
[1, 20, 50],
...
[1, 5, 35]]
General Implementation
freqs = build_freqs(tweets,labels) #Build frequencies dictionary
X = np.zeros((m,3)) #Initialize matrix X
for i in range(m): #For every tweet
p_tweet = process_tweet(tweets[i]) #Process tweet X[i,:]
= extract_features(p_tweet,freqs) #Extract Features
Summary
● Implement the feature extraction algorithm for your
entire set of
tweets
● Almost ready to train!
04
LOGISTIC
REGRESSION
Plugin feature vector to the LR model
Outline
● Supervised learning and logistic
regression
● Sigmoid function
Overview of logistic regression
Paramete
rs
Cost
Feature F Outpu Outpu
t vs
s Sigmoid t Label
Label
s
Overview of logistic regression
Overview of logistic regression
Overview of logistic regression
Overview of logistic regression
@Class and @Bishal are 4.92
tuning a GREAT AI model
at
https://geeksforgeeks.com
[tun,
!!! ai, great,
model]
Summary
● Sigmoid
function
● ,
positive
● ,
negativ
e
Logistic
Regression:
Training
Outline
● Review the steps in the training
process
● Overview of gradient descent
Training
LR
Cos
t
Iteration
Training LR
Initialize
parameters
Classify/predict
Get gradient
U
n
t
i
l
g
o
o
d
e
Training LR
Initialize
parameters
Classify/predict
Get gradient
U
n
t
i
l
g
o
o
d
e
Summary
● Visualize how gradient descent works
● Use gradient descent to train your logistic regression
classifier
➔ Compute the accuracy of your model
Logistic
Regression:
Testing
Outline
● Using your validation set to compute model
accuracy
● What the accuracy metric means
Testing logistic regression
●
,
Testing logistic regression
●
,
Testing logistic regression
,
Testing logistic regression
Summary
● Performance on unseen
data
● Accurac
y
To improve model: step size, number of iterations, regularization, new
features, etc.
Logistic
Regression:
Cost Function
Outline
● Overview of the logistic cost function, AKA the binary cross-
entropy
function
Cost function for logistic
regression
Cost function for logistic
regression
For, y = 1; For, y = 0 ;
positive class negative class
Cost function for logistic
regression
0 any 0
1 0.99 ~0
1 ~0 -inf
Cost function for logistic
regression
0 any 0
1 0.99 ~0
1 ~0 -inf
Cost function for logistic
regression
1 any 0
0 0.01 ~0
0 ~1 -inf
Cost function for logistic
regression
Cost function for logistic
regression
Summary
● Strong disagreement =
high cost
● Strong agreement = low
cost
● Aim for the lowest cost!
ADDITIONAL
INFORMATION
Tokenizing: Breaks text into meaningful units (tokens).
Example: "This is an example" → ["This", "is", "an", "example"]
Encoding: Converts tokens into numerical representations (like integer IDs).
Example: ["This", "is", "an", "example"] → [1, 2, 3, 4] (where each word is mapped to a unique
integer).
Embedding: Maps encoded tokens into continuous vector spaces that capture
semantic meanings.
Example: [1, 2, 3, 4] → [ [0.1, 0.2, ...], [0.3, 0.1, ...], ... ] (each integer is mapped to a vector of
real numbers).
THANK YOU
!!!