0% found this document useful (0 votes)
11 views79 pages

Unit1 Extra

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views79 pages

Unit1 Extra

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 79

Supervise

d ML
&
Sentiment
Analysis
CONTENTS
● Review Supervised ML

● Build your own tweet classifier!


TABLE OF CONTENTS

Sentiment
Supervised ML
01 Training 02 analysis
Use of ML in SA

Feature Extraction Logistic regression


03 Extract Features 04 Apply LR for SA
01
SUPERVISED ML
Training
Supervised ML (training)
Paramete
rs
Cost
Feature Prediction Outpu Outpu
Function t vs
s t Label

Label
s
02
SENTIMENT
ANALYSIS
Use of ML in SA
Sentiment analysis
Positive:
Tweet I am happy because I am 1
learning NLP
Negative:
:
0

Logistic
regression
Sentiment analysis

I am happy
because I Train
Classify Positive:
LR
am 1
learning
NLP
Summary
● Features, Trai Predic
Labels n t
● Extract Train Predict
features LR sentiment
03
FEATURE EXTRACTION
Extract Features
Outline
● Vocabulary

● Feature extraction

● Sparse representations and some of their


issues
Vocabulary
I am happy because I am
Tweets: learning
...N..
[tweet_1, tweet_2, ..., ...
.
tweet_m] LP
I hated the
movie

[ I, happ becaus learnin NLP ... the movie


am, y, e, g, , hated, , ]
Feature extraction
I am happy because I am learning
NLP
NLP, . hate the movie
[ I , am, happy, because, .. d, , ]
learning,
[ 1, 1 1, 1, 1, ... 0 0
1, , 0, , ]

A lot of zeros! That’s a sparse


representation.
Problems with sparse representations

I am happy because I am
learning NLP All
zeros!
[ 1 , 1 , 1 , 1 , 1 , 1 , ... , 0 , … ,
0,0,0]
1
1. Large training
time
2. Large prediction
time
Summary

● Vocabulary: set of unique words

● Vocabulary, Text [1 ….. 0 ….. 1 .. 0 .. 1 .. 0]

● Sparse representations are problematic for training and


prediction times
Negative and
Positive
Frequencies
Outline
● Populate your vocabulary with a frequency count for
each class
Positive and
negative counts Vocabulary
I
Corpu am
s
I am happy because I am happy
learning NLP because
I am happy learning

I am sad, I am not learning NLP


sad
NLP I am sad
not
Positive and
negative counts
Negative
I am happy because I am I am sad, I am tweets
not learning NLP
learning NLP
I am sad
I am happy
Positive and negative counts
Vocabular PosFreq
y (1) 3
I am happy because I am learning NLP aI 3
m
happ 2
y
becaus 1
I am happy
e 1
learnin 1
g NLP 0
sad 0
not
Positive and negative counts
Vocabulary NegFreq
(0) 3 Negative
I 3 tweets
I am sad, I am not learning NLP
am 0
I am sad
happy 0
because 1
learning 1
NLP 2
sad 1
not 1
Word frequency in classes

Vocabulary PosFreq (1) NegFreq (0)


I 3 3
am 3 3
freqs: dictionary mapping
happy 2 0
from
because 1 0 (word, class) to frequency
learning 1 1
NLP 1 1
sad 0 2
not 0 1
Summary
● Divide tweet corpus into two classes: positive and
negative

● Count each time each word appears in either class

➔ Feature extraction for training and prediction!


Feature
extraction
with
frequencies
Outline
● Extract features from your frequencies dictionary to create
a features
vector
Word frequency in classes
Vocabulary PosFreq (1) NegFreq (0)
I 3 3
am 3 3
freqs: dictionary mapping
happy 2 0
from
because 1 0 (word, class) to frequency
learning 1 1
NLP 1 1
sad 0 2
not 0 1
Feature extraction

freqs: dictionary mapping from (word, class) to


frequency

Features Sum Pos. Sum


of Bia Frequencie Neg.
tweet m s s Frequenci
es
Feature extraction
Vocabulary PosFreq (1) I am sad, I am not learning
I 3 NLP
am 3
happy 20
because 10
learning 1
NLP 1
8
sad 0
not 0
Feature extraction
Vocabulary NegFreq (0) I am sad, I am not learning
I 3 NLP
am 3
happy 0
because 0
learning 1
NLP 1
sad 2 11
not 1
Feature
extraction I am sad, I am not learning
NLP
Summary

● Dictionary mapping (word,class) to


frequencies

➔ Cleaning unimportant information from your


tweets
Preprocessin
g
Outline
● Removing stopwords, punctuation, handles
and URLs

● Stemming

● Lowercasing
Preprocessing: stop words and
punctuation
@Class and @Bishal are Stop words
tuning a GREAT AI model at and Punctuation
https://geeksforgeeks.com!!! is ,
are .
at :
has !
for “
a ‘
Preprocessing: stop words and
punctuation
@Class and @Bishal are Stop words
tuning a GREAT AI model at and Punctuation
https://geeksforgeeks.com!!! is ,
are .
at :
has !
@Class @Bishal tuning
for “
GREAT AI model a ‘
https://geeksforgeeks.com!!!
Preprocessing: stop words and
punctuation
@Class @Bishal tuning Stop words
GREAT AI model and Punctuation
https://geeksforgeeks.com!!! is ,
are .
at :
has !
@Class @Bishal tuning
for “
GREAT AI model a ‘
https://geeksforgeeks.com
Preprocessing: Handles and URLs

@Class @Bishal tuning GREAT AI


model
https://geeksforgeeks.com

tuning GREAT AI
model
Preprocessing: Stemming and
lowercasing
tuning GREAT AI
model GREAT

Great

tune great great

tun
Preprocessed
tune
tweet:
d [tun, great,ai,
tunin model]
g
Summary
● Stop words, punctuation, handles and
URLs

● Stemming

● Lowercasing
● Less unnecessary Better
info times
Putting it all
together
Outline
● Generalize the
process

● How to code it!


General overview
I am Happy Because i am learning NLP
@deeplearning Preprocessi
ng

[happy, learn,
Featurenlp]
Extraction
Bia [1, 4, 2] Sum
s negative
frequencies
Sum positive
frequencies
General overview

I am Happy Because i am

learning NLP [happy, learn, nlp] [[1, 40, 20],


@geeksforgeeks

[1, 20, 50],


I am sad not learning NLP [sad, not, learn,nlp]
...
... ... [1, 5, 35]]

I am sad :( [sad]
General overview

[[1, 40, 20],

[1, 20, 50],


...
[1, 5, 35]]
General Implementation

freqs = build_freqs(tweets,labels) #Build frequencies dictionary

X = np.zeros((m,3)) #Initialize matrix X

for i in range(m): #For every tweet

p_tweet = process_tweet(tweets[i]) #Process tweet X[i,:]

= extract_features(p_tweet,freqs) #Extract Features


Summary
● Implement the feature extraction algorithm for your
entire set of
tweets
● Almost ready to train!
04
LOGISTIC
REGRESSION
Plugin feature vector to the LR model
Outline
● Supervised learning and logistic
regression

● Sigmoid function
Overview of logistic regression

Paramete
rs
Cost
Feature F Outpu Outpu
t vs
s Sigmoid t Label

Label
s
Overview of logistic regression
Overview of logistic regression
Overview of logistic regression
Overview of logistic regression

@Class and @Bishal are 4.92


tuning a GREAT AI model
at
https://geeksforgeeks.com
[tun,
!!! ai, great,
model]
Summary
● Sigmoid
function
● ,
positive
● ,
negativ
e
Logistic
Regression:
Training
Outline
● Review the steps in the training
process

● Overview of gradient descent


Training
LR

Cos
t
Iteration
Training LR
Initialize
parameters

Classify/predict

Get gradient
U
n
t
i
l

g
o
o
d
e
Training LR
Initialize
parameters

Classify/predict

Get gradient
U
n
t
i
l

g
o
o
d
e
Summary
● Visualize how gradient descent works
● Use gradient descent to train your logistic regression
classifier

➔ Compute the accuracy of your model


Logistic
Regression:
Testing
Outline
● Using your validation set to compute model
accuracy

● What the accuracy metric means


Testing logistic regression

,
Testing logistic regression

,
Testing logistic regression

,
Testing logistic regression
Summary
● Performance on unseen
data
● Accurac
y

To improve model: step size, number of iterations, regularization, new


features, etc.
Logistic
Regression:
Cost Function
Outline
● Overview of the logistic cost function, AKA the binary cross-
entropy
function
Cost function for logistic
regression
Cost function for logistic
regression

For, y = 1; For, y = 0 ;
positive class negative class
Cost function for logistic
regression

0 any 0
1 0.99 ~0
1 ~0 -inf
Cost function for logistic
regression

0 any 0
1 0.99 ~0
1 ~0 -inf
Cost function for logistic
regression

1 any 0
0 0.01 ~0
0 ~1 -inf
Cost function for logistic
regression
Cost function for logistic
regression
Summary
● Strong disagreement =
high cost

● Strong agreement = low


cost

● Aim for the lowest cost!


ADDITIONAL
INFORMATION
Tokenizing: Breaks text into meaningful units (tokens).
Example: "This is an example" → ["This", "is", "an", "example"]

Encoding: Converts tokens into numerical representations (like integer IDs).


Example: ["This", "is", "an", "example"] → [1, 2, 3, 4] (where each word is mapped to a unique
integer).

Embedding: Maps encoded tokens into continuous vector spaces that capture
semantic meanings.
Example: [1, 2, 3, 4] → [ [0.1, 0.2, ...], [0.3, 0.1, ...], ... ] (each integer is mapped to a vector of
real numbers).
THANK YOU
!!!

You might also like