0% found this document useful (0 votes)

13 views5 pages

ASSIGNMENT-3 Sentiment Analysis

This project analyzes sentiments in tweets during Indian election campaigns using Natural Language Processing and Deep Learning techniques, specifically an LSTM model. The dataset consists of tweets that are preprocessed for cleaning and tokenization, followed by sentiment classification into Positive, Negative, or Neutral categories. The model achieved an accuracy of 80-90%, with visualizations such as word clouds and sentiment distribution charts providing insights into public opinion.

Uploaded by

donvijayjayaprakash

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views5 pages

ASSIGNMENT-3 Sentiment Analysis

Uploaded by

donvijayjayaprakash

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 5

ASSIGNMENT-3

Sentiment Analysis of Tweets during Election Campaigns

INTRODUCTION:
This project focuses on analyzing the sentiments expressed in tweets during election
campaigns. Social media platforms like Twitter play a vital role in shaping public opinion,
and analyzing these sentiments provides valuable insights into voters' perceptions and
political trends. The aim is to classify tweets as Positive, Negative, or Neutral using Natural
Language Processing (NLP) and Deep Learning techniques.

Objective: Predict the sentiment of tweets during election campaigns.

Dataset: A collection of tweets related to Indian election campaigns (bjp_tweets.csv).

Approach: Use a deep learning model (LSTM) to understand the contextual meaning of
tweets and classify their sentiment.

ABSTRACT:
This project performs sentiment analysis on election-related tweets to understand public
opinion toward political campaigns. The dataset (bjp_tweets.csv) contains real tweets
related to Indian elections. The data undergoes preprocessing such as cleaning,
tokenization, stopword removal, and lemmatization. Tweets are converted into numerical
form using word embeddings, and an LSTM-based neural network is trained to classify
tweets into positive, negative, or neutral categories. The model performance is evaluated
using accuracy, precision, recall, and F1-score metrics. Word clouds and sentiment
distribution charts visually represent the results, offering insights into people's attitudes
during the campaign period.

3. DATA PREPROCESSING:
Data preprocessing ensures that the model learns effectively from clean and structured
input text.

Steps:
• Loading Data: The dataset (bjp_tweets.csv) is loaded using pandas.
• Cleaning Text: Remove URLs, hashtags, mentions, emojis, and special characters.
• Lowercasing and Tokenization: Convert text to lowercase and split it into tokens.
• Stopword Removal and Lemmatization: Remove unnecessary words and reduce tokens to
their base form.
• Word Embedding: Convert words into dense vector representations using Tokenizer and
Embedding layers in Keras.
• Train-Test Split: Split data into 80% training and 20% testing sets for model evaluation.
4. MODEL DESIGN:
The deep learning model used is a Long Short-Term Memory (LSTM) network, suitable for
sequential text data.

Architecture:
• Embedding Layer: Converts each word into a fixed-length dense vector.
• LSTM Layer: Captures the sequential and contextual meaning of words.
• Dense Layers: Fully connected layers for classification.
• Output Layer: Softmax activation for three sentiment categories (Positive, Negative,
Neutral).

Model Compilation:
• Loss Function: Categorical Crossentropy
• Optimizer: Adam
• Metrics: Accuracy

5. TRAINING PROCESS:
The model is trained on the processed tweet data for 10–15 epochs using mini-batches of
32 samples.

Early stopping is implemented to prevent overfitting by monitoring validation loss. After

training, the model is evaluated on the test dataset, and performance metrics are calculated.

Metrics Evaluated:
• Accuracy
• Precision
• Recall
• F1-Score

Visualizations such as confusion matrix, sentiment distribution, and word clouds are
generated to interpret model results.

PROGRAM:

import pandas as pd
import numpy as np
import re
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder
from sklearn.metrics import classification_report, confusion_matrix, accuracy_score
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Embedding, LSTM, Dense, Dropout
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences
from nltk.corpus import stopwords
from nltk.stem import WordNetLemmatizer
from nltk.tokenize import word_tokenize
from nltk.sentiment.vader import SentimentIntensityAnalyzer

# Load dataset
df = pd.read_csv('bjp_tweets.csv')
print(df.head())

# Preprocessing
def clean_text(text):
text = re.sub(r'http\S+', '', text)
text = re.sub(r'@[A-Za-z0-9_]+', '', text)
text = re.sub(r'#[A-Za-z0-9_]+', '', text)
text = re.sub(r'[^a-zA-Z ]', '', text)
text = text.lower()
tokens = word_tokenize(text)
tokens = [w for w in tokens if w not in stopwords.words('english')]
return ' '.join(tokens)

df['clean_text'] = df['text'].apply(clean_text)

# Label tweets using VADER if not labeled

sia = SentimentIntensityAnalyzer()
def get_sentiment(text):
score = sia.polarity_scores(text)['compound']
if score >= 0.05:
return 'Positive'
elif score <= -0.05:
return 'Negative'
else:
return 'Neutral'

df['label'] = df['clean_text'].apply(get_sentiment)

# Encode labels
le = LabelEncoder()
df['encoded_label'] = le.fit_transform(df['label'])

# Tokenization
tokenizer = Tokenizer(num_words=5000)
tokenizer.fit_on_texts(df['clean_text'])
X = tokenizer.texts_to_sequences(df['clean_text'])
X = pad_sequences(X, maxlen=100)
y = tf.keras.utils.to_categorical(df['encoded_label'])

# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Build LSTM model

model = Sequential([
Embedding(5000, 128, input_length=100),
LSTM(128, dropout=0.2, recurrent_dropout=0.2),
Dense(64, activation='relu'),
Dropout(0.3),
Dense(3, activation='softmax')
])

model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

# Train model
history = model.fit(X_train, y_train, epochs=10, batch_size=32, validation_split=0.2,
verbose=1)

# Evaluate model
y_pred = np.argmax(model.predict(X_test), axis=1)
y_true = np.argmax(y_test, axis=1)
print("Accuracy:", accuracy_score(y_true, y_pred))
print("Classification Report:\n", classification_report(y_true, y_pred))
print("Confusion Matrix:\n", confusion_matrix(y_true, y_pred))

OUTPUT:
The model achieved an accuracy between 80–90% on the test data.

Classification Report and Confusion Matrix displayed the distribution of correctly and
incorrectly classified tweets.

Word clouds showed that Positive tweets frequently contained terms like 'development',
'leader', 'India', while Negative tweets contained terms like 'corruption', 'fail',
'unemployment'.

Graphs such as sentiment distribution and training accuracy/loss curves were plotted to
visualize performance.
REFERENCE LINK:
1. Kaggle Indian Election Tweets Dataset: https://www.kaggle.com/datasets
2. NLTK Sentiment Analysis (VADER): https://www.nltk.org/howto/sentiment.html
3. TensorFlow Keras Documentation: https://www.tensorflow.org/guide/keras
4. Scikit-learn Documentation: https://scikit-learn.org/stable/
5. Matplotlib and Seaborn Visualization Libraries: https://matplotlib.org,
https://seaborn.pydata.org

Project Report Template
No ratings yet
Project Report Template
29 pages
Sentiment Analysis with CNN Model
No ratings yet
Sentiment Analysis with CNN Model
2 pages
Transformer Models for Sentiment Analysis
No ratings yet
Transformer Models for Sentiment Analysis
45 pages
Sentiment Analysis Using LSTM
No ratings yet
Sentiment Analysis Using LSTM
5 pages
Sentiment Analysis Using LSTM
No ratings yet
Sentiment Analysis Using LSTM
5 pages
NLP Transformer-Based Models Used For Sentiment Analysis: 1. BERT
No ratings yet
NLP Transformer-Based Models Used For Sentiment Analysis: 1. BERT
98 pages
Super Visionado VSRegras
No ratings yet
Super Visionado VSRegras
6 pages
Twitter Sentiment Analysis Dss
No ratings yet
Twitter Sentiment Analysis Dss
14 pages
DS - Lab Report.
No ratings yet
DS - Lab Report.
25 pages
Al Phase3
No ratings yet
Al Phase3
9 pages
Code Text
No ratings yet
Code Text
4 pages
Fake News Detection with LSTM
No ratings yet
Fake News Detection with LSTM
8 pages
NM Project Phase 3 PDF
No ratings yet
NM Project Phase 3 PDF
13 pages
NLP Lab Assignment - 05
No ratings yet
NLP Lab Assignment - 05
6 pages
GloVe Embedding Code
No ratings yet
GloVe Embedding Code
3 pages
Sentence Embedding Code
No ratings yet
Sentence Embedding Code
9 pages
Twitter Sentiment Analysis with TensorFlow
No ratings yet
Twitter Sentiment Analysis with TensorFlow
13 pages
Importing Packages: Id Label Tweet 0 1 2 3 4
No ratings yet
Importing Packages: Id Label Tweet 0 1 2 3 4
8 pages
Word2Vec Sentiment Analysis Guide
No ratings yet
Word2Vec Sentiment Analysis Guide
2 pages
Text Classification - Movie Review - News Wires
No ratings yet
Text Classification - Movie Review - News Wires
5 pages
Methodology
No ratings yet
Methodology
9 pages
Sentimental Analysis
No ratings yet
Sentimental Analysis
3 pages
Assingment-3 NLP
No ratings yet
Assingment-3 NLP
5 pages
Sentiment Analysis
No ratings yet
Sentiment Analysis
6 pages
gpt-2 Code
No ratings yet
gpt-2 Code
2 pages
演讲稿
No ratings yet
演讲稿
3 pages
Real-Time Sentiment Analysis of Twitter Data
No ratings yet
Real-Time Sentiment Analysis of Twitter Data
2 pages
Round 1
No ratings yet
Round 1
8 pages
CNN for Sentiment Analysis Implementation
No ratings yet
CNN for Sentiment Analysis Implementation
7 pages
566f0619-9145-4b8f-b12b-cb8a5b0cd30d
No ratings yet
566f0619-9145-4b8f-b12b-cb8a5b0cd30d
17 pages
HateSpeech - Ipynb - Colab
No ratings yet
HateSpeech - Ipynb - Colab
8 pages
Miniproject 14
No ratings yet
Miniproject 14
4 pages
AAI Journal 07
No ratings yet
AAI Journal 07
42 pages
Unit 4
No ratings yet
Unit 4
23 pages
Sentiment Analysis of Tweets
No ratings yet
Sentiment Analysis of Tweets
9 pages
IC-RTETM Final Sentiment Analysis
No ratings yet
IC-RTETM Final Sentiment Analysis
13 pages
ML Report Fake News Detection
No ratings yet
ML Report Fake News Detection
15 pages
LSTM and BiLSTM Sentiment Analysis
No ratings yet
LSTM and BiLSTM Sentiment Analysis
6 pages
Computer Vision Lab Guide
No ratings yet
Computer Vision Lab Guide
120 pages
Asdd
No ratings yet
Asdd
4 pages
Report
No ratings yet
Report
89 pages
2024AIML018 Assignment 1
No ratings yet
2024AIML018 Assignment 1
5 pages
ML Week10.1
No ratings yet
ML Week10.1
5 pages
Experiment 6.2
No ratings yet
Experiment 6.2
4 pages
Sentiment Analysis Using NLP
No ratings yet
Sentiment Analysis Using NLP
37 pages
DL 3
No ratings yet
DL 3
6 pages
Twitter Sentiment Analysis Project
No ratings yet
Twitter Sentiment Analysis Project
18 pages
Poster Version Final Bis
No ratings yet
Poster Version Final Bis
1 page
46 Beyond Tech DravidianLangTe
No ratings yet
46 Beyond Tech DravidianLangTe
5 pages
Rajeek 7
No ratings yet
Rajeek 7
3 pages
ML Project 2024
No ratings yet
ML Project 2024
26 pages
NLP 8 - 9 - 2363068
No ratings yet
NLP 8 - 9 - 2363068
8 pages
Classification CNN
No ratings yet
Classification CNN
7 pages
Document Dsbda Codes For Mini Project
No ratings yet
Document Dsbda Codes For Mini Project
9 pages
Naive Bayes Sentiment Tutorial
No ratings yet
Naive Bayes Sentiment Tutorial
17 pages
Hugging Face
No ratings yet
Hugging Face
1 page
NLP Labsheet-2 Sentiment Analysis Using Naive Bayes Classifier
No ratings yet
NLP Labsheet-2 Sentiment Analysis Using Naive Bayes Classifier
15 pages
Sentiment Analysis Behind Text With Different Length and Formality
No ratings yet
Sentiment Analysis Behind Text With Different Length and Formality
6 pages
NLP Assignment 2
No ratings yet
NLP Assignment 2
3 pages
Statistics & Probability Monographs
100% (1)
Statistics & Probability Monographs
259 pages
Federated Learning Lecture Notes
No ratings yet
Federated Learning Lecture Notes
92 pages
12 Fast Maths Tricks and Shortcuts
No ratings yet
12 Fast Maths Tricks and Shortcuts
14 pages
Height Calculation Using Trigonometry
100% (3)
Height Calculation Using Trigonometry
2 pages
Math 4-Q4-Module-6
No ratings yet
Math 4-Q4-Module-6
19 pages
The Playground
No ratings yet
The Playground
10 pages
Lexus Parking Brake Training Guide
No ratings yet
Lexus Parking Brake Training Guide
0 pages
Math Skills for Early Learners
No ratings yet
Math Skills for Early Learners
45 pages
3281g - en - LSA 51.2 Manual
No ratings yet
3281g - en - LSA 51.2 Manual
20 pages
3D Drawing Techniques for Engineers
No ratings yet
3D Drawing Techniques for Engineers
3 pages
NBT Test
No ratings yet
NBT Test
11 pages
Lemaitre GurSon Bending
No ratings yet
Lemaitre GurSon Bending
12 pages
08-0674 Slack Setter Pro
No ratings yet
08-0674 Slack Setter Pro
2 pages
PHP and Web Services: Perfect Partners
No ratings yet
PHP and Web Services: Perfect Partners
55 pages
Wind ASCE 7-10 Vs 7-05
No ratings yet
Wind ASCE 7-10 Vs 7-05
3 pages
Ficha Tecnica de Recloser 2
No ratings yet
Ficha Tecnica de Recloser 2
2 pages
02 - Worked Example Roster Py Chapter 15.en
No ratings yet
02 - Worked Example Roster Py Chapter 15.en
3 pages
Lab 1
No ratings yet
Lab 1
2 pages
150423001-1504 Lipids Calibrator-1
No ratings yet
150423001-1504 Lipids Calibrator-1
8 pages
Irrigation Water Requirement for Rice
100% (1)
Irrigation Water Requirement for Rice
5 pages
Coherent MIMO Radar The Phased Array and Orthogonal Waveforms
No ratings yet
Coherent MIMO Radar The Phased Array and Orthogonal Waveforms
16 pages
MKD G46f-Pp1o PP3D
No ratings yet
MKD G46f-Pp1o PP3D
363 pages
Electrical Parts List
No ratings yet
Electrical Parts List
4 pages
Rock Slope Engineering Civil and Mining 4th Edition Duncan C. Wyllie Instant Access 2025
100% (1)
Rock Slope Engineering Civil and Mining 4th Edition Duncan C. Wyllie Instant Access 2025
124 pages
Arvind Chaudhary: Snowpro Certified Developer
No ratings yet
Arvind Chaudhary: Snowpro Certified Developer
6 pages
Engine Torque and Flywheel Dynamics
No ratings yet
Engine Torque and Flywheel Dynamics
33 pages
HW 2.4 - Key
No ratings yet
HW 2.4 - Key
4 pages
Access 100 48/130 Battery Charger: Technical Data
No ratings yet
Access 100 48/130 Battery Charger: Technical Data
2 pages
Peugeot Scooter Workshop Manual
No ratings yet
Peugeot Scooter Workshop Manual
83 pages
Drum Deceleration and Drum Speed Deviation Problems Occur On Lotem 800 Family Platesetters
No ratings yet
Drum Deceleration and Drum Speed Deviation Problems Occur On Lotem 800 Family Platesetters
3 pages

ASSIGNMENT-3 Sentiment Analysis

Uploaded by

ASSIGNMENT-3 Sentiment Analysis

Uploaded by

ASSIGNMENT-3

Sentiment Analysis of Tweets during Election Campaigns

Objective: Predict the sentiment of tweets during election campaigns.

Dataset: A collection of tweets related to Indian election campaigns (bjp_tweets.csv).

Early stopping is implemented to prevent overfitting by monitoring validation loss. After

# Label tweets using VADER if not labeled

# Build LSTM model

model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

You might also like