Understanding Rumor Detection on Social Media

This document discusses research into detecting rumors from social media. It aims to develop a rumor classification system with four components: rumor detection, rumor tracking, rumor stance classification, and rumor veracity classification. The methodology involves collecting tweets from Twitter using an information harvester, preprocessing the data, extracting features, and using machine learning techniques like sentiment analysis to detect trends and characteristics of rumors. The workflow involves collecting tweets from Twitter, storing them in a MongoDB database, preprocessing the data, and performing analysis on a development platform. The research also discusses enhancing the existing workflow and testing models on additional datasets.

Uploaded by

chirag

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (1 vote)

274 views19 pages

Understanding Rumor Detection on Social Media

Uploaded by

chirag

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

RUMOUR

DETECTION FROM
SOCIAL MEDIA

BY:
SOHAM NANDY
SAHIL MAHAJAN
VARDAAN BAJAJ
RUMOURS
?
What is Rumor?

 A rumour is a story or piece of information that may or may not be

true, but that people are talking about.

 Two types of rumours :-

1) long-standing rumours
2) newly emerging rumours
Problem Definition

 Rumors, fortunately or unfortunately affects us all and in many ways then we care to
remember.

 Despite the increasing use of social media platforms for information and news gathering,
its unmoderated nature often leads to the emergence and spread of rumours.

 At the same time, the openness of social media platforms provides opportunities to study
how users share and discuss rumours, and to explore how to automatically assess their
veracity, using natural language processing and data mining techniques.
Problem Definition

 We provide an overview of research into social media rumours with the ultimate goal of
developing a rumour classification system that consists of four components:
1. Rumour detection,
2. Rumour tracking,
3. Rumour stance classification, and
4. Rumour veracity classification.

 We delve into the approaches presented in the scientific literature for the development of
each of these four components.
Aim
 This project aims to investigate the characteristics of rumours found on online social
networks. The characteristics could be: Size and frequency of messages, message
propagation through the social network, and sentence structure of the messages.

 This study seeks to identify the key traits of rumours on online social networks such as
Twitter. The importance of automating the identification of rumours is growing ever-
increasingly important, given the rise of the internet’s popularity as a source of news,
and the ever-growing amount of information on the internet.
Methodology
Project Planning:-
The project plan must comprehensively account for all tasks required to be completed for the
project, accounting for the research direction of the project and the dependencies between
tasks.

Information Harvester Development:-

The information harvester must be able to collect tweets from Twitter in automated and
consistent manner.

Literature Review:-
Literature review will be undertaken for both academic fields of Computer Science and Social
Sciences, to gain a mix of insights of how rumours are detected.
Methodology
Feature Selection & Engineering:-
This project will be exploring datasets gathered through the collection of tweets from the Twitter
API. Investigative work will be performed to engineer additional features based on existing tweet
data, such as tweet type and tweet text. This section will also include manual labelling of tweets to
indicate the tweet’s sentiment (eg. is news, is rumour), which will be used as the target label for
classification purposes.

Sentiment Analysis using Machine Learning Techniques:-

Work will be performed to engineer more features via the usage of sentiment libraries. Lastly,
Machine Learning classifiers will be used to detect key trends in the dataset.

Testing, Results, and Discussion:-

The testing phase will report characteristics of the datasets collected and elaborate on the impact
of the findings generated.

Full System Integration:-

The full system integration seeks to provide an easy-to-use web user interface for the user to
easily discover insights from the datasets and experiment results generated.
Workflow
Workflow
The general data workflow consists of the following 4 elements :-

[Link]
Twitter is a social network platform where participants can make posts and interact with fellow
participants using hashtags, quote retweets, retweets, and comments. The datasets used in this project are
based on tweets collected from Twitter.

[Link] Harvester
The Information Harvester collects tweets from Twitter based on search queries by the user.

[Link] Database
The MongoDB Database stores tweets from the information harvester. Tweets are put through a data
cleaning process and are imported into the MongoDB Database.

[Link] & Development Platform

The Analysis & Development Platform is where all further in-depth analysis and work are performed.
TWITTER
2nd
Largest Social Networking Site

1,300,000,000
Twitter Accounts
5,000,000
Tweets per Day
INFORMATIONHARVESTER
» Automated 24/7 tweet collection
» Networkoptimizations
» Duplicate tweet reduction
» Gzipped archives for 90% space savings
DATAPREPROCESSING
1. Decompress archives
2. Remove tweet duplicates
3. Label tweets with tweet types
4. Generate tweet relationship data
Future Scope
As this is only a preliminary and broad study on rumours on online social networks, improvements can be
done in the following ways:-

1) The existing workflow can be enhanced in the following ways:

- Leveraging on GPU acceleration to speed up calculations

- Utilizing a distributed database for greater scale-up capability
- Real-time importing and visualization of data

2) Testing can be done in the following ways:

- Evaluation of existing models on public datasets (eg. News datasets)

- Evaluation of existing models on other types of texts (eg. Articles)
Some Snapshots of the App
Questions?
Thank You

Understanding Space Mouse Functionality
No ratings yet
Understanding Space Mouse Functionality
2 pages
Understanding Space Mouse Technology
100% (2)
Understanding Space Mouse Technology
27 pages
Job Title Identification Methodology
No ratings yet
Job Title Identification Methodology
59 pages
Understanding Space Mouse Technology
No ratings yet
Understanding Space Mouse Technology
30 pages
AI Healthcare Chatbot Using Python
100% (1)
AI Healthcare Chatbot Using Python
5 pages
IoT Project Ideas for Automation and Monitoring
No ratings yet
IoT Project Ideas for Automation and Monitoring
2 pages
Image Caption Generator Project Report
No ratings yet
Image Caption Generator Project Report
39 pages
Multi-Keyword Search in Encrypted Cloud
No ratings yet
Multi-Keyword Search in Encrypted Cloud
13 pages
Sentimental Analysis of Twitter Using Emoji: A Creative and Innovative Project Report
No ratings yet
Sentimental Analysis of Twitter Using Emoji: A Creative and Innovative Project Report
19 pages
Introduction to the World Wide Web
No ratings yet
Introduction to the World Wide Web
8 pages
Identifying Fake Profiles with ANN
No ratings yet
Identifying Fake Profiles with ANN
78 pages
Verifiable Data Protection in Cloud Systems
No ratings yet
Verifiable Data Protection in Cloud Systems
57 pages
Fake News Detection Using Natural Language Processing
100% (1)
Fake News Detection Using Natural Language Processing
8 pages
Spammer Detection in Social Networks
No ratings yet
Spammer Detection in Social Networks
7 pages
Automatic Image Captioning Techniques
No ratings yet
Automatic Image Captioning Techniques
26 pages
Online Charity Management System Overview
No ratings yet
Online Charity Management System Overview
21 pages
ATM System OOAD Lab Manual
No ratings yet
ATM System OOAD Lab Manual
41 pages
Electrical Engineering Project Proposal Guide
No ratings yet
Electrical Engineering Project Proposal Guide
9 pages
YouTube Spam Comment Detection Using ML
100% (1)
YouTube Spam Comment Detection Using ML
6 pages
Digital Image Processing Fundamentals
No ratings yet
Digital Image Processing Fundamentals
68 pages
Netflix Recommendation System Analysis
No ratings yet
Netflix Recommendation System Analysis
48 pages
CharitAble: Charity Donation Software
No ratings yet
CharitAble: Charity Donation Software
5 pages
Overview of Distributed Operating Systems
No ratings yet
Overview of Distributed Operating Systems
25 pages
Computer Science Project Ideas List
No ratings yet
Computer Science Project Ideas List
8 pages
Fake News Detection Using Machine Learning
No ratings yet
Fake News Detection Using Machine Learning
8 pages
CSE 380: Distributed Systems Overview
No ratings yet
CSE 380: Distributed Systems Overview
24 pages
Overview of Information Gathering Tools
No ratings yet
Overview of Information Gathering Tools
11 pages
TCP Congestion Control Techniques
No ratings yet
TCP Congestion Control Techniques
68 pages
Data Analytics Internship at Forage
No ratings yet
Data Analytics Internship at Forage
18 pages
Convolutional Neural Networks Overview
No ratings yet
Convolutional Neural Networks Overview
29 pages
Deep Learning for Consumer Complaint Classification
No ratings yet
Deep Learning for Consumer Complaint Classification
5 pages
Fog Computing for Cloud Data Security
No ratings yet
Fog Computing for Cloud Data Security
11 pages
Challenges in Wireless Sensor Networks
No ratings yet
Challenges in Wireless Sensor Networks
8 pages
Machine Learning for Phishing Detection
No ratings yet
Machine Learning for Phishing Detection
5 pages
Web Technology Overview and Protocols
0% (1)
Web Technology Overview and Protocols
13 pages
Class Diagram Overview in UML
No ratings yet
Class Diagram Overview in UML
17 pages
Image Enhancement Techniques Overview
100% (1)
Image Enhancement Techniques Overview
16 pages
Data Deduplication via File Checksum
No ratings yet
Data Deduplication via File Checksum
2 pages
Intelligent Spam Classifier Project Report
100% (1)
Intelligent Spam Classifier Project Report
24 pages
Machine Learning Weather Prediction Design
No ratings yet
Machine Learning Weather Prediction Design
18 pages
Forensic Tools Performance Comparison
No ratings yet
Forensic Tools Performance Comparison
7 pages
Django App for Food Wastage Reduction
100% (1)
Django App for Food Wastage Reduction
5 pages
Speaker Recognition Software Progress Report
No ratings yet
Speaker Recognition Software Progress Report
17 pages
Master's Program Finder in Sri Lanka
No ratings yet
Master's Program Finder in Sri Lanka
31 pages
Crop Recommendations by Season and Yield
No ratings yet
Crop Recommendations by Season and Yield
5 pages
IoT-Based Automatic Parking System
No ratings yet
IoT-Based Automatic Parking System
110 pages
Overview of ASIC Types and Applications
100% (1)
Overview of ASIC Types and Applications
20 pages
Airline Search Engine Project Overview
No ratings yet
Airline Search Engine Project Overview
28 pages
Overview of Biometric Security Systems
No ratings yet
Overview of Biometric Security Systems
20 pages
IOT Unit 3
No ratings yet
IOT Unit 3
25 pages
Research Paper on Cloud Computing
No ratings yet
Research Paper on Cloud Computing
7 pages
Digital Image Processing Overview
No ratings yet
Digital Image Processing Overview
10 pages
Understanding URIs in IoT
100% (1)
Understanding URIs in IoT
17 pages
AI and Automation in Network Operations
No ratings yet
AI and Automation in Network Operations
3 pages
Introduction to Arduino for IoT
No ratings yet
Introduction to Arduino for IoT
59 pages
Multimedia Document Architecture Overview
No ratings yet
Multimedia Document Architecture Overview
15 pages
Edge Computing Seminar by Nitesh Saini
No ratings yet
Edge Computing Seminar by Nitesh Saini
12 pages
Overview of Information Retrieval Systems
No ratings yet
Overview of Information Retrieval Systems
18 pages
Detecting Rumors in Microblogs
No ratings yet
Detecting Rumors in Microblogs
11 pages
Rumour Detection Models and Tools For Social PDF
No ratings yet
Rumour Detection Models and Tools For Social PDF
6 pages
Zepto Marketing Strategy for IMC Plan
No ratings yet
Zepto Marketing Strategy for IMC Plan
31 pages
Digital Marketing Strategies for Global SMEs
No ratings yet
Digital Marketing Strategies for Global SMEs
11 pages
Social Media's Impact on Student Success
No ratings yet
Social Media's Impact on Student Success
5 pages
Corrigé DCG 2010 : UE 12 Anglais
No ratings yet
Corrigé DCG 2010 : UE 12 Anglais
10 pages
Apology Letter for Damaged Bike
No ratings yet
Apology Letter for Damaged Bike
6 pages
SNS Impact on Tourism Satisfaction
No ratings yet
SNS Impact on Tourism Satisfaction
23 pages
Social Media Trends Among Young Adults
0% (1)
Social Media Trends Among Young Adults
4 pages
Social Media's Impact on Teen Drug Use
No ratings yet
Social Media's Impact on Teen Drug Use
6 pages
Technology's Impact on American Life
No ratings yet
Technology's Impact on American Life
48 pages
Benefits of Computers for Youth
No ratings yet
Benefits of Computers for Youth
2 pages
Section 2 Chapter 8 Involving The Guest The Co Creation of Value
No ratings yet
Section 2 Chapter 8 Involving The Guest The Co Creation of Value
30 pages
Upper-Intermediate Test
No ratings yet
Upper-Intermediate Test
4 pages
TikTok and the Creator Economy Explained
No ratings yet
TikTok and the Creator Economy Explained
3 pages
Evolution of ICT in Communication
No ratings yet
Evolution of ICT in Communication
4 pages
International Journal of Human-Computer Interaction
No ratings yet
International Journal of Human-Computer Interaction
11 pages
Advantages and Disadvantages of Mobile Phones
No ratings yet
Advantages and Disadvantages of Mobile Phones
5 pages
MasterMind 2 Unit 3 Extra LifeSkills Lesson 1
No ratings yet
MasterMind 2 Unit 3 Extra LifeSkills Lesson 1
2 pages
Democracy in Post-War Sri Lanka - Top Line Report
100% (1)
Democracy in Post-War Sri Lanka - Top Line Report
116 pages
Risks of Social Media for Children
No ratings yet
Risks of Social Media for Children
2 pages
City Museum Smart City
No ratings yet
City Museum Smart City
28 pages
Understanding Social Media: Pros and Cons
No ratings yet
Understanding Social Media: Pros and Cons
8 pages
Social Media's Impact on Bolinao Students
No ratings yet
Social Media's Impact on Bolinao Students
54 pages
Cybersecurity Hashtags for LinkedIn
No ratings yet
Cybersecurity Hashtags for LinkedIn
29 pages
Lorenzana vs. Austria: Judicial Misconduct Case
No ratings yet
Lorenzana vs. Austria: Judicial Misconduct Case
17 pages
Urban Challenges in Megacities
No ratings yet
Urban Challenges in Megacities
7 pages
Understanding Social Media Basics
No ratings yet
Understanding Social Media Basics
11 pages
By The Authors Students of Novaliches High School
No ratings yet
By The Authors Students of Novaliches High School
26 pages
Advantages of Not Using Facebook
No ratings yet
Advantages of Not Using Facebook
20 pages
Group Discussion Structure Guide
No ratings yet
Group Discussion Structure Guide
8 pages
Skincare Routine and Nutrition Insights
No ratings yet
Skincare Routine and Nutrition Insights
4 pages