Real Time Sentiment Analysis
Using Twitter Stream API & AWS
Kinesis
Armando Padilla
mandopadilla81@gmail.com
A little about me
● NodeJS & AWS enthusiast
● 15 years building technology solutions
● 4 years leading and building distributed and
colocated teams
● BS/MS in Computer Science
● Father & Husband :-)
So, what are we building?
Real Time Sentiment Dashboard using english Tweets.
Architecture & Tech
1 Producer - Takes in Twitter Stream data and sends it to Kinesis.
1 Kinesis Stream - 1 shard to handle incoming flow of data.
1 Consumer - Dashboard.
Tech
Twitter Stream API AWS Kinesis Streams NodeJS
Rickshaw (Graphs Lib), sentiment (npm package)
Let’s get to it!
Producer
A producer injects data into a Kinesis
Stream.
2 ways to inject data.
1. PutRecord - Single use .
2. PutRecords. - Batched Injection.
const AWS = require('aws-sdk');
const kinesis = new AWS.Kinesis({
accessKeyId: 'ACCESS_KEY',
secretAccessKey: 'SECRET_KEY',
region: 'us-east-1'
});
const params = {
Data: STRING_TO_SAVE,
StreamName: AWS_STREAM_NAME,
PartitionKey: PARTITION_KEY_OF_YOUR_CHOSING,
}
kinesis.putRecord(params, function (err, data) {
if (err) throw err;
console.log("data", data); // For login.
})
Twitter Stream Integration
Continues random stream of live
tweets.
API: statuses/sample
Filters: language=en
const Twitter = require('twitter');
const AWS = require('aws-sdk');
const client = new Twitter({}) //See docs for props to use.
client.stream('statuses/sample', {language: 'en'}, function(stream){
stream.on('data', function(event){
const text = event.text;
// Producer Slide Content Here
})
stream.on('error', function(e){
throw e;
})
})
Kinesis Streams
Collects and processes large streams of
data in real time.
Shards
● Supports 1MB/Sec
● 1000 writes/sec
● 2MB/sec Reads
Consumer
Reads data from a stream.
Our example the consumer is an endpoint used by the front-end graph which pulls
data and calculates sentiment.
2 steps:
1. Fetch iterator
2. Fetch data
Consumer - cont.
Fetch Iterator
A shard iterator allows us to pull
data from the stream from a
specific point along the stream.
const AWS = require('aws-sdk');
const kinesis = new AWS.Kinesis({
accessKeyId: AWS_KINESIS_ACCESS_KEY_ID,
secretAccessKey: AWS_KINESIS_SECRET_ACCESS_KEY,
region: AWS_KINESIS_REGION,
});
//Fetch initial iterator.
var params = {
ShardId: SHARD_ID,
StreamName: AWS_KINESIS_STREAMNAME,
ShardIteratorType: 'LATEST'
}
kinesis.getShardIterator(params, (err, data) => {
if (err) return reject(err);
const iterator = data.ShardIterator;
}
Consumer - cont.
Fetch Data
With an iterator, fetch the data using
getRecords. Data needed is in the
Records property of the response.
If you want to start from this point in
the stream in the next call, the
NextShardIterator contains the iterator
to use.
params = { ShardIterator: shardIterator }
kinesis.getRecords(params, (err, data) => {
if (err) throw err;
if (data.Records) {
data.Records.forEach((record) => {
const content = record.Data.toString();
console.log(‘content’, content); // Loggin
});
}
// Get the next iterator.
if (!data.NextShardIterator) {
shardIterator = null;
} else {
shardIterator = data.NextShardIterator;
}
Consumer - Sentiment Add On
Calculate Sentiment - Naive
Implementation
Calculate the average sentiment for the
set of tweets returned.
Increment/Decrement by 1 since we want
to take 0 into account.
This value is returned to the Front-End
// Grab sentiment.
var avgMood = 0;
if (data.Records) {
data.Records.forEach((record) => {
const content = record.Data.toString();
var mood = sentiment(content).score;
if (mood >= 0) mood += 1;
if (mood < 0) mood = mood-1;
avgMood += mood;
});
}
if (data.Records.length != 0) avgMood = avgMood/data.Records.length;
// Get the next iterator code here
res.status(200).json({
nextIterator: shardIterator,
Mood: avgMood
})
Front End Dashboard
Init Rickshaw Graph
var timeInterval = 1000;
var url = "http://localhost:3000/data";
// instantiate our graph!
var graph = new Rickshaw.Graph({
element: document.getElementById("chart"),
width: 900,
height: 500,
renderer: 'line',
series: new Rickshaw.Series.FixedDuration([{ name: 'Mood', color: 'steelblue' }],
undefined, {
timeInterval: timeInterval,
maxDataPoints: 100,
}),
min: -10,
max: 10,
});
graph.render();
Make Up
1. Basic HTML
2. Rickshaw to build Graph
3. JQuery to make call to API
Front End Dashboard
Fetch Data every X seconds
// Fetch data every X seconds.
var nextIterator = null;
setInterval(function() {
$.getJSON(url+'?nextIterator='+encodeURIComponent(nextIterator), function(data) {
nextIterator = data.nextIterator;
graph.series.addData({
Mood: data.Mood
});
graph.render();
});
}, timeInterval);
Next Steps & Conclusion
1. Move the sentiment analysis to Kinesis Analytics.
2. Store Calculated sentiment scores to fetch historical data.
3. Update Sentiment Algo used.
References
App Code
https://github.com/armandopadilla/twitter_sentiment_public
Sentiment Package
https://www.npmjs.com/package/sentiment
AWS Kinesis Docs
https://aws.amazon.com/documentation/kinesis/
Rickshaw Graphs
http://code.shutterstock.com/rickshaw/

Real time sentiment analysis using twitter stream api &amp; aws kinesis

  • 1.
    Real Time SentimentAnalysis Using Twitter Stream API & AWS Kinesis Armando Padilla [email protected]
  • 2.
    A little aboutme ● NodeJS & AWS enthusiast ● 15 years building technology solutions ● 4 years leading and building distributed and colocated teams ● BS/MS in Computer Science ● Father & Husband :-)
  • 3.
    So, what arewe building?
  • 4.
    Real Time SentimentDashboard using english Tweets.
  • 5.
  • 6.
    1 Producer -Takes in Twitter Stream data and sends it to Kinesis. 1 Kinesis Stream - 1 shard to handle incoming flow of data. 1 Consumer - Dashboard.
  • 7.
    Tech Twitter Stream APIAWS Kinesis Streams NodeJS Rickshaw (Graphs Lib), sentiment (npm package)
  • 8.
  • 9.
    Producer A producer injectsdata into a Kinesis Stream. 2 ways to inject data. 1. PutRecord - Single use . 2. PutRecords. - Batched Injection. const AWS = require('aws-sdk'); const kinesis = new AWS.Kinesis({ accessKeyId: 'ACCESS_KEY', secretAccessKey: 'SECRET_KEY', region: 'us-east-1' }); const params = { Data: STRING_TO_SAVE, StreamName: AWS_STREAM_NAME, PartitionKey: PARTITION_KEY_OF_YOUR_CHOSING, } kinesis.putRecord(params, function (err, data) { if (err) throw err; console.log("data", data); // For login. })
  • 10.
    Twitter Stream Integration Continuesrandom stream of live tweets. API: statuses/sample Filters: language=en const Twitter = require('twitter'); const AWS = require('aws-sdk'); const client = new Twitter({}) //See docs for props to use. client.stream('statuses/sample', {language: 'en'}, function(stream){ stream.on('data', function(event){ const text = event.text; // Producer Slide Content Here }) stream.on('error', function(e){ throw e; }) })
  • 11.
    Kinesis Streams Collects andprocesses large streams of data in real time. Shards ● Supports 1MB/Sec ● 1000 writes/sec ● 2MB/sec Reads
  • 12.
    Consumer Reads data froma stream. Our example the consumer is an endpoint used by the front-end graph which pulls data and calculates sentiment. 2 steps: 1. Fetch iterator 2. Fetch data
  • 13.
    Consumer - cont. FetchIterator A shard iterator allows us to pull data from the stream from a specific point along the stream. const AWS = require('aws-sdk'); const kinesis = new AWS.Kinesis({ accessKeyId: AWS_KINESIS_ACCESS_KEY_ID, secretAccessKey: AWS_KINESIS_SECRET_ACCESS_KEY, region: AWS_KINESIS_REGION, }); //Fetch initial iterator. var params = { ShardId: SHARD_ID, StreamName: AWS_KINESIS_STREAMNAME, ShardIteratorType: 'LATEST' } kinesis.getShardIterator(params, (err, data) => { if (err) return reject(err); const iterator = data.ShardIterator; }
  • 14.
    Consumer - cont. FetchData With an iterator, fetch the data using getRecords. Data needed is in the Records property of the response. If you want to start from this point in the stream in the next call, the NextShardIterator contains the iterator to use. params = { ShardIterator: shardIterator } kinesis.getRecords(params, (err, data) => { if (err) throw err; if (data.Records) { data.Records.forEach((record) => { const content = record.Data.toString(); console.log(‘content’, content); // Loggin }); } // Get the next iterator. if (!data.NextShardIterator) { shardIterator = null; } else { shardIterator = data.NextShardIterator; }
  • 15.
    Consumer - SentimentAdd On Calculate Sentiment - Naive Implementation Calculate the average sentiment for the set of tweets returned. Increment/Decrement by 1 since we want to take 0 into account. This value is returned to the Front-End // Grab sentiment. var avgMood = 0; if (data.Records) { data.Records.forEach((record) => { const content = record.Data.toString(); var mood = sentiment(content).score; if (mood >= 0) mood += 1; if (mood < 0) mood = mood-1; avgMood += mood; }); } if (data.Records.length != 0) avgMood = avgMood/data.Records.length; // Get the next iterator code here res.status(200).json({ nextIterator: shardIterator, Mood: avgMood })
  • 16.
    Front End Dashboard InitRickshaw Graph var timeInterval = 1000; var url = "http://localhost:3000/data"; // instantiate our graph! var graph = new Rickshaw.Graph({ element: document.getElementById("chart"), width: 900, height: 500, renderer: 'line', series: new Rickshaw.Series.FixedDuration([{ name: 'Mood', color: 'steelblue' }], undefined, { timeInterval: timeInterval, maxDataPoints: 100, }), min: -10, max: 10, }); graph.render(); Make Up 1. Basic HTML 2. Rickshaw to build Graph 3. JQuery to make call to API
  • 17.
    Front End Dashboard FetchData every X seconds // Fetch data every X seconds. var nextIterator = null; setInterval(function() { $.getJSON(url+'?nextIterator='+encodeURIComponent(nextIterator), function(data) { nextIterator = data.nextIterator; graph.series.addData({ Mood: data.Mood }); graph.render(); }); }, timeInterval);
  • 18.
    Next Steps &Conclusion 1. Move the sentiment analysis to Kinesis Analytics. 2. Store Calculated sentiment scores to fetch historical data. 3. Update Sentiment Algo used.
  • 19.
    References App Code https://github.com/armandopadilla/twitter_sentiment_public Sentiment Package https://www.npmjs.com/package/sentiment AWSKinesis Docs https://aws.amazon.com/documentation/kinesis/ Rickshaw Graphs http://code.shutterstock.com/rickshaw/