Dr. Diana Maynard
University of Sheffield, UK
Are We Really Interested In
Climate Change?
www.decarbonet.eu
What do people think about climate change?
And how much do
we really know
about it?
How do we know
what's really true?
It's cold in my flat
The Decarbonet Project
●
Scientists predict adverse consequences to our climate unless
stronger actions are taken
●
Collective awareness about many climate change issues is still
problematic
●
We are exposed to vast amounts of conflicting information
●
Hard to know what is accurate and relevant
●
DecarboNet: “A Decarbonisation Platform for Citizen
Empowerment and Translating Collective Awareness into
Behavioural Change”
●
3-year EU project, started October 2013
DecarboNet Objectives
• Raise Individual and Collective
Awareness
• Trigger Behavioural Change and
Foster Social Innovation
• Analyse Behavioural Patterns and
Information Diffusion
Social media analysis for climate change
●
What are people tweeting about?
●
NLP tools for the automatic discovery of new insights, by
automatically extracting information from social media.
●
Extracted information can be linked together to form new facts or to
allow new hypotheses to be explored further.
●
What arguments for and against man-made causes of climate
change develop in social media?
●
What impact does this information have?
●
How do people's opinions change over time?
●
What kinds of topics are most engaging for social media users?
Earth Hour Campaign
represented on Twitter
What is going on here???
Media Watch for Climate Change
How can this be used to raise
awareness of climate change?
●
Organisations need to have a better understanding of public perception
of climate change in order to develop campaigns and strategies.
●
Our technology helps them to understand what are the opinions on
crucial topics and events
●
How are these opinions distributed in relation to demographic
user data?
●
How have these opinions evolved?
●
Who are the opinion leaders, and what is their impact and
influence?
●
Helps them to improve their campaigns, better target them
●
Helps improve both the development and marketing of environment-
related tools and technology by better understanding social perception
and behaviour
●
Hard to get this information by traditional means (youGov polls etc)
We are all connected to each other...
● Information,
thoughts and
opinions are
shared prolifically
on the social web
these days
● 72% of online
adults use social
networking sites
Your grandmother is three times as likely to
use a social networking site now as in 2009
Social media
and climate
change are
not just for
young people!
Analysing Social Media is harder than it
sounds
There are lots
of things to
think about!
Not every tweet about
global warming is
relevant
Some tweets are sarcastic
Let's search for keywords like “Arctic”
Oops!
Opinion Mining involves finding out what
people think
●
A simple approach would look for positive and negative words in a tweet.
●
e.g. “Climate change is terrible.” “Recycling is a good way to save the
planet.”
●
We could simply collect lists of positive and negative words and
categorise the tweets accordingly.
●
But it's not as simple as that.
Language in Social Media is complicated
●
Grundman:politics makes #climatechange scientific
issue,people don’t like knowitall rational voice tellin em wat 2do
●
Want to solve the problem of #ClimateChange? Just #vote for a
#politician! Poof! Problem gone! #sarcasm #TVP #99%
●
Human Caused #ClimateChange is a Monumental Scam!
http://www.youtube.com/watch?v=LiX792kNQeE … F**k yes!!
Lying to us like MOFO's Tax The Air We Breath! F**k Them!
Challenges for NLP
●
Noisy language: unusual punctuation, capitalisation, spelling,
use of slang, sarcasm etc.
●
Terse nature of microposts such as tweets
●
Use of hashtags, @mentions etc causes problems for
tokenisation #thisistricky
●
Lack of context gives rise to ambiguities
●
NER performs poorly on microposts, mainly because of linguistic
pre-processing failure
●
Standard NER tools almost halve their performance rate when
used on tweets
Persons in news articles
Persons in tweets
Lack of context causes ambiguity
Branching out from Lincoln park after dark ...
Hello Russian Navy, it's like the same thing but with glitter!
??
Getting the NEs right can be tricky
Branching out from Lincoln park after dark ...
Hello Russian Navy, it's like the same thing but with glitter!
“Positive” tweets about fracking
●
Help me stop fracking. Sign the petition to David
Cameron for a #frack-free UK now!
●
I'll take it as a sign that the gods applaud my new anti-
fracking country love song.
●
#Cameron wants to change the law to allow #fracking
under homes without permission. Tell him NO!!!!!
●
Clearly, existing tools don't work very well!
Whitney Houston wasn't very popular...
Death confuses opinion mining tools

Opinion mining
tools are good for a
general overview,
but not for some
situations
Text Analysis with GATE
●
GATE is a toolkit for Natural Language Processing (NLP)
developed at the Univesity of Sheffield for 20 years
●
http://gate.ac.uk
●
components for language processing: parsers, machine learning
tools, stemmers, IR tools, IE components for various languages.
opinion mining
●
tools for visualising and manipulating text, annotations,
ontologies, parse trees, etc.
●
various information extraction tools
●
evaluation and benchmarking tools
GATE Components for Opinion Mining
●
TwitIE
●
structural and linguistic pre-processing, specific to
Twitter
●
includes language detection, hashtag retokenisation,
POS tagging, NER
●
Term recognition using TermRaider
●
Sentiment gazetteer lookup
●
JAPE opinion detection grammars
●
Include target and opinion holder detection based on
entities/terms
●
currently positive/negative, extending to emotion
detection (happy/sad/anger/fear etc)
Basic approach for opinion finding
●
Find sentiment-containing words in a linguistic relation
with terms/entities (opinion-target matching)
●
life flourishing in Antarctica
●
Dictionaries give a starting score for sentiment words
●
Use a number of linguistic sub-components to deal with
issues such as negatives, adverbial modification, swear
words, conditionals, sarcasm etc.
A positive sentiment list
●
awesome category=adjective score=0.5
●
beaming category=adjective score=0.5
●
belonging category=noun score=0.5
●
benefic category=adjective score=0.5
●
benevolently category=adverb score=0.5
●
caring category=noun score=0.5
●
charitable category=adjective score=0.5
●
charm category=verb score=0.5
A negative sentiment list
Examples of phrases following the word “go”:
●
down the pan
●
down the drain
●
to the dogs
●
downhill
●
pear-shaped
Opinion scoring
●
Sentiment gazetteers (developed from sentiment words in
WordNet) have a starting “strength” score
●
These get modified by context words, e.g. adverbs, swear
words, negatives and so on
amazing campaign --> really amazing campaign.
good campaign --> not so good campaign
●
Swear words modifying adjectives count as intensifiers
good campaign --> damned good campaign
●
Swear words on their own are classified as negative
Damned politicians.
A positive tweet
A negative tweet
A Sarcastic Tweet
A little look at sarcasm
●
Sarcasm is usually about conveying the opposite meaning to the words
we use
●
Frequent hashtags: #sarcasm, #notreally, #whocares, #whoknew, #lol
●
Even knowing it's sarcastic isn't enough unless you now how to interpret
the sarcasm
●
Did you know, trees grow? YEEEEEEE #yee #trees #biodiversity
#wee #igottapee #notreally
●
RT @James_BG: Some lunchtime reading - why golf is an
environmental abomination and MUST BE BANNED
http://t.co/tBs0aGW8bc #notreally.
●
It can be positive as well as negative
●
@jimcramer I hate your energy and knowledge #notreally
Booyah Cramer.
Analysis of the EarthHour campaign
●
Analysis of hashtags and
topics mentioned
●
The main activities and themes
of the campaign drove most of
the social media conversations
●
Users engaged in the
campaign but did not
necessarily engage with
climate change and
sustainability issues.
●
Lack of correlation between
Durex campaign and climate
change engagement
Engagement Analysis
●
Retweets as the strongest engagement action
●
Identify the characteristics of those tweets that are followed
by
●
an engagement action (retweet)
●
a high level of engagement (high number of retweets)
●
For generating engagement, the content of the tweet is more
relevant than the reputation of the user.
●
This contradicts previous findings in other domains
●
Posts generating attention are slightly longer, easier to read,
have positive sentiment, mention other users and repeat
terminology from other posts
●
People are perhaps bored of hearing doom and gloom about
how the world is going to end?
38
Summary
●
Tools for social media analysis could be very useful to
companies and organisations involved in climate change
●
Understanding engagement and user impact of campaigns,
products etc.
●
Also useful to the general public to understand issues better
●
While retweets etc are a useful indicator of engagement, the
key is understanding content of social media posts
●
For this we need in-depth analysis of many language issues
●
NLP can help us understand what is really going on!
Some final thoughts on climate change
Acknowledgements and further Information
●
Research partially supported by the European Union/EU under the
Information and Communication Technologies (ICT) theme of the
7th Framework Programme for R&D (FP7) DecarboNet (610829)
http://www.decarbonet.eu
●
GATE website http://gate.ac.uk
●
Opinion mining demo: http://demos.gate.ac.uk/arcomem/opinions/
●
Diana Maynard, Gerhard Gossen, Marco Fisichella, Adam Funk. Should I
care about your opinion? Detection of opinion interestingness and
dynamics in social media. Journal of Future Internet, Special Issue on
Archiving Community Memories, 2014.
●
Diana Maynard and Mark A. Greenwood. Who cares about sarcastic
tweets? Investigating the impact of sarcasm on sentiment analysis. Proc.
of LREC 2014, Reykjavik, Iceland, May 2014.
This document does not represent the opinion of the European Community, and the
European Community is not responsible for any use that might be made of its content

Cls8 decarbonet

  • 1.
    Dr. Diana Maynard Universityof Sheffield, UK Are We Really Interested In Climate Change? www.decarbonet.eu
  • 2.
    What do peoplethink about climate change? And how much do we really know about it? How do we know what's really true? It's cold in my flat
  • 3.
    The Decarbonet Project ● Scientistspredict adverse consequences to our climate unless stronger actions are taken ● Collective awareness about many climate change issues is still problematic ● We are exposed to vast amounts of conflicting information ● Hard to know what is accurate and relevant ● DecarboNet: “A Decarbonisation Platform for Citizen Empowerment and Translating Collective Awareness into Behavioural Change” ● 3-year EU project, started October 2013
  • 4.
    DecarboNet Objectives • RaiseIndividual and Collective Awareness • Trigger Behavioural Change and Foster Social Innovation • Analyse Behavioural Patterns and Information Diffusion
  • 6.
    Social media analysisfor climate change ● What are people tweeting about? ● NLP tools for the automatic discovery of new insights, by automatically extracting information from social media. ● Extracted information can be linked together to form new facts or to allow new hypotheses to be explored further. ● What arguments for and against man-made causes of climate change develop in social media? ● What impact does this information have? ● How do people's opinions change over time? ● What kinds of topics are most engaging for social media users?
  • 7.
    Earth Hour Campaign representedon Twitter What is going on here???
  • 8.
    Media Watch forClimate Change
  • 9.
    How can thisbe used to raise awareness of climate change? ● Organisations need to have a better understanding of public perception of climate change in order to develop campaigns and strategies. ● Our technology helps them to understand what are the opinions on crucial topics and events ● How are these opinions distributed in relation to demographic user data? ● How have these opinions evolved? ● Who are the opinion leaders, and what is their impact and influence? ● Helps them to improve their campaigns, better target them ● Helps improve both the development and marketing of environment- related tools and technology by better understanding social perception and behaviour ● Hard to get this information by traditional means (youGov polls etc)
  • 10.
    We are allconnected to each other... ● Information, thoughts and opinions are shared prolifically on the social web these days ● 72% of online adults use social networking sites
  • 11.
    Your grandmother isthree times as likely to use a social networking site now as in 2009
  • 12.
    Social media and climate changeare not just for young people!
  • 13.
    Analysing Social Mediais harder than it sounds There are lots of things to think about!
  • 14.
    Not every tweetabout global warming is relevant
  • 15.
    Some tweets aresarcastic
  • 16.
    Let's search forkeywords like “Arctic” Oops!
  • 17.
    Opinion Mining involvesfinding out what people think ● A simple approach would look for positive and negative words in a tweet. ● e.g. “Climate change is terrible.” “Recycling is a good way to save the planet.” ● We could simply collect lists of positive and negative words and categorise the tweets accordingly. ● But it's not as simple as that.
  • 18.
    Language in SocialMedia is complicated ● Grundman:politics makes #climatechange scientific issue,people don’t like knowitall rational voice tellin em wat 2do ● Want to solve the problem of #ClimateChange? Just #vote for a #politician! Poof! Problem gone! #sarcasm #TVP #99% ● Human Caused #ClimateChange is a Monumental Scam! http://www.youtube.com/watch?v=LiX792kNQeE … F**k yes!! Lying to us like MOFO's Tax The Air We Breath! F**k Them!
  • 19.
    Challenges for NLP ● Noisylanguage: unusual punctuation, capitalisation, spelling, use of slang, sarcasm etc. ● Terse nature of microposts such as tweets ● Use of hashtags, @mentions etc causes problems for tokenisation #thisistricky ● Lack of context gives rise to ambiguities ● NER performs poorly on microposts, mainly because of linguistic pre-processing failure ● Standard NER tools almost halve their performance rate when used on tweets
  • 20.
  • 21.
  • 22.
    Lack of contextcauses ambiguity Branching out from Lincoln park after dark ... Hello Russian Navy, it's like the same thing but with glitter! ??
  • 23.
    Getting the NEsright can be tricky Branching out from Lincoln park after dark ... Hello Russian Navy, it's like the same thing but with glitter!
  • 24.
    “Positive” tweets aboutfracking ● Help me stop fracking. Sign the petition to David Cameron for a #frack-free UK now! ● I'll take it as a sign that the gods applaud my new anti- fracking country love song. ● #Cameron wants to change the law to allow #fracking under homes without permission. Tell him NO!!!!! ● Clearly, existing tools don't work very well!
  • 25.
    Whitney Houston wasn'tvery popular...
  • 26.
    Death confuses opinionmining tools  Opinion mining tools are good for a general overview, but not for some situations
  • 27.
    Text Analysis withGATE ● GATE is a toolkit for Natural Language Processing (NLP) developed at the Univesity of Sheffield for 20 years ● http://gate.ac.uk ● components for language processing: parsers, machine learning tools, stemmers, IR tools, IE components for various languages. opinion mining ● tools for visualising and manipulating text, annotations, ontologies, parse trees, etc. ● various information extraction tools ● evaluation and benchmarking tools
  • 28.
    GATE Components forOpinion Mining ● TwitIE ● structural and linguistic pre-processing, specific to Twitter ● includes language detection, hashtag retokenisation, POS tagging, NER ● Term recognition using TermRaider ● Sentiment gazetteer lookup ● JAPE opinion detection grammars ● Include target and opinion holder detection based on entities/terms ● currently positive/negative, extending to emotion detection (happy/sad/anger/fear etc)
  • 29.
    Basic approach foropinion finding ● Find sentiment-containing words in a linguistic relation with terms/entities (opinion-target matching) ● life flourishing in Antarctica ● Dictionaries give a starting score for sentiment words ● Use a number of linguistic sub-components to deal with issues such as negatives, adverbial modification, swear words, conditionals, sarcasm etc.
  • 30.
    A positive sentimentlist ● awesome category=adjective score=0.5 ● beaming category=adjective score=0.5 ● belonging category=noun score=0.5 ● benefic category=adjective score=0.5 ● benevolently category=adverb score=0.5 ● caring category=noun score=0.5 ● charitable category=adjective score=0.5 ● charm category=verb score=0.5
  • 31.
    A negative sentimentlist Examples of phrases following the word “go”: ● down the pan ● down the drain ● to the dogs ● downhill ● pear-shaped
  • 32.
    Opinion scoring ● Sentiment gazetteers(developed from sentiment words in WordNet) have a starting “strength” score ● These get modified by context words, e.g. adverbs, swear words, negatives and so on amazing campaign --> really amazing campaign. good campaign --> not so good campaign ● Swear words modifying adjectives count as intensifiers good campaign --> damned good campaign ● Swear words on their own are classified as negative Damned politicians.
  • 33.
  • 34.
  • 35.
  • 36.
    A little lookat sarcasm ● Sarcasm is usually about conveying the opposite meaning to the words we use ● Frequent hashtags: #sarcasm, #notreally, #whocares, #whoknew, #lol ● Even knowing it's sarcastic isn't enough unless you now how to interpret the sarcasm ● Did you know, trees grow? YEEEEEEE #yee #trees #biodiversity #wee #igottapee #notreally ● RT @James_BG: Some lunchtime reading - why golf is an environmental abomination and MUST BE BANNED http://t.co/tBs0aGW8bc #notreally. ● It can be positive as well as negative ● @jimcramer I hate your energy and knowledge #notreally Booyah Cramer.
  • 37.
    Analysis of theEarthHour campaign ● Analysis of hashtags and topics mentioned ● The main activities and themes of the campaign drove most of the social media conversations ● Users engaged in the campaign but did not necessarily engage with climate change and sustainability issues. ● Lack of correlation between Durex campaign and climate change engagement
  • 38.
    Engagement Analysis ● Retweets asthe strongest engagement action ● Identify the characteristics of those tweets that are followed by ● an engagement action (retweet) ● a high level of engagement (high number of retweets) ● For generating engagement, the content of the tweet is more relevant than the reputation of the user. ● This contradicts previous findings in other domains ● Posts generating attention are slightly longer, easier to read, have positive sentiment, mention other users and repeat terminology from other posts ● People are perhaps bored of hearing doom and gloom about how the world is going to end? 38
  • 39.
    Summary ● Tools for socialmedia analysis could be very useful to companies and organisations involved in climate change ● Understanding engagement and user impact of campaigns, products etc. ● Also useful to the general public to understand issues better ● While retweets etc are a useful indicator of engagement, the key is understanding content of social media posts ● For this we need in-depth analysis of many language issues ● NLP can help us understand what is really going on!
  • 40.
    Some final thoughtson climate change
  • 41.
    Acknowledgements and furtherInformation ● Research partially supported by the European Union/EU under the Information and Communication Technologies (ICT) theme of the 7th Framework Programme for R&D (FP7) DecarboNet (610829) http://www.decarbonet.eu ● GATE website http://gate.ac.uk ● Opinion mining demo: http://demos.gate.ac.uk/arcomem/opinions/ ● Diana Maynard, Gerhard Gossen, Marco Fisichella, Adam Funk. Should I care about your opinion? Detection of opinion interestingness and dynamics in social media. Journal of Future Internet, Special Issue on Archiving Community Memories, 2014. ● Diana Maynard and Mark A. Greenwood. Who cares about sarcastic tweets? Investigating the impact of sarcasm on sentiment analysis. Proc. of LREC 2014, Reykjavik, Iceland, May 2014. This document does not represent the opinion of the European Community, and the European Community is not responsible for any use that might be made of its content