Introduction to emotion detection

Introduction to emotion
detection
Tyler Schnoebelen
Stanford University

Welcome to the electronic copy of my
presentation!
• I’ve put a lot of stuff in the notes fields, so make
sure to check them out.
• Unfortunately, most of the audio/video
embedded in this presentation will only play if
you have a recent version of PowerPoint.
– If you can’t play the audio/video but would like to
hear it, please email me at tylers at stanford dot edu
and I’ll make it available.
• Please do let me know about any thoughts or
questions!

Goals
1. Reasons to do emotion detection
2. Understanding the main findings (cues)
3. Overview of limitations
4. Finding new ways forward
– Voice quality
– Using change over time
– A new measurement of speech tempo
– Adding lexical content (won’t do much of this
here, though)

It looks like you’re full of
rage and indignation.
Would like to:
[ ] Sound even angrier
[ ] Sound less angry

Also: we aren’t robots
• Linguistics is about
understanding human
beings.
• To understand human
beings is to understand
the variety and
complexity of emotional
experiences they have.
• Linguistics can offer a lot
by showing how linguistic
resources are used in
creating and coping with
these experiences.

The basic patterns
• Anger, fear, happiness
– Fast speech rate
– High voice intensity (m)
– High voice intensity variability
– High-frequency energy
– High F0
• Sadness, tenderness
– Pretty much the opposite of above:
• Slow speech rate and low voice intensity, low voice intensity variability, low high-
frequency energy, Low F0
• F0 variability?
– Anger and happiness are high
– Fear DOES NOT pattern with them; there’s low F0 variability, as with
tenderness and sadness

A closer look
• Of the 104 studies J&L looked at from 1900 to
June 2002 they found:
– 87% of studies used actors (90 studies)
– 12% used natural speech samples (12 studies)
• Mostly from fear expressions in aviation accidents

For example
• Emotional Prosody Speech and Transcripts
Corpus (EPSaT)

What about naturalistic speech?

Clavel et al (2011)
• A little more naturalistic—film clips involving fear
• 7 hours, 400 speakers
• 3 labelers
• 40 features selected for voiced, 40 for unvoiced:
– Prosodic (pitch)
– Voice quality (jitter, shimmer)
– Spectral and cepstral features (MFCC and Bark band
energy-related measures)

Clavel et al (2011)—Main findings
• Most useful in voiced (all higher in fear than in neutral):
– F0 mean
– Spectral centroid mean
– Jitter
– Note: Intensity was no good even after normalization (diversity
of audio sources)
• Useful in some subtypes of fear, though like “panic”
• Most useful in unvoiced:
– Bark band energy-related > MFCC
– Harmonics-to-noise ratio (HNR)
– Unvoiced rate (proportion of unvoiced frames in a segment)

Sobol-Shikler (2011)
http://www.jkp.com/mindreading/demo/content/dswmedia/MRF_Load.html

Sobol-Shikler (2011)
• 7000 sentences, 2 languages
– Mind Reading database
• Teaches autistic children how to recognize emotions
• 4400 recorded sentences, 10 UK English speakers
• Emotions labeled by 10 people
– Doors corpus
• Humans gambled for 15 minutes—2 repeated sentences (“open this door”, “close door”)
• 100 sentences for each participant (~27 participants)
– 100 spontaneous utterances from 25 speakers
– 9 emotional categories (joyful, absorbed, sure, stressed, excited, opposed,
interested, unsure, thinking)
• 173 features extracted
– F0
– Energy
– Tempo
– Harmonics
– Spectral content

Sobol-Shikler (2011)—Main findings
• To distinguish any pair of emotions, you only
need about 10 features
• But to classify everything, you need 166 of the
173
• Able to classify Hebrew and English affective
states using training on the other language
• About 76% accuracy in detection, which is
comparable to what most studies report

Laukka et al (2011)
• 200 utterances, 64 speakers to a voice-controlled
travel services in Sweden
– 20 subjects judged them for:
• Irritation
• Resignation
• Neutrality
• Emotional intensity
• 73 acoustic measures
– Pitch, intensity, formant, voice source, temporal
– 23 selected for further analysis

Laukka et al (2011)—Main findings
• Irritated speech
– Higher F0 mean, first quantile of F0, and fifth quantile of F0 (essentially F0 min and max)
– Higher mean intensity and fifth quantile of intensity; lower percentage of frames with
intensity rise
– Higher median bandwidth of F2
• Resigned speech
– Lower F0 mean
– Smaller F0 standard deviation
– Lower mean intensity, first quantile of intensity and fifth quantile of intensity
– Smaller intensity standard deviation
• Both
– Slower speech rate (mean syllable duration)
– F0 and intensity cues were strongest
• Other
– Maybe an effect for H1MA3 (a measure of spectral tilt at higher formant frequencies), should
be large for breathy and small for creaky
• Irritation <-> creaky
– Jitter may be linked to irritation, too (jitter often goes with roughness and breathiness)

Different objects of study
• The emotions that are studied do change when
one switches to naturalistic data. While acted
data tends to investigate rage and sorrow,
naturalistic data expresses irritation and
resignation
– Ang, Dhillon, Krupski, Shriberg, & Stolcke, 2002;
Benus, Gravano, & Hirschberg, 2007; Petri Laukka,
Neiberg, Forsell, Karlsson, & Elenius, 2011
• Of course that’s partly because the easiest
naturalistic corpora are human-computer
interactions…

Other naturalistic projects
• Work on naturalistic corpora has increased
through the efforts of the HUMAINE project,
which serves as a repository of emotion
corpora (Douglas-Cowie et al., 2007).
• Most of the data here is from talk shows.

More cues? New cues?
• Modern methods pretty much extract as many features as
they can. For example:
– 46 acoustic features were extracted in Grimm, Kroschel, Mower,
& Narayanan (2007)
– 73 in Petri Laukka, Neiberg, Forsell, Karlsson, & Elenius (2011)
– 87 features in Ververidis, Kotropoulos, & Pitas (2004)
– 100 features in Amir & Cohen (2007)
– 116 in Vidrascu & L. Devillers (2008)
– 173 features in Sobol-Shikler (2011)
– 534 features for voiced content and 518 for unvoiced content in
Clavel, Vasilescu, L. Devillers, Richard, & Ehrette (2008)
– 1,280 features were extracted in Vogt & André (2005).

Voice quality
• Voice quality was among the most neglected
cues as Scherer (1986) pointed out
• Things hadn’t improved that much in 2003
when Juslin and Laukka published
• But there have been more since:
– Amir & Cohen, 2007
– Campbell, 2004
– Drioli, Tisato, Cosi, & Tesser, 2003
– Fernandez & Picard, 2005
– Gobl & Ní Chasaide, 2000, 2003
– Johnstone & Scherer, 1999
– Laukkanen, Vilkman, Alku, & Oksanen, 1997
– Lugger & Yang, 2007
– Monzo, Alías, Iriondo, Gonzalvo, & Planet,
2007
– Nwe, Foo, & De Silva, 2003
– Yang & Lugger, 2010

Campbell (2004)
• One Japanese woman
who wore a
microphone for two
years
• 13,604 usable
utterances
• Here we see that
breathiness and pitch
are controlled
separately

Voice quality
• Voice quality measures are squirrellier than
people would like, so they are often labeled by
hand
– Tense, harsh
– Whisper
– Creaky
– Breathy
– (Modal)

What about…
http://www.stanford.edu/~tylers/misc/turk/96_
A_a.wav
Speaker A: I'm totally getting like his wit and
giving it back {0.7s pause} to him. It's awesome.
Like it has taken a really long time, {breath} but
like I finally get him like as good as he gets me.

How intense is this utterance?

What about pitch?
• Given the literature, we would expect a happy utterance to have a high pitch.
• BUT! Speaker A is using low pitch to indicate excitement, as with the awesome
utterance.
– This is opposite of the prediction that most researchers would make, but there are actually
several instances of it—low pitch enthusiasm seems to be part of A’s emotional style.
170
180
190
200
210
220
230
Pre-wedding section Wedding section Relationship section Awesome utterance Post-relationship
section

What else do we notice?
• A: I'm totally getting like his wit and giving it
back {0.7s pause} to him. It's awesome. Like it
has taken a really long time, {breath} but like I
finally get him like as good as he gets me.
Expected total
like’s
Observed total
like’s
Pre-wedding 42 38
Wedding 9 5
Relationship 15 31
Post-relationship 10 2

Change
• Normalization lets you see how things are
different than the average
• Which is good
• But it misses larger units

Speech rate
Avg 3-9 syllables 10-19 syllables 20+ syllables
Speaker A
Overall (31
utterances)
5.19 syll/sec 4.92 syll/sec 5.17 syll/sec 5.47 syll/sec
Relationship
section (18
utterances)
Non-
relationship
section (13
utterances)
Speaker B
Overall (25
utterances)
Relationship
section (12
utterances)
Non-
relationship
section (13
utterances)

We’ve got to get Spock to Vulcan!

Leonard Nimoy’s Mr. Spock is even

Burstiness
• Variance / (syllables * 0.5)
– Variance gets us dispersion of the data
– The denominator helps us see how spread out the
data is
– The bigger the ratio, the more it is characterized
by clusters (“bursts”)

Burstiness and emotionality
• 48 Americans judged the emotional intensity of
228 utterances
– Utterances taken from 8 episodes, focusing on:
• Captain Kirk
• Mr. Spock
• Lt. Sulu
• Dr. (Bones) McCoy
– Each utterance judged by 3-5 people
– Scores were normalized per judge and then averaged
– Top 30, bottom 30 and 63 randomly chosen in
between were analyzed for speech rate and burstiness
– Restricted to utterances that were at least 5 syllables

Emotional speech in Star Trek is bursty
speech

Better than speech rate
• Among factors tested:
– Burstiness
– Speech rate
– Syllable count
– Interactions among these
• Only burstiness is significant (in a simple linear
regression model or an ordinary least squares
model, p=~0.0125)
– But note that the r-squared isn’t all that great:
0.05044

• A better approach is to use a mixed model,
where speaker is a random effect.
– This allows us to see that Kirk and Bones use
burstiness, while Sulu and Spock don’t.
• Kirk 0.4371045
• Bones 0.1710811
• Sulu -0.1518260
• Spock -0.4563595
Mixed model

Spock in reversal
Bursty, but not emotional
Emotional, but not bursty

Emotionality by Burstiness and
Speaker
AIC BIC logLik deviance REMLdev
341.4 352.7 -166.7 336.7 333.4
Random effects:
Groups Name Variance Std.Dev.
Speaker (Intercept) 0.21810 0.46701
Residual 0.87055 0.93303
Number of obs: 123, groups: Speaker, 4
Fixed effects:
Estimate Std. Error t value
(Intercept) -0.07646 0.27625 -0.2768
Burstiness 7.07226 3.07077 2.3031
> pvals.fnc(data.lmer)$fixed
Estimate MCMCmean HPD95lower HPD95upper pMCMC
Pr(>|t|)
(Intercept) -0.0765 -0.0845 -0.8558 0.6649 0.8174 0.7824
Burstiness 7.0723 7.1217 1.0825 13.1220 0.0194 0.0230

Background: The Brunswikian Lens Model
• is used in several fields to study how observers correctly and
incorrectly use objective cues to perceive physical or social
reality
physical or social
environment
observer
(organism)
cues
• cues have a probabilistic (uncertain) relation to the actual objects
• a (same) cue can signal several objects in the environment
• cues are (often) redundant
slide from Tanja Baenziger

Overwhelmed
Other-oriented
Overwhelming
Authoritative
Persuading

Indexical fields
• Variables aren’t fixed but are located in a
“constellation of ideologically related
meanings” (Eckert 2008)
• Diverse but not unconstrained

Summary
• Reasons to do emotion detection
– Call center automation
– Detecting stress/anxiety
– Helping people communicate better
– Terrorism/law
– Progress in detection means progress in synthesis
• Understanding the main findings (cues)
– Speech rate
– Pitch (mean and variability)
– Intensity (mean and variability)
– High frequency energy
– Review of Clavel et al (2011), Sobol-Shikler (2011), and Laukka et al (2011)
• Overview of limitations
– Non-naturalistic data
– Few labelers
– Contextlessness

Summary
• Finding new ways forward
– Voice quality
– Using change over time
• Contrasts
– Low-pitch-but-happy “awesome” utterance
• Changes
– Normalization 1.0=gender
– Normalization 2.0=speaker
– Normalization 3.0=sections of talk
» Shriberg and colleagues work on “hot spot” detection may be the most
relevant here
– Burstiness (Captain Kirk)
– Embracing indeterminacy (the pitch of “awesome”, indexical
fields, etc)
– Adding linguistic content (not discussed so far—q&a?)

To learn more…
• I’ve put together a lot of essays and reading notes
about language and emotion here:
– http://www.stanford.edu/~tylers/emotions.sht
ml
• Among top overviews are:
– Cowie & Cornelius, 2003
– Juslin & Laukka, 2003
– Russell, Bachorowski, & Fernández-Dols, 2003
– Scherer, 1986, 2003
– Schroder, 2004
– Ververidis & Kotropoulos, 2006

Works cited (1/5)
• Amir, N., & Cohen, R. (2007). Characterizing Emotion in the Soundtrack of an Animated Film: Credible or
Incredible? Affective Computing and Intelligent Interaction, 148–158.
• Ang, J., Dhillon, R., Krupski, A., Shriberg, E., & Stolcke, A. (2002). Prosody-based automatic detection of annoyance
and frustration in human-computer dialog. In Seventh International Conference on Spoken Language Processing.
• Banse, R., & Scherer, K. (1996). Acoustic profiles in vocal emotion expression. Journal of personality and social
psychology, 70(3), 614–636.
• Benus, S., Gravano, A., & Hirschberg, J. (2007). Prosody, emotions, and…‘whatever’. In Proceedings of International
Conference on Speech Communication and Technology (pp. 2629–2632).
• Campbell, N. (2004). Accounting for voice-quality variation. In Speech Prosody 2004, International Conference.
• Clavel, C., Vasilescu, I., Devillers, L., Richard, G., & Ehrette, T. (2008). Fear-type emotion recognition for future
audio-based surveillance systems. Speech Communication, 50(6), 487–503.
• Cowie, R., & Cornelius, R. R. (2003). Describing the emotional states that are expressed in speech. Speech
Communication, 40(1-2), 5–32.
• Devillers, L., & Campbell, N. (2011). Special issue of computer speech and language on. Computer Speech &
Language, 25(1), 1 - 3. doi:DOI: 10.1016/j.csl.2010.07.002
• Douglas-Cowie, E., Cowie, R., Sneddon, I., Cox, C., Lowry, O., McRorie, M., Martin, J. C., et al. (2007). The HUMAINE
database: Addressing the collection and annotation of naturalistic and induced emotional data. Affective
computing and intelligent interaction, 488–500.
• Drioli, C., Tisato, G., Cosi, P., & Tesser, F. (2003). Emotions and voice quality: experiments with sinusoidal modeling.
Proceedings of VOQUAL, 27–29.

Works cited (2/5)
• Eckert, P. (2008). Variation and the indexical field. Journal of Sociolinguistics, 12(4),
453–476.
• Fernandez, R., & Picard, R. W. (2005). Classical and novel discriminant features for
affect recognition from speech. In Ninth European Conference on Speech
Communication and Technology.
• Gobl, C., & Ní Chasaide, A. (2000). Testing affective correlates of voice quality
through analysis and resynthesis. In ISCA Tutorial and Research Workshop (ITRW)
on Speech and Emotion.
• Gobl, C., & Ní Chasaide, A. (2003). The role of voice quality in communicating
emotion, mood and attitude. Speech Communication, 40(1-2), 189–212.
• Grimm, M., Kroschel, K., Mower, E., & Narayanan, S. (2007). Primitives-based
evaluation and estimation of emotions in speech. Speech Communication, 49(10-
11), 787–800.
• Huson, D., D. Richter, C. Rausch, T. Dezulian, M. Franz and R. Rupp. (2007).
Dendroscope: An interactive viewer for large phylogenetic trees . BMC
Bioinformatics 8:460, 2007, software freely available from www.dendroscope.org
• de Jong, N. H., and T. Wempe. (2009). Praat script to detect syllable nuclei and
measure speech rate automatically. Behavior research methods, 41(2), 385.

Works cited (3/5)
• Juslin, P. N., & Laukka, P. (2003). Communication of emotions in vocal expression and music
performance: Different channels, same code?. Psychological Bulletin, 129(5), 770–814.
• Kendall, T. (2010). Language Variation and Sequential Temporal Patterns of Talk. Linguistics
Department, Stanford University: Palo Alto, CA. February.
• Kendall, T. (2009). Speech Rate, Pause, and Linguistic Variation: An Examination Through the
Sociolinguistic Archive and Analysis Project, Doctoral Dissertation. Durham, NC: Duke University.
• Laukka, P., Neiberg, D., Forsell, M., Karlsson, I., & Elenius, K. (2011). Expression of affect in
spontaneous speech: Acoustic correlates and automatic detection of irritation and resignation.
Computer Speech & Language, 25(1), 84 - 104. doi:DOI: 10.1016/j.csl.2010.03.004
• Laukkanen, A. M., Vilkman, E., Alku, P., & Oksanen, H. (1997). On the perception of emotions in
speech: the role of voice quality. Logopedics Phonatrics Vocology, 22(4), 157–168.
• Lugger, M., & Yang, B. (2007). The relevance of voice quality features in speaker independent
emotion recognition. In IEEE International Conference on Acoustics, Speech and Signal Processing,
2007. ICASSP 2007 (Vol. 4, pp. 17-20).
• Monzo, C., Alías, F., Iriondo, I., Gonzalvo, X., & Planet, S. (2007). Discriminating expressive speech
styles by voice quality parameterization. In Proc. of the 16th International Congress of Phonetic
Sciences (ICPhS) (pp. 2081–2084).
• Nwe, T. L., Foo, S. W., & De Silva, L. C. (2003). Speech emotion recognition using hidden Markov
models. Speech communication, 41(4), 603–623.
• Russell, J. A., Bachorowski, J., & Fernández-Dols, J. (2003). Facial and Vocal Expressions of Emotion.
Annual Review of Psychology, 54(1), 329-349. doi:10.1146/annurev.psych.54.101601.145102

Works cited (4/5)
• Scherer, K. (1979). Nonlinguistic vocal indicators of emotion and psychopathology. Emotions in
personality and psychopathology, 493–529.
• Scherer, K. (1986). Vocal affect expression: A review and a model for future research. Psychological
bulletin, 99(2), 143–165.
• Scherer, K. (1999). Appraisal theory. In T. Dalgleish & M. Power (Eds.), Handbook of cognition and
emotion (pp. 637–663). John Wiley and Sons Ltd.
• Scherer, K. (2003). Vocal communication of emotion: a review of research paradigms. Speech
Communication, 40, 227-256.
• Scherer, K. R. (2005). What are emotions? And how can they be measured? Social Science
Information, 44(4), 695.
• Scherer, K. and J. Oshinsky. (1977). Cue utilization in emotion attribution from auditory stimuli.
Motiv. Emot. 1, 331–346.
• Schnoebelen, T. (2009). The social meaning of tempo.
http://www.stanford.edu/~tylers/notes/socioling/Social_meaning_tempo_Schnoebelen_3-23-
09.pdf
• Schnoebelen, T. (2010). The structure of the affective lexicon. California Universities Semantics and
Pragmatics Workshop (CUSP). Stanford University, October 15, 2010.
http://linguistics.stanford.edu/documents/cusp3-schnoebelen.pdf
• Schnoebelen, T. (2010). Variation in speech tempo: Capt. Kirk, Mr. Spock, and all of us in between.
NWAV, San Antonio, TX.
http://www.stanford.edu/~tylers/notes/socioling/NWAV_Capt_Kirk_Mr_Spock_rest_of_us_burstin
ess_11-4-10.pptx

Works cited (5/5)
• Schroder, M. (2004). Speech and emotion research: an overview of research frameworks and a
dimensional approach to emotional speech synthesis (Ph. D thesis). Saarland University.
• Schröder, M., Cowie, R., Douglas-Cowie, E., Westerdijk, M., & Gielen, S. (2001). Acoustic correlates
of emotion dimensions in view of speech synthesis. In Seventh European Conference on Speech
Communication and Technology.
• Sobol-Shikler, T. (2011). Automatic inference of complex affective states. Computer Speech &
Language, 25(1), 45 - 62. doi:DOI: 10.1016/j.csl.2009.12.005
• Ververidis, D., & Kotropoulos, C. (2006). Emotional speech recognition: Resources, features, and
methods. Speech Communication, 48(9), 1162–1181.
• Ververidis, D., Kotropoulos, C., & Pitas, I. (2004). Automatic emotional speech classification. In IEEE
International Conference on Acoustics, Speech, and Signal Processing, 2004.
Proceedings.(ICASSP'04).
• Vidrascu, L., & Devillers, L. (2008). Anger detection performances based on prosodic and acoustic
cues in several corpora. In Programme of the Workshop on Corpora for Research on Emotion and
Affect (pp. 13-16).
• Vogt, T., & André, E. (2005). Comparing feature sets for acted and spontaneous speech in view of
automatic emotion recognition. In 2005 IEEE International Conference on Multimedia and Expo (pp.
474–477).
• Yang, B., & Lugger, M. (2010). Emotion recognition from speech signals using new harmony
features. Signal Processing, 90(5), 1415–1423.

What’s an emotion?
• Kleinginna & Kleinginna (1981) reviewed 92 definitions
of emotion (and 9 “skeptical statements”). Here’s their
proposal:
– Emotion is a complex set of interactions among subjective
and objective factors, mediated by neural/hormonal
systems, which can (a) give rise to affective experiences
such as feelings of arousal, pleasure/displeasure; (b)
generate cognitive processes such as emotionally relevant
perceptual effects, appraisals, labeling processes; (c)
activate widespread physiological adjustments to the
arousing conditions; and (d) lead to behavior that is often,
but not always, expressive, goal-directed, and adaptive.
(K&K 1981: 355)

Basic assumptions
• Facial and vocal changes occur everywhere and are
coordinated with the sender's psychological state
– Most people can infer something from changes
– So why can’t computers?
• Insistence that there are fixed links between
facial/vocal expression and emotions is misplaced.
(Kappas 2002: 10, Russell et al 2003).
• And of course, the receiving side "is more than a reflex-
like decoding of a message" (Russell et al 2003: 331).
– That is, everything is contextualized! Change!

Social meaning
• Speech rates are not stable by demographic
category
• They vary all over the place
• Conveying and creating identities and
attitudes

Speakers A & B (relationship
conversation)

Speaker A (relationship conversation)

Speaker B (relationship conversation)

Ang et al (2002): detecting annoyance
• Annoyed labeled 7.62% of the time,
interlabeler agreement (grouping frustrated
and annoyed) is 71%, Kappa of .47, not super
great, but better than many others
• Frustration
– Longer durations
– Slower speaking
– High values for a number of F0 pitch features
– Repeats and corrections

Ang et al ‘02 Conclusions
• Emotion labeling is a complex decision task
• Cases that labelers independently agree on are classified
with high accuracy
– Extreme emotion (e.g. ‘frustration’) is classified even more
accurately
• Classifiers rely heavily on prosodic features, particularly
duration and stylized pitch
– Speaker normalizations help
• Two nonprosodic features are important: utterance
position and repeat/correction
– Language model is an imperfect surrogate feature for the
underlying important feature repeat/correction
Slide from Shriberg, Ang, Stolcke

Example 3: “How May I Help YouSM
”
(HMIHY)
• Giuseppe Riccardi, Dilek Hakkani-Tür, AT&T Labs
• Liscombe, Riccardi, Hakkani-Tür (2004)
• Each turn in 20,000 turns (5690 dialogues) annotated for 7
emotions by one person
– Positive/neutral, somewhat frustrated, very frustrated, somewhat angry,
very angry, somewhat other negative, very other negative
– Distribution was so skewed (73.1% labeled positive/neutral)
– So classes were collapsed to negative/nonnegative
• Task is hard!
– Subset of 627 turns labeled by 2 people: kappa .32 (full set) and .42
(reduced set)!
Slide from Jackson Liscombe

Lexical Features
• Language Model (ngrams)
• Examples of words significantly correlated with
negative user state (p<0.001) :
– 1st person pronouns: ‘I’, ‘me’
– requests for a human operator: ‘person’, ‘talk’,
‘speak’, ‘human’, ‘machine’
– billing-related words: ‘dollars’, ‘cents’
– curse words: …
Slide from Jackson Liscombe

It can be tremendously useful
• To look at confusion matrices (which emotions
are mistaken for which other emotions the
most?)
• The patterns of confusion are not just a
measure of how good/bad people did—
• They are informative!

“subjective”
positive
love wonderful best
negative
bad stupid waste
Typical sentiment analysis

Word choice matters
• Evidence suggests that people’s physical and
mental health are correlated with the words
they use
– Gottschalk & Glaser (1969); Rosenberg & Tucker
(1978); Stiles (1992)
• “Word use is a meaningful marker and
occasional mediator of natural social and
personality processes.”
– (Pennebaker et al 2003: 548)

Not markers…
• Words occur socially
(even when the speaker
is alone).
• So interlocutors aren’t
just listening for
meaning, they are
constructing and
imposing it.

Words can be infected…
• Semantic
dynamics,
changes of
meaning
• E.g., semantic
contempt can
creep in to things
from their use
(Kaplan 1999)

Introduction to emotion detection

More Related Content

What's hot (20)

Similar to Introduction to emotion detection (20)

More from Tyler Schnoebelen (8)

Recently uploaded (20)

Introduction to emotion detection