Event Extraction
Event Extraction
Bonan Min
[email protected]
Some slides are based on class materials from Ralph Grishman
Event Extraction: An Example
Preprocessing
◦ Tagging named entities, mentions and value mentions (e.g., time)
2
Event Extraction: An Example
Preprocessing
◦ Tagging named entities, mentions and value mentions (e.g., time)
Event Extraction
◦ Event detection: detect and classify event mentions
3
Event Extraction: An Example
Place
Time
Preprocessing
◦ Tagging named entities, mentions and value mentions (e.g., time)
Event Extraction
◦ Event detection: detect and classify event mentions
◦ Argument Extraction: attach event arguments Who, When (Time), and Where (Place)
4
Event Extraction: An Example
Place
Time
Preprocessing
◦ Tagging named entities, mentions and value mentions (e.g., time)
Event Extraction
◦ Event detection: detect and classify event mentions
◦ Argument Extraction: attach event arguments Who, When (Time), and Where (Place)
5
Event Extraction
Scenario Template (MUC: Message Understanding Conference)
◦ The scenario template task originally was the IE task for the MUC
evaluations
◦ Identify participants, locations, dates etc. of a class of events -- a naval engagement, a
terrorist incident, a joint venture.
◦ A single template included related information, such as an attack and its effects; this led to
some relatively complex templates
◦ With later MUCs (6 and 7), the task narrowed to single events or closely
related events -- executive succession, rocket launchings
For the ACE evaluations, this became the event extraction task.
An event is
◦ a specific occurrence involving participants.
◦ something that happens.
This is broadly defined, so we
◦ frequently described as a change of state. need an inventory of types
ACE Events
Event Type Event Subtype
Life Be-Born, Marry, Divorce, Injure, Die
Movement Transport
Transaction Transfer-Ownership, Transfer-Money
Business Start-Org, Merge-Org, Declare-Bankruptcy, End-Org
Conflict Attack, Demonstrate
Contact Meet, Phone-Write
Personnel Start-Position, End-Position, Nominate, Elect
Justice Arrest-Jail, Release-Parole, Trial-Hearing, Charge-Indict, Sue, Convict,
Sentence, Fine, Execute, Extradite, Acquit, Appeal, Pardon
S1: Six murders occurred in France, including the assassination of Bob and the killing of Joe.
S2. Six men were murdered, including Bob (in Paris) and Joe (in Reims).
Conflict.Attack Life.Die
Event Anchors/Triggers
Many Events anchor on a single verb or noun
Six murders occurred in France, including the
assassination of Bob and the killing of Joe.
An alternative set
Transaction.Transfer-Ownership
(in CAMEO):
[His brother] bought [him] [a new car] with [$20,000]. • Agent
Buyer Beneficiary Artifact Price • Patient
• Location
*All event types have Place and Time arguments. • Time
Event Arguments: Attributes (ACE)
These argument slots can be filled by certain Values identified within the
scope of the Event
Event-specific attributes
S2. Six men were murdered, including Bob (in Paris) and Joe (in Reims).
Anchor Arguments
role Mention/Value
murdered Victim six men, Bob, Joe
location Paris, Reims
Event Properties: Polarity and Genericity
Polarity
◦ An Event is NEGATIVE when it is explicitly indicated that the Event did not
occur. All other Events are POSITIVE.
His wife was sitting on the backseat and was not hurt.
They backed out of the purchase at the last minute.
Genericity
◦ An Event is SPECIFIC if it is understood as a singular occurrence at a
particular place and time, or a finite set of such occurrences. All other Events
are GENERIC.
◦ FUTURE is used for those Events that have not yet occurred at the textual anchor
time.
He plans to meet with lawmakers from both parties.
When he's born, he'll be named after his father.
◦ PRESENT is used for those Events that occur at the textual anchor time
The airline is in the midst of a major aircraft purchase from Airbus Industries.
Event Properties: Modality
An Event is ASSERTED when the author or speaker makes reference to it as
though it were a real occurrence
He traveled to Houston in late September.
... the Boston Marathon terror attack. On April 15, 2013, double bombings near
the finish line of the Boston Marathon killed three people and injured at least 264.
... The bombs exploded 12 seconds apart near the marathon's finish line on
Boylston Street.
Event Extraction Pipeline (Stages)
Anchor Argument Property Event
identification identification assignment coreference
Anchor identification
◦ finding event anchors(the basis for event mentions) in text and assigning them
an event type
Argument identification
◦ determining which entity mentions, timexes, and values are arguments of each
event mention
Property assignment
◦ determining the values of the modality, polarity, genericity, and tense
properties for each event mention
Event coreference
◦ determining which event mentions refer to the same even
David Ahn. 2006. The stages of event extraction
Event Extraction Pipeline (Stages)
Anchor Argument Property Event
identification identification assignment coreference
Lexical features: full word, lowercase word, lemmatized word, POS tag,
depth of word in parse tree
WordNet features: Synset for words in categories such as noun, verb,
adjective, and adverbs
Left context (e.g., 3 words): lowercase, POS tag
Right context (e.g., 3 words): lowercase, POS tag
Dependency features: labels of the dependency relations, dependent/head
words, or both. An example is <*, subj, killed>
Related entity features: Hamas attacked Israeli army
◦ Entity types, constituent head words, etc
Trump attacked CNN
◦ Often used in conjunction with dependency relations
Features For Argument Identification
Hamas launched an attack.
Anchor word of event mention: full, lower-case, POS tag, and depth in parse
tree
Event type of event mention, e.g., Conflict.Attack
Constituent head word of entity mention: full, lowercase, POS tag, and
depth in parsetree
Entity type and mention type (name, pro-noun, other NP) of entity mention
Dependency path between anchor word and constituent head word of
entity mention
◦ Often expressed as a sequence of labels, of words, and/or of POS tags
◦ Often used in conjunction with entity types to reduce sparsity
◦ E.g., ORG.Military subj launched obj attack
Features For Assigning Attributes
Train a separate classifier for each attribute
◦ Genericity, modality, and polarity are each binary classification tasks
◦ Tense is a multi-class task.
Similar features for the anchor identification task, with the exception of
the lemmatized anchor word and the WordNet features
Number, heads, and roles of candidate arguments that are not anaphor arguments
Number, heads, and roles of anaphor arguments that are not candidate arguments
Heads and roles of arguments shared by candidate and anaphor in different roles
Candidate modality value + anaphor modality value
◦ also for polarity, genericity, and tense
Convolutional Neural Networks
for Event Extraction
CNN for event classification and argument identification
◦ WE: word embeddings
◦ PF: position embeddings features
◦ EF: event embeddings features
◦ Local features: embeddings of trigger, argument, and words around them (e.g. left 3 words, right 3
words). These local features are concatenated with the max-pool to form the final input features to the
dense network, for prediction
Word embeddings Word-level features CNN filters Max-pool
Generalization Vector NN
concatenation output
over words
focuses
Avoid (extensive)
on feature engineering
relief
and
recovery
activities
in
the
non-conflict
affected
WE PF EF
states WE: word embeddings
PF: position embeddings features
,
EF: event embeddings features
Trigger Argument
Chen et al., Event Extraction via Dynamic Multi-Pooling Convolutional Neural Networks. ACL-IJCNLP 2015.
Problems with Pipeline Models
Anchor Argument Property Event
identification identification assignment coreference
◦ cameraman is far away from fired, it may fail to recognize cameraman as its Target
argument with the local features.
◦ Can we generate global features to encode that a Victim argument for the Die event is
often the Target argument for the Attack event in the same sentence?
Joint Trigger and Argument Extraction via RNN
Thien Huu Nguyen, Kyunghyun Cho and Ralph Grishman. Joint Event Extraction via Recurrent Neural Networks. NAACL-HLT 2016.