Semantic Role Labelinggggggg
Semantic Role Labelinggggggg
Instead of training a single-stage classifier as in Fig. 19.5, the node-level classification task
can be broken down into multiple steps:
1. Pruning: Since only a small number of the constituents in a sentence are
arguments of any given predicate, many systems use simple heuristics to prune
unlikely constituents.
2. Identification: a binary classification of each node as an argument to be labeled or a NONE.
3. Classification: a 1-of-N classification of all the constituents that were labeled
as arguments by the previous stage
The separation of identification and classification may lead to better use of features (different
features may be useful for the two tasks) or to computational efficiency
Global Optimization
The classification algorithm of Fig. 19.5 classifies each argument separately (‘locally’),
making the simplifying assumption that each argument of a predicate can be
labeled independently. This assumption is false; there are interactions between arguments that
require a more ‘global’ assignment of labels to constituents. For example,
constituents in FrameNet and PropBank are required to be non-overlapping. More
significantly, the semantic roles of constituents are not independent. For example
PropBank does not allow multiple identical arguments; two constituents of the same
verb cannot both be labeled ARG0 .
Role labeling systems thus often add a fourth step to deal with global consistency
across the labels in a sentence. For example, the local classifiers can return a list of
possible labels associated with probabilities for each constituent, and a second-pass
Viterbi decoding or re-ranking approach can be used to choose the best consensus
label. Integer linear programming (ILP) is another common way to choose a solution
that conforms best to multiple constraints.
Features for Semantic Role Labeling
Most systems use some generalization of the core set of features introduced by
Gildea and Jurafsky (2000). Common basic features templates (demonstrated on
the NP-SBJ constituent The San Francisco Examiner in Fig. 19.5) include:
• The governing predicate, in this case the verb issued. The predicate is a crucial feature since
labels are defined only with respect to a particular predicate.
• The phrase type of the constituent, in this case, NP (or NP-SBJ). Some semantic roles tend
to appear as NPs, others as S or PP, and so on.
• The headword of the constituent, Examiner. The headword of a constituent
can be computed with standard head rules, such as those given in Chapter 12
in Fig. ??. Certain headwords (e.g., pronouns) place strong constraints on the
possible semantic roles they are likely to fill.
• The headword part of speech of the constituent, NNP.
• The path in the parse tree from the constituent to the predicate. This path is
marked by the dotted line in Fig. 19.5. Following Gildea and Jurafsky (2000),
we can use a simple linear representation of the path, NP↑S↓VP↓VBD. ↑ and
↓ represent upward and downward movement in the tree, respectively. The path is very useful
as a compact representation of many kinds of grammatical
function relationships between the constituent and the predicate.
• The voice of the clause in which the constituent appears, in this case, active
(as contrasted with passive). Passive sentences tend to have strongly different
linkings of semantic roles to surface form than do active ones.
• The binary linear position of the constituent with respect to the predicate,
either before or after.
• The subcategorization of the predicate, the set of expected arguments that
appear in the verb phrase. We can extract this information by using the phrasestructure rule
that expands the immediate parent of the predicate; VP → VBD
NP PP for the predicate in Fig. 19.5.
• The named entity type of the constituent.
• The first words and the last word of the constituent.
The following feature vector thus represents the first NP in our example (recall
that most observations will have the value NONE rather than, for example, ARG0,
since most constituents in the parse tree will not bear a semantic role):
ARG0: [issued, NP, Examiner, NNP, NP↑S↓VP↓VBD, active, before, VP → NP PP,
ORG, The, Examiner]
Other features are often used in addition, such as sets of n-grams inside the
constituent, or more complex versions of the path features (the upward or downward
halves, or whether particular nodes occur in the path).
It’s also possible to use dependency parses instead of constituency parses as the
basis of features, for example using dependency parse paths instead of constituency