0% found this document useful (0 votes)
80 views8 pages

Incrementality in Deterministic Dependency Parsing

This document discusses incrementality in deterministic dependency parsing. It analyzes the constraints on incrementality in this framework and argues that strict incrementality is not achievable. However, an optimal parsing algorithm is shown to minimize the number of structures requiring non-incremental processing. Experimental evidence shows this algorithm achieves incremental parsing for 68.9% of input sentences and 87.9% for sentences it can fully parse.

Uploaded by

progis
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
80 views8 pages

Incrementality in Deterministic Dependency Parsing

This document discusses incrementality in deterministic dependency parsing. It analyzes the constraints on incrementality in this framework and argues that strict incrementality is not achievable. However, an optimal parsing algorithm is shown to minimize the number of structures requiring non-incremental processing. Experimental evidence shows this algorithm achieves incremental parsing for 68.9% of input sentences and 87.9% for sentences it can fully parse.

Uploaded by

progis
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Incrementality in Deterministic Dependency Parsing

Joakim Nivre
School of Mathematics and Systems Engineering
Växjö University
SE-35195 Växjö
Sweden
[email protected]

Abstract parsers that only partially disambiguate the in-


put — partial parsing — are usually determin-
Deterministic dependency parsing is a robust
istic and construct the final analysis in one pass
and efficient approach to syntactic parsing of
over the input (Abney, 1991; Daelemans et al.,
unrestricted natural language text. In this pa-
1999). But since they normally output a se-
per, we analyze its potential for incremental
quence of unconnected phrases or chunks, they
processing and conclude that strict incremen-
fail to satisfy the constraint of incrementality
tality is not achievable within this framework.
for a different reason.
However, we also show that it is possible to min-
Deterministic dependency parsing has re-
imize the number of structures that require non-
cently been proposed as a robust and effi-
incremental processing by choosing an optimal
cient method for syntactic parsing of unre-
parsing algorithm. This claim is substantiated
stricted natural language text (Yamada and
with experimental evidence showing that the al-
Matsumoto, 2003; Nivre, 2003). In some ways,
gorithm achieves incremental parsing for 68.9%
this approach can be seen as a compromise be-
of the input when tested on a random sample
tween traditional full and partial parsing. Es-
of Swedish text. When restricted to sentences
sentially, it is a kind of full parsing in that the
that are accepted by the parser, the degree of
goal is to build a complete syntactic analysis for
incrementality increases to 87.9%.
the input string, not just identify major con-
stituents. But it resembles partial parsing in
1 Introduction being robust, efficient and deterministic. Taken
Incrementality in parsing has been advocated together, these properties seem to make de-
for at least two different reasons. The first is pendency parsing suitable for incremental pro-
mainly practical and has to do with real-time cessing, although existing implementations nor-
applications such as speech recognition, which mally do not satisfy this constraint. For exam-
require a continually updated analysis of the in- ple, Yamada and Matsumoto (2003) use a multi-
put received so far. The second reason is more pass bottom-up algorithm, combined with sup-
theoretical in that it connects parsing to cog- port vector machines, in a way that does not
nitive modeling, where there is psycholinguis- result in incremental processing.
tic evidence suggesting that human parsing is In this paper, we analyze the constraints
largely incremental (Marslen-Wilson, 1973; Fra- on incrementality in deterministic dependency
zier, 1987). parsing and argue that strict incrementality is
However, most state-of-the-art parsing meth- not achievable. We then analyze the algorithm
ods today do not adhere to the principle of in- proposed in Nivre (2003) and show that, given
crementality, for different reasons. Parsers that the previous result, this algorithm is optimal
attempt to disambiguate the input completely from the point of view of incrementality. Fi-
— full parsing — typically first employ some nally, we evaluate experimentally the degree of
kind of dynamic programming algorithm to de- incrementality achieved with the algorithm in
rive a packed parse forest and then applies a practical parsing.
probabilistic top-down model in order to select
the most probable analysis (Collins, 1997; Char- 2 Dependency Parsing
niak, 2000). Since the first step is essentially In a dependency structure, every word token
nondeterministic, this seems to rule out incre- is dependent on at most one other word to-
mentality at least in a strict sense. By contrast, ken, usually called its head or regent, which
 adv  obj  att 
pr  sub  att  sub  obj  id 
? ? ? ? ? ? ? ? ?
PP NN VB PN JJ NN HP VB PM PM
På 60-talet målade han djärva tavlor som retade Nikita Chrusjtjov.
(In the-60’s painted he bold pictures which annoyed Nikita Chrustjev.)
Figure 1: Dependency graph for Swedish sentence

means that the structure can be represented as 3 Incrementality in Dependency


a directed graph, with nodes representing word Parsing
tokens and arcs representing dependency rela-
tions. In addition, arcs may be labeled with Having defined dependency graphs, we may
specific dependency types. Figure 1 shows a now consider to what extent it is possible to
labeled dependency graph for a simple Swedish construct these graphs incrementally. In the
sentence, where each word of the sentence is la- strictest sense, we take incrementality to mean
beled with its part of speech and each arc la- that, at any point during the parsing process,
beled with a grammatical function. there is a single connected structure represent-
ing the analysis of the input consumed so far.
In the following, we will restrict our atten-
In terms of our dependency graphs, this would
tion to unlabeled dependency graphs, i.e. graphs
mean that the graph being built during parsing
without labeled arcs, but the results will ap-
is connected at all times. We will try to make
ply to labeled dependency graphs as well. We
this more precise in a minute, but first we want
will also restrict ourselves to projective depen-
to discuss the relation between incrementality
dency graphs (Mel’cuk, 1988). Formally, we de-
and determinism.
fine these structures in the following way:
It seems that incrementality does not by itself
1. A dependency graph for a string of words imply determinism, at least not in the sense of
W = w1 · · ·wn is a labeled directed graph never undoing previously made decisions. Thus,
D = (W, A), where a parsing method that involves backtracking can
be incremental, provided that the backtracking
(a) W is the set of nodes, i.e. word tokens is implemented in such a way that we can always
in the input string, maintain a single structure representing the in-
(b) A is a set of arcs (wi , wj ) (wi , wj ∈ W ). put processed up to the point of backtracking.
In the context of dependency parsing, a case in
We write wi < wj to express that wi pre- point is the parsing method proposed by Kro-
cedes wj in the string W (i.e., i < j); we mann (Kromann, 2002), which combines heuris-
write wi → wj to say that there is an arc tic search with different repair mechanisms.
from wi to wj ; we use →∗ to denote the re- In this paper, we will nevertheless restrict our
flexive and transitive closure of the arc re- attention to deterministic methods for depen-
lation; and we use ↔ and ↔∗ for the corre- dency parsing, because we think it is easier to
sponding undirected relations, i.e. wi ↔ wj pinpoint the essential constraints within a more
iff wi → wj or wj → wi . restrictive framework. We will formalize deter-
2. A dependency graph D = (W, A) is well- ministic dependency parsing in a way which is
formed iff the five conditions given in Fig- inspired by traditional shift-reduce parsing for
ure 2 are satisfied. context-free grammars, using a buffer of input
tokens and a stack for storing previously pro-
The task of mapping a string W = w1 · · ·wn cessed input. However, since there are no non-
to a dependency graph satisfying these condi- terminal symbols involved in dependency pars-
tions is what we call dependency parsing. For a ing, we also need to maintain a representation of
more detailed discussion of dependency graphs the dependency graph being constructed during
and well-formedness conditions, the reader is re- processing.
ferred to Nivre (2003). We will represent parser configurations by
r r0
Unique label (wi → wj ∧ wi → wj ) ⇒ r = r0
Single head (wi → wj ∧ wk → wj ) ⇒ wi = wk
Acyclic ¬(wi → wj ∧ wj →∗ wi )
Connected wi ↔∗ wj
Projective (wi ↔ wk ∧ wi < wj < wk ) ⇒ (wi →∗ wj ∨ wk →∗ wj )

Figure 2: Well-formedness conditions on dependency graphs

triples hS, I, Ai, where S is the stack (repre- ten be applied to the same configuration. Thus,
sented as a list), I is the list of (remaining) input in order to get a deterministic parser, we need
tokens, and A is the (current) arc relation for to introduce a mechanism for resolving transi-
the dependency graph. (Since the nodes of the tion conflicts. Regardless of which mechanism
dependency graph are given by the input string, is used, the parser is guaranteed to terminate
only the arc relation needs to be represented ex- after at most 2n transitions, given an input
plicitly.) Given an input string W , the parser is string of length n. Moreover, the parser is guar-
initialized to hnil, W, ∅i and terminates when it anteed to produce a dependency graph that is
reaches a configuration hS, nil, Ai (for any list acyclic and projective (and satisfies the single-
S and set of arcs A). The input string W is head constraint). This means that the depen-
accepted if the dependency graph D = (W, A) dency graph given at termination is well-formed
given at termination is well-formed; otherwise if and only if it is connected.
W is rejected. We can now define what it means for the pars-
In order to understand the constraints on ing to be incremental in this framework. Ide-
incrementality in dependency parsing, we will ally, we would like to require that the graph
begin by considering the most straightforward (W − I, A) is connected at all times. How-
parsing strategy, i.e. left-to-right bottom-up ever, given the definition of Left-Reduce and
parsing, which in this case is essentially equiva- Right-Reduce, it is impossible to connect a
lent to shift-reduce parsing with a context-free new word without shifting it to the stack first,
grammar in Chomsky normal form. The parser so it seems that a more reasonable condition is
is defined in the form of a transition system, that the size of the stack should never exceed
represented in Figure 3 (where wi and wj are 2. In this way, we require every word to be at-
arbitrary word tokens): tached somewhere in the dependency graph as
soon as it has been shifted onto the stack.
1. The transition Left-Reduce combines the
We may now ask whether it is possible
two topmost tokens on the stack, wi and
to achieve incrementality with a left-to-right
wj , by a left-directed arc wj → wi and re-
bottom-up dependency parser, and the answer
duces them to the head wj .
turns out to be no in the general case. This can
2. The transition Right-Reduce combines be demonstrated by considering all the possible
the two topmost tokens on the stack, wi projective dependency graphs containing only
and wj , by a right-directed arc wi → wj three nodes and checking which of these can be
and reduces them to the head wi . parsed incrementally. Figure 4 shows the rele-
3. The transition Shift pushes the next input vant structures, of which there are seven alto-
token wi onto the stack. gether.
We begin by noting that trees (2–5) can all be
The transitions Left-Reduce and Right- constructed incrementally by shifting the first
Reduce are subject to conditions that ensure two tokens onto the stack, then reducing – with
that the Single head condition is satisfied. For Right-Reduce in (2–3) and Left-Reduce in
Shift, the only condition is that the input list (4–5) – and then shifting and reducing again –
is non-empty. with Right-Reduce in (2) and (4) and Left-
As it stands, this transition system is non- Reduce in (3) and (5). By contrast, the three
deterministic, since several transitions can of- remaining trees all require that three tokens are
Initialization hnil, W, ∅i

Termination hS, nil, Ai

Left-Reduce hwj wi |S, I, Ai → hwj |S, I, A ∪ {(wj , wi )}i ¬∃wk (wk , wi ) ∈ A

Right-Reduce hwj wi |S, I, Ai → hwi |S, I, A ∪ {(wi , wj )}i ¬∃wk (wk , wj ) ∈ A

Shift hS, wi |I, Ai → hwi |S, I, Ai

Figure 3: Left-to-right bottom-up dependency parsing

   
         
? ? ? ? ? ?
(1) a b c (2) a b c (3) a b c (4) a? b
?
c
   
      
(5) a ? ? (6) a? ? (7) a ? ?
b c b c b c

Figure 4: Projective three-node dependency structures

shifted onto the stack before the first reduction. egy, which requires each token to have found
However, the reason why we cannot parse the all its dependents before it is combined with its
structure incrementally is different in (1) com- head. For left-dependents this is not a problem,
pared to (6–7). as can be seen in (5), which can be processed
In (6–7) the problem is that the first two to- by alternating Shift and Left-Reduce. But in
kens are not connected by a single arc in the (1) the sequence of reductions has to be per-
final dependency graph. In (6) they are sisters, formed from right to left as it were, which rules
both being dependents on the third token; in out strict incrementality. However, whereas the
(7) the first is the grandparent of the second. structures exemplified in (6–7) can never be pro-
And in pure dependency parsing without non- cessed incrementally within the present frame-
terminal symbols, every reduction requires that work, the structure in (1) can be handled by
one of the tokens reduced is the head of the modifying the parsing strategy, as we shall see
other(s). This holds necessarily, regardless of in the next section.
the algorithm used, and is the reason why it It is instructive at this point to make a com-
is impossible to achieve strict incrementality in parison with incremental parsing based on ex-
dependency parsing as defined here. However, tended categorial grammar, where the struc-
it is worth noting that (2–3), which are the mir- tures in (6–7) would normally be handled by
ror images of (6–7) can be parsed incrementally, some kind of concatenation (or product), which
even though they contain adjacent tokens that does not correspond to any real semantic com-
are not linked by a single arc. The reason is bination of the constituents (Steedman, 2000;
that in (2–3) the reduction of the first two to- Morrill, 2000). By contrast, the structure in (1)
kens makes the third token adjacent to the first. would typically be handled by function compo-
Thus, the defining characteristic of the prob- sition, which corresponds to a well-defined com-
lematic structures is that precisely the leftmost positional semantic operation. Hence, it might
tokens are not linked directly. be argued that the treatment of (6–7) is only
The case of (1) is different in that here the pseudo-incremental even in other frameworks.
problem is caused by the strict bottom-up strat- Before we leave the strict bottom-up ap-
proach, it can be noted that the algorithm de- that the Single head constraint is satisfied,
scribed in this section is essentially the algo- while the Reduce transition can only be ap-
rithm used by Yamada and Matsumoto (2003) plied if the token on top of the stack already
in combination with support vector machines, has a head. The Shift transition is the same as
except that they allow parsing to be performed before and can be applied as long as the input
in multiple passes, where the graph produced in list is non-empty.
one pass is given as input to the next pass.1 The
Comparing the two algorithms, we see that
main motivation they give for parsing in multi-
the Left-Arc transition of the arc-eager algo-
ple passes is precisely the fact that the bottom-
rithm corresponds directly to the Left-Reduce
up strategy requires each token to have found
transition of the standard bottom-up algorithm.
all its dependents before it is combined with its
The only difference is that, for reasons of sym-
head, which is also what prevents the incremen-
metry, the former applies to the token on top
tal parsing of structures like (1).
of the stack and the next input token instead
4 Arc-Eager Dependency Parsing of the two topmost tokens on the stack. If we
compare Right-Arc to Right-Reduce, how-
In order to increase the incrementality of deter-
ever, we see that the former performs no re-
ministic dependency parsing, we need to com-
duction but simply shifts the newly attached
bine bottom-up and top-down processing. More
right-dependent onto the stack, thus making
precisely, we need to process left-dependents
it possible for this dependent to have right-
bottom-up and right-dependents top-down. In
dependents of its own. But in order to allow
this way, arcs will be added to the dependency
multiple right-dependents, we must also have
graph as soon as the respective head and depen-
a mechanism for popping right-dependents off
dent are available, even if the dependent is not
the stack, and this is the function of the Re-
complete with respect to its own dependents.
duce transition. Thus, we can say that the
Following Abney and Johnson (1991), we will
action performed by the Right-Reduce tran-
call this arc-eager parsing, to distinguish it from
sition in the standard bottom-up algorithm is
the standard bottom-up strategy discussed in
performed by a Right-Arc transition in combi-
the previous section.
nation with a subsequent Reduce transition in
Using the same representation of parser con-
the arc-eager algorithm. And since the Right-
figurations as before, the arc-eager algorithm
Arc and the Reduce can be separated by an
can be defined by the transitions given in Fig-
arbitrary number of transitions, this permits
ure 5, where wi and wj are arbitrary word to-
the incremental parsing of arbitrary long right-
kens (Nivre, 2003):
dependent chains.
1. The transition Left-Arc adds an arc Defining incrementality is less straightfor-
r
wj → wi from the next input token wj ward for the arc-eager algorithm than for the
to the token wi on top of the stack and standard bottom-up algorithm. Simply consid-
pops the stack. ering the size of the stack will not do anymore,
2. The transition Right-Arc adds an arc since the stack may now contain sequences of
r
wi → wj from the token wi on top of tokens that form connected components of the
the stack to the next input token wj , and dependency graph. On the other hand, since it
pushes wj onto the stack. is no longer necessary to shift both tokens to be
combined onto the stack, and since any tokens
3. The transition Reduce pops the stack.
that are popped off the stack are connected to
4. The transition Shift (SH) pushes the next some token on the stack, we can require that
input token wi onto the stack. the graph (S, AS ) should be connected at all
times, where AS is the restriction of A to S, i.e.
The transitions Left-Arc and Right-Arc, like AS = {(wi , wj ) ∈ A|wi , wj ∈ S}.
their counterparts Left-Reduce and Right-
Reduce, are subject to conditions that ensure Given this definition of incrementality, it is
1
easy to show that structures (2–5) in Figure 4
A purely terminological, but potentially confusing, can be parsed incrementally with the arc-eager
difference is that Yamada and Matsumoto (2003) use the
term Right for what we call Left-Reduce and the term
algorithm as well as with the standard bottom-
Left for Right-Reduce (thus focusing on the position up algorithm. However, with the new algorithm
of the head instead of the position of the dependent). we can also parse structure (1) incrementally, as
Initialization hnil, W, ∅i

Termination hS, nil, Ai

Left-Arc hwi |S, wj |I, Ai → hS, wj |I, A ∪ {(wj , wi )}i ¬∃wk (wk , wi ) ∈ A

Right-Arc hwi |S, wj |I, Ai → hwj |wi |S, I, A ∪ {(wi , wj )}i ¬∃wk (wk , wj ) ∈ A

Reduce hwi |S, I, Ai → hS, I, Ai ∃wj (wj , wi ) ∈ A

Shift hS, wi |I, Ai → hwi |S, I, Ai

Figure 5: Left-to-right arc-eager dependency parsing

is shown by the following transition sequence: of lower nodes from higher nodes, since all nodes
are given by the input string. Hence, in terms of
hnil, abc, ∅i what drives the parsing process, all algorithms
↓ (Shift) discussed here correspond to bottom-up algo-
ha, bc, ∅i rithms in context-free parsing. It is interest-
↓ (Right-Arc) ing to note that if we recast the problem of de-
hba, c, {(a, b)}i pendency parsing as context-free parsing with a
↓ (Right-Arc) CNF grammar, then the problematic structures
hcba, nil, {(a, b), (b, c)}i (1), (6–7) in Figure 4 all correspond to right-
We conclude that the arc-eager algorithm is op- branching structures, and it is well-known that
timal with respect to incrementality in depen- bottom-up parsers may require an unbounded
dency parsing, even though it still holds true amount of memory in order to process right-
that the structures (6–7) in Figure 4 cannot be branching structure (Miller and Chomsky, 1963;
parsed incrementally. This raises the question Abney and Johnson, 1991).
how frequently these structures are found in Moreover, if we analyze the two algorithms
practical parsing, which is equivalent to asking discussed here in the framework of Abney and
how often the arc-eager algorithm deviates from Johnson (1991), they do not differ at all as to
strictly incremental processing. Although the the order in which nodes are enumerated, but
answer obviously depends on which language only with respect to the order in which arcs are
and which theoretical framework we consider, enumerated; the first algorithm is arc-standard
we will attempt to give at least a partial answer while the second is arc-eager. One of the obser-
to this question in the next section. Before that, vations made by Abney and Johnson (1991), is
however, we want to relate our results to some that arc-eager strategies for context-free pars-
previous work on context-free parsing. ing may sometimes require less space than arc-
First of all, it should be observed that the standard strategies, although they may lead
terms top-down and bottom-up take on a slightly to an increase in local ambiguities. It seems
different meaning in the context of dependency that the advantage of the arc-eager strategy
parsing, as compared to their standard use in for dependency parsing with respect to struc-
context-free parsing. Since there are no nonter- ture (1) in Figure 4 can be explained along the
minal nodes in a dependency graph, top-down same lines, although the lack of nonterminal
construction means that a head is attached to nodes in dependency graphs means that there
a dependent before the dependent is attached is no corresponding increase in local ambigui-
to (some of) its dependents, whereas bottom- ties. Although a detailed discussion of the re-
up construction means that a dependent is at- lation between context-free parsing and depen-
tached to its head before the head is attached to dency parsing is beyond the scope of this paper,
its head. However, top-down construction of de- we conjecture that this may be a genuine advan-
pendency graphs does not involve the prediction tage of dependency representations in parsing.
Connected Parser configurations
components Number Percent
0 1251 7.6
1 10148 61.3
2 2739 16.6
3 1471 8.9
4 587 3.5
5 222 1.3
6 98 0.6
7 26 0.2
8 3 0.0
≤1 11399 68.9
≤3 15609 94.3
≤8 16545 100.0

Table 1: Number of connected components in (S, AS ) during parsing

5 Experimental Evaluation (out of 613), for which the parser produces a


In order to measure the degree of incremental- well-formed dependency graph. The results can
ity achieved in practical parsing, we have eval- be seen in Table 2. In this case, 87.1% of all
uated a parser that uses the arc-eager parsing configurations in fact satisfy the constraints of
algorithm in combination with a memory-based incrementality, and the proportion of configu-
classifier for predicting the next transition. In rations that have no more than three connected
experiments reported in Nivre et al. (2004), a components on the stack is as high as 99.5%.
parsing accuracy of 85.7% (unlabeled attach- It seems fair to conclude that, although strict
ment score) was achieved, using data from a word-by-word incrementality is not possible in
small treebank of Swedish (Einarsson, 1976), di- deterministic dependency parsing, the arc-eager
vided into a training set of 5054 sentences and algorithm can in practice be seen as a close ap-
a test set of 631 sentences. However, in the proximation of incremental parsing.
present context, we are primarily interested in
the incrementality of the parser, which we mea- 6 Conclusion
sure by considering the number of connected In this paper, we have analyzed the potential
components in (S, AS ) at different stages dur- for incremental processing in deterministic de-
ing the parsing of the test data. pendency parsing. Our first result is negative,
The results can be found in Table 1, where since we have shown that strict incrementality
we see that out of 16545 configurations used in is not achievable within the restrictive parsing
parsing 613 sentences (with a mean length of framework considered here. However, we have
14.0 words), 68.9% have zero or one connected also shown that the arc-eager parsing algorithm
component on the stack, which is what we re- is optimal for incremental dependency parsing,
quire of a strictly incremental parser. We also given the constraints imposed by the overall
see that most violations of incrementality are framework. Moreover, we have shown that in
fairly mild, since more than 90% of all configu- practical parsing, the algorithm performs in-
rations have no more than three connected com- cremental processing for the majority of input
ponents on the stack. structures. If we consider all sentences in the
Many violations of incrementality are caused test data, the share is roughly two thirds, but if
by sentences that cannot be parsed into a well- we limit our attention to well-formed output, it
formed dependency graph, i.e. a single projec- is almost 90%. Since deterministic dependency
tive dependency tree, but where the output of parsing has previously been shown to be com-
the parser is a set of internally connected com- petitive in terms of parsing accuracy (Yamada
ponents. In order to test the influence of incom- and Matsumoto, 2003; Nivre et al., 2004), we
plete parses on the statistics of incrementality, believe that this is a promising approach for sit-
we have performed a second experiment, where uations that require parsing to be robust, effi-
we restrict the test data to those 444 sentences cient and (almost) incremental.
Connected Parser configurations
components Number Percent
0 928 9.2
1 7823 77.8
2 1000 10.0
3 248 2.5
4 41 0.4
5 8 0.1
6 1 0.0
≤1 8751 87.1
≤3 9999 99.5
≤6 10049 100.0

Table 2: Number of connected components in (S, AS ) for well-formed trees

Acknowledgements Lyn Frazier. 1987. Syntactic processing: Ev-


The work presented in this paper was sup- idence from Dutch. Natural Language and
ported by a grant from the Swedish Re- Linguistic Theory, 5:519–559.
search Council (621-2002-4207). The memory- Matthias Trautner Kromann. 2002. Optimality
based classifiers used in the experiments were parsing and local cost functions in Discontin-
constructed using the Tilburg Memory-Based uous Grammar. Electronic Notes of Theoret-
Learner (TiMBL) (Daelemans et al., 2003). ical Computer Science, 52.
Thanks to three anonymous reviewers for con- William Marslen-Wilson. 1973. Linguistic
structive comments on the submitted paper. structure and speech shadowing at very short
latencies. Nature, 244:522–533.
References Igor Mel’cuk. 1988. Dependency Syntax: The-
Steven Abney and Mark Johnson. 1991. Mem- ory and Practice. State University of New
ory requirements and local ambiguities of York Press.
parsing strategies. Journal of Psycholinguis- George A. Miller and Noam Chomsky. 1963.
tic Research, 20:233–250. Finitary models of language users. In R. D.
Steven Abney. 1991. Parsing by chunks. Luce, R. R. Bush, and E. Galanter, editors,
In Principle-Based Parsing, pages 257–278. Handbook of Mathematical Psychology. Vol-
Kluwer. ume 2. Wiley.
Eugene Charniak. 2000. A maximum-entropy- Glyn Morrill. 2000. Inremental processing
inspired parser. In Proceedings NAACL- and acceptability. Computational Linguis-
2000. tics, 26:319–338.
Michael Collins. 1997. Three generative, lexi- Joakim Nivre, Johan Hall, and Jens Nils-
calised models for statistical parsing. In Pro- son. 2004. Memory-based dependency pars-
ceedings of the 35th Annatual Meeting of the ing. In Proceedings of the 8th Conference
Association for Computational Linguistics, on Computational Natural Language Learn-
pages 16–23, Madrid, Spain. ing (CoNLL), pages 49–56.
Walter Daelemans, Sabine Buchholz, and Jorn Joakim Nivre. 2003. An efficient algorithm
Veenstra. 1999. Memory-based shallow pars- for projective dependency parsing. In Pro-
ing. In Proceedings of the 3rd Conference ceedings of the 8th International Workshop
on Computational Natural Language Learn- on Parsing Technologies (IWPT), pages 149–
ing (CoNLL), pages 77–89. 160.
Walter Daelemans, Jakub Zavrel, Ko van der Mark Steedman. 2000. The Syntactic Process.
Sloot, and Antal van den Bosch. 2003. MIT Press.
Timbl: Tilburg memory based learner, ver- Hiroyasu Yamada and Yuji Matsumoto. 2003.
sion 5.0, reference guide. Technical Report Statistical dependency analysis with support
ILK 03-10, Tilburg University, ILK. vector machines. In Proceedings of the 8th In-
Jan Einarsson. 1976. Talbankens skriftspråks- ternational Workshop on Parsing Technolo-
konkordans. Lund University. gies (IWPT), pages 195–206.

You might also like