Goldberg Cxs LMMs 2023
Goldberg Cxs LMMs 2023
The constructionist framework is more relevant than ever, due to efforts by a broad
range of researchers across the globe, a steady increase in the use of corpus and
experimental methods among linguists, consistent findings from laboratory
phonology and sociolinguistics, and striking advances in transformer-based large
language models. These advances promise exciting developments and a great deal
more clarity over the next decade. The constructionist approach rests on two
interrelated but distinguishable tenets: a recognition that constructions pair form
with function at varying levels of specificity and abstraction and the recognition that
our knowledge and use of language are dynamic and usage-based.
1. Introduction
I use the term constructionist approach to emphasize two claims (Goldberg 2006).1 First,
language is comprised of a dynamic network of CONSTRUCTIONS, at varying levels of
complexity and abstraction, which pair each form with a conventional range of functions.
Equally important, languages are learned or CONSTRUCTED on the basis of the linguistic input
witnessed, together with general cognitive, pragmatic and processing factors. These and several
other basic tenets of the constructionist approach are stated below:
1
I do not attempt to organize the full range of research that falls under the heading
“Construction Grammar” or “constructionist.” But see eloquent and thoughtful descriptions of
the landscape available elsewhere (Gonzálvez-García and Butler 2006; Hoffman & Trousdale,
2013; Ungerer and Hartmann 2023)
1
Goldberg, AE (to appear) in Constructions and Frames
2
Goldberg, AE (to appear) in Constructions and Frames
o Gossip construction
It’s nice of you to be here; It was stupid of me.
o the Xer, the Yer construction:
The bigger they come, the harder they fall
Constructions that structure discourse o Information questions
(and lexically specified instances in italics) What does that mean?
o Polarity questions
Does it matter? Is that a thing?
o Relative Clauses
things you can do
o Passives
Mistakes were made
1.2. Definition
My understanding of constructions has evolved as I’ve gained a better appreciation of human
memory, learning, and the brain. Rather than abstract constructions being reified entities that
exist independently of their instantiations, the description in (8) is more accurate:
(8) A construction is an emergent cluster of lossy (imperfect) memory traces that are aligned
within our high-dimensional conceptual space on the basis of shared form, function, and
contextual dimensions (Goldberg 2019:7).
The definition of construction in (8) is based on evidence of the usage-based nature of our
knowledge of language, very briefly reviewed in the following section.
3
Goldberg, AE (to appear) in Constructions and Frames
within each community of language users (see Croft, this volume; van Trijp, this volume). Not
every constructionist emphasizes the usage-based nature of language. Indeed, this aspect was only
in my own peripheral vision early on (e.g., Goldberg 1995:135). A far deeper appreciation of
statistical information and discourse factors came into focus by the time I wrote Constructions at
Work (Goldberg 2006), as I interacted with and read more work by colleagues such as Ron
Langacker, Joan Bybee, Liz Bates, Wallace Chafe, Jeff Elman, Knud Lambrecht, Mike Tomasello,
the other authors and editors of this volume and others.
But the usage-based nature of language is tacitly endorsed by nearly every psychologist and
machine learning expert, as well as those of us who explicitly describe our perspective as usage-
based (Abbot-Smith and Tomasello 2010; Ambridge and Lieven 2011; Arnon and Snider 2010;
Boas 2008; Diessel and Hilpert 2016; Dunn 2019; Kidd, Lieven, and Tomasello 2010; Hilpert 2015;
Ibbotson 2022). The frequencies of constructions and the frequencies of their subparts
simultaneously influence language processing and language change (Baayen and Prado Martin
2005; Bybee, 2010; Gries 2010; Goldberg and Lee 2021; Gries and Hilpert 2010; Traugott &
Trousdale, 2013). And relationships among constructions and the forms of constructions are
shaped by users goals and conversational demands over diachronic time (e.g., Francis & Michaelis
2017; DuBois, 2014; Givón, 2014).
Since new information is related to old information, constructional generalizations emerge as
clusters of related instances within the high-dimensional network embedded in each brain, with
its nearly 100 billion neurons and roughly 60 trillion connections. As discussed at some length in
Goldberg (2019), memory traces that cluster together to form constructions involve partially
overlapping patterns of connections. Our brain’s incredibly rich network is dynamic: each person’s
constructicon (or ConstructionNet) is shaped by millions of exposures to language (Beckner et al.
2009; Bybee 2010; McClelland et al. 2010; Gries and Hilpert 2008; Perek 2015; Traugott 2014).
ConstructionNets continue to change as speakers are exposed to new contexts, new semi-idiomatic
expressions (Ok, boomer; living one’s best life; I did a thing), new individuals, different dialects
and/or new languages. Several foundational aspects of memory and learning are relevant to the
dynamic nature of ConstructionNets (Goldberg 2019:6):
• Speakers balance the need to be expressive and efficient while conforming to the
conventions of their speech communities.
• Our memory is vast but imperfect: memory traces are partially abstract (“lossy”)2
• Lossy memories are aligned when they share relevant aspects of form and function,
resulting in overlapping, emergent clusters of representations: Constructions
• New information is related to old information, resulting in a rich network of
constructions
• During production, multiple constructions are activated. If they cannot combine, they
2
Representations are “lossy,” a term from computer science, in the sense that they are not fully
specified in all detail. Models are also lossy representations, a point we return to in section 4.1.
4
Goldberg, AE (to appear) in Constructions and Frames
The phrase, hazard a guess, occurs only 187 times in the billion-word corpus of Contemporary
American English, COCA (Davies 2008), but the transitional probability of a guess following
hazard used as a verb is very high: p(a guess | hazard) = .49, in COCA. The implicit awareness
of the lexically filled construction, plainly evident even in a free recall task, for at least half of
adult English speakers, coexists with the recognition of the construction’s individual lexemes,
guess and hazard.4 The word guess is particularly transparent in the construction and is
3
This figure is based on a fill-in-the-blank task included in a Prolific survey taken by 20
English-speaking adults.
4
The verb, to hazard, is semantically related to its more common use as a noun. When one hazards
a guess, there is typically little evidence on which to base the guess, which makes the guessing
somewhat fraught or hazardous. The verb to hazard is sometimes used to imply some
communicative signal other than a guess. In this case, it retains the implication that doing so
makes the speaker vulnerable as in the attested examples from COCA:
5
Goldberg, AE (to appear) in Constructions and Frames
In this way, hazard a guess has a highly specific form and function, yet it can also be extended
on occasion. Notably, because the statistics are so skewed toward guess, the meaning of “guess”
tends to be implied even when other terms are used, as in the attested example in (14), which
makes the fact that the prediction is a guess explicit in the second clause:
(14) “I would hesitate to hazard a particular percentage, but I would guess that…”
(a) I hazarded that maybe it was glamorous living in exile with a tennis legend.
(b) She hazarded a backward glance
5
The paraphrase, She baked something for him, on the other hand can alternatively be
interpreted to mean that she baked something on his behalf (to be given to someone else), or
even that she baked something intending to throwing it at him.
6
Goldberg, AE (to appear) in Constructions and Frames
way into the room), and the construction implies that a real or metaphorical path is created
(i.e., made). The implication that a path is created is what imbues the construction with its
interpretation of self-propelled motion despite difficulty or obstacles (Goldberg 1995).
7
Goldberg, AE (to appear) in Constructions and Frames
or toward an event, the certainty of the information being conveyed, or the relative status of
speaker and hearer, to name just a few of the many functions signed by constructions.
Although I sometimes use simple decompositional representations to capture the type of
general and abstract meanings associated with common argument structure meanings in English
(e.g., CAUSE-MOVE), I do this only in an effort to provide a representation that may be
comprehended at a glance. This is possible for highly frequent constructions because they
generalize across many, varied instances, so their meanings are necessarily very general. I also
sometimes employ grammatical terms such as N(noun) or V(verb), but this again only intended
to provide information via a shorthand that I assume is familiar to readers. Each time I employ
a formal notation, I feel unsatisfied and humbled. Syntactic terms such as noun, subject, passive
do not refer to consistent categories across different languages (e.g., Croft 2001; Fried 1994;
LaPolla 1993), nor even within a single language (e.g., Croft 2001; Culicover 1999; Goldberg 2006;
Ross, 1973).
That said, I value the fact that other contributors to this volume have developed
formalisms that suit their intended goals. Proponents of Sign-Based Construction Grammar
provide a unification-based symbolic formalism for the sake of explicitness and to offer a common
descriptive language, and this is clearly valuable (Michaelis, this volume; Trousdale and Michaelis
2010; Boas and Sag 2012; Bergen and Chang 2005). As Michaelis (this volume) describes it, the
use of a “rigid” formalism can provide a “way of seeing.” Likewise, Fluid Construction Grammar
offers a fully implemented computational tool that can be used to test the compatibility of
representations in a way that captures the interactive online nature of language processing as a
means of communication (Van Trijp 2014; Steels and DeBeule, 2006; van Trijp 2014; van Trijp,
this volume). To its enduring credit, Fluid Construction Grammar has made enormous efforts and
taken great strides in grounding meaning in the goals of agentive actors in real and computational
situations.
(15) Wasn’t it rather McIlroy who seemed never to be outdriven when playing in contention?
(16) Wasn't it actually Everett who consistently demonstrated remarkable linguistic skills,
effortlessly speaking multiple languages?
What is most interesting about the utterance in (16) (and presumably the one in [15]) is its
complex interpretation. Due to the combination of constructions it employs (see i-ix below),
example (16) conveys a hedged assertion, namely that the speaker believes the polyglot at issue
is Everett; it also presupposes that someone had incorrectly suggested (or thought) that a
different person was a remarkable polyglot. Example (16) combines the following constructions:
8
Goldberg, AE (to appear) in Constructions and Frames
(i) Polar interrogative construction (yes/no question), which can be as a rhetorical question
to imply the associated assertion is false because it literally questions the veracity of the
assertion. Since the associated assertion in this case is negated, example (16) implies the
negation is false: i.e., Everett was the polyglot. The polar interrogative construction
includes:
a. Subj-aux inversion construction (wasn’t it)
(ii) It cleft construction (It BE ___ <relative clause>), presupposes the content of the
relative clause and puts the head noun in focus. Here, Everett in focus and that someone
else has been suggested as the polyglot is presupposed. The it-cleft construction includes:
a. Relative clause construction: here, a subject-oriented non-restrictive relative
clause (since Everett is interpreted as the subject argument of the relative clause)
(iii) A negative clitic (n’t), which presupposes the relevance of the positive counterpart (e.g.,
Horn, 2010; Lakoff 2014)
(iv) The focus element (actually) emphasizes the function of the it-cleft. It treats Everett as
the focus and implicitly corrects a mistaken belief (whether previously asserted or
presupposed), in this case that someone other than Everett was the polyglot.
(v) Several instances of the noun phrase construction (it, Everett, remarkable linguistic
skills, multiple languages)
(vi) A verb phrase adjunct which is discontinuous from what it modifies (effortlessly
speaking multiple languages modifies Everett, not skills) [aka a “dangling participle”]
(vii) Two lexical adverbs modifying different verb phrases (consistently, effortlessly)
(viii) Morphological inflection constructions (V-ing, N-s, V-past)
(ix) Lexical items, each associated with related words as well as its own range of functions:
was, not, it, actually, who, consistently, demonstrated, remarkable, linguistic, skills, effortlessly,
speaking, multiple, languages
The list in (i)-(ix) clarifies that the utterance in (16) is a combination of many constructions,
but few readers are likely to find the list itself particularly illuminating. And I fully empathize
with them. Not only do (i)-(ix) fail to do justice to any of the constructions involved (dozens of
papers have been written on each), but listing constructions obscures the fact that each exists as
part of a network of related items: the polar interrogative construction in (i) is related to the
tag question construction (e.g., wasn’t it?) and to information (wh-) questions. The subject-
auxiliary construction is in reality a family of constructions in English (e.g., Goldberg, 2006).
The it-cleft construction is related to the presentational relative clause construction (e.g., there
was a guy who) and to wh-clefts (Kim & Michaelis, 2020). It can also be used as a scene-setting
device, with information structure quite distinct from that described in (ii): e.g., It was 1967
when young people from around the world were drawn to San Francisco by the promise of
peace, love, and understanding [COCA, 1997, SPOK]. Subject-oriented relative clauses are
related to non-subject oriented relative clauses. And obviously, the lexical item was is related to
were, be and is; the adverb effortlessly is related to the lexemes effortful, effort, and to the
morphological constructions N-less, Adj-ly.
9
Goldberg, AE (to appear) in Constructions and Frames
Moreover, none of the labels or features used in (i)-(ix) captures their usage-based nature.
There are no uniform tests that hold of all words we generally call adjectives in English, let alone
any tests that apply to all adjectives in all languages (Croft, 2001). While there are constructions
with comparable functions across languages (Croft 2022) and while each construction tends to be
motivated rather than random, the specifics of each construction are not strictly predictable (e.g.,
Lambrecht, 1994). The need to explain the usage-based motivation and complexity of every
feature and construction leaves me wary of symbolic formalisms.
ChatGPT and GPT4 are “generative pre-trained transformer” models, which are a type
of Large Language Models (LLMs), and they are able to produce and comprehend language to a
degree that has stunned me. This is a good a place to note that generative in “generative pre-
trained transformer” simply means that the models generate novel outputs; it is unrelated to
generative linguistics, and generative linguists are generally skeptical, arguably naively so (e.g.,
Chomsky et al., 2023). In fact, as described below, LLMs share far more with the usage-based
constructionist approaches to language than traditional generative approaches (Weissweiler et
al. 2023). Table 2 provides six striking parallels, with the final one, new to ChatGPT and
GPT4, being quite profound: the new models are specifically trained to be helpful to human
10
Goldberg, AE (to appear) in Constructions and Frames
users (section 4.6). There are to be sure, profound differences in how the parallels arise. Each is
discussed briefly in turn below.
11
Goldberg, AE (to appear) in Constructions and Frames
12
Goldberg, AE (to appear) in Constructions and Frames
13
Goldberg, AE (to appear) in Constructions and Frames
same process when an easier solution is evident. In these contexts, children recognize that there
is a conventional or “correct” way to perform the activity and they conform their behavior
accordingly (Gergely, Bekkering, & Király 2002; Horner & Whiten, 2005). Humans also naturally
segment the natural world into meaningful units in vision, memory, and in language (Chater
2018). Humans naturally construe parts of scene that move together as parts of the same entity
(Ostrovsky, 2009), we come to recognize parts with relatively high transitional probabilities as units
(Saffran et al 1996), and we understand contiguous words that combine to form a coherent unit
to be a semantic constituent.
Finally, there is evidence that humans spontaneously predict the next word that will be
uttered while comprehending language (Kutas and Federmeier 2011). For instance, the N400
ERP component detectable from EEG recordings on the scalp, while people listen to text,
correlates quite well with how predictable each word is in context (e.g., Nieuwland & Van
Berkum 2006). Less predictable words result in a higher amplitude N400 and highly predictable
words results in a negligible N400. Yet, despite the efficacy of predict-the-next-word training, it
is unlikely to be sufficient to explain the dramatic improvement in ChatGPT, because it has
been used for decades.
4.3. Complex dynamic network of constructions at varying levels of abstraction and complexity
LLMs include far more layers, with exponentially more nodes and connections than earlier
connectionist models. This accounts for their characterization as “deep learning” models
(Graves, Mohamed, and Hinton 2013). And ChatGPT was trained on 300 billion words of text
scraped from the internet in all languages found online. The massive amount of input allows it
to learn the thousands of collocations, idioms, and semi-idiosyncratic constructions within the
vast training data, a key hallmark of usage-based constructionist approaches. The compression
involved requires a rich network of conventional constructions to partially share representational
structure with related constructions, in the same spirit as the clustering described in section 2.
To be clear, ChatGPT and GPT4 receive orders of magnitude more input than any
human receives in their entire lifetime. And no human can learn any new language only by
scanning text or by listening to the radio, even for a billion years. We would not be able to
glean any meaning whatsoever. Humans, however, have access to real or imagined grounding in
various contexts, and importantly, humans are good understanding of the intentions of others
(Tomasello, 2003; 2010). I had been skeptical that models trained only on text could converse
coherently with humans, but ChatGPT has proven my intuition wrong. How is this possible?
14
Goldberg, AE (to appear) in Constructions and Frames
to want water, juice, or milk, while if an adult at a bar ask for a drink, it is far more likely to be
a Manhattan or a Mojito. Today’s GPT models have no access to non-linguistic contexts, but
they make use of the large amount of linguistic context: each token of input includes thousands
of words of the preceding text.
15
Goldberg, AE (to appear) in Constructions and Frames
and assume others are: relevant, truthful, brief, and mannerful. The goal of being helpful is mighty
close to being cooperative: one would seem to need to be relevant, generally honest, allow for turn-
taking (one way to interpret the idea of being “brief”), and be appropriate in context (or
“mannerful”). In fact, the assumption that others are being helpful or cooperative is a well-
recognized prerequisite for natural language, present in young children (Tomasello 2010).
In fact, what I find most intriguing about the latest LLMs is the way they succeed as well
as they do. The assumption that others intend to communicate cooperatively and the inclination
to conform to relatively arbitrary social norms are prerequisites for languages (and complex
cultures) to emerge in humans. This combination of prerequisites explains why none of our primate
cousins are able to learn a language anywhere near as complex as the natural language of humans
(Tomasello, 2010; 2019).
8. GPTs at work
In what follows, I include a series of representative responses to prompts I provided to GPT-4 in
March of 2023. Like humans, GPT models are not deterministic. What follows includes
responses to the first (and only) time I provided a prompt. Your results will vary. While it is
remarkably impressive overall, an illustrative instance in which it fails in an illuminating way is
included as well (Figure 9).
16
Goldberg, AE (to appear) in Constructions and Frames
In another test of GPT-4’s ability to make appropriate social inferences, I asked, “When
Tim’s husband said he was at the gym all morning, Tim turned red. What might have
happened?” GPT4’s response was again remarkably appropriate (Figure 3).
I tested the model on whether it could supply reasonable and distinct inferences when given
single-word utterances, fire! vs. coffee! Chat-4’s unedited and appropriate responses are
provided in Figure 4 (graciously overlooking my typo [the want]:
17
Goldberg, AE (to appear) in Constructions and Frames
Figure 4: GPT-4’s markedly distinct and highly appropriate interpretations of the single word
utterances, “fire!” and “coffee!”
18
Goldberg, AE (to appear) in Constructions and Frames
Figure 5: GPT-4’s interpretation of “she sneezed the foam off the cappuccino” and “three
computers ago” which require constructional meaning.
Using corpus analysis, survey data, and cross-linguistic comparison, we provide motivation for
the form and function of phrases that are treated syntactically as if they were words. In
particular, we argue that novel uses of the PAL construction are ideally suited for conveying
what comedians call “observational humor.” The reasoning for this is sketched in Table 3.
We confirmed the hypothesized function of the PAL construction with survey data that asked
participants to compare pairs of sentences that either included a PAL or a non-PAL paraphrase
(Shirtz & Goldberg submitted). Results showed that the sentences with PALs implied more
shared background between speaker and listener and were judged to be more witty and more
sarcastic than non-PALs. With this as background, I asked GPT-4 what the following means:
19
Goldberg, AE (to appear) in Constructions and Frames
“I’m officially ‘slows down at all of the yellow traffic lights’ years old.” (Shirtz & Goldberg, to
appear). Remarkably, GPT-4 recognized the “humorous” flourish of the PAL construction
(Figure 6) and interpreted the phrase accurately:
Figure 6: Probe of the PAL construction (see Shirtz & Goldberg, forthcoming)
20
Goldberg, AE (to appear) in Constructions and Frames
8.5. Over-reliance on associations can lead GPT models (and humans) astray
Insight into how ChatGPT worked comes from a series of examples posted on Twitter by
@PaulMainwood, (2/22/23).6 In one, Mainworth cleverly provides ChatGPT with a twist on a
widely shared riddle, written by undergraduate students at Boston University in 2008, and
intended to highlight implicit sexism. A version of the familiar riddle follows in (21):
(21) Familiar riddle (included in training data): “A boy was rushed to the hospital after
a terrible car crash in which his father was killed. The surgeon looks at the boy and says
‘I can’t operate: he’s my son.’ How is this possible?”
People who hear the riddle for the first time are sometimes flummoxed until it is revealed that
the surgeon is the boy’s mother. Mainworth explains a different situation to ChatGPT. It is not
a riddle at all but is strongly but vaguely reminiscent of the original riddle. He stated that the
man at the wheel was the child’s biological father and that the surgeon is the child’s adoptive
father. Strikingly undeterred, ChatGPT blindly forged ahead and provided the standard answer
to the standard riddle, incongruously responding, “The surgeon is the boy’s mother.” I tried
the same prompt on GPT-4 and it performed similarly, assuredly but incorrectly stating that
the surgeon was the boy’s adopted mother (Figure 9).
6
[Link] The examples recall Searle’s
famous Chinese Room argument (Searle 1980).
21
Goldberg, AE (to appear) in Constructions and Frames
GPT-4’s response is obviously wrong, but it associates the given prompt with a specific
context—the standard riddle -- undoubtably encountered in its training. This type of error is
potentially interesting. We humans are also prone to context-based errors, which have been
described as a result of “good-enough” processing (Christianson 2016; Ferreira, Bailey, and
Ferraro 2002; Goldberg and Ferreira 2022). For example, when asked, How many pairs of
animals did Moses take on the ark, people commonly fail to notice that Moses, rather than
Noah, is referred to in the question.7 Similarly, students are often misled by math and physics
word problems that vary from the specific types of content they had been previously exposed to
(Bassok 1990). In fact, when I shared Figure 9 with two quite brilliant colleagues, each made
the same error that ChatGPT and GPT-4 did, by failing to notice that the prompt was not the
standard riddle.
In the narrow domain of human natural language production and comprehension, GPT
models available now, in 2023, make every phrase structure grammar and every syntactic parser
that came before look like line drawings next to Gaudi’s Sagrada Familia. These models have
limitations in terms of novel spatial reasoning, complex math problems, and descriptions of
world events not included in its training data. Like la Sagrada Familia, the models are works in
progress. Advancements will continue. It is up to humans to put the models to work in ways
that benefit humanity. And it will be left to cognitive scientists and linguists to explore how
they work.
7
Unsurprisingly, since the example is a class example of good-enough processing, GPT-4 was
not fooled by this particular question, responding “Moses did not bring animals onto an ark; it
was Noah who brought animals onto the ark according to the biblical story found in the Book of
Genesis...”
22
Goldberg, AE (to appear) in Constructions and Frames
9. Looking ahead
I fully agree with Martin Hilpert that the future of construction grammar is in excellent hands.
Researchers ought to follow their own curiosity, wherever it takes them. But before closing, I
offer what I personally take to be the most promising directions for new work over the coming
decade.
o GPT models put on full display the power of usage-based constructionist models. Systematic
investigation of such models will likely help us better understand parallels and differences
with human language and cognition (Hawkins et al. 2020; Mahowald forthcoming; McCoy et
al. 2021; Piantadosi, 2023). At the very least, since we know that context always matters, we
need to move away from static representations of the ConstructionNet and embrace dynamic
models to the extent possible (see also Barak and Goldberg 2017; Dasgupta et al. 2022; van
Trijp 2014, 2015; Steels and DeBeule, 2006).
o Field work will always be highly valuable and large-scale cross-linguistic comparisons will
help us better understand shared aspects of our semantic and pragmatic construal of the
world, as well as the processing pressures that result in languages patterning as they do
(Bohnemeyer et al. 2007; Croft 2001; Croft 2022; Haspelmath 2010; Kemmerer 2011; Majid et
al. 2004).
o A fuller, deeper appreciation of information structure and lexical semantics can unlock
puzzles that have long been assumed to require syntactic stipulations, including island
constraints, scope, anaphora, and binding (Ackerman and Nikolaeva 2014; Cole, Hermon, and
Yanti 2014; Culicover and Jackendoff 2005; Cuneo and Goldberg 2022; Francis and Michaelis
2017; Goldberg and Michaelis 2017; Israel 2001; Ackerman and Nikolaeva 2014;
Namboodiripad et al. 2022).
o Laboratory phonology and sociolinguistics are thriving subfields of linguistics. Each field has
long provided compelling evidence for the usage-based approach to language. Researchers
equipped to hypothesize and test potential parallels between phonological and grammatical
phenomena will be in a position to offer coherent and insightful perspective across subareas
(e.g., Bybee 2010; Docherty and Foulkes 2014; Harmon and Kapatsinski 2016).
o We ought not feel constrained to focus only on traditional questions. For instance, emotion
drives most everything we do, so it is worthwhile to better understand its role in
communication and language (Citron and Goldberg 2014; Foolen 2012). We also need to
incorporate constraints and implications of communicative gestures (Congdon et al. 2018;
Khasbage et al. 2022; Steen and Turner 2012; Willems and Hagoort 2007), and conversational
dynamics to more accurately understand natural language (Du Bois, Kumpf, and Ashby
2003; Hopper and Thompson 1980; Stephens, Silbert, and Hasson 2010).
23
Goldberg, AE (to appear) in Constructions and Frames
10. Conclusion
Let’s allow ChatGPT to have the final word, offered in the style of Ovid (Left, Figure 10) and
Dr. Seuss (Right, Figure 10).
Acknowledgements
I thank the other contributors to this volume for helpful feedback, particularly Bill Croft and
the editors for reviewing an earlier version of this paper. I am also grateful to Arielle Belluck for
her expert editing.
References
Abbot-Smith, K., & Tomasello, M. (2010). The influence of frequency and semantic similarity
on how children learn grammar. First Language, 30(1), 79–101.
[Link]
Ackerman, F., & Nikolaeva, I. (2014). Descriptive typology and linguistic theory: A study in the
morphosyntax of relative clauses. Stanford: CSLI Publication.
Ambridge, B., & Lieven, E. V. M. (2011). Child language acquisition: Contrasting theoretical
approaches. Cambridge: Cambridge University Press.
Arnon, I., & Christiansen, M. H. (2017). The role of multiword building blocks in explaining L1
L2 differences. Topics in Cognitive Science, 9(3), 621–636.
Arnon, I., & Snider, N. (2010). More than words: Frequency effects for multi-word phrases.
Journal of Memory and Language, 62(1), 67–82.
Baayen, R. H., & del Prado Martin, F. M. (2005). Semantic density and past-tense formation in
24
Goldberg, AE (to appear) in Constructions and Frames
25
Goldberg, AE (to appear) in Constructions and Frames
Christiano, P. F., Leike, J., Brown, T., Martic, M., Legg, S., & Amodei, D. (2017). Deep
reinforcement learning from human preferences. In Advances in Neural Information
Processing Systems. Vol. 30. Curran Associates, Inc.
Christiansen, M. H., & Chater, N. (2022). The language game: How improvisation created
language and changed the world. Hachette UK.
Christianson, K. (2016). When language comprehension goes wrong for the right reasons: Good
enough, underspecified, or shallow language processing. Quarterly Journal of
Experimental Psychology, 69(5), 817–828.
[Link]
Christianson, K., & Ferreira, F. (2005). Conceptual accessibility and sentence production in a
free word order language (Odawa). Cognition 98(2), 105–135.
[Link]
Citron, F. M. M., & Goldberg, A. E. (2014). Metaphorical sentences are more emotionally
engaging than their literal counterparts. Journal of Cognitive Neuroscience, 26(11),
2585–2595. [Link]
Cole, P., Hermon, G., & Yanti (2014). The grammar of binding in the languages of the world:
Innate or learned? Cognition, 141, 138–60.
[Link]
Congdon, E. L., Novack, M. A., Brooks, N., Hemani-Lopez, N., O’Keefe, L., & Goldin
Meadow, S. (2018). Better together: Simultaneous presentation of speech and gesture in
math instruction supports generalization and retention. Learning and Instruction, 50,
65–74. [Link]
Croft, W. (2001). Radical construction grammar. Oxford University Press.
Croft, W. (2022). Morphosyntax: Constructions of the world’s languages. Cambridge University
Press.
Culicover, P. W. (1999). Syntactic nuts: Hard cases, syntactic theory and language acquisition.
Cognitive Linguistics, 10(3), 251–261.
Culicover, P. W., & Jackendoff, R. (2005). Simpler syntax. Oxford: Oxford University Press.
Cuneo, N., & Goldberg, A. E. (2022). Islands effects without extraction: The discourse function
of constructions predicts island status. Proceedings of the Cognitive Science Society.
Dasgupta, I., Lampinen, A. K., Chan, S. C. Y., Creswell, A., Kumaran, D., McClelland, J. L., &
Hill, F. (2022). Language models show human-like content effects on reasoning. arXiv.
[Link]
Davies, Mark (2008). The Corpus of Contemporary American English (COCA): One Billion
Words, 1990-2019.
Diessel, H., Dabrowska, E. & Divjak, D. (2019). Usage-based construction grammar. Cognitive
Linguistics, 2, 50–80.
Diessel, H., & Hilpert, M. (2016). Frequency effects in grammar. In Oxford Research
Encyclopedia of Linguistics. [Link]
Docherty, G. J., & Foulkes, P. (2014). An evaluation of usage-based approaches to the
modelling of sociophonetic variability. Lingua, SI: Usage-Based and Rule-Based
26
Goldberg, AE (to appear) in Constructions and Frames
27
Goldberg, AE (to appear) in Constructions and Frames
28
Goldberg, AE (to appear) in Constructions and Frames
29
Goldberg, AE (to appear) in Constructions and Frames
30
Goldberg, AE (to appear) in Constructions and Frames
31
Goldberg, AE (to appear) in Constructions and Frames
Steels, L., & de Beule, J. (2006). A (very) brief introduction to fluid construction grammar.
Proceedings of the Third Workshop on Scalable Natural Language Understanding, 73–80.
New York City, New York.
Steen, F. F., & Turner, M. (2012). Multimodal construction grammar. SSRN Electronic
Journal, no. 2010, 255–274. [Link]
Stephens, G. J., Silbert, L. J., & Hasson, U. (2010). Speaker–listener neural coupling underlies
successful communication. Proceedings of the National Academy of Sciences, 107(32),
14425–14430. [Link]
Suttle, L. and Goldberg, A. (2011). The partial productivity of constructions as
induction. Linguistics. 49–6 1237–1269
Tomasello, M. (2010). Origins of human communication. Cambridge: MIT Press.
Traugott, E. C. (2014). Toward a constructional framework for research on language change.
Cognitive Linguistic Studies, 1(1), 3–21. [Link]
Tomasello, M. (2019). Becoming human: A theory of ontogeny. Harvard University Press.
Traugott, E. C., & Trousdale, G. (2013). Constructionalization and constructional changes (Vol.
6). Oxford University Press.
Trips, C., & Kornfilt, J. (2017). Further investigations into the nature of phrasal compounding.
Zenodo. [Link]
Ungerer, T. (2022). Extending structural priming to test constructional relations: Some
comments and suggestions. Yearbook of the German Cognitive Linguistics Association,
10(1), 159–182.
Ungerer, T., & Hartmann, S. (2023). Constructionist approaches: Past, present, future.
PsyArXiv. [Link]
van Dis, E. A. M., Bollen, J., Zuidema, W., van Rooij, R., & Bockting, C. L. (2023). ChatGPT:
Five priorities for research. Nature, 614(7947), 224–226. [Link]
023-00288-7.
van Trijp, R. (2014). Long-distance dependencies without filler−gaps: A cognitive-functional
alternative in fluid construction grammar. Language and Cognition, April, 1–29.
[Link]
van Trijp, R. (2015). Towards bidirectional processing models of sign language: A
constructional approach in fluid construction grammar. Proceedings of the
EuroAsianPacific Joint Conference on Cognitive Science, 1, 668--673.
Vig, J. (2019). A multiscale visualization of attention in the transformer model. arXiv.
[Link]
Warneken, F., & Tomasello, M. (2007). Helping and cooperation at 14 months of
age. Infancy, 11(3), 271-294.
Weissweiler, Leonie, Taiqi He, Naoki Otani, David R. Mortensen, Lori Levin, and Hinrich
Schütze. "Construction grammar provides unique insight into neural language
models." arXiv preprint arXiv:2302.02178 (2023).
Willems, R. M., & Hagoort, P. (2007). Neural evidence for the interplay between language,
32
Goldberg, AE (to appear) in Constructions and Frames
33