2025 Hyland Does Chatgpt Write Like A Student Engagement Markers in Argumentative Essays
2025 Hyland Does Chatgpt Write Like A Student Engagement Markers in Argumentative Essays
research-article2025
WCXXXX10.1177/07410883251328311Written CommunicationJiang and Hyland
Original Article
Written Communication
Argumentative Essays
Abstract
ChatGPT has created considerable anxiety among teachers concerned
that students might turn to large language models (LLMs) to write their
assignments. Many of these models are able to create grammatically accurate
and coherent texts, thus potentially enabling cheating and undermining
literacy and critical thinking skills. This study seeks to explore the extent
LLMs can mimic human-produced texts by comparing essays by ChatGPT
and student writers. By analyzing 145 essays from each group, we focus on
the way writers relate to their readers with respect to the positions they
advance in their texts by examining the frequency and types of engagement
markers. The findings reveal that student essays are significantly richer in the
quantity and variety of engagement features, producing a more interactive
and persuasive discourse. The ChatGPT-generated essays exhibited fewer
engagement markers, particularly questions and personal asides, indicating
its limitations in building interactional arguments. We attribute the patterns
in ChatGPT’s output to the language data used to train the model and its
underlying statistical algorithms. The study suggests a number of pedagogical
implications for incorporating ChatGPT in writing instruction.
Keywords
ChatGPT, argumentative writing, reader engagement, academic interaction
1
School of Foreign Languages, Beihang University, Beijing, China
2
School of Education and Lifelong Learning, University of East Anglia, Norwich, Norfolk, UK
Corresponding Author:
Feng (Kevin) Jiang, School of Foreign Languages, Beihang University, #37 Xueyuan Street,
Haidian District, Beijing, 100191, China.
Email: [email protected]
464 Written Communication 42(3)
Introduction
The use of computational technologies in language learning is nothing new.
For decades, conversational agents, commonly referred to as chatbots, have
been applied to support teaching of foreign languages (Ji et al., 2023; Rudolph
et al., 2023). However, recent advancements in machine learning have led to
the development of large language models (LLMs), such as ChatGPT (Chat
Generative Pre-trained Transformer), which can produce humanlike text by
responding to user queries. LLMs like ChatGPT use sophisticated algorithms
and natural language processing to simulate conversational interactions and
generate various forms of content. They cannot, of course, engage in human-
like cognitive processes or possess understanding, intention, or awareness
(e.g., Byrd, 2023; Gallagher, 2023). The apparent “intelligence” exhibited by
these models is thus an emergent property of statistical patterns identified in
their training data, rather than a result of humanlike reasoning or comprehen-
sion. Therefore, in this study we use the term “LLM” to describe these com-
putational models and to refer to programs like ChatGPT.
Reactions to the role of such LLMs in language learning and student writ-
ing are mixed. For some, the language modeling capabilities of ChatGPT
can help scaffold students’ language study (e.g., Kasneci et al., 2023; Kohnke
et al., 2023). Others express concern that “the eerily humanlike chatbot”
(Satariano & Kang, 2023) might make it difficult to distinguish GPT-
generated and human-authored texts in assessing students’ writing (Revell
et al., 2024). The possible temptation to rely on LLMs when writing assign-
ments can undermine students’ development of critical thinking, problem-
solving and literacy skills, as Laquintano et al. (2023) have argued. Although
tools such as GPTZero and AICheatCheck have been developed to detect AI
involvement in writing, these seem currently unable to make a reliable dis-
tinction (Adeshola & Adepoju, 2024; Scarfe et al., 2024).
We approach this question by exploring the performance of the model
when asked to complete a writing task previously done by students. We focus
on engagement expressions, which refers to the way “writers relate to their
readers with respect to the positions advanced in the text” (Hyland, 2005c, p.
176). The analysis seeks to offer textual evidence to help identify ChatGPT-
generated texts and to offer support for L2 students and teachers seeking to
use the tool in classrooms. Before describing the methods and results, we
introduce ChatGPT, argumentative writing, and reader engagement.
array of topics (e.g., Reddit, Wikipedia, New York Times, etc). It is designed
to learn the statistical patterns and relationships between words and phrases
in collections of texts (Kabir, 2022) and represents a significant advance in
natural language processing and artificial intelligence. A form of LLMs,
ChatGPT uses specialized algorithms to find patterns within data sequences
in order to respond to user prompts with images, texts, or videos created by
artificial intelligence.
Essentially, the model generates responses through a sophisticated sequence-
processing mechanism, wherein it analyses the input text word by word, “pre-
dicting the next word in the sequence based on the context of the words that
come before it” (Kumar, 2023). This predictive process is iterative; each newly
predicted word serves as input for subsequent predictions, continuing until the
desired textual output is achieved. Rather than processing language on a clause-
by-clause basis, then, ChatGPT calculates probabilities across vast spans of
text in its training data. This approach allows it to generate coherent text but
means it does not “understand” context in the way humans do. Instead, it pro-
duces text based on statistical patterns learned from its training data, which can
lead to both impressively fluent output and unexpected limitations in adapting
to specific contexts or tasks (Byrd, 2023; OpenAI, 2023).
ChatGPT therefore offers the potential to provide a range of applications
in language learning from natural language processing to conversation gen-
eration, language translation, text summarization, grammar correction, para-
phrasing, and more (Bin-Hady et al., 2023; Laquintano et al., 2023; Pack &
Maloney, 2023). Yet it is seen by students and researchers as most useful
when assisting with writing: to plan, write, edit, and polish academic texts
(Ingley & Pack, 2023; Nordling, 2023). Here, ChatGPT can serve as an initial
sounding board for brainstorming ideas (Su et al., 2023) and provide correc-
tive feedback on students’ writing assignments (Godwin-Jones, 2022). Its
potential to assist writing rests on a capacity to generate writing that is “typi-
cally coherent and grammatically correct” (Barrot, 2023, p. 2) and to refine
the tone and style of a text (Ji et al., 2023). Based on these studies, it can be
argued that ChatGPT is particularly useful for non-native English speakers to
improve their academic writing skills.
However, despite these advantages, there are ongoing concerns, particu-
larly about bias, hallucinations, and sycophancy (e.g., Santurkar et al., 2023;
Zhang et al., 2023). In educational contexts, most serious is perhaps the dif-
ficulty of distinguishing “whether a text is machine- or human-generated,
presenting an additional major challenge to teachers and educators” (Kasneci
et al., 2023, p. 6). Thus, comparing undergraduate exam scripts generated by
ChatGPT to those by students, for example, Scarfe et al. (2024) found that
94% of AI submissions were undetected by readers. Even trained linguists
466 Written Communication 42(3)
are not particularly effective in spotting the differences, with an average total
positive identification rate of only 38.9% (Casal & Kessler, 2023).
One possible difference is the extent of interactional involvement that
ChatGPT invests in the texts it creates. So, while ChatGPT can generate
seemingly reasoned and contextually appropriate text, it lacks an inherent
understanding of audience. Unlike human writers, who develop a mental
model of their readers and adjust their writing accordingly (Hyland & Jiang,
2023), ChatGPT does not possess an intrinsic awareness of who might be
reading its output. This limitation is a consequence of the fact that the model
is trained on huge amounts of texts from diverse registers and genres, each
with its own purposes, structures, and audiences (Milano, 2023). This process
bleaches out any specific audience and means that the model operates with a
“generic” target reader. Consequently, any audience-specific features in
ChatGPT-generated text, particularly engagement markers, are incidental,
reflecting patterns in the training data rather than a clear consideration of
readers’ needs. This “audience blindness” can result in output that, while
grammatically conventional and topically relevant, may lack the nuanced
involvement features that characterize effective human writing (Markey
et al., 2024; Jiang & Hyland, 2025).
To put this plainly, the absence of a built-in model of audience means that
ChatGPT cannot automatically adapt its rhetorical style, tone, or level of
detail to suit different reader groups, nor can it anticipate and address poten-
tial reader questions or objections without specific prompting. Audience
awareness is central to academic writing, and control of interpersonal ele-
ments can be crucial to successful persuasion (Hyland, 2005a; Su et al.,
2023). Such dialogic aspects of argument not only involve conveying an
appropriate stance but also acknowledging and addressing the role of readers
(Hyland, 2004; Shahriari & Shadloo, 2019). Recent studies, however, sug-
gest that ChatGPT-generated essays “exhibit reduced involvement and inte-
gration compared to their human counterparts” (Berber Sardinha, 2024, p. 9),
and “often read as dialogically closed, ‘empty,’ and ‘fluffy’” (Markey et al.,
2024, p. 571). These findings, based on Biber’s (1988) multidimensional
analysis, provide useful insights into the general characteristics of LLM-
generated text. However, while this approach offers a broad perspective on
textual features, it does not specifically focus on the nuances of interpersonal
interaction within the essays. As such, there remains a need for more targeted
research to systematically compare the interpersonal elements in LLM-
generated and human-written essays.
Our study sets out to provide textual evidence for human-AI differences in
this regard by exploring the extent to which ChatGPT can generate argumenta-
tive content with the same form, frequency, and function of reader engagement
as undergraduates. Our comparison provides insights into the development of
Jiang and Hyland 467
more nuanced writing instruction methods that leverage the strengths of LLMs
while addressing their limitations. Beyond this, our study points to broader
issues around the limitations of the data sets this particular bot was trained on
and the difficulties of designing prompts that communicate a context for any
desired text.
soft disciplines. Comparisons have also been made in the ways patterns of
engagement vary by genre and language, revealing how writers shape their
texts to the expectations of different audiences. Therefore, Hyland (2004)
found differences between expert texts and undergraduate dissertations and
between popular and professional science articles (Hyland, 2010). In addi-
tion, while context and national culture can influence the use of engage-
ment, L1 transfer and L2 proficiency may also have some bearing
(Lafuente-Millán, 2014).
Typically, the way writers articulate arguments and initiate social engage-
ment is shaped by their understanding of “audience.” In academic contexts
this is rarely a real person but an abstraction conjured up by the writer and
based on his or her knowledge of the community for which the text is writ-
ten. Thus, audience comprises the writer’s perception of the external cir-
cumstances that define a rhetorical context and influence the specific textual
conventions employed. Hence, Park (1982), for example, argues that audi-
ence exists in the writer’s mind and shapes a text as “a complex set of con-
ventions, estimations, implied responses and attitudes” (p. 251).
Writers therefore navigate the complexities of engaging audience by
drawing on the rhetorical and structural conventions of the genre and by
ways of crafting arguments that are recognized and valued within their dis-
ciplinary communities. Hyland (2005c) argues that there are five main
ways that authors overtly intrude into their texts to connect with readers
directly. At certain points, writers acknowledge an active audience using
the following.
As seen in Table 1, these features are the most explicit means at the
writer’s disposal to recognize their readers in the text, to acknowledge their
expectations of inclusion, and to respond to their possible objections and
alternative interpretations (Hyland, 2005c). While inclusion of readers
might sometimes be based on tacit assumptions and expressed implicitly
through, say, choice of method, theory, or data, explicit engagement fea-
tures help to concretize the ways that writers intervene to “engage actively
or position readers, focusing their attention, recognizing their uncertainties,
including them as discourse participants and guiding them to interpreta-
tions” (Hyland, 2001, p. 552). They, therefore, carry important rhetorical
meanings while managing the impression readers get of the writer (Hyland
& Jiang, 2016).
With the growing influence of LLMs on academic writing, what is miss-
ing from these studies is the question of whether ChatGPT can produce texts
with the same degree of nuance and variability of reader engagement.
Jiang and Hyland 469
Reader They bring readers into a discourse, (1) As we can imagine, this has
mentions normally through second person had a tremendous influence on
pronouns, particularly inclusive sales in places such as fast-food
we which identifies the reader as restaurants where beefburgers
someone who shares similar ways of are the main item on the menu.
seeing to the writer. (Student essay1)
Questions They invite direct collusion because (2) Can we expect a scientist to bear
they address the reader as someone this additional burden for the
with interest in the issue the whole world? In truth no, it is
question raises and the good sense unreasonable. (Student essay)
to follow the writer’s response to it.
Appeals to They are explicit signals asking readers (3) Traditionally, participating in
shared to recognize something as familiar, a lottery involved purchasing a
knowledge apparent, or accepted. physical ticket from an authorized
retailer. (ChatGPT essay)
Directives They are instructions to the reader, (4) As IVF technologies continue
mainly expressed through imperatives to advance and become more
(such as consider, note), obligation integrated into the fabric of
modals (need to, should), and society, it is vital to consider
predicative adjectives (it is important to the demographic trends they
understand. . .), which direct readers influence. (ChatGPT essay)
(a) to another part of the text or
another text, (b) to carry out some
action in the real world, or (c) to
interpret an argument in certain
ways.
Personal They are brief interjections where the (5) Many constitutional problems
asides writer speaks directly to the reader, still block our road to Europe, as
often to share a personal thought, well as people’s attitudes—we
comment, or anecdote. These asides in Britain rather enjoy being
can create a conversational tone, add an island and not attached
personality to the writing, and help to the continent—witness the
to engage the reader. opposition to the Channel Tunnel.
(Student essay)
Methodology
Data Collection
As outlined above, we set out to compare the argumentative essays generated
by ChatGPT with those written by British university students. For the latter,
we drew upon the Louvain Corpus of Native English Essays (LOCNESS), a
collection of texts written by British and American university students.2 From
this corpus, we extracted 145 argumentative essays written by second-year
470 Written Communication 42(3)
Data Analysis
The two corpora were part-of-speech tagged by TagAnt (Anthony, 2014) and
then searched for the engagement features described in Hyland (2005c) using
AntConc (Anthony, 2022). Overall, we examined about 100 different items of
reader engagement provided by Hyland (2005c) and Hyland and Jiang (2016)
(see Appendix 1), and manually checked each concordance to establish that
the feature was performing an engagement function by addressing readers
directly. Most obviously, this involved eliminating non-addressee modals
(every scientist should be a good judge) and interjections that were not per-
sonal asides (the problem is obvious—there are too many cars on Britain’s
roads). In addition, some features were easily located through a corpus word
search (we, of course) while others entailed a regular expression search
(imperatives, it is adj to + verb).
The two authors worked independently and coded a random sample of
10% of engagement expressions, achieving an inter-rater reliability of 97%.
Disagreements were resolved through discussion and consideration of other
examples. For instance, disagreement arose coding the phrase “you might
wonder why . . .” in the introduction of an essay. The second author argued
that this phrase should be classified as a reader mention, while the first author
felt it should be considered a rhetorical question, which typically serves a
different function of engagement. After discussing the context and reviewing
similar examples, we agreed to code it as a reader mention and concluded that
the primary function of the phrase was to directly address the reader and to
anticipate their thoughts.
We normalized the results to 1,000 words to compare the use of engage-
ment across the two corpora and determined statistical significances in these
differences by applying log-likelihood (LL) tests using Rayson’s (2016) cal-
culator. We followed the suggestion in that paper that an LL score of 3.8 or
higher is significant at a cut-off p value of 0.05. We also considered the effect
size for log-likelihood tests (%DIFF), which indicates the percentage of the
difference between the two normalized frequencies (see Gabrielatos, 2018,
for more information about %DIFF).
amounting to 16.99 cases per 1,000 words. This shows significantly less use
of engagement markers by ChatGPT in creating argumentative essays
(LL = 471.98, %DIFF = 68.23, p < 0.001). Interestingly, Jiang and Hyland
(2024) similarly identified significantly fewer 3-word stance bundles (e.g., it
is possible, this never is, in my opinion) in the ChatGPT essays than used by
human writers. Clearly, this does not tell us a great deal about the quality of
the essays per se, as more interactional devices do not necessarily mean more
effective texts. Hyland (2004) and Jiang and Ma (2018), for instance, found
far fewer uses of engagement in student than professional writing. However,
significantly fewer markers of engagement reveal a distinctive characteristic
of the LLM texts and indicates a gulf in the interactional positions taken in
the two corpora.
As we have mentioned, a writer’s rhetorical investment in engagement
contributes to the impression of reader-awareness and recipient design in a
text and helps construct an effective line of reasoning, establishing a connec-
tion with readers as in (6) and (7):
(6) We all feel that we have a divine right to be on the road. Why?
(Student essay)
(7) As we navigate this digital social landscape, it is crucial to foster
digital literacy and etiquette to ensure that our online interactions are
respectful, authentic, and enriching. (ChatGPT essay)
We should also point out here that the normalized frequency of engagement
in both students and ChatGPT-generated essays is higher than that reported
by Hyland and Jiang (2016) for research articles over time. The fact that
LLMs such as ChatGPT “learn” from a wide range of registers and genres
gives them the appearance of a broad understanding of context and an adapt-
ability to different writing styles and genres (Milano, 2023; Wolfram, 2023).
This adaptability allows them to tailor essays to a broader audience, using a
more accessible tone compared to the more formal conventions of research
articles. Our comparison with student writers, however, shows the program
was unable to mirror the engaging tone of the student texts.
Table 3 shows the distribution of engagement features in the two cor-
pora, and we can see here that the overall percentages of reader mention
and directives in the ChatGPT texts align quite closely with the students’
choices. This distribution also corresponds with frequencies for these
items in the argumentative essays by EFL students studied by Yoon
(2021).
Table 3. Frequency of Engagement in the Two Corpora (Normed Frequency and Percentage of Total).
473
474 Written Communication 42(3)
we/our/us 152 2.09 0.08 0.12 95.00 721 9.24 0.21 0.25 90.24
you/your 8 0.11 0.01 0.15 5.00 78 1.00 0.02 0.30 9.76
This rhetorical work is key to argumentative essays that aim to persuade read-
ers of a point of view by addressing them directly and making. Therefore,
reader mentions and directives play a significant role in distinguishing suc-
cessful and unsuccessful essays (Lee & Deakin, 2016).
Following these features, preferences differ with knowledge appeals more
frequent in the ChatGPT-generated essays and questions in the student essays.
Below, we discuss these results in more detail.
separation of writer and reader, “marking out the differences and perhaps
emphasizing the writer’s relatively junior status compared with the teacher/
reader” (Hyland, 2005b, p. 369). Novice writers are often uncertain about
engaging their readers in such an explicitly direct and personal way. They
perhaps recognize it as characterizing more intimate registers, too personal or
informal for academic writing (Hyland & Jiang, 2017) while their textbooks,
style guides and teachers generally advise them to avoid it.
Perhaps as a consequence of these personal implications, students mainly
used you and your with a broader semantic reference, referring to people in
general (similar to the indefinite pronoun one) rather than specific discourse
participants:
(10) It is said that you can meet people through computers and have “rela-
tionships”. (Student essay)
(11) For instance, spelling is no longer as important as it was as you can
simply use a “spellcheck” to correct your English, which is absurd.
(Student essay)
Here, you and your carry a more encompassing meaning than rhetorically
focusing on an individual reader, seeking instead to engage with readers by
recruiting them into a world of shared experiences. Interestingly, this rhetori-
cal use was not found in the ChatGPT essays.
Inclusive we, on the other hand, implies a shared understanding and col-
lective goal. Although it is undoubtedly dialogic by considering the readers’
perspective on an issue, we addresses readers from a position of authority,
steering them through an argument toward a preferred conclusion. Reader
pronouns therefore assert both authority and collegiality; facilitating a dia-
logue intended to persuade readers to agree with the author’s claims as in (12)
and (13). This is perhaps why this form of reader mention dominates the two
corpora.
(14) If Britain were to join “The Single Market”, because of our well-
known independence and head-strength, would we not just be “rock-
ing the bout” so to say? (Student essay)
(15) Should they have the right to “buy” themselves a baby? I think so.
(Student essay)
Whatever the sense questions carry, they all invite direct collusion since the
reader is addressed as someone with an interest in the issue raised by the
question, the ability to recognize the value of asking it, and the good sense to
follow the writer’s response to it (Hyland, 2002). Questions, then, are the
strategy of dialogic involvement par excellence, serving up an invitation for
readers to orientate themselves in a certain way to the argument presented
and to enter a frame of discourse where they can be led to the writer’s view-
point (Hyland, 2002). The ChatGPT essays, in contrast, contained very few
questions, rhetorical or otherwise, and appear to have limitations in accu-
rately identifying and interpreting such questions. Curry et al. (2024), for
example, observed that ChatGPT sometimes fabricated questions in their cor-
pus by adding question marks or question tags to declarative statements,
resulting in inaccurate questions.
In addition, we see a considerable percentage of questions in the student
essays combined with inclusive we pronouns as writers interjected questions
on behalf of the intelligent reader who is brought into the text through this
shared exploration of the topic (16 and 17).
Jiang and Hyland 477
(16) But we ought to ask ourselves “What happens when the computer-
orientated world collapses?” We would then have to use our brains.
(Student essay)
(17) But are we right to blame him? Let us consider that he has discovered
a cure for cancer as a result of genetic engineering. (Student essay)
Readers are brought to agreement with the writer through the sleight of hand
of building on what the writer suggests is already implicitly agreed. By
explicitly referring to this assumed agreement, writers construct themselves
and their reader as fellow travellers on the path of knowledge. Interestingly,
knowledge appeals account for a higher percentage in the ChatGPT-generated
essays, which indicates the model’s use of its ability to access vast amounts
478 Written Communication 42(3)
(20) Soon, of course, this will become even less of a barrier with the
completion of the “Channel Tunnel”. (Student essay)
(21) The main disadvantage with the railways is as the rail service and the
bus service are normally owned by different companies . . .
(Student essay)
Normed SD DP % Normed SD DP %
Logical reasoning 0.00 0.00 0.00 0.00 0.54 0.02 0.35 33.87
Tradition and typicality 1.27 0.03 0.14 98.92 0.74 0.02 0.28 46.77
Routine conditions 0.01 0.00 0.18 1.08 0.31 0.01 0.32 19.35
(23) The need to develop new markets has become pressing, though it is a
challenging prospect given the established tastes and demands of
traditional European markets. (ChatGPT essay)
(24) Traditionally, the concept of family was often narrowly defined: a
heterosexual couple with the ability to conceive naturally.
(ChatGPT essay)
(25) This will apparently extend our free market economy to the whole
of Europe, or at least to those countries who participate.
(Student essay)
(26) An obvious problem with a single Europe of course would be the
language barrier, should we learn a common language?
(Student essay)
Normed SD DP % Normed SD DP %
In each case, there is a clear reader-oriented focus as the writer signals a rec-
ognition of the dialogic dimension of argumentative writing, intervening to
direct the reader to some action or understanding.
Table 6 shows that modals are the preferred form in both the ChatGPT and
student texts, signaling what the writer believes is either necessary or desir-
able, they carry a less imposing and commanding force than forms such as
Jiang and Hyland 481
imperatives (Hyland, 2001; Jiang & Ma, 2018). As seen from the extracts, the
obligation is typically tempered with less imposing modals such as need to,
have to, often combined with inclusive we, especially in the students’ essays.
We can also see that besides obligation modals and complement to- clauses,
students also make use of imperative and inclusive let-imperative forms. As
Hyland and Jiang (2016) note, these two options impose far less on readers
while bringing them into the process of considering and interpreting data as a
partner:
(36) all of which seems rather un-DEMOCRATIC (and I use the term in
its correct meaning) that is, taking power away from the people.
(Student essay)
(37) When this tunnel, which will run under the water between England
and France, is finally completed (hopefully in the near future) it
will be much easier to travel to and from Europe. (Student essay)
482 Written Communication 42(3)
(38) Although, there are plans to create computers which can program
themselves, (which I, personally, feel is a very dangerous idea) the
human brain still very much controls the computer and still the ability
to end the existence of computers at any given moment; thankfully, a
power computers do not have over humans. (Student essay)
(39) Computers have been used as a means of keeping records, they have
all but superseded handwritten text, (in a few decades people may
well be faced with a computer screen and keyboard in their
Jiang and Hyland 483
General Studies exam), they are used to transfer money across the
globe, even to create artwork and to entertain. (Student essay)
Discussion
This comparative analysis of ChatGPT-generated essays and those written by
British students reveals significant differences in the use of engagement
markers and rhetorical strategies to involve readers. The findings indicate
that the students use overwhelmingly more engagement features, although
the LLM-generated essays demonstrate a recognition of context and adapt-
ability in mimicking various writing styles and genres.
Our findings align with recent research in the field of LLM-generated text
analysis. Markey et al. (2024), for example, observed that LLM-generated
texts exhibit reduced involvement and integration compared to their human
counterparts. Similarly, Berber Sardinha (2024) found LLM-generated texts
to be more informationally dense but less interactive than human-authored
texts. Our results also support Jiang and Hyland’s (2024; 2025) findings that
ChatGPT-generated responses often lack the stance features characteristic of
human writing. Collectively, these studies provide insights that can inform
the detection of LLM-generated writing in educational settings, where Scarfe
et al. (2024) have demonstrated the increasing sophistication of LLM in
mimicking certain aspects of academic writing.
The stark differences in engagement we have found here indicates
ChatGPT’s inability to model an audience or anticipate reader needs.
ChatGPT does not operate by guessing the possible occurrence of items at
a clause-by-clause level but by calculating probabilities across a massive
span of text in its training data, rendering it sluggish to respond to context.
The much narrower standard deviations and smaller dispersion of propor-
tions for features in our ChatGPT data (see Table 3) show the relative lack
of variance in the LLM’s use of these features. This result aligns with those
of other studies that have quantitatively compared LLM-generated to
human-written writing (e.g., Jiang & Hyland, 2024; Markey et al., 2024).
Thus, the LLMs’ reliance on statistical patterns, rather than an understand-
ing of reader expectations, results in texts that, while coherent and gram-
matical, lack the interactive and persuasive qualities that characterize
successful argument. These are texts that are less effective in building rap-
port, addressing potential counterarguments, and guiding readers through
complex ideas. Our findings, then, underscore the importance of human
input in crafting engaging and audience-aware texts, especially in contexts
where reader buy-in is paramount.
484 Written Communication 42(3)
Conclusion
Although providing important textual evidence of the rhetorical differences
between ChatGPT and human writing, one shortcoming of our study is a
focus of interactional elements of academic writing, a feature of argument in
which the program might be expected to have limitations. We are also aware
that undergraduate students are not expert writers and might potentially over-
use engagement markers while the caliber of the data used to train ChatGPT
is a constraint on its responses. Although the model is trained on a sizable
amount of text data, these data may only be broadly indicative of how lan-
guage is used in this context as the training data may be skewed toward cer-
tain registers, demographics, or subject areas.
It must be said, however, that we were very impressed by the ability of the
large language model to generate a series of extended and coherent responses
to the prompts we gave it, and by the statistical procedure it uses to organize
and present points in a logical sequence. Nevertheless, our findings indicate
that ChatGPT is less adept at injecting the text with a personal touch. It still
lacks the ability to adopt a strong perspective on a topic and to engage in
persuasive interactions to carry it through, thus neglecting aspects of argu-
ment that are highly valued in academic writing. This takes nothing away
from our positive assessment of the essays it generated nor are we undervalu-
ing the obvious power and affordances of ChatGPT for writing assistance.
Jiang and Hyland 485
your obviously
you of course
one’s prevailing
the reader prevalent
we traditional
us traditionally
our typical
reader typically
usual
routinely
Questions
? Directives
Appeals to shared add
allow
knowledge analyse
apparently analyze
as a rule apply
common arrange
commonly assess
conventional calculate
conventionally choose
established classify
familiar compare
normally connect
obvious consult
Jiang and Hyland 487
contrast show
define suppose
demonstrate state
determine think of
do not turn
develop use
employ take
ensure consider
estimate find
evaluate imagine
follow let
go let’s
have to note
review notice
increase assume
input think about
insert recall
integrate remember
key let us
let us let’s
look at let
mark need to
measure should
mount ought to
must do not
need to have to
ought must
observe has to
order (regular expression query)
pay it is adj. to V.
picture it_PP\sis_VBZ\s\w*_JJ\sto_TO\
prepare s\w*_VV
recover imperatives
refer (\(_\(\s|._SENT\s)\w*_VV
regard
remember
Asides
remove
see incidentally
select by the way
set (
should —
488 Written Communication 42(3)
Funding
The authors received no financial support for the research, authorship, and/or publica-
tion of this article.
ORCID iD
Feng (Kevin) Jiang https://orcid.org/0000-0001-7369-9498
Notes
1. All the examples are taken from our corpora of argumentative essays by British
university students and ChatGPT discussed below.
2. https://www.learnercorpusassociation.org/
References
Adeshola, I., & Adepoju, A. P. (2024). The opportunities and challenges of ChatGPT
in education. Interactive Learning Environments, 32(10), 6159–6172. https://doi.
org/10.1080/10494820.2023.2253858
Anthony, L. (2014). TagAnt (Version 1.2.0) [Computer software]. Waseda University.
http://www.antlab.sci.waseda.ac.jp
Anthony, L. (2022). AntConc (Version 4.0.11) [Computer software]. https://www.
laurenceanthony.net/software
Barrot, J. S. (2023). Using ChatGPT for second language writing: Pitfalls and poten-
tials. Assessing Writing, 57, Article 100745.
Berber Sardinha, T. (2024). AI-generated vs human-authored texts: A multidimen-
sional comparison. Applied Corpus Linguistics, 4(1), Article 100083.
Biber, D. (1988). Variation across speech and writing. Cambridge University Press.
Bin-Hady, W. R. A., Al-Kadi, A., Hazaea, A., & Ali, J. K. M. (2023). Exploring
the dimensions of ChatGPT in English language learning: A global perspective.
Library Hi Tech. Advance online publication. https://doi.org/10.1108/LHT-05-
2023-0200
Byrd, A. (2023). Truth-telling: Critical inquiries on LLMs and the corpus texts that
train them. Composition Studies, 51(1), 135–142.
Casal, J. E., & Kessler, M. (2023). Can linguists distinguish between ChatGPT/AI and
human writing? A study of research ethics and academic publishing. Research
Methods in Applied Linguistics, 2(3), Article 100068.
Curry, N., Baker, P., & Brookes, G. (2024). Generative AI for corpus approaches to
discourse studies: A critical evaluation of ChatGPT. Applied Corpus Linguistics,
4(1), Article 100082.
Jiang and Hyland 489
Jiang, F., & Hyland, K. (2024). Does ChatGPT argue like students? Bundles in argu-
mentative essays. Applied Linguistics, amae052.
Jiang, F. K., & Hyland, K. (2025). Rhetorical distinctions: Comparing metadiscourse
in essays by ChatGPT and students. English for Specific Purposes, 79, 17–29.
Jiang, F., & Ma, X. (2018). ‘As we can see’: Reader engagement in PhD candida-
ture confirmation reports. Journal of English for Academic Purposes, 35, 1–15.
https://doi.org/10.1016/j.jeap.2018.05.003
Kabir, A. A. (2022). Learn ChatGPT: The future of learning. Self-published.
Kasneci, E., Sessler, K., Küchemann, S., Bannert, M., Dementieva, D., Fischer, F.,
Gasser, U., Groh, G., Günnemann, S., Hüllermeier, E., Krusche, S., Kutyniok, G.,
Michaeli, T., Nerdel, C., Pfeffer, J., Poquet, O., Sailer, M., Schmidt, A., Seidel,
T., . . . Kasneci, G. (2023). ChatGPT for good? On opportunities and challenges
of large language models for education. Learning and Individual Differences,
103, 1–9. https://doi.org/10.1016/j.lindif.2023.102274
Kohnke, L., Moorhouse, B. L., & Zou, D. (2023). ChatGPT for language
teaching and learning. RELC Journal, 54(2), 537–550. https://doi.
org/10.1177/00336882231162868
Koubaa, A. (2023). GPT-4 vs. GPT-3.5: A concise showdown. TechRxiv. Preprint.
https://doi.org/10.36227/techrxiv.22312330.v1
Koutsantoni, D. (2004). Attitude, certainty and allusions to common knowledge in
scientific research articles. Journal of English for Academic Purposes, 3(2), 163–
182. https://doi.org/10.1016/j.jeap.2003.08.001
Kumar, D. V. (2023). How “ChatGPT” seems to look behind: Working of ChatGPT.
https://www.kaggle.com/discussions/general/373463
Lafuente-Millán, E. (2014). Reader engagement across cultures, languages and con-
texts of publication in business research articles. International Journal of Applied
Linguistics, 24(2), 201–223. https://doi.org/10.1111/ijal.12019
Laquintano, T., Schnitzler, C., & Vee, A. (2023). TextGenEd: Teaching with text
generation technologies. WAC Clearinghouse.
Lee, J. J., & Deakin, L. (2016). Interactions in L1 and L2 undergraduate student
writing: Interactional metadiscourse in successful and less-successful argu-
mentative essays. Journal of Second Language Writing, 33, 21–34. https://doi.
org/10.1016/j.jslw.2016.06.004
Liu, Z., Xu, A., Zhang, M., Mahmud, J., & Sinha, V. (2017). Fostering user
engagement: Rhetorical devices for applause generation learnt from ted talks.
Proceedings of the International AAAI Conference on Web and Social Media,
11(1), 592–595.
Markey, B., Brown, D. W., Laudenbach, M., & Kohler, A. (2024). Dense and dis-
connected: Analyzing the sedimented style of ChatGPT-generated text at scale.
Written Communication, 41(4), 571–600.
Martin, J. R., & White, P. R. R. (2005). The language of evaluation: Appraisal in
English. Palgrave Macmillan.
McGrath, L., & Kuteeva, M. (2012). Stance and engagement in pure mathematics
research articles: Linking discourse features to disciplinary practices. English for
Specific Purposes, 31(3), 161–173. https://doi.org/10.1016/j.esp.2011.11.002
Jiang and Hyland 491
Milano, E. (2023). How to ask ChatGPT anything and get the right answers: Learn to
prompt AI LLMs effectively. Self-published.
Nordling, L. (2023). How ChatGPT is transforming the postdoc experience. Nature,
622(7983), 655–657. https://doi.org/10.1038/d41586-023-03235-8
OpenAI. (2023). ChatGPT: Optimizing language models for dialogue. https://openai.
com/blog/chatgpt
Pack, A., & Maloney, J. (2023). Using generative artificial intelligence for language
education research: Insights from using OpenAI’s ChatGPT. TESOL Quarterly,
57(4), 1571–1582. https://doi.org/10.1002/tesq.3253
Park, D. B. (1982). The meanings of “audience.” College English, 44(3), 247–257.
Qiu, X., & Jiang, F. (2021). Stance and engagement in 3MT presentations: How
students communicate disciplinary knowledge to a wide audience. Journal of
English for Academic Purposes, 51, 1–12.
Rayson, P. (2016). Log-likelihood spreadsheet. http://ucrel.lancs.ac.uk/llwizard.html
Revell, T., Yeadon, W., Cahilly-Bretzin, G., Clarke, I., Manning, G., Jones, J., Mulley,
C., Pascual, R., Bradley, N., Thomas, D., & Leneghan, F. (2024). ChatGPT ver-
sus human essayists: An exploration of the impact of artificial intelligence for
authorship and academic integrity in the humanities. International Journal for
Educational Integrity, 20(1), 1–18. https://doi.org/10.21203/rs.3.rs-3483059/v1
Rudolph, J., Tan, S., & Tan, S. (2023). ChatGPT: Bullshit spewer or the end of tradi-
tional assessments in higher education? Journal of Applied Learning & Teaching,
6(1), 1–22. https://doi.org/10.37074/jalt.2023.6.1.9
Santurkar, S., Durmus, E., Ladhak, F., Lee, C., Liang, P., & Hashimoto, T. (2023).
Whose opinions do language models reflect? In A. Krause, E. Brunskill, K. Cho,
B. Engelhardt, S. Sabato, & J. Scarlett (Eds.), Proceedings of the 40th interna-
tional conference on machine learning (Vol. 202, pp. 29971–30004). JMLR.
Sarrion, E. (2023). Exploring the power of ChatGPT. Apress. https://doi.
org/10.1007/978-1-4842-9529-8
Satariano, A., & Kang, C. (2023). How nations are losing a global race to tackle AI’s
harms. The New York Times. https://www.nytimes.com/2023/12/06/technology/
ai-regulation-policies.html?searchResultPosition=3
Scarfe, P., Watcham, K., Clarke, A., & Roesch, E. (2024). A real-world test of artifi-
cial intelligence infiltration of a university examinations system: A “Turing Test”
case study. PLoS One, 19(6), Article e0305354.
Shahriari, H., & Shadloo, F. (2019). Interaction in argumentative essays: The case of
engagement. Discourse and Interaction, 12(1), 96–110. https://www.ceeol.com/
search/article-detail?id=834675
Su, Y., Lin, Y., & Lai, C. (2023). Collaborating with ChatGPT in argumentative writ-
ing classrooms. Assessing Writing, 57, Article 100752. https://doi.org/10.1016/j.
asw.2023.100752
Thompson, G. (2001). Interaction in academic writing: Learning to argue with the
reader. Applied Linguistics, 22(1), 58–78.
Wilson, F., Child, S., & Suto, I. (2017). Assessing the transition between school and
university: Differences in assessment between A level and university in English.
Arts and Humanities in Higher Education, 16(2), 188–208.
492 Written Communication 42(3)
Wolfram, S. (2023). What is ChatGPT doing and why does it work? Self-published.
Yoon, H.‑J. (2021). Interactions in EFL argumentative writing: Effects of topic, L1
background, and L2 proficiency on interactional metadiscourse. Reading and
Writing, 34(3), 705–725. https://doi.org/10.1007/s11145-020-10085-7
Zhang, M., Press, O., Merrill, W., Liu, A., & Smith, N. A. (2023). How language
model hallucinations can snowball. arXiv Preprint. arXiv:2305.13534.
Author Biographies
Feng (Kevin) Jiang is a Professor of applied linguistics in the School of Foreign
Languages at Beihang University, China and gained his PhD under the supervision of
Professor Ken Hyland at the University of Hong Kong. His research interests include
disciplinary discourse, corpus studies, and academic writing, and his publications
have appeared in most major applied linguistics journals.
Ken Hyland is a Professor of Applied Linguistics in Education at the University of
East Anglia. He has published over 300 articles and 35 books on academic discourse
with over 97,000 citations on Google Scholar. A collection of his work was published
as the Essential Hyland (Bloomsbury, 2018).