Davitti, E, Korybski, T, & Braun, S 2025 The Routledge Handbook
Davitti, E, Korybski, T, & Braun, S 2025 The Routledge Handbook
stock-taking of technological developments that have been and will be shaping the way
interpreting is practiced and future technology-using professionals are educated to enable
communication in a variety of settings.”
Franz Pöchhacker, University of Vienna, Austria
THE ROUTLEDGE HANDBOOK OF
INTERPRETING, TECHNOLOGY AND AI
This handbook provides a comprehensive overview of the history, development, use, and
study of the evolving relationship between interpreting and technology, addressing the
challenges and opportunities brought by advances in AI and digital tools.
Encompassing a variety of methods, systems, and devices applied to interpreting as a
field of practice as well as a study discipline, this volume presents a synthesis of current
thinking on the topic and an understanding of how technology alters, shapes, and enables
the interpreting task. The handbook examines how interpreting has evolved through the
integration of both purpose-built and adapted technologies that support, automate, or
even replace (human) interpreting tasks and offers insights into their ethical, practical, and
socio-economic implications. Addressing both signed and spoken language interpreting
technologies, as well as technologies for language access and media accessibility, the book
draws together expertise from varied areas of study and illustrates overlapping aspects of
research.
Authored by a range of practising interpreters and academics from across five continents,
this is the essential guide to interpreting and technology for both advanced students and
researchers of interpreting and professional interpreters.
Elena Davitti is Associate Professor of Translation Studies at the Centre for Translation
Studies at the University of Surrey, Co-Director of the Leverhulme Doctoral Network
‘AI-Enabled Digital Accessibility’ (ADA), and Co-Editor of the journal Translation,
Cognition & Behavior.
Sabine Braun is Professor of Translation Studies and the Director of the Centre for
Translation Studies at the University of Surrey, Co-Director of the Surrey Institute for
People-Centred AI, and Director of the Leverhulme Doctoral Network ‘AI-Enabled Digital
Accessibility’ (ADA).
ROUTLEDGE HANDBOOKS IN TRANSLATION
AND INTERPRETING STUDIES
List of contributors x
Acknowledgmentsxiv
Introduction1
Elena Davitti, Tomasz Korybski and Sabine Braun
PART I
Technology-enabled interpreting 9
1 Telephone interpreting 11
Raquel Lázaro Gutiérrez
2 Video-mediated interpreting 30
Sabine Braun
vii
Contents
PART II
Technology and interpreter training 121
PART III
Technology for (semi-)automating interpreting workflows 179
PART IV
Technology in professional interpreting settings 227
viii
Contents
PART V
Current issues and debates 303
Index423
ix
CONTRIBUTORS
x
Contributors
the European Network of Public Service Interpreting and Translation and previously
worked as a sworn interpreter.
Jérôme Devaux is a senior lecturer in French and translation studies at the Open University
(UK). His research interest lies at the intersection of interpreting studies, technology,
and social justice. He has published papers on various topics, including the interpreter’s
role(s) and technologies, legal interpreting, ethics, and interpreting training.
Claudio Fantinuoli is a researcher at the University of Mainz, CTO at KUDO Inc., and
consultant for international organisations. He works in the area of natural language
processing applied to human and machine interpreting (computer-assisted interpreting,
speech recognition, speech translation).
Wojciech Figiel is an assistant professor at the Institute of Applied Linguistics, University
of Warsaw. He has authored numerous publications on accessibility, the sociology of
translation, and disability studies. In particular, he has researched the accessibility of
translational professions for blind and low-sighted persons.
Haris Ghinos is the CEO of ELIT Language Services, vice president of the International
Association of Conference Interpreters (AIIC), and project leader of ISO 23155:2022
on conference interpreting services. Haris is also a consultant interpreter with Calliope
Interpreters and was a former staff interpreter with SCIC, the European Commission. He
holds degrees in physics, war studies, international politics, and finance.
Deborah Giustini is an assistant professor in intercultural communication at HBKU and
a research fellow in interpreting studies at KU Leuven. Her research explores the digi-
talisation of multilingual work. She serves on the IATIS executive council and the edito-
rial boards of Interpreting & Society, Sociology, The British Journal of Sociology, and
Sociological Research Online.
Tomasz Korybski is an assistant professor at the Institute of Applied Linguistics, University
of Warsaw; a visiting researcher at the Centre for Translation Studies, University of Sur-
rey; and a conference interpreter/translator with over 20 years’ experience. His research
interests include the evaluation of interpreting quality and the applicability of AI-based
solutions in the provision of interpreting services.
Raquel Lázaro Gutiérrez is an associate professor and the director of the Department of
Modern Philology at the University of Alcalá. She is a member of the FITISPos-UAH
Research Group and the vice president of ENPSIT. She has been PI of several projects, such
as ‘Corpus pragmatics and telephone interpreting’ (2023–2026) and MHEALTH4ALL
(2022–2025).
Christopher Mellinger is an associate professor at the University of North Carolina at Char-
lotte. He is the co-author of Quantitative Research Methods in Translation and Interpret-
ing Studies and the editor of The Routledge Handbook of Interpreting and Cognition. He
also serves as a co-editor of the journal Translation and Interpreting Studies.
Constantin Orăsan is a professor of language and translation technologies at the Centre of
Translation Studies, University of Surrey. He has over 25 years of experience in natural
language processing and artificial intelligence. His current research focuses on the use of
large language models and automatic speech recognition in translation and interpreting.
xi
Contributors
Marc Orlando is a full professor of translation and interpreting studies and the programme
director in the Department of Linguistics at Macquarie University, Sydney. He sits on
the editorial boards of The Interpreters’ Newsletter and Interpreting and Society. His
research investigates synergies between T&I practice, research, and training, with a
focus on digital technologies.
Verónica Pérez Guarnieri, PhD, is a lead auditor and standards expert with over a decade of
experience. She authored ISO 18841 and convenes ISO TC37/SC5/WG2. As a consult-
ant interpreter and translator, she has lectured globally on standardisation and certifica-
tion and helped establish mirror committees and language industry standards in several
countries.
Bianca Prandi is a postdoctoral assistant at the University of Innsbruck. She researches
human–machine interaction in interpreting and has published on computer-assisted
interpreting and cognition. She cooperated with the Universities of Vienna, Trieste, and
Bologna. She is a member of the EST and of the scientific board of TransActions.
Mariachiara Russo is a professor of Spanish language and interpretation at the Department
of Interpreting and Translation of the University of Bologna–Forlì Campus. She is also the
coordinator of the European Parliament Interpreting Corpus (EPIC), a co-coordinator of
the EU-funded project SHIFT in Orality (https://site.unibo.it/shiftinorality/en), and the
co-creator of UNIC (Unified Interpreting Corpus; http://unic.dipintra.it).
Anja Rütten holds a professorship at the University of Applied Sciences in Cologne, Ger-
many, focusing on knowledge management, terminology, and the use of technologies in
conference interpreting. She has been working for the institutional and private market
for over 20 years and is a member of AIIC’s AI workstream.
Francesco Saina is an Italian linguist, translator, and interpreter with English, French, and
Spanish. A university lecturer, he collaborates on academic and industrial research on
translation and interpreting technology and natural language processing. His work on
the applications of digital technology to the language professions has been published and
presented at international conferences.
Kilian Seeber is a professor of interpreting at the University of Geneva’s Faculty of Transla-
tion and Interpreting, where he serves as the vice dean and as the programme director
for the master of advanced studies in interpreter training programme. He is a principal
investigator at LaborInt and at InTTech.
Diana Singureanu is a researcher at the Centre for Translation Studies, University of Sur-
rey. She has investigated video-mediated interpreting (VMI) in court settings and helped
develop VMI standards and interpreter training through the EU-WEBPSI and EmpASR
projects. Recently, she was awarded a Leverhulme fellowship to explore machine inter-
preting in legal contexts.
Nicoletta Spinolo is an assistant professor in the Department of Interpreting and Transla-
tion, Bologna University. Her research interests include interpreter education, Italian–
Spanish interpreting, and interpreting technologies. With Agnieszka Chmiel, she was
a co-PI in the AIIC-funded ‘Inside the Virtual Booth’ project on the impact of remote
interpreting environments on interpreters’ experience and performance.
xii
Contributors
Cihan Ünlü is a researcher at İstanbul Yeni Yüzyıl University, Türkiye. His work focuses
on computer-assisted translation, interpreting technologies, machine translation, and
human–computer interaction. He is also pursuing a doctoral degree in translation stud-
ies at Boğaziçi University, Istanbul, where he specialises in interpreting technologies.
Camilla Warnicke is an associate professor, a certified interpreter of Swedish and Swedish
Sign Language, and a deaf-blind interpreter working at Stockholm University’s Institute
for Interpreting and Translation Studies and the Sign Language and Deafblind Inter-
preter Program at Fellingsbro Folkhighschool, Örebro. She is affiliated with Örebro Uni-
versity’s School of Behavioural, Social, and Legal Sciences.
xiii
ACKNOWLEDGMENTS
The editors extend their heartfelt thanks to the many contributors from around the world
who generously shared their invaluable insights, expertise, and perspectives. Their contribu-
tions have enriched this volume, ensuring its depth, broad relevance, and forward-looking
angle. Capturing a rapidly evolving topic is no easy task, yet our authors have engaged with
this challenge thoughtfully, helping to create a resource that reflects the dynamic nature of
the field. We are deeply grateful for their dedication and collaboration throughout this pro-
cess. We would also like to express our deep gratitude to the reviewers for their thorough
feedback, which has significantly shaped the quality of this volume and sparked an inspiring
exchange of ideas, even before its publication. A special thank-you goes to Megan Stock-
well, Education and Language Solutions Specialist at Megan Stockwell Language Solu-
tions, for her meticulous editing and proofreading, which ensured clarity and consistency
throughout the chapters. All remaining errors and inconsistencies are entirely our own.
xiv
INTRODUCTION
Elena Davitti, Tomasz Korybski and Sabine Braun
This book sets the ambitious goal of providing a comprehensive overview of the evolution,
application, and study of interpreting and technology in the AI era, covering various tech-
nologies regularly encountered in the field, the ways in which they have been integrated into
a variety of settings and workflows, as well as the issues arising as a result of this integra-
tion. Bringing together contributions from authors in 15 countries across five continents,
the volume addresses a wide range of methods, systems, and devices used both in interpret-
ing practice and research, while also engaging with emerging critical issues and debates. At
present, there is no comprehensive synthesis of interpreting and technology that addresses
all these areas concurrently.
In our increasingly technologised world, exploring the intersection between interpreting
practice, research, and technology is vital not only to capture the ways in which an increas-
ing range of digital technologies – from information and communication technologies and
platforms to data-driven language technologies, including AI – have altered real-time multi-
lingual and accessible communication and the interpreters’ tasks across different modes and
settings but also to understand the evolving trends that will continue to shape the interpret-
ing profession. While technology has played a role in interpreting since the 1920s, when it
paved the way for simultaneous interpreting, it has more recently started to truly permeate
the field, in many different forms and with more diversified uses and applications emerging
at a much faster pace than in the past. The rise of AI-powered language technology has
significantly accelerated this process, creating new possibilities for delivering interpreting
services across different settings and modalities via enhanced human–machine interaction,
and even going as far as supposedly replacing the human at the core of these practices.
In this respect, there has been a notable shift from technology developed specifically for
interpreting purposes (e.g. simultaneous interpreting consoles) to technology developed for
other purposes and subsequently adapted for interpreting. These include telephone and
videoconferencing systems, tablets, and portable equipment. Additionally, experimen-
tal efforts now aim to integrate AI-driven technology, such as automatic speech recogni-
tion and machine translation, within interpreting workflows not only to provide support
but also to introduce new hybrid practices of real-time speech-to-text/speech which were
previously unavailable. New research has begun to explore the possibilities afforded by
DOI: 10.4324/9781003053248-1
The Routledge Handbook of Interpreting, Technology and AI
technology from different perspectives, thus developing new lines of enquiry and illustrat-
ing the expanding role of technology in professional contexts.
This technological upheaval has intensified the need to categorise and distinguish
between different types of technology based on their functions when applied to interpreting:
technology opening up new ways of delivering interpreting services, such as distance inter-
preting; technology performing an assistive role to the interpreter’s task, with ramifications
for the ways in which interpreters prepare for work and the quality of service rendered;
technology semi-automating interpreting workflows, enabling new hybrid modalities that
cross the boundaries between interpreting and other translation-related practices; but also
technology designed to replace human interpreters, which is gaining momentum despite
being at different development stages.
New platforms and ‘solutions’ are continuously being developed and refined to address
the diverse needs for multilingual support in our globalised era. The rapid pace of these
developments challenges practitioners and industry stakeholders to keep up and adapt to an
ever-evolving landscape. Despite increased mutual collaboration, research is also struggling
to keep up with the pace of the industry, and perceptions related to the implementation,
use, and adaptation required by these solutions vary widely among different stakeholders.
Professional and international organisations are now developing guidelines and standards
to account for the shifting paradigm occasioned by the inclusion of technology in the work
of interpreters. Moreover, the close intersection between technology and interpreting is
leading to a shift in pedagogy, not only in terms of how training is delivered, but also in
the skills required of interpreters hoping to enter the profession. This dynamic environment
requires continuous learning, upskilling, flexibility, and engagement with new tools and
methodologies to ensure that interpreting practices remain relevant and effective amidst the
technological advances transforming the field.
At this juncture, it is crucial to reflect upon and reassess how technology is applied to and
integrated into professional interpreting practice and training as a contribution to securing
the profession’s long-term viability. Technological advancements have undoubtedly opened
up numerous possibilities, but they have also introduced significant challenges that must
be addressed. While halting technological progress is not feasible and outright opposition
would be outdated and shortsighted, the risks associated with the unmonitored adoption of
new technologies must be carefully evaluated. A balanced approach is required – one that
neither glorifies nor vilifies technology but instead carefully considers its affordances and
constraints. This nuanced perspective must weigh the pros and cons of technology within
specific contexts, ensuring that adoption is thoughtful, responsible and tailored to the needs
of various circumstances, avoiding blanket judgments.
A new approach is thus essential – one that integrates technology as part of a broader
solution for inclusive multilingual communication and that harnesses the benefits of tech-
nology while minimising its risks through ethical design. This is particularly true in the
context of (generative) language AI, where the focus must be on developing it safely to cre-
ate content that serves users with diverse linguistic, cultural, sensory, and cognitive needs.
The key ethical principles guiding this approach must include human-centric development,
inclusivity, fairness, accountability, sustainability, and transparency.
Building on these premises and considering the broad scope of enquiry into interpreting
technology and AI and their impact on the profession, there is now a pressing need for a
clear mapping of the current state of the art to gain a comprehensive overview of this rap-
idly evolving field. This handbook draws on literature in the field of interpreting and related
2
Introduction
disciplines to synthetise current thinking and examine how technology alters, shapes, and
enables interpreting practice. The volume covers both spoken and signed language inter-
preting technologies, as well as technologies for language access and media accessibility,
highlighting overlapping aspects of research on these topics. The inclusion of authors from
various relevant backgrounds and specialisations, with many being both practising inter-
preters and academics, allows for in-depth insight into these technologies and surrounding
debates. The volume is organised in five sections that will give space to both industry and
academic stakeholders, so as to cover all existing arguments around the complex intersec-
tion between interpreting and technology.
Part I, ‘Technology-enabled interpreting’, is dedicated to a range of interpreting modali-
ties that, over time, have facilitated the delivery of interpreting services, including the
modalities now known as distance interpreting. Each of the chapters in this section presents
an overview of the design and development of the underlying technology, the contexts of its
use, and its applications, along with critical issues and emerging trends. These chapters aim
to summarise current interdisciplinary research on each topic while identifying potential
areas for further enquiry. Chapter 1, by Lázaro Gutiérrez, explores telephone interpreting
as both a professional practice and a research area. It examines the evolution of the service,
highlighting its benefits and challenges for interpreters and users, and addresses some key
research issues in this field as well as future prospects driven by technological advances.
In Chapter 2, on video-mediated interpreting, Braun traces the evolution of this distance
interpreting modality and its current applications across different settings, exploring key
research topics in VMI, such as interpreting quality and interactional aspects, interpreter
and user perceptions, human factors, and working conditions. The chapter also discusses
opportunities from integrating AI-powered tools into VMI platforms, the role of audio-
visual communication technology in interpreter education, and training interpreters specifi-
cally for VMI. Chmiel and Spinolo address remote simultaneous interpreting in Chapter 3,
highlighting the shift to platform-based interpreting and related interface design issues;
discuss key issues, including sound quality, cognitive load, stress, teamwork, and multimo-
dality; and conclude by examining future possibilities through the lens of recent AI-related
developments. In Chapter 4, Warnicke explores video relay service, a bimodal interpreting
modality that enables interpreting between a person who uses a signed language via video
link and a speaking participant via telephone. This modality relies on technological devices
such as videophone and telephone to enable and shape the interaction. The chapter pro-
vides an overview of the service, its regulatory provisions, and discusses the main implica-
tions related to its usage. The remaining chapters in Part I address further technologies that
have been used and/or adapted to enable the delivery of interpreting services in different
ways, such as in tour guide systems, digital pens and tablets for SimConsec, and speech
recognition technology for SightConsec. In Chapter 5, Korybski explores the evolution and
application of portable interpreting equipment over the past century, tracing the techno-
logical advancements that have shaped this technology and highlighting significant mile-
stones and innovations. Building on existing research in the area, the chapter then examines
the primary contexts and modalities in which portable interpreting equipment is utilised,
while also addressing the inherent limitations, including technical, acoustic, and ethical
constraints, and other user accessibility issues. Additionally, the chapter explores the role
of portable interpreting equipment in the training of novice interpreters and offers a look
forward into future application contexts in a highly technologised environment. Chapter 6,
contributed by Ünlü, focuses on technology-enabled consecutive interpreting driven by
3
The Routledge Handbook of Interpreting, Technology and AI
advancements in speech technologies, computing power and hardware, and generative AI.
The chapter addresses the technologisation of consecutive interpreting by exploring the
development and implementation of computer-assisted interpreting tools tailored for this
mode and the functionalities and impact of hybrid modalities, digital pen–supported tools,
and automatic speech recognition–assisted solutions. Concluding this part of the volume,
Chapter 7 is devoted to tablet interpreting, a relatively new modality that has come to the
fore with the increasingly widespread adoption of mobile technological devices and the dig-
italisation of most interpreting-related processes and workflows. Saina explores the use of
tablets as a substitute for personal computers and laptops, both in interpreter preparation
and during interpreting assignments, pointing out the unstructured deployment and diver-
sified usage by practitioners. Building on the limited work on new and hybrid interpreting
modalities enabled by using tablets, he reports on early experiences of tablet interpreting
in professional practice and interpreter training and outlines possible future directions in
tablet interpreting research.
Part II, ‘Technology and interpreter training’, addresses technologies designed to sup-
port some aspects of the interpreting task with a view to ensuring quality. In Chapter 8,
Prandi provides an overview of tools encompassed under computer-assisted interpreting
(CAI) tools and CAI tools training, focusing on their evolution, application, and impact
on interpreter performance and training. To orient future investigations, the chapter scruti-
nises the existing body of research, spotlighting key enquiries and empirical approaches and
examining the impact of tool use on interpreters’ performance and cognitive processes, as
well as questions of system performance and usability in the context of the recent advances
in AI. Chapter 9 shifts the focus to the use of digital pens for interpreter training. Orlando
discusses how this technology, originally investigated, trialled, and recommended for use
in various fields of education since the early 2000s, made its appearance in interpreting
training only from 2010, particularly in the area of note-taking for consecutive interpret-
ing. The chapter reviews training initiatives undertaken on digital pens in interpreter edu-
cation and discusses the relevance of this technology in relation to more recent tools and
systems. In Chapter 10, Amato, Russo, Carioli, and Spinolo address technology for train-
ing in conference interpreting, providing an overview of the development and applications
of computer-assisted interpreter training (CAIT) tools from the late 1990s to today. The
authors highlight how such tools are used by trainers and how they assist trainees when
practising and honing their skills, as well as their perceived user-friendliness. In light of
the impact that information and communication technologies and AI have on interpreting
trainees, this chapter concludes by emphasising the need to include training in CAI and
CAIT tools, alongside soft skills training, in interpreting curricula.
Part III of the volume, ‘Technology for (semi-)automating interpreting workflows’,
focuses on increasingly automated solutions to support multilingual communication in real
time. This part specifically examines the intersection of interpreting with AI, natural lan-
guage processing, and (neural) machine translation. It first covers hybrid workflows for
real-time speech-to-text communication that rely on varying levels of human–machine inter-
action and explores fully automated machine interpreting, often termed ‘speech translation’
in other fields. In Chapter 11, on technology for hybrid modalities, Davitti examines the
transformative impact of AI-driven technology on interpreting-related workflows, focusing
on the emergence of practices combining speech recognition and machine translation. The
chapter highlights the high demand for real-time speech-to-text interlingual services and the
need to reconceptualise traditional interpreting practices accordingly. It then provides an
4
Introduction
overview of five key workflows representing new forms of human–AI interaction (HAII),
exploring the collaborative dynamics at play, the need for new skill sets, and the chal-
lenges of ensuring accuracy and reliability, particularly in high-stakes scenarios. Despite
the scarcity of comparative studies on these workflows, the chapter identifies and critically
reviews current research themes and challenges, including opportunities for upskilling lan-
guage professionals to expand their service offerings. In Chapter 12, Fantinuoli focuses on
machine interpreting as a form of automatic speech translation that has the potential to
overcome language barriers in real time but presents a number of challenges, including the
risks associated with providing real-time language mediation without human experts in the
loop, and in terms of responsible and ethical use. After exploring its evolution, challenges,
and potential future applications, the chapter discusses key issues and explores relevant
technological approaches, while also addressing the ethical questions that arise from the
development of artificial interpreting systems.
After grouping specific technologies for interpreting according to their main functions
and addressing each of them individually (Parts I, II, and III), the last two parts of the
volume adopt a different approach. Part IV, ‘Technology in professional interpreting set-
tings’, takes as a point of departure specific contexts in which interpreters regularly work
and the role that technologies play in these settings. Each of these settings has been an
area of enquiry in interpreting studies, but the structure and approach to these chapters
emphasise the ways in which technology has altered the work of the interpreter, and the
outcomes of its implementation in the respective setting. Interpreting in medical or legal
settings, for instance, regularly relies on several of the technologies presented in Parts I and
II and (more recently) III of the volume. Chapters in Part IV allow authors to synthesise
scholarship relative to interpreting technologies in the specific domain or setting to pre-
sent a comprehensive overview of their use, contexts of application, current practices, and
issues specific to these domains. These chapters help situate technology squarely within the
interpreter’s work and integrate the discussion of technology and interpreting rather than
treating each in isolation. Part IV starts with Chapter 13 on conference settings, where
Seeber argues that conference interpreting, given its status and wide recognition as a highly
professionalised practice, has also been the test bed for many new technological develop-
ments. In particular, the high expectations in terms of accuracy and completeness, as well
as the seemingly ever-increasing speed and density of statements delivered at international
conferences, have fostered the integration of technology in this setting, with a view to com-
pensating for the limitations of the human processor. This chapter considers how different
technologies have been developed, introduced, and used in conference interpreting settings
and what part of the process they are likely to have impacted. Chapter 14 shifts the focus
to healthcare settings, which are increasingly being influenced by technology, ranging from
distance interpreting via video link and telephone to machine translation. Yet whereas dis-
tance interpreting in this setting has already been explored to some extent, the impact of
the most recent technological advancements on the quality and organisation of healthcare
remains largely unexplored. In this chapter, De Boe presents examples of various types of
technology-based healthcare communication, discussing current practices and issues associ-
ated with their use, as identified through empirical research, and concluding with a brief
outlook for near-future developments and directions for research, calling for a greater focus
on users and heightened attention to ethical issues raised regarding the use of technol-
ogy in healthcare communication. In Chapter 15, on legal settings, Devaux discusses how
audio- and video-mediated interpreting has altered the way multilingual legal proceedings
5
The Routledge Handbook of Interpreting, Technology and AI
are conducted. More specifically, it examines the effect technology has on the legal inter-
preter’s working environment, the interpreting process, and the interpreter’s training. To
this end, it goes beyond distance interpreting by discussing other emerging technologies,
from computer-assisted interpreting tools to machine interpreting. By shedding light on
the transformative role of technology in legal interpreting, this chapter provides a founda-
tion for understanding the current state of technology-driven changes and raises considera-
tions for the future of the legal interpreting profession. Finally for this section, Chapter 16
explores the multifaceted world of immigration, asylum, and refugee settings. Singureanu
and Braun provide a comprehensive overview of current practice and research relating to
the use of distance interpreting modalities during asylum interviews, health assessment of
refugees and reception centres, as well as of existing yet currently limited and fragmented
guidelines and training for these practices. In addition, it outlines the emerging uses of other
technologies, such as crowdsourcing platforms and automated services, in such sensitive
and high-stake contexts.
Part V, ‘Current issues and debates’, concludes the volume by covering a comprehen-
sive range of topics that have been investigated in relation to interpreting and technology.
Each chapter focuses on one of these topics, discussing it in relation to one or more of the
technologies presented in previous parts, based on existing debates and studies. Chapters
in this part are split between chapters exploring theoretical constructs that are at the core
of debates around interpreting and technology (Chapters 17–19) and chapters address-
ing issues related to the profession (Chapters 20–22). Each contribution reviews relevant
theoretical backgrounds in addition to the current literature on the topic in interpreting
studies, addressing how technology has enabled and/or altered our understanding and
approach to these topics in the practice and theory of interpreting, followed by a discus-
sion of critical issues and emerging trends. To start with, Chapter 17, on quality-related
aspects, reviews this central concept within interpreting studies and its crucial relevance
for building a deeper understanding of the influence of different technology-related modali-
ties of interpreting on current and future interpreting practice. In addition, it considers the
uses of technology (automated measures) in the process of interpreting quality assessment
itself. Davitti, Korybski, Orăsan, and Braun synthesise how quality can be evaluated and
measured and consider the added complexity brought by technology. The chapter demon-
strates how technology integration in different interpreting practices makes the process of
assessing quality even more intricate, yet increasingly needed, while also outlining the ben-
efits and drawbacks of technology-driven quality assessment. Chapter 18 explores ethical
aspects in relation to the use and impact of interpreting technologies. Giustini focuses on
this interrelation between the use and impact of technologies, as a matter of ‘technoethics’,
highlighting how the technological tools available in the interpreting profession and the
industry are connected to moral and socio-economic issues, such as employment, working
conditions, and potential automation and substitution of human labour; confidentiality,
training, and corporate ownership of data; bias and linguistic diversity; and technology
use in crisis-prone interpreting settings. It concludes by arguing that while the usefulness
and viability of interpreting technologies should not be negated, there is a necessity for
increased awareness, inclusive guidelines, regulation, and stakeholder collaboration to pro-
mote their fair and ethical deployment. In Chapter 19, Mellinger presents an overview
of current scholarship on cognitive aspects that has been conducted at the intersection of
technology and interpreting highlighting studies that have explored various aspects of inter-
preter cognition as they relate to technology use in the practice and process of interpreting.
6
Introduction
The chapter also reviews several key topics associated with technology use, including cog-
nitive load and cognitive effort, cognitive ergonomics, and human–computer interaction,
as well as more situated and contextualised approaches to interpreter cognition. The chap-
ter concludes with a brief discussion of open questions in the field related to interpreting
technologies and cognition – namely, big data and interpreting ethics. Chapter 20 shifts
the focus onto international and professional standards, discussing what has been done in
relation to different modalities to regulate the growing intersection between interpreting
and technology, and what still needs to be addressed. In this chapter, Pérez Guarnieri and
Ghinos address the question of why standards are essential in the field of interpreting,
exploring the evolution and significance of interpreting and related technology standards
in the context of cross-cultural communication. The chapter outlines the historical roots of
these standards, their adaptability across various specialisations, and the numerous benefits
they bring to the interpreting profession. It highlights the crucial role these benchmarks
play in enhancing service quality, protecting interpreters and users, fostering professional
development and trust, and offering valuable insights for individuals and organisations
involved in interpreting communicative events. Additionally, it provides insights into the
development of specific ISO standards, including how standards can contribute to quality
in distance interpreting. Chapter 21 is devoted to workflows and working models. Rütten
explores the impact of technology on the workflow and working models of interpreting, with
a particular focus on conference interpreting. She argues that technological advances have
improved efficiency but increased interpreters’ cognitive load through ‘simultanification’ –
performing multiple tasks simultaneously. While technology simplifies data access and
knowledge acquisition, it also risks overreliance and information overload. Moreover,
while it enhances access to semiotic information, it may distance interpreters from the
communicative contexts in which they work. Additionally, technology blurs traditional
task boundaries, allowing interpreters to handle pre- and post-task activities during the
assignment itself. From a business perspective, it improves client matching and introduces
micropayments and more technical subjects. The volume concludes with Chapter 22, where
Figiel discusses the intersection between ergonomics and accessibility in an evolving context
of technologies for (conference) interpreting. The chapter defines the notions of ergonom-
ics, user experience, usability, and accessibility and applies these to the discussion of both
historical and current issues relating to interpreter workstations and workflow. The chap-
ter places special emphasis on the accessibility of the solutions and practices discussed,
examining the impact that the rise of distance interpreting has on ergonomics, focusing on
areas such as cognitive load, acoustic challenges, and issues with the interfaces of simul-
taneous interpreting delivery platforms. It also covers aspects of ergonomics related to
speech-to-text interpreting and interpreter training.
The volume addresses different types of readerships. Firstly, students and researchers
interested in interpreting and, more broadly, in technologies for multilingual communica-
tion access. There are indeed several university courses that include an interpreting and
technology module within their offer, which is testament to the growing importance of
this intersection within the broader field of interpreting studies. This handbook can serve
as a complement to existing handbooks and encyclopaedias on interpreting studies and as
the go-to reference on interpreting technology. The grouping of chapters into five parts,
with symmetrical chapter structures within each part as far as possible, makes the volume
approachable and readable, and the two-pronged approach – that is, presenting both tech-
nology as the specific object of study and the ways in which it has enabled and shaped the
7
The Routledge Handbook of Interpreting, Technology and AI
interpreting task, process, and workflow, with its socio-economic implications – makes it
a useful resource for students and researchers of interpreting alike. While advanced under-
graduate and graduate students can use the volume to become familiar with the scholarship
on the topic, researchers can also use it as a starting point to examine interpreting technol-
ogy and its impact on a range of topics. A second, yet equally important, readership are
professional interpreters. There is a growing number of professional interpreters interested
in the use and development of technologies for interpreting. For instance, there are working
groups dedicated specifically to the development of international standards on interpreting
technology in a variety of settings, interest groups, and divisions in professional organisa-
tions, as well as a number of podcasts, social media groups, and feeds (e.g. #terptech) dedi-
cated to the topic. As such, professionals working in the field and the agencies who engage
with them will likely find the volume to be of interest. In some respects, this volume could
help bridge the professional/academic divide that often limits professional engagement in
academic discourse.
We hope this comprehensive overview of the intersection between interpreting and tech-
nology in the AI era serves as a valuable resource for anyone interested in this evolving field,
to either deepen their knowledge or approach it for the first time. While it inevitably offers
only a snapshot of the current landscape, which is in a constant state of flux, it highlights
key developments, issues, and ongoing debates. By presenting a range of perspectives, we
encourage readers to form their own informed opinions on different aspects of the complex
relationship between interpreting and technology. Despite concerns about AI’s potential
to replace interpreting, this volume highlights the complex nature of their relationship,
emphasising the need for varied and nuanced solutions based on context and the careful
consideration of different factors. It underscores both the opportunities and limitations
of technology in interpreting. Ultimately, we hope readers find the volume insightful and
enjoyable, while recognising that despite the changes brought about by technology, inter-
preting plays and will continue to play a crucial role in facilitating real-time multilingual
communication across all areas of life.
8
PART I
Technology-enabled interpreting
1
TELEPHONE INTERPRETING
Raquel Lázaro Gutiérrez
1.1 Introduction
Telephone interpreting (TI) refers to the interpretation of spontaneous speech over the phone
by two or more speakers who use two different languages. It is a modality of remote or
distance interpreting that takes place in consecutive or dialogue mode (Ruiz Mezcua, 2018)
and is popular in public and private service provision (Lázaro Gutiérrez and Nevado Llo-
pis, 2022). In relation to Braun’s (2015, 2019) taxonomy of modes of interaction between
technology and interpreting, TI belongs to technology-mediated interpreting, which refers
to technologies used to deliver remote or distance interpreting services.
According to Fantinuoli (2018, 4), remote interpreting is ‘a broad concept which is com-
monly used to refer to forms of interpreter-mediated communication delivered by means
of information and communication technology’. Similar definitions have been provided by
Braun (2019, 2024), using the more recent term ‘distance interpreting’. Telephone inter-
preting, video-mediated interpreting (see Braun, this volume), and remote simultaneous
interpreting (see Chmiel and Spinolo, this volume) all fall under this umbrella term. Inter-
preting delivered over video link is gaining momentum as technology continues to progress
(Lázaro Gutiérrez and Nevado Llopis, 2022). These advances are particularly evident in
conference settings with remote simultaneous interpreting. However, telephone interpret-
ing remains the most popular modality of distance interpreting in service provision settings,
with significant investment made in this area (Hickey and Hynes, 2023) particularly after
the COVID-19 pandemic.
In terms of participant configuration (or ‘constellation’, following Pöchhacker, 2020),
telephone interpreting occurs when any of the participants involved in the interaction
(including the interpreter) connects through audio link. The most common situations
include a remote interpreter working with an on-site service provider and an end user, or a
three-way call in which all the participants connect via audio link (Rosenberg, 2007). The
latter configuration has also been termed ‘teleconference interpreting’ (Braun, 2019, 272).
A less-typical constellation is described by Spinolo et al. (2018), where the service provider
is in the same location as the interpreter and contacts the end user via audio link. To these,
a configuration that has received little attention in research to date ought to be added: when
DOI: 10.4324/9781003053248-3
The Routledge Handbook of Interpreting, Technology and AI
end users are in the same location as the interpreter and connect to service providers via
audio link. Despite being under-researched and potentially uncommon in formal settings
(W. Zhang et al., 2024), this configuration is more frequent in informal interpreting con-
texts. It can be illustrated by end users providing the interpreter, who, in these cases, is usu-
ally a relative or a friend of theirs. Usually, these informal interpreter-mediated encounters
occur in the frame of service provision over the phone, for instance, when an emergency
service is called or when appointments for different public services (such as a medical con-
sultation) are arranged telephonically.
This chapter starts with a conceptualisation of TI. Section 1.2 provides definitions and
refers to the evolution of TI services in market. Section 1.3 is devoted to the current practice
of and research into TI, including its characteristics, advantages, and challenges for profes-
sionals (interpreters) and users. Section 1.4 deals with future avenues for TI. This includes
aspects related to ergonomics, cognition, and working conditions, as well as the peculiari-
ties of human–machine interaction and the move towards automation.
1.2 Evolution
TI is a modality of distance interpreting often used in bilateral or dialogue interpreter–
mediated interaction. This has been explored in research on public service interpreting and
business interpreting settings (Lázaro Gutiérrez and Nevado Llopis, 2022). Another name
for telephone interpreting is ‘over-the-phone interpreting’, or OPI. Typically, a telephone
interpreter–mediated interaction in a service provision setting will include three partici-
pants: the interpreter, a service provider, and an end user. Even though telephone interpret-
ing could simply be defined as ‘bilateral interpretation over the phone’ (Andres and Falk,
2009), it has been widely acknowledged that it also possesses characteristics that go beyond
bilateral interpreting (González Rodríguez, 2017, 2020).
The shift towards remote delivery is a disruptive change in interpreting, and human
beings are often resistant to change. However, this is not the first time the interpreting pro-
fession has experienced such technological disruption. One of the most significant changes
was the introduction of simultaneous interpreting and the use of electro-acoustic technol-
ogy (Baigorri Jalón, 2014). This new mode of interpreting was initially called ‘telephonic’,
as it created distance between interpreters and end users, relying solely on oral communi-
cation and limiting visual cues. The technological shift provoked by the introduction of
electro-acoustic technology might also be at the core of ongoing debates about the visibility
of interpreters (Pöchhacker, 2020).
In fact, the etymology of ‘telephone’ refers to the transmission of sounds in the dis-
tance. In recent years, there have been attempts to complement or even substitute TI with
video-mediated interpreting (see, for example, Lázaro Gutiérrez and Nevado Llopis (2022)
for a discussion of this trend), which can be seen as a natural evolution from TI. For
instance, despite the differences between the two modalities, they share many elements,
including the transmission of sound across a distance. Incidentally, this is reflected in the
more recent umbrella term ‘distance’ interpreting. Spinolo (2022) describes distance inter-
preting (including TI and video-mediated modalities) as a phenomenon that was accel-
erated by the pandemic. Even now that pandemic social restrictions have been relaxed,
demand for distance interpreting is still growing, although at a slower pace. Technological
improvements, such as the availability of higher bandwidth in public services, have made it
easier not only to implement TI more widely but also to complement it with video-mediated
12
Telephone Interpreting
interpreting when needed (Spinolo, 2022). This is also due to a disruptive change in work
practices in general. These changes in practice favour telecommuting in all sectors, as well as
changes in social behaviour, with clients and users preferring to acquire and obtain services
over the phone instead of via on-site visits and interactions. Many telephone interpreters
started their careers as on-site interpreters. Accepting remote assignments was something
many telephone interpreters felt obliged to do because of changing market conditions. The
COVID-19 pandemic accelerated this process and transformed not only the interpreting
industry but also the way in which service provision occurs. Nowadays, some interpreters
have greater expertise in telephone assignments than in on-site assignments.
For sign language interpreting, telephones alone are not sufficient to meet all users’ needs,
and video-mediated interpreting has brought about real change. However, in the early days
of remote sign language interpreting, Deaf individuals wishing to communicate with hear-
ing people at a distance used teletypewriters (TTY) or a TDD (telecommunications device
for the deaf), also known as ‘text telephones’. These devices functioned like a telegraph,
transmitting signals from one device to another using the phone line. TDDs could display
text on a screen or print out written messages. Public services (mostly healthcare systems)
offered a human relay for those who did not own a TDD. In this system, the message was
communicated orally over the phone to an operator, who would codify (write) the message
on the TDD before sending it to the Deaf user.
More recently, market analysts (Hickey and Hynes, 2023) have highlighted two differ-
ent phenomena that influence the interpreting industry. On the one hand, their data points
to interpreters’ perception of having suffered poor working conditions in public services
settings for decades. The move to TI could be considered as the final nail in the coffin for
interpreters. This may even lead to an abandonment of the profession altogether. On the
other hand, this situation also gives way to younger interpreters, who ‘flourished during the
pandemic and now prefer remote assignments over on-site ones’ (Hickey and Hynes, 2023).
This explains the recurrent lack of talent in on-site public service interpreting, which has
intensified during this decade.
13
The Routledge Handbook of Interpreting, Technology and AI
However, disadvantages have also been described. These include increased difficulties
in coordinating talk (Wadensjö, 1999; Hsieh, 2006; Oviatt and Cohen, 1992), technical
issues related to sound quality and the connectivity of the line (Kelly, 2008; Lee, 2007), an
inadequate use of the technology by main speakers (Lázaro Gutiérrez, 2021), lack of brief-
ing (Lee, 2007), and scarce specialisation (Heh and Qian, 1997; Gracia-García, 2002). This
section focuses on two of the main, most often cited challenges of telephone interpreting:
the lack of visual context (multimodal input) and the coordination of discourse, as well as
on research around quality.
These two challenges are highly intertwined. Coordinating discourse, already complex
in on-site bilateral interpreting, becomes even more difficult over the phone. In this modal-
ity, the lack of visual cues further constrains turn-taking, adding to the complexity. Turn
exchange is normally performed unconsciously by primary speakers in monolingual inter-
actions, and its convenience and orchestration rely on non-verbal information, such as
intonation or body language (Drew and Heritage, 2006). These non-verbal elements are
inaccessible during telephone-interpreted interactions. Training in discourse coordination
abilities has been acknowledged as being essential (Fernández Pérez, 2015). In fact, this has
been the focus of research, although, at times, simulated conversations are used instead of
authentic data (De Boe, 2023).
14
Telephone Interpreting
The lack of visual context is not always perceived as a burden for the interpreters’ per-
formance that impairs communication. For example, if the sound conditions are appropri-
ate, interpreters can deploy strategies to obtain information from paralanguage (tone of
voice, breathing patterns, inflection, pitch, and volume) (Ko, 2006; Kelly, 2008; Crezee,
2013; Cheng, 2015). Since contextualisation at the beginning of the interaction is key,
telephone interpreters have been reported to interact with main speakers in order to obtain
key information that can help them better understand the scenario. This can be consid-
ered as ‘leaving aside the traditional invisible role’ of the interpreter (Lázaro Gutiérrez and
Cabrera Méndez, 2019).
Other advantages of TI include avoiding visual distractors. Visual distractors include
non-verbal language (Mikkelson, 2003; Lee, 2007) or unpleasant input (the sight of blood
or injuries). Distance also keeps interpreters away from unpleasant smells or possible risks
or harm (e.g. contagion in cases of medical interpreting, or aggression in conflict situations)
(Lázaro Gutiérrez, 2021). Some interpreters have reported an increased feeling of neutral-
ity when performing over the phone (Lee, 2007) and reduced interference with patients’
privacy (Lázaro Gutiérrez, 2021).
In order to overcome the most important constraints brought about by the lack of visual
context without losing the advantages of telephone interpreting, video-mediated interpret-
ing is seen as the most logical, convenient solution. For example, Spinolo (2022, 7) points
to the appearance of ‘video-mediated dialogue interpreting’ as an alternative to (or comple-
ment for) telephone interpreting in service provision settings. However, although technol-
ogy advances rapidly and users progressively develop technological competences, many
organisations are still not equipped to incorporate video-mediated interpreting into their
procedures. This could be due to fixed institutional practices, financial reasons, or the fact
that telephones are, undoubtedly, more accessible and portable than other devices.
One could argue that videoconferences can also be carried out via smartphone. However,
if video-mediated interpreting were to be used instead of TI, with the aim of having access
to visual information, it should be kept in mind that smartphones do not offer the required
technical characteristics, in terms of screen size or video quality. In fact, Moser-Mercer
(2005) stated that it was essential for interpreters’ visual needs to be addressed in order
to achieve proper working conditions that would, in turn, lead to quality interpreting per-
formance. Additionally, Sultanic (2022) mentioned the reduced sizes of screens and poor
lighting as challenges for remote interpreters in videoconferences caused by smartphones.
These challenges result in rendering non-verbal cues barely apparent, which, in turn, leads
to misunderstanding.
In any case, according to Hickey and Hynes (2023), and leaving out pandemic figures,
which reflect a dramatic increase of telephone and video-mediated interpreting over on-site
interpreting, predictions for the years to come indicate a decrease of between 2 and 3% for TI,
in favour of video-mediated and remote simultaneous interpreting. As technology advances
and society at large gets more used to it, remote modalities of interpreting will include more
and more input for interpreters. This, in turn, implies offering a multimedia and multimodal
set-up, and video is therefore the next logical step to complement audio input.
15
The Routledge Handbook of Interpreting, Technology and AI
is motivated by the distance between the interpreters and the main participants in the inter-
action (who may also be in separate locations from one another). In the case of telephone
interpreting, communication management is even more challenging due to the lack of visual
information, as outlined previously, which prevents interpreters from ascertaining interac-
tional cues from body language (Peng et al., 2023).
When referring to instances of remote communication, such as remote interpreting, Spi-
nolo (2022) also points to a dual socio-cognitive element: the feelings of presence and
alienation. Traditional views of the interpreter’s role emphasise them being invisible, com-
pletely neutral, and impartial and translating everything that is said. However, in reality,
interpreters interact with the parties they interpret for and build rapport (De Boe, 2020)
throughout the conversation. Rapport is necessary in human interaction and facilitates
communication. However, several authors have pointed to difficulties in building rapport
for remote interpreters (including telephone interpreters). This can lead to an increased feel-
ing of alienation, in comparison to on-site interpreters (Moser-Mercer, 2005; Mouzourakis,
2006; Price et al., 2012).
Comparative studies on the characteristics of interpreter-mediated discourse have
revealed that the interpreter’s turn-taking and the coordination work are more complex
during telephone interpreting compared to on-site interpreting (Sultanic, 2022). For
instance, Wadensjö (1999) noticed the increased presence of overlaps and interruptions
over the phone. Furthermore, Braun (2015) stated that overlapping, which is more fre-
quent in remote modalities, may lead to incomplete renditions. Braun and Davitti (2020)
also studied on-site, telephone, and video-interpreted interactions to create categories that
allow us to analyse remote modalities. With a pedagogical aim, they focused on turn-taking
(managing conversation openings, turn shifts, closings) among other challenging aspects
for interpreters.
Whatever the modality of dialogue interpreting, turn-taking is always challenging. For
instance, participants may feel impatient and start talking before the interpreter finishes
their rendition. Similar challenges occur when personal or cultural turn-taking patterns
or practices differ between participants in the interaction. Likewise, the main speaker(s)
may not always allow interpreters to provide their interpreted rendition because they have
already understood the message themselves. Other challenges occur in remote modalities,
both in video and TI. Of particular interest are three-way calls, especially when end users
are located in a noisy or uncomfortable environment. As seen in Section 1.3.1, video inter-
preting seems to be a suitable solution to minimise the impact of the lack of visual context,
although this does not entirely resolve this issue. The same turn-taking issues also occur
with video interpreting and are particularly challenging when the interpreter has to manage
both visual context and turn-taking. Sultanic (2022) surveyed remote (video) interpreters
about the causes of difficulties in turn-taking and compared their remarks to TI. In her
study, ‘[b]oth providers and interpreters reported that the ability to clearly see the par-
ticipants on video made it easier to engage in dialogue and anticipate each speaker’s turn’
(p. 93). However, video-mediated interpreting cannot solve all the turn-taking issues that
appear in TI. It can also add challenges, such as technical issues related to lags or screens
freezing.
The use of the third person instead of the first is also reported as occurring more fre-
quently in TI. This has the effect of making interpreters more visible and active in the
coordination of turns. Several authors have also identified more frequent intervention by
interpreters to avoid miscommunication and overlap (Lee, 2007; Oviatt and Cohen, 1992).
16
Telephone Interpreting
In fact, Oviatt and Cohen (1992) and Kelly (2008) suggest that this trend of increasing the
use of personal pronouns may be due to the lack of visual cues. Similarly, in a survey-based
study, Wang (2021) found that, despite a tendency to use the first person, telephone inter-
preters feel they have to change to the third person to meet clients’ needs and compensate
for lack of experience in communicating through an interpreter. Clients were also reported
to have shifted between second- and third-person pronouns when addressing their inter-
locutors, thus further complicating the task of telephone interpreters.
The coordination role of dialogue interpreters can at times be considered as a step away
from the classical role of an ‘invisible’ interpreter. In remote assignments (as highlighted in
Section 1.3.1 and the beginning of Section 1.3.2), this coordination role goes even further.
Sultanic (2022) refers to instances in which interpreters take on the role of ‘technicians’
and give instructions to the primary speakers about how to use technology in remote com-
munication. These results were due to the fact that the interpreters in Sultanic’s study had
more experience in the use of the technology and therefore found themselves in the position
of advice-provider to the main speakers (particularly to patients) about how to use it effec-
tively. Interestingly enough, in this same study, Sultanic also found that some interpreters
felt the need to provide further cultural explanations when interpreting remotely than they
would on-site.
17
The Routledge Handbook of Interpreting, Technology and AI
18
Telephone Interpreting
(Stengers et al., 2023). This range of technology can cause stress and anxiety to individual
interpreters and deeply concern professional associations (Lázaro Gutiérrez, 2021).
Nevertheless, several authors have attested to a slower introduction of technology within
interpreting compared with translation and mainly refer to ‘computer-assisted interpreting’
(CAI) tools for interpreting (Tripepi-Winteringham, 2010; Drechsel, 2015; Moser-Mercer,
2015; Fantinuoli, 2017 – see also Prandi, this volume). That being said, X. Zhang et al.
(2023) state that the COVID-19 pandemic accelerated the digitalisation of interpreting and
note that ‘radical changes’ occurred in less than one year. They highlight that this amount
of change would have most ‘probably taken five to six years, under normal conditions’.
Besides this crisis-provoked evolution in interpreting services, the development of technol-
ogy which supports interpreting has also advanced thanks to academic research. As Fan-
tinuoli (2023) states, this could be due to the size of the interpretation industry, which is
oriented towards a relatively small number of users. For example, while TI has not been the
main focus of CAI development, current projects, such as PRAGMACOR,2 include ‘design-
ing CAI tools adapted to TI needs’ within their objectives. Although these tools remain in
development, telephone interpreters can already use the CAI tools developed for conference
interpreting to prepare their own assignments. This is because this process is similar to what
conference interpreters typically undergo (Stengers et al., 2023).
1.4.2 Training
An array of training has been developed for telephone interpreters over recent decades.
Examples include numerous research projects, such as the project SHIFT in Orality (Amato
et al., 2018), which has placed remote interpretation at the core of their research objectives.
As a result, training in telephone interpreting has become more common and more stand-
ardised. This has allowed telephone interpreters to increase their skill set and move towards
video remote interpreting or other remote simultaneous modalities.
Training is acknowledged as being the ‘main pillar’ for quality in TI services (Kelly,
2008). Nevertheless, early research in TI in Spain denounced the scarcity of training and
guidelines that specialised in TI. This denunciation referred to the lack of courses and
educational resources for interpreters, as well as clear protocols and guidelines for service
providers (García Luque, 2009; Murgu and Jiménez, 2011; Prieto, 2008; Martínez-Gómez,
2008). Outside of Spain, Verrept, in Belgium (2011), and Ozolins, in Australia (2011), also
identified a need for interpreters to have access to supplementary training in order to ‘make
adequate use of equipment’ and ‘solve technological issues’ and ‘improve performance’.
Taking a different approach, Hlavac (2013) also alludes to this, suggesting including TI in
training programmes for interpreters.
Despite researchers widely acknowledging the lack of guidelines, Kelly’s (2008) guide for
telephone interpreters represents a remarkable example of training materials for telephone
interpreters. It includes workplace guidelines, as well as a full chapter on ethics, and scenar-
ios for practice. Similarly, in Spain, Fernández Pérez also designed and published training
activities. These were based on both role-plays (2015) and the particular skills telephone
interpreters need to acquire to fulfil their main tasks: discourse coordination and transla-
tion (2012). The TI characteristics that Fernández Pérez identified include lack of visual
information, increased access to interpreters in a short period of time for a larger number
of users from different backgrounds, and use of technical equipment. These characteristics
19
The Routledge Handbook of Interpreting, Technology and AI
originate from a study she carried out in order to identify and classify the characteristics
of TI.
In general, training in TI is provided by TI companies, and in cooperation with universi-
ties. TI companies tend to visit universities and provide training in the form of seminars and
workshops or as part of initial and ongoing training for their workers (Lázaro Gutiérrez,
2021). Training tends to focus on the use of specific protocols, basic information about the
field in which work will be conducted (e.g. healthcare, road assistance, social welfare), ethi-
cal issues, and technology.
20
Telephone Interpreting
of under-remuneration. Telephone interpreters are paid in minutes worked, rather than for
completed assignments or working days (although some TI companies do establish a mini-
mum pay for an interpreted telephone call).
The question of ergonomics has recently been investigated by scholars in relation to dis-
tance interpreting. Mainly focusing on simultaneous conference interpreting, the results of
these studies are also applicable to TI because they refer to aspects that concern remoteness
and telework. For instance, Ziegler and Gigliobianco (2018) and Spinolo (2022) suggest
that an interpreter’s workstation should mimic on-site conditions and be in a quiet location.
They say that interpreters should use good-quality headphones and microphones and work
at a desk to allow for comfortable note-taking and consultation of sources. Scholarly work
has also highlighted the importance of including interpreters in the design of user-centred
technologies (Mellinger, 2023). Current studies aiming at the development of CAI tools
for telephone interpreters, such as those conducted by the group FITISPos-UAH, use two
separate cohorts of interpreters in the research (students and professionals) and include
ergonomic testing and acceptance analysis.
Human–machine interaction and cognitive ergonomics (O’Brien, 2012) also contrib-
ute to the literature regarding remote interpreters’ working conditions. After acknowledg-
ing that distance interpreting increases cognitive load and leads to screen fatigue, stress,
isolation, and alienation, Liu (2022) reflects on the need for further training for remote
interpreters. They highlight the need for interpreters to know how to use remote interpret-
ing platforms, how to set up their home office, how to communicate efficiently with rel-
evant stakeholders, and how to fight for good working conditions. Remote interpreters are
expected to perform effectively, despite exposure to sudden, loud noises (which can cause
acoustic shocks), despite difficulties in collaborating with colleague interpreters, and with
limited contextual information about an assignment. Training is therefore deemed essential,
but many of the trainers themselves have limited experience of working remotely. Fur-
thermore, many have negative views of distance interpreting and see it to be a provisional
contingency in times of crisis only. This may be one of the reasons that TI companies tend
to provide TI training themselves, either via university workshops as extra-curricular activi-
ties, as training to new interpreter colleagues, or as part of the selection and recruitment
process (Lázaro Gutiérrez and Cabrera Méndez, 2021b).
Besides aspects related to working conditions and training, the human–machine inter-
play in interpreting has also been examined in order to account for the relationship that the
different actors in interpreter-mediated interactions establish with technology (Mellinger
and Pokorn, 2018). The focus of studies has been on the way interpreters ‘use’ technology.
This includes examining interpreters’ approaches and attitudes towards the present and
potential future use of technology (Mellinger and Hanson, 2018; Stengers et al., 2023).
Mellinger (2019) focuses on how the use of technology can influence the cognitive processes
that underpin interpreting. They also use new research methods, such as close discourse
analysis or (automated) corpus analysis of interpreters’ performance, using eye trackers to
examine interpreters’ behaviour (Mellinger, 2023). As Mellinger (2023: 203) states, ‘each
technological configuration situates and embeds interpreters in a specific context, ulti-
mately shaping their experience and cognitive interaction with the environment’. Whereas
the use of technology might imply an increased cognitive load for interpreters, they have
been observed to be able to externalise their cognitive efforts to technology (Mellinger,
2023). This suggests that technology can play an important role in the preparation phase
for telephone interpreters. Similarly, in the future, one can imagine that a user-centred
21
The Routledge Handbook of Interpreting, Technology and AI
platform will be designed to assist interpreters during assignments. Current research by the
FITISPos-UAH group also makes use of eye trackers and learning analytics to examine the
cognitive workload and telephone interpreters’ focus of attention while using CAI during
assignments. This research is part of a technological development project that is in progress.
It is hoped that it offer finalised, tested products by the end of 2028.3
22
Telephone Interpreting
et al., 2021). In these contexts, although technologically possible, the use of machine inter-
preting should be discouraged (see Fantinuoli, this volume).
End users, who are also clients or buyers of interpretation services, also play an important
role in the transformation of the market. Downie (2023: 288) claims that it is important
for society to stop viewing human and machine interpreting as ‘rivals, bidding for the same
clients and the same work’. Both interpreting solutions can be complementary in a wider
panorama of language access and communication services. Machine interpreting can be use-
ful without the presence of a human interpreter, but it can equally act to augment a human
interpreter’s capabilities. Recent literature describes examples of conversation settings where
TI is available. However, in spite of this, interactants prefer not to use it, instead preferring
communication through a friend, relative, or colleague for ‘translanguaging’ (‘the mixed
use of elements of all the languages known by the speakers to convey meaning and build
communication’; Vogel and García, 2017) (Lázaro Gutiérrez and Tejero González, 2017). In
addition, primary speakers sometimes use technology, such as machine translation, to assist
them, as reported in a study by Lázaro Gutiérrez and Tejero González, 2022. If an applica-
tion is developed for automated bilateral interpreting, this could coexist with TI.
However, wider awareness about the consequences of the use of machine interpreting
without an interpreter in the loop is required. End users need to be made aware of the
characteristics and implications of the speech events and situations which require linguistic
mediation, and when it is appropriate to use machine interpreting, TI, or on-site interpret-
ing. In the medical field, where TI is popular, guidelines have been recently published by the
Interpreting SAFE AI Task Force, constituted in the realm of the NCICH (National Council
on Interpreting in Health Care, based in the USA, and reference for healthcare interpreting
worldwide), to guide the responsible use of AI-based technology in interpreting and multi-
lingual communication. Although initial conversations, debates, and work groups focused
on bilateral interpretation, including on-site and TI, task force activities soon opened to
cover ‘conference, medical, legal, educational, business and other settings’ (https://safeaitf.
org/mission/). Outside of conferences, all other interpreting settings tend to prefer bilateral
interpretation, and particularly TI (Hickey and Hynes, 2023).
In any case, the machine will not replace the human interpreter completely, as humans
will always establish a utilitarian relationship with it, as earlier demonstrated. Ideally, inter-
preters and machines will ‘conform a sort of partnership’ (Downie, 2023) to shape the most
accepted solution for solving multilingual communication problems in our modern societies
(Monzó Nebot, 2009).
1.5 Conclusion
TI has been used for many decades. It constitutes a fast and simple alternative to on-site bilat-
eral interpreting. In times when multilingual communication is frequent and the acknowl-
edgement of linguistic and cultural differences is spreading, interpretation is increasingly
used to communicate in bilateral encounters. Public and private service provision sits at
the core of this phenomenon, together with population movements and telecommunica-
tion advancements. A shortage of trained, professional interpreters, in both widespread
languages and languages of lesser diffusion, pushes for the existence and enhancement of
a globalised market in interpreting services. This would allow interpreters all around the
world to access assignments in remote locations, therefore making the most out of their
abilities and language knowledge.
23
The Routledge Handbook of Interpreting, Technology and AI
Working remotely and embracing TI imply a disruptive challenge for on-site bilateral
interpreters, many of whom have not been trained in most technologised modalities of inter-
preting (i.e. simultaneous conference interpreting, traditionally performed from a booth).
Reluctance to use technologies in all phases of the interpreting assignment is a significant
reason for explaining the avoidance of TI. However, the demand for TI is increasing, even
when other modalities of remote interpreting are appearing and improving. Supplying
interpretation services in many different languages and covering a wide schedule are easier
with TI than with on-site bilateral interpreting. TI represents a simpler form of interpreting
in all respects and can therefore be seen as being more accessible. Telephone technology is
reliable and cheap, and for these reasons, it can be found all around the world. Humans
have grown familiar with the use of a telephone from an early age. As a result, communicat-
ing via a telephone interpreter is usually no more complicated for end users than doing so
via an on-site interpreter. The advantages that TI represents for end users (such as imme-
diacy and reduced costs) go hand in hand with the benefits that it has for interpreters
(access to more assignments, more possibilities for work–life balance, increased privacy,
and reduced exposure to the presence of risks), always acknowledging the challenges it also
implies
However, in general, new work patterns also present risks for workers. Fierce competi-
tion, alienation, and the threat of substitution by a machine affect not only interpreters but
also workers in many other sectors. In fact, technological advances are always two-sided.
While humans are eager to experiment with the benefits provided by technological aids,
we also fear the changes that they bring to our professional practices in the long run. TI
will soon be pushed to use video remote technology. In addition, remote interpreters may
also see themselves having to use CAI tools during preparation and performance phases
of assignments, and even for making themselves available for clients. Although embracing
technology is a frequent attitude amongst interpreters (e.g. Stengers et al., 2023), this does
not come without extra effort. Exhibiting open-mindedness, proactivity, and creativity will
allow current telephone interpreters to not only remain in the market but also access new
settings and scenarios, as the complexity of multilingual communication is acknowledged
worldwide.
Telephone interpreters should seek training opportunities to assist with constant adapta-
tion to a changing market, but it is also important that updated training is made available
to them. In the same vein, in order for CAI tools to be seen as attractive for telephone inter-
preters, technology developers should tailor a more ergonomic design. TI will undoubtedly
remain and coexist with other remote modalities of interpreting, since it represents simplic-
ity for end users. The work environment of telephone interpreters will undoubtedly evolve,
and interaction with technology will remain a key issue in this domain, although it will
hopefully also provide assistance to them.
Notes
1 www.anao.gov.au/work/performance-audit/management-interpreting-services (accessed 22.8.2024).
2 https://pragmacor.web.uah.es/. Ref. PID2021-127196NA-I00. Corpus pragmatics and tele-
phone interpreting: analysis of face-threatening acts. Funded by the Spanish National Research
Agency (accessed 22.8.2024).
3 INNOVATRAD-CM, Ref. PHS-2024/PH-HUM-52, Artificial intelligence and human-machine
interaction: Research and innovation in real-time discourse generation, interpretation and transla-
tion. Funded by Comunidad de Madrid.
24
Telephone Interpreting
References
Amato, A., Spinolo, N., González Rodríguez, M.J., 2018. Handbook of Remote Interpreting – SHIFT
in Orality. University of Bologna, Bologna.
Andres, D., Falk, S., 2009. Remote and Telephone Interpreting. In Andres, D., Pöllabauer, S., eds.
Spürst Du, wie der Bauch rauf-runter? Fachdolmetschen im Gesundheitsbereich/Is Everything All
Topsy Turvy in Your Tummy? Martin Meidenbauer, Munich, 9–27.
Azarmina, P., Wallace, P., 2005. Remote Interpretation in Medical Encounters: A Systematic Review.
Journal of Telemedicine and Telecare 11, 140–145.
Baigorri Jalón, J., 2014. Interpreters at the Edges of the Cold War. In Fernández Ocampo, A.,
Wolf, M., eds. Framing the Interpreter: Towards a Visual Perspective. Routledge, London,
163–171.
Braun, S., 2015. Remote Interpreting. In Mikkelson, H., Jourdenais, R., eds. The Routledge Hand-
book of Interpreting. Routledge, New York, 352–367.
Braun, S., 2019. Technology and Interpreting. In O’Hagan, M., ed. The Routledge Handbook of
Translation and Technology. Routledge, New York, 271–288.
Braun, S., 2024. Distance Interpreting as a Professional Profile. In Massey, G., Ehrensberger-Dow, M.,
Angelone, E., eds. Handbook of the Language Industry. Mouton der Gruyter, Berlin, 449–472.
Braun, S., Davitti, E., 2020. A Multidisciplinary Methodological Framework. In Iglesias, E.F., Rod-
ríguez, M.J.G., Russo, M., eds. Telephone Interpreting/L’interpretazione telefonica. Bononia Uni-
versity Press (BUP), Bologna, 30–38.
Cheng, Q., 2015. Examining the Challenges for Telephone Interpreters in New Zealand (PhD the-
sis). URL https://openrepository.aut.ac.nz/bitstream/handle/10292/9250/ChengQ.pdf?sequence=
3andisAllowed=y (accessed 12.9.2024).
Corpas Pastor, G., 2018. Tools for Interpreters: The Challenges That Lie Ahead. Current Trends in
Translation Teaching and Learning, 157–182.
Corpas Pastor, G., Fern, L., 2016. A Survey of Interpreters’ Needs and Practices Related to Language
Technology. Universidad de Málaga, Málaga.
Corpas Pastor, G., Gaber, M., 2020. Remote Interpreting in Public Service Settings: Technology, Per-
ceptions and Practice. SKASE Journal of Translation and Interpretation 13(2), 58–68.
Crezee, I., 2013. Introduction to Healthcare for Interpreters and Translators. John Benjamins,
Amsterdam.
De Boe, E., 2020. Remote Interpreting in Dialogic Settings. In Salaets, H., Brône, G., eds. Linking
Up with Video: Perspectives on Interpreting Practice and Research. John Benjamins Publishing
Company, Amsterdam, 79–106.
De Boe, E., 2023. Remote Interpreting in Healthcare Settings. Peter Lang, Bern.
Dong, J., Turner, G., 2016. The Ergonomic Impact of Agencies in the Dynamic System of Interpreting
Provision: An Ethnographic Study of Backstage Influences on Interpreter Performance. Translation
Spaces 5(1), 97–123. URL https://doi.org/10.1075/ts.5.1.06don
Downie, J., 2023. Where Is It All Going? Technology, Economic Pressures and the Future of Inter-
preting. In Corpas Pastor, G., Defrancq, B., eds. Interpreting Technologyes – Current and Future
Trends. John Benjamins Publishing Company, Amsterdam, 277–301.
Drechsel, A., 2015. The Tablet Interpreter. Lean Publishing, Canada.
Drew, P., Heritage, J., eds., 2006. Conversation Analysis, vol. I. Sage Publications Ltd., London.
Fantinuoli, C., 2017. Computer-Assisted Preparation in Conference Interpreting. Translation and
Interpreting 9(2), 24–37.
Fantinuoli, C., 2018. Computer-Assisted Interpreting: Challenges and Future Perspectives. In Durán,
I., Pastor, G.C., eds. Trends, E-Tools and Resources for Translators and Interpreters. Koninklijke
Brill Nv., Liden, 153–174.
Fantinuoli, C., 2023. Towards AI-Enhanced Computer-Assisted Interpreting. In Corpas Pastor, G.,
Defrancq, B., eds. Interpreting Technologies – Current and Future Trend. John Benjamins Publish-
ing Company, Amsterdam, 46–71.
Fantinuoli, C., Dastyar, V., 2022. Interpreting and the Emerging Augmented Paradigm. Interpreting
and Society 2(2), 185–194.
Fantinuoli, C., Marchesini, G., Landan, D., Horak, D., 2022. KUDO Interpreter Assist: Automated
Real-Time Support for Remote Interpretation. ArXiv, abs/2201.01800.
25
The Routledge Handbook of Interpreting, Technology and AI
Fernández Pérez, M.M., 2012. Identificación de Las Destrezas de La Interpretación Telefónica. Uni-
versidad de La Laguna, La Laguna.
Fernández Pérez, M.M., 2015. Designing Role-Play Models for Telephone Interpreting Training.
MonTI. Monographs in Translation and Interpreting, 259–279.
Fors, J., 1999. Perspectives on Remote Public Service Interpreting. In Álvarez Lugrís, A., Fernández
Ocampo, A., eds. Quality Issues in Remote Interpreting. Servizo de Publicacións da Universidade
de Vigo, Vigo, 114–116.
García Luque, F., 2009. La interpretación telefónica en el ámbito sanitario. Realidad social y reto
pedagógico. Redit 3, 18–30.
Gentile, A., Ozolins, U., Vasilakakos, M., 1996. Liaison Interpreting: A Handbook. Melbourne Uni-
versity Press, Melbourne.
González Rodríguez, M.J., 2017. La conversación telefónica monolingüe, su futuro inmediato y su
representación en ámbito judicial-policial. In San Vicente, F., Capanaga, P., Bazzocchi, G., eds.
ORALITER: Formas de comunicación presencial y a distancia. BUP, Bologna, 197–222.
González Rodríguez, M.J., 2020. La interpretación a distancia y su formación: La experiencia de la
Shift Summer School y cómo crear la ‘virtualidad necesaria’ en el aula. In Nicoletta Spinolo, Amalia
Amato, eds., TRAlinea Special Issue: Technology in Interpreter Education and Practice, 1–8.
Gracia-García, R.A., 2002. Telephone Interpreting: A Review of Pros and Cons. In Brennan, S., ed. Pro-
ceedings of the 43rd Annual Conference. American Translators Association, Alexandria, VA, 195–216.
Heh, Y., Qian, H., 1997. Over-the-Phone Interpretation: A New Way of Communication Between
Speech Communities. In Jérôme-O’Keeffe, M., ed. Proceedings of the 38th Annual Conference.
American Translators Association, Alexandria, VA, 51–62.
Hewitt, W.E., 1995. Court Interpretation: Model Guides for Policy and Practice in the State Courts.
National Center for State Courts.
Hickey, S., Hynes, R., 2023. The 2023 Nimdzi Interpreting Index: The Ranking of the Top 34 Larg-
est Interpreting Service Providers. URL https://www.nimdzi.com/nimdzi-100-top-lsp/#thenimdzi-
100-ranking (accessed 14.9.2024).
Hlavac, J., 2013. A Cross-National Overview of Translator and Interpreter Certification Procedures.
Translation and Interpreting 5(1), 32–65.
Hornberger, J., Gibson Jr, C. D., Wood, W., Dequeldre, C., Corso, I., Palla, B., Bloch, D.A., 1996.
Eliminating Language Barriers for Non-English-Speaking Patients. Medical Care 34(8), 845–856.
Hsieh, H., 2006. Understanding Medical Interpreters: Reconceptualizing Bilingual Health Communi-
cation. Health Communication 20(2), 177–186.
Iglesias Fernández, E., Ouellet, M., 2018. From the Phone to the Classroom: Categories of Problems
for Telephone Interpreting Training. The Interpreters’ Newsletter 23, 19–44.
Jaime Pérez, A., 2015. Remote Interpreting in Public Services: Developing a 3G Phone Interpreting
Application. In Lázaro Gutiérrez, R., Sánchez Ramos, M.M., Vigier Moreno, F.J., eds. Investi-
gación Emergente En Traducción e Interpretación. Comares, Granada, 73–82.
Jones, D., Gill, P., 1998. Breaking Down Language Barriers: The NHS Needs to Provide Accessible
Interpreting Services for All. BMJ (Clinical Research Ed.) 316(7143), 1476.
Kelly, N., 2008. Telephone Interpreting: A Comprehensive Guide to the Profession. Trafford, Victo-
ria, BC.
Kerremans, K., Cox, A., Stengers, H., Lázararo Gutiérrez, R., Rillof, P., 2018. On the Use of Tech-
nologies in Public Service Interpreting and Translation Settings. In Read, T., Montaner, S., Sedano,
B., eds. Technological Innovation for Specialized Linguistic Domains. Éditions universitaires euro-
péennes, Mauritius, 57–68.
Kerremans, K., Lázaro Gutiérrez, R., Stengers, H., Cox, A., Rillof, P., 2019. Technology Use by Pub-
lic Service Interpreters and Translators: The Link Between Frequency of Use and Forms of Prior
Training. FITISPos International Journal 6(1), 107–122.
Ko, L., 2006. The Need for Long-Term Empirical Studies in Remote Interpreting Research: A Case Study
of Telephone Interpreting. Linguistica Antverpiensia, New Series – Themes in Translation Studies 5.
Kurz, I., 1999. Remote Conference Interpreting: Assessing the Technology. Anovar/anosar estudios de
traducción e interpretación I, 114–116.
Lázaro Gutiérrez, R., 2014. Use and Abuse of an Interpreter. In Valero-Garcés, C., ed. (RE)Visit-
ing Ideology and Ethics in Situations of Conflict. Servicio de Publicaciones de la Universidad de
Alcalá, Alcalá de Henares.
26
Telephone Interpreting
Lázaro Gutiérrez, R., 2020. Fidelidad, confidencialidad y empatía en consultas médicas con víctimas
de violencia de género mediadas por un intérprete. In Pinazo, E.P., ed. Interpreting in a Changing
World: New Scenarios, Technologies, Training Chalenges and Vulnerable Groups. Peter Lang,
Berlin, 49–64.
Lázaro Gutiérrez, R., 2021. Remote (Telephone) Interpreting in Healthcare Settings. In Susam-Saraeva,
Ş., Spišiaková, E., eds. The Routledge Handbook of Translation and Health. Routledge, London,
216–231.
Lázaro Gutiérrez, R., 2022. Self-Care for Interpreters. In Porlán Moreno, R., Arnedo Villaescusa, C.,
eds. Interpreting in the Classroom: Tools for Teaching. UCO Press, Córdoba, 137–154.
Lázaro Gutiérrez, R., Cabrera Méndez, G., 2019. Chapter 2. Context and Pragmatic Meaning in
Telephone Interpreting. In Garcés-Conejos Blitvich, P., Fernández-Amaya, L., Hernández-López,
M.O., eds. Technology Mediated Service Encounters. John Benjamins Publishing Company,
Amsterdam, 45–68.
Lázaro Gutiérrez, R., Cabrera Méndez, G., 2021a. Development of Technological Competences:
Remote Simultaneous Interpreting Explored. Translating and the Computer 43.
Lázaro Gutiérrez, R., Cabrera Méndez, G., 2021b. How COVID-19 Changed Telephone Interpreting
in Spain. International Journal of Translation and Localization 8(2), 137–155.
Lázaro Gutiérrez, R., Cabrera Méndez, G., 2023. Hazard Communication Through Telephone
Interpreters During the Pandemic in Spain: The Case of COVID-19 Tracer Calls. The Translator
28(3), 1–15.
Lázaro Gutiérrez, R., Cabrera Méndez, G., 2024. Widening the Scope of Interpreting in Conflict Set-
tings: A Description of the Provision of Interpreting During the 2021 Afghan Evacuation to Spain.
In Declercq, C., Kerremans, K., eds. The Routledge Handbook of Translation, Interpreting and
Crisis. Routledge, Londres y Nueva York, 172–186.
Lázaro Gutiérrez, R., Iglesias-Fernández, E., Cabrera-Méndez, G., 2021. Ethical Aspects of Tel-
ephone Interpreting Protocols. Verba Hispanica 29(1), 137–156. URL https://doi.org/10.4312/
vh.29.1.137-156
Lázaro Gutiérrez, R., Nevado Llopis, A., 2022. Remote Interpreting in Spain After the Irruption of
COVID-19: A Mapping Exercise. Hikma 21(2), 211–230.
Lázaro Gutiérrez, R., Ross, C., under review. The Telephone Interpreter and the Machine: An Explor-
atory Study into the Potential of Adapting CAI-Tools to Dialogue Interpreting. Target.
Lázaro Gutiérrez, R., Tejero González, J.M., 2017. Interculturalidad y Mediación Cultural en el
Ámbito Sanitario. Descripción de la implementación de un programa de mediación intercultural
en el Servicio de Salud de Castilla-La Mancha. Panace@ XVIII(46), Segundo semestre, 97–107.
Lázaro Gutiérrez, R., Tejero González, J.M., 2022. Challenging Ideologies and Fostering Intercultural
Competence: The Discourses of Healthcare Staff About Linguistic and Cultural Barriers, Interpret-
ers, and Mediators. In Määttä, S.K., Hall, M.K., eds. Mapping Ideology in Discourse Studies. De
Gruyter Mouton, Berlin and Boston, 223–246.
Lee, J., 2007. Telephone Interpreting Seen from the Interpreters’ Perspective. Interpreting 9(2),
231–252.
Liu, J., 2022. The Impact of Technology on Interpreting: An Interpreter and Trainer’s Perspective.
International Journal of Chinese and English Translation and Interpreting (IJCETI) 1(8).
Mack, G., 2001. Conference Interpreters on the Air: Live Simultaneous Interpreting on Italian Tel-
evision. In Gambier, Y., Gottlieb, H., eds. (Multi) Media Translation: Concepts, Practices, and
Research. John Benjamins Publishing Company, Amsterdam and Philadelphia, 125–132.
Martínez-Gómez, A., 2008. La interpretación telefónica en los servicios de atención al inmigrante de
Castilla – La Mancha. In Valero Garcés, C., Pena Díaz, C., Lázaro Gutiérrez, R., eds. Investigación
y práctica en traducción e interpretación en los servicios públicos: Desafíos y alianzas. Servicio de
Publicaciones de la Universidad de Alcalá, Alcalá de Henares, 338–353.
Mellinger, C.D., 2019. Computer-Assisted Interpreting Technologies and Interpreter Cognition:
A Product and Process-Oriented Perspective. Tradumàtica 17, 33–44.
Mellinger, C.D., 2023. Embedding, Extending, and Distributing Interpreter Cognition with Tech-
nology. In Corpas Pastor, G., Defrancq, B., eds. Interpreting Technologies – Current and Future
Trend. John Benjamins Publishing Company, Amsterdam, 195–216.
Mellinger, C.D., Hanson, T.A., 2018. Interpreter Traits and the Relationship with Technology and
Visibility. Translation and Interpreting Studies 13(3), 366–394.
27
The Routledge Handbook of Interpreting, Technology and AI
Mellinger, C.D., Pokorn, N.K., 2018. Community Interpreting, Translation and Technology. Transla-
tion and Interpreting Studies 13(3), 337–341.
Mikkelson, H., 2003. Telephone Interpreting: Boon or Bane? In González, L.P., ed. Speaking in
Tongues: Language Across Contexts and Users. Universitat de València, Valencia, 251–269.
Mintz, D., 1998. Hold the Phone: Telephone Interpreting Scrutinized. Proteus 7(1), 1–5.
Monzó Nebot, E., 2009. Legal and Translational Occupations in Spain: Regulation and Specialization
in Jurisdictional Struggles. In Sela-Sheffy, R., Shlesinger, M., eds. Profession, Identity and Status,
Special Issue of Translation and Interpreting Studies 4(2), 134–154.
Moser-Mercer, B., 2005. Remote Interpreting: The Crucial Role of Presence. Bulletin VALS-ASLA
81, 73–97.
Moser-Mercer, B., 2015. Pedagogy. In Pöchhacker, F., Grbic, N., Mead, P., Setton, R., eds. Routledge
Encyclopedia of Interpreting Studies. Routledge, London, 303–306.
Mouzourakis, P., 2006. Remote Interpreting: A Technical Perspective on Recent Experiments. Inter-
preting 8(1), 45–66.
Murgu, D., Jiménez, S., 2011. La formación de un intérprete telefónico. In Valero Garcés, C., ed. Tra-
ducción e interpretación en los servicios públicos en un mundo INTERcoNEcTado. Universidad
de Alcalá Servicio de Publicaciones, Alcalá de Henares, 214–219.
O’Brien, S., 2012. Translation as Human-Computer Interaction. Translation Spaces 1(1), 101–122.
Oviatt, S., Cohen, P., 1992. Spoken Language in Interpreted Telephone Dialogues. Computer Speech
and Language 6, 277–302.
Ozolins, U., 2011. Telephone Interpreting: Understanding Practice and Identifying Research Needs.
Translation and Interpreting 3(1), 33–47.
Paneth, E., [1957] 2002. An Investigation into Conference Interpreting. In Pöchhacker, F., Shlesinger,
M., eds. The Interpreting Studies Reader. Routledge, London, 31–41.
Peng, K., Mo, A., Liu, M., 2023. Interacting Modalities in the Teletherapeutic Triad and Interpreter’s
Coping Tactics. In Corpas Pastor, G., Hidalgo Ternero, C.M., eds. Proceedings of the International
Workshop on Interpreting Technologies SAY-IT 2023, 5–7 June | Malaga. Incoma, Varna, 61–67.
Phelan, M., 2001. The Interpreter’s Resource. Multilingual Matters, Manchester.
Phillips, C., 2013. Remote Telephone Interpretation in Medical Consultations with Refugees:
Meta-Communications About Care, Survival and Selfhood. Journal of Refugee Studies 26(4),
505–523.
Pöchhacker, F., 2020. “Going Video”: Mediality and Multimodality in Interpreting. In Saalets, H.,
Gert, B., eds. Linking Up with Video. John Benjamins Publishing Company, Amsterdam, 13–45.
Price, E.L., Pérez-Stable, E.J., Nickleach, D., López, M., Karliner, L.S., 2012. Interpreter Perspectives
of In-Person, Telephonic, and Videoconferencing Medical Interpretation in Clinical Encounters.
Patient Education and Counseling 87(2), 226–232.
Prieto, M.N., 2008. La interpretación telefónica en los servicios sanitarios públicos. Estudio del caso:
el servicio de “conversación a tres” del Hospital Carlos Haya de Málaga. In Valero Garcés, C.,
Pena Díaz, C., Lázaro Gutiérrez, R., eds. Investigación y práctica en traducción e interpretación
en los servicios públicos: desafíos y alianzas. Servicio de Publicaciones de la Universidad de Alcalá,
Alcalá de Henares, 369–384.
Rodriguez, S., Gretter, R., Matassoni, M., Alonso, A., Corcho, O., Rico, M., Daniele, F., 2021. SmarT-
erp: A CAI System to Support Simultaneous Interpreters in Real-Time. In Mitkov, R., Sosoni, V.,
Giguère, J.C., Murgolo, E., Deysel, E., eds. Proceedings of the Translation and Interpreting Tech-
nology Online Conference. INCOMA Ltd., 102–109.
Rosenberg, B.A., 2007. A Data Driven Analysis of Telephone Interpreting. In Wadensjö, C., Dim-
itrova, B.E., Nilsson, A., eds. The Critical Link 4: Professionalisation of Interpreting in the Com-
munity. John Benjamins Publishing Company, Amsterdam, 65–77.
Roy, C., 2000. Interpreting as a Discourse Process. Oxford University Press, Oxford.
Roziner, I., Shlesinger, M., 2010. Much Ado About Something Remote: Stress and Performance in
Remote Interpreting. Interpreting 12(2), 214–247.
Ruiz Mezcua, A., 2018. General Overview of Telephone Interpretation (TI): A State of the Art. In
Ruiz Mezcua, A., ed. Approaches to Telephone Interpretation: Research, Innovation, Teaching
and Transference. Peter Lang, Bern, 9–17.
Saint-Louis, L., Friedman, E., Chiasson, E., Quessa, A., Novaes, F., 2003. Testing New Technologies
in Medical Interpreting. Cambridge Health Alliance, Somerville, MA.
28
Telephone Interpreting
Serrano, O.J., 2020. Foto fija de la interpretación simultanea remota al inicio del 2020. Revista Tra-
dumàtica. Tecnologies de la Traducció 17, 59–80.
Spinolo, N., 2022. Remote Interpreting. In Franco Aixelá, J., Muñoz Martín, R., eds. ENTI (Ency-
clopaedia of Translation and Interpreting). AIETI (Asociación Ibérica de Estudios de Traducción
e Interpretación).
Spinolo, N., Bertozzi, M., Russo, M., 2018. Basic Tenets and Features Characterising Telephone-and
Video-Based Remote Communication in Dialogue Interpreting. In Amalia, A., Spinolo, N.,
González Rodríguez, M.J., eds. Handbook of Remote Interpreting – SHIFT in Orality. AMS Acta,
Bologna, 12–25.
Stengers, H., Lázaro Gutiérrez, R., Kerremans, K., 2023. Public Service Interpreters’ Perceptions and
Acceptance of Remote Interpreting Technologies in Times of a Pandemic. In Corpas Pastor, G.,
Defrancq, B., eds. Interpreting Technologies – Current and Future Trends. John Benjamins Pub-
lishing Company, Amsterdam, 109–141.
Sultanic, I., 2022. Interpreting in Pediatric Therapy Settings During the COVID-19 Pandemic: Ben-
efits and Limitations of Remote Communication Technologies and Their Effect on Turn-Taking
and Role Boundary. FITISPos-IJ 9(1), 78–101.
Tripepi-Winteringham, S., 2010. The Usefulness of ICTs in Interpreting Practice. The Interpreters’
Newsletter 15, 87–99.
Valero Garcés, C., Lázaro Gutiérrez, R., Del Pozo Triviño, M., 2015. Interpretar en casos de violencia
de género en el ámbito médico. In Toledano Buendía, C., del Pozo Triviño, M., eds. Interpretación
en Contextos de Violencia de Género. Tirant Lo Blanch, Valencia.
Verrept, H., 2011. Intercultural Mediation Through the Internet in Belgian Hospitals. Proceedings of
the 4th International Conference on Public Service Interpreting and Translation.
Vidal, M., 1998. Telephone Interpreting: Technological Advance or Due Process Impediment? Proteus
7(3), 1–6.
Vogel, S., García, O., 2017. Translanguaging. In Noblit, G., Moll, L., eds. Oxford Research Encyclo-
pedia of Education. Oxford University Press, Oxford.
Wadensjö, C., 1998. Interpreting as Interaction. Addison Wesley Longman, London and New York.
Wadensjö, C., 1999. Telephone Interpreting & the Synchronization of Talk in Social Interaction. The
Translator 5(2), 247–264.
Wang, J., 2021. “I Only Interpret the Content and Ask Practical Questions When Necessary.” Inter-
preters’ Perceptions of Their Explicit Coordination and Personal Pronoun Choice in Telephone
Interpreting. Perspectives: Studies in Translation Theory and Practice 29(4), 625–642.
Zhang, W., Davitti, E., Braun, S., 2024. Charting the Landscape of Remote Medical Interpreting: An
International Survey of Interpreters Working in Remote Modalities in Healthcare Services. Per-
spectives: Studies in Translation Theory and Practice, 1–26.
Zhang, X., Corpas Pastor, G., Zhang, J., 2023. Videoconference Interpreting Goes Multimodal. In
Corpas Pastor, G., Defrancq, B., eds. Interpreting Technologies – Current and Future Trends. John
Benjamins Publishing Company, Amsterdam, 169–194.
Ziegler, K., Gigliobianco, S., 2018. Present? Remote? Remotely Present! New Technological
Approaches to Remote Simultaneous Conference Interpreting. In Fantinuoli, C., ed. Interpreting
and Technology. Language Science Press, Berlin, 119–139.
29
2
VIDEO-MEDIATED
INTERPRETING
Sabine Braun
2.1 Introduction
Video-mediated interpreting (VMI), a term coined in the AVIDICUS projects (Braun, 2016),
refers to all modalities of interpreting that use audiovisual telecommunications technology
to enable the delivery of interpreting services where the interpreter(s) and at least one of the
primary participants are in separate locations. Alongside telephone-mediated interpreting
(see Lázaro Gutiérrez, this volume), VMI represents a key modality of distance interpreting
(DI; Braun, 2024).
The concept of DI likely originated in Germany in the 1950s, championed by Fredo
Nestler, a German interpreter and inventor. Nestler envisioned a centralised telephone inter-
preting service in Europe, where interpreters would be connected to international telephone
calls to provide simultaneous interpreting (Nestler, 1957). Although his system was never
implemented as a telephone service, the German ViKiS project later tested the feasibility
of a VMI service that incorporated a simultaneous interpreter into two-way video calls
between business clients, using an adapted videoconferencing system with additional audio
channels (Braun, 2001, 2004).
The configurations that are now more commonly associated with the term VMI are
those designed for consecutive/dialogue interpreting settings, using standard videoconfer-
encing systems. Such VMI configurations emerged from the 1990s onwards in the context
of public service interpreting, for example, in bilingual court proceedings, and later in the
context of healthcare communication. In courts, the adoption of videoconferencing tech-
nology to connect remote defendants or witnesses to a courtroom has created a need to
integrate interpreters into these video links when the defendant or witness did not speak
the language of the court. In healthcare settings, the driver for the use of videoconferencing
technology to deliver interpreting services has been the need to optimise access to interpret-
ers in increasingly multilingual societies.
In the context of multilingual conference interpreting, various configurations of VMI
in simultaneous mode were tested by UNESCO in the 1970s, laying the foundations for
remote simultaneous interpreting (RSI) (see Chmiel and Spinolo, this volume). Unlike stand-
ard VMI configurations, which are based on mutual visibility of all participants, including
DOI: 10.4324/9781003053248-4
Video-mediated interpreting
the interpreter, RSI is an asymmetric configuration with additional audio channels where
the primary participants are visible to the interpreter, but not vice versa, in line with the
convention for simultaneous (conference) interpreting. While early RSI solutions involved
interpreters working in traditional interpreting booths, technological advances, acceler-
ated by the virtualisation of interpreting services during the COVID-19 pandemic, led to
the development of platform-based RSI solutions using simultaneous interpreting delivery
platforms (SIDPs). Another configuration of VMI, known as video relay service (VRS),
emerged to mediate communication between a deaf person and a hearing person who are
not co-located. Both connect to a sign language interpreter: the deaf person by video link,
and the hearing person by telephone (see Warnicke, this volume).
The shift towards online work during the COVID-19 pandemic contributed to the
expansion of VMI from a hitherto relatively marginal practice of interpreting remotely
for individual clients or specific events to a much more widespread modality of inter-
preting in fully virtual and/or complex hybrid event configurations. While interpreters
have often been critical of VMI, the COVID-19 pandemic has highlighted its benefits
in meeting linguistic demand, and the rapidly increasing exposure to VMI (or DI more
broadly) has enabled interpreters to adapt and develop new competencies, beginning to
change perceptions of VMI. At the same time, the many new and untested configura-
tions of working that emerged during the pandemic have also created new and poten-
tially more challenging working conditions, such as fully virtual meetings with many
participant sites. These new conditions have led to an increased need for interpreters
to work online for extended periods, use cloud-based or software-based interpreting
platforms, combine multiple modes of communication (audio-only/video), and/or use
multiple devices to connect with clients and fellow interpreters. The implications for the
quality and effectiveness of communication in VMI, along with other factors, have yet
to be fully addressed.
This chapter provides an overview of VMI as a growing professional practice and area
of research within interpreting studies. Section 2.2 defines key concepts and outlines the
primary configurations of VMI. Section 2.3 briefly traces the historical development of
VMI, while Section 2.4 examines its practices and representation in research across vari-
ous interpreting fields. Section 2.5 reviews specific research topics related to VMI. Finally,
Section 2.6 concludes with a brief outlook on potential future trends and developments.
31
The Routledge Handbook of Interpreting, Technology and AI
Differentiating between VRI and VCI helps clarify the distinct contexts in which
interpreters work with audiovisual communication technology. Conceptually, this dis-
tinction brings out the reasons why interpreters encounter different configurations. From
a practical point of view, the distinction reflects the different types of events interpreters
are engaged in. The VRI configuration applies to in-person events, where the technology
is used to connect interpreters to the physical venue in which the primary participants
are located. By contrast, VCI configurations arose from the need to integrate interpret-
ers into hybrid or fully virtual events, such as conferences with remote speakers, court
hearings with remote witnesses, or consultations between a lawyer in their law firm and
a client in prison. Initially, the VCI configuration saw interpreter(s) commonly being
co-located with the primary participants at one of the primary participant sites, but
as hybrid and virtual event formats became more frequent and grew more complex
and diverse, particularly because of the virtualisation of human interaction during the
COVID-19 pandemic, interpreters increasingly joined these events from separate sites,
such as an interpreting hub or their home office. This development may have blurred
the boundaries between VRI and VCI. In addition, the different configurations share
many features, such as spatial separation of at least some of the participants. These
factors might suggest that the distinctions between different configurations are less rel-
evant today. However, they remain important not only for conceptual clarity but also
because, as we will see later in this chapter, different configurations can impact the
interpreter-mediated communication in different ways.
A further conceptual distinction should be drawn based on the medium used for ser-
vice delivery – audio-only (e.g. telephone or audio connections via web conferencing plat-
forms) versus audio-video – highlighting the difference between telephone-mediated and
video-mediated interpreting. However, interpreters also work in hybrid, mixed-media sce-
narios, such as when a video link is unavailable for certain participants in a virtual event
(Zhang et al., 2024) or when participants choose to disable their video feed during an event.
These practices have further diversified the configurations of DI.
In addition, some configurations of VMI are inherently asymmetrical in terms of media
use. For instance, DI in conference interpreting settings – normally referred to as remote
simultaneous interpreting (RSI) regardless of the distribution of primary participants and
interpreters relative to one another – typically involves video feeds from participants to
interpreters, while the audience receives only an audio feed, reflecting traditional confer-
ence interpreting norms (see also Chmiel and Spinolo, this volume). Similarly, video relay
services (VRS), which are used to facilitate communication between deaf and hearing indi-
viduals, connect an interpreter by video link to the deaf person (to provide sign language
interpreting) and by telephone to the hearing person (Warnicke and Plejert, 2012; Napier
et al., 2018b; Warnicke, this volume).
This initial overview highlights the need to consider additional parameters in devel-
oping a comprehensive taxonomy of VMI. Firstly, the different configurations of VMI
cannot be fully described without drawing on the parameters that traditionally apply to
interpreter-mediated communicative events, such as the setting (e.g. conference, business,
legal, health- and social care, humanitarian), the type of event (monologic/dialogic, bi/mul-
tilingual), and the mode of interpreting used. The standard application of VMI typically
involves dialogic, bilingual interactions supported by consecutive or dialogue interpreting.
However, the asymmetrical modalities outlined earlier, that is, RSI and VRS, also exhibit
characteristics of VMI.
32
Video-mediated interpreting
33
The Routledge Handbook of Interpreting, Technology and AI
first commercial RSI solutions were entirely audio-based, today’s RSI solutions normally
provide one or more video feeds from the venue and/or from individual participants to the
interpreter.
Meanwhile, in the justice sector, the increasing use of videoconferencing technology in
courts led to a steady increase in VCI in video links between courts and remote witnesses
and/or defendants (Braun and Taylor, 2012b). An early implementation of VRI was in the
9th Judicial Circuit in Florida, which introduced this service in 2007. The videoconferenc-
ing platform allows interpreters to switch between consecutive and simultaneous interpret-
ing to mimic the conventions of traditional court interpreting (i.e. consecutive into the
official language of the court, and simultaneous out of that language). The Metropolitan
Police Service in London introduced VRI in 2011, with interpreters working in consecu-
tive mode from central remote interpreting hubs linked to London police stations (Braun
and Taylor, 2012b). In the healthcare sector, VRI began to replace telephone-mediated
interpreting in the 2010s (Marshall et al., 2019), but VRI still appears to be less common
than telephone interpreting in many healthcare settings. Many of the developments in VMI
outlined here took place in relation to both spoken language interpreting and sign language
interpreting (Napier et al., 2018a).
Various technologies have been used for VMI over the years. The early UNESCO tri-
als used satellite technology, but transmission was too slow. The digital telephone net-
work (ISDN) used from the 1990s onwards did not have sufficient bandwidth to provide
adequate audio and video quality for interpreting. The introduction of broadband internet,
along with improved technical standards for audio and video transmission, combined with
high-quality equipment, for example, in dedicated hubs for interpreters, made VMI a more
technically feasible option. However, the recent shift towards home/mobile working, using
cloud-based video platforms, has resulted in less control over the technical environment,
leading to variations in sound and image quality and potentially worsening working condi-
tions for interpreters (Buján and Collard, 2022).
Overall, regardless of the set-up or technological basis employed, VMI has been associ-
ated with a range of technical challenges, many of which persist to some extent. Insufficient
audio quality for interpreting purposes is perhaps the most frequently reported issue. Devel-
opers of videoconferencing systems have typically prioritised video quality over audio,
deeming the latter adequate for monolingual communication while it is not always suf-
ficient for interpreters’ specific needs regarding source speech comprehension. Additional
challenges include latency in audio and video transmission, lack of synchronicity between
audio and video feeds, and issues with system and network stability.
Perhaps unsurprisingly, given these challenges, interpreters have often been reluctant to
use VMI (Braun and Taylor, 2012b; Mouzourakis, 2006). In recent years, however, there
appears to have been a shift in attitudes, with an increased focus on the benefits of VMI
(e.g. Seeber et al., 2019; Corpas Pastor and Gaber, 2020). One possible explanation is the
growing exposure of interpreters to all modalities of DI, which has allowed interpreters to
gain first-hand experience, develop strategies to cope with different DI modalities, and iden-
tify benefits, particularly during the COVID-19 pandemic. Nimdzi (2023) highlighted in its
Interpreting Index that DI now accounts for nearly 50% of the interpreting market in the
post-pandemic era, with VMI (in the narrower sense, that is, bilingual consecutive/dialogue
interpreting) at 20%, RSI at 16%, and telephone-mediated interpreting at 13%. This marks
a significant increase from pre-pandemic market shares of 10% for telephone-mediated
interpreting, 7% for VMI, and 3% for RSI.
34
Video-mediated interpreting
35
The Routledge Handbook of Interpreting, Technology and AI
(2006) argued that recurring physiological and psychological issues observed across stud-
ies were more attributable to the overarching condition of remoteness than to any specific
technical factor. Professional conference interpreters, particularly within the International
Association of Conference Interpreters (AIIC), strongly opposed RSI in the 2000s, deeming
remote interpreting unacceptable (AIIC, 2000).
The shortcomings of RSI identified in the early studies, such as the absence of a direct
view of the speaker and the resulting feelings of remoteness or alienation, prompted efforts
to optimise the technical set-up of RSI, including the use of multiple large screens to provide
interpreters with detailed views of delegates (Ziegler and Gigliobianco, 2018). Addition-
ally, new ISO standards were introduced, defining minimum requirements for audio quality
and related parameters (e.g. ISO 20108:2017 for the quality and transmission of sound and
image input in simultaneous interpreting, and ISO 20109:2016 for equipment; see Pérez
Guarnieri and Ghinos, this volume).
The idea of implementing physical RSI hubs equipped with advanced conferencing tech-
nology and soundproof interpreting booths became popular from the mid-2010s, particu-
larly in major cities. These hubs aimed to reduce interpreter travel while offering organisers
and interpreters greater control over technical conditions (Ziegler and Gigliobianco, 2018).
A study on RSI at the 2015 World Cup in Brazil highlighted benefits, such as dedicated work-
spaces, opportunities for team collaboration, and improved interpreter well-being, resulting
in more favourable attitudes towards RSI compared to earlier findings (Seeber et al., 2019).
However, as outlined in Section 2.2.2, RSI has undergone a further significant evolu-
tion since the late 2010s. In addition to the rise of physical hubs offering booth-based RSI,
software-based virtual RSI platforms also emerged. An early evaluation by the European
Commission’s Directorate-General for Interpreting found these platforms generally suit-
able for conference interpreting (DG SCIC, 2019). Around the same time, the International
Association of Conference Interpreters (AIIC) revised its stance on RSI, publishing guide-
lines in 2018 that recognised RSI as a new reality and established minimum requirements
for its use (AIIC, 2018).
It is unclear whether physical RSI hubs equipped with high-end technology and easily
accessible to interpreters would have achieved widespread adoption without the COVID-19
pandemic or whether they would have coexisted with software-based RSI platforms. The
pandemic significantly accelerated the adoption of software-based RSI, though a 2020
survey of over 800 conference interpreters found a majority believed their performance
and working conditions to be poorer in platform-based RSI compared to on-site inter-
preting (Buján and Collard, 2022). Despite these challenges, over 40 RSI platform pro-
viders emerged or expanded during the pandemic, introducing features such as improved
interpreter collaboration tools, AI-assisted transcription, term extraction (Fantinuoli et al.,
2022; Rodríguez González et al., 2023), and audio quality enhancements.
Nevertheless, generic conferencing platforms such as Zoom continue to dominate RSI
assignments due to organiser preferences, offering only basic interpreting functionality
(Buján and Collard, 2022; Saeed et al., 2023). A recent development to address this gap is
the integration of RSI platforms as plug-in solutions with generic conferencing platforms,
enhancing interpreter functionality. Post-pandemic, RSI platform providers are also shift-
ing their business models, transforming platforms into marketplaces for interpreters and
clients, alongside offering automated interpreting options.
In conclusion, RSI has evolved substantially in recent years, with the pandemic acting
as a catalyst for the development of software-based platforms. Although concerns about
36
Video-mediated interpreting
performance and working conditions persist, innovative features and new business models
suggest the RSI market remains dynamic and continues to adapt.
37
The Routledge Handbook of Interpreting, Technology and AI
38
Video-mediated interpreting
over recent years (Braun et al., 2023), telephone interpreting remains a common modality
in healthcare settings (De Boe, this volume).
When using VRI in healthcare settings, interpreters often work from home or from hubs
located in large hospitals or operated by remote interpreting service providers (Zhang et al.,
2024). The expansion of healthcare consultations by video link during the pandemic fur-
ther led to an increase in the demand for VCI, especially for configurations in which inter-
preters work from their own location, leading to three-way communication links between
healthcare providers, patients, and interpreters, sometimes combining audio-only and video
connections (Zhang et al., 2024). An earlier pilot of this configuration revealed high satis-
faction rates among patients (Schulz et al., 2015).
The effects of these relatively recent VCI configurations on interpreter-mediated interac-
tions remain underexplored and warrant further investigation. By contrast, VRI has been
studied extensively in health sciences, especially to assess its feasibility within healthcare
workflows. This research has primarily focused on the effectiveness of, and satisfaction
with, VRI compared to on-site interpreting, often eluding questions of interpreting qual-
ity, particularly accuracy, and its impact on patient outcomes. An early systematic review
concluded that remote interpreting by telephone and video link is as acceptable as on-site
interpreting for patients and doctors – and, to a slightly lesser extent, for interpreters – with
similar accuracy levels reported across the two modalities (Azarmina and Wallace, 2005).
However, no formal interpreting quality assessment had been undertaken in the reviewed
studies.
Studies examining interpreting quality in healthcare settings have typically relied on
self-reported perceptions from patients and providers, despite their limited ability to evaluate
quality accurately (see Section 2.5.2). More attention has been given to interactional issues,
with recent research highlighting differences in communication dynamics and complexity
between on-site interpreting and VRI (Hansen, 2020; Klammer and Pöchhacker, 2021).
Interestingly, while perception-based studies suggest that VRI is comparable to on-site inter-
preting, interactional studies reveal it can be more complex and challenging to navigate (see
Section 2.5.3). The links between interactional complexity, interpreting quality, and patient
outcomes remain largely unexplored and require further systematic investigation.
39
The Routledge Handbook of Interpreting, Technology and AI
VMI, revealing a general preference for on-site interpreting among interpreters (Azarmina
and Wallace, 2005; Locatis et al., 2010; Price et al., 2012). Yabe (2020) found that both
healthcare providers and deaf and hard-of-hearing patients preferred on-site interpret-
ing for critical care to ensure effective and accurate communication. Price et al. (2012)
emphasised the influence of healthcare communication genres on interpreter’s perceptions
of different interpreting modalities, with remote modalities perceived as less satisfactory
for patient assessment scenarios than on-site interpreting due to challenges in building rap-
port with remote participants. Healthcare interpreters surveyed by Zhang et al. (2024)
highlighted frequent technical and logistical challenges in telephone- and video-mediated
interpreting. Issues such as poor sound quality, limited visual cues, lack of briefing,
and restricted non-verbal communication negatively affected interpreters’ effectiveness.
Telephone-mediated interpreting was perceived to be particularly difficult in complex medi-
cal settings involving multiple speakers or emotionally charged communication, such as
delivering bad news. While VMI was also seen as impacting interaction and communica-
tion negatively, it was perceived as more effective than telephone-mediated interpreting for
handling complex healthcare interactions.
In the field of legal interpreting, an early survey of 150 legal interpreters in Europe
revealed generally negative attitudes towards, and low acceptance of, VMI (Braun and Tay-
lor, 2012c). However, further analysis by country highlighted significant differences, with
particularly negative attitudes in the UK compared to more positive views in continental
Europe (Braun, 2018). These differences were attributed to several factors. First, the tech-
nology used for VMI played a role. The UK, an early adopter of videoconferencing in legal
proceedings, relied on ISDN-based legacy equipment implemented from the 1990s onwards
(see Section 2.4.2) until well into the 2000s. This equipment provided poor sound quality
and only offered limited options to interact with the technology. In contrast, many continen-
tal European countries, as later adopters, implemented more modern, internet-based vide-
oconferencing systems that interpreters found more acceptable. Second, the socio-economic
context of interpreting in the UK is likely to have contributed to the negative perceptions,
given that the expansion of VMI coincided with debates about cost-cutting in public service
interpreting and an emerging trend towards outsourcing of interpreting services to a private
contractor, which resulted in lower remuneration and deteriorating working conditions
for interpreters, nurturing a perception of undervaluation within the UK justice system.
An interview-based study of 17 legal interpreters in the UK revealed mixed feelings about
VMI, with the interpreters recognising benefits such as improved safety and cost savings
but noting drawbacks such as altered communicative dynamics and reliance on technology
(Devaux, 2018, this volume).
A more recent survey of public service interpreters working in different sectors regard-
ing their views of both telephone-mediated and video-mediated interpreting yielded more
positive results, noting some appreciation for the benefits of these modalities, such as the
comfort of working from home, despite concerns about stress and quality (Corpas Pastor
and Gaber, 2020). Some of these results were echoed in an international survey of sign
language interpreters, which found that the interpreters’ experience of both VRI and VRS
is overall satisfactory, but that interactional and technological issues make VMI more chal-
lenging than working on-site (Napier et al., 2017). The interpreters in Napier et al.’s survey
highlighted benefits for both themselves and for the deaf community as the service users.
This is corroborated by Singureanu et al. (2023) and Zhang et al. (2024), who similarly
40
Video-mediated interpreting
found that interpreters often reflect on the benefits of VMI from the perspective of whether
it brings benefits for the service users.
In conference settings, two early studies comparing on-site interpreting with VRI (RSI)
reported high stress levels and low acceptance among interpreters (Moser-Mercer, 2003;
Roziner and Shlesinger, 2010). However, more recent findings suggest a shift in attitudes,
with the 22 participating interpreters working at an RSI hub during the World Cup in Bra-
zil reporting that they were satisfied with their performance and that they felt an increase
in psychological well-being after gaining experience with RSI (Seeber et al., 2019). Con-
versely, a COVID-19 pandemic survey of over 800 conference interpreters showed largely
negative attitudes and low satisfaction with RSI, likely influenced by the limited time to
adjust to working largely online at the time of the survey and by a fragmented client base
resulting from this (Buján and Collard, 2022). These perceptions were corroborated in a
number of local surveys during the pandemic (see Chmiel and Spinolo, this volume).
41
The Routledge Handbook of Interpreting, Technology and AI
on-site interpreting in police interviews concluded that telephone interpreting was inferior
to the other two modalities, while VRI and on-site interpreting showed minimal differences
(Hale et al., 2022).
This overview shows that research on VMI has yielded mixed results. On the one hand,
RSI studies have observed a striking gap between objective measures of interpreting quality
and interpreters’ subjective perceptions. While objective evaluations found minimal differ-
ences between on-site interpreting and RSI, interpreters’ subjective quality perceptions were
substantially lower for RSI. Roziner and Shlesinger (2010) suggested that interpreters’ per-
ceptions might have been influenced by their dissatisfaction with RSI. On the other hand,
studies in conference and legal settings have produced differing conclusions about VMI,
which may be attributed to variations in study design, methods of quality assessment (see
Davitti et al., this volume), and the nature of the interpreting contexts (e.g. monologic ver-
sus dialogic interactions). These differences complicate efforts to draw broad conclusions
about the quality of VMI compared to on-site interpreting.
While quality is a critical factor in evaluating the viability of VMI, other considerations,
such as ergonomic, psychological, and physiological factors, also play a significant role in
shaping interpreters’ experiences and well-being. Studies on RSI indicate that interpret-
ers face greater stress and discomfort in remote settings compared to on-site interpreting
(Buján and Collard, 2022; Moser-Mercer, 2003; Roziner and Shlesinger, 2010), potentially
due to the reduced sense of presence associated with technology-mediated communication.
Research in human–computer interaction (Luff et al., 2003) suggests that diminished pres-
ence can negatively impact user experience. Similarly, Moser-Mercer (2005) posited that
distance interpreting might hinder interpreters’ ability to process information and construct
mental representations of situations, contributing to stress and fatigue. This has sparked
discussions on the role of cognitive load in VMI (Zhu and Aryadoust, 2022; see also Chmiel
and Spinolo, this volume).
To enhance the VMI experience, further research is needed to address these challenges.
For instance, while audio quality is widely recognised as vital to interpreting quality and
user experience, the influence of other factors, such as the visual interface design of vide-
oconferencing and RSI platforms, remains less understood. Initial studies on RSI platform
interfaces suggest that interpreters tend to favour information-rich, feature-packed inter-
faces offered by bespoke platforms. However, much of their work is currently conducted on
generic platforms like Zoom, which employ minimalist interfaces that may negatively affect
both interpreting quality and user experience (Saeed et al., 2023).
42
Video-mediated interpreting
communication. Short and his colleagues considered the capability of different communi-
cation technologies to create and maintain a sense of presence, that is, a feeling of being
physically present with others. They concluded that technologies offering fewer non-verbal
and visual cues diminish this sense of presence, making interactions feel more impersonal
and potentially affecting communication quality. Conversely, media that transmit more
visual and auditory cues create a stronger sense of presence and are seen as supporting more
effective communication.
However, studies exploring the visual ecology in VMI, the spatial arrangements, and
the use of visual and embodied resources suggest that VMI is visually more complex than
on-site interpreting and can affect the communication. Although video links provide visual
cues, as captured by cameras, these cues offer only a partial view of the remote partici-
pants’ communicative context and are often less effective than the cues available in on-site
interpreting (Braun, 2004, 2012, 2017; Davitti and Braun, 2020). For instance, the way the
camera frames the remote site can obscure important visual details, such as hand gestures,
or introduce distractions, such as someone entering the room or off-camera activities in
the remote space. If the interpreter’s hands are not visible, participants may fail to realise
they are taking notes or reading from them, potentially misinterpreting this as uncertainty
and thereby undermining their trust in the interpreter. The distance between remote par-
ticipants and the camera, along with the size of the screen on which they are viewed, can
also lead to important cues being missed. For example, if a participant is far away from
the camera or viewed on a small screen, facial expressions and small hand gestures may go
unnoticed. This issue is particularly relevant when interpreting for older individuals whose
facial expressions may be harder to read (Gilbert et al., 2022). Equally important, the view-
ing distance and angle of the screen showing remote participants can influence how they are
perceived (Benforado, 2010; Singureanu et al., 2023).
Further evidence for the lower effectiveness of visual cues in VMI compared to on-site
interpreting comes from recent research on interaction in VMI. Hansen (2024) examined
interpreter-initiated repair sequences in VMI during healthcare encounters, highlighting
that while non-verbal repair initiators are less intrusive for the participants than verbal
utterances, the visual constraints of VMI may prevent non-verbal repairs from being per-
ceived by the remote participants. Exploring how gaze works for turn-taking in VMI,
Vranjes (2024) found that while gaze patterns in VMI are similar to those in face-to-face
interactions, their effectiveness in coordinating turns is reduced, suggesting that interpret-
ers may need to adopt more explicit turn-taking strategies, such as prolonged gaze at the
screen. De Boe (2024) compared interactional aspects in VRI, telephone-mediated, and
on-site interpreting, arguing that the remote modalities require more grounding to ensure
mutual understanding between participants than on-site interpreting and that VRI leads to
turn-taking issues and ineffective gestures.
Specifically in situations of VRI, achieving the triangular positioning typical of on-site
dialogue interpreting has been shown to be more challenging, making it harder for the inter-
preter to perceive embodied cues from the co-present interlocutors and to build rapport with
them, and vice versa (Davitti and Braun, 2020; Gilbert et al., 2022; Hansen, 2020; Klammer
and Pöchhacker, 2021). Participants’ awareness that the video camera only captures part of
their environment may prompt adjustments in their positioning relative to the videoconfer-
encing equipment, such as sitting closer together than in in-person interactions (Braun et al.,
2018; Klammer and Pöchhacker, 2021). However, in legal proceedings, such adjustments risk
undermining perceptions of the interpreter’s impartiality (Licoppe et al., 2018).
43
The Routledge Handbook of Interpreting, Technology and AI
Furthermore, different configurations of VMI have been shown to alter turn-taking pat-
terns and fragment communication. For example, the use of VRI in police interviews was
found to affect the interpreter’s understanding of the communicative intentions behind some
of the police officers’ statements, leading to misrepresentation of those intentions (Braun,
2017). Similarly, in video-mediated court proceedings where the minority-language speaker
participates remotely while the interpreter is co-located with other participants in court,
research has highlighted instances where the sequential order of turns was altered. This
occurred because participants present in court failed to notice requests from the remote
minority-language speaker to continue speaking after the first part of their utterance had
been interpreted (Licoppe et al., 2018).
Equally important, this configuration has also been shown to affect the mode of inter-
preting, which, in turn, influences the dynamics of the proceedings (Singureanu et al.,
2023). When the minority-language speaker attends court remotely and the interpreter
is in court, simultaneous interpreting is not feasible unless the videoconferencing system
provides additional audio channels. In such situations, interpreters have been observed to
adopt alternative strategies, especially alternating between simultaneous interpreting, by
briefly speaking over the participants in court, and consecutive interpreting, making use of
brief pauses in their speeches. However, these coping strategies can have an impact on the
court proceedings, potentially leading to errors or loss of information, and contribute to
increased stress for interpreters (Singureanu et al., 2023). Ethnographic research in legal
settings also highlights the negative impact of imbalanced participant distributions, such as
when the defendant is the only remote participant, on their ability to follow court proceed-
ings and intervene (Ellis, 2004; Fowler, 2018).
Research on the interactional aspects of VCI configurations with multiple sites remains
limited. Rosenberg (2007) suggested that interaction in three-way telephone links, where the
primary interlocutors and the interpreter are all located separately, may be less problematic
than remote interpreting via telephone, where the primary interlocutors are co-located, as
the three-way link can place participants on a more equal footing. However, Braun (2004,
2007), in her analysis of interaction in VCI configurations where the primary participants
were in different locations and interpreters worked from a third site, providing simultane-
ous interpreting, concluded that this set-up creates its own interactional challenges, particu-
larly requiring greater coordination efforts from the interpreter. Despite identifying these
challenges, interaction-focused research on VMI has also underscored the potential for
adaptation to VMI, which will be explored in the next section.
44
Video-mediated interpreting
to mitigate the perceived lack of presence (Braun, 2017). Cavents and De Wilde’s (2024)
finding that the face work carried out by both the primary participants and the interpreter
to maintain mutual rapport was primarily directed at protecting the other’s faces rather
than their own may also serve as evidence of strategic adaptation to bridge the perceived
gap in presence in VMI.
However, one setting in particular has required wide-ranging adaptation strategies from
interpreters, namely, VCI in the court, when defendants attend by video link and interpreters
are located in the court. As explained in Section 2.4.2, this configuration makes it impossible
to provide whispered simultaneous interpreting to the defendant, which leads to problems
when court participants do not pause to let the interpreter deliver consecutive interpretation.
Interpreters have been observed adopting strategies such as delivering simultaneous inter-
preting in a normal voice – despite the potential disruption – or using brief pauses in speech
to deliver short target-language segments. They also alternated between the two methods
to align with the pace of proceedings (Singureanu et al., 2023). Emotional intelligence may
influence these strategic choices and adaptation abilities (Singureanu, 2023).
In relation to VMI in healthcare settings and conference interpreting, respectively, recent
surveys indicate adaptations to the physical separation from the participants and/or booth-
mates, such as using additional devices for text communication or muted video chats with
interpreter colleagues (Zhang et al., 2024, for healthcare settings; Buján and Collard, 2022;
Chmiel and Spinolo, 2022, for conference settings). However, the effectiveness of these
strategies warrants further investigation. Moser-Mercer (2005) suggests that experienced
interpreters might struggle to adapt to remote interpreting due to reliance on automated
processes, whereas novice interpreters, especially those trained in new modalities, may
adapt more readily.
45
The Routledge Handbook of Interpreting, Technology and AI
requires detailed research into the variables that shape interpreters’ experiences and atti-
tudes towards VMI. In addition, emerging AI-powered language technologies may alleviate
challenges such as high cognitive load and fatigue and contribute to sustaining interpreting
quality in VMI. Initial research on integrating AI-powered automatic speech recognition
into RSI and into VMI workflows in healthcare and legal interpreting has shown prom-
ising results (e.g. Rodríguez González et al., 2023; Tan et al., 2024; Tang et al., 2024,
respectively).
Another aspect highlighted by the growth of VMI is that technology-mediated environ-
ments require adaptation on the part of both interpreters and users of interpreting ser-
vices who interact with these environments (Davitti and Braun, 2020). This will need to be
addressed through training and upskilling of both interpreters and service users in continu-
ing professional development programmes as well as being reflected in institutional VMI
policies. As a complementary development, further effort is required in terms of standardi-
sation, which has so far mainly addressed RSI in the context of conference interpreting (see
Pérez Guarnieri and Ghinos, this volume). Exceptions are the German DIN8578 standard,
which covers requirements and recommendations for consecutive distance interpreting, and
the minimum standards for VMI developed in the European EU WEBPSI project (see Sin-
gureanu and Braun, this volume).
References
AIIC, 2000. Code for the Use of New Technologies in Conference Interpreting. AIIC Technical and
Health Committee. URL https://web.archive.org/web/20020429100556/www.aiic.net/ViewPage.
cfm?page_id=120 (accessed 10.10.2024).
AIIC, 2018. AIIC Position on Distance Interpreting. AIIC Executive Committee. URL https://aiic.org/
document/4837/AIIC_position_on_TFDI_05.03.18.pdf (accessed 10.10.2024).
Anttila, A., Rappaport, D.I., Tijerino, J., Zaman, N., Sharif, I., 2017. Interpretation Modalities Used
on Family-Centered Rounds: Perspectives of Spanish-Speaking Families. Hospital Pediatrics 7(8),
492–498. URL https://doi.org/10.1542/hpeds.2016-0209
Azarmina, P., Wallace, P., 2005. Remote Interpretation in Medical Encounters: A Systematic Review.
Journal of Telemedicine and Telecare 11(3), 140–145. URL https://doi.org/10.1258/135763305
3688679
Balogh, K., Hertog, E., 2012. AVIDICUS Comparative Studies – Part II: Traditional, Videoconference
and Remote Interpreting in Police Interviews. In Braun, S., Taylor, J., eds., Videoconference and
Remote Interpreting in Legal Proceedings. Intersentia, Antwerp, 119–136.
Benforado, A., 2010. Frames of Injustice: The Bias We Overlook. Indiana Law Journal 85(4), 1333.
URL www.repository.law.indiana.edu/ilj/vol85/iss4/8
Böcker, M., Anderson, D., 1993. Remote Conference Interpreting Using ISDN Videotelephony: A
Requirements Analysis and Feasibility Study. Proceedings of the Human Factors and Ergonomics
Society 1(3), 235–239. URL https://doi.org/10.1177/154193129303700305
Braun, S., 2001. ViKiS – Videokonferenz mit integriertem Simultandolmetschen für kleinere und
mittlere Unternehmen. In Beck, U., Sommer, W., eds. Proceedings of LearnTec 2001: European
Congress and Trade Fair for Educational and Information Technology, 9th, Karlsruhe, Germany.
Karlsruhe, 263–273.
Braun, S., 2004. Kommunikation unter widrigen Umständen? Einsprachige und gedolmetschte Kom-
munikation in der Videokonferenz. Gunter Narr, Tübingen.
Braun, S., 2007. Interpreting in Small-Group Bilingual Videoconferences. Interpreting 9(1), 21–46.
URL https://doi.org/10.1075/intp.9.1.03bra
Braun, S., 2012. Recommendations for the Use of Video-Mediated Interpreting in Criminal Proceed-
ings. In Braun, S., Taylor, J., eds. Videoconference and Remote Interpreting in Criminal Proceed-
ings. Intersentia, Antwerp, 301–328.
46
Video-mediated interpreting
Braun, S., 2013. Keep Your Distance? Remote Interpreting in Legal Proceedings. Interpreting 15(2),
200–228. URL https://doi.org/10.1075/intp.15.2.03bra
Braun, S., 2014. Comparing Traditional and Remote Interpreting in Police Settings: Quality and
Impact Factors. In Viezzi, M., Falbo, C., eds., Traduzione e interpretazione per la società e le
istituzioni. Edizioni Università, Trieste, 1–12.
Braun, S., 2015. Remote Interpreting. In Jourdenais, R., Mikkelson, H., eds. The Routledge Hand-
book of Interpreting. Routledge, London, 352–367.
Braun, S., 2016. The European AVIDICUS Projects: Collaborating to Assess the Viability of
Video-Mediated Interpreting in Legal Proceedings. European Journal of Applied Linguistics 4(1),
173–180. URL https://doi.org/10.1515/eujal-2016-0002
Braun, S., 2017. What a Micro-Analytical Investigation of Additions and Expansions in Remote
Interpreting Can Tell Us About Interpreters’ Participation in a Shared Virtual Space. Journal of
Pragmatics 107, 165–177. URL https://doi.org/10.1016/j.pragma.2016.09.011
Braun, S., 2018. Video-Mediated Interpreting in Legal Settings in England. Translation and Interpret-
ing Studies 13(3), 393–420. URL https://doi.org/10.1075/tis.00022.bra
Braun, S., 2024. Distance Interpreting as a Professional Profile. In Massey, G., Ehrensberger-Dow,
M., Angelone, E., eds. Handbook of the Language Industry: Contexts, Resources and Profiles. De
Gruyter, Berlin, 449–472. URL https://doi.org/10.1515/9783110716047-020
Braun, S., Al Sharou, K., Temizöz, Ö., 2023. Technology Use in Language-Discordant Interpersonal
Healthcare Communication. In The Routledge Handbook of Public Service Interpreting. Rout-
ledge, London, 89–105. URL https://doi.org/10.4324/9780429298202-8
Braun, S., Balogh, K., 2015. Bilingual Videoconferencing in Legal Proceedings: Findings from the
AVIDICUS Projects. In Proceedings of the Conference: Electronic Protocol – a Chance for Trans-
parent and Fast Trial. Polish Ministry of Justice, Warsaw, 21–34.
Braun, S., Davitti, E., Dicerto, S., 2018. Video-Mediated Interpreting in Legal Settings: Assessing
the Implementation. In Napier, J., Skinner, R., Braun, S., eds. Here or There: Research on Inter-
preting via Video Link. Gallaudet University Press, Washington, DC, 144–180. URL https://doi.
org/10.2307/j.ctv2rh2bs3.9
Braun, S., Taylor, J., eds., 2012a. Videoconferencing and Interpreting in Criminal Proceedings.
Intersentia, Antwerp.
Braun, S., Taylor, J., 2012b. Video-Mediated Interpreting: An Overview of Practice and Research.
In Braun, S., Taylor, J., eds. Videoconference and Remote Interpreting in Criminal Proceedings.
Intersentia, Antwerp, 33–68.
Braun, S., Taylor, J., 2012c. Video-Mediated Interpreting in Criminal Proceedings: Two European
Surveys. In Braun, S., Taylor, J., eds. Videoconference and Remote Interpreting in Criminal Pro-
ceedings. Intersentia, Antwerp, 69–98.
Braun, S., Taylor, J., 2012d. AVIDICUS Comparative Studies – Part I: Traditional Interpreting and
Remote Interpreting in Police Interviews. In Braun, S., Taylor, J.L., eds. Videoconference and
Remote Interpreting in Criminal Proceedings. Intersentia, Antwerp, 99–117.
Braun, S., Taylor, J., Miler-Cassino, J., Rybińska, Z., Balogh, K., Hertog, E., Vanden Bosch, Y., Rombouts,
D., 2012. Training in Video-Mediated Interpreting in Criminal Proceedings. In Braun, S., Taylor, J., eds.,
Videoconference and Remote Interpreting in Criminal Proceedings. Intersentia, Antwerp, 233–288.
Buján, M., Collard, C., 2022. Remote Simultaneous Interpreting and COVID-19: Conference
Interpreters’ Perspective. In Liu, K., Cheung, A., eds. Translation and Interpreting in the Age of
COVID-19. Springer, Singapore, 133–150. URL https://doi.org/10.1007/978-981-19-6680-4_7
Cavents, D., De Wilde, J., 2024. Face-Work in Video Remote Interpreting: A Multimodal
Micro-Analysis. In de Boe, E., Vranjes, J., Salaets, H., eds. Interactional Dynamics in Remote
Interpreting: Micro-Analytical Approaches. Routledge, New York, 155–175. URL https:/doi.
org/10.4324/9781003267867-8
Chmiel, A., Spinolo, N., 2022. Testing the Impact of Remote Interpreting Settings on Interpreter
Experience and Performance: Methodological Challenges Inside the Virtual Booth. Transla-
tion, Cognition and Behavior 5(2), 250–274. URL https://doi-org.surrey.idm.oclc.org/10.1075/
tcb.00068.chm
Corpas Pastor, G., Gaber, M., 2020. Remote Interpreting in Public Service Settings: Technology, Per-
ceptions and Practice. SKASE Journal of Translation and Interpretation 13(2), 58–78.
47
The Routledge Handbook of Interpreting, Technology and AI
Davitti, E., Braun, S., 2020. Analysing Interactional Phenomena in Video Remote Interpreting in Col-
laborative Settings: Implications for Interpreter Education. The Interpreter and Translator Trainer
14(3), 279–302. URL https://doi.org/10.1080/1750399X.2020.1800364
De Boe, E., 2020. Remote Interpreting in Healthcare Settings: A Comparative Study on the Influence
of Telephone and Video Link Use on the Quality of Interpreter-Mediated Communication. Uni-
versity of Antwerp, Antwerp.
De Boe, E., 2024. Synchronization of Interaction in Healthcare Interpreting by Video Link and Tel-
ephone. In de Boe, E., Vranjes, J., Salaets, H., eds. Interactional Dynamics in Remote Interpreting:
Micro-Analytical Approaches. Routledge, London, 22–41.
Devaux, J., 2018. Technologies and Role-Space: How Videoconference Interpreting Affects the Court
Interpreter’s Perception of Her Role. In Fantinuoli, C., ed. Interpreting and Technology. Language
Science Press, Berlin, 91–117.
DG SCIC, 2019. Interpreting Platforms. Consolidated Test Results and Analysis. European Commission
Directorate General for Interpretation (DG SCIC). URL https://knowledge-centre-interpretation.
education.ec.europa.eu/sites/default/files/interpreting_platforms_-_consolidated_test_results_
and_analysis_-_def.pdf (accessed 23.5.2023).
Ellis, S.R., 2004. Videoconferencing in Refugee Hearings. Ellis Report to the Immigration and Ref-
ugee Board Audit and Evaluation Committee. URL www.irb-cisr.gc.ca/Eng/transp/ReviewEval/
Pages/Video.aspx#analysis (accessed 3.10.2022).
Fantinuoli, C., Marchesini, G., Landan, D., Horak, L., 2022. KUDO Interpreter Assist: Automated
Real-Time Support for Remote Interpretation. arXiv:2201.01800 [cs.CL]. URL https://doi.
org/10.48550/arxiv.2201.01800
Fowler, Y., 2018. Interpreted Prison Video Link: The Prisoner’s Eye View. In Napier, J., Skinner, R.,
Braun, S., eds. Here or There: Research on Interpreting via Video Link. Gallaudet University Press,
Washington, DC, 183–209. URL https://doi.org/10.2307/j.ctv2rh2bs3.10
Gilbert, A.S., Croy, S., Hwang, K., LoGiudice, D., Haralambous, B., 2022. Video Remote Interpret-
ing for Home-Based Cognitive Assessments. Interpreting. International Journal of Research and
Practice in Interpreting. John Benjamins Publishing Company 24(1), 84–110. URL https://doi.
org/10.1075/intp.00065.gil
Hale, S., Goodman-Delahunty, J., Martschuk, N., Lim, J., 2022. Does Interpreter Location Make a
Difference? Interpreting 24(2), 221–253. URL https://doi.org/10.1075/intp.00077.hal
Hansen, J.P.B., 2020. Invisible Participants in a Visual Ecology: Visual Space as a Resource for Organ-
ising Video-Mediated Interpreting in Hospital Encounters. Social Interaction. Video-Based Studies
of Human Sociality 3(3). URL https://doi.org/10.7146/si.v3i3.122609
Hansen, J.P.B., 2024. Interpreters’ Repair Initiators in Video-Mediated Environments. In de Boe,
E., Vranjes, J., Salaets, H., eds. Interactional Dynamics in Remote Interpreting: Micro-Analytical
Approaches. Routledge, London, 91–112.
Jumpelt, R.W., 1985. The Conference Interpeter’s Working Environment Under the New ISO and IEC
Standards. Meta 30(1), 82–90. URL https://doi.org/10.7202/003278AR
Klammer, M., Pöchhacker, F., 2021. Video Remote Interpreting in Clinical Communication: A
Multimodal Analysis. Patient Education and Counseling 104(12), 2867–2876. URL https://doi.
org/10.1016/j.pec.2021.08.024
Licoppe, C., Verdier, M., 2013. Interpreting, Video Communication and the Sequential Reshaping of
Institutional Talk in the Bilingual and Distributed Courtroom. International Journal of Speech,
Language and the Law 20(2), 247–275. URL https://doi.org/10.1558/ijsll.v20i2.247
Licoppe, C., Verdier, M., Veyrier, C.A., 2018. Voice, Power and Turn-Taking in Multilingual, Con-
secutively Interpreted Courtroom Proceedings with Video Links. In Napier, J., Skinner, R., Braun,
S., eds. Here or There: Research on Interpreting via Video Link. Gallaudet University Press, Wash-
ington, DC, 299–322. URL https://doi.org/10.2307/j.ctv2rh2bs3.14
Lion, K.C., Brown, J.C., Ebel, B.E., Klein, E.J., Strelitz, B., Gutman, C.K., Hencz, P., Fernandez, J.,
Mangione-Smith, R., 2015. Effect of Telephone vs Video Interpretation on Parent Comprehension,
Communication, and Utilization in the Pediatric Emergency Department a Randomized Clinical Trial.
JAMA Pediatrics 169(12), 1117–1125. URL https://doi.org/10.1001/jamapediatrics.2015.2630
Locatis, C., Williamson, D., Gould-Kabler, C., Zone-Smith, L., Detzler, I., Roberson, J., Maisiak, R.,
Ackerman, M., 2010. Comparing In-Person, Video, and Telephonic Medical Interpretation. Journal
of General Internal Medicine 25(4), 345–350. URL https://doi.org/10.1007/s11606-009-1236-x
48
Video-mediated interpreting
Luff, P., Heath, C., Kuzuoka, H., Hindmarsh, J., Yamazaki, K., Oyama, S., 2003. Fractured Ecologies:
Creating Environments for Collaboration. Human – Computer Interaction 18(1–2), 51–84. URL
https://doi.org/10.1207/S15327051HCI1812_3
Marshall, L.C., Zaki, A., Duarte, M., Nicolas, A., Roan, J., Colby, A.F., Noyes, A.L., Flores, G.,
2019. Promoting Effective Communication with Limited English Proficient Families: Implementa-
tion of Video Remote Interpreting as Part of a Comprehensive Language Services Program in a
Children’s Hospital. Joint Commission Journal on Quality and Patient Safety 45(7), 509–516.
URL https://doi.org/10.1016/j.jcjq.2019.04.001
Miler-Cassino, J., Rybińska, Z., 2012. AVIDICUS Comparative Studies – Part III: Traditional Inter-
preting and Videoconference Interpreting in Prosecution Interviews. In Braun, S., Taylor, J., eds.
Videoconference and Remote Interpreting in Criminal Proceedings. Intersentia, Antwerp, 117–136.
Moser-Mercer, B., 2003. Remote Interpreting: Assessment of Human Factors and Performance
Parameters. Communicate! AIIC (Summer 2003-Being There), 1–25.
Moser-Mercer, B., 2005. Remote Interpreting: Issues of Multi-Sensory Integration in a Multilingual
Task. Meta 50(2), 727–738. URL https://doi.org/10.7202/011014ar
Mouzourakis, P., 1996. Videoconferencing. Interpreting 1(1), 21–38. URL https://doi.org/10.1075/
intp.1.1.03mou
Mouzourakis, P., 2006. Remote Interpreting: A Technical Perspective on Recent Experiments. Inter-
preting 8(1), 45–66. URL https://doi.org/10.1075/intp.8.1.04mou
Napier, J., 2012. Here or There? An Assessment of Video Remote Signed Language Interpreter-Mediated
Interaction in Court. In Braun, S., Taylor, J., eds. Videoconference and Remote Interpreting in
Criminal Proceedings. Intersentia, Antwerp, 145–185.
Napier, J., Skinner, R., Braun, S., eds., 2018a. Here or There: Research on Interpreting via Video
Link. Washington, DC: Gallaudet University Press. URL https://doi.org/10.2307/j.ctv2rh2bs3
Napier, J., Skinner, R., Braun, S., eds., 2018b. Interpreting via Video Link: Mapping of the Field. In
Napier, J., Skinner, R., Braun, S., eds. Here or There: Research on Interpreting via Video Link.
Gallaudet University Press, Washington, DC, 11–35. URL https://doi.org/10.2307/j.ctv2rh2bs3.4
Napier, J., Skinner, R., Turner, G.H., 2017. “It’s Good for Them but Not so for Me”: Inside the
Sign Language Interpreting Call Centre. Translation and Interpreting 9(2), 1–23. URL https://doi.
org/10.12807/ti.109202.2017.a01
Nestler, F., 1957. Tel-Interpret: Begründung und Grundlagen eines deutschen Telefon-Dolmetschdienstes.
Lebende Sprachen 2(1), 21–23.
Nimdzi, 2023. Nimdzi Interpreting Index: Ranking of Top Interpreting Service Providers. URL www.
nimdzi.com/interpreting-index-top-interpreting-companies/ (accessed 10.10.2024).
Ozolins, U., 2011. Telephone Interpreting: Understanding Practice and Identifying Research Needs.
Translation & Interpreting 3(2), 33–47.
Price, E.L., Pérez-Stable, E.J., Nickleach, D., López, M., Karliner, L.S., 2012. Interpreter Per-
spectives of In-Person, Telephonic, and Videoconferencing Medical Interpretation in Clinical
Encounters. Patient Education and Counseling 87(2), 226–232. URL https://doi.org/10.1016/j.
pec.2011.08.006
Rodríguez González, E., Ahmed Saeed, M., Korybski, T., Davitti, E., Braun, S., 2023. Reimagining
the Remote Simultaneous Interpreting Interface to Improve Support for Interpreters. In Ferreiro
Vázquez, Ó., Moutinho Pereira, A.T.V., Gonçalves Araújo, S.L., eds. Technological Innovation for
Language Learning, Translation and Interpreting. Peter Lang, Berlin, 227–246.
Rosenberg, B.A., 2007. A Data Driven Analysis of Telephone Interpreting. In Wadensjö, C., Dim-
itrova, B.E., Nilsson, A.-L., eds. The Critical Link 4 Professionalisation of Interpreting in the Com-
munity. Selected Papers from the 4th International Conference on Interpreting in Legal, Health
and Social Service Settings, Stockholm, Sweden, 20–23 May 2004. John Benjamins, Amsterdam,
65–76. URL https://doi.org/10.1075/btl.70.09ros
Roziner, I., Shlesinger, M., 2010. Much Ado About Something Remote. Interpreting 12(2), 214–247.
URL https://doi.org/10.1075/intp.12.2.05roz
Saeed, M.A., Rodríguez González, E., Korybski, T., Davitti, E., Braun, S., 2023. Comparing Interface
Designs to Improve RSI Platforms: Insights from an Experimental Study. In Orǎsan, C., Mitkov,
R., Corpas Pastor, G., Moni, J., eds. International Conference on Human-Informed Translation
and Interpreting Technology (HiT-IT 2023), Naples, Italy, 7–9.7.2023, 147–156. URL https://
hit-it-conference.org/wp-content/uploads/2023/07/HiT-IT-2023-proceedings.pdf
49
The Routledge Handbook of Interpreting, Technology and AI
Schulz, T.R., Leder, K., Akinci, I., Ann Biggs, B., 2015. Improvements in Patient Care: Videoconfer-
encing to Improve Access to Interpreters During Clinical Consultations for Refugee and Immigrant
Patients. Australian Health Review 39(4), 395–399. URL https://doi.org/10.1071/AH14124
Seeber, K.G., Keller, L., Amos, R., Hengl, S., 2019. Expectations vs. Experience. Interpreting 21(2),
270–304. URL https://doi.org/10.1075/intp.00030.see
Short, J., Williams, E., Christie, B., 1976. The Social Psychology of Telecommunications. John Wiley,
New York.
Singureanu, D., 2023. Managing the Demands of Video-Mediated Court Interpreting: Strategies and
the Role of Emotional Intelligence (dissertation), University of Surrey, Surrey. URL https://doi.
org/10.15126/thesis.900664
Singureanu, D., Hieke, G., Gough, J., Braun, S., 2023. I am His Extension in the Courtroom.’ How
Court Interpreters Cope with the Demands of Video-Mediated Interpreting in Hearings with
Remote Defendants. In Corpas Pastor, G., Defrancq, B., eds. Interpreting Technologies – Cur-
rent and Future Trends. John Benjamins, Amsterdam, 72–108. URL https://doi.org/10.1075/
ivitra.37.04sin
Tan, S., Orăsan, C., Braun, S., 2024. Integrating Automatic Speech Recognition into Remote Health-
care Interpreting: A Pilot Study of its Impact on Interpreting Quality. Proceedings of Translating and
the Computer 2024 (TC46), 175–191. URL https://asling.org/tc46/wp-content/uploads/2025/03/
TC46-proceedings.pdf
Tang, W., Singureanu, D., Wang, F., Orăsan, C., Braun, S., 2024. Integrating Automatic Speech Rec-
ognition in Remote Interpreting Platforms: An Initial Assessment. Presentation at the CIOL Inter-
preters Day, 16.3.2024, London.
Terry, M., Johnson, S., Thompson, P., 2010. Virtual Court Pilot Outcome Evaluation. Min-
istry of Justice (UK) Research Series 21/10. URL https://assets.publishing.service.gov.uk/
media/5a7b66ff40f0b6425d592eaf/virtual-courts-pilot-outcome-evaluation.pdf
UNESCO, 1976. A Teleconference Experiment: A Report on the Experimental Use of the Sympho-
nie Satellite to Link UNESCO Headquarters in Paris with the Conference Centre in Nairobi.
UNESCO, Paris.
Vranjes, J., 2024. Where to Look? On the Role of Gaze in Regulating Turn-Taking in Video Remote
Interpreting. In de Boe, E., Vranjes, J., Salaets, H., eds. Interactional Dynamics in Remote Inter-
preting: Micro-Analytical Approaches. Routledge, London, 113–134.
Warnicke, C., Plejert, C., 2012. Turn-Organisation in Mediated Phone Interaction Using Video
Relay Service (VRS). Journal of Pragmatics 44(10), 1313–1334. URL https://doi.org/10.1016/j.
pragma.2012.06.004
Yabe, M., 2020. Healthcare Providers’ and Deaf Patients’ Interpreting Preferences for Critical Care
and Non-Critical Care: Video Remote Interpreting. Disability and Health Journal 13(2), 100870.
URL https://doi.org/10.1016/j.dhjo.2019.100870
Zhang, W., Davitti, E., Braun, S., 2024. Charting the Landscape of Remote Medical Interpreting: An
International Survey of Interpreters Working in Remote Modalities in Healthcare Services. Per-
spectives 1–26. URL https://doi.org/10.1080/0907676X.2024.2382488
Zhu, X., Aryadoust, V., 2022. A Synthetic Review of Cognitive Load in Distance Interpreting:
Toward an Explanatory Model. Frontiers in Psychology 13, 3535. URL https://doi.org/10.3389/
fpsyg.2022.899718
Ziegler, K., Gigliobianco, S., 2018. Present? Remote? Remotely Present! New Technological
Approaches to Remote Simultaneous Conference Interpreting. In Fantinuoli, C., ed. Interpreting
and Technology. Language Science Press, Berlin, 119–139.
50
3
REMOTE SIMULTANEOUS
INTERPRETING
Agnieszka Chmiel and Nicoletta Spinolo
3.1 Introduction
According to one of the earliest definitions, remote simultaneous interpreting (RSI) is under-
stood as ‘any form of simultaneous interpreting where the interpreter works away from the
meeting room either through a video-conferencing set-up or through a cabled arrangement
close to the meeting facilities, either in the same building or at a neighboring location’
(Moser-Mercer, 2003, 1). RSI is a modality of interpreting that belongs to distance inter-
preting. This term may be considered as a hypernym for all forms of technology-mediated
interpreting (Braun, 2019). Looking at the phenomenon from the point of view of the
participants’ location, Braun (2019, 272) identifies remote interpreting as ‘the situation in
which the interpreter is in a physical space, different from that of the conference venue’,
and teleconference interpreting as ‘the situation in which the whole event is virtual and
therefore participants are connected at a distance, with the interpreter either at one client
location or working from a different venue’. Following this principle, Braun makes a fur-
ther differentiation based on the source of input for the interpreter. This can be audio-only
(audioconference interpreting, when the interpreter is located with one or some of the par-
ticipants; audio remote interpreting, when the interpreter is located remotely) or audio
and video (videoconference interpreting, when the interpreter is located with one or some
of the participants; video remote interpreting, when the interpreter is located remotely).
In a follow-up classification, based on participant location and aimed at considering
post-pandemic modifications in distance interpreting configurations, Braun (2024, 452)
employs the label ‘remote interpreting’ for situations in which clients are all located in the
same venue, and ‘virtual interpreting’ for fully virtual events.
Remote simultaneous interpreting is not recent; as a matter of fact, its first use dates back
to 1976 and was organised by UNESCO. Similar experiments were subsequently organ-
ised by the United Nations in the 1970s and the 1980s (Braun, 2015, 346). These initial
instances of remote interpreting in the simultaneous mode, alongside those that followed in
subsequent years, employed traditional booths and interpreting consoles placed away from
the conference venue (Mouzourakis, 2006).
DOI: 10.4324/9781003053248-5
The Routledge Handbook of Interpreting, Technology and AI
Seminal research on remote interpreting at that time (Moser-Mercer, 2005b; Roziner and
Shlesinger, 2010) also focused on situations where interpreters, although located away from
the conference venue, worked from interpreting booths, sharing the same physical space
with their boothmates, interpreting team, and technical support and using their habitual
interpreting console and equipment. This modality of interpreting was then already con-
sidered ‘a seemingly inevitable shift . . . in Brussels and beyond’ (Roziner and Shlesinger,
2010, 215).
In recent years, with incredible boost due to the COVID-19 pandemic, the shift towards
web-based communication led to a steadier definition of RSI as cloud-based simultaneous
interpreting (Braun, 2019; Saeed et al., 2022).
In this chapter, we will discuss RSI by presenting key terms and concepts (Section 3.2),
an overview of the communicative contexts in which it can be used (Section 3.3), and of
the development of RSI-related technology (Section 3.4). We will then move on to present-
ing feedback from practitioners (Section 3.5) and discussing critical issues (Section 3.6),
such as sound quality, cognitive load and performance, teamwork and boothmate presence,
stress, and multimodality. Finally, we will present concluding remarks and discuss emerg-
ing trends (Section 3.7).
52
Remote simultaneous interpreting
non-RSI-specific platforms usually include RSI as a further option on the platform. These
have a lean, basic interpreter interface, and there is no actual soft console containing multi-
ple controls for the interpreter (the most popular example, at the time of writing, is Zoom).
In contrast, RSI-specific platforms (or ‘RSI bespoke platforms’, as termed by Saeed, 2023,
12) tend to have slightly more complex interpreter interfaces and soft consoles. These make
multiple options and configurations available for interpreters. For example, besides provid-
ing the platform service, these can also provide technical and logistic support before, dur-
ing, and after the event (examples, at the time of writing, are KUDO, Interprefy, Interactio).
However, it is not only the equipment and technology that is undergoing a termino-
logical update. The extensive diffusion of RSI implied radical changes in what Pöchhacker
(2004, 13) termed ‘constellation’ of interaction in the event. While further details will be
discussed in greater depth in Section 3.3, it is worth noting here that this has led to new
terminology being used to describe interaction with boothmates and the rest of the inter-
preting and technical team, who are either co-located or not (AIIC TFDI, 2020).
Although they do not pertain exclusively to the realm of RSI, the concepts of (social)
presence, immersion, and flow are tightly linked to remote communication in general and,
as a consequence, are often mentioned with respect to RSI. Lee (2004, 45) defines social
presence as ‘a psychological state in which virtual (para-authentic or artificial) social actors
are experienced as actual social actors in either sensory or nonsensory ways’ and states that
‘[s]ocial presence occurs when technology users do not notice the para-authenticity of medi-
ated humans and/or the artificiality of simulated nonhuman social actors’; a lack of sense
of presence in a remote interaction leads to a feeling of alienation in participants (Mouzou-
rakis, 2006). Saeed et al. (2022) identify how two concepts that are linked to the concept
of presence are those of ‘immersion’ and ‘flow’. They state how the concept of ‘immersion’
appears to overlap with that of ‘presence’, since it refers to ‘the feeling of “existing” within
a virtual world’ (Saeed et al., 2022, 218). The concept of ‘flow’ refers, more broadly, to a
mental state of full immersion and concentration on a task (Saeed et al., 2022, 220).
53
The Routledge Handbook of Interpreting, Technology and AI
different access to event information. This includes access to audio and video input and
view of the speakers, slides, and feedback from the audience. As opposed to a fully virtual
constellation, this unequal footing may make the challenges faced by interpreters less vis-
ible, since these are not shared by all participants. Such challenges may include connection
issues, misuse of equipment (microphones, cameras, etc.), poor and incomplete visuals, and
in most cases, no view of the audience to gain feedback.
A third possible event type is hybrid. Here, certain participants (speakers, moderators,
members of the audience) are on-site, while some others are online. Since a hybrid situa-
tion is a partially on-site, partially online event, it may present challenges of both other
aforementioned modalities. In a hybrid event, the view of the speakers and slides will vary
depending on whether they are presenting from home or from the venue. In addition, tech-
nical issues may arise either from online presenters or from on-site.
The implications of the different kinds of events for interpreters are multiple. The type
of the event (virtual, hybrid, on-site with interpreters only online) and the interpreter loca-
tion directly influence the interpreters’ working condition, role, and sense of social presence
(Lee, 2004 and Section 3.2) in the interaction.
Working locations can also vary and have important ramifications on the interpreters’
work and practice in RSI. While it is evident that, in RSI, interpreters can work from virtu-
ally anywhere, a broad distinction can be made as to whether interpreters work from home
or their private office or from an interpreting hub (a facility ‘equipped with interpretation
booths and remote interpretation equipment. The interpreter is co-located with at least
some other interpreters from the team’ (Buján and Collard, 2022, 139)). Working ‘from
a hub’ therefore implies being co-located with a boothmate, the interpreting team as a
whole, and technical support, while ‘working from home or a private office’ implies being
responsible not only for interpreting but also for technical equipment, and for interaction
with boothmates, colleagues, and technicians, from a distance. Finally, a third possibility is
a mixed remote constellation, where an interpreter (or part of an interpreting team) works
from a hub while others work remotely from home. Interpreting hubs are being set up by
large interpreting companies and agencies, as well as some institutions. One such example
is the European Parliament, where, due to travel restrictions imposed by COVID-19, del-
egates often took the floor remotely, and interpreters would occupy one booth per person
and communicate with their boothmates through the booth window (Jayes, 2023).
Chaves (2020) and Cheung (2022, 115) summarise the main features that differentiate
‘working from home’ from ‘working from a hub’. The overall picture emerging from these
observations is that hubs offer on-site technical support, co-location with boothmate and
easier handover, connection safety and stability, use of hard consoles (although this might
not always be the case), and soundproof settings (although, again, this may vary from hub
to hub). A home setting, on the other hand, makes the interpreter responsible for techni-
cal set-up and management and may present less network stability, increased difficulty in
communication and handover with boothmates, and greater difficulty in accessing remote
technical support (when such support is provided).
However, survey research suggests that, currently, while interpreting hubs do exist, most
interpreters work either from home or from their private office (Buján and Collard, 2022;
Chmiel and Spinolo, 2022). This may be due to multiple reasons, the most likely being that
there is no hub option provided, or that hubs are far from the interpreters’ locations. Also,
some interpreters might prefer to work from their own, personalised workstation and may
see the advantage of less travelling, a reduced carbon footprint, an easier work–life balance,
54
Remote simultaneous interpreting
and even the possibility of accepting more assignments, without travel days in between
events (Mayub Rayaa and Martin, 2022).
Whatever the interpreter’s choice of location (home or hub), it has an impact not only
on the mode of interaction with their colleagues and other stakeholders but also on their
workstation set-up and ability to customise it to the interpreter’s individual needs. There
are, as a matter of fact, multiple possible configurations of an interpreter’s workstation.
Interpreters can use a single or a double screen, one or two computers, or other devices for
documentation and interaction (Spinolo and Chmiel, 2023).
The question of ‘interpreter location’ is closely linked with ‘boothmate location’, since
‘hub-based’ interpreting usually implies a co-located boothmate, while ‘home-based’ inter-
preting usually means a non-co-located boothmate. In the former case, boothmate support
and interaction can occur in the ‘traditional’ way, by means of prompts written on paper,
gestures, or short oral communications with muted microphones. In the latter case, when
working with a non-co-located boothmate, options for boothmate interaction are multiple.
According to survey research (Chmiel and Spinolo, 2022), most interpreters seem to prefer
using an external chat on a device, different from the platform within which they are per-
forming RSI, to communicate with boothmates. Others prefer connecting with boothmates
via video call on a separate device. Alternatively, others use the chat embedded in their RSI
platform, or an external chat on the same device. Other ways interpreters may communi-
cate with one another, although less frequently, may be audio calls or muted video calls, on
a separate device. While various modes of interaction with boothmates have already been
observed (see, for instance, Chmiel and Spinolo, 2022), research on the impact of booth-
mate location and mode of interaction is still in its infancy (see Section 3.6.4), and results
are only just beginning to emerge.
Research on remote interpreting in the booth has reported an increased sense of alien-
ation for interpreters who are working with traditional equipment and co-located with
the team (Moser-Mercer, 2005b; Roziner and Shlesinger, 2010; Seeber et al., 2019). In a
home-based setting, without a co-located boothmate, this feeling of alienation is bound to
persist or increase. This sense of alienation could be offset by providing appropriate visual
input (Seeber et al., 2019) or by designing interfaces that improve immersion and general
interpreter experience. However, as explained earlier, due to the fast evolution of technol-
ogy, coupled with the acceleration triggered by the COVID-19 pandemic, the situation is
changing quickly, and further research is needed to extract observations that are relevant
to the current times.
Interpreters’ access to visual information from the event is an obvious and important
consequence of the distribution of event participants across the physical or virtual space.
In the case of an on-site event, where only the interpreters are located remotely, interpret-
ers may have different visual access to the event, in terms of speaker view, slide view, or
view of the room and audience. As Seeber (2022) identifies, the human field of vision has a
120-degree extension both horizontally and vertically, although the human eye has better
focus on elements that are central in our field of vision. With this notion in mind, Seeber
(2022) explains, it is easily understandable that even with a state-of-the-art technological
set-up, it will be hardly possible to offer interpreters the same visuals as in an on-site con-
ference setting. This is particularly relevant as, according to research on visual attention
distribution during RSI, the speaker frame occupies a large share of professionals’ visual
attention, even when slides are being presented and given their share of visual attention
(Chmiel et al., under review).
55
The Routledge Handbook of Interpreting, Technology and AI
Interface design and user experience in RSI are also crucial when it comes to providing
an enhanced sense of presence. Saeed et al. (2022) used the focus group methodology to
elicit interpreters’ preferences in terms of interface design. This was followed up (Saeed,
2023) by an experimental study using a validated usability questionnaire, the UEQ (User
Experience Questionnaire; Laugwitz et al., 2008), post-task questionnaires, and follow-up
interviews. Frittella (2023) used post-task questionnaires and interviews to assess the usa-
bility of an RSI platform (SmarTerp) providing AI-based CAI support. Chmiel et al. (under
review) used the UEQ (Laugwitz et al., 2008) to assess user experience in different RSI
configurations (see more in Section 3.4).
56
Remote simultaneous interpreting
To illustrate, besides not having the venue in direct sight, interpreters working in RSI
often use a so-called ‘soft console’ (see Section 3.2 and ISO 20539:2023). This means all
their booth controls are on-screen, rather than being physical consoles with buttons and
switches. However, hard consoles can at times be integrated and used within RSI platforms
(Fan, 2022). The difference between using a ‘hard’ or a ‘soft’ console does not only lie in
the fact that soft console controls are on-screen together with all other visual input from
the conference (video stream, slides, event chat, booth chat, etc.); this differentiation also
implies variation in location, number, functioning, and visualisation of controls. This some-
times varies noticeably from one platform to another and thus requires a certain degree of
adjustment from professionals when dealing with a new platform or switching regularly
between platforms. This may be particularly true for RSI-specific platforms. These offer a
wider variety of options for interpreters, and therefore have increased functions and con-
trols, compared to non-RSI-specific platforms.
An important element of the interpreter interface is that which is related to boothmate
communication. This, too, varies significantly from one platform to another. It can range
from platforms without a specific communication channel for boothmates (e.g. in many
non-RSI-specific platforms) to platforms which are developing virtual booths, either as a
part of the SIDP itself or as a separate backchannel, injecting booth sound into non-RSI
specific platforms. In such virtual booths (such as GTBooth), interpreters are not only able
to use a chat feature to communicate but can also see and talk to their boothmates without
being heard by conference participants. This, therefore, allows them to use both gestures
and voice as means of communication.
Interface-related issues are currently being investigated by researchers, and results are
starting to be published. Saeed et al. (2022) elicited opinions and preferences on RSI inter-
faces from two focus groups. The first included three professionals, and the second four
trainee interpreters. Results on visual preferences pointed at a desire for further stand-
ardisation of interfaces to reduce excessive variation (in line with results from Spinolo
and Chmiel, 2023). Results also indicated a feeling of information overload and potential
fatigue relating to visual information that is not regularly used during the interpreting pro-
cess (e.g. controls and buttons). Additionally, results showed a desire for easier communi-
cation with boothmates, through a video feed and a shared notepad and chat. In general,
authors detected a preference towards what they defined to be a ‘minimal’ interface. Based
on the focus groups, Saeed (2023) proceeded to carry out an experimental study with 29
participants to explore how different visual elements of interfaces can support practition-
ers. The author manipulated the following elements: interface (minimal vs maximal, using
mock-up interfaces with no prior usability testing), speaker view (close-up vs gesture view),
and CAI support (ASR vs no ASR), and aimed at exploring the impact of different configu-
rations on user experience and at identifying the most effective way of displaying visual
information to interpreters. Surprisingly, and somewhat in contrast with results from the
focus groups, UEQ scores (Laugwitz et al., 2008) hint at a better overall user experience
with the maximal interface, rather than the minimal. In addition, analysis of post-task
questionnaires completed by participants highlights high individual variation in preference
and, consequently, a desire for interface customisability (in line with results from Spinolo
and Chmiel, 2023).
A more recent development in RSI technology is the integration of AI-powered inter-
preter support within the platform (see, for example, Rodríguez et al., 2021, and Fantinu-
oli et al., 2022). This can come in the form of either a running automatic transcript of the
57
The Routledge Handbook of Interpreting, Technology and AI
speech or providing a specific selection of known problem triggers (Gile, 1995/2009), such
as named entities, numbers, specific terminology (see Section 3.7).
58
Remote simultaneous interpreting
The picture stemming from interpreter feedback shows that RSI is viewed as a challenge
but is also embraced as an opportunity. Practitioners have manifested great flexibility and
adaptability to the changing nature of the interpreting market.
59
The Routledge Handbook of Interpreting, Technology and AI
Improvements have been made in the area of technological factors that influence
sound quality. In September 2020, Zoom introduced a High-Fidelity Music Mode feature
to replace narrow-band audio with higher-clarity broadband audio in its VoIP protocol
(voice-over-internet protocol, a technology standard allowing for online transmission of
voice) (Zoom Blog, 2022). This development has been seen as both ‘promising’ (Caniato,
2020b) and ‘challenging’ because it might backfire if clients do not use this feature with
‘good equipment’ (Flerov, 2021).
Experimental studies on the impact of sound quality for interpreters have been scarce.
However, one such large-scale study focused on more than 80 interpreters and examined
the effect of sound quality on cognitive load and performance in RSI (Seeber, 2022; Seeber
and Pan, 2022). The study manipulated sound by using two frequency ranges: 125–15,000
Hz for the high-quality condition, and 300–3,400 Hz for the low-quality condition. Results
indicated the detrimental effect of low sound quality, and interpreters reported increased
cognitive load and levels of frustration relating to this condition. In addition, their inter-
preting performances were judged as being ‘worse’ by independent evaluators.
60
Remote simultaneous interpreting
interpreting quality for lexically dense and fast speeches, with a delivery rate exceeding 140
words per minute (Rodríguez González et al., 2023b).
3.6.3 Stress
Stress is another critical issue that has been examined in the context of RSI. Two semi-
nal studies comparing on-site and remote simultaneous interpreting (Moser-Mercer, 2003;
Roziner and Shlesinger, 2010) in a within-subject design (i.e. comparing the same interpret-
ers performing both tasks) collected both objective and subjective measurements of stress.
The former included salivary cortisol measurements, while the latter included responses to
validated tests. Both studies showed no effect of task on objectively measured stress levels.
Despite numerical differences, interpreters in the study by Moser-Mercer (2003) did not
self-report higher stress levels when working in RSI. However, a larger-sample study by
Roziner and Shlesinger (2010) did show that RSI was indeed perceived as more stressful.
Stressors that were identified as being more significant in RSI included difficulty of delivery
and text, visibility, and lack of feedback from the audience. On the other hand, length of
turn, booth conditions, and technical equipment were judged as being equally stressful in
both remote and on-site interpreting. Interpreters also reported experiencing more exhaus-
tion, cognitive fatigue, and burnout after working in RSI. Both of these studies were con-
ducted in a hub, and it is not known whether these results can be generalised to RSI in a
home office scenario. In line with the findings of the two studies mentioned, Chmiel et al.
(under review) found no differences in self-reported anxiety levels, depending on whether
the boothmate was co-located or non-co-located. However, the study did not include an
on-site condition for comparison. Interestingly, a survey and interview-based study on RSI
by Seeber et al. (2019) found that interpreters’ impressions varied – with half of their sam-
ple judging RSI as being more stressful than on-site interpreting, and half perceiving RSI as
less stressful than traditional interpreting.
61
The Routledge Handbook of Interpreting, Technology and AI
presence was not found to influence mental demand (understood as the amount of thinking
involved in a task) or temporal demand (time pressure).
One of the most common forms of collaboration in the booth is handover (Seresi and
Láncos, 2022), re-assigning the role of the active interpreter from one booth partner to
another. Handover may be achieved through eye contact or gestures. These procedures can
fail when performing RSI with a non-co-located boothmate (unless the interpreters are in
a virtual booth and can see each other). Thus, interpreters have turned to using particular
messages or icons in chat applications or agreeing on a predefined time for handover (Seresi
and Láncos, 2022). So far, a single experimental study on RSI has focused on the problem
of handover, showing difficulties in this respect in RSI (Matsushita et al., 2022).
3.6.5 Multimodality
Human–computer interaction is at the core of RSI and increases the multimodal nature of
this type of interpreting. This is paradoxical because, at a first glance, remote interpreting
may seem to provide the interpreter with fewer sources of input to process compared to
on-site interpreting. In RSI, input comes predominantly from a screen (or multiple screens)
and a headset, while on-site interpreting is rife with various input channels (visual infor-
mation from the conference room, conference slides, interpreters’ own computer screens,
documentation in the booth, written prompts from the boothmate, and auditory input
both from the floor and the boothmate). However, although reduced to a two-dimensional
screen, information input in RSI is far more complex, as interpreters must juggle screens,
devices, applications, platforms, and systems. This constitutes a challenge and creates a
need for ‘multisensory integration to construct meaning’ (Moser-Mercer, 2005a). How-
ever, when faced with such a challenge, interpreters have been found to show flexibility
and adaptability. When asked about how challenging the use of multiple input channels
is, interpreters find attending to multiple channels of information (speaker’s audio, a chat
between delegates, communication with boothmate, written questions from the audience,
presentation visuals) moderately problematic (scoring 5.33 out of 7) and using multi-
ple programmes, application, systems, and devices slightly less problematic (scoring 5.0
out of 7) (Spinolo and Chmiel, 2023). Interpreters seem to prefer lean interfaces with a
clearer view of the speaker and their body language (Rodríguez González et al., 2023a).
However, visual attention with more fragmented distribution over various chat boxes
and panels has not been found to lead to poorer performance or a higher self-reported
cognitive load (Chmiel et al., under review). Recent studies (Frittella, 2023; Saeed et al.,
2022) might help optimise the multimodal working environment in RSI in the future (see
Section 3.3).
3.7 Conclusion
This chapter has presented the main concepts related to RSI, including contexts of use and
technologies used in RSI. It also focused on feedback from practitioners and critical issues
such as sound quality, cognitive load, stress, teamwork, and multimodality. As mentioned,
the interpreting profession is now experiencing disruptive, extensive changes as a result
of technological developments. These are bound to continue developing and changing the
practice of RSI. It is only natural that RSI, a modality of interpreting that is so dependent
on cloud-based infrastructure, absorb new solutions and benefits from new features.
62
Remote simultaneous interpreting
Although not exclusive to RSI, a trend towards interpreter augmentation, very often
applied to RSI tools, is emerging. AI-powered, computer-assisted interpreting tools are
increasingly being included in RSI platforms in the form of interpreter support (see also
Prandi, this volume). This is particularly noticeable in relation to known problem-triggering
items such as numbers and named entities and prompting interpreters with specialized ter-
minology via ASR (Defrancq and Fantinuoli, 2021; Frittella, 2023; Prandi, 2023; Saeed
et al., 2022).
Zhang et al. (2023) draw future scenarios and propose various tools that could be
used to ensure better user experience and support interpreters working remotely. These
include video summarisation (providing a short update to boothmates resuming work
after a break), keyword spotting (identification of keywords in the source text rather than
displaying verbatim transcripts), face and gesture detection and recognition, and interac-
tion management to avoid overlapping speech (for instance, in Q&A sessions). It is also
possible to envisage meetings of the future being interpreted, in some language combina-
tions, via machine interpreting (see Fantinuoli, this volume) or using machine interpreting
post-editing (MIPE) (i.e. using automatically generated, machine-translated subtitles as a
source of interpreted text, to be read out following online editing.) A further and more
recent evolution of AI support for interpreters is the use of augmented reality tools to offer
ASR support (Gieshoff and Schuler, 2022). This practice is likely to grow stronger in the
near future and can be applied to both RSI and other simultaneous interpreting contexts.
In this scenario, interpreters wear virtual reality glasses and see AI-generated prompts as
part of their 3D visual field.
All these emerging trends constitute potential further areas of enquiry. Additionally, due
to the unprecedented recent development in generative artificial intelligence, the disruptive
impact of this technology on interpreting will surely be researched in the months and years
to come.
References
AIIC EU ND – EU Negotiating Delegation, 2022. Resolution on Sound Quality. AIIC Website.
URL https://aiic.org/document/10590/Resolution%20on%20Auditory%20Health%20and%20
Sound%20Quality%20v2.pdf (accessed 9.3.2024).
AIIC TFDI – Taskforce on Distance Interpreting, 2020. AIIC COVID-19 Distance Interpreting Rec-
ommendations for Institutions and DI Hubs. AIIC Website. URL https://aiic.org/document/4839/
AIIC%20Recommendations%20for%20Institutions_27.03.2020.pdf (accessed 9.3.2024).
AIIC THC – Technical and Health Committee, 2019. Technical Study on Transmission of Sound and
Image Through Cloud-Based Systems for Remote Interpreting in Simultaneous Mode (Remote
Simultaneous Interpreting – RSI). AIIC Website. URL https://aiic.org/document/4862/Report_
technical_study_RSI_Systems_2019.pdf (accessed 9.3.2024).
Braun, S., 2015. Remote Interpreting. In Pöchhacker, F., ed. The Routledge Encyclopedia of Interpret-
ing Studies, 1st ed. Routledge, Abingdon, 346–348. URL https://doi.org/10.4324/9781315678467
Braun, S., 2019. Technology and Interpreting. In O’Hagan, M., ed. The Routledge Handbook
of Translation and Technology, 1st ed. Routledge, Abingdon, 271–288. URL https://doi.
org/10.4324/9781315311258-19.
Braun, S., 2024. Distance Interpreting as a Professional Profile. In Massey, G., Ehrensberger-Dow,
M., Angelone, E., eds. Handbook of the Language Industry: Contexts, Resources and Profiles. de
Gruyter, Berlin, 449–472. URL https://doi.org/10.1515/9783110716047-020.
Buján, M., Collard, C., 2022. Remote Simultaneous Interpreting and COVID-19: Conference Inter-
preters’ Perspective. In Liu, K., Cheung, A.K.F., eds. Translation and Interpreting in the Age of
COVID-19. Springer, New York, 133–150. URL https://doi.org/10.1007/978-981-19-6680-4_7.
63
The Routledge Handbook of Interpreting, Technology and AI
64
Remote simultaneous interpreting
Gile, D., 1995/2009. Basic Concepts and Models for Interpreter and Translator Training. John
Benjamins, Amsterdam. URL https://doi.org/10.1075/btl.8
ISO, 2022. ISO 24019:2022 Simultaneous Interpreting Delivery Platforms. Requirements and
Recommendations. URL www.iso.org/standard/80761.html (accessed 9.3.2024).
ISO, 2023. ISO 20539:2023 Translation, Interpreting and Related Technology: Vocabulary. www.
iso.org/standard/81379.html (accessed 9.3.2024).
Jayes, T., 2023. Conference Interpreting and Technology: An Institutional Perspective. In Corpas Pas-
tor, G., Defrancq, B., eds. Interpreting Technologies: Current and Future Trends. John Benjamins,
Amsterdam, 217–240. URL https://doi.org/10.1075/ivitra.37.09jay.
Laugwitz, B., Held, T., Schrepp, M., 2008. Construction and Evaluation of a User Experience Ques-
tionnaire. USAB 5298, 63–76. URL https://doi.org/10.1007/978-3-540-89350-9_6
Lee, K.M., 2004. Presence, Explicated. Communication Theory 14(1), 27–50. URL https://doi.
org/10.1111/j.1468-2885.2004.tb00302.x
Matsushita, K., 2022. How Remote Interpreting Changed the Japanese Interpreting Industry: Find-
ings from an Online Survey Conducted During the COVID-19 Pandemic. INContext: Studies in
Translation and Interculturalism 2(2), 167–185. URL https://doi.org/10.54754/incontext.v2i2.22
Matsushita, K., Yamada, M., Ishizuka, H., 2022. How Multiple Visual Input Affects Interpreting Per-
formance in Remote Simultaneous Interpreting (RSI): An Experimental Study. A Conference Pres-
entation at the Third HKBU International Conference on Interpreting, Hong Kong, 7–9.12.2022.
Mayub Rayaa, B., Martin, A., 2022. Remote Simultaneous Interpreting: Perceptions, Practices
and Developments. The Interpreters’ Newsletter 27, 21–42. URL https://doi.org/10.13137/
2421-714X/34390
Moser-Mercer, B., 2003. Remote Interpreting: Assessment of Human Factors and Performance
Parameters. AIIC Webzine, Summer, 1–17.
Moser-Mercer, B., 2005a. Remote Interpreting: Issues of Multi-Sensory Integration in a Multilingual
Task. Meta 50(2), 727–738. URL https://doi.org/10.7202/011014ar
Moser-Mercer, B., 2005b. Remote Interpreting: The Crucial Role of Presence. Bulletin vals-asla 81,
73–97.
Mouzourakis, P., 1996. Videoconferencing: Techniques and Challenges. Interpreting: International
Journal of Research and Practice in Interpreting 1(1), 21–38.
Mouzourakis, P., 2006. Remote Interpreting: A Technical Perspective on Recent Experiments. Inter-
preting: International Journal of Research and Practice in Interpreting 8(1), 45–66. URL https://
doi.org/10.1075/intp.8.1.04mou
Pöchhacker, F., 2004. Introducing Interpreting Studies. Routledge, Abingdon.
Prandi, B., 2023. Computer-Assisted Simultaneous Interpreting: A Cognitive-Experimental Study on
Terminology. Language Science Press, Berlin. URL https://zenodo.org/record/7143056
Przepiórkowska, D., 2021. Adapt or Perish: How Forced Transition to Remote Simultaneous Inter-
preting During the COVID-19 Pandemic Affected Interpreters’ Professional Practices. Między
Oryginałem a Przekładem 27(4(54)), 137–159. URL https://doi.org/10.12797/MOaP.27.2021.54.08
Rodríguez González, E., Saeed, M.A., Korybski, T., Davitti, E., Braun, S., 2023a. Reimagin-
ing the Remote Simultaneous Interpreting Interface to Improve Support for Interpreters. In
Ferreiro-Vázquez, Ó., Correia, A., Araújo, S., eds. Technological Innovation Put to the Service
of Language Learning, Translation and Interpreting: Insights from Academic and Professional
Contexts. Peter Lang, Lausanne, 227–246.
Rodríguez, S., Gretter, R., Matassoni, M., Alonso, A., Corcho, O., Rico, M., Falavigna, D., 2021.
SmarTerp: A CAI System to Support Simultaneous Interpreters in Real-Time. In Mitkov, R.,
Sosoni, V., Giguère, J.C., Murgolo, E., Deysel, E., eds. Proceedings of the Translation and Inter-
preting Technology Online Conference. INCOMA Ltd., 102–109.
Rodríguez González, E., Saeed, M.A., Korybski, T., Davitti, E., Braun, S., 2023b. Assessing the Impact
of Automatic Speech Recognition on Remote Simultaneous Interpreting Performance Using the
NTR Model. Say It Again. International Workshop on Interpreting Technologies, Malaga, Spain.
Roziner, I., Shlesinger, M., 2010. Much Ado About Something Remote: Stress and Performance in
Remote Interpreting. Interpreting: International Journal of Research and Practice in Interpreting,
12(2), 214–247. URL https://doi.org/10.1075/intp.12.2.05roz
Saeed, M.A., 2023. Exploring the Visual Interface in Remote Simultaneous Interpreting (PhD thesis).
University of Surrey. URL https://doi.org/10.15126/thesis.901059
65
The Routledge Handbook of Interpreting, Technology and AI
Saeed, M.A., González, E.R., Korybski, T., Davitti, E., Braun, S., 2022. Connected Yet Distant: An
Experimental Study into the Visual Needs of the Interpreter in Remote Simultaneous Interpreting.
In Kurosu, M., ed. Human-Computer Interaction: User Experience and Behavior, Vol. 13304.
Springer, Berlin, 214–232. URL https://doi.org/10.1007/978-3-031-05412-9_16
Seeber, K., 2011. Cognitive Load in Simultaneous Interpreting: Existing Theories – New Models.
Interpreting 13(2), 176–204. URL https://doi.org/10.1075/intp.13.2.02see
Seeber, K., 2022. When Less Is Not More: Sound Quality in Remote Interpreting. UN Today web-
site. URL https://untoday.org/when-less-is-not-more-sound-quality-in-remote-interpreting/
(accessed 9.3.2024).
Seeber, K., Pan, D., 2022. Audio Quality in Remote Interpreting. A Conference Presentation at the
Third HKBU International Conference on Interpreting, Hong Kong, 7–9.12.2022.
Seeber, K.G., Keller, L., Amos, R., Hengl, S., 2019. Expectations vs. Experience: Attitudes Towards
Video Remote Conference Interpreting. Interpreting: International Journal of Research and Prac-
tice in Interpreting 21(2), 270–304. URL https://doi.org/10.1075/intp.00030.see
Şengel, Z., 2022. Zooming in: Interpreters Perspective Towards Remote Simultaneous Interpret-
ing (RSI) Ergonomics. Çeviribilim ve Uygulamaları Dergisi. Journal of Translation Studies 33,
169–190.
Seresi, M., Láncos, P.L., 2022. Teamwork in the Virtual Booth – Conference Interpreters’ Experiences
with RSI Platforms. In Liu, K., Cheung, A.K.F., eds. Translation and Interpreting in the Age of
COVID-19. Springer, Berlin, 181–196.
Spinolo, N., 2022. Remote Interpreting. In Franco Aixelá, J., Muñoz Martín, R., eds. ENTI (Enci-
clopedia de traducción e interpretación). AIETI, Asociación Ibérica de Estudios de Traducción e
Interpretación. URL https://zenodo.org/records/6370665
Spinolo, N., Chmiel, A., 2023. Final Report – AIIC Research Grant 2020 Inside the Virtual Booth:
The Impact of Remote Interpreting Settings on Interpreter Experience and Performance. Unpub-
lished report.
Zhang, X., Corpas Pastor, G., Zhang, J., 2023. Videoconference Interpreting Goes Multimodal. Some
Insights and a Tentative Proposal. In Corpas Pastor, G., Defrancq, B., eds. Interpreting Technologies –
Current and Future Trends. John Benjamins, Amsterdam, 169–194.
Zhu, X., Aryadoust, V., 2022. A Synthetic Review of Cognitive Load in Distance Interpreting:
Toward an Explanatory Model. Frontiers in Psychology 13, 899718. URL https://doi.org/10.3389/
fpsyg.2022.899718
Ziegler, K., Gigliobianco, S., 2018. Present? Remote? Remotely Present! New Technological
Approaches to Remote Simultaneous Conference Interpreting. In Fantinuoli, C., ed. Interpret-
ing and Technology. Language Science Press, Berlin, 119–139. URL https://doi.org/10.5281/
ZENODO.1493299
Zoom Blog, 2022. High-Fidelity, Professional-Grade Audio on Zoom. Zoom blog. URL https://blog.
zoom.us/high-fidelity-music-mode-professional-audio-on-zoom/ (accessed 9.3.2024).
66
4
VIDEO RELAY SERVICE
Camilla Warnicke
4.1 Introduction
Technological advancements have grown significantly in recent decades. For instance, the
ability to make real-time video calls is now a widespread phenomenon. One can contrast
this with the year 1968, when Stanley Kubrick’s groundbreaking film 2001: A Space Odys-
sey (1968) astonished audiences by depicting the ‘futuristic concept’ of communication
through sound and video using long-distance videophones. Current technology has evolved
sufficiently, with the result that remote communication via video is now possible in real life.
Today, what used to be considered ‘sci-fi-like interaction’ has become commonplace for many
people in the world. The possibility for users to see each other while communicating from a
distance has brought about tremendous changes for several societal groups. One such group
is deaf signers, who can now rely on a service that allows them to contact speaking parties via
interpreter mediation by the use of ‘video relay service’ (VRS). VRS provides calls between
two primary participants: a person using a signed language via videophone, interacting with
a person speaking on a ‘regular’ telephone/smartphone. The interaction between the primary
participants is mediated by an interpreter. From the position (often) in a call centre, the inter-
preter observes the signing party on the computer screen and listens to the speaking party via
a headset. In contrast, during this interaction, the two primary participants cannot see or hear
each other (see Figure 4.1).
Before the innovation of videophones, deaf signers had to meet face-to-face in order to
sign with each other. In fact, it was not until the 1960s that the signing community received
accessible telephone technology. The first example of distance interaction with deaf signers
in real time took place via the exchange of texts produced by a teletypewriter (TTY, or tele-
communications device for the deaf: TDD). TTY and TDD produce typed versions of a spo-
ken language (Brunson, 2011) and bridge the remote communication gap between (voice)
telephone users and those with hearing or speech disabilities. Later evolutions allowed for
text messages to be sent between mobile phones, via a short message service (SMS). It is
worth noting that signing interaction via videophone was launched in the 1990s, whereas
organised interpreting services via videophone for deaf people (VRS) have only been an
accessible option since 1996 (Haualand, 2012).
DOI: 10.4324/9781003053248-6
The Routledge Handbook of Interpreting, Technology and AI
To date, VRS has been providing regular service in several countries around the world.
The first regular VRS was launched in Europe, in Sweden, in 1996 (Warnicke, 2017). VRS
came into use in other countries as time went by, being launched in, for example, the United
States in 2000 (Chang and Russell, 2022), and in Thailand in 2011 (Thailand Communica-
tions Relay Service Center, 2021). However, most countries in Africa and South America
do not yet have a regular VRS.
VRS contributes to providing greater access and equality for deaf people, via technical
developments and the provision of interpreting. In the wake of this, VRS has made deaf
signers’ access to societal services more comparable to that of hearing individuals in society
(Haualand, 2014). The technical set-up of a VRS is, however, not a ‘quick fix’ to improve
inclusion for deaf people in society (cf. De Meulder and Haualand, 2021).
After this brief overview of how VRS has emerged, Section 4.2 will provide a criti-
cal analysis of practices involved in VRS. It will focus on modalities and contrast spoken
and signed language. Subsequently, it will make a link to cognate practices and describe
telephone and video-mediated interactions. Lastly, Section 4.2 will present the procedures
required to make interpreter-mediated phone calls via a VRS. In Section 4.3, discussion
will focus on regulatory considerations relating to interpreters’ billable time and emergency
calls, in order to represent differences between VRS in different nations. Section 4.4 will
conclude the chapter by discussing potential future developments in this domain.
68
Video relay service
interpreter. The next sections will individually address and analyse modalities, media and
institutional interaction relating to VRS. The different aspects will then be brought together
to discuss interpreter-mediated calls conducted via a VRS.
4.2.1 Modalities
VRS involves both a signed and a spoken language. It is therefore a bimodal and intercul-
tural practice. The two languages involved differ in modality, in the way they are produced
and perceived. Spoken languages rely on oral speech and auditory perception. Signed lan-
guages, on the other hand, rely on gestural-visual resources, including signs, gestures, and
facial expressions.
Signed languages are structured using conventional linguistic units, indicating that they
are naturally occurring human languages (Stokoe, 2005), distinct from spoken languages
(Brennan, 1990). However, not all signed languages are the same across the globe. As with
spoken languages, there is a wide variety of signed languages across different countries and
regions. In some cases, more than one official signed language can be used within a country
(see, for example, Brentari, 2010). Although signed languages have different signs, some
cultural expressions and elements of interaction will be the same across signed languages.
One example includes how to name or label a person. While names can be finger-spelled,
letter by letter, in signed languages, it is also common in the deaf community to use a sign
for each person’s name (Börstell, 2017; Supalla, 1990). Another common denominator
across all signed languages is that they have no written form. This means that written text
(often the national spoken official language) constitutes a second language for deaf signers.
In order to analyse the grammatical structure of signed languages, one needs to under-
stand the complexities related to using a signed language over videophone, via a VRS.
It is possible to categorise signs according to where on the body a sign is produced, the
handshape used, or movement of signs. Phonological segments are attributed to each sign,
in relation to its movement and hold (Liddell, 1984). These different aspects give the sign
its meaning – any change in one of these aspects can change the meaning of the sign.
Signed languages encompass manual signs made by the hands and non-manual signs (Lid-
dell, 1977). Non-manual signs include facial expressions and lip movements that may cor-
respond to spoken words. In addition, signed languages rely on various resources to convey
various linguistic elements. Examples include sentence or clause type (e.g. interrogative
clauses, relative clauses; cf. Sandler and Lillo-Martin, 2006). Although it could be difficult
to identify and decode some aspects of signs via visual media, such as videophones, both
the VRS interpreter and the signing party need to be able to perceive these aspects for com-
munication with the speaking party to take place.
69
The Routledge Handbook of Interpreting, Technology and AI
guidelines and norms and by the circumstances and functions of the media used for it (cf.
Heritage and Clayman, 2010). VRS involves interaction both over telephone and video-
phone. Section 4.2.2.1 describes spoken interaction over the telephone, and Section 4.2.2.2
provides specificities relating to videophone interaction among deaf signers.
70
Video relay service
1074). This is a rather uncommon way for signers to pose a question during face-to-face
interaction, but it is seen as a feasible and functional way to do so during video-based inter-
action. Moreover, for a signer to direct a question to one of the two users of a videophone,
they may also have to compensate: using the name sign of their intended addressee. This is
an issue that is normally handled by gaze or by pointing towards a reference in face-to-face
interaction (cf. Keating et al., 2008, 1175–1176).
It is important to remember that participants do not share the same physical space dur-
ing video-mediated conversation. To some extent, using a video link fuses proximity and
distance between them. A shared space appears, which can be considered a ‘novel space’ (cf.
Warnicke and Broth, 2023). This novel space challenges boundaries between what is real
and what is virtual. Consider a scenario where two signing people interact via videophone,
sharing what can be called a ‘front stage’ (Goffman, 2016). This front stage encompasses
everything that is visible by the other person’s webcam. However, not everything that is
taking place in the surrounding (real) environment is captured by the webcam. View of
this shared, virtual space captured by the webcam is thus distinct from the ‘real’ space.
Furthermore, in the real space, the signer can communicate with another individual in the
same physical location, out of view of the webcam. As a result, interaction can take place in
both the virtual space (as captured by the webcam) and the real space (in the near physical
environment, outside of webcam view). However, in video-mediated interaction, it is com-
mon to describe who can see the interaction from the real space and who is following what
is being signed in the video call (Keating et al., 2008; Keating and Mirus, 2003).
71
The Routledge Handbook of Interpreting, Technology and AI
while the use of the headset is key for the interpreter to hear the speaking party, evidence
also shows that its presence has implications in the interaction between the interpreter and
the signing party too.
A common procedure in VRS occurs when the interpreter answers an incoming call from
either a signer’s videophone or a speaking party’s telephone. A traditional way of answering
a call is with a greeting (Warnicke, 2021). In this initial phase of the call, the interpreter
receives a number or an email address to a videophone to contact the called part on behalf
of the caller. The interpreter dials the number or the email address. When the called party
answers, the interpreter may have to explain the VRS. Following this, the two primary par-
ticipants can begin their interaction, mediated by the interpreter (Warnicke, 2021). These
phases of the call vary between countries, as do the regulations. These differences are elabo-
rated in Section 4.3.1 (‘billable time’).
Interpreting in VRS is carried out in the simultaneous mode. This requires the inter-
preter to interpret into the target language what is signed or said in the source language,
while processing the incoming source language (cf. Leeson, 2005; Riccardi, 2005; Russell,
2005). As explained in Section 4.3.1, the VRS setting involves a range of modalities that
are produced and perceived via different means of communication, that is, a visual/gestural
signed language and a verbally spoken language. The different modalities present in VRS
facilitate simultaneous interpreting, as the languages used do not interfere with each other.
In contrast to many simultaneous interpreters working in spoken language communication,
VRS interpreters interpret bidirectionally, that is, to and from both signed language and
spoken language.
Signed language can present language-related challenges for interpreters in a VRS. As a
nationwide service, VRS spans a variety of regions and may therefore include regional and
cultural variants of signs that may be unfamiliar to interpreters (Palmer et al., 2012). These
variants may be difficult to render simultaneously without asking for clarification. Further-
more, the use of personal signs for names, which is a cultural and common way for signers
to refer to a name or a person (see Section 4.2.1), could also present an issue for interpreters
(cf. Börstell, 2017; Supalla, 1990). In addition, in trilingual calls (e.g. involving a signed
language, English, and Spanish), challenges include how to render finger-spelled names for
the signing party in the videophone, and how to figure out a right pronunciation, as English
or Spanish can differ (Treviño and Quinto-Pozos, 2018).
The interpreter is also responsible for turn organisation, making decisions about who is
interpreted and when (Warnicke and Plejert, 2012). Turn organisation is managed exclu-
sively by the interpreter, as the signing and speaking participants cannot see or hear each
other. The interpreter can manage the organisation of turns using strategies such as antici-
pating an upcoming utterance or by providing expanded or reduced renditions (Warnicke
and Plejert, 2012). Other forms of coordination performed by the interpreter include using
embodied resources, such as hand and body movements, and gaze signals to communicate
with the signing party, and audible signals such as humming to the hearing participant
(Marks, 2018; Warnicke and Plejert, 2012). Due to the division of spaces in VRS, and for
effective communication to take place, it is important that the interpreter inform the pri-
mary participants about what is happening in the novel space between the ‘caller’ or the
‘called’ party and the interpreter. This presents an issue for the interpreter as s/he needs to
define the situation by explaining what is going on. For example, the interpreter will need
to mention if the signing party moves away from the videophone at any point, or if there
72
Video relay service
73
The Routledge Handbook of Interpreting, Technology and AI
some surface similarities, differences persist (Haualand, 2012). One difference that affects
the organisation of VRS is the underlying reason that each nation provides the service. For
instance, in Sweden, it is the need for increased accessibility that drives the provision of
VRS. However, in the United States, civil rights considerations led to the service. Finally,
in Norway, the service is organised as an extension of the regular sign language interpret-
ing service (Haualand, 2012). Each nation’s motive derives from its respective country’s
political, financial, and social structures that are embedded in various networks of actors.
These underlying structures shape each nation’s VRS in terms of organisational structure
and practical performance, that is, the level of interaction a user can have with the service.
As a result, the platforms and technical devices used to provide the service do not follow
identical standards all over the world.
Each country’s national VRS can be mandated by governmental authorities and/or be
managed by independent VRS companies. For instance, in Sweden, only one VRS company
is nominated to provide the service, following state procurement. However, the United
States boasts multiple VRS providers. In Norway, VRS is provided by local county govern-
ments. Despite some similarities between national services, differences remain substantial.
Regulatory considerations directly impact interpreters’ work and have been identified as
being a considerable stress factor (Bower, 2015; Chang and Russell, 2022). In addition,
regulations have implications within the calls themselves. Examples include differences
between company billing practices (see Section 4.3.1) and the provision of emergency calls
(see Section 4.3.2).
74
Video relay service
during the calls, via the interpreter’s computer (Brunson, 2011, 55). A scenario from the
United States described interpreters being alerted by a flashing light if the connection
between primary participants was not established within 30 sec. The monitoring can also
affect the interpreters’ salaries, as the VRS company can control how quickly they con-
nect calls, how much billable time they charge, and by extension, how much money they
raise for the company (cf. Brunson, 2011). Such regulations may lead to calls being set up
quickly, with little preparatory interaction. In VRSs where interpreters are not monitored
and where they are able to interact with callers before interpretation begins, the situation is
less stressful (cf. Warnicke, 2017). A less-stressful situation may provide interpreters with
a greater opportunity to offer higher-quality service. To sum up, enabling interpreters to
prepare for a call before the primary participants are connected could ensure better quality
interpretation. Although billable time seems to be a small detail initially, it has great practi-
cal consequences for both the interpreter and the interaction.
4.4 Conclusion
The situation regarding VRS around the world is still evolving (cf. Warnicke and Granberg,
2022). Western countries, such as Sweden and the United States, have been providing a
VRS for a quarter of a century, although opening hours, organisation, and devices differ
between the two countries. The development of VRS has been forced to expand in a short
time due to COVID-19 (De Meulder et al., 2021; Warnicke and Matérne, 2024a, 2024b).
75
The Routledge Handbook of Interpreting, Technology and AI
However, some countries have no organised sign language interpreting service at all. Never-
theless, VRS could present a means of solving issues relating to sign language interpreting,
particularly the limited availability of interpreters. Factors such as large geographical areas
with low interpreter coverage, lack of interpreting training, and the need for interpreters to
travel long distances for assignments often contribute to this scarcity. In poor and troubled
areas, where movement is difficult, such as in countries with poorly developed infrastruc-
ture and in war zones, VRS could be a way to provide an accessible interpreting service.
Internet access and smartphones are commonly available in developing countries. However,
signers around the world require an option for making calls via a VRS in their country.
VRS could place signing people in developing countries on more equal terms with others, in
accordance with global sustainability goals (United Nations, 2015) and the United Nations
Convention on the Rights of Persons with Disabilities (CRPD; United Nations, 2006).
For services that are already in place, technical evolutions could provide a more devel-
oped service in the future. One opportunity that already is a reality in some VRS plat-
forms is an option for both the primary participants and the interpreter to share visual,
auditive, and text input during the call; this is referred to as total conversation (EENA,
2023). Although some VRSs have the technical solution to provide total conversation
call, both primary participants also need a visual link to make it happen. To the author’s
knowledge, in emergency assistant centre, this is not the case yet. Total conversation
could be thought as one way to optimise emergency calls via VRS. For the caller in a
precarious situation, it could be difficult to clearly convey what is needed (e.g. whether
they need an ambulance or a fire brigade). For the interpreter, it is highly demanding but
extremely relevant to give a correct translation, although it is a stressful situation, and
more difficult in the two-dimensional space (Skinner et al., 2021). A VRS that enables
total conversation could give cues that facilitate decisions about how to handle an alarm,
which potentially can save lives. A total conversation mode could be a good comple-
ment as a default in VRS calls in general and in emergency calls in particular. However,
although there have been positive developments in the design of emergency call in VRS,
no countries are yet offering direct calls using total conversation as default – a call when
both the interpreter, the signing help-seeker, and the switchboard operator at the emer-
gency centre can both see and hear each other as well as exchange text. An essential point
to keep in mind as VRS is evolving is, however, that even small details, as has been shown,
can have big consequences. The future of interaction within VRS depends on ongoing
development and further advancements.
References
Alley, E., 2014. Who Makes the Rules Anyway? Reality and Perception of Guidelines in Video Relay
Service Interpreting. Interpreters Newsletter 19, 13–26.
Bakker, A.B., Van Veldhoven, M., Xanthopoulou, D., 2010. Beyond the Demand-Control Model.
Journal of Personnel Psychology 9(1), 3–16.
Börstell, C., 2017. Types and Trends of Name Signs in the Swedish Sign Language Community. SKY
Journal of Linguistics 30.
Bower, K., 2015. Stress and Burnout in Video Relay Service (VRS). Interpreting. Journal of Interpreta-
tion 24(1).
Brennan, M., 1990. Word Formation in BSL (PhD thesis). Stockholm University, Sweden.
Brentari, D., 2010. Sign Languages. Cambridge University Press, Cambridge.
Brunson, J.L., 2011. Video Relay Service Interpreters: Intricacies of Sign Language Access. Gallaudet
University Press.
76
Video relay service
Chang, S., Russell, D., 2022. Coming Apart at the Screens: Canadian Video Relay Interpreters and
Stress. Journal of Interpretation 30(1), 1–18.
De Meulder, M., Haualand, H., 2021. Sign Language Interpreting Services: A Quick Fix for Inclusion?
Translation and Interpreting Studies 16(1), 19–40.
De Meulder, M., Pouliot, O., Gebruers, K., 2021. Remote Sign Language Interpreting in Times of
COVID-19. University of Applied Sciences, Utrecht.
Drew, P., Heritage, J., 1992. Analyzing Talk at Work: An Introduction. In Drew, P., Heritage, J., eds.
Talk at Work: Interaction in Institutional Settings. Cambridge University Press, Cambridge, 3–65.
EENA, 2023. Implementation of RTT and Total Conversation in Europe. URL https://eena.org/
knowledge-hub/documents/rtt-and-tc-implementation-in-europe/ (accessed 25.1.2014).
Goffman, E., 2016. The Presentation of Self in Everyday Life. Routledge.
Haualand, H., 2012. Interpreting Ideals and Relaying Rights. University of Oslo, Oslo.
Haualand, H., 2014. Video Interpreting Services: Calls for Inclusion or Redialling Exclusion? Ethnos
79(2), 287–305.
Heritage, J., Clayman, S., 2010. Talk in Action: Interactions, Identities, and Institutions. Wiley-
Blackwell, Chichester.
Hopper, R., 1992. Telephone Conversation, Vol. 724. Indiana University Press.
Hoza, J., 2022. Team Interpreting. In Stone, C., Adam, R., Müller de Quadros, R., Rathmann, C., eds.
The Routledge Handbook of Sign Language Translation and Interpreting. Routledge, 162–178.
Keating, E., Edwards, T., Mirus, G., 2008. Cybersign and New Proximities: Impacts of New Com-
munication Technologies on Space and Language. Journal of Pragmatics 40(6), 1067–1081.
Keating, E., Mirus, G., 2003. American Sign Language in Virtual Space: Interactions Between Deaf
Users of Computer-Mediated Video Communication and the Impact of Technology on Language
Practices. Language in Society 32(5), 693–714.
Kubrick, S., Producer, Director, 1968. 2001: A Space Odyssey. Stanley Kubrick Productions.
Leeson, L., 2005. Making the Effort in Simultaneous Interpreting: Some Considerations for Signed
Language Interpreters. In Janzen, T., ed. Topics in Signed Language Interpreting: Theory and Prac-
tice. John Benjamins Publishing Company, Amsterdam, 51–68.
Liddell, S.K., 1977. An Investigation into the Syntactic Structure of American Sign Language. Univer-
sity of California, San Diego.
Liddell, S.K., 1984. Think and Believe: Sequentially in American Sign Language. Language 60(2),
372–399.
Liddell, S.K., 2003. Sources of Meaning in ASL Classifier Predicates. Psychology Press.
Linell, P., 1998. Approaching Dialogue: Talk, Interaction and Contexts in Dialogical Perspectives.
John Benjamins Publishing.
Marks, A., 2018. “Hold the Phone!” Turn Management Strategies and Techniques in Video Relay
Service Interpreted Interaction. Translation & Interpreting Studies: The Journal of the American
Translation & Interpreting Studies Association 13(1), 87–109.
Napier, J., Skinner, R., Turner, G.H., 2017. “It’s Good for Them but Not so for Me”: Inside the Sign
Language Interpreting Call Centre. Translation & Interpreting 9(2), 1–23.
Palmer, J., Wanette Reynolds, L., Minor, R., 2012. “You Want What on Your Pizza!?” Videophone
and Video-Relay Service as Potential Influences on the Lexical Standardization of American Sign
Language. Sign Language Studies 12(3), 371–397.
Peterson, R., 2011. Profession in Pentimento. In Nicodemus, B., Swabey, L., eds. Advances in Inter-
preting Research: Inquiry and Action. John Benjamins Publishing, Amsterdam, 199–223.
Riccardi, A., 2005. On the Evolution of Interpreting Strategies in Simultaneous Interpreting. Meta
50(2), 753–767.
Roman, G.A., Samar, G., 2015. Workstation Ergonomics Improves Posture and Reduces Musculo-
skeletal Pain in Video Interpreters. Journal of Interpretation 24(1), 1–19.
Roy, C.B., 2002. The Problem with Definitions, Descriptions, and the Role Metaphors of Interpreters.
In Pöchhacker, F., Shlesinger, M., eds. The Interpreting Studies Reader. Routledge, London and
New York, 344–353.
Russell, D., 2005. Consecutive and Simultaneous Interpreting. In Janzen, T., ed. Topics in Signed
Language Interpreting. John Benjamins Publishing, 136–164.
Sandler, W., Lillo-Martin, D., 2006. Sign Language and Linguistic Universals. Cambridge University
Press, Cambridge.
77
The Routledge Handbook of Interpreting, Technology and AI
Skinner, R., 2023. Would You Like Some Background? Establishing Shared Rights and Duties in
Video Relay Service Calls to the Police. Interpreting and Society 3(1), 46–74.
Skinner, R.A., 2020. Approximately There – Positioning Video-Mediated Interpreting in Frontline
Police Services. International Journal of Interpreter Education 82.
Skinner, R.A., Napier, J., Fyfe, N.R., 2021. The Social Construction of 101 Non-Emergency Video Relay
Services for Deaf Signers. International Journal of Police Science & Management 23(2), 145–156.
Stokoe, W.C., 2005. Sign Language Structure: An Outline of the Visual Communication Systems of
the American Deaf. Journal of Deaf Studies and Deaf Education 10(1), 3–37.
Supalla, S.J., 1990. The Arbitrary Name Sign System in American Sign Language. Sign Language
Studies 67(1), 99–126.
Thailand Communication Relay Service Center, 2021. TTRS Center. URL www.ttrs.or.th (accessed
25.1.2014).
Treviño, R., Quinto-Pozos, D., 2018. Name Pronunciation Strategies of ASL-Spanish-English Trilin-
gual Interpreters During Mock Video Relay Service Calls. Translation and Interpreting Studies
13(1), 71–86.
United Nations, 2006. The United Nations Convention on the Rights of Persons with Disabilities.
(CRPD) (A/RES/61/106). United Nations, New York.
United Nations, 2015. The Sustainable Development Goals. URL www.un.org/sustainabledevelopment/
sustainable-development-goals/ (accessed 25.1.2014).
Wadensjö, C., 2014. Interpreting as Interaction. Routledge.
Warnicke, C., 2017. Tolkning vid förmedlade samtal via Bildtelefoni.net – Interaktion och gemensamt
meningsskapande [The Interpreting of Relayed Calls Through the Service Bildtelefoni.net – Inter-
action and the Joint Construction of Meaning] (PhD thesis). URL https://oru.diva-portal.org/
smash/get/diva2:1089956/FULLTEXT01.pdf.
Warnicke, C., 2018. Co-Creation of Communicative Projects Within the Swedish Video Relay Inter-
preting Service. In Napier, J., Skinner, R., Braun, S., eds. Here or There: Research on Interpreting
via Video Link. Gallaudet University Press, Washington, DC, 210–229.
Warnicke, C., 2019. Equal Access to Make Emergency Calls: A Case for Equal Rights for Deaf Citi-
zens in Norway and Sweden. Social Inclusion 7(1), 173–179.
Warnicke, C., 2021. Signed and Spoken Interaction at a Distance: Interpreter Practices to Strive for
Progressivity in the Beginning of Calls via the Swedish Video Relay Service. Interpreting 23(2),
296–320.
Warnicke, C., Broth, M., 2023. Embodying Dual Actions as Interpreting Practice: How Interpreters
Address Different Parties Simultaneously in the Swedish Video Relay Service. Translation and
Interpreting Studies 18(2), 191–212.
Warnicke, C., Granberg, S., 2022. Interpreter Mediated Interaction Between People Using a Signed
Respective Spoken Language on a Distance in Real Time – a Scoping Review. BMC Health Services
Research 22(387).
Warnicke, C., Matérne, M., 2024a. Sign Language Interpreters’ Experiences of Remote Interpreting
in Light of COVID-19 in Sweden. Interpreting and Society. URL https://journals.sagepub.com/
doi/10.1177/27523810241239779
Warnicke, C., Matérne, M., 2024b. Regulation, Modification, and Evolution of Remote Sign Lan-
guage Interpreting in Sweden – a Service in Progress. BMC Health Services Research 24, 1431.
URL https://doi.org/10.1186/s12913-024-11907-y
Warnicke, C., Plejert, C., 2012. Turn-Organisation in Mediated Phone Interaction Using Video Relay
Service (VRS). Journal of Pragmatics 44(10), 1313–1334.
Warnicke, C., Plejert, C., 2016. The Positioning and Bimodal Mediation of the Interpreter in a Video
Relay Interpreting (VRI) Service Setting. Interpreting: International Journal of Research & Prac-
tice in Interpreting 18(2), 198–230.
Warnicke, C., Plejert, C., 2018. The Headset as an Interactional Resource in a Video Relay Interpret-
ing (VRI) Setting. Interpreting: International Journal of Research & Practice in Interpreting 20(2),
285–308.
Warnicke, C., Plejert, C., 2021. The Use of the Text-Function in Video Relay Service Calls. Text and
Talk 41(3), 391–416.
Wessling, D.M., Shaw, S., 2014. Persistent Emotional Extremes and Video Relay Service Interpreters.
Journal of Interpretation 23(1), 1–21.
78
5
PORTABLE INTERPRETING
EQUIPMENT
Tomasz Korybski
5.1 Introduction
A significant part of this volume inevitably focuses on the most recent technological devel-
opments that have either already disrupted the way interpreting is delivered (such as remote
simultaneous interpreting delivery platforms; see Chmiel and Spinolo, this volume) or are
bound to impact interpreting, such as machine interpreting or automatic speech recogni-
tion. While these developments are undoubtedly of huge importance, a volume devoted to
interpreting and technology must strive to offer a potentially comprehensive picture of all
technological solutions used by interpreters. From today’s standpoint, some of these may
appear low-tech. However, they still have a place in the interpreter’s toolkit and continue
to successfully serve professionals in circumstances where cloud-based solutions and even
conventional conference equipment simply cannot work. This chapter focuses on one such
solution: portable interpreting equipment, that is, small pocket-sized kits that typically func-
tion based on radio frequencies. This equipment is composed of a transmitter (or multiple
transmitters) and multiple receivers and allows users to communicate at a (relatively short)
distance, either within the same language or across languages. Alternative names include
the widely used ‘tour guide system’ (as the technology has been used beyond interpreting,
by guides communicating with groups during guided tours), the French word bidule (‘the
thingie’), the German word Flüsterkoffer (‘the whispering briefcase’, as the portable sets
are usually carried in a briefcase which doubles as a charger, and the predominant modal-
ity of interpretation via such systems resembles whispered interpreting), the term ‘infoport
systems’, and even ‘boothless interpreting systems’. This chapter will use the generic term
‘portable interpreting equipment’ interchangeably with ‘tour guide systems/sets’ and ‘bid-
ule’, as these three terms appear to be the most widely applied in literature. Tour guide
systems are used primarily in guided tours, site visits, museums, conferences, and other
events where spoken information needs to be conveyed live, to one person or a group of
people, and where portability as well as mobility are vital. During assignments, the system’s
transmitter device is worn by the speaker (a tour guide, an interpreter, a host, etc.), and
receiver devices equipped with corded or cordless earphones are used by the recipient/s. The
transmitter emits audio signals through a selected channel. These signals convey spoken
DOI: 10.4324/9781003053248-7
The Routledge Handbook of Interpreting, Technology and AI
80
Portable interpreting equipment
within a lanyard, thus allowing the receiver to be worn around the neck, leaving the hands
free to operate the device. The transmitters, however, were larger – despite the technology
already being available to vastly reduce their size, thanks to the Motorola corporation.
As is often the case with rapid technological development, this company’s links with the
military fuelled the creation of portable radio communication systems before and during
World War II. What would later become known as the popular pocket-size walkie-talkie
was originally a military-grade mobile radio frequency communication system called the
Motorola SCR (Signal Corps Radio), developed by Engr Henryk Magnuski and his team.1
The SCR-300 represented a significant advancement in military communication as the first
portable FM backpack radio, which allowed for reliable, secure communication over a
range of distances and was critical for military operations. In post-war years, as industries
adopted military technology for civilian use, the concept of portable radio frequency com-
munication was embraced and further developed by early producers of portable tour guide
systems. After the war, as the global economy recovered, and a plethora of political events
aimed at restoring world peace, there was a burgeoning interest in enhancing visitor experi-
ences at trade fairs, museums, and historical sites. This sparked the development of tour
guide technology, offering new ways to engage and inform visitors.
Initial systems were rudimentary, relying on portable radios with fixed-frequency trans-
mitters and receivers. While innovative for their time, these early devices were plagued with
interference issues – linked to analogue sound transmission – and had limited operational
range. These factors restricted the widespread adoption and effectiveness of these early
devices, yet they still offered both live communication and communication across languages.
The AIIC’s archives also provide evidence of portable equipment being used after the war,
with the system used by Thadé Pilley and Frank Barker as a prime example (Keiser, 2004).
We know that Pilley and Barker used some form of portable equipment to offer simultane-
ous interpretation, predominantly in Africa,2 but there is no certainty as to the exact tech-
nology used. It has been hypothesised that they used a set of radio frequency transmitter
and receivers, but it is more likely that the equipment was a portable, reduced-size version
of the Finlay–Filene Hush-a-Phone, thanks to accounts of cabling being carried by inter-
preters (AIIC 2019). Nevertheless, what is important is that the interpreting community
embraced technology early on, and that the concept of ‘mobile interpreting’ had already
found its initial adopters in the decade following World War II.
Returning to tour guide systems and fast-forwarding several decades, important strides
were made in battery technology towards the end of the 20th century. This facilitated the
creation of more portable, user-friendly systems, which enabled longer usage periods and
greater flexibility. It is worth noting that, currently (as of 2024), modern lightweight Li-Ion
batteries can power a digital tour guide transmitter or receiver for more than 20 hours. This
renders all-day assignments more comfortable and removes the need to remind users to save
battery life or charge their receivers during breaks. Aside from extended battery life, further
positive changes were brought about thanks to the transition from analogue to digital tech-
nology in the 1990s: the new digital devices offered superior sound quality and enhanced
ease of use. At the beginning of the 21st century, new advancements in portable tour guide
systems were implemented: a widespread uptake of wireless technology, including UHF and
VHF frequencies, allowed guides or interpreters to communicate effortlessly with multiple
participants via microphones and wireless receivers.
This analogue-to-digital shift was important for the user experience. While analogue and
digital radio transmitters both send audio signals through the air, they do this in different
81
The Routledge Handbook of Interpreting, Technology and AI
ways. Analogue radio sends sound waves directly through the air. This can result in sound
becoming ‘fuzzy’ or containing static. This is especially the case if the receiver is far from
the transmitter (the further from the transmitter, the worse the sound quality becomes), or
where there is some sort of interference (e.g. physical obstacle, other waves) along its path.
In contrast, digital radio technology converts sound into digital data, much like a computer
file, which it then sends through the air. This method ensures clearer, more consistent sound,
with far less static and interference. The sound remains clear until the signal becomes too
weak, at which point it stops rather than gradually degrading. This technological advance
has proven important and particularly advantageous for modern tour guide systems. The
sound received through such digital equipment is clear, even in the presence of background
noise or when a group of recipients is spread out during interpreted field visits or on-site
training, in noisy environments. The quality of digital radio is also far more consistent than
that of older analogue devices, and analogue crackling noises or sudden sound drops are
eliminated through digital processing. Additionally, digital systems can handle multiple
channels easily. This allows different groups with different language pairs the ability to
communicate in the same location without any interference.
When considering typical specifications of a mid-range digital UHF (ultra-high fre-
quency) tour guide transmitter, one might think of a small lightweight device resembling
a small wallet or pack of credit cards, with a selection of the following features: it may
weigh around 50–70 g, have a working range of around 150 m in the line of sight, pos-
sess over 30 channels, and have a battery life which exceeds 15 hr of continuous work
after 2–4 hours of charging. It may also feature Bluetooth and Wi-Fi technology for
easier integration with smartphones and other personal devices. Nowadays, standard
specifications also include digital transmission modulation with noise reduction technol-
ogy, interference-free channel search function, and remote channel change in receivers.
Additional accessories are also typically available. These include convenient bulk charg-
ers (which resemble a small briefcase), tailored bags, matching handheld wireless micro-
phones, etc. Therefore, from today’s perspective, it seems fair to say that what began as a
simple concept, based on analogue radio technology with hardware limitations, has, over
a century, morphed into a relatively advanced, highly portable device, based on digital
signal processing technology (DSP). Today’s devices can be deemed ‘reliable’ in terms of
sound quality and battery life and ‘flexible enough’ to work in configurations with other
modern devices, including smartphones, wireless equipment, and digital recorders. This
translates into devices that are more convenient to use when providing live interpreting
in a range of multilingual settings, such as seminars, lectures, international conferences,
site visits, facility tours, cultural tours, and business meetings. Still, the impressive tech-
nological progress described earlier has not eliminated all shortcomings that are inherent
to interpreting via tour guide systems, the most important of which are discussed later in
this chapter (Section 5.4).
82
Portable interpreting equipment
challenges, such as the strain on vocal cords during continued whispered assignments
(although simultaneous interpreting with portable systems need not always require the
interpreter to whisper).
More recent and comprehensive presentations of this modality are offered by Baxter
(2015) and Porlán Moreno (2019). The former refers to the apparent marginalisation of
bidule interpreting and positions it within volunteer and activist interpretation. Meanwhile,
the latter presents a comprehensive overview of the applications of tour guide interpreting.
Porlán Moreno (2019, 56) suggests that portable systems may work well for small groups,
visits, and environments where setting up traditional simultaneous interpretation booths
is impractical. However, the author also emphasises the apparent misuse of such systems.
This, in turn, creates challenges for interpreters and increases the need for interpreter train-
ing programmes and associations to clearly define appropriate environments within which
the bidule can be used, in addition to defining its limitations.
Some descriptions of bidules and their use, with comments, come directly from interpret-
ing service providers (e.g. Magalhães, 2016; Sekikawa, 2008). These provide a balanced
presentation of the ‘pros and cons’ of this modality of service delivery. Stahl (2010) remarks
that, as subject of research, tour guide interpreting is an ‘orphan’ of research in interpreting
studies. Indeed, well-structured research, based on empirical data, is limited to MA-level
dissertation projects, such as Bidone (2017) and Panna (2017). In addition to offering a
broader evidence-based analysis of the challenges associated with tour guide interpreting
and its positioning in interpreter training, both MA theses report on challenges encountered
by a particular group of trainee interpreters while performing an actual field assignment.
There is, therefore, ample opportunity to fill the existing research gap. This will be dis-
cussed in more detail in Section 5.7 of this chapter.
83
The Routledge Handbook of Interpreting, Technology and AI
84
Portable interpreting equipment
Another significant constraint linked to the use of tour guide systems, as reported by
Diriker (2015), is the way the interpreter uses their voice. While working with bidule, with-
out a soundproof booth or even a sound shield, it is necessary for the interpreter to consider
their immediate environment and lower their voice as a result. This may prove overly taxing
and lead to disruptions in delivery, with the interpreter’s control over their own product
suffering due to having to speak at a lower volume. In turn, this could impact the overall
perception of interpretation quality in such scenarios.
Furthermore, effective planning is required in relation to the interpreter/s’ physical posi-
tioning while providing their services via portable sets. Planning is particularly important
in the case of indoor events. When booths are unavailable, interpreters tend to work at
a distance from both the speaker and the audience in order to reduce potential acoustic
interference. However, the interpreters still need to be able to hear the speech from the floor
clearly (if no separate audio feed is available). They also need to be able to see the room
dynamics, since visual aids (presentation slides, etc.) and cues provide useful indications to
meaning during interpretation. This element requires appropriate attention when training
interpreters. Similarly, in real-life settings, it is a compulsory part of preparation for any
assignment and contributes to the interpreter’s overall workload.
A further aspect of interpreting that is crucial to consider prior to any assignment which
requires tour guide sets in indoor venues is incoming sound management. Aside from man-
aging the aforementioned issue of incoming and outgoing sound quality, before an assign-
ment, the interpreter must ascertain whether floor sound will be conveyed directly to his/
her earphones or headset, or whether sound will need to be picked up via the venue’s
loudspeakers. The former solution can be afforded by the venue’s technicians if they can
provide a spare outgoing audio channel for the interpreter. Alternatively, a combination of
two transmitters, operating simultaneously, can be used: one transmitter for the speaker,
another for the interpreter (who will also have a receiver set to the speaker’s channel, differ-
ent to the channel used by the audience). Recently, certain manufacturers have offered more
advanced, bidirectional devices, for this purpose,3 thus resolving the challenge created by
dynamic exchanges involving two languages, as is the case of Q&A sessions, for example.
Whatever the hardware used, one can observe that the alternative scenario further
increases the interpreter’s workload: S/he will need to work alongside the venue’s techni-
cians to equip the speaker with the transmitter/transceiver, brief the speaker on how to use
it, mount the transmitter’s microphone, and ensure the equipment is passed over to the next
speaker if needed. Therefore, it is no wonder that the interpreting community, including
researchers, has reservations about tour guide interpreting. However, despite these addi-
tional extra steps, it may still be worth the effort, as interpreting with portable sets is still
regarded as a less-strenuous alternative to whispered interpreting (Pöchhacker 2009). Nev-
ertheless, reservations are reflected in existing standards, guidelines, and comments in both
research and professional publications and will be discussed in more detail in the following
section.
85
The Routledge Handbook of Interpreting, Technology and AI
to a number of requirements, including the characteristics of the event (e.g. a factory visit
where mobility is paramount), duration (short durations of up to two hours preferred), a
small number of participants, availability of two-way equipment (to ensure clear source
audio for the interpreter), and compliance of such equipment with the relevant standards,
such as the IEC 914 standard (AIIC 2002).
Setton and Dawrant (2016, 19) also refer to the use of portable sets as a ‘problematic’
phenomenon due to ‘inadequate sound quality and acoustic isolation’. They distinguish
interpreting in this mode from what they call ‘real SI’. Certainly, both the AIIC guidelines
and the terminology used by the aforementioned authors are dictated by a concern for the
comfort and convenience of the interpreter and a desire to uphold the standards of the
interpreting profession. As Magalhães (2016) notes, the interpreter’s agreement to use port-
able interpreting equipment should never be at the expense of the standards that the inter-
preting community has worked hard to establish for decades. Portable units ought to serve
as a streamlined alternative (predominantly to consecutive interpreting) in settings where
regular booth equipment cannot be installed. In fact, there is already anecdotal evidence of
how tour guide systems can be used in settings that employed consecutive interpretation in
the past, with positive results in high-stakes environments (Shermet 2019).
Another issue relating to standards centres on the question of the number of interpreters
required to deliver the service using portable units. In reality, as client decisions are often
cost-driven, there may be an expectation to cut costs further, employing just one interpreter
to work with portable units. Baxter (2015) describes ‘boothless simultaneous interpreting’
with portable equipment, and whispered interpreting, as alternatives that are preferred by
activist and non-governmental event organisers with limited budgets. In addition, Baxter
refers to interpreters delivering such services in both the singular and the plural. The expec-
tation for an interpreter to single-handedly deliver an assignment through a tour guide
system may materialise when portable units are used as a ‘faster’ alternative to consecutive
interpretation. There is a general expectation that these tasks be performed by one con-
secutive interpreter in the given setting. Therefore, it is essential that working standards
are upheld if the commissioned simultaneous shifts exceed 20 min in duration. In fact,
during assignments that use tour guide sets, interpreters’ shifts may require more frequent
handovers, due to the potentially more demanding acoustic environment. This is because
interpreter fatigue resulting from lack of soundproofing and exposure to background noise
from the audience may lead to compromised quality. Consequently, appropriate rest peri-
ods are mandatory. It is crucial to make this requirement known to both clients and event
organisers wherever possible, as well as address it during training. This is an aspect that the
next section will discuss in more detail.
86
Portable interpreting equipment
statistic still shows that approximately 1 in every 20 working days will involve an assign-
ment using a tour guide system. This seems like reason enough to place tour guide interpret-
ing in curricula – especially since the same source also quotes a growing trend in the uptake
of tour guide systems for interpretation, as confirmed by nearly 50% of respondents. In
addition, researchers must also bear in mind that these figures already date from a decade
ago and involve an elite group of interpreters. One can reasonably assume that among
non-associated freelancers who often combine interpreting with other linguistic professions
and work for a variety of clients, including non-institutional clients, NGOs, or corporate
clients with limited budgets, these figures could be higher. In addition to the need to pro-
vide all-around training which covers the widest possible spectrum of interpreting modes
and modalities, this point serves as justification for including bidule training in interpreter
training curricula. The following sections offer practical elements that could be considered
during tour guide interpreter training.
87
The Routledge Handbook of Interpreting, Technology and AI
the speakers/audience on how to use the equipment, and even ensuring uninterrupted use
and troubleshooting the equipment during the assignment. This difference highlights how
training should include both technical and organisational tasks that are rarely required of
an interpreter in a typical, technically supported booth conference interpreting service set-
ting in order to prepare trainees for tour guide set assignments.
Another skill for working with tour guide sets is the ability to interpret simultaneously
despite a lack of source audio fed directly to the interpreter’s headphones. This is a conten-
tious issue. For example, following the relevant AIIC (2002) guidelines, interpreters should
reject any jobs where audio needs to be picked up from the floor without a direct audio feed
to the interpreter’s ears. There are strong reasons for this standard to be upheld. However,
many anecdotal accounts prove that it is not realistic to always expect ideal source audio
during an assignment. As a result, the responsibility rests with the interpreter to choose
whether to accept the job, with all its challenges, or to decline it. Consequently, due to the
range of different interpreting scenarios and settings where tour guide systems can be used,
it is challenging for interpreter trainers to draw a line of acceptability in the ill-defined area
of ‘satisfactory vs non-satisfactory’ sound quality. What remains certain, however, is that
by the end of their course, students should be aware of the impact that acoustic limitations
can have on the quality of their product, and the additional burden on the interpreter that
this modality of delivery causes.
Interpreter training curricula should place a strong emphasis on both the ethical and
workload considerations that surround the delivery of interpreting services with portable
sets. A well-trained interpreter should be able to transparently present the advantages and
disadvantages of bidule interpreting vis-à-vis regular interpreting to the client. They should
also be capable of highlighting issues like feasibility, impact on quality, increased workload,
shift work, and technical and organisational tasks linked to the delivery. Such an approach
will facilitate a comprehensive analysis of requirements that are unique to each assign-
ment. Consequently, this approach would also reinforce a standardised methodology for a
non-standard delivery of interpreting services (however paradoxical that may sound).
5.7 Conclusion
As interpreting undergoes a dynamic transformation, caused predominantly by the arrival
of remote interpreting, automatic speech recognition, and AI/MI technologies (see contri-
butions by Prandi, Chmiel and Spinolo, Fantinuoli, and Ünlü, this volume), the question of
whether bidule interpreting is ‘future-proof’ is highly relevant. To answer this, it is necessary
to first consider the widest possible spectrum of interpreting assignments currently available
to service providers. In some settings, including remote interpreting of conferences, webi-
nars, town hall meetings, lectures, and training courses, the impact of new technology and
the potential for (semi-)automation is already clearly visible. However, many other settings
remain, where the physical presence of a human interpreter will continue to be the preferred
choice or simply a necessity. Field trips in remote areas, facility tours where mobility is
crucial, interpreting in crisis settings where internet access is limited or non-existent, clas-
sified negotiations at risk of cyberattacks, small- to medium-size group events for clients in
small premises or with limited budgets, or contingency-type interpreting when technology
has failed are just a few examples of situations where modern digital radio tour guide sets
can serve as a valuable companion for interpreters at work. If their inherent limitations and
impact on the interpreter’s output are both thoroughly understood and considered on a firm
88
Portable interpreting equipment
ethical groundwork, these devices are, and will most probably remain, a useful tool in the
flexible interpreter’s toolbox.
Furthermore, there is a need for more concrete justification regarding the presentation of
this modality of interpreting as ‘marginal’. Interpreting with portable equipment remains
an understudied area. Firstly, there is a considerable research gap concerning the actual
share of bidule assignments, their types, and their characteristics. Secondly, there is a lack
of knowledge about the impact that this non-standard equipment (in varying configura-
tions) may have on the quality of interpreting. Experimental research in this area is there-
fore necessary. Thirdly, interpreter fatigue and workload in this modality have only been
described vaguely, and researchers lack concrete data to prove the relevant claims made in
literature. Fourthly, as much criticism of tour guide systems is founded on claims relating
to inferior sound quality and impractical channel management, the impact of the recent
technological strides on the user experience has yet to be assessed. In addition, the key func-
tionalities to be investigated in this regard include the bidirectionality of equipment and its
noise-cancellation features. In a similar vein, more technically oriented acoustic research is
required, as is further investigation into, and comparisons between, the quality of incoming
sound across different interpreting modalities. These are just the main research strands that
are linked to bidule interpreting. These can certainly be narrowed down to more specific
research projects that will further our understanding of how this modality of interpreting
can be used.
Finally, thanks to recent strides in technology and the arrival of AI, a further poten-
tial avenue for application and development of portable systems for interpreting is their
combination with automatic speech recognition (ASR) and natural language processing
(NLP) technology. For instance, combining ASR with noise cancellation could allow port-
able equipment to be used to form alternative workflows for interpreters. To illustrate,
interpreters who are working in acoustically substandard environments could be provided
with live speech-to-text transcripts, which would aid interpretation. The size and interfaces
of existing tour guide kits facilitate this application. Similarly, further solutions may prove
useful for simultaneous interpreting in general (including RSI); for example, automatic and
low-latency summarisation applications could also help interpreters partially overcome the
contextual challenges that are present in tour guide interpreting.
Future research will reveal the extent to which a concept born nearly 100 years ago and
now in its most modern guise can work alongside the most recent developments in AI and
NLP if managed and controlled by well-trained human professionals.
Notes
1 Scr300.org (accessed 8.7.2024).
2 https://bootheando.com/2013/03/08/la-historia-de-la-interpretacion-simultanea-de-la-mano-
de-ted-pilley/ (accessed 10.7.2024).
3 Examples include products from Williams AV and Okayo (respectively: https://williamsav.com/
product/dlt-400/, www.okayo.com/en/product-281287/Full-duplex-Communication-System-Wave
TEAMS.html, accessed 7.9.2024).
References
AIIC, 2002. Text on Bidule. AIIC.net. URL https://aiic.net/page/633/text-on-bidule/lang/1
AIIC, 2014. Basic Texts. Code of Professional Ethics. AIIC.net. URL http://aiic.net/p/6724
89
The Routledge Handbook of Interpreting, Technology and AI
AIIC, 2019. Birth of a Profession. The First Sixty-Five Years of the AIIC.
Baigorri-Jalón, J., Mikkelson, H., Olsen, B.S., eds., 2014. From Paris to Nuremberg: The Birth of
Conference Interpreting. John Benjamins Pub Co., Amsterdam and Philadelphia.
Baxter, R.N., 2015. A Discussion of Chuchotage and Boothless Simultaneous as Marginal and Unor-
thodox Interpreting Modes. The Translator 22, 59–71. URL https://doi.org/10.1080/13556509.2
015.1072614
Bidone, A., 2017. Dolmetschen mit Flüsterkoffer: Eine Feldstudie (MA dissertation). University of
Vienna. Available online at phaidra.univie.ac (accessed 5.8.2024).
Diriker, E., 2015. Simultaneous Interpreting. In Pöchhacker, F., ed. Routledge Encyclopedia of
Interpreting Studies. Routledge, London and New York.
Gile, D., 2009. Basic Concepts and Models for Interpreter and Translator Training, revised ed.
Benjamins Translation Library, John Benjamins.
Jones, R., 2002. Conference Interpreting Explained. St. Jerome Publishing.
Kalina, S., Ziegler, K., 2015. Technology. In Pöchhacker, F., ed. Routledge Encyclopedia of Interpret-
ing Studies. Routledge, London, 410–412.
Keiser, W., 2004. L’interprétation de conférence en tant que profession et les précurseurs de l’Association
internationale des interprètes de conférence (AIIC) 1918–1953. Meta 49(3), 579–608. URL https://
doi.org/10.7202/009380ar
Magalhães, E., 2016. Portable Interpreting Equipment. What to Get and Why. URL https://ewandro.
com/portable/
Panna, S., 2017. Die Praxis des Flüsterdolmetschens: Eine qualitative und quantitative Studie (MA
dissertation). University of Vienna. Available at: https://phaidra.univie.ac.at/ (accessed 5.8.2024).
Pöchhacker, F., 2009. Conference Interpreting: Surveying the Profession. The Journal of the American
Translation and Interpreting Studies Association 4(2), 172–186. URL https://doi.org/10.1075/
tis.4.2.02poc
Porlán Moreno, R., 2019. The Use of Portable Interpreting Devices: An Overview. Revista Trad-
umàtica. Tecnologies de la Traducció 17, 45–58. URL https://doi.org/10.5565/rev/tradumatica.233
Sekikawa, F., 2008. Über Sinn und Unsinn einer Personenführungsanlage. URL www.sekikawa.de/
pdf-files/pfa_d.PDF (accessed 10.8.2024).
Setton, R., Dawrant, A., 2016. Conference Interpreting: A Complete Course. Benjamins Translation
Library: BTL, John Benjamins. URL https://doi.org/10.1075/btl.120
Shermet, S., 2019. A Quieter Revolution in Diplomatic Interpretation. ATA Website. URL www.
ata-divisions.org/ID/a-quieter-revolution-in-diplomatic-interpretation/ (accessed 13.7.2024).
Stahl, J., 2010. Flüsterdolmetschen – ein Waisenkind der Forschung. In Stahl, J., ed. Translatologické
reflexie. Iris, Bratislava, 55–67.
90
6
TECHNOLOGY-ENABLED
CONSECUTIVE INTERPRETING
Cihan Ünlü
6.1 Introduction
Over the past two decades, various tailor-made technological provisions have been cre-
ated for interpreters. Causes may be attributed to the broader accessibility of technology
for professional use, significant steps in the robustness and quality of speech technologies,
the drastic advent of computing power, significant advances made in deep learning, or the
prevalence of generative artificial intelligence. These provisions aim to offer both tools and
solutions to improve interpreters’ work efficiency. As a broader concept, technology in
interpreting has become a major topic and subject of enquiry in both academia and indus-
try. The intersection of technology and interpreting has thus also become an attractive,
albeit young, field of study and has given rise to a proliferation of human–machine interac-
tion and product- and process-oriented studies in the field of translation studies. The tech-
nologisation of interpreting is discussed by many, particularly in a period when the number
of remote interpreting (RI) platforms has multiplied, when machine interpreting (MI) has
become prevalent in the language industry, and when natural language–based AI technolo-
gies have achieved a level of maturity sufficient enough to help interpreters complete certain
subtasks before, during, and after assignments.
Recent progress in speech technologies (automatic speech recognition, speech transla-
tion, speech synthesis) alongside state-of-the-art deep learning models in natural language
processing have paved the way for the utilisation of and research on ‘process-oriented’
interpreting technologies (Fantinuoli, 2018). Accordingly, computer-assisted interpret-
ing (CAI) tools have emerged. These are dedicated to ‘assist[ing] professional interpret-
ers in at least one of the several sub-processes of interpreting’. Examples of sub-processes
include ‘knowledge acquisition and management, lexicographical memorization, real-time
terminology access, and so on’ (Fantinuoli, 2023, 46). In the past decade, these tools and
technologies have been implemented in various settings and discussed academically in terms
of usability (Frittella, 2023), terminology (Gacek, 2015; Biagini, 2015), cognitive processes
(Prandi, 2023, this volume), and socio-cognitive (4EA) approaches (Mellinger, 2023),
among others. Interpreting is not only technology-enabled but also technology-mediated.
The sharp increase in demand for and provision of distance interpreting has made remote
DOI: 10.4324/9781003053248-8
The Routledge Handbook of Interpreting, Technology and AI
simultaneous interpreting (RSI) more prevalent, and technologisation more relevant for
practitioners. In addition, the platformisation of RSI has also led more interpreters with
technology literacy to work as independent contractors. Such independent contractors have
been found to have an increased reliance on digital platforms, which mediate their jobs and
terms (Giustini, 2024; Giustini, this volume). Furthermore, the shift to a digital workspace
has also made it necessary to integrate technology into the interpreter’s workflow. This
technological change requires additional empirical studies to assess how best to prevent
risks and establish which benefits they can offer in terms of streamlining the entire pro-
cess. Therefore, there has been an observable increase in the design and deployment of
‘setting-oriented’ (Fantinuoli, 2018) technologies as commercial RSI products with CAI
functions (Fantinuoli et al., 2022; Frittella and Rodríguez, 2022) and as products designed
for interpreter training (Arriaga-Prieto et al., 2023; Baselli, 2023).
Recently, the intersection of automatic speech recognition (ASR) and AI-based informa-
tion retrieval models has been observed as a game changer for process-oriented CAI tools,
particularly in SI. Several studies have explored the possibilities of using ASR technology
as an automated querying system (Hansen-Schirra, 2012; Fantinuoli, 2017). Other studies
have explored the possibilities of using ASR technology to enhance CAI tools in the context
of problem triggers during practice (Defrancq and Fantinuoli, 2021; Rodríguez et al., 2021;
Pisani and Fantinuoli, 2021), assisting in the preparatory needs of interpreters (Gaber et al.,
2020), and supporting interpreters by transcribing source speeches (Cheung and Tianyun,
2018; Wang and Wang, 2019; Rodríguez González et al., 2023).
In academia, there has been a clear interest and abundance of empirical research on CAI
in the last decade. However, the focus has predominantly been on simultaneous rather than
consecutive interpreting. A closer glance at the literature shows the majority of the func-
tionalities and designs of the tools, as well as future tool proposals, focus on simultaneous
interpreting. There is a need for more robust empirical studies that explore human–com-
puter interaction within CAI across various language pairs and modalities. That is why
most of the aforementioned studies mostly use SI performance as their independent vari-
able. Along with market demand, it might be the case that the preference for SI over CI is
influenced by the inherent characteristics of CI and the diverse environments it is applied
in. Contrary to simultaneous conference interpreting, CI often lacks a stable set-up. This
may create a doubt that technological aids for CI are less feasible and potentially riskier.
Consequently, when considering the implementation of technology in CI, the focus shifts
to the specific technological tools and their functions that are pertinent to various sub-
tasks within the interpreting process. Although overshadowed by SI-related CAI research,
computer-assisted consecutive interpreting is considered to be a new and vibrant field, with
many experimental studies yielding promising results and insights for process-oriented
interpreting studies.
However, the development of text processing and language understanding on the soft-
ware side and the introduction of new ergonomic and user-friendly tools on the hardware
side have influenced the creation of potential technological support in CI. Although tradi-
tionally considered to be a non-technical field, CI has the ability to integrate a variety of
tools and technologies. These tools aid in the different stages of the interpreting process.
This chapter will focus on the literature surrounding technology aids in CI, with a particular
lens on hybrid modalities, types of equipment, and ASR-assisted solutions. The chapter will
revisit studies conducted so far and analyse the goals, methodologies, and conclusions of
empirical studies that deal with the use of technology in consecutive interpreting in detail.
92
Technology-enabled consecutive interpreting
Section 6.2 provides a general overview of the interplay between CI and technology and
draws connections between various innovations that have been designed and conceptual-
ised for the CI process. Subsequent subsections outline these innovations in the following
broad categories: hybrid modalities, digital pen–supported modalities, and ASR-assisted
modalities. These subsections also provide a detailed review of the literature with empiri-
cal studies conducted so far. Finally, Section 6.3 draws a conclusion based on the current
status, limitations, and future studies.
93
The Routledge Handbook of Interpreting, Technology and AI
phases of effort models (Gile, 2009), respectively. Despite the distinct operational dif-
ferences between simultaneous and consecutive modes, technological solutions in both
modes aim to facilitate the core work of interpreters by reducing their cognitive load
during information retrieval and enhancing performance quality. In the literature, there
appears to be a handful of studies tackling the question of whether state-of-the-art tech-
nology can enhance CI. There are currently four themes under the umbrella of ‘technol-
ogy use in CI’ that can be given for a broad review. These include hybrid interpreting
modalities, digital pens, tablet interpreting, and the new ASR-integrated approaches (see
also contributions by Davitti and Orlando, this volume). Rather than viewing these as
isolated technological applications for CI, these can be understood as interconnected
technological innovations that share functionalities across either the comprehension and
note-taking phase or the production and delivery phase. Hybrid modalities include the
practice of simultaneous-consecutive, which hypothetically allows interpreters to manage
cognitive loads better and potentially avoid traditional note-taking (Ferrari, 2002, 2007;
Pöchhacker, 2007; Hamidi, 2006; Hamidi and Pöchhacker, 2007). Smartpens (Orlando,
2014, 2016, and this volume) are high-tech electronic devices equipped with a micro-
phone, speaker, and infrared camera. Another innovation is tablet interpreting (Rosado,
2013; Drechsel and Goldsmith, 2016; Goldsmith and Holley, 2015, 2018; Altieri, 2020).
This provides improved handwriting recognition and portability for the note-taking pro-
cess. Lastly, recent studies focused on computer-assisted CI explore ASR (Ünlü, 2023) and
respeaking (Chen and Kruger, 2023), exploring the feasibility of certain speech technolo-
gies on performance enhancement. The following subsections will focus and elaborate on
these solutions under the following categories: hybrid modalities, digital pen–supported
modalities, and lastly, ASR-assisted modalities, with a detailed review of stand-alone com-
mercialised and non-commercialised tools available.
94
Technology-enabled consecutive interpreting
conducted further tests within what became DG Interpretation, so as to refine the technique
(Ferrari, 2001, 2002). Subsequent to Ferrari’s initial findings, John Lombardi (2003) and
Erik Camayd-Freixas (2005), both US court interpreters, highlighted the potential ben-
efits of using digital voice recorders in their reviews and evaluations. Lombardi’s informal
evaluation of the ‘digital recorder–assisted consecutive’ (DRAC) method detailed the func-
tionality, advantages, and limitations of this approach (2003). Camayd-Freixas conducted
a more structured experiment at Florida International University involving 24 advanced
interpreting students and early-career professionals (Camayd-Freixas, 2005; Hamidi and
Pöchhacker, 2007). The experiment, in which both groups served as their own controls,
centred on the accuracy of interpreting. The unit of analysis was defined as ‘the percentage
of words missed in each statement’. This provided a quantitative measure of interpreter
performance (Camayd-Freixas, 2005, 43). Accuracy was tested using digital recorders ver-
sus traditional note-taking. Results indicated superior accuracy and completeness when
using the recorder, as speech length increased. Overall, studies concluded that such technol-
ogy not only enhances the quality of interpreting in terms of accuracy but also retains the
original’s intonation and ‘liveliness’ (Camayd-Freixas, 2005, 42) more faithfully. Avoiding
note-taking allowed interpreters to focus more intensively on listening and comprehending
the source material.
A couple of years later, drawing on Hamidi’s thesis (2006), Hamidi and Pöchhacker
(2007) tested the simultaneous-consecutive on three experienced professional interpret-
ers and assessed their performances using both methods through transcript analysis,
self-assessment, and audience response. The authors, who named it ‘SimConsec’, reported
that the participants showed more fluent delivery, had closer source–target correspondence,
and had fewer prosodic deviations (2007, 14). In later years, more comparative studies pro-
vided mixed methodologies. Similar to Hamidi and Pöchhacker’s findings, Hawel (2010)
and Orlando (2014) also note significant improvements in the quality of expression, includ-
ing closer correspondence between source and target texts, fewer prosodic deviations, and
enhanced fluency of delivery. Chitrakar (2016), Mielcarek (2017), and Svoboda (2020)
observe fewer linguistic and conceptual errors and a higher level of detail retention. Experi-
menting with the Chinese–English language pair, Ma (2022) reported that the new mode
enhanced interpreting quality by improving accuracy, information completeness, and logi-
cal clarity, while reducing the memory load and stress on participants. Pöchhacker (2015)
acknowledges the intent to eliminate traditional note-taking with this method. However, he
also states how interpreters may still utilise note-taking as a memory aid, particularly with
the advent of digital pen technology that integrates recording and note-taking in one device
(e.g. smartpens). This will be discussed later on in this chapter.
However, certain drawbacks and negative anecdotal evidence concerning the feasibility
of the mode are also apparent, namely, the fact that interpreters are typically focused on the
audio playback risks impairing proper eye contact or other forms of non-verbal commu-
nication with the audience (Orlando, 2014; Ma, 2022). For Orlando (2014), this could be
mitigated if interpreters are made aware of this drawback and trained to proactively engage
with the audience. The other possible disadvantage is the extended duration of delivery,
which may make it difficult to adhere to the norm that consecutive interpretations should
not be longer than the duration of the original speech. Studies also report scepticism among
professional interpreters regarding the practical applicability of this mode. For example,
Hiebl (2011) reported that some participants had concerns about its use in professional
practice. However, it is possible to say that the views ‘usually range from ambivalent to
95
The Routledge Handbook of Interpreting, Technology and AI
positive’ (Orlando and Hlavac, 2020, 11), both in the eyes of the audience (Svoboda, 2020)
and the practitioners (Orlando, 2014).
Further empirical research is indeed required to find out whether this method effectively
enhances the accuracy and fluency of the rendition, and to determine the best types of equip-
ment as well as the most effective contexts for its application. Admittedly, despite the positive
conclusions of the studies implemented over the past two decades, simultaneous-consecutive
is not widely adopted in professional settings, nor has it been comprehensively addressed
in training programmes. A systematic integration of simultaneous-consecutive into training
has been a much-discussed topic in academia (Orlando, 2016), but there is clear need for
further research on when to begin training in the simultaneous-consecutive mode within
interpreting courses (Orlando and Hlavac, 2020). Aside from its popularity among prac-
titioners, limited examples and insufficient data around how best to implement this mode
exist in interpreter training literature. Such shortcomings remain to be resolved.
96
Technology-enabled consecutive interpreting
educational context. The earliest initiatives are Orlando’s studies (2010, 2015, 2016),
which introduced smartpens in classroom settings to facilitate the self-evaluation of stu-
dents’ note-taking practices. A limited number of studies report different pedagogical activi-
ties using the smartpen (Kellet Bidoli, 2016; Kellet Bidoli and Vardè, 2016; Romano, 2018),
with a range of conclusions drawn. For example, the smartpen is seen as being beneficial
for allowing students to revisit their notes repeatedly. This enhances their understanding of
their own note-taking habits and helps improve accuracy in interpreting (Romano, 2018).
As a result, a collaborative learning environment is created (Orlando, 2015; Kellet Bidoli,
2016), which allows for a tangible trace of mental processes, deemed to be typically chal-
lenging, to be captured and evaluated (Kellet Bidoli and Vardè, 2016). Nevertheless, further
widespread research is needed to further validate the effectiveness of smartpens in training.
Aside from the benefits that smartpens and digital pens provide in terms of the cognitive
aspects of the process, as well as enriching process-oriented research in CI, these devices
were also used and tested in the hybrid mode of simultaneous-consecutive for the recording
phase. Several studies employed explanatory research methodologies to analyse the practi-
cality of smartpens in simultaneous-consecutive. For example, Hiebl (2011) aimed to com-
pare CI with simultaneous-consecutive. Four interpreting students and three professional
interpreters interpreted three speeches from Italian to German using either mode. The results
showed that difficult texts were often chosen for simultaneous-consecutive interpreting,
but participants generally preferred traditional CI. Orlando (2014) tested four professional
interpreters who performed CI in both traditional consecutive and the hybrid mode. Their
level of accuracy was measured based on the units of meaning being correctly conveyed.
This was shown to be higher for participants who performed simultaneous-consecutive
using a smartpen. Orlando also reported that there were fewer disfluencies or hesitations
in the hybrid mode compared to the traditional mode. This may be due to having access to
the speech content for a second time through the playback feature of the digital pen (2014,
48). Similarly, Mielcarek (2017) tested the two different modes with four participants. The
study found better interpreting performance in simultaneous-consecutive with smartpens
compared to both traditional CI and simultaneous-consecutive with ordinary digital voice
recorders. In another comparative study by Svoboda (2020), seven interpreters performed
CI both in traditional consecutive mode using simultaneous-consecutive with a smartpen.
The interpretations were assessed by an independent audience and through video-based
analysis for ‘source–target correspondence’ (p. 47). Findings indicated that despite the tech-
nological support provided by the smartpen, the audience generally preferred traditional
consecutive interpreting over simultaneous-consecutive with a smartpen. However, the data
analysis showed that simultaneous-consecutive with a smartpen improved the accuracy of
source–target correspondence compared to traditional consecutive interpreting. Özkan’s
study (2020) found a positive trend in favour of simultaneous-consecutive with a smartpen,
but interpreters preferred tablets and styluses over smartpens.
97
The Routledge Handbook of Interpreting, Technology and AI
ASR technology rely on the generated transcript and essentially sight-translate the text. The
rationale behind this is that an ASR model with precision and accuracy can enhance per-
formance by presenting the automatic transcription and/or machine-translated target text
for a smooth rendition. This new modality can be referred to as ‘consecutive interpreting
with text’ or ‘sight-consecutive’ (Ünlü, 2023, 23). Relying fully on a real-time transcript
generated by an ASR tool can occasionally be problematic for many reasons, particularly
in a real setting. The verbatim output is one such reason. For example, having access to
the entire ASR-generated source text, combined with machine translation, might compel
the interpreter to be more thorough and comprehensive, thus spending additional time to
deliver thorough renditions (Ünlü, 2023, 96). Another challenge is related to the inherent
process of text generation through an ASR pipeline. Recent progress in deep learning has
made it possible, albeit in favour of high-resource languages, to run multilingual speech rec-
ognition models with high accuracy and low word-error rate. However, ASR is still bound
to certain technical issues and inherent pitfalls. The technical limitations of ASR-assisted
interpreting, whether SI or CI, include, among many others, microphone malfunctions,
software glitches, connectivity issues, and confidentiality issues. Transcribing speech accu-
rately can also be challenging due to various factors. These include the type of speech used
(casual or formal), variations in the speaker’s voice, and ambiguity caused by homonyms
(Fantinuoli, 2017). Additionally, misrecognition of word boundaries can also contribute
to errors in the output of ASR. Therefore, it is crucial for the user to have an ergonomic
experience with the tool(s) and to know the weaknesses, strengths, and risks of having such
assistance.
As for the tools and technologies used in empirical research, the literature shows that
the researchers who conducted empirical studies on adopting ASR in CI mostly used
stand-alone ASR products that are not designed as CAI tools. However, two bespoke CAI
tools, designed specifically for interpreting scenarios, are now available on the market.
These publicly available products are Sight-Terp and Cymo Note. With their customised
interfaces, both aim to enhance the capabilities of standard note-taking applications by
incorporating AI-based functionalities like named entity recognition, segmentation, anno-
tation, and summarisation. The following subsections will focus on the studies that have
been conducted so far on the use of speech technologies (ASR and speech translation) in CI,
both in terms of non-bespoke ASR solutions and stand-alone CAI tools.
98
Technology-enabled consecutive interpreting
organisation names; person names; dates; numerical data, for example, percentage, ordinal
numbers, temperature; location names; and currency data) are synchronously highlighted
to ease the ‘reading from notes’ effort (Ünlü, 2023, 73). The validity of this has not yet
been tested. Moreover, Sight-Terp incorporates an optional digital note-taking application,
supporting features like scratch-out erasing and drawing lines with a stylus like the Apple
Pencil or Samsung S Pen. This feature allows the user to experience both ASR support and
tablet interpreting (Goldsmith, 2018) experience. However, the use of a digital notepad
with ASR or ST has not been tested in this study.
Ünlü’s study used a within-subjects design to compare interpreting performance with
and without the tool, focusing particularly on accuracy and fluency variables. A repeated
measures design was used, involving pre-tests (without technological aid) and post-tests
(with Sight-Terp). Twelve novice interpreters participated, providing a within-subjects
comparison across conditions. The procedure involved initial pre-tests using traditional
note-taking methods, followed by training on the Sight-Terp tool, and subsequent post-tests
using the tool. It is important to note for this experiment that the Sight-Terp interface also
displays an instant MT output of the source transcription. So the participants were free to
make use of either the output of ASR (source language transcription) or the output of ST
during the assignment.
Results indicated that interpreters demonstrated increased accuracy when using
Sight-Terp compared to when they did not use technological aids. While Sight-Terp
improved accuracy, its use also led to an increase in disfluencies, such as pauses, hesitations,
repetitions, stuttering, and false starts. Using Sight-Terp also resulted in longer durations
of delivery. This was either perceived as interpreters taking additional time to process the
information provided by the tool or being tempted to take a more meticulous approach
to render the ASR/ST output fully in the target language. Participants generally found the
Sight-Terp interface to be user-friendly, and the tool itself useful for their interpreting work.
However, some challenges were noted, such as minor ASR errors and segmentation issues.
Another CAI tool offering ASR-based functionalities is Cymo Note, a software designed
as an interpretation assistant by Cymo Inc., a Beijing-based company. Cymo Note incorpo-
rates a third-party ASR with MT developed to augment both consecutive and simultaneous
interpreting performance. Cymo Note offers two display options: displaying the transcrip-
tion on a full screen, from which the interpreter makes use of the notes, or dividing the
screen into two halves, one for the transcript and the other for digital note-taking. Gener-
ally speaking, Cymo Note has similar functionalities to Sight-Terp, including automatic
transcription of speech in multiple languages, inline highlighting of important terms or
numbers, and on-demand machine translation of selected texts. On the other hand, Cymo
Note also makes digital note-taking possible during the speech recognition session. The tool
also allows for customisation via a proprietary algorithm, note annotation of notes directly
on transcripts, and local data storage for confidentiality (Goldsmith, 2023). Cymo Note is
tailored for various interpreting scenarios, including remote, on-site, and hybrid settings. It
does not save data to the cloud, ensuring user privacy (Zhu, 2023). Currently, there are no
studies that empirically test Cymo Note in CI.
Following Ünlü’s call to conduct empirical studies for other language pairs, Dellantonio
(2023) experimented with Sight-Terp with the participation of six Italian-speaking female
interpreting students from the University of Innsbruck. Participants interpreted two texts
from German to Italian, one with Sight-Terp’s assistance and one without, in consecu-
tive mode. Two speeches on climate change, each made up of 15 syntactically complex
99
The Routledge Handbook of Interpreting, Technology and AI
passages, were used as the stimuli. Performances were recorded for accuracy analysis. After
the experiment, participants completed a questionnaire where the researcher explored their
perceptions, opinions, and reflections on using Sight-Terp (2023, 99). A particularly nota-
ble aspect of Dellantonio’s study is that the study analysed how the tool aids interpreters in
tackling syntactically complex passages. The ST feature was discarded, and only the ASR
output was used. This means that the machine-generated translation of the source speech
input was not used for reference during the experiment. Unlike Ünlü (2023), who analysed
the renditions based on ‘units of meaning’ (Seleskovitch and Lederer, 1989), the perfor-
mances of the participants were analysed for accuracy and completeness using a rating
scale based on Tiselius (2009). The results indicate that, for the participants, the tool was
particularly helpful for handling long or complex sentences. However, this caused syntactic
differences between the source and the target in particular. These were perceived as a dif-
ficulty and led the participants to use Sight-Terp’s transcription (Dellantonio, 2023).
Another study on the use of Sight-Terp in CI is Michele Restuccia’s MA thesis (2023), which
similarly employs an experimental set-up to explore how interpreters interact with the tool,
and its impact on their performance. Six students nearing completion of their master’s degree
in conference interpreting participated in the study. Participants interpreted two speeches from
German to Italian, one in traditional consecutive interpreting mode (pen-and-paper notes),
and the other using Sight-Terp, with the order randomised. For better preparation and elimi-
nation of the unfamiliarity with technology-assisted CI, Restuccia provided the participants
with an 11 hr seminar on AI-assisted interpreting preparation using NotionAI and consecu-
tive interpreting with Sight-Terp. The data collection methods included the screen recording
(of Sight-Terp-generated transcription), audio recordings of the renditions, a questionnaire
(31 questions), and lastly, a focus group discussion to explore the participants’ strategies,
challenges, and opinions on Sight-Terp and ASR-assisted CI. Contrary to initial assumptions,
the availability of ASR transcription did not positively enhance listening comprehension. The
trainee interpreters who relied solely on the transcript reported feeling distracted and found it
challenging to analyse the speech in detail (Restuccia, 2023). Similar to the results concluded
in Ünlü’s study (2023), consecutive interpreting using Sight-Terp was found to take generally
longer than the performances using traditional techniques (pen and paper). This is likely due
to the added cognitive effort of monitoring both the transcript and their notes, or the reliance
on sight translation of a raw transcription, leading to a longer rendition. Furthermore, par-
ticipants developed various strategies for interacting with Sight-Terp: Some used it primarily
for confirming notes, some relied solely on the transcript for sight translation, while others
used it as a supplementary resource during rendition (Restuccia, 2023).
The existing body of literature in Western academic publications suggests a scarcity of
empirical studies that focus on the use of ASR/CAI within CI. However, a quick search of
the other databases, including China National Knowledge Infrastructure (CNKI), Wanfang
database (www.wf.pub), and Korean Studies Information Service System (KISS), reveals
a growing body of research conducted at both master’s and PhD levels on the application
of ASR in CI. It is also worth noting that the English-language empirical studies on CAI,
for both SI and CI, focus primarily on the adoption of InterpretBank and Dragon Natu-
ral Speaking software. In contrast, Chinese-language studies encompass a wide range of
software, including iFlytek ASR, iFlynote, or iFlytek Interpreting Assistant (Guo et al.,
2023, 93). The studies on ASR usage in CI in CNKI and other related non-European data-
bases provide empirical evidence for different language pairs. Lee’s experimental study
(2021, 2022) investigated ASR assistance in CI using Microsoft Azure’s speech-to-text.
100
Technology-enabled consecutive interpreting
Twenty-two professional interpreters (19 females, 3 males) at various ages and experience
levels were asked to interpret the first half of the material presented to them, without using
ASR support. The second half of the material was interpreted with ASR-generated tran-
scripts displayed on-screen. The experiment involved three rounds of varying text length
and content complexity (a lecture, a presentation, a conference call). Data analysis was
conducted through a questionnaire administered to all participants. This post-experiment
questionnaire revealed that interpreters generally expressed a positive view of using ASR
(2022, 950), and particularly, shorter texts with numerous numerical data were received
more favourably. The study also reported that interpreters changed their approach to using
the ASR output as they became more familiar with it over the three experimental rounds.
This means that repeated exposure appeared to lead to adjustments in their need for ASR
reference. Examples of adjustments include checking the ASR output to confirm their
understanding or fill in gaps in their notes, and using this for specific types of information
(like numbers or names). Some interpreters shifted their note-taking to focus on captur-
ing only the most essential ideas, knowing ASR could provide more detailed information
(2022, 948). Li (2016) and Zhang (2020) focused on the overall accuracy of ASR software,
like Dragon Naturally Speaking, and its impact on (consecutive) interpreting performance.
While Li (2016) found that ASR assists in improving interpreting performance, issues with
strong accents and rapid speech were notable. Similarly, Zhang (2020) reported a modest
improvement in fidelity but observed a decrease in fluency and time management when using
ASR technology (Li, H. Y., 2016; Zhang, 2020). Xin’s (2023) study specifically examined
the translation of proper nouns in Japanese–Chinese CI, suggesting how ASR technology
particularly aids with longer or more complex nouns. However, it also introduces chal-
lenges such as psychological dependency and potential errors from low recognition accu-
racy (Xin, 2023). Qin (2021) adopted a similar methodology to assess performance across
different text lengths and language directions. The findings suggested that high-accuracy
ASR technology consistently improves interpreting quality, with more proven benefits for
longer texts. Bu (2021) tested 20 graduate students interpreting speeches by Jack Ma on
education and AI, with and without iFlyNote speech recognition software. The findings
suggest that while the software can reduce repetition and self-correction in formal materi-
als, it generally lowers accuracy, efficiency, and fluency in interpreting, especially with col-
loquial materials. Li (2023) explored the efficiency of interpreters using iFlyRec software to
replace traditional note-taking tasks. The results showed significant improvements in inter-
preting scores with high speech recognition accuracy, underlining the importance of mate-
rial difficulty and individual differences in the effectiveness of ASR technology (Li, 2023).
Overall, the studies just quoted generally indicate a positive trend toward integrating
ASR technology in CI. However, they also highlight the need for improvements in software
accuracy and adaptation to specific interpreting challenges. On the other hand, studies in
this area show that they share some methodological limitations. One such limitation is
that the sample sizes are not large enough to allow for the generalisability of the findings.
The other limitation is that, besides Lee (2022), studies usually do not recruit professional
interpreters, but trainees. The methodologies of the studies also demonstrate that data
analysis regarding the accuracy of the renditions in terms of quality is approached from
very different frameworks and viewpoints. Examples of different frameworks and view-
points include linguistic micro-analysis of source–target correspondence, perceived quality
through surveys, or interrater agreements. This further shows the importance of developing
robust-quality assessment frameworks for empirical data analysis in ASR-assisted CI.
101
The Routledge Handbook of Interpreting, Technology and AI
As to the application of CACI in the real world, the scenarios would be different in
remote and on-site interpreting. In remote CI (referring to cases in which the inter-
preter is located remotely from the other parties), the interpreter simply needs to mute
herself on the conference software in Phase I and unmute in Phase II. In on-site CI, the
interpreter would need a stenomask – a microphone built into a padded, sound-proof
enclosure that fits over the speaker’s mouth – a device used by court reporters in
creating proceedings via SR. On the one hand, the steno-mask ensures SR quality in
noisy environments; on the other hand, it silences the user’s voice so that it does not
interfere with the surrounding environment, and in our case, the conference where
the interpreting takes place.
(Chen and Kruger, 2023, 8)
The study used six undergraduate students who had been majoring in English and had had
experience in conventional CI. The participants underwent a ten-week online and offline
training programme on CACI. However, the authors did not provide specific details about the
training programme. This left it unclear as to whether the participants received training on
respeaking as part of their training. iFlytek was used for ASR, while Baidu Translate was used
for MT. The results showed that the participants delivered more fluently with CACI than in
conventional CI. This meant the authors believed this method to be effective if provided with
proper training in CACI (2023). It is important to acknowledge the limitations of this method
in terms of certain practical aspects. For example, wearing a steno mask for extended peri-
ods can be uncomfortable and cause fatigue, especially for interpreters working on lengthy
102
Technology-enabled consecutive interpreting
assignments. Using a steno mask for CACI, as well as the multi-step nature of the method,
results in a trade-off between improved speech recognition quality and potential drawbacks
relating to communication, comfort, technical dependency, and psychological impact.
6.3 Conclusion
This chapter takes stock of the designs and implementations of various technologies applied
in CI, with a broad review of academic research in practice and training. Although CI has
traditionally been viewed as a non-technical field, it does have the potential to incorporate
various tools and technologies and enhance different phases of the interpreting process. It
is evident that the interplay between technology and consecutive interpreting occurs within
different methods and approaches. Although the number of studies remains very limited,
the application of technology in CI using different approaches is experimentally tested in
various geographies, with different language pairs and directionality. These applications and
experiments included technological types of equipment, like digital pens/smartpens, ASR
models, tablets, and hybrid modalities (simultaneous-consecutive and CACI). Themes of the
recent studies also show that the recent advancements in speech recognition increased interest
among scholars and students in CAI and ASR-integrated technologies applied in interpreting.
Despite efforts, it seems that there are still challenges and limitations in research that per-
sist and require further discussion. A common limitation appears to be the subjects involved
in the experiments, which are mostly trainee interpreters or new graduates. Studies involv-
ing professionals in ecologically valid experimental set-ups should eventually bridge this
gap. More studies based on practical experiments and empirical evidence are needed, par-
ticularly for ASR-assisted CI. Moreover, studies that examine the cognitive loads involved
in ASR-assisted CI are still very limited in number.
A general outlook at the field shows that technology in CI has not been widely adopted
in training or professional settings. This reflects its modest popularity and the lack of sys-
tematic integration into interpreter training curricula. Despite the recognised advantages
and the growing priority given to digital literacy and ICT skills in higher education, the
integration of these technologies in interpreting classrooms remains limited and should be
addressed. Nevertheless, the growing momentum in research will surely open new areas for
broader generalisability. The availability of various technologies and of multimodal func-
tions presents potential paths for future academic studies on CAI applied in CI.
On the technical side, recent advancements in speech recognition technology have led to
significant progress (Baevski et al., 2020; Barrault et al., 2023a), making it applicable across
a wide range of uses. State-of-the-art foundation models, such as Whisper (Radford et al.,
2023), Google USM (Zhang et al., 2023), w2v-BERT 2.0 v1 (Barrault et al., 2023a), and
w2v-BERT 2.0 v2 (Barrault et al., 2023b), have gained widespread popularity and are used
extensively in both academia and industry. Their robust performance demonstrates a lower
word error rate and remains very promising for ASR-related issues and challenges present
in ASR-assisted CI. Thus, related CAI tools might soon benefit from these advanced foun-
dational models in a fast cloud environment. Furthermore, thanks to their remarkable abil-
ity to understand, analyse, and generate text with vast amounts of pre-trained knowledge,
the emergence of large language models (LLMs) has revolutionised both academic natural
language processing (NLP) research and industrial products. Since LLMs capture complex
linguistic patterns, semantic relationships, and contextual cues, it is possible to produce
high-quality summaries that rival those crafted by humans. This capability is particularly
103
The Routledge Handbook of Interpreting, Technology and AI
advantageous in ASR-assisted CI, where the transcription of source speech can be dense and
complex. That is to say, LLMs can effectively distil the main arguments, points, and ideas
from the transcription and provide concise information. As of now, direct integration of an
LLM or LLM-in-the-loop approach in an ASR pipeline for CI remains unexplored.
References
Albl-Mikasa, M., 2017. Notation Language and Notation Text: A Cognitive-Linguistic Model of
Consecutive Interpreting. In Someya, Y., ed. Consecutive Notetaking and Interpreter Training.
Routledge, London, 71–117.
Altieri, M., 2020. Tablet Interpreting: Étude expérimentale de l’interprétation consécutive sur tab-
lette. The Interpreters’ Newsletter 25, 19–35. URL https://doi.org/10.13137/2421-714X/31235
Arriaga-Prieto, C., Villamayor, I., Serrano Leiva, A., Cascallana Rodriguez, A., Rodríguez, S., Pozo
Huertas, A., Alonso González, A., 2023. Smarterp Educational: A Virtual Laboratory to Train
Simultaneous Interpreting. In Proceedings of the 15th International Conference on Education and
New Learning Technologies (EDULEARN23), IATED Academy, Palma, Spain, 3257–3264. URL
https://doi.org/10.21125/edulearn.2023.0900
Baevski, A., Zhou, Y., Mohamed, A., Auli, M., 2020. wav2vec 2.0: A Framework for Self-Supervised
Learning of Speech Representations. In Proceedings of the 34th Conference on Neural Informa-
tion Processing Systems, 12449–12460.
Barrault, L., Chung, Y.-A., Cora Meglioli, M., Dale, D., Dong, N., Duquenne, P.-A., Elsahar, H.,
Gong, H., Heffernan, K., Hoffman, J., Klaiber, C., Li, P., Licht, D., Maillard, J., Rakotoarison,
A., Ram Sadagopan, K., Wenzek, G., Ye, E., Akula, B., Chen, P.-J., El Hachem, N., Ellis, B.,
Mejia Gonzalez, G., Haaheim, J., Hansanti, P., Howes, R., Huang, B., Hwang, M.-J., Inaguma,
H., Jain, S., Kalbassi, E., Kallet, A., Kulikov, I., Lam, J., Li, D., Ma, X., Mavlyutov, R., Peloquin,
B., Ramadan, M., Ramakrishnan, A., Sun, A., Tran, K., Tran, T., Tufanov, I., Vogeti, V., Wood,
C., Yang, Y., Yu, B., Andrews, P., Balioglu, C., R. Costa-jussà, M., Celebi, O., Elbayad, M., Gao,
C., Guzmán, F., Kao, J., Lee, A., Mourachko, A., Pino, J., Popuri, S., Ropers, C., Saleem, S.,
Schwenk, H., Tomasello, P., Wang, C., Wang, J., Wang, S., 2023a. Seamless M4T-Massively Multi-
lingual & Multimodal Machine Translation. arXiv preprint arXiv:2308.11596. URL https://arxiv.
org/abs/2308.11596
Barrault, L., Chung, Y.-A., Coria Meglioli, M., Dale, D., Dong, N., Duppenthaler, M., Duquenne,
P.-A., Ellis, B., Elsahar, H., Haaheim, J., Hoffman, J., Hwang, M.-J., Inaguma, H., Klaiber, C.,
Kulikov, I., Li, P., Licht, D., Maillard, J., Mavlyutov, R., Rakotoarison, A., Ram Sadagopan, K.,
Ramakrishnan, A., Tran, T., Wenzek, G., Yang, Y., Ye, E., Evtimov, I., Fernandez, P., Gao, C.,
Hansanti, P., Kalbassi, E., Kallet, A., Kozhevnikov, A., Mejia Gonzalez, G., San Roman, R., Touret,
C., Wong, C., Wood, C., Yu, B., Andrews, P., Balioglu, C., Chen, P.-J., R. Costa-jussà, M., Elbayad,
M., Gong, H., Guzmán, F., Heffernan, K., Jain, S., Kao, J., Lee, A., Ma, X., Mourachko, A., Pelo-
quin, B., Pino, J., Popuri, S., Ropers, C., Saleem, S., Schwenk, H., Sun, A., Tomasello, P., Wang,
C., Wang, J., Wang, S., Williamson, M., 2023b. Seamless: Multilingual Expressive and Streaming
Speech Translation. arXiv preprint arXiv:2312.05187. URL https://arxiv.org/abs/2312.05187
Baselli, V., 2023. Developing a New CAI-Tool for RSI Interpreters’ Training: A Pilot Study. In Orăsan,
C., Mitkov, R., Corpas Pastor, G., Monti, J., eds. Proceedings of the International Conference
on Human-Informed Translation and Interpreting Technology (HiT-IT 2023), Naples, Italy.
INCOMA Ltd., Shoumen, Bulgaria, 157–166.
Biagini, G., 2015. Glossario cartaceo e glossario elettronico durante l’interpretazione simultanea: uno
studio comparativo (Unpublished MA dissertation). Università di Trieste.
Bu, X., 2021. A Report of Automatic Speech Recognition’s Impacts on Chinese-Japanese Simultane-
ous Interpreting of Numbers: A Case Study of iFlyrec (Unpublished MA dissertation). Dalian
University of Foreign Languages.
Camayd-Freixas, E., 2005. A Revolution in Consecutive Interpretation: Digital Voice-Recorder-Assisted
CI. The ATA Chronicle 3, 40–46.
Chen, S., Kruger, J.-L., 2023. The Effectiveness of Computer-Assisted Interpreting: A Preliminary
Study Based on English-Chinese Consecutive Interpreting. Translation and Interpreting Studies
18(3). URL https://doi.org/10.1075/tis.21036.che
104
Technology-enabled consecutive interpreting
Cheung, A., Tianyun, L., 2018. Automatic Speech Recognition in Simultaneous Interpreting: A New
Approach to Computer-Aided Interpreting. In Proceedings of Ewha Research Institute for Transla-
tion Studies International Conference. Ewha Womans University.
Chitrakar, R., 2016. Tehnološko podprto konsekutivno tolmačenje [Technologically Supported Con-
secutive Interpreting] (Unpublished PhD thesis). University of Ljubljana.
Defrancq, B., Fantinuoli, C., 2021. Automatic Speech Recognition in the Booth: Assessment of Sys-
tem Performance, Interpreters’ Performances and Interactions in the Context of Numbers. Target
33(1), 73–101.
Dellantonio, E., 2023. Utilizzo del CAI Tool Sight-Terp in interpretazione consecutiva: Impiego del
CAI tool per la risoluzione di passaggi sintatticamente complessi nella combinazione linguistica
tedesco-italiano (Unpublished MA dissertation). Leopold-Franzens-Universität Innsbruck.
Drechsel, A., Goldsmith, J., 2016. Tablet Interpreting: The Evolution and Uses of Mobile Devices in
Interpreting. In Proceedings of CUITI Forum 2016.
Fantinuoli, C., 2017. Speech Recognition in the Interpreter Workstation. In Proceedings of the Trans-
lating and the Computer 39. Editions Tradulex, Geneva, 25–34.
Fantinuoli, C., 2018. Computer-Assisted Interpreting: Challenges and Future Perspectives. In
Durán, I., Corpas, G., eds. Trends in E-Tools and Resources for Translators and Interpreters. Brill,
153–174.
Fantinuoli, C., 2023. Towards AI-Enhanced Computer-Assisted Interpreting. In Corpas Pastor, G.,
Defrancq, B., eds. Interpreting Technologies – Current and Future Trends. John Benjamins Pub-
lishing Company, 47–72. URL https://doi.org/10.1075/ivitra.37.01orl.
Fantinuoli, C., Marchesini, G., Landan, D., 2022. Interpreter Assist: Fully-Automated Real-Time
Support for Remote Interpretation. In Proceedings of Translator and Computer 53 Conference.
Ferrari, M., 2001. Consecutive Simultaneous? SCIC News 26, 2–4.
Ferrari, M., 2002. Traditional vs. Simultaneous consecutive. SCIC News 29, 6–7.
Ferrari, M., 2007. Simultaneous Consecutive Revisited. SCIS News, 124. URL https://iacovoni.files.
wordpress.com/2009/01/simultaneousconsecutive-2.pdf (accessed 9.3.2024).
Frittella, F.M., 2023. Usability Research for Interpreter-Centred Technology: The Case Study of
SmarTerp. Language Science Press.
Frittella, F.M., Rodríguez, S., 2022. Putting SmartTerp to Test: A Tool for the Challenges of Remote
Interpreting. NContext 2(2), 137–166. URL https://doi.org/10.54754/incontext.v2i2.21
Gaber, M., Corpas Pastor, G., Omer, A., 2020. Speech-to-Text Technology as a Documentation Tool
for Interpreters: A New Approach to Compiling an Ad Hoc Corpus and Extracting Terminology
from Video-Recorded Speeches. TRANS. Revista De Traductología 24, 263–281. URL https://doi.
org/10.24310/TRANS.2020.v0i24.7876
Gacek, M., 2015. Softwarelösungen für DolmetscherInnen (Unpublished MA dissertation). Univer-
sity of Vienna.
Gile, D., 1995. Basic Concepts and Models for Interpreter and Translator Training. John Benjamins,
Amsterdam.
Gile, D., 2001. The Role of Consecutive in Interpreter Training: A Cognitive View. Communicate 14.
URL http://aiic.net/p/377
Gile, D., 2009. Basic Concepts and Models for Interpreter and Translator Training, revised ed. John
Benjamins, Amsterdam. URL https://doi.org/10.1075/btl.8
Gillies, A., 2019. Consecutive Interpreting: A Short Course. Routledge, London.
Giustini, D., 2024. “You Can Book an Interpreter the Same Way You Order Your Uber”: (Re)Inter-
preting Work and Digital Labour Platforms. Perspectives 1–19. URL https://doi.org/10.1080/090
7676X.2023.2298910
Goldsmith, J., 2018. Tablet Interpreting: Consecutive Interpreting 2.0. Translation and Interpreting
Studies 13(3), 342–365. URL https://doi.org/10.1075/tis.00020.gol
Goldsmith, J., 2023. Cymo Note: Speech Recognition Meets Automated Note-Taking. The Tool Box
Journal 23(2), 34. URL www.internationalwriters.com/toolkit/current.html
Goldsmith, J., Holley, J., 2015. Consecutive Interpreting 2.0: The Tablet Interpreting Experience
(Unpublished MA dissertation). University of Geneva.
Gomes, M., 2002. Digitally Mastered Consecutive: An Interview with Michele Ferrari. Lingua Franca:
Le Bulletin de l’interpretation au Parlement Européen 5/6, 6–10.
105
The Routledge Handbook of Interpreting, Technology and AI
Guo, M., Han, L., Anacleto, M.T., 2023. Computer-Assisted Interpreting Tools: Status Quo and
Future Trends. Theory and Practice in Language Studies, 13(1), 89–99.
Hamidi, M., 2006. Simultanes Konsekutivdolmetschen. Ein experimenteller Vergleich im Sprachen-
paar Französisch-Deutsch (Unpublished MA dissertation). University of Vienna.
Hamidi, M., Pöchhacker, F., 2007. Simultaneous Consecutive Interpreting: A New Technique Put to
the Test. Meta 52(2), 276–289. URL https://doi.org/10.7202/016070ar
Hansen-Schirra, S., 2012. Nutzbarkeit von Sprachtechnologien für die Translation. trans-kom 5(2),
211–226.
Hawel, K., 2010. Simultanes versus klassisches Konsekutivdolmetschen: Eine vergleichende textuelle
Analyse (Unpublished MA dissertation). University of Vienna.
Hiebl, B., 2011. Simultanes Konsekutivdolmetschen mit dem Livescribe Echo Smartpen (Unpublished
MA dissertation). University of Vienna.
Kellet Bidoli, C.J., 2016. Traditional and Technological Approaches to Learning LSP in Italian to
English Consecutive Interpreter Training. In Garzone, G., Heaney, D., Riboni, G., eds. Focus on
LSP Teaching: Developments and Issues. LED, 103–126.
Kellet Bidoli, C.J., Vardè, S., 2016. Digital Pen Technology and Consecutive Note-Taking in the Class-
room and Beyond. In Zehnalová, J., Molnár, O., Kubánek, M., eds. Interchange Between Lan-
guages and Cultures: The Quest for Quality. Palacký University, 131–148.
Lee, J.-R.-A., 2021. Preliminary Research on the Application of Automatic Speech Recognition in
Interpretation. The Journal of Humanities and Social Science 21 12(5), 2407–2422. https://doi.
org/10.22143/HSS21.12.5.170
Lee, J.-R.-A., 2022. A Case Study on the Usability of Automatic Speech Recognition as an Auxil-
iary Tool for Consecutive Interpreting. The Journal of Humanities and Social Science 21 13(4),
937–952. URL https://doi.org/10.22143/HSS21.13.4.66
Li, F.Z., 2023. The Impact of Speech Recognition on the Efficiency of Interpreters in Consecu-
tive Interpreting (Unpublished MA dissertation). Inner Mongolia University. URL https://doi.
org/10.27224/d.cnki.gnmdu.2023.001088
Li, H.Y., 2016. The Auxiliary Role of Speech Recognition Technology in Consecutive Interpreting
(Unpublished MA dissertation). Sichuan Foreign Languages University. URL https://kns.cnki.net/
KCMS/detail/detail.aspx?dbname=CMFD201701&filename=1016072373.nh
Lombardi, J., 2003. DRAC Interpreting: Coming Soon to a Courthouse Near You? Proteus 12(2), 7–9.
Ma, Z., 2022. A Comparative Study of Student Interpreters’ Performance in EC Simultaneous-
Consecutive and Consecutive Interpreting Modes (Unpublished MA dissertation). Beijing Foreign
Studies University. URL https://doi.org/10.26962/d.cnki.gbjwu.2022.000760
Mellinger, C.D., 2023. Embedding, Extending, and Distributing Interpreter Cognition with Tech-
nology. In Corpas Pastor, G., Defrancq, B., eds. Interpreting Technologies – Current and Future
Trends. John Benjamins, 195–216. URL https://doi.org/10.1075/ivitra.37.08mel
Mielcarek, M., 2017. Das simultane Konsekutivdolmetschen (Unpublished MA dissertation). Univer-
sity of Vienna.
Orlando, M., 2010. Digital Pen Technology and Consecutive Interpreting: Another Dimension in
Note-Taking Training and Assessment. The Interpreters’ Newsletter 15, 71–86.
Orlando, M., 2014. A Study on the Amenability of Digital Pen Technology in a Hybrid Mode of
Interpreting: Consec-Simul with Notes. Translation and Interpreting 6(2), 39–54.
Orlando, M., 2015. Implementing Digital Pen Technology in the Consecutive Interpreting Classroom.
In Andres, D., Behr, M., eds. To Know How to Suggest . . . Approaches to Teaching Conference
Interpreting. Frank & Timme, 171–199.
Orlando, M., 2016. Training 21st Century Translators and Interpreters: At the Crossroads of Prac-
tice, Research and Pedagogy. Frank & Timme.
Orlando, M., 2023. Using Smartpens and Digital Pens in Interpreter Training and Interpreting
Research. In Corpas Pastor, G., Defrancq, B., eds. Interpreting Technologies – Current and Future
Trends. John Benjamins Publishing Company, 6–26. URL https://doi.org/10.1075/ivitra.37.01orl
Orlando, M., Hlavac, J., 2020. Simultaneous-Consecutive in Interpreter Training and Interpreting
Practice: Use and Perceptions of a Hybrid Mode. The Interpreters’ Newsletter 25, 1–17.
Özkan, C.E., 2020. To Use or Not to Use a Smartpen: That Is the Question. An Empirical Study on
the Role of Smartpen in the Viability of Simultaneous-Consecutive Interpreting (Unpublished MA
dissertation). Ghent University.
106
Technology-enabled consecutive interpreting
Pisani, E., Fantinuoli, C., 2021. Measuring the Impact of Automatic Speech Recognition on Number
Rendition in Simultaneous Interpreting. In Wang, C., Binghan Zheng, B., eds. Empirical Studies of
Translation and Interpreting, 1st ed. Routledge, London.
Pöchhacker, F., 2007. Going Simul? Technology-Assisted Consecutive Interpreting. Forum 5(2),
101–124.
Pöchhacker, F., 2015. Simultaneous Consecutive. In Pöchhacker, F., ed. Encyclopedia of Interpreting
Studies. Routledge, London, 381–382.
Pöchhacker, F., 2016. Introducing Interpreting Studies, 2nd ed. Routledge, London.
Prandi, B., 2023. Computer-Assisted Simultaneous Interpreting: A Cognitive-Experimental Study on
Terminology. Language Science Press.
Qin, X., 2021. Experimental Report on the Impact of Speech Recognition on Consecutive Interpret-
ing (Unpublished MA dissertation). Southwest University of Finance and Economics. URL https://
wf.pub/thesis/article:D02418773
Radford, A., Kim, J.W., Xu, T., Brockman, G., McLeavey, C., Sutskever, I., 2023. Robust Speech Rec-
ognition via Large-Scale Weak Supervision. In Proceedings of the 40th International Conference
on Machine Learning, 28492–28518.
Restuccia, M., 2023. Interpretazione di conferenza e IA: Studio sperimentale su Sight-Terp e la con-
secutiva assistita (Unpublished MA dissertation). Università degli Studi di Trieste.
Rodríguez, S., Gretter, R., Matassoni, M., Falavigna, D., Alonso, A., Corcho, O., Rico, M., 2021.
SmarTerp: A CAI System to Support Simultaneous Interpreters in Real-Time. In Proceedings of
Triton 2021, 102–109.
Rodríguez González, E., Saeed, M.A., Korybski, T., Davitti, E., Braun, S., 2023. Assessing the Impact
of Automatic Speech Recognition on Remote Simultaneous Interpreting Performance Using the
NTR Model. In Corpas Pastor, G., Hidalgo-Ternero, C.M., eds. International Workshop on Inter-
preting Technologies – SAY IT AGAIN 2023. Málaga, Spain, 2–3.11.2023.
Romano, E., 2018. Teaching Note-Taking to Beginners Using a Digital Pen. Między Oryginałem a
Przekładem 24(42), 9–16.
Romero-Fresco, P., 2011. Subtitling Through Speech Recognition: Respeaking. Routledge, Manchester.
Rosado, T., 2013. Note-Taking with iPad: Making Our Life Easier. The Professional Interpreter
(Blog). URL http://rpstranslations.wordpress.com/2013/05/28/note-taking-with-ipad-making-our-
life-easier-2/ (accessed 6.3.2024).
Rozan, J., 1956. La prise de notes en interprétation consécutive. Libraire de l’Université Georg, Geneva.
Seleskovitch, D., Lederer, M., 1989. Pédagogie raisonnée de l’interprétation. OPOCE/Didier Erudition.
Svoboda, S., 2020. SimConsec: The Technology of a Smartpen in Interpreting (Unpublished MA dis-
sertation). Palacký University Olomouc.
Tiselius, E., 2009. Revisiting Carroll’s Scales. In Angelelli, C.V., Jacobson, H.E., eds. American Trans-
lators Association Scholarly Monograph Series, Vol. XIV. John Benjamins Publishing Company,
95–121.
Ünlü, C., 2023. Automatic Speech Recognition in Consecutive Interpreter Workstation: Computer-Aided
Interpreting Tool ‘Sight-Terp’ (Unpublished MA dissertation). Hacettepe University.
Wang, X., Wang, C., 2019. Can Computer-Assisted Interpreting Tools Assist Interpreting? Translet-
ters. International Journal of Translation and Interpreting 3, 109–139.
Xin, X.Y., 2023. The Impact of Speech Recognition Software on Japanese-Chinese Consecutive Inter-
preting with a Focus on Proper Nouns (Unpublished MA dissertation). Dalian Foreign Studies
University. URL https://wf.pub/thesis/article:D03183164
Zhang, P., 2020. Empirical Study on the Impact of Speech Recognition Software on the Performance
of English-Chinese Consecutive Interpreters (Unpublished MA dissertation). Inner Mongolia Uni-
versity. URL https://wf.pub/thesis/article:D02007520
Zhang, Y., Han, W., Qin, J., Wang, Y., Bapna, A., Chen, Z., Chen, N., Li, B., Axelrod, V., Wang, G.,
Meng, Z., Hu, K., Rosenberg, A., Prabhavalkar, R., S. Park, D., Haghani, P., Riesa, J., Perng, G.,
Soltau, H., Strohman, T., Ramabhadran, B., Sainath, T., Moreno, P., Chiu, C.-C., Schalkwyk, J.,
Beaufays, F., Wu, Y., 2023. Google USM: Scaling Automatic Speech Recognition Beyond 100 Lan-
guages. arXiv preprint arXiv:2303.01037. URL https://arxiv.org/abs/2303.01037
Zhu, M., 2023. Workflow for the Tech-Augmented Interpreter with Cymo Note. In 2023 Innovation
in Interpreting Summit. URL https://learn.techforword.com/courses/2012632/lectures/45432025
(accessed 9.3.2024).
107
7
TABLET INTERPRETING
Francesco Saina
7.1 Introduction
Since the beginning of the diffusion of modern digital tablets in the early 2010s and within
the broader framework of the digitalisation of most interpreting-related processes (Saina,
2021a), interpreters have started to explore ways to include such devices into their work in
support of their activity (Drechsel et al., 2018). To date, these exploratory approaches have
involved practitioners more than academic researchers.
Digital technology is increasingly embraced in daily lives for managing tasks as well
as externalising memory and cognitive efforts (Grinschgl and Neubauer, 2022). Hence,
tablet interpreting can be precisely defined as ‘the use of tablets to support one or more
aspects of interpreting and its associated activities’ (Goldsmith, 2023). Tablet interpret-
ing can be framed in the broader context (and growing domain of interest and study) of
computer-assisted interpreting (CAI) or, more widely, machine- or digitally assisted inter-
preting, in consideration of the differing interfaces and interactive natures of tablets.
Despite hesitations regarding the adoption of digital technology, as discussed in interpret-
ing studies literature (Tripepi Winteringham, 2010), both practice and recent research have,
in effect, started to witness how such tools can, in fact, aid the interpreting activity in various
ways. Examples include the provision of assistance in the development of dedicated terminol-
ogy management or real-time AI-enabled assisting software (see Prandi, this volume).
Despite developing scholarly attention towards interpreting technology and its impli-
cations for interpreting modalities, techniques, processes, and cognition, publications on
tablet interpreting tend to primarily consist of grey literature (i.e. informal texts, such as
web blog or social media posts by practitioners reflecting on their experience with the use
of tablets). Furthermore, there are currently less than a dozen academic publications on the
topic (Goldsmith, 2023). Existing research on tablet interpreting in academic settings is
based on limited samples of participants. These are frequently not representative of the pro-
fession and subject to significant methodological constraints, as will be detailed in the fol-
lowing sections. Moreover, most work on tablet interpreting is product-oriented, namely,
comparing device models and operating systems, considering tablet accessories (styluses,
keyboards), or assessing the usefulness and usability of specific applications.
DOI: 10.4324/9781003053248-9
Tablet interpreting
The development and embedding into tablets of systems based on artificial intelligence
(AI) and, particularly, natural language processing (NLP), such as speech recognition and
speech-to-text technology or machine translation (MT) and multilingual support, appear to
be opening the way (primarily in the professional practice) to new and hybrid interpreting
modalities. These topics will be discussed in the following sections, as well as in other chap-
ters of the present volume (see contributions to this volume by Davitti, Fantinuoli, Prandi,
and Ünlü). While these practical uses have already been in place in the real world, albeit
unevenly, according to experience reported (mostly informally) by practising interpreters,
tablet usage is slowly starting to make its way into academic environments and formal
training institutions.
This contribution provides an overview of the first years of tablet interpreting practice
(Section 7.2) and research (Section 7.3). It describes the background and present research
gaps in the field and reports on the (still limited) adoption of tablet interpreting in pro-
fessional practice across different interpreting modalities (Subsections 7.3.1 to 7.3.3), as
documented by existing publications. In addition, this chapter highlights the implementa-
tions of tablet interpreting practice in interpreter training (Section 7.4) and finally outlines
directions for future investigation (Section 7.5).
109
The Routledge Handbook of Interpreting, Technology and AI
documents and glossaries, or even access to the internet. Tablets can also be utilised when
designing and accomplishing educational activities in interpreting training settings. While
most digital devices could be used for the same purposes, tablets undoubtedly represent a
practical and manageable complement for interpreters on the move (Drechsel and Gold-
smith, 2016).
Tablets can also even be used to perform and deliver distance (consecutive and simul-
taneous) interpreting services. Indeed, most videoconferencing and remote interpreting
platforms or systems can be accessed on tablets as well, both via web pages and with
dedicated applications. Furthermore, the latest, most advanced tablets are now set to
offer the same level of stable connectivity, robust performance, and equipment quality (in
terms of CPU, RAM, and network capabilities) as recommended or required for distance
interpreting services. While they do not appear to be used to perform interpreting itself
as yet, the function and power of portable and mobile devices are anticipated to continue
advancing in the future, thus eventually making this option no longer impractical or
uncommon.
Finally, tablets consistently benefit from accessibility features. Features include speech
synthesis or customisable text size to aid screen reading. This has the potential to enable
interpreters with impairments to access and participate in the profession more easily
than they could have in the past. However, even though tablets have revolutionised the
market of augmentative and alternative communication (AAC) tools to a certain extent
by making portable, powerful devices accessible to wider audiences, advances in gesture
recognition and other sign language technologies are still meagre. As yet, tablets have
not found relevant representations in sign language interpreting research and practice
as a result.
110
Tablet interpreting
111
The Routledge Handbook of Interpreting, Technology and AI
and Cecchi, 2023). Tablets have been identified as providing a backchannel for communica-
tion between boothmates in distance interpreting settings.
An unpublished MA thesis study on mobile devices for simultaneous interpreting (Paone,
2016) appears to be the only work available that focuses on the use of tablets in this inter-
preting modality. Conducted by surveying 21 Austrian conference interpreters, the study
had participants reporting using tablets to perform tasks in preparation for their simulta-
neous interpreting assignments. Preparation included looking up terminology and taking
notes in the booth and recording and listening back to their own renditions for personal
reference.
Thus, in simultaneous interpreting studies, the existing gap in academic research on the
use of tablets – which is mostly documented in grey literature by practitioners reflecting on
their exposure to such devices – is even more evident.
112
Tablet interpreting
Another novel hybrid modality, which combines consecutive interpreting with sight
translation based on an automatically generated real-time speech transcription, is referred
to as ‘SightConsec’ (see also Orlando and Ünlü, this volume). In this case, too, the inter-
preter can take complete or reduced notes (as compared to ‘traditional’ consecutive) before
rendering the message by reading from, or referring to, the transcription. In this instance,
as emerging research is starting to report, a subsequent raw machine translation step could
be added as reference for the interpreter.
Generating running transcripts and draft translations is also becoming increasingly easy
on tablets, thanks to the integration of automatic speech recognition (ASR), MT, and mul-
tilingual support into digital and mobile devices. While no experimental or exploratory
research has yet been conducted academically into this mode (either with or without tab-
lets), dedicated commercial products and applications, specifically designed for interpreters,
are already available on the market. Initial tests of this modality could, for instance, com-
pare sight translation (based on a text transcription) with the simultaneous-oriented variant
(based on an audio recording). Researchers could then assess the potential impact of the
different backup channels on the interpreter’s consecutive performance. Another avenue of
research could also be to assess whether a combination of the two backup channels could
be a viable and effective option for interpreters.
However, despite having been explored in professional settings, both these hybrid
modalities are still far from being consolidated and used in regular practice in interpreting.
This is in spite of the potential usefulness and support they provide to interpreters working
from spoken into signed languages.
113
The Routledge Handbook of Interpreting, Technology and AI
114
Tablet interpreting
Digital mobile devices, including tablets, now hold the processing power and capacities
to support interpreting preparation, performance, and quality control (and even provide a
platform for the actual delivery of interpreting). Consequently, their applications and the
differences between these and other devices (mainly laptops and computers) need to be ade-
quately explored and studied. In interpreting technology research in general, there is scant
evidence of how interpreting work could be optimised using multiple inputs and external
prompts, be this via a laptop or a tablet. This perspective could also enable comparisons to
be made between device categories, ergonomics (workstation set-up and space saving), or
whether there are significant differences in terms of benefits associated to one device over
others, or across settings (for instance, in-person vis-à-vis remotely). This would help better
define the appropriate scope and the best-suited scenarios for applications of these devices.
Following the trajectory of other studies on CAI tools (Frittella, 2023), further research
could also investigate how user interfaces of dedicated tablet applications and systems are
designed in order to create optimal interpreter–device interaction. In particular, an applica-
tion designed and developed specifically for consecutive interpreting could feature dedicated
note-taking functionalities. Dedicated functionalities could include customisable shortcuts
for frequently used symbols, signs, abbreviations, and terminology or background pop-up
windows containing knowledge that interpreters may need to repeatedly recall during their
rendition (such as names, terms, or other relevant information).
More technical trials could also study how NLP-based systems, such as ASR, could be
used by interpreters to assess and improve (both in real time and retrospectively) the qual-
ity of their renditions in professional settings. Similarly, the use of technology such as face
detection and eye tracking could potentially compensate for any of consecutive and tablet
interpreting’s detected shortcomings. To illustrate, interpreters could be prompted to look
up if they are found to be gazing too much at the screen. As a result, eye contact could be
improved.
Also worth further investigation is interpreting technique, specifically with regard to
consecutive interpreting. Studies could focus on potential variations in note-taking method
and style across tablet-enabled modalities (consecutive on a tablet, SimConsec, and Sight-
Consec). Other studies could also bring forth exploration into latency. Across the range of
interpreting modes, some research has already taken place into the maximum acceptable
latency provided by the automated output of tools (Fantinuoli and Montecchio, 2022)
for these to be cognitively adequate and effectively useful for interpreters. However, more
can be done. Similarly, research focusing on the interpreter’s overreliance on these systems
(Defrancq and Fantinuoli, 2021) is also minimal and can be further explored.
115
The Routledge Handbook of Interpreting, Technology and AI
long-established work processes and habits, shortage of evidence- and research-based ben-
efits, and insufficient extended specific training or upskilling programmes.
In the area of interpreter training (and continued professional development), research
could focus more precisely on the acquisition of new skills and competencies that are required
for interpreters to integrate and benefit from the use of tablets in interpreting. Research in this
area could also explore the learning curve related to the use of this technology and whether
this has had an impact on its limited adoption thus far compared to other interpreter tools.
In this vein, thanks to a more comprehensive body of investigation into interpreter–
tablet interaction, new interpreting-specific skills could also emerge. Potential skills include
the competence to best leverage different devices or even adapt interpreting styles and pro-
cesses to the resources being used (e.g. by adjusting note-taking or décalage to the latency of
a tool) or the expertise and critical thinking required to decide what tasks can be optimised
through digital technology (and what are best-suited scenarios). An education-oriented
research path such as this could also guide instructional design proposals and assist in pro-
viding effective training on interpreting using this device.
116
Tablet interpreting
an assignment by studying documents and texts on a tablet – may change over time. For
example, interpreters could be studied as they get increasingly better-acquainted with the
use of a digital device in their daily lives. Alternatively, studies could focus on the evolution
of the technological medium and how it influences comprehension and knowledge retention
over time.
7.6 Conclusion
While interest in digital technology in interpreting is increasing gradually, the use of tablets
still appears to be a niche area in both research and professional practice. There are currently
limited academic publications on the subject (in addition to a few anecdotal accounts),
and only restricted circles of practitioners resort to these devices in their regular work
routines. This chapter has aimed to provide a comprehensive overview of the early years
of tablet interpreting, from its exploratory uses in professional practice (Section 7.2) as a
portable and convenient device to support interpreter preparation and performance, to
research (Section 7.3) on its application in consecutive, simultaneous, and hybrid interpret-
ing modalities (with all its constraints and limitations) and its usability in aid of interpreter
training settings (Section 7.4).
Amidst the growing relevance of AI and the level of interest regarding the impact it
may have on interpreting, any AI-based application that is addressed to interpreters (from
real-time, ASR-generated transcriptions to instant access to terminological databases) can
be easily made available on tablets. In addition, the level of portability and multifunction-
ality that tablets possess makes them ideal tools for the dynamic environments in which
interpreters often operate. However, potential associated hurdles (such as information over-
load or excessive reliance on system outputs) require further investigation. Additionally, the
potential impact on an interpreter’s cognitive load that use of a tablet may have remains
relatively unexplored. It could be hypothesised that the combination of components could
exacerbate the interpreters’ mental strain when they are already focusing on trying to con-
vey accurate messages across languages.
Human-specific aspects of interpreting and cross-language communication should not be
neglected either. Nuances such as cultural peculiarities, contextual subtleties, and elements
beyond verbal language are essential for interpreting, multilingual communication, and
exchanges among people. These may be at risk if technology-driven efficiency is prioritised
over higher-level human connection. Therefore, while AI-equipped tablets in interpreting
can indeed offer promising benefits in terms of support and accuracy, they also present
critical challenges. These need to be carefully considered in order to ensure proper and
favourable integration of tablets into interpreting practices. Nevertheless, these assump-
tions, common to digitally assisted interpreting on any device, can only be assessed and
scrutinised through empirical, evidence-based research.
In conclusion, despite its potential capabilities, tablet interpreting does not yet constitute
an established space in the interpreting field. This view takes into consideration the limited
amount of investigation that has already taken place in this area, as well as the sporadic
reports written over recent years, which attest to a burgeoning interest in this technology,
although not specifically developed for interpreting purposes. However, it remains to be
said that tablets possess the potential to be widely adopted by the profession and may prove
beneficial for interpreting in the future.
117
The Routledge Handbook of Interpreting, Technology and AI
References
Altieri, M., 2020. Tablet Interpreting: Étude expérimentale de l’interprétation consécutive sur tab-
lette. The Interpreters’ Newsletter 25, 19–35.
Arumí, M., Sánchez-Gijón, P., 2019. La presa de notes amb ordinadors convertibles en l’ensenyament-
aprenentatge de la interpretació consecutiva. Resultats d’un estudi pilot en una formació de màster.
Revista Tradumàtica. Tecnologies de la Traducció 17, 128–152.
Bertozzi, M., Cecchi, F., 2023. Simultaneous Interpretation (SI) Facing the Zoom Challenge:
Technology-Driven Changes in SI Training and Professional Practice. In Corpas Pastor, G.,
Hidalgo-Ternero, C.M., eds. Proceedings of the International Workshop on Interpreting Tech-
nologies JUST SAY-IT 2023. Shoumen, Incoma, 32–40.
Clark, A., Chalmers, D., 1998. The Extended Mind. Analysis 58(1), 7–19.
Corpas Pastor, G., Fern, L.M., 2016. A Survey of Interpreters’ Needs and Practices Related to Language
Technology. Universidad de Málaga, Malaga. URL www.lexytrad.es/assets/Corpas-Fern-2016.pdf
(accessed 29.9.2024).
Defrancq, B., Fantinuoli, C., 2021. Automatic Speech Recognition in the Booth: Assessment of Sys-
tem Performance, Interpreters’ Performances and Interactions in the Context of Numbers. Target
33(1), 73–102.
Drechsel, A., Bouchard, M., Feder, M., 2018. Inter-Institutional Training Cooperation on the Use of
Tablets in Interpreting. CLINA 4(1), 105–114.
Drechsel, A., Goldsmith, J., 2016. Tablet Interpreting: The Use of Mobile Devices in Interpreting. In
Forstner, M., Lee-Jahnke, H., eds. CIUTI-Forum 2016: Equitable Education Through Intercul-
tural Communication: Role and Responsibility for Non-State Actors. Frankfurt, Peter Lang.
Drechsel, A., Goldsmith, J., 2020. The Tablet Interpreting Manual: A Beginner’s Guide. URL www.
techforword.com/ (accessed 29.9.2024).
Fantinuoli, C., 2017. Computer-Assisted Preparation in Conference Interpreting. Translation & Inter-
preting 9(2), 24–37.
Fantinuoli, C., Montecchio, M., 2022. Defining Maximum Acceptable Latency of AI-Enhanced CAI
Tools. URL: https://doi.org/10.48550/arXiv.2201.02792.
Frittella, F.M., 2021. Computer-Assisted Conference Interpreter Training: Limitations and Future
Directions. Journal of Translation Studies 1(2), 103–142.
Frittella, F.M., 2023. Usability Research for Interpreter-Centred Technology: The Case Study of
SmarTerp. Language Science Press, Berlin.
Goldsmith, J., 2017. A Comparative User Evaluation of Tablets and Tools for Consecutive Interpret-
ers. In Esteves-Ferreira, J., Macan, J.M., Mitkov, R., Stefanov, O.M., eds. Proceedings of the 39th
Conference Translating and the Computer. Tradulex, Geneva, 40–50.
Goldsmith, J., 2018. Tablet Interpreting: Consecutive Interpreting 2.0. Translation and Interpreting
Studies 13(3), 342–365.
Goldsmith, J., 2023. Tablet Interpreting: A Decade of Research and Practice. In Corpas Pastor,
G., Defrancq, B., eds. Interpreting Technologies – Current and Future Trends. John Benjamins,
27–45.
Grinschgl, S., Neubauer, A.C., 2022. Supporting Cognition with Modern Technology: Distributed
Cognition Today and in an AI-Enhanced Future. Frontiers in Artificial Intelligence 5. URL https://
doi.org/10.3389/frai.2022.908261
Hamidi, M., Pöchhacker, F., 2007. Simultaneous Consecutive Interpreting: A New Technique Put to
the Test. Meta 52(2), 276–289.
Liu, Z., 2005. Reading Behavior in the Digital Environment: Changes in Reading Behavior Over the
Past Ten Years. Journal of Documentation 61(6), 700–712.
McLuhan, M., 1962. The Gutenberg Galaxy: The Making of Typographic Man. University of Toronto
Press, Toronto.
McLuhan, M., 1964. Understanding Media: The Extensions of Man. McGraw-Hill, New York.
Moser-Mercer, B., 2015. Technology and Interpreting: New Opportunities Raise New Questions.
URL https://oeb.global/oeb-insights/interpreting-technology/ (accessed 29.9.2024).
Ong, W.J., 1982. Orality and Literacy: The Technologizing of the Word. Methuen, London.
Orlando, M., 2016. Training 21st Century Translators and Interpreters: At the Crossroads of Prac-
tice, Research and Pedagogy. Frank & Timme GmbH.
118
Tablet interpreting
Orlando, M., 2023. Using Smartpens and Digital Pens in Interpreter Training and Interpreting
Research: Taking Stock and Looking Ahead. In Corpas Pastor, G., Defrancq, B., eds. Interpreting
Technologies – Current and Future Trends. John Benjamins, 6–26.
Paone, M.D., 2016. Mobile Geräte beim Simultandolmetschen mit besonderem Bezug auf Tablets.
University of Vienna, Vienna. Unpublished.
Risku, H., Rogl, R., 2020. Translation and Situated, Embodied, Distributed, Embedded and Extended
Cognition. In Alves, F., Jakobsen, A., eds. The Routledge Handbook of Translation and Cognition.
Routledge, London, 478–499.
Saina, F., 2021a. Technology-Augmented Multilingual Communication Models: New Interaction
Paradigms, Shifts in the Language Services Industry, and Implications for Training Programs. In
Turchi, M., Fantinuoli, C., eds. Proceedings of the 1st Workshop on Automatic Spoken Language
Translation in Real-World Settings (ASLTRW). Association for Machine Translation in the Ameri-
cas, 49–59. URL https://aclanthology.org/2021.mtsummit-asltrw.5 (accessed 29.9.2024).
Saina, F., 2021b. Remote Interpreting: Platform Testing in a University Setting. In Mitkov, R.,
Sosoni, V., Giguère, J.C., Murgolo, E., Deysel, E., eds. Proceedings of the Translation and
Interpreting Technology Online (TRITON) Conference, 57–67. URL https://doi.org/10.2661
5/978-954-452-071-7_007
Saina, F., 2022. Scenari didattici della mediazione linguistica nell’era digitale. In Petrocchi, V., ed.
Spunti e riflessioni per una didattica della traduzione e dell’interpretariato nelle SSML. Configni
(Rieti), CompoMat, 107–119.
Sandrelli, A., 2015. Computer-Assisted Interpreter Training. In Pöchhacker, F., ed. Routledge Ency-
clopedia of Interpreting Studies. Routledge, London, 75–77.
Sprevak, M., 2019. Extended Cognition. In Crane, T., ed. The Routledge Encyclopedia of Philosophy.
Routledge, London. URL https://dx.doi.org/10.4324/9780415249126-V049-1
Tripepi Winteringham, S., 2010. The Usefulness of ICTs in Interpreting Practice. The Interpreters’
Newsletter 15, 87–99.
Wang, Y., Tian, Y., Jiang, Y., Yu, Z., 2023. The Acceptance of Tablet for Note-Taking in Consecutive
Interpreting in a Classroom Context: The Students’ Perspectives. Forum for Linguistic Studies
5(2). URL https://doi.org/10.59400/fls.v5i2.1862
119
PART II
8.1 Introduction
Computer-assisted interpreting (CAI) can be defined as the use of technological applications,
such as terminology management and automatic speech recognition (ASR), on hardware,
such as laptops or tablets, to provide support to interpreters for one or more sub-processes
in their workflows. The use of technologies as support (Braun, 2019) is but one of the
potential applications of technology to interpreting: Other areas are technology-enabled
interpreting (see Part I of this volume) and technology (semi-)automating interpreting
workflows (see Part III of this volume). These additional applications are strictly linked
with CAI: For instance, support is often achieved through automation. This is increasingly
true when it comes to artificial intelligence (AI), as the same technologies can be used both
to help interpreters and to automate interpretation, as is the case for ASR and machine
translation (MT). Another example of the growing interplay between CAI and other tech-
nological applications in interpreting is the inclusion of CAI functionalities, either for the
automation of subtasks in the preparation phase or for live support during the assignment,
in remote simultaneous interpreting (RSI) platforms (e.g. Rodríguez et al., 2021; Fantinuoli
et al., 2022).
The definition of CAI is therefore broad and can essentially be taken to mean
‘technology-supported interpreting’. The term CAI tool, however, has been used both in a
broad and in a narrow sense in scholarship. In some publications (e.g. Costa et al., 2014), it
indicates any piece of software interpreters can use as support in their workflows. Based on
this definition, even unit converters and terminology management programs for translators
could be considered CAI tools. Other scholars, particularly those carrying out empirical
research on this subset of interpreting technologies, define CAI tools more narrowly as
all sorts of computer programs and mobile applications specifically designed and
developed to assist interpreters in at least one of the different sub-processes of inter-
pretation, for example, knowledge acquisition and management, lexicographic mem-
orisation, terminology access, and so forth.
(Fantinuoli, 2021, 512)
DOI: 10.4324/9781003053248-11
The Routledge Handbook of Interpreting, Technology and AI
This definition stresses the bespoke nature of these tools, often created by interpreters
for interpreters. An attempt to bridge these discrepant definitions following a functional
approach is offered by Guo et al. (2023, 91), who describe CAI tools as
pieces of computer software, mobile phone applications, or digital devices that can
be used during the interpreting process to reduce the cognitive stress that interpreters
face and to enhance overall processing capacity. They are an integral part of the inter-
preting process. They are also directly linked to and might positively affect the cog-
nitive processes that underlie the task of interpreting by reducing working-memory
stress, eliminating production difficulties, and such.
This definition is in line with Will’s (2020) classification of CAI tools according to whether
they provide support for the in-process phase (primary CAI tools) or the pre-process phase
of interpreting (secondary CAI tools) or for both (integrated CAI tools). Guo et al.’s defini-
tion can cover both primary and integrated CAI tools, but not secondary CAI tools, which
leaves out a considerable number of applications developed for interpreters’ advanced
preparation.
To avoid any terminological confusion and clearly identify the type of applications
which will be discussed in this chapter, CAI tools will henceforth be defined as those appli-
cations capable to provide support pre-, in-, peri-, and/or post-process (see Kalina, 2005)
and that have been specifically developed with the goal to optimise interpreters’ workflows
and cognitive processes, increasing their productivity and ultimately improving the quality
of their interpretation.
This chapter first discusses CAI tools from the point of view of their historical develop-
ment (Section 8.2), tracing their evolution, providing an overview of the current landscape,
and discussing their uptake and reception among professionals. With the goal to orient
future investigations, Section 8.3 spotlights the main areas of enquiry in empirical research
on CAI tools, that is, interpreter-centric research (Section 8.3.1) and research focusing on
the evaluation of CAI systems (Section 8.3.2). This central section offers a critical overview
of where we stand in terms of our understanding of interpreter–tool interaction, of the
implications of CAI tool use for interpreters’ cognitive processes and performance, of the
affordances and limitations of bespoke technologies for interpreters, and of the method-
ologies we adopt to investigate such topics. Training on and with CAI tools is discussed
in Section 8.4, which highlights current issues and open questions. The chapter concludes
by exploring potential future advancements in CAI, considering the recent progress in AI
technology.
124
Computer-assisted interpreting tools and tools training
optimise their preparation workflows, rationalise their processes, and in turn, alleviate some
of the effort resulting from a highly demanding cognitive activity, such as interpreting. Stoll
(2009) suggested that CAI tools should aim to ‘move cognition ahead’ of the interpreting
task, by allowing interpreters to pre-process their preparation material, anticipating some
of the challenges they may encounter during the interpreting phase. When receiving the
text of a speech in advance, for instance, they could annotate the document with difficult
terminology, highlight names and numbers, and do so in a dedicated digital environment
that would facilitate this kind of preparatory work.
Interpreters have specific needs which partly differ from those of translators. They work
under high time pressure and can conduct terminological work mainly before the assign-
ment. They require highly personalised tools, offering speed of consultation and intuitive
navigation, allowing terminology lookup and updating in the booth (Rodríguez and Sch-
nell, 2009). Rütten (2004, 2007) sketched the ideal architecture of a tool for interpreters
which would support working with documents, extracting terminology, creating and stor-
ing glossaries, efficiently accessing terminological resources, and training. Starting from a
central homepage, interpreters would be able to access the different modules. This proposed
structure highlights how CAI tools were initially meant primarily for preparation, specifi-
cally for terminology work and glossary creation and management. The first tools were
therefore developed with these aims in mind and for conference interpreters working in the
simultaneous mode. At that time, simultaneous interpreting already saw the use of laptops
in the booth, whereas the use of tablets for consecutive interpreting (see, for example, Gold-
smith, 2018) was first envisaged only several years later. Additionally, the reduced visibility
of interpreters working in the simultaneous mode may foster the use of technology as sup-
port more than in consecutive interpreting.
125
The Routledge Handbook of Interpreting, Technology and AI
thanks to dedicated interfaces for terminology extraction (e.g. Boothmate),4 and present an
advanced search algorithm for manual queries in the booth. When using these tools, inter-
preters need only type a few letters to start a query, without having to press the Enter key
and worry about typing mistakes.
The true innovation in supporting technology for interpreters comes from a third gen-
eration of CAI tools which rely heavily on support through automation, often including AI
applications. This covers all phases of the workflow: For instance, interpreters can auto-
matically extract terminology from documents within a few seconds, generate terminology
lists by providing the tool with a URL or domain-related keywords, and machine-translate
such lists or automatically summarise texts. Particularly prominent in third-generation tools
is the use of ASR (Fantinuoli, 2017b) and named entity recognition (NER) to automatically
prompt interpreters for common problem triggers, such as numbers, specialised terminol-
ogy, and named entities (Gile, 2009). InterpretBank5 is currently the only stand-alone CAI
tool in this category offering all these functions. Sight-Terp6 can also be regarded as a
third-generation CAI tool but does not offer preparation-related functions and only focuses
on the in-process phase.
Some authors suggest including a fourth generation of tools distinguishing cloud-based
applications from desktop-based applications. The addition of a fourth generation may
indeed be warranted by the fact that interpreters working for RSI platforms can now use
integrated CAI tools, as is the case for SmarTerp&Me (Rodríguez et al., 2021) and KUDO
Interpreter Assist (Fantinuoli et al., 2022). Figure 8.1 sums up the available CAI tools, the
phases they support, and the integrated technologies.
126
Computer-assisted interpreting tools and tools training
use of technology, which sometimes include CAI tools among the types of applications sur-
veyed. At present, however, we can mostly formulate hypotheses as to interpreters’ actual
use of CAI tools and their attitudes towards these applications and, at best, draw tentative
conclusions. Obtaining a clear picture of CAI tools use is complicated by the current dearth
of data. The picture of an interpreter only modestly using supporting technologies, and in
particular, CAI tools, emerges from data collected several years ago, before the COVID-19
pandemic, which shifted much of interpreting online, with an increasing integration of
technology into interpreters’ workflows. For this reason, the surveys often cited to paint the
said picture may reflect outdated views and facts about interpreting technologies in general
and CAI tools in particular. Even before the pandemic, the usefulness of the data collected
for inferring interpreters’ adoption of CAI tools was limited by a certain lack of clarity as to
what was considered a CAI tool or which technologies were under investigation.
The seemingly limited uptake of CAI tools by interpreters is often attributed to an
attitude of general scepticism (see, for example, Tripepi Winteringham, 2010). How-
ever, interpreters’ motivations for rejecting technologies are rarely surveyed and often
stem from scholars’ anecdotal observations or suppositions. Some exceptions are the sur-
veys conducted by Mellinger and Hanson (2018), Deysel (2023) and Fan (2024). The first
pulled together several validated instruments to investigate potential relationships between
interpreters’ communication apprehension, visibility, and personal technology adoption
propensity. The survey highlighted how additional factors, such as the interpreting setting,
interpreters’ self-perception of their role, and even the availability of technologies devel-
oped to support interpreting, may explain interpreters’ attitudes towards the uptake of
technologies beyond their personal inclinations. The survey conducted by Fan (2024) also
highlights similar mediating factors, such as interpreters’ concern about the reliability of
such tools. In her survey, Deysel (2023) specifically explored interpreters’ concerns about
127
The Routledge Handbook of Interpreting, Technology and AI
technology. While her investigation did not only pertain to CAI tools, it is useful as it
highlighted how interpreters worry about tools potentially interfering with their cognitive
processes, proving distracting, and requiring excessive attention or processing capacity.
Additional reasons for the limited acceptance of CAI tools may be a lack of satisfaction
towards currently available functionalities, as the survey by Corpas and Fern (2016) on
terminology management tools would suggest. Qualitative data from empirical studies may
also help get a clearer picture of the reasons behind interpreters’ limited uptake of CAI
tools. Several study participants report being sometimes distracted by the tools (e.g. Prandi,
2015b, 2023; Desmet et al., 2018) and underline the importance of getting accustomed to
the tool to make the best use of it (e.g. Defrancq and Fantinuoli, 2020; Pisani and Fantinu-
oli, 2021). Most importantly, they observe that for a tool to be considered useful and sat-
isfactory, it must be dependable, help them achieve better performance than without, and
ideally offer features tailored to their own needs (e.g. Frittella, 2023; Ünlü, 2023a).
Despite these initial insights, we have only scratched the surface of the complex rela-
tionship between interpreters and CAI tools. While we may hypothesise that interpreters’
views of technology have changed and will continue to evolve in the coming years, further
dedicated investigation is needed to understand the prevalence and perception of tools spe-
cifically created for interpreters and to be able to draw conclusions which may inform CAI
tool development.
128
Computer-assisted interpreting tools and tools training
Although the first generation of CAI tools was primarily built to provide support for the
preparation phase of an interpreting assignment, research on computer-assisted interpret-
ers’ preparation is still rather scarce.
The idea of using technology, primarily from the area of corpus linguistics and natural
language processing, to rationalise interpreters’ glossary work can be summed up in Fan-
tinuoli’s (2017a) corpus-driven interpreters’ preparation (CDIP). Fantinuoli observed that
interpreters prepare for their assignments under high time pressure, often under suboptimal
conditions, as they lack relevant preparation materials. Furthermore, they are called upon
to facilitate communication on specialised subjects, in which they are often not experts
themselves, and to acquire knowledge and specialised terminology on a variety of topics. To
overcome these limitations, interpreters can use dedicated tools to create ad hoc corpora of
domain-relevant documents and explore such collections of texts to reconstruct the under-
lying conceptual systems and extract terminology.
Empirical work by Xu and Sharoff (2014) and Xu (2018) explored the question of
whether corpus-based preparation provides advantages to interpreters as compared to
manual preparation. In a preliminary study contrasting three term extraction tools for
English and Chinese, Xu and Sharoff (2014) found that trainees in the experimental
group saved, on average, half the time compared to the students who extracted terms
manually. Students in the experimental group further observed that the term list ori-
ented them as to the terms and concepts worth prioritising when preparing. In a subse-
quent study involving 22 Chinese students divided into a control and a test group, Xu
(2018) found that the test group, working with a corpus creation tool, an automatically
extracted term list, and a concordancer, achieved significantly better results with regards
to terminological accuracy in SI, number of omissions, preparation time required, and
post-task recall of terms. This study has the merit of linking the preparation process to
the interpreting task, showing that the potential benefits of CDIP go beyond time savings
for interpreters.
It should be noted, however, that the manual and the automatic terminological prepa-
ration approaches are not entirely comparable, as they may pursue different goals. While
reading preparation documents and manually extracting terminology helps interpreters
gain domain knowledge and engage with the subject matter, automatic approaches allow
them to process large amounts of textual data and inevitably entail a focus on the special-
ised language used. The two approaches, then, can be integrated to achieve a thorough and
extensive level of preparation.
Future research may choose to investigate other aspects of computer-assisted prepara-
tion beyond corpus creation and terminology extraction, for instance, addressing the use of
speech recognition in the documentation phase (e.g. Gaber et al., 2020). CAI tools offer a
plethora of options to speed up glossary creation (see Section 8.2.2 for a detailed descrip-
tion). Aspects such as the impact of automatic glossary creation on interpreters’ learning
129
The Routledge Handbook of Interpreting, Technology and AI
processes in the preparation phase and the subsequent effects on their interpretation, as well
as interpreters’ perception of automatically generated terminological resources, are yet to
be widely explored empirically but may yield useful insights for CAI tool development and
training.
Much of the concern expressed by interpreters about CAI tools’ ability to support them
revolves around the limited cognitive resources they have available for any additional
activity during interpreting, particularly in the simultaneous mode. It is only natural that
empirical research on CAI tools has focused mainly on the in-process phase and on the
simultaneous mode, which imposes the additional constraint that the interpreter’s pace is
largely determined by the speaker. Research on number interpreting with CAI tool support
gained momentum after Fantinuoli (2017b) proposed the integration of ASR into CAI tools
to prompt interpreters for common problem triggers (but see also Hansen-Schirra, 2012,
for a first theorisation of CAI-ASR integration). With staggering improvements in ASR
performance (see Section 8.3.2.1), using this technology to aid interpreters when faced with
numbers, the ‘problem trigger par excellence’ (Frittella, 2019), seems feasible and has been
the object of several studies.
When examining empirical research on ASR support for numbers, an evident method-
ological issue emerges: ‘Number interpreting’ is conceptualised differently in the studies
conducted, as observed by Frittella (2022b), limiting the alignment of research methods,
the comparability of studies, and the interpretation of results. Most studies conducted
on number interpreting with CAI tool support focus on the interpretation of the number
word, without expanding the analysis to other elements of discourse (e.g. only check-
ing whether the number ‘150’ is interpreted correctly instead of ‘150 km’). This narrow
focus allows for a direct assessment of whether interpreters are able to pick up the sug-
gestions offered by the tool but does not tell us how the input provided is integrated into
the interpreter’s rendition, or how such integration may affect the overall process. One
example is the study conducted by Desmet et al. (2018). In an experiment involving ten
Dutch students, the authors mocked up a system automatically prompting interpreters
for numbers of different magnitude and complexity, using PowerPoint slides time-aligned
with the speech. Evaluating the percentage of numbers correctly interpreted by the study
participants as compared to the condition without support, they identified statistically
significant accuracy gains of around 30%, especially for complex numbers and decimals,
and almost 90% fewer approximations. Defrancq and Fantinuoli (2020) and Pisani and
Fantinuoli (2021) also focused on the number word, using, however, a real application.
Both studies supported the initial findings by Desmet et al. (2018), identifying accu-
racy gains of 11.5–44.2% and 25%, respectively, for the CAI tool condition, although
Defrancq and Fantinuoli (2020) observed that differences were statistically significant
only for two out of six study participants.
Overcoming the narrow focus of these initial studies, Frittella (2022a, 2022b, 2023)
argued in favour of a more nuanced and holistic approach to research on computer-assisted
number interpreting. Her analysis expanded the scope by examining the entire numerical
information unit. For instance, the sentence ‘Chinese export value decreased by 3 billion US
dollars in 2019’ can be analysed by considering the rendition of the numeral (3 billion), the
130
Computer-assisted interpreting tools and tools training
referent (export value), the unit of measurement (US dollars), the relative value (decreased),
the time reference (in 2019), and the geographical location (in China) (Frittella, 2022a,
91–92). This approach allowed her to uncover issues which do not emerge when exclusively
looking at the number word, such as severe semantic errors.
Also recognising that local accuracy gains should not come at the expense of overall
interpreting quality, Defrancq et al. (2024) reanalysed the data from a previous study
(Defrancq and Fantinuoli, 2020). Evaluating participants’ renditions for accuracy and
acceptability beyond individual items (numbers), they did not find a significant effect of
ASR, but as higher accuracy with tool support was found, they were able to conclude that
ASR had, on balance, a positive effect on performance.
If the initial findings foreground the potential of CAI tools to improve interpreters’ accu-
racy in the rendition of numbers, the studies reviewed also highlight issues which deserve
further exploration. A prominent negative trend identified in several studies is a certain
overreliance on the tool for support, with interpreters’ performance dropping in case of tool
failure (Defrancq and Fantinuoli, 2020; Frittella, 2023). At the same time, Defrancq and
Fantinuoli (2020) postulate a potentially beneficial psychological effect (see also Van Cau-
wenberghe, 2020) due to the mere presence of the ASR-CAI tool, which may be perceived
as a safety net by interpreters, lowering the stress related to the interpreting task.
Despite CAI tools’ rapid evolution, many of the functions they offer still revolve around
terminology work. This concerns not only glossary preparation ahead of the interpreting
assignment but also the possibility to retrieve terminology quickly and effectively from
terminological databases while interpreting. This can be done by integrating ASR, as in the
case of numbers, but also by using the advanced search algorithm provided by some appli-
cations to facilitate manual queries.
Initial studies looked at in-process terminology support examining participants’ man-
ual queries (Biagini, 2015; Prandi, 2015a, 2015b). Contrasting CAI tool queries with
paper glossaries, Biagini (2015) found that participants achieved greater accuracy in the
CAI condition. Broadening the scope of the analysis to the sentences including the target
terms, he also found fewer non-strategic omissions when the tool (InterpretBank) was
used. Prandi (2015a, 2015b) contrasted two groups of students with differing levels of
exposure to the tool, finding overall high levels of accuracy, particularly for those who
had practiced more often and started developing their own strategies for an effective
interaction with the tool.
While in these initial studies the experimental materials were speeches characterised by
high terminological density, without further control of variables such as the distribution
of terms in the speech or the frequency of the terms selected as stimuli, the studies con-
ducted by Prandi (2017, 2018, 2023) pursued higher experimental control allowing for
an analysis focused on the specialised terms selected as stimuli. Adopting methods from
sentence processing research (Seeber and Kerzel, 2012; see also Keating and Jegerski,
2015), she prepared three ad hoc speeches which she controlled for the terms’ frequency,
their level of morphological complexity (unigrams, bigrams, and trigrams equally distrib-
uted in the speech), and their position in the sentence. Additionally, the target sentences
containing the specialised terms were preceded and followed by generic sentences without
131
The Routledge Handbook of Interpreting, Technology and AI
132
Computer-assisted interpreting tools and tools training
[I]n spite of the promise that many of these technologies hold and the media hype
around them, very little empirical evidence exists on the effectiveness of the technolo-
gies in assisting the workflow and sub-processes of interpreting.
Therefore, it appears essential to experimentally probe some of the assumptions which ini-
tially prompted the development of CAI tools, that is, that technology support may alleviate
interpreters’ cognitive effort in rendering problem triggers such as specialised terminology
and numbers. At the same time, the cognitive load imposed by CAI may be higher due to
the additional input provided through ASR, for instance, or to the additional resources
necessary for manually querying the tool. The postulated additional cognitive load may
also arise due to the effortful allocation of attention to multiple, multimodal informa-
tion sources, as interpreters must split their attention between the auditory and the visual
input.
CAI is a complex object of study. Therefore, one concern of researchers is establish-
ing a suitable methodology for the exploration of questions linked with interpreters’
cognitive processes, in addition to the already more established research foci discussed
in the previous sections. Prandi’s (2023) PhD project aimed to establish a methodology
for studying computer-assisted simultaneous interpreting (CASI). She combined multiple
methods, adopting performance, subjective, and behavioural measures. Her goal was
to explore whether CAI tools, especially those with integrated ASR, could reduce inter-
preters’ cognitive effort while interpreting specialised terminology and help them focus
on the speaker, the primary source of information. Her findings pointed to statistically
significant lower cognitive effort in the ASR condition and suggested that it was easier for
participants to allocate attention to the speaker when a CAI tool was used. This would
indicate an advantage provided by bespoke tools for interpreters, while non-bespoke
tools resulted in equal time spent on tool and speaker, with frequent attention switching.
The study allowed to explore the benefits and shortcomings of using methods such as
accuracy ratings, fixation-based measures, and qualitative questionnaires. Other meas-
ures could be explored in future studies, such as disfluencies (see, for example, Gieshoff,
2021a, 2021b) or vocal correlates of arousal (e.g. Scherer, 1989). A first step in this direc-
tion was taken by Defrancq et al. (2024), who explored mean fundamental frequency (F0)
as a cognitive load indicator in interpreters for the first time. The authors’ assumption
that ASR may lead to additional load was, however, not substantiated by F0 data. This
warrants further research on other aspects of F0, such as standard deviation, peaks, and
ranges (Defrancq et al., 2024, 54).
Chen and Kruger (2023) reported on a study on CACI involving respeaking and live
post-editing of the machine-translated transcript. CACI was compared with conventional
consecutive interpreting in an experiment involving six students working in the English–
Chinese pair who had been trained in the new mode. As this is one of the few studies
openly addressing both the product and the process of CACI, it is presented here, while
also reporting the findings on the effects on accuracy and fluency. To measure the impact
133
The Routledge Handbook of Interpreting, Technology and AI
on the study participants’ cognitive load, the authors used the NASA Task Load Index
(NASA-TLX; Hart and Staveland, 1988). Quality was measured both in general terms
with a rubric-based rating scale by five raters and more in-depth with propositional rat-
ing. For propositional rating, the speeches were divided into units and scored either 0
or 1, depending on whether the target text matched the source. Fluency was assessed by
automatically calculating unfilled pauses and manually counting filled pauses. Chen and
Kruger expected higher accuracy and lower cognitive load in the CACI condition due to
the reduced pressure on interpreters’ working memory. For cognitive load, their hypoth-
esis was confirmed, but only in the L1–L2 direction (Chinese–English), while higher
accuracy and fewer unfilled pauses were found in the CACI condition. A replication of
the study with a larger sample of 13 students (Chen and Kruger, 2024a) yielded similar
results for cognitive load, but also for fluency and target language quality, which were,
however, significantly better only in the L1–L2 direction. Overall quality was also sig-
nificantly higher for the CACI condition, although more for the L1–L2 direction. These
findings suggest that cognitive load in CACI is modulated by directionality and overall
provide further evidence in favour of CAI also for the consecutive mode.
Process research may also help better understand how interpreters interact with support-
ing tools, which can have important repercussions for training. For instance, Chen and Kru-
ger (2024b) conducted an eye tracking study to investigate how trainees allocate attention
during CACI. Participants’ dwell time, fixation durations, and saccade lengths suggested
that they focused more on listening and respeaking during phase 1. However, monitoring
the source text led to better respeaking quality. Greater reliance on the MT text was found
for phase 2, which led to better quality, but only in the L2 (English) to L1 (Chinese) direc-
tion, requiring further investigation.
To sum up, while process-oriented research on CAI is still in its infancy, it has the poten-
tial to offer valuable insights into CAI, integrating product-oriented findings and deepen-
ing our understanding of technology’s impact on interpreters’ cognition. The distraction,
high effort in coordinating attention, and effortful visual search reported offer additional
research questions worth investigating, making process-oriented research on CAI a fruitful
field of enquiry. Additionally, studying CAI may add to our knowledge of the interpreting
process, while possibly attracting the interest of cognitive psychologists, especially as con-
cerns the processing of multimodal input.
134
Computer-assisted interpreting tools and tools training
validated. The following sections report on research about system performance and the
tools’ usability.
LATENCY
System latency refers to the delay with which the output of the ASR process is presented to
interpreters on the screen. Keeping latency low is of major importance for interpreters, as
high latency may pose an excessive strain on working memory and exacerbate issues related
to the coordination of auditory and visual-verbal input. The main assumption is that if sys-
tem latency can fit within interpreters’ ear–voice span (EVS), it may be perceived as accept-
able and allow for the integration of the tools’ suggestions in the interpretation. However,
interpreters adjust their EVS constantly, for instance, by shortening it to a minimum in the
case of question-and-answer sessions or heated debates, or for the interpretation of num-
bers, and so even very low system latency may be insufficient to cater to such specific needs.
Being aware of what can be reasonably expected of the tool is essential for interpreters to
make strategic choices in terms of supporting technologies.
Current research findings suggest that CAI tools’ latency may be sufficiently low to fit
within interpreters’ EVS. In their study on ASR support for the interpretation of numbers,
Defrancq and Fantinuoli (2020) found sufficiently low system latency, below the crucial
threshold of 2 1/2 to 3 sec reported in literature. They conclude that ‘provided interpret-
ers maintain an average EVS, the number is readable in its final version before interpret-
ers reach the point at which they would deliver it’ (Defrancq and Fantinuoli, 2020, 89).
Sufficiently low latency was also found by Van Cauwenberghe (2020) when using Inter-
pretBank as support for the interpretation of specialised terminology, although very high
latencies (up to about 11 sec) were occasionally observed for the terms at the beginning
of the speech, suggesting that the CAI tool needed some time to warm up, offering better
performance downstream. Satisfactory latency was also found by Fantinuoli et al. (2022),
with an average of 1.6 sec (range: 1.1–2.3 sec). A study by Fantinuoli and Montecchio
(2023) aimed to define the maximum acceptable latency for an ASR-CAI tool. To explore
this question, they compared increasingly high latencies, from 1 to 5 sec, and analysed the
effects on interpreters’ accuracy and fluency of rendition. Their study suggested that inter-
preters may be able to cope with latencies of up to 3 sec, corroborating the findings from
previous studies. While these results are encouraging, it should be noted that they represent
the system performance under controlled laboratory conditions, and further tests in real-life
settings may reveal additional shortcomings.
135
The Routledge Handbook of Interpreting, Technology and AI
PRECISION
The precision achieved by the ASR module of a third-generation CAI tool has been an
important concern since the hypothesised use of this technology for interpreters’ support.
High precision and recall are essential for an effective integration of ASR suggestions into
the interpreters’ rendition. Precision is defined as ‘the fraction of relevant instances among
the retrieved instances’, while recall is ‘the fraction of relevant instances that have been
retrieved over the total amount of relevant instances present in the speech’ (Fantinuoli,
2017b, 30). Precision should be prioritised over recall to produce relevant results.
A series of benchmark tests on CAI tools with integrated ASR reveals satisfactory results.
For instance, testing an ASR prototype, Fantinuoli (2017b) found a word error rate (WER)
of 5.04%, which dropped from 10.92% after the system was trained on a specialised glos-
sary. The system reached an F1 value, which considers both precision and recall, of 0.97 for
terms and of 1 for numbers. Encouraging results were also obtained for SmarTerp, both for
ASR and for NER, especially after an adaptation stage. Performance was found to be mod-
ulated by language (Rodríguez et al., 2021). The live support function of KUDO Interpreter
Assist was also recently tested (Fantinuoli et al., 2022), with encouraging results, such as an
F1 score of around 98% and good performance both with and without fine-tuning. NER
reached peaks of 100% for precision, recall, and F1 score, especially with fine-tuning.
The results reported in these studies suggest that ASR is mature enough to help interpret-
ers, although systems do not perform equally well on all problem triggers: While perfor-
mances are very good for numbers, NER is still a challenging task for machines (see, for
example, Gaido et al., 2021). Further testing is needed, also under real-life conditions, and
on a wider variety of languages, especially low-resource ones. Failing to account for these
modulating factors may yield a non-representative picture of the current level of devel-
opment of ASR technology and perpetuate disparities between interpreters working with
major languages and those providing services for minority languages. Furthermore, sys-
tem accuracy should not be limited to in-process support but include the performance of
systems automatically generating glossaries to help interpreters in the preparation phase.
While results in terms of the quality and relevance of the glossary items retrieved and
the quality of MT are satisfactory (e.g. Fantinuoli et al., 2022), further investigations are
needed on this topic.
8.3.2.2 Usability
Research on the usability of CAI tools is arguably at the intersection of interpreter-centric
and system-centric research, as it concerns both human factors and system-related issues.
Usability studies foreground the non-negligible role of design in the complex interpreter–
tool equation. Rather, design represents a defining factor, because ‘details matter in the
design of any user interface, where seemingly small features can significantly impact users’
performance’ (Frittella, 2023, 151).
Currently, usability testing on stand-alone CAI tools is missing, with design features
being discussed only marginally in empirical research on CAI. Only two research projects
have adopted a usability perspective so far, exploring which design features may facilitate
or hinder interpreters’ interaction and satisfaction with the tools: Saeed et al. (2022), who
studied how the integration of ASR into RSI platforms may support source text comprehen-
sion, and Frittella (2023), who addressed the impact on delivery accuracy.
136
Computer-assisted interpreting tools and tools training
Research on the usability of CAI tools adopts methods from user experience research
and human–computer interaction, representing a much-needed novelty in research on inter-
preters’ supporting tools. CAI tools are usually considered intuitive and user-friendly, but
this assumption has not yet been fully explored empirically. Saaed et al. (2022) found that
reduced visual information in RSI interfaces promotes a state of flow, facilitating the inter-
preting task. The mocked-up ASR feature presented interpreters with the live-generated full
transcript of the source speech. Using a convergent mixed-method design gathering per-
formance and subjective (perception) data through SI tests, post-task questionnaires, and
semi-structured interviews, Frittella’s (2023) study probed the soundness of the SmarTerp
RSI-CAI tool design. Her findings revealed the importance of customisation to tailor the
tool’s appearance to the user’s needs, as excessive or unnecessary input can be disruptive.
Usability studies not only are useful to identify general principles for effective tool design
but also provide tangible suggestions. For instance, they reveal that interpreters prefer see-
ing the ASR suggestions at the bottom of the screen, where they are used to reading subtitles
(Saeed et al., 2022). They also bring up a number of open questions, for instance, about the
most effective way of suggesting problem triggers to interpreters, or about the language in
which terms and acronyms should be presented (Frittella, 2023).
137
The Routledge Handbook of Interpreting, Technology and AI
138
Computer-assisted interpreting tools and tools training
elements of technological competence (Wang and Li, 2022), should be derived from empiri-
cal research (Rodriguez et al., 2022). Providing trainees with transferable knowledge on
the technologies seems essential in the face of rapid technological advances (Fantinuoli and
Prandi, 2018; Defrancq, 2023). Specifically, trainees should be aware of how technolo-
gies work and develop critical judgment to be able to assess the tools’ performance, the
impact of CAI tools on their own performance, and the implications of technology use,
for instance, when ASR is involved. As Defrancq (2023, 305) observes, training on ethics
becomes even more important when technology is involved.
Empirical process-oriented research can provide support in developing students’ critical
thinking but also help scholars and trainers identify the skills needed for an effective and
efficient use of CAI tools. This would include not only mere operational skills, as most CAI
tools have intuitive interfaces, but also especially strategic knowledge on how to integrate the
interaction with CAI tools in the already-complex interpreting process. This goes beyond the
individual, in-process cognitive processes but extends to all phases of interpreting, and to
the constellation of interpreter teams and supporting applications. At the moment of writ-
ing, only one empirical study was conducted with a focus on the inclusion of CAI tools into
interpreting curricula (Prandi, 2015a, 2015b). Despite its exploratory nature, it revealed the
multifaceted nature of the interaction with CAI tools: Issues such as overreliance, distribution
of attention, the practical arrangement of the tool and other support materials in the booth,
and in-team coordination were also identified in subsequent empirical research, not only
on students, but also on practising interpreters. As described in Rodriguez et al. (2022, 85),
which references a PhD study by Frittella (2024), educational research methods may contrib-
ute to helping define how, what, and when to teach, as this approach involves research-based
interventions consisting of ‘(1) the identification of an educational need based on research, (2)
the design of the intervention, (3) its development and (4) evaluation to improve the solution,
on the one hand, and contribute to theoretical understanding, on the other’.
Not only are we still far from defining what the content of CAI tool training should be,
but teaching methods are also still under discussion. As mentioned earlier, experience-based
training is rarely possible in this area of interpreting technologies. Fantinuoli and Prandi
(2018) suggest that training on interpreting technologies should follow the constructivist
approach proposed by Kiraly (2000). While trainees should acquire theoretical knowledge
on the technologies, practical exposure to an increasingly complex use of CAI tools should
promote the development of relevant skills. The authors offer a series of practical sugges-
tions to help guide students in acquiring said competences. Defrancq (2023) argues that
training on technologies, and therefore on CAI tools, should not be encased in stand-alone
modules, as seems to be the case for some training institutions (see Prandi, 2020), but
should rather be routinely integrated horizontally into interpreter training. This is no
small feat, and it is possible that both dedicated modules aimed at providing the necessary
knowledge and the integration of technologies into regular interpreting classes may prove
beneficial to prepare the future generation of interpreters for the evolving interpreting mar-
kets. Further educational research on CAI tools may help universities substantiate their
training.
8.5 Conclusion
Despite current shortcomings and our still limited understanding of its implications,
CAI holds exciting prospects for improving interpreters’ work. Thanks to the staggering
139
The Routledge Handbook of Interpreting, Technology and AI
progress made by AI, it is safe to assume that CAI tools are poised for significant advance-
ments in the coming years. Broadening the scope of research, improving tool design, and
devising impactful training approaches are goals worth pursuing, as is a reflection on a
potential broader application of AI in the area of supporting technologies for interpret-
ers. For instance, research on CAI may explore the possibility for interpreters to leverage
AI beyond traditional applications, to generate speeches for training, or to automatically
evaluate performances (Ünlü, 2023b), and investigate what this might mean for training.
The considerable investments in machine interpreting propelled by RSI platform provid-
ers may bring about benefits for interpreters, as the underlying technologies, such as ASR,
are also key components of next-generation CAI tools. This represents a shift compara-
ble to the industry’s investment in CAT tools but also brings about similar risks, such as
the exclusion of interpreters from the development process. Research on CAI tools should
therefore also elucidate such risks, expanding its scope to the broad implications of the
‘technological turn’ (Fantinuoli, 2019) for the profession.
In the future, the current shortcomings of CAI tools may be addressed by further testing
predictive approaches (see, for example, Vogler et al., 2019) for the automatic assessment
of elements likely to be left untranslated by interpreters, thus providing more targeted sup-
port and alleviating the additional cognitive load imposed on interpreters by the tool’s
presence. Working towards interpreter augmentation (Fantinuoli and Dastyar, 2022), one
avenue currently being explored by researchers is the use of augmented reality (AR) to
alleviate negative split attention effects due to the interpreter having to attend to multiple
sources of information for support (Gieshoff et al., 2024). However, AR is not automati-
cally synonymous with augmentation. The very concept of augmented interpretation is still
undefined, and the precise definition of augmentation remains fluid.
In interpreting, augmentation could take various forms. One promising approach is offered
by research conducted on augmented cognition systems, as recently discussed by O’Brien
(2023), for translation. Her reflections on the implications of cognitive augmentation for
translators may be extrapolated to interpreting, urging future research to scrutinise how far
interpreter support may be pushed, whether the move towards augmented interpretation may
result in more interpreter-centric applications, and what this might entail for our conceptuali-
sation of cognition in interpreting and for the future of the profession. Being able to ask bold
questions will be essential to help the field navigate this ever-evolving landscape.
Notes
1 http://fourwillows.com/interplex.html (accessed 7.11.2024).
2 https://www.flashterm.eu/index.html (accessed 17.2.2025).
3 www.glossarmanager.de/ (accessed 7.11.2024).
4 https://interpretershelp.com/ (accessed 7.11.2024).
5 https://interpretbank.com/site/ (accessed 7.11.2024).
6 www.sightterp.net/ (accessed 10.7.2024).
References
Berber-Irabien, D.-C., 2010. Information and Communication Technologies in Conference Interpret-
ing (PhD thesis). Universitat Rovira i Virgili. URL http://hdl.handle.net/10803/8775 (accessed
9.10.2024).
Biagini, G., 2015. Glossario cartaceo e glossario elettronico durante l’interpretazione (MA thesis).
Università di Trieste.
140
Computer-assisted interpreting tools and tools training
Bowker, L., 2022. Computer-Assisted Translation and Interpreting Tools. In Zanettin, F., Run-
dle, C., eds. The Routledge Handbook of Translation and Methodology. Routledge, Oxon and
New York, 392–409. URL https://www.taylorfrancis.com/chapters/edit/10.4324/9781315158945-28/
computer-assisted-translation-interpreting-tools-lynne-bowker
Braun, S., 2019. Technology and Interpreting. In O’Hagan, M., ed. The Routledge Handbook of Transla-
tion and Technology. Routledge, London, 271–288 URL https://www.taylorfrancis.com/chapters/
edit/10.4324/9781315311258-19/technology-interpreting-sabine-braun?context=ubx.
Chen, S., Kruger, J.-L., 2023. The Effectiveness of Computer-Assisted Interpreting: A Preliminary
Study Based on English-Chinese Consecutive Interpreting. Translation and Interpreting Studies
18(3), 399–420. URL https://doi.org/10.1075/tis.21036.che
Chen, S., Kruger, J.-L., 2024a. A Computer-Assisted Consecutive Interpreting Workflow: Training
and Evaluation. The Interpreter and Translator Trainer 18(3), 380–399. URL https://doi.org/10/
gt3736
Chen, S., Kruger, J.-L., 2024b. Visual Processing During Computer-Assisted Consecutive Interpret-
ing: Evidence from Eye Movements. Interpreting 26(2), 231–252. URL https://doi.org/10/gt3738
Corpas Pastor, G., Fern, L.M., 2016. A Survey of Interpreters’ Needs and Practices Related to Lan-
guage Technology. Universidad de Malaga.
Costa, H., Corpas Pastor, G., Durán Muñoz, I., 2014. A Comparative User Evaluation of Terminol-
ogy Management Tools for Interpreters. In Drouin, P., Grabar, N., Hamon, T., Kageura, K., eds.
Proceedings of the 4th International Workshop on Computational Terminology (Computerm).
Association for Computational Linguistics and Dublin City University, Dublin, Ireland, 68–76.
URL https://doi.org/10.3115/v1/W14-4809
Defrancq, B., 2023. Technology in Interpreter Education and Training: A Structured Set of Proposals.
In Corpas Pastor, G., Defrancq, B., eds. Interpreting Technologies – Current and Future Trends.
John Benjamins Publishing Company, Amsterdam (IVITRA Research in Linguistics and Litera-
ture, 37), 302–319. URL https://doi.org/10.1075/ivitra.37.12def
Defrancq, B., Fantinuoli, C., 2020. Automatic Speech Recognition in the Booth: Assessment of Sys-
tem Performance, Interpreters’ Performances and Interactions in the Context of Numbers. Target
33(1), 73–102. URL https://doi.org/10.1075/target.19166.def
Defrancq, B., Snoeck, H., Fantinuoli, C., 2024. Interpreters’ Performances and Cognitive Load in
the Context of a CAI Tool. In Deane-Cox, S., Böser, U., Winters, M., eds. Translation, Interpret-
ing and Technological Change: Innovations in Research, Practice and Training. Bloomsbury Aca-
demic, London (Bloomsbury Advances in Translation), 38–58.
Desmet, B., Vandierendonck, M., Defrancq, B., 2018. Simultaneous Interpretation of Numbers and
the Impact of Technological Support. In Fantinuoli, C., ed. Interpreting and Technology. Language
Science Press, Berlin (Translation and Multilingual Natural Language Processing, 11), 13–27. URL
https://doi.org/10.5281/zenodo.1493291
Deysel, E., 2023. Investigating the Use of Technology in the Interpreting Profession: A Comparison
of the Global South and Global North. In Corpas Pastor, G., Defrancq, B., eds. Interpreting Tech-
nologies – Current and Future Trends. John Benjamins Publishing Company, Amsterdam (IVITRA
Research in Linguistics and Literature, 37), 142–168. URL https://doi.org/10.1075/ivitra.37.06dey
Fan, D.C., 2024. Conference Interpreters’ Technology Readiness and Perception of Digital Technolo-
gies, Interpreting 26(2), 178-200. URL https://doi.org/10.1075/intp.00110.fan
Fantinuoli, C., 2017a. Computer-Assisted Preparation in Conference Interpreting. Translation and
Interpreting 9(2), 24–37. URL https://doi.org/10.12807/ti.109202.2017.a02
Fantinuoli, C., 2017b. Speech Recognition in the Interpreter Workstation. In Esteves-Ferreira, J.,
Macan, J., Mitkov, R., Stefanov, O.-M., eds. Proceedings of the 39th Conference Translating and
the Computer. Editions Tradulex, London, 25–34. URL https://www.asling.org/tc39/wp-content/
uploads/TC39-proceedings-final-1Nov-4.20pm.pdf
Fantinuoli, C., 2019. The Technological Turn in Interpreting: The Challenges That Lie Ahead. In
Baur, W., Mayer, F., eds. Proceedings of the Conference Übersetzen und Dolmetschen 4.0. – Neue
Wege im digitalen Zeitalter. BDÜ Fachverlag, Bonn, 334–354.
Fantinuoli, C., 2021. Conference Interpreting and New Technologies. In Albl-Mikasa, M., Tiselius,
E., eds. The Routledge Handbook of Conference Interpreting. Routledge, London, 508–522. URL
https://doi.org/10.4324/9780429297878-44
141
The Routledge Handbook of Interpreting, Technology and AI
Fantinuoli, C., Dastyar, V., 2022. Interpreting and the Emerging Augmented Paradigm. Interpreting
and Society 2(2), 185–194. URL https://doi.org/10.1177/27523810221111631
Fantinuoli, C., Montecchio, M., 2023. Defining Maximum Acceptable Latency of AI-Enhanced CAI
Tools. In Ferreiro Vázquez, Ó., Correia, A., Araújo, S., eds. Technological Innovation Put to the
Service of Language Learning, Translation and Interpreting: Insights from Academic and Profes-
sional Contexts. Peter Lang, Berlin (Lengua, Literatura, Traducción, 2), 213–225.
Fantinuoli, C., Prandi, B., 2018. Teaching Information and Communication Technologies: A Pro-
posal for the Interpreting Classroom. Trans-kom: Journal of Translation and Technical Commu-
nication Research 11(2), 162–182. URL https://www.trans-kom.eu/bd11nr02/trans-kom_11_02_
02_Fantinouli_Prandi_Teaching.20181220.pdf
Fantinuoli, C., Marchesini, G., Landan, D., Horak, L., 2022. KUDO Interpreter Assist: Automated
Real-Time Support for Remote Interpretation. In Esteves-Ferreira, J., Mitkov, R., Recort Ruiz, M.,
Stefanov, O.-M., Chambers, D., Macan, J., Sosoni, V., eds. Proceedings of the 43rd Conference
Translating and the Computer. Editions Tradulex, Geneva, 68–77. URL https://www.tradulex.
com/varia/TC43-OnTheWeb2021.pdf.
Frittella, F.M., 2019. “70.6 Billion World Citizens”: Investigating the Difficulty of Interpreting Num-
bers. The International Journal of Translation and Interpreting Research 11(1), 79–99. URL
https://doi.org/10.12807/ti.111201.2019.a05
Frittella, F.M., 2022a. ASR-CAI Tool-Supported SI of Numbers: Sit Back, Relax and Enjoy Interpret-
ing? In Esteves-Ferreira, J., Mitkov, R., Recort Ruiz, M., Stefanov, O.-M., Chambers, D., Macan,
J., Sosoni, V., eds. Proceedings of the 43rd Conference Translating and the Computer. Editions
Tradulex, Geneva, 88–102. URL https://www.tradulex.com/varia/TC43-OnTheWeb2021.pdf
Frittella, F.M., 2022b. CAI Tool-Supported SI of Numbers: A Theoretical and Methodological
Contribution. International Journal of Interpreter Education 14(1), 32–56. URL https://doi.
org/10.34068/ijie.14.01.05
Frittella, F.M., 2023. Usability Research for Interpreter-Centred Technology: The Case Study of
SmarTerp. Language Science Press, Berlin (Translation and Multilingual Natural Language Pro-
cessing, 21). URL https://doi.org/10.5281/zenodo.7376351
Frittella, F.M., 2024. Computer-Assisted Interpreting: Cognitive Task Analysis and Evidence-Informed
Instructional Design Recommendations (PhD thesis). University of Surrey. URL https://doi.org/
10.15126/thesis.901410
Gaber, M., Corpas Pastor, G., Omer, A., 2020. Speech-to-Text Technology as a Documentation Tool
for Interpreters: A New Approach to Compiling an Ad Hoc Corpus and Extracting Terminology
from Video-Recorded Speeches. TRANS. Revista de Traductología 24, 263–281. URL https://doi.
org/10.24310/TRANS.2020.v0i24.7876
Gaido, M., Rodríguez, S., Negri, M., Bentivogli, L., Turchi, M., 2021. Is “Moby Dick” a Whale or a
Bird? Named Entities and Terminology in Speech Translation. In Moens, M.-F., Huang, X., Specia,
L., Yih, S. W.-T., eds. Proceedings of the 2021 Conference on Empirical Methods in Natural Lan-
guage Processing, 1707–1716. URL https://doi.org/10.18653/v1/2021.emnlp-main.128
Ghent University, 2021. Ergonomics for the Artificial Booth Mate (EABM). URL www.eabm.ugent.
be/survey/ (accessed 7.10.2024).
Gieshoff, A.C., 2021a. Does It Help to See the Speaker’s Lip Movements? An Investigation of Cog-
nitive Load and Mental Effort in Simultaneous Interpreting. Translation, Cognition & Behavior
4(1), 1–25. URL https://doi.org/10.1075/tcb.00049.gie
Gieshoff, A.C., 2021b. The Impact of Visible Lip Movements on Silent Pauses in Simultaneous Inter-
preting. Interpreting 23(2), 168–191. URL https://doi.org/10.1075/intp.00061.gie
Gieshoff, A.C., Schuler, M., Jahany, Z., 2024. The Augmented Interpreter: An Exploratory Study of
the Usability of Augmented Reality Technology in Interpreting. Interpreting 26(2), 282–315. URL
https://doi.org/10.1075/intp.00108.gie
Gile, D., 2009. Basic Concepts and Models for Interpreter and Translator Training. John Benja-
mins Publishing Company, Amsterdam (Benjamins Translation Library, 8). URL https://doi.
org/10.1075/btl.8
Goldsmith, J., 2018. Tablet Interpreting: Consecutive Interpreting 2.0. Translation and Interpreting
Studies 13(3), 342–365. URL https://doi.org/10.1075/tis.00020.gol
Guo, M., Han, L., Anacleto, M.T., 2023. Computer-Assisted Interpreting Tools: Status Quo and Future Trends.
Theory and Practice in Language Studies 13(1), 89–99. URL https://doi.org/10.17507/tpls.1301.11
142
Computer-assisted interpreting tools and tools training
Hansen-Schirra, S., 2012. Nutzbarkeit von Sprachtechnologien für die Translation. Trans-kom: Journal
of Translation and Technical Communication Research 5(2), 211–226. URL https://www.trans-kom.
eu/bd05nr02/trans-kom_05_02_02_Hansen-Schirra_Sprachtechnologien.20121219.pdf
Hart, S.G., Staveland, L.E., 1988. Development of NASA-TLX (Task Load Index): Results of Empir-
ical and Theoretical Research. In Hancock, P.A., Meshkati, N., eds. Advances in Psychology.
North-Holland, Amsterdam, 139–183. URL https://doi.org/10.1016/S0166-4115(08)62386-9
Jiang, H., 2013. The Interpreter’s Glossary in Simultaneous Interpreting: A Survey. Interpreting 15(1),
74–93. URL https://doi.org/10.1075/intp.15.1.04jia
Kalina, S., 2005. Quality Assurance for Interpreting Processes. Meta: Translators’ Journal 50(2),
768–784. URL https://doi.org/10.7202/011017ar
Keating, G.D., Jegerski, J., 2015. Experimental Designs in Sentence Processing Research: A Meth-
odological Review and User’s Guide. Studies in Second Language Acquisition 37(1), 1–32. URL
https://doi.org/10.1017/S0272263114000187
Kiraly, D., 2000. A Social Constructivist Approach to Translator Education: Empowerment from
Theory to Practice. St. Jerome Publishing, Manchester/Northampton.
Mellinger, C.D., 2019. Computer-Assisted Interpreting Technologies and Interpreter Cognition:
A Product and Process-Oriented Perspective. Revista Tradumàtica. Tecnologies de la Traducció
17, 33–44. URL https://doi.org/10.5565/rev/tradumatica.228
Mellinger, C.D., Hanson, T.A., 2018. Interpreter Traits and the Relationship with Technology and Visibil-
ity. Translation and Interpreting Studies 13(3), 366–392. URL https://doi.org/10.1075/tis.00021.mel
O’Brien, S., 2023. Human-Centered Augmented Translation: Against Antagonistic Dualisms. Per-
spectives 32(3), 391–406. URL https://doi.org/10.1080/0907676X.2023.2247423
Pisani, E., Fantinuoli, C., 2021. Measuring the Impact of Automatic Speech Recognition on Number Rendition
in Simultaneous Interpreting. In Wang, C., Zheng, B., eds. Empirical Studies of Translation and Interpreting:
The Post-Structuralist Approach. Routledge, New York, 181–197. URL https://www.taylorfrancis.com/
chapters/edit/10.4324/9781003017400-14/measuring-impact-automatic-speech-recognition-number-
rendition-simultaneous-interpreting-elisabetta-pisani-claudio-fantinuoli?context=ubx
Prandi, B., 2015a. L’uso di InterpretBank nella didattica dell’interpretazione: Uno studio esplorativo (MA
thesis). Università di Bologna. URL https://amslaurea.unibo.it/id/eprint/8206 (accessed 7.10.2024).
Prandi, B., 2015b. The Use of CAI Tools in Interpreters’ Training: A Pilot Study. In Esteves-Ferreira,
J., Macan, J., Mitkov, R., Stefanov, O.-M., eds. Proceedings of the 37th Conference Translating
and the Computer. Translating and the Computer. AsLing, London, 48–57. URL https://aclanthol-
ogy.org/2015.tc-1.8
Prandi, B., 2017. Designing a Multimethod Study on the Use of CAI Tools During Simultaneous Inter-
preting. In Esteves-Ferreira, J., Macan, J., Mitkov, R., Stefanov, O.-M., eds. Proceedings of the 39th
Conference Translating and the Computer: Translating and the Computer. AsLing, Geneva, 76–88.
URL www.asling.org/tc39/wp-content/uploads/TC39-proceedings-final-1Nov-4.20pm.pdf
Prandi, B., 2018. An Exploratory Study on CAI Tools in Simultaneous Interpreting: Theoretical
Framework and Stimulus Validation. In Fantinuoli, C., ed. Interpreting and Technology. Language
Science Press, Berlin (Translation and Multilingual Natural Language Processing, 11), 29–59. URL
https://doi.org/10.5281/zenodo.1493293
Prandi, B., 2020. The Use of CAI Tools in Interpreter Training: Where Are We Now and Where Do
We Go from Here? In TRAlinea, 1–10. URL https://www.intralinea.org/specials/article/2512
Prandi, B., 2023. Computer-Assisted Simultaneous Interpreting: A Cognitive-Experimental Study on
Terminology. Language Science Press, Berlin (Translation and Multilingual Natural Language Pro-
cessing, 22). URL https://doi.org/10.5281/zenodo.7143056
Rodríguez, N., Schnell, B., 2009. A Look at Terminology Adapted to the Requirements of Inter-
pretation. Language Update 6(1), 21–25. URL https://www.noslangues-ourlanguages.gc.ca/fr/
favourite-articles/terminology-adapted-requirements-interpretation
Rodríguez, S., Frittella, F.M., Okoniewska, A.M., 2022. A Paper on the Conference Panel “In-Booth
CAI Tool Support in Conference Interpreter Training and Education”. In Esteves-Ferreira, J., Mit-
kov, R., Recort Ruiz, M., Stefanov, O.-M., Chambers, D., Macan, J., Sosoni, V., eds. Proceedings
of the 43rd Conference Translating and the Computer. Editions Tradulex, Geneva, 78–87. URL
www.tradulex.com/varia/TC43-OnTheWeb2021.pdf (accessed 7.10.2024).
Rodríguez, S., Gretter, R., Matassoni, M., Falavigna, D., Alonso, Á., Corcho, O., Rico, M., 2021.
SmarTerp: A CAI System to Support Simultaneous Interpreters in Real-Time. In Mitkov, R.,
143
The Routledge Handbook of Interpreting, Technology and AI
Sosoni, V., Giguère, J. C., Murgolo, E., Deysel, E., eds. Proceedings of the Translation and Inter-
preting Technology Online Conference. Online: INCOMA Ltd., 102–109. URL https://doi.org/10
.26615/978-954-452-071-7_012
Rodríguez Melchor, M.D., Horváth, I., Ferguson, K., eds., 2020. The Role of Technology in Confer-
ence Interpreter Training. Peter Lang, New York.
Rütten, A., 2004. Why and in What Sense Do Conference Interpreters Need Special Software? Lin-
guistica Antverpiensia, New Series – Themes in Translation Studies 3, 167–178. URL https://doi.
org/10.52034/lanstts.v3i.110
Rütten, A., 2007. Informations- und Wissensmanagement im Konferenzdolmetschen. Peter Lang,
Berlin (Sabest. Saarbrücker Beiträge zur Sprach- und Translationswissenschaft, 15).
Saeed, M. A., Rodríguez González, E., Korybski, T., Davitti, E., Braun, S., 2022. Connected Yet
Distant: An Experimental Study into the Visual Needs of the Interpreter in Remote Simultaneous
Interpreting. In Kurosu, M., ed. Human-Computer Interaction: User Experience and Behavior.
Springer International Publishing, Cham (Lecture Notes in Computer Science, 13304), 214–232.
URL https://doi.org/10.1007/978-3-031-05412-9_16
Scherer, K.R., 1989. Vocal Correlates of Emotional Arousal and Affective Disturbance. In Wagner, H.,
Manstead, A., eds. Handbook of Social Psychophysiology. John Wiley & Sons, Chichester, 165–197.
Seeber, K.G., Kerzel, D., 2012. Cognitive Load in Simultaneous Interpreting: Model Meets Data. Inter-
national Journal of Bilingualism 16(2), 228–242. URL https://doi.org/10.1177/1367006911402982
Smith, P.L., Little, D.R., 2018. Small Is Beautiful: In Defense of the Small-N Design. Psychonomic
Bulletin & Review 25(6), 2083–2101. URL https://doi.org/10.3758/s13423-018-1451-8
Stoll, C., 2009. Jenseits simultanfähiger Terminologiesysteme: Methoden der Vorverlagerung und
Fixierung von Kognition im Arbeitsablauf professioneller Konferenzdolmetscher. WVT, Wissen-
schaftlicher Verlag Trier, Trier (Heidelberger Studien zur Übersetzungswissenschaft, 13).
Tripepi Winteringham, S., 2010. The Usefulness of ICTs in Interpreting Practice. The Interpreters’
Newsletter 15, 87–99. URL http://hdl.handle.net/10077/4751
Ünlü, C., 2023a. Automatic Speech Recognition in Consecutive Interpreter Workstation: Computer-
Aided Interpreting Tool ‘Sight-Terp’/Otomatik konuşma tanıma sistemlerinin ardıl çeviride
kullanılması: Sight-Terp (MA thesis). Hacettepe Üniversitesi.
Ünlü, C., 2023b. InterpreTutor: Using Large Language Models for Interpreter Assessment. In Orăsan,
C., Mitkov, R., Corpas Pastor, G., Monti, J., eds. International Conference on Human-Informed
Translation and Interpreting Technology (HiT-IT 2023). INCOMA Ltd., Naples, 78–96. URL
https://doi.org/10.26615/issn.2683-0078.2023_007
Van Cauwenberghe, G., 2020. Étude expérimentale de l’impact d’un soutien visuel automatisé sur la
restitution de terminologie spécialisée (MA thesis). Universiteit Ghent. https://lib.ugent.be/catalog/
rug01:002862551
Vogler, N., Stewart, C., Neubig, G., 2019. Lost in Interpretation: Predicting Untranslated Termi-
nology in Simultaneous Interpretation. In Burstein, J., Doran, C., Solorio, T., eds. Proceedings
of the 2019 Conference of the North American Chapter of the Association for Computational
Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for
Computational Linguistics, Minneapolis, 109–118. URL https://doi.org/10.18653/v1/N19-1010
Wan, H., Yuan, X., 2022. Perceptions of Computer-assisted Interpreting Tools in Interpreter Educa-
tion in Chinese Mainland: Preliminary Findings of a Survey. International Journal of Chinese and
English Translation & Interpreting 1, 1–28. URL https://doi.org/10.56395/ijceti.v1i1.8
Wang, H., Li, Z., 2022. Constructing a Competence Framework for Interpreting Technologies, and
Related Educational Insights: An Empirical Study. The Interpreter and Translator Trainer 16(3),
367–390. URL https://doi.org/10.1080/1750399X.2022.2101850
Will, M., 2020. Computer Aided Interpreting (CAI) for Conference Interpreters. Concepts, Con-
tent and Prospects. ESSACHESS-Journal for Communication Studies 13(25), 37–71. URL https://
www.essachess.com/index.php/jcs/article/view/480
Xu, R., 2018. Corpus-Based Terminological Preparation for Simultaneous Interpreting. Interpreting
20(1), 29–58. URL https://doi.org/10.1075/intp.00002.xu
Xu, R., Sharoff, S., 2014. Evaluating Term Extraction Methods for Interpreters. In Drouin, P., Grabar,
N., Hamon, T., Kageura, K., eds. Proceedings of the 4th International Workshop on Computa-
tional Terminology (Computerm). Association for Computational Linguistics and Dublin City
University, Dublin, 86–93. URL https://doi.org/10.3115/v1/w14-4811
144
9
DIGITAL PENS FOR
INTERPRETER TRAINING
Marc Orlando
9.1 Introduction
The term ‘digital pen technology’ was first used in the context of interpreter training in
2010 by the present author and solely referred to a smartpen used with a notepad (also
called pen-and-paper technology). Over the last decade, ‘digital pen’ has also been used
to refer to a stylus used with a tablet computer/a touchscreen device. The information
presented and discussed in this chapter will deal exclusively with digital pens/smartpens as
mobile devices combined with a notepad. It is worth noting, however, that several studies
dedicated to the use of styluses with tablet computers or touchscreen devices have been
carried out to establish their relevance to note-taking for consecutive interpreting (Altieri,
2020; Arumí and Sánchez-Gijón, 2019; Drechsel and Goldsmith, 2016; Goldsmith, 2023).
The chapter will also focus only on the use of such technology in interpreter training (for
detailed information on the use of digital pen technology – including tablet computers – in
interpreting training and research, see Orlando, 2023).
Smartpens belong to the category of mobile computing platforms and are input devices
which capture the handwriting of a user and convert analogue information into digi-
tal data. Depending on the model, they can have additional features, like an integrated
camera/text scanner and/or a microphone/audio recorder. They offer advanced process-
ing power, memory for handwriting capture and for audio or visual feedback, as well as
additional applications. Smartpens have been investigated, trialled, and recommended for
use in various fields of education since the early 2000s, such as in design education, educa-
tion, engineering, health, and allied health (Boyle, 2012; Dawson et al., 2010; Grieve and
McGee-Lennon, 2010; Maldonado et al., 2006). It is only from 2010 that they appeared in
interpreter training and, in particular, in the area of note-taking for consecutive interpret-
ing. The technology has been used either to assess trainees’ interpreting performances and
provide interactive and dynamic feedback through peer or self-assessments, or to develop
process-oriented pedagogical activities that underpin the acquisition of metacognitive skills
and competence. Their innovative features have been praised by interpreting students and
instructors as they offer new opportunities to delve into the intricacies of note-taking for
consecutive interpreting. This chapter aims at reviewing and presenting training initiatives
DOI: 10.4324/9781003053248-12
The Routledge Handbook of Interpreting, Technology and AI
146
Digital pens for interpreter training
Livescribe released its first smartpen (Pulse) in 2008, a more advanced model than any
of its predecessors as it combined an infrared digital camera, augmented paper, and audio
recording capability. Originally conceived to assist students or secretaries in their retrieving
of notes taken during lectures or meetings, it was subsequently used for research activi-
ties in various fields, such as education, engineering, health and allied health, or science
(Orlando, 2016). With advanced processing power, audio and visual feedback, memory for
handwriting capture, audio recording, and a few other additional applications, the Live-
scribe smartpen soon appeared as the ideal digital tool to explore the process of note-taking
in consecutive interpreting in an innovative manner.
As the audio and video data captured by the pen are synchronised and can be uploaded to
any computer or played back instantly on digital devices, such as tablets or smartphones,
instructors and trainees can view the ‘live’ notes taken by an interpreter and pinpoint their
qualities or defects in direct relation to the source speech and the interpretation (for a com-
prehensive technical overview, see Orlando, 2023).
These features have appealed to researchers interested in collecting process-oriented data
for research purposes (Mellinger, 2022) with higher ecological validity and have led to
research projects focusing primarily on cognitive processes at play in note-taking (Chen,
2017, 2020; Kellett Bidoli and Vardè, 2016) or on hybrid modes of interpreting (Hiebl,
2011; Mielcarek, 2017; Orlando, 2014; Özkan, 2020; Svoboda, 2020). As proposed by
Mellinger (2022), more could be investigated, since digital pen technology allows data col-
lection that can be interpreted in line with cognitive research on interpreting. Researchers
with an interest in enhancing interpreter training in note-taking have also recommended the
use of digital pen technology (Chen, 2017; Kellett Bidoli and Vardè, 2016; Orlando, 2010,
2015b). As experiments and trials of the tool carried out so far have shown, the system can
‘certainly be a very valuable aid to training in consecutive with notes’ (Setton and Dawrant,
2016, 198), ‘an invaluable tool for teachers and students’ (Gillies, 2019, 225).
147
The Routledge Handbook of Interpreting, Technology and AI
148
Digital pens for interpreter training
information and of issues with notes layout, the ability to track the sequencing of notes
throughout a speech, the possibility to view and discuss playbacks collectively and establish
cross-fertilisation processes in peer assessment and group work, or the improvements of
note-taking personal conventions over time (Orlando, 2015b, 184–192).
Kellet Bidoli (University of Trieste) also reported using Livescribe smartpens when teach-
ing consecutive interpreting and how the technology has opened new horizons by providing
trainers with ‘new innovative tools to teach with and evaluate consecutive in class’ (Kellet
Bidoli, 2016, 116). She praised the fact that the synchronised audio and filmed notes can
be uploaded and shared with the group immediately after performing, allowing trainees
to follow her observations and comments on a single student’s notes and add their own
suggestions. The exercise makes it ‘possible to observe, trace and count features of the
interpreted text which, during normal oral critique sessions in class, are inaccessible to
such a high degree of accuracy’. Observations focus on a student’s notes in relation to the
correct use of terminology in the interpretation, as well as ‘other features like ear-voice
span, additions, corrections, hesitations, false starts, repetitions, figures of speech, names,
facts and figures’ (Kellet Bidoli, 2016, 118). Such cross-fertilisation of ideas allows students
to ‘quickly pick up new solutions and dispel any doubts’ (Kellet Bidoli, 2016, 117). From
a different project she carried out with Vardè, they pointed out that the data captured by
the pen allows ‘to return to any section of a speech and see the notes and/or listen to the
SL [source language] over and over to unravel the process, highlight mistakes and good or
bad choices which otherwise would go unnoticed’ (Kellet Bidoli and Vardè, 2016, 144).
The synchronised audio and video provided unique insights in lag, false starts, hesitations,
additions, or corrections. Finally, the authors also noted the interactional benefits gained as
‘the job of collecting notes is fast, they can be observed in all their dynamicity and, together
with synchronization of the SL, much can be gleaned in the classroom making students
active participants’ (Kellet Bidoli and Vardè, 2016, 144).
What prompted Romano to implement the use of smartpens in consecutive interpreting
training at Innsbruck University was a sense of dissatisfaction when teaching note-taking,
in particular, the difficulty to identify ‘what went wrong during the note-taking phase’ of a
substandard interpretation (Romano, 2018, 9). Asking students ‘to copy their notes from
their notebooks to the blackboard’ as a sharing activity was an impractical solution and ‘a
waste of precious time’, while with digital pens, ‘one gains valuable time that can be used
for in-depth analysis of the notes’ (Romano, 2018, 11). Working with a cohort of 52 stu-
dents, she decided to use the technology following the pedagogical scenarios recommended
by Orlando (2010) and concluded that it was particularly useful and helpful in one area
that ‘tends to be neglected in consecutive interpreting while being of paramount importance
in simultaneous interpreting, namely decalage’ (Romano, 2018, 13). She praised the storage
capacity of pens allowing students ‘to go back to previous notes and monitor how their
note-taking technique has evolved over time’ (Romano, 2018, 12), as well as the ease of use
and the potential for more interactivity in the classroom.
Whether digital pens are used at the very start of training (Romano) or at a later advanced
stage to assess progress and efficiency (Orlando), they can benefit trainees throughout the
duration of their course and later in their professional practice.
All examples reported hereby concur on the many pedagogical and metacognitive ben-
efits the use of such technology brings: the access to ‘live’ notes allowing to identify
at once and better understand what part of the source speech is misunderstood, not
149
The Routledge Handbook of Interpreting, Technology and AI
memorised or missed; how long the lag/décalage is; etc. The possibility for students to
visualise the process of note-taking and identify better their own qualities or deficiencies,
to share ideas and get inspiration from other students and trainers, to understand bet-
ter and analyse what can go wrong if they take excessive or disorganised notes, is also
an incentive to use such technology. Various pedagogical activities and sequences can
be developed and implemented to allow students and trainers to identify issues in the
note-taking technique of a trainee (e.g. self- or peer evaluation), but also to develop per-
sonalised and effective remediation strategies through cross-fertilisation (Orlando, 2016,
117–121). Despite these indisputable advantages, one can regret that only so few initia-
tives have been reported so far. It is possible, though, that other trainers have been using
the technology in their interpreting classroom. In their survey of 60 interpreter trainers
teaching within the EMCI (European Masters in Conference Interpreting) consortium,
Riccardi et al. (2020, 31) noted that 9 of them (15%) use smartpens in their classes.
Unfortunately, it appears that none of them have reported about their activities and find-
ings through publications.
150
Digital pens for interpreter training
and learning benefits of using technologies on a more systematic basis (Frittella, 2021;
Orlando, 2019).
Similar to what Frittella recommends (2021) regarding computer-assisted interpreter
training (CAIT) and the use of technology in interpreter education, the use of digital pen
technology and smartpens in the consecutive interpreting classroom should be envisaged
and implemented on the basis of its intended pedagogical purpose. As discussed by Ahrens
and Orlando (2021), a favoured approach in interpreter education training aims at put-
ting the student at the centre of the teaching/learning act. Any such constructivist intention
relies on various pedagogical elements that will assist trainees in their learning process:
metacognitive strategies, process-oriented and product-oriented evaluations, or feedback
mechanisms, among others. In the teaching of note-taking in interpreting, the use of smart-
pens has been a crucial technological advance to achieve such objectives and should be
advocated for on a broader scale.
• The Livescribe range, requiring microdotted augmented paper: Pulse and Echo, with
built-in microphones and syncing possible with desktop via USB; Sky WiFi, Livescribe
3, Aegir, Symphony, and Echo 2, with audio syncing possible via a digital device paired
with Bluetooth and the Livescribe application. Audio files are sorted separately and
recorded as sharable pencasts synched with notes. It is worth noting that, at the time
of writing, most models but Symphony and Echo 2 appear to have been discontinued,
though some may still be available for purchase.
• The Neo models Smartpen M1+ and Smartpen N2 are used with the versatile Neo Studio
app and augmented paper too.
151
The Routledge Handbook of Interpreting, Technology and AI
• Moleskine Pen+ Ellipse, which offers an audio recording option on a paired phone, tab-
let, or computer via its application and works with specific Moleskine+ notebooks.
• The SyncPen NEWYES second-generation smartpen works both on paper and on a dedi-
cated tablet and is paired on any digital device via the app.
152
Digital pens for interpreter training
Could such a tool be a substitute to traditional pen and paper or to smartpens? It might
become the ‘next-level’ tool for those trainees/practitioners who already use touchscreen
devices or tablets. For those who still prefer relying on a pen and a notepad and are not
comfortable using a stylus, this might be a step too far. Time will tell.
9.5 Conclusion
The use of smartpens in interpreter training over the last ten years has been reviewed
in this chapter. Smartpens with video and audio recording features used with notepads,
aka pen-and-paper technology, which still allows note-taking with pen and paper, offer
undeniable advantages. Initiatives implemented in the consecutive interpreting classroom,
albeit still too few, have demonstrated how the access to ‘live’ notes offers invaluable and
unmatched pedagogical opportunities to dissect the process of notes taken by trainees and
to provide personalised remediation for the issues they may encounter. Trainers and train-
ees who have trialled smartpens praise in particular the synchronisation of handwritten
notes with the audio recording of the source speech, as well as the change of dynamics and
the increased engagement of students during and outside class time thanks to the potential
for more exchange of ideas.
As was noted, digital pen technology and smartpens are still underutilised in interpreter
training, even though a lot more could still be done to teach interpreting students to develop
appropriate note-taking systems that would ultimately improve the quality of their inter-
preting performances. The main reasons for such an underutilisation may be multiple and
include the budget needed to acquire pens and to replace the microdotted paper (though it
can now be printed out from the Livescribe desktop at no cost), the perceived complexity
of set-up and of use of the tool, and the lack of technological support available for trainers
at their university, or the reluctance from professionals to use such technology (Orlando,
2023). As new digital tools and systems emerge, the future and the relevance of smart-
pens may also be questioned. Though the fact that they are used as ‘normal’ pens with
a notepad seemed to have been a positive characteristic at first (Maldonado et al., 2006;
Orlando, 2010), the paperless digital devices and web-based technologies currently avail-
able for note-taking and audio recording may be more convenient and appealing to current
and future generations of educators and students, who may no longer use pens and paper.
In any case, to see teaching traditional methods evolve and various technologies that aim
to enhance the practice of interpreting (be it digital pens and smartpens, automatic speech
recognition tools, or any future relevant technology) be implemented in interpreter educa-
tion and be more widespread and more democratised than they are today, a shift would
need to occur in interpreting programmes and in universities.
To make sure future graduates are trained to respond to the contemporary demands of
our discipline and our profession, and to fulfil their digital literacy obligations by exposing
students to a variety of technological tools, interpreting departments would need to build
capacity and ensure that trainers and researchers in education are made aware of the exist-
ence of such tools and systems, have access to them, and are trained in interpreting didactics
to gain the knowledge to develop purposeful pedagogical activities with them. As philoso-
pher and educational reformer John Dewey once put it, ‘if we teach today as we taught
yesterday, we rob our children of tomorrow’ (Dewey, 1897). If we want to see interpreting
education and practice remain relevant, a thorough focus on ongoing technological devel-
opments and changes, and on their impact on the interpreting world, is essential.
153
The Routledge Handbook of Interpreting, Technology and AI
Notes
1 Live prompting CAI tools – a market snapshot – Dolmetscher-wissen-alles.de (sprachmanagement.net)
(accessed 1.7.2024).
2 www.techforword.com/resources (accessed 12.9.2024).
References
Ahrens, B., Orlando, M., 2021. Note-Taking for Consecutive Conference Interpreting. In Albl-Mikasa,
M., Tiselius, E., eds. The Routledge Handbook of Conference Interpreting. Routledge, London
and New York, 34–48.
Altieri, M., 2020. Tablet Interpreting: Étude expérimentale de l’interprétation consécutive sur tab-
lette. The Interpreters’ Newsletter 25, 19–35.
Arumí, M., Sánchez-Gijón, P., 2019. La toma de notas con ordenadores convertibles en la
enseñanza-aprendizaje de la interpretación consecutiva. Resultados de un estudio piloto en una
formación de master. Tradumàtica 17, 128–152. URL https://doi.org/10.5565/rev/tradumatica.234
Boyle, JR., 2012. Note-Taking and Secondary Students with Learning Disabilities: Challenges and
Solutions. Learning Disabilities Research & Practice 27(2), 90–104.
Chen, S., 2017. Note-Taking in Consecutive Interpreting: New Data from Pen Recording. The Inter-
national Journal for Translation and Interpreting Research 9(1), 4–23.
Chen, S., 2020. The Process of Note-Taking in Consecutive Interpreting: A Digital Pen Recording
Approach. Interpreting 22(1), 117–139.
Chen, S., Kruger, J.L., 2023. The Effectiveness of Computer-Assisted Interpreting: A Preliminary
Study Based on English-Chinese Consecutive Interpreting. Translation and Interpreting Studies
18(3), 399–420. URL https://doi.org/10.1075/tis.21036.che
Cheung, A.K.F., Li, T., 2022. Machine Aided Interpreting: An Experiment of Automatic Speech Rec-
ognition in Simultaneous Interpreting. Translation Quarterly 104, 1–20.
Cymo Note, 2023. URL www.cymo.io/en/documentation/note/index.html
Darden, V 2019. Educator Perspectives on Incorporating Digital Citizenship Skills in Interpreter
Education (PhD thesis). Walden University.
Dawson, L., Plummer, V., Weeding, S., Harlem, T., Ribbons, B., Waterhouse, D., 2010. Build-
ing a System for Managing Clinical Pathways Using Digital Pens. URL https://ro.uow.edu.au/
infopapers/1471
Defrancq, B., Fantinuoli, C., 2020. Automatic Speech Recognition in the Booth. Target 33(1), 1–30.
URL https://doi.org/10.1075/target.19166.def
Dewey, J., 1897. My Pedagogic Creed. E. L. Kellog & Co., New York.
Dimond, T., 1957. Devices for Reading Handwritten Characters. Proceedings from the Eastern Joint
Computer Conference 232–237. URL https://doi.org/10.1145/1457720.1457765
Drechsel, A., Goldsmith, J., 2016. Tablet Interpreting: The Evolution and Uses of Mobile Devices in
Interpreting. URL http://independent.academia.edu/adrechsel
Frittella, F.M., 2021. Computer-Assisted Conference Interpreter Training: Limitations and Future
Directions. Journal of Translation Studies 2, 103–142.
Gillies, A 2019, Consecutive Interpreting: A Short Course. Routledge, London and New York.
Goldsmith, J., 2018. Tablet Interpreting: Consecutive 2.0. Translation and Interpreting Studies 13(3),
342–365.
Goldsmith, J., 2023. Tablet Interpreting: A Decade of Research and Practice. In Corpas Pastor, G.,
Defrancq, B., eds. Interpreting Technologies – Current and Future Trends. IVITRA Research in
Linguistics and Literature, 37. John Benjamins, Amsterdam and Philadelphia, 27–45. URL https://
doi.org/10.1075/ivitra.37.02gol
Grieve, C.R., McGee-Lennon, M., 2010. Digitally Augmented Reminders at Home. URL www.cs.stir.
ac.uk/~kjt/research/match/resources/documents/grieve-reminders.pdf (accessed 12.9.2024).
Hiebl, B., 2011. Simultanes Konsekutivdolmetschen mit dem LivescribeTM EchoTM Smartpen
[Simultaneous Consecutive Interpreting with the LivescribeTM EchoTM Smartpen] (MA disserta-
tion). University of Vienna.
Kellet Bidoli, C.J., 2016. Traditional and Technological Approaches to Learning LSP in Italian to
English Consecutive Interpreter Training. In Garzone, G., Heaney, D., Riboni, G., eds., Focus on
LSP Teaching: Developments and Issues. LED, Milan, 103–126.
154
Digital pens for interpreter training
Kellet Bidoli, C.J., Vardè, S., 2016. Digital Pen Technology and Consecutive Note-Taking in the Class-
room and Beyond. In Zehnalová, J., Molnár, O., Kubánek, M., eds. Interchange Between Lan-
guages and Cultures: The Quest for Quality. Palacký University, Olomouc, 131–148.
Maldonado, H., Lee, B., Klemmer, S., 2006. Technology for Design Education: A Case Study. In
Proceedings of the Conference on Computer Human Interaction: CHI 2006. Montreal, Canada,
1067–1072.
Mellinger, C.D., 2022. Cognitive Behaviour During Consecutive Interpreting: Describing the
Note-Taking Process. The International Journal of Translation and Interpreting Research 14(2),
103–119.
Mielcarek, M., 2017. Das simultane Konsekutivdolmetschen [Simultaneous Consecutive Interpreting]
(MA dissertation). University of Vienna.
Nguyen, N.P.H., 2006. Note Taking and Sharing with Digital Pen and Paper: Designing for
Practice-Based Teacher Education (MA dissertation). Trondheim University of Science and
Technology.
Orlando, M., 2010. Digital Pen Technology and Consecutive Interpreting: Another Dimension in
Note-Taking Training and Assessment. The Interpreters’ Newsletter 15, 71–86.
Orlando, M., 2014. A Study on the Amenability of Digital Pen Technology in a Hybrid Mode of
Interpreting: Consec-Simul with Notes. The International Journal of Translation and Interpreting
Research 6(2), 39–54.
Orlando, M., 2015a. Digital Pen Technology and Interpreter Training, Practice and Research: Status
and Trends. In Ehrlich, S., Napier, J., eds. Interpreter Education in the Digital Age. Gallaudet
University Press, Washington, DC, 125–152.
Orlando, M., 2015b. Implementing Digital Pen Technology in the Consecutive Interpreting Class-
room. In Andres, D., Behr, M., eds. To Know How to Suggest . . . Approaches to Teaching Confer-
ence Interpreting. Frank & Timme, Berlin, 171–199.
Orlando, M., 2016. Training 21st Century Translators and Interpreters: At the Crossroads of Prac-
tice, Research and Pedagogy. Frank & Timme, Berlin.
Orlando, M., 2019. Training and Educating Interpreter and Translator Trainers as
Practitioners-Researchers-Teachers. The Interpreter and Translator Trainer 13(3), 216–232. URL
https://doi.org/10.1080/1750399X.2019.1656407
Orlando, M., 2023. Using Smartpens and Digital Pens in Interpreter Training and Interpreting
Research: Taking Stock and Looking Ahead. In Corpas Pastor, G., Defrancq, B., eds. Interpreting
Technologies – Current and Future Trends. IVITRA Research in Linguistics and Literature, 37.
John Benjamins, Amsterdam and Philadelphia, 6–26. URL https://doi.org/10.1075/ivitra.37.01orl
Özkan, C.E., 2020. To Use or Not to Use a Smartpen: That Is the Question. An Empirical Study on
the Role of Smartpen in the Viability of Simultaneous-Consecutive Interpreting (MA dissertation
(accessed 12.9.2024). Ghent University.
Pöchhacker, F., 2016. Introducing Interpreting Studies, 2nd ed. Routledge, London and New York.
Pogue, D., 2008. Gadget Fanatics, Take Note. The New York Times, 8.5.2008. URL www.nytimes.
com/2008/05/08/technology/personaltech/08pogue.html (accessed 12.9.2024).
Prandi, B., 2020. The Use of CAI Tools in Interpreter Training: Where We Are Now and Where Do
We Go from Here. inTRAlinea Special issue: Technology in Interpreter Education and Practice.
URL www.intralinea.org/specials/article/the_use_of_cai_tools_in_interpreter_training
Riccardi, A., Ceňková, I., Tryuk, M., Maček, A., Pelea, A., 2020. Survey of the Use of New Technolo-
gies in Conference Interpreting Courses. In Rodriguez Melchor, M.D., Horváth, I., Ferguson, K.,
eds. The Role of Technology in Conference Interpreter Training. Peter Lang, Oxford, 7–42.
Romano, E., 2018. Teaching Note-Taking to Beginners Using a Digital Pen. Między Oryginałem a
Przekładem 24(42), 9–16.
Sandrelli, A., 2015. Becoming an Interpreter: The Role of Computer Technology. MonTI Special Issue
2, 111–138.
Setton, R., Dawrant, A., 2016. Conference Interpreting: A Trainer’s Guide. John Benjamins, Amster-
dam and Philadelphia.
Svoboda, S., 2020. SimConsec: The Technology of a Smartpen in Interpreting (MA dissertation).
Palacký University Olomouc.
Techforword, 2023. Cymo Note: Speech Recognition Meets Automated Note-Taking. URL www.tech-
forword.com/blog/cymo-note-speech-recognition-meets-automated-note-taking (accessed 12.9.2024).
155
10
TECHNOLOGY FOR
TRAINING IN CONFERENCE
INTERPRETING1
Amalia Amato, Mariachiara Russo, Gabriele Carioli
and Nicoletta Spinolo
10.1 Introduction
The turn of the 20th century marked a watershed in interpreter-mediated communication.
The advent of technologies applied to conference interpreting has profoundly affected the
way in which the speaker’s message could be put across in another language. Firstly, an
IBM system enabled simultaneous interpreting. This mode was officially launched during
the Nuremberg Trials (Baigorri-Jalón, 2000) and remains the standard interpreting mode
in multilingual conferences and international organisations today. Secondly, video-based
conference interpreting systems have enabled remotely interpreted communication to take
place on an unprecedented scale.
Technology affordances have had a remarkable impact not only on conference inter-
preting but also on interpreter education. In this field, technological tools have been
developed to meet training demands that have evolved over the years. These tools are
known as computer-assisted interpreter training tools, or CAIT tools (Fantinuoli, 2023;
Prandi, 2020).
This chapter provides an overview of the fast-paced evolution of CAIT tools for educa-
tional purposes (Section 10.2). It then discusses two examples of specifically developed tools
for online training and self-training (Sections 10.3 and 10.4 and related subsections). In light
of technological advances in interpreting, Section 10.5 deals with the need to include soft
skills training in interpreting curricula and presents an exploratory study on the use of an
AI-enhanced CAI tool in an educational setting. Section 10.6 offers some concluding remarks.
DOI: 10.4324/9781003053248-13
Technology for training in conference interpreting
These first multilingual tools, the use of which was confined to their institutions of
origin, set the stage for the abundant online resources of the present day. Examples of cur-
rently available resources include the Interpreter Training Resources2 website, developed
by Andrew Gillies, or specific sections of the Knowledge Centre on Interpretation.3 Exam-
ples of accessible multilingual speech repositories include Speechpool, conceived by Sophie
Llewellyn Smith;4 the EU-funded ORCIT;5 and the multilayered Speech Repository, created
by the DG SCIC of the EU Commission.
Another training demand which emerged during this time concerned the need to effi-
ciently organise self-study activities for interpreter trainees. Sandrelli (2005) and Merlini
(1996) created two original pieces of software: Blackbox and InterprIT, respectively. These
examples of software innovation resulted from the collaboration between these two inter-
preting scholars and IT engineers. Blackbox included structured materials and exercises,
along with the possibility for users to record student performance and receive trainer feed-
back. From an educational prototype, Blackbox became a fully-fledged, somewhat popular
commercial product (Sandrelli, 2005). Although designed specifically for consecutive inter-
preting practice (Merlini, 1996; Gran et al., 2002), InterprIT never reached the commercial
production stage, unfortunately.
The gap filled by these initial innovative training tools in interpreter education is fur-
ther bridged today by the rise of virtual learning environments. Here, multiple training
resources can be stored, accessed, and exchanged. Examples include the general-purpose
platform Moodle, which can be used profitably to train interpreters (Kajzer-Wietrzny and
Tymczynska, 2014; Russo and Spinolo, 2022). Inspired by the concept of Moodle, Bertozzi
(2024), at the Department of Interpreting and Translation of the University of Bologna at
Forlì, has designed a self-training platform specifically for interpreter education. The plat-
form consists of several modules which both recap theoretical basic concepts of interpreter
training and provide a wide-ranging supply of principled training materials and evaluation
tools.
More recent developments in technology for interpreter education also appear in the field
of dialogue interpreting. These include computer-generated 3D virtual environments that
simulate credible business and community contexts, without exposing trainees to the stress-
ful conditions of real professional settings. These tools are the result of two EU-funded pro-
jects led by the University of Surrey: IVY (Interpreting in Virtual Reality; Braun and Slater,
2014; Ritsos et al., 2013) and EVIVA (Evaluating the Education of Interpreters and Their
Clients through Virtual Learning Activities) (Braun et al., 2014, this volume). In particular,
IVY was implemented in Second Life (SL) and includes virtual locations in which both
trainee interpreters and interpreting clients can practice, individually or collaboratively, via
virtual representations of themselves (avatars).
Another need arising in interpreter education concerns terminology retrieval and man-
agement, glossary compilation and ‘on-the-job’ consultation, term memorisation, and doc-
umentation. Meeting this demand has led to the second wave of technologies for interpreter
education. Whereas some proposed technological solutions have remained at master-thesis
level without any commercial development thus far (e.g. Pollice, 2016), others have enjoyed
larger diffusion, such as InterpretBank, developed by Fantinuoli (2023), which has become
a widely used CAIT tool. Examples of other commercial CAIT tools with similar features
include Interpreter’s Help or Interplex (see Fantinuoli, 2023).
A wider overview of the main CAIT tools developed during the second wave and, more
importantly, how and when to use them to promote guided student autonomy and more
157
The Routledge Handbook of Interpreting, Technology and AI
158
Technology for training in conference interpreting
As a result, DIT has also developed two innovative open-source CAIT tools: InTrain and
ReBooth. Their development was based on fruitful collaboration between an IT specialist
and an interpreting trainer/researcher (Carioli and Spinolo, 2019, 2020). Mastering these
skills requires a long, self-paced learning curve in a student-centred, collaborative envi-
ronment; the availability of peers; trainer feedback; and adequate physical spaces where
to practise and sit exams: These have always been considered prerequisites in interpreter
education institutions, until the outbreak of the COVID-19 pandemic, which put all ten-
ets upside down. In particular, online training had become a necessity, and therefore,
InTrain, which had been developed to facilitate students’ self-study practice if no physical
booths were available (e.g. if the booth-equipped lab was already booked), and ReBooth,
which was developed during the pandemic, proved indispensable to support interpreter
education.
The following sections will describe both CAIT tools: InTrain for online peer-to-peer
interpreting practice (Section 10.3), and ReBooth for online interpreter training and evalua-
tion (Section 10.4). Section 10.5 will review training-related features and the user-friendliness
of SmarTerp – an AI-enhanced CAI tool in its first release – in order to highlight how CAIT
tools assist and are used by interpreter trainers and trainees to practise and hone the afore-
mentioned interpreting skills.
10.3.1 Overview
InTrain (which stands for INterpreter TRAINing) is an open-source HTML5 WebRTC
web-based application for interpreter (self-)training. It was conceived in 2019 at the Uni-
versity of Bologna’s Department of Interpreting and Translation (DIT) by Carioli and Spi-
nolo and designed and developed by Carioli. This online tool allows students to practise
(either alone or with a tutor) and improve their simultaneous and consecutive interpreting
skills, as well as reflect on their practice in terms of interpreting process and product. The
rationale behind InTrain was to provide a minimal yet flexible application that would allow
groups of three students to work together. When working as a trio, one student acts as a
speaker, another as assessor and session administrator (who can also share a video or audio
159
The Routledge Handbook of Interpreting, Technology and AI
file to interpret) and the third acts as an interpreter. As a result, this tool allows students to
practise their skills independently. However, tutors could also easily use InTrain to work
with two students, without requiring technical supervision or intervention.
When a trainer uses InTrain with a small group of students in simultaneous mode, there
are two possible constellations:
1. The trainer acts as a ‘pure’ listener and/or assessor, while one student acts as the speaker
and a second one interprets (three people involved).
2. The trainer can share a video to be interpreted and act as listener and/or assessor while
one student interprets (two people involved).
When working in InTrain alongside a tutor in consecutive mode, two students can listen
to the speech and take notes. One student will then deliver his/her interpretation, while
the other checks his/her notes or assesses the first student’s delivery. The objective behind
using InTrain in this way is to focus on practising interpreting techniques alongside self- or
other-initiated reflections and assessment. InTrain also aims to allow students to work on
a specific skill or ability related to interpreting. Consequently, the tool includes features
that enable users to organise activities in relation to a specific purpose. Examples include
working on a specific task or problem trigger in order to complement activities covered in
the classroom.8
160
Technology for training in conference interpreting
connection (PC, tablet, etc.), headphones with a built-in microphone, and a webcam (which
may also be built into the device).
At first access, the screen resembles Figure 10.1.
Before beginning a training session, each user/student must choose a role to enable one
participant to run the session, while the remaining two participants act as interpreter or
speaker, respectively (see Section 10.3.1). The objective behind this is for the session to
unfold smoothly and prepare all student participants for their role in training, with or with-
out the presence of a trainer. Before starting an InTrain training session, the three student
participants agree on the interpreting mode(s) and/or the interpreting aspect they wish to
work on. As a result, the supervisor will have chosen the speech to use in the session in
advance and will have therefore briefed the interpreter on the mode of delivery (consecutive
or simultaneous) and on the terminological aspects they should prepare.
In terms of the operational steps of a session, the supervisor logs on first, choosing any
username that is not currently in use by another user. They must then allow the site to
access their webcam and headset and communicate their username to the other participants
via a separate means of communication. There is no built-in mechanism for sending email
invitations through the server for security reasons, since it is a public access tool. The inter-
preter and the speaker will subsequently connect to the platform, choose their respective
roles, and enter their supervisor’s username to join the session. Once all three participants
have connected to the session, they will be able to see and communicate with each other, via
their own interface. This working mode is called briefing mode. All microphones and audio
channels are open. This enables the participants to agree on the activity they wish to carry
out, and for the speaker to brief the interpreter.
The three interfaces are arranged in a similar way for each user, with a larger user display
in the centre of the screen, two smaller displays stacked on the left, and a chat window on
the right. The displays were sized in relation to what was considered to be most relevant for
each participant to see. For instance, the interpreter sees the speaker on the larger display.
161
The Routledge Handbook of Interpreting, Technology and AI
Below this are the respective toolbars. By default, the supervisor sees the speaker on the
main user display, and the interpreter in the box at the top on the left. On the other hand,
the speaker sees the supervisor at the centre of the screen and the interpreter at the top on
the left. Lastly, the interpreter sees the speaker at the centre and the supervisor on the top
left corner. The display at the bottom on the left shows their own video stream. Streams at
the centre and at the top right corner can be switched.
Although InTrain was not designed to simulate a remote interpreting platform, it does
include a feature to allow the exchange of text messages and files between participants, via
a chat panel (Figure 10.2). To start an exercise, the supervisor first enables training mode on
the interface. This automatically mutes the supervisor’s microphone and allows them to listen
to both the speaker’s and the interpreter’s delivery at the same time or choose between one of
the two audio inputs. The supervisor can then record and save the training session by creat-
ing a double-track file that contains both the speaker’s speech and the interpreter’s rendition.
Finally, the platform offers the possibility of conducting a session without a speaker. This
feature can be used by uploading a link to a video from YouTube, Vimeo, or Facebook that
can be played in the speaker’s box on the interface, or by sharing the supervisor’s own screen.
The interpreter’s role only allows the user to share documents, mute and unmute their
microphone, write messages in the chat, and take part in the briefing. Once the supervisor
has switched the session to training mode, the interpreter can still hear the speaker but can
no longer hear the supervisor, whose microphone is muted. The speaker is unable to hear
the interpreter’s rendition by default but can select the interpreter’s audio channel on the
interface and listen to the rendition.
[U]niversities obviously have an important role to play in research into usability and
suitability of potential tools for interpreter support. They have a duty to help identify
162
Technology for training in conference interpreting
technological gaps and advance the conceptual development of tools that are both
relevant and suited for interpreters.
Consequently, testing the usability and suitability of InTrain became the object of an MA
thesis conducted at the Department of Interpreting and Translation of the University of
Bologna (Santoro, 2020). The study tested the tool in both consecutive and simultane-
ous modes. The participants were four second-year MA students from the aforementioned
Department at the Forlì Campus, and four from the Interpreting Department of the Uni-
versity of Trieste. The aim was to test the online platform with students who had already
trained and self-trained together in person, and with students from another university who
had neither trained nor interacted in person before with the students from Forlì.
The study took place in different stages, carried out at all times by the same MA stu-
dent who had prepared the texts and speeches, taken field notes, and run the simultaneous
interpreting sessions under the supervision of an academic trainer. First, four students from
the Forlì Department who had trained together worked in two pairs on consecutive inter-
preting from Spanish into Italian using InTrain. Each student, in turn, took on the role of
speaker and interpreter. This enabled each participant to play both roles and ensured that
every student had interpreted consecutively once. The participants were also asked to take
part in a simultaneous interpreting session, run by the supervisor, to provide further evalu-
ation of the function of this tool. During the second stage of the study, the four students
from Forlì worked with four students from Trieste, in mixed pairs, and performed the same
activities as in the first stage. In short, the two stages aimed to allow students to test the
tool in a more ‘familiar’ situation with their course mates, and in a less familiar situation
with unknown peers.
All sessions followed the same structure: Briefing, listening to a speech, note-taking,
consecutive delivery, comments, and feedback. After the InTrain interpreting sessions had
been completed, the eight participants were asked to fill in a questionnaire. The question-
naire was composed of statements to be rated with a Likert scale and questions with yes/
no answers, followed by a space to provide explanations for the answer. The questionnaire
consisted of six sections: The first requested information relating to the participants’ demo-
graphics. The second asked about previous experience using platforms or other tools for
online interpreting and their names and also contained three questions about InTrain – if the
student had received an induction before using InTrain, if it had been useful, and if it was
judged necessary. The third section asked respondents to rate the functional efficiency and
user-friendliness of the tool’s user interface, focusing on technical aspects, namely, if exer-
cises had run smoothly or if there had been technical problems, if the platform was intuitive
or not, and whether it was necessary to use an additional device in case of technical prob-
lems on the basis of statements to be rated using Likert scales ranging from 1 to 5, where
1 was ‘completely disagree’, 2 was ‘partially disagree’, 3 was ‘neither agree nor disagree’,
4 was ‘partially agree’, and 5 was ‘completely agree’. The fourth section investigated the
usability of the platform in the two interpreting modes. Students were asked to say whether
the tool was more fitting for consecutive or simultaneous interpreting; if, in the role of
interpreter, the sound quality had been good enough for the practice, and if not, why; and if
the sound quality when in the role of speaker and supervisor had been good enough for the
purpose of the session, and if not, why. The fifth section collected the users’ perceptions of
pragmatic and paralinguistic aspects of communication via the platform and of the training
experience. Respondents were asked to compare consecutive and simultaneous interpreting
163
The Routledge Handbook of Interpreting, Technology and AI
practice with InTrain and in the lab, to state whether the online mode had influenced their
performance; whether non-verbal aspects had been influenced by the online mode (posture,
gaze), and if yes, how; how they rated the interaction compared to an on-site in-person
session (better or worse), and why; and if the experience with InTrain had been positive or
negative. Finally, the sixth section was devoted to users’ comments, suggestions, and opin-
ions. For space reasons, only the most relevant findings of this study will be reported briefly
here, namely, those pertaining to the platform’s user-friendliness, efficiency of functions,
and the need for another device to communicate with peers before and during the activity.
When asked to rate the statement regarding InTrain’s user-friendliness, seven respond-
ents out of eight answered ‘partially agree’, and one participant chose ‘neither agree nor
disagree’. As for the efficiency of functions, five students out of eight remarked that they
‘completely agreed’ that InTrain allowed users to perform interpreting activities smoothly,
one ‘partially agreed’, and two ‘partially disagreed’. In the final section of the questionnaire,
where participants could write comments, these two respondents complained about con-
nectivity problems during the two practice sessions. When asked whether the tool was suit-
able both for simultaneous and consecutive exercises, seven students out of eight answered
positively. When asked whether the tool’s online interpreting practice was comparable to
on-site practice, only one participant out of eight stated that both simultaneous and con-
secutive exercises online are ‘not comparable’ to interpreting in the classroom. When asked
how they found their experiences with the platform, all participants answered positively
but expressed different opinions regarding its use: Some stated that InTrain was a valu-
able tool as an alternative to on-site practice, when face-to-face practice was not possible.
Others saw further potential for its use and said they would consider using InTrain as an
additional tool for student training in addition to the interpreting lab.
While the study was limited in size, its findings suggest that this tool can help students
in peer-to-peer (as well as trainer–student) interpreting practice, even when they are not
co-located. The tool provides students with a free, open-access virtual space designed for
practising consecutive and simultaneous modes. It fosters students’ abilities to work col-
laboratively, despite geographical distance, and promotes familiarity with remote inter-
preting and RSI platforms. These additional skills, often defined as ‘interpreter’s soft skills’
(Albl-Mikasa, 2013; Galán-Mañas et al., 2020), are becoming crucial, if not core, compe-
tencies in the interpreting profession. This trend is likely to increase as time goes by (see
Section 10.5).
The next section will describe another technological platform specifically designed and
developed for teaching interpreting online: ReBooth.
164
Technology for training in conference interpreting
could they interpret the same speech. These issues prevented consistency in terms of ensur-
ing equal levels of difficulty and test conditions for each student. ReBooth is specifically
designed to test groups of students interpreting the same speech at the same time, monitor-
ing their performance and collecting their renditions, which can then be evaluated later.
As for lab activities, the ReBooth platform allows trainers to see students in their virtual
booths, via the interface. Trainers are also able to communicate with their students, provid-
ing them with a briefing or feedback while all participants remain in their virtual booths.
10.4.1 Overview
Before the pandemic, other remote training systems (both synchronous and asynchro-
nous) had been developed and tested by different interpreter education institutions. At that
time, one of the most difficult aspects for distance education to reproduce was found to be
interaction between students and trainers. Ko (2006, 2008) experimented with teaching
liaison interpreting and SI via video and teleconferencing. He highlighted a main draw-
back of teleconferencing to be the fact that teachers and students were unable to see each
other, while remarking how interpreting in real-life situations involved both verbal and
non-verbal interaction between the interpreter and the primary participants. Later, Ko and
Chen (2011) would test online interactive interpreting teaching in virtual classrooms using
the internet and a collaborative cyber community (3C) learning platform. Their system
allowed students to connect from anywhere and to practise interpreting in four different
types of virtual spaces: Interacting with both their teacher and their classmates, in a syn-
chronous mode for in-class activities, and in an asynchronous mode for after-class group
practice. Within the InZone15 project, in humanitarian interpreting education, a blended
approach combined a learning platform – created by Geneva University’s Interpreting
Department Virtual Institute – and open-educational resources to prepare interpreters to
work in the field. In 2008, the Universidade Federal de Santa Catarina, together with other
Brazilian universities, developed a SLI e-learning programme, for deaf and hearing stu-
dents, which was supplemented with a face-to-face option to support teaching based on
visual learning (Müller De Quadros and Rossi Stumpf, 2015). In addition, the Middlebury
Institute of International Studies at Monterey redesigned and adapted a module of tradi-
tional face-to-face instruction for an online context to bridge the gap between the need for
professional development in community interpreting and the constraints experienced by
working professionals (Mikkelson et al., 2019).
However, the potential of ICT was far from being fully exploited in the field of inter-
preter training at the time of these aforementioned online or blended experiences. The
COVID-19 pandemic significantly accelerated both the development and adoption of online
remote training in interpreting. In Italy, which was strongly hit by the pandemic, higher
education institutions had to move from classroom to distance teaching overnight due to
long, repeated lockdowns. Since interpreter training involves more than lecturing – which
can be moved online with videoconferencing relatively easily – the DIT of the University
of Bologna, like many others around the world (see, for example, Ho and Zou, 2023, for
experiments with Gather, a proximity-based platform), had to find solutions to moving
interpreting classes online. Instructors searched for platforms that would provide students
with the possibility to practice both consecutive and simultaneous interpreting remotely.
As a result, there was great need for a web application that would allow both simultane-
ous interpreting classes and exams to be conducted remotely in a similar way to in-person.
165
The Routledge Handbook of Interpreting, Technology and AI
Students still needed to experience entering booths in interpreting classrooms and deliver-
ing simultaneous renditions of the same speech, at the same time, which could be recorded
and collected by their teacher for training, final exams, or individual feedback or reflections
on the task as during classes in the lab.
As a result, ReBooth sought to replicate this on-site situation as much as possible, pro-
viding a virtual classroom environment with remote virtual booths. The teacher would be
able to simultaneously deliver the same speech to all connected booths, monitor student
performance, and automatically collect recordings of the students’ renditions. The main
requirements of such a system were to be able to reliably reproduce the source speech in
all booths at the same time, without lags or interruptions due to streaming issues, and to
securely record and save students’ renditions.
A further requirement was the need to facilitate trainer-to-student communication as
much as possible. To this end, additional features were designed. These features included a
chat box, which could be used by, and be visible to, both the trainer and their students; a
‘class call’ mode to allow the trainer to talk to students for briefing or feedback purposes;
and visual signals to enable students to flag an issue to their trainer (for instance, a raised
hand to signal a student’s need/wish to speak). Although students are unable to see or talk to
each other in this tool, they are able to communicate via the chat and speak directly to their
trainer. In contrast to InTrain, which was specifically designed to foster student-to-student
interaction, the rationale behind this particular CAIT tool was to reproduce simultaneous
interpreting training sessions in the lab, with students in separate booths, listening to either
a speaker for whom they are interpreting or to their trainer.
166
Technology for training in conference interpreting
167
The Routledge Handbook of Interpreting, Technology and AI
At this stage, the session has not yet started, and students cannot yet enter their booths.
However, the trainer can prepare their class, uploading media files to the server, verifying
previously uploaded files, checking headphone and webcam, and so on. Once all prepara-
tions are complete, the trainer ‘starts the class’, and the students can enter their booths by
opening the invitation link and clicking the ‘join’ button on their interface.
If a student has not received their invitation, the tutor can retrieve their connection link
by clicking the button in the upper left corner of their booth display and sending the link to
the student by other means. At this stage, the trainer can also create additional ‘booths’ as
needed and communicate links to the relevant students.
As shown in Figure 10.4, students are displayed on the trainer’s interface inside their
own ‘booths’ as soon as they log on, but their microphones are muted. In the student’s
interface, the trainer appears. The trainer is able to speak to particular students by clicking
the ‘talk to booth’ button on their display or communicate with the whole class using the
‘class call’ button. In ‘class call’ mode, the trainer can give the floor to one booth at a time.
As a result, the chosen student’s voice can be heard by all other booths, which allows only
one student to speak at a time. ReBooth connections are, in fact, ‘peer-to-peer’ rather than
‘peer-to-server’ connections. Consequently, there is a single audio/video WebRTC connec-
tion between the instructor and each student (the connection topology is star-shaped), but
there are no direct connections between booths. Students can only communicate with each
other via chat, using text messages that are transparently routed via the trainer’s application
using the WebRTC data channel. Students also can send visual signals (flags) to the trainer
like ‘raise hand’, ‘agree/ok/yes’, and ‘do not agree/no’.
The teacher’s interface features a media player panel to manage and play media files, a
recorder panel to activate the recording in manual mode, two buttons for the automated
interpretation session mode (simultaneous/consecutive), and buttons to manage other fea-
tures of the practice session (including a session status monitor, the capacity to dismiss flags
and make a ‘class call’).
To reduce the risk of jeopardising students’ activity, ReBooth contains the following
features:
1. ReBooth does not use streaming. Instead, it sends the entire media file to the student’s
browser before the trainer starts a class or an exam. This allows students to listen to
the media file in its original quality, without being affected by lags, drops, and even
disconnections.
2. ReBooth records the student’s audio using two separate methods alongside a potential
backup procedure:
• ReBooth saves all audio streaming that the trainer receives on their computer.
• ReBooth records the student’s audio locally on their computer and sends it to the
server where ReBooth is hosted. This ensures that the audio retains the best possible
quality, since it has been acquired directly from the student’s device.
• The aforementioned recording is also available to the student, who can also save and
send it to the trainer by other means (e.g. email).
In the ‘simultaneous’ mode, booth recording automatically starts when the speech begins,
and stops when the speech ends. The trainer can decide to allocate additional recording
time for students to complete their rendition (since there is often a time lag in simultaneous
interpreting).
168
Technology for training in conference interpreting
In the ‘consecutive’ mode, playback starts immediately, and recording begins as soon as
the speech ends. By default, recording lasts as long as the speech, but again the trainer can
decide to allow more (or less) time for the rendition. In both cases, an audible signal alerts
the student when the recording is starting, and a timer indicates time remaining.
During recording, the trainer can monitor the status of each booth via the appropriate
button. They can also listen to the students’ output using the discrete listening function.
This is activated by simply clicking on the relevant booth’s display.
At the end of an interpreting activity, the automatic collection of the renditions begins,
as previously described. Recordings made in streaming can be downloaded together in a
zip file by clicking the appropriate button in the recording panel during the session. All
recordings collected on the server are kept and can be downloaded immediately or later (or
individually from each booth).
The student’s interface is simple, since all functions and activities are launched by the
trainer (Figure 10.5). The student can adjust the volume of the floor and stream. Students
can also send the trainer visual cues (hand up, thumbs up, thumbs down) and communicate
with the class via the chat.
The student interface has three display panels: One features the trainer, one shows the
feedback from their own webcam, and the final panel (usually hidden) is activated to dis-
play the trainer’s chosen media file, from which they would like their class to work.
ReBooth has been used by the DIT at Forlì Campus in their conference interpreting MA
programme since April 2020 for remote classes and exams. It was also used for the first time
to conduct MA admission tests with 190 candidates in 2020. The candidates were divided
into groups of 10 to 12 students per trainer’s PC. They successfully sat the test remotely,
at the same time, without technical glitches from ReBooth. Since 2020, this process has
been used every year for admission tests to the interpreting MA programme, with over 150
candidates sitting the test in person, in the university’s language labs, using ReBooth. The
platform has also been used since 2022 by the master’s course in conference interpreting of
V. N. Karazin Kharkiv National University in Ukraine, since no on-site education has been
possible since the Russian invasion, taking place at the time of writing this chapter.
169
The Routledge Handbook of Interpreting, Technology and AI
The tool’s next release, ReBooth 2.0, is currently in its design stage. It will contain addi-
tional features, namely, a shared virtual space for trainers and students to communicate
before and after a practice session. This will allow for briefing and debriefing/feedback
or other purposes. ReBooth 2.0 will also include virtual booths that are able to host two
interpreters, who will be able to cooperate (prompt) and manage handover as they would
in-person. It will also be possible to interpret using relay. These additional features will
make ReBooth 2.0 an invaluable tool for online teaching and learning, as it will replicate
the functions and conditions of RSI platforms and provide training which is as close as pos-
sible to in-person, on-site training.
10.4.3 Usability
ReBooth was conceived as a multiple-booth, virtual classroom to train students remotely.
As a result, the trainers’ experiences and opinions regarding its usability and suitability are
of paramount importance. This was tested by two of this chapter’s authors (Spinolo and
Carioli) using a convergent, mixed-method research design (Creswell, 1999). Eleven inter-
preter trainers from Italian and Spanish universities who had never used ReBooth before
were recruited. Participants received the user manual a week before the test, along with a
set of 15 tasks to complete. These tasks represented a trainer’s typical role during an inter-
preter training session. In addition, the participants were sent a short audio file to play dur-
ing the test. The researchers collected the data from three different sources: (1) participants’
screen recordings during the session; (2) responses to the User Experience Questionnaire
(UEQ, Laugwitz et al., 2008), which was completed straight after the session; and (3) two
focus groups (with five participants each),19 held one week after the test.
The participants’ screen recordings20 were analysed as a ‘performance measure’ (Rubin
and Chisnell, 2008, 166). The number of attempts made by participants to complete each
task was rated on a 6-point scale ranging from 0 (‘not problematic’, one attempt) to 5 (‘very
problematic’, more than 5 attempts, or not completed). Results show that most tasks (11 out
of 15) proved ‘not’ or ‘slightly’ problematic, with a mean score ≤1. Of the remaining four, the
two ‘most problematic’ tasks had a mean score of 2 (SD = 2.3) (‘listen to students providing
their consecutive renditions using the discreet listening function’) and 2.3 (SD = 1.4) (‘make
sure that all students have received the file’), while the remaining two tasks scored 1.2 (SD =
1.6) (‘start a consecutive interpreting session with the media file sent’) and 1.6 (SD = 1.4)
(‘make sure you can hear all students and they can hear you through the intercom function’).
The full UEQ used in this study includes the following six scales, as described by its devel-
opers (Schrepp, 2023, 2): Attractiveness (whether or not users like the product), perspicuity
(whether it is easy to learn how to use the product), efficiency (whether tasks can be solved
without unnecessary effort), dependability (whether users felt in control of their interaction
with the tool), stimulation (whether it is exciting to use the tool), and novelty (whether the
tool is innovative and creative). The UEQ has been previously used in translation and inter-
preting studies to assess tool usability (see, for example, Braun et al., 2020). In this case, the
UEQ results were positive on all six scales. Results were obtained using the UEQ data analysis
tools (Schrepp, 2023), which also compare the results to benchmarks derived from more than
450 usability studies. The results of the UEQ are reported in Table 10.1.
The thematic analysis conducted on the recordings of the two focus groups allowed
for both positive and problematic aspects of the tool to be identified. This activity proved
170
Technology for training in conference interpreting
useful in obtaining suggestions for the current version of ReBooth, as well as contributing
to identifying desired features for its future 2.0 release. In general, participants found the
interface pleasant and easy to use; they appreciated the ease with which they could switch
between one booth and another, the possibility to communicate with one booth or with
the entire class, and the ability to download all recordings and send the audio or video file
before starting the session. Participants also highlighted the ability to streamline consecutive
and simultaneous sessions and obtain dual track recordings as a further positive feature.
The most reported issues were connected to the class set-up interface, which participants
found less intuitive. They also reported issues with the format of recordings (webm), and as
observed in the screen recordings, participants experienced problems with using the inter-
com and found file sending functions to be less intuitive. Participants provided noteworthy
suggestions on how icon colours and sizes could be made more intuitive and created their
wish list of features for the 2.0 release. The most requested features included allowing more
students to connect at a time, allowing students to work cooperatively in virtual booths,
having access to more participant profiles (trainer, student, speaker, audience), having a
relay function, and the possibility to share screen and audio.
This usability study was conducted with a small sample of participants, and its lim-
ited scale does not allow for generalisable conclusions. However, based on the findings,
ReBooth appears to enable trainers to perform all the necessary tasks for conducting effi-
cient online interpreting classes and exam sessions.
171
The Routledge Handbook of Interpreting, Technology and AI
With today and tomorrow’s digital tools our next generation students will have
unprecedented power to amplify their ability to think, learn, communicate, collabo-
rate and create. Along with all that power comes the need to learn the appropriate
skills to handle massive amounts of information, media and technology.
(Trilling and Fadel, 2009, 65)
The current abundance of media resources, including videos, podcasts, websites, speech,
and terminology banks require the ability to navigate them. Therefore, interpreting stu-
dents need to understand how to use media resources and CAI tools for their learning pur-
poses and for their future professional development. In addition, since technology evolves
very quickly, two additional skills deemed necessary for students are ‘flexibility’ and ‘readi-
ness to explore and test innovations without being overwhelmed’.
Before the COVID-19 pandemic, which boosted the use of technology in interpret-
ing, Fantinuoli and Prandi (2018) had already identified three crucial areas of learning
for future interpreters: Remote interpreting, computer-assisted interpreting (CAI), and
automatic speech translation (AST). CAI tools are meant to relieve the interpreter’s cog-
nitive load during simultaneous interpreting and improve the interpreter’s performance.
Initial studies on the subject show that interpreters’ performances improved with these
tools (Prandi, 2017, 2018, this volume). CAI tools can have an impact on all three of the
stages in the interpreting process, as identified in field-specific literature (Kalina, 2007; Gile,
1995/2009), before, during, and after the interpreting event. These tools can help interpret-
ers retrieve domain-specific terminology and knowledge that are relevant to their specific
conference. CAI tools assist the interpreter in organising and memorising terminology and
knowledge before the event, and in looking up terminology during the time-constrained
interpreting process (Fantinuoli and Prandi, 2018, 167), without having to allocate addi-
tional cognitive resources to manually perform the search themselves (Fantinuoli, 2023).
CAI tools also assist the interpreter after completion of an assignment by integrating and
updating glossaries or other relevant documents. However, as anticipated, CAI tools have
since moved to a new paradigm and are starting to integrate AI. This evolution requires
further skills to be included in interpreter training programmes if trainers want their stu-
dents to be ‘tech-savvy’.
172
Technology for training in conference interpreting
Simultaneous Interpreting) developed SmarTerp, an ASR- and AI-based CAI tool. Its initial
release was tested in an interpreter education setting – the DIT of the University of Bolo-
gna – to explore the impact of its use on simultaneous interpreting performance and collect
trainees’ perceptions of its usefulness. SmarTerp combines CAI features with speech recog-
nition and translation by displaying well-known problem-trigger items (terms, named enti-
ties, numbers, acronyms), alongside their translations, on the interpreter’s PC screen. Russo
et al. (forthcoming) conducted the study including 24 second-year students of the Master
in Interpreting who participated on a voluntary basis to test the tool. All participants (21
females, 3 males) were Italian L1 speakers and were divided according to the following
language combinations, each comprising six students: Italian > Spanish, Spanish > Italian,
Italian > English, English > Italian. These participants had never experienced this particu-
lar feature of a CAI tool before. However, they had all received training in the tool before
undertaking the tests. Twelve students participated on-site, in the Department’s interpret-
ing labs, while the remaining 12 took part online, via Zoom. A total of 15 source video
texts were administered to participants across three experimental sessions: Five English
speeches and corresponding five Spanish and five Italian speeches. The speeches contained
the problem triggers that Frittella (2021) had identified at an earlier stage of testing with
professional interpreters. The problem triggers identified include named entities, acronyms,
numbers, and technical terms. All speeches were controlled and harmonised in relation to
duration and text features. The six analytical categories suggested by Frittella (2022) plus
one suggested by the research team were subsequently applied to the transcribed renditions
in order to analyse them: Correct rendition, partial rendition, minor error/missing detail,
generalisation, omission and semantic error, plus self-corrections. After transcribing and
analysing all renditions, the sessions that had been conducted using SmarTerp were com-
pared to those without the CAI tool. A similar pattern was observed in all four directions
(i.e. IT<>EN and IT<>ES): More successful handling of triggers (correct renditions, par-
tial renditions, minor errors) and fewer instances of unsuccessful management (omissions,
semantic errors) were recorded when using the CAI tool. So directionality does not seem
to impact the students’ renditions. Furthermore, all 24 students’ performances improved
from Session 1 to Session 3, thus indicating the positive effect of their familiarisation with
the CAI tool.
In terms of usability, at the end of the sessions that used SmarTerp, the participants were
asked to complete a short online questionnaire where they answered questions and rated
statements using a 7-point Likert scale, with 1 = strongly dissatisfied/disagree/very unlikely
and 7 = strongly satisfied/agree/very likely. The main results are as follows: In response to
the question ‘Overall, how satisfied are you with the support of the CAI tool SmarTerp dur-
ing testing?’, the participants’ responses were overwhelmingly positive, with most partici-
pants indicating that they were either ‘satisfied’ or ‘very satisfied’. The majority of trainees
also found the CAI tool to be ‘user-friendly’, strongly agreeing with the statement ‘The CAI
tool was easy to use’. However, participants also highlighted the need for specific training
to be able to use SmarTerp effectively.
10.6 Conclusions
This chapter has explored the impact of technologies on interpreter education and profes-
sional life. In particular, CAI and CAIT tools are met with mixed opinions by all involved
parties: Higher interpreter education institutions, trainers, trainees, and professional
173
The Routledge Handbook of Interpreting, Technology and AI
interpreters. The COVID-19 pandemic proved to be a real catalyst in developing CAI and
CAIT solutions that ensured efficient interpreter-mediated communication and interpreter
training remotely. Two such CAIT solutions developed at the DIT of the University of Bolo-
gna at Forlì, InTrain and ReBooth, were presented, and their evidence-based usability was
discussed. Furthermore, a section of the chapter focused on the results that emerged from
the initial tests of SmarTerp, a CAI tool developed by a European partnership and tested in
an academic setting by 24 interpreting trainees at DIT.
As for InTrain, students overall expressed positive opinions about its user-friendliness,
efficiency of functions, and suitability. However, in terms of its use, there were divided
opinions: Some students considered InTrain to be a ‘good alternative’ to on-site training
when the latter is not possible, while others stated they would only use the tool as an addi-
tion to in-person classes in the lab. These results reflect two concepts that are inherent in
interpreting: Situatedness and embodiment (Risku, 2002; Davitti and Pasquandrea, 2017;
Pöchhacker, 2024). Being in a virtual environment means lacking shared physical space
and context, as well as elements of non-verbal communication that interpreters use during
comprehension and production tasks.
The usability of ReBooth was rated positively by participants during the study, and it is
currently used successfully at the DIT of Bologna University, as well as at Karazin Kharkiv
National University. The next release of the tool is currently under development and will
include additional interactive features that aim to reproduce a well-equipped in-person
classroom working environment as much as possible, including real booths. New features
will therefore allow students to share a virtual booth and be able to listen to and commu-
nicate with their booth partner as they would in a physical booth.
There has also been clear appreciation for SmarTerp, although participants also
expressed the need for specific training in order to use it efficiently. Since current trends
leaning towards working online and introducing AI into interpreting are likely to remain
and even accelerate, there is a call to include CAI and CAIT tools in interpreter education
in order to best equip students for the profession and allow them to develop the necessary
soft skills that are becoming increasingly crucial today.
Notes
1 This chapter was jointly conceived by the four authors. In the final version, Mariachiara Russo
authored Sections 10.1, 10.2, 10.5.2; Amalia Amato authored Sections 10.3, 10.3.1, 10.3.3, 10.4,
10.4.1, 10.5, 10.5.1, 10.6; Gabriele Carioli authored Sections 10.3.2, 10.4.2; and Nicoletta Spi-
nolo authored Section 10.4.3.
2 https://interpretertrainingresources.eu/ (accessed 31.3.2025).
3 https://knowledge-centre-interpretation.education.ec.europa.eu
4 https://nationalnetworkforinterpreting.ac.uk/interactive-resources/speechpool/
5 https://orcit.eu/
6 www.interpretbank.com/site/ (accessed 31.3.2025).
7 www.eitdigital.eu/fileadmin/2021/innovation-factory/new/digital-tech/EIT-Digital-Factsheet-
Smarterp.pdf (accessed 31.3.2025).
8 The use of InTrain is also discussed in the context of humanitarian interpreting training (Russo
and Spinolo, 2022).
9 The STUN (Session Traversal Utilities for NAT) protocol is used to assist devices behind a NAT
(network address translation) or firewall in establishing communication with external devices.
It works by allowing devices to discover their public IP address and the type of NAT they are
behind, enabling them to communicate with other devices even when located behind NAT or
firewalls.
174
Technology for training in conference interpreting
10 The TURN (Traversal Using Relays around NAT) protocol is a relay protocol used when direct
peer-to-peer communication is not possible due to network address translation (NAT) or firewall
restrictions. TURN servers act as intermediaries, relaying data between communicating peers. This
enables devices to establish communication even when direct connections are not feasible, ensur-
ing connectivity in diverse network environments.
11 https://github.com/mattdiamond/Recorderjs (accessed 31.3.2025).
12 https://intrain.ditlab.it/ (accessed 31.3.2025).
13 www.gnu.org/licenses/agpl-3.0.html (accessed 31.3.2025).
14 https://github.com/bilo1967/intrain (accessed 31.3.2025).
15 www.unige.ch/inzone/ (accessed 31.3.2025).
16 https://rebooth.ditlab.it/ (accessed 31.3.2025).
17 Source code is available under the GNU Affero General Public v3 License from https://github.com/
bilo1967/rebooth.
18 A temporary evaluation account can be obtained free of charge (see https://rebooth.ditlab.it/ for
the request address).
19 One of the recruited participants was not able to participate in the focus groups.
20 Screen recordings were analysed for ten participants, as one screen recording had to be discarded
due to technical issues.
21 UEQ items are ‘scaled from −3 to +3. Thus, −3 represents the most negative answer, 0 a neutral
answer, and +3 the most positive answer’ (Schrepp, 2023, 2). In the same way, the range of scales
goes from −3 (worst performance) to + 3 (best performance) (Schrepp, 2023, 5).
References
Albl-Mikasa, M., 2013. Developing and Cultivating Interpreter Expert Competence. The Interpreters’
Newsletter 18, 17–34.
Baigorri-Jalón, J., 2000. La interpretación de conferencias: el nacimiento de una profesión. De París
a Nuremberg. Editorial Comares, Granada.
Bertozzi, M., 2024. Continuous self-learning for conference interpreting trainees: the case of the Uni-
versity of Bologna. The Interpreters’ Newsletter 29, 19–39.
Boéri, J., De Manuel Jerez, J., 2011. From Training Skilled Conference Interpreters to Educating
Reflective Citizens: A Case Study of the Marius Action Research Project. The Interpreter and
Translator Trainer 5(1). Special Issue: Ethics and the Curriculum: Critical Perspectives, 41–64.
Braun, S., Davitti, E., Dicerto, S., Slater, C., Tymczyńska, M., Kajzer-Wietrzny, M., Floros, G., Kritsis, K., Hoffs-
taedter, P., Kohn, K., Roberts, J.C., Ritsos, P.D., Gittins, R., 2014. Eviva evaluation studies report. Research
Gate. URL www.researchgate.net/publication/309736446_EVIVA_Project_Evaluating_the_ Education_
of_Interpreters_and_their_Clients_through_Virtual_Learning_Activities_-_Evaluation_Studies_Report
(accessed 9.6.2024).
Braun, S., Davitti, E., Slater, C., 2020. ‘It’s Like Being in Bubbles’: Affordances and Challenges of Vir-
tual Learning Environments for Collaborative Learning in Interpreter Education. The Interpreter
and Translator Trainer 14(3), 259–278. URL https://doi.org/10.1080/1750399X.2020.1800362
Braun, S., Slater, C., 2014. Populating a 3D Virtual Learning Environment for Interpreting Students
with Bilingual Dialogues to Support Situated Learning in an Institutional Context. The Interpreter
and Translator Trainer 8(3), 469–485. URL https://doi.org/10.1080/1750399X.2014.971484
Braun, S., Slater, C., Botfield, N., 2015. Evaluating the Pedagogical Affordances of a Bespoke 3D Vir-
tual Learning Environment for Training Interpreters and Their Clients. In Erlich, S., Napier, J., eds.
Interpreter Education in the Digital Age: Innovation, Access, and Change. Gallaudet University
Press, Washington, DC, 39–67.
Carioli, G., Spinolo, N., 2019. InTrain [software]. URL https://intrain.ditlab.it/credits (accessed 9.
6.2024).
Carioli, G., Spinolo, N., 2020. ReBooth [software]. URL https://rebooth.ditlab.it/credits (accessed 9.
6.2024).
Creswell, J.D., 1999. Mixed-Method Research: Introduction and Application. In Cizek, G.J., ed.
Handbook of Educational Policy. Academic Press, Cambridge, 455–472.
175
The Routledge Handbook of Interpreting, Technology and AI
Davitti, E., Pasquandrea, S., 2017. Embodied Participation: What Multimodal Analysis Can Tell
Us About Interpreter-Mediated Encounters in Pedagogical Settings. Journal of Pragmatics 107,
105–128.
Defrancq, B., 2023. Technology in Interpreter Education and Training. In Corpas Pastor, G., Defrancq, B.,
eds. Interpreting Technologies – Current and Future Trends. John Benjamins, Amsterdam, 302–319.
Defrancq, B., Fantinuoli, C., 2020. Automatic Speech Recognition in the Booth: Assessment of
System Performance, Interpreters’ Performances and Interactions in the Context of Numbers.
Target – International Journal of Translation Studies 33(1), 73–102.
Fantinuoli, C., 2023. Towards AI-Enhanced Computer-Assisted Interpreting. In Corpas Pastor,
G., Defrancq, B., eds. Interpreting Technologies – Current and Future Trends. John Benjamins,
Amsterdam, 46–71.
Fantinuoli, C., Prandi, B., 2018. Teaching Information and Communication Technologies – a Pro-
posal for the Interpreting Classroom. trans-kom 11(2), 162–182.
Frittella, F.M., 2021. Early Testing Report. Usability Test of the ASR- and AI-Powered CAI tool
‘SmarTerp’. Unpublished report of the SmarTerp project.
Frittella, F. M., 2022. The ASR-CAI Tool Supported SI of Numbers: Sit Back, Relax and Enjoy
Interpreting? Paper presented at the Conference Translator & The Computer (TC) 43. Research-
Gate. URL www.researchgate.net/publication/363835256_The_ASR-CAI_tool_supported_SI_of_
numbers_Sit_back_relax_and_enjoy_interpreting/link/63304eab86b22d3db4e07c1d/download
(accessed 9.6.2024).
Frittella, F.M., 2023. Usability Research for Interpreter-Centred Technology: The Case Study of
SmarTerp. Language Science Press, Berlin. URL https://doi.org/10.5281/zenodo.7376351
Galán-Mañas, A., Kuznik, A., Olalla-Soler, C., 2020. Entrepreneurship in Translator and Interpreter
Training. HERMES – Journal of Language and Communication in Business 60, 7–11.
Gile, D., 1995/2009. Basic Concepts and Models for Interpreter and Translator Training: Revised
Edition, 2nd ed. John Benjamins, Amsterdam.
Gran, L., Carabelli, A., Merlini, R., 2002. Computer-Assisted Interpreter Training. In Garzone, G.,
Viezzi, M., eds. Interpreting in the 21st Century: Challenges and Opportunities. John Benjamins,
Amsterdam, 277–294.
Ho, C.-E., Zou, Y., 2023. Teaching Interpreting in the Time of COVID: Exploring the Feasibil-
ity of Using Gather. In Liu, K., Cheung, A.K.F., eds. Translation and Interpreting in the Age of
COVID-19. Corpora and Intercultural Studies, Vol. 9, Springer, Singapore, 311–330. URL https://
doi.org/10.1007/978-981-19-6680-4_16
Kajzer-Wietrzny, M., Tymczynska, M., 2014. Integrating Technology into Interpreter Training
Courses: A Blended Learning Approach. inTRAlinea Special Issue: Challenges in Translation Peda-
gogy. URL www.intralinea.org/specials/article/2101
Kalina, S., 2007. ‘Microphone Off’ – Application of the Process Model of Interpreting to the Class-
room. Kalbotyra 57(3), 111–121.
Ko, L., 2006. Teaching Interpreting by Distance Mode: Possibilities and Constraints. Interpreting
8(1), 67–96.
Ko, L., 2008. Teaching Interpreting by Distance Mode: An Empirical Study. Meta 5(4), 814–840.
Ko, L., Chen, N.S., 2011. Online-Interpreting in Synchronous Cyber Classrooms. Babel 57(2),
123–143.
Laugwitz, B., Held, T., Schrepp, M., 2008. Construction and Evaluation of a User Experience
Questionnaire. In Holzinger, A., ed. HCI and Usability for Education and Work: 4th Symposium of
the Workgroup Human-Computer Interaction and Usability Engineering of the Austrian Computer
Society (USAB 2008). Springer, Berlin, 63–76. URL https://doi.org/10.1007/978-3-540-89350-9_6
Merlini, R., 1996. InterpretIT. Consecutive Interpretation Module. The Interpreters’ Newsletter 7,
31–41.
Mikkelson, H., Slay, A., Szasz, P., Cole, B., 2019. Innovations in Online Interpreter Education:
A Graduate Certificate Program in Community Interpreting. In Sawyer, D., Austermühl, F.,
Enríquez Raído, V., eds. The Evolving Curriculum in Interpreter and Translator Education. John
Benjamins, Amsterdam, 167–184.
Motta, M., 2013. Evaluating a Blended Tutoring Program for the Acquisition of Interpreting Skills:
Implementing the Theory of Deliberate Practice (PhD thesis). University of Geneva, Switzerland.
Motta, M., 2016. A Blended Learning Environment Based on the Principles of Deliberate Practice
for the Acquisition of Interpreting Skills. The Interpreter and Translator Trainer 10(1), 133–149.
176
Technology for training in conference interpreting
Müller de Quadros, R., Rossi Stumpf, M., 2015. Sign Language Interpretation and Translation in Bra-
zil: Innovative Formal Education. In Ehrlich, S., Napier, J., eds. Interpreter Education in the Digi-
tal Age: Innovation, Access, and Change. Gallaudet University Press, Washington, DC, 243–265.
Olalla-Soler, C., Spinolo, N., Muñoz Martín, R., 2023. Under Pressure? A Study of Heart Rate and
Heart-Rate Variability Using Smarterp. HERMES 63, 119–142.
Pöchhacker, F., 2024. Is Machine Interpreting Interpreting? Translation Spaces, online first. URL
www.jbe-platform.com/content/journals/10.1075/ts.23028.poc
Pollice, A., 2016. Portare la tecnologia in cabina: le nuove tecnologie a servizio dell’interprete e il
caso della simultanea con testo (Unpublished MA dissertation). Department of Interpreting and
Translation, University of Bologna.
Prandi, B., 2017. Designing a Multimethod Study on the Use of CAI Tools During Simultaneous Inter-
preting. In Esteves-Ferreira, J., Macan, J., Mitkov, R., Stefanov, O.M., eds. Proceedings of the 39th
Conference Translating and the Computer. Tradulex, Geneva, 76–88. https://www.asling.org/tc39/
wp-content/uploads/TC39-proceedings-final-1Nov-4.20pm.pdf (accessed 9.6.2024).
Prandi, B., 2018. An Exploratory Study on CAI Tools in Simultaneous Interpreting: Theoretical
Framework and Stimulus Validation. In Fantinuoli, C., ed. Interpreting and Technology. Language
Science Press, Berlin, 29–59.
Prandi, B., 2020. The Use of CAI Tools in Interpreter Training: Where Are We Now and Where Do
We Go from Here? inTRAlinea Special Issue: Technology in Interpreter Education and Practice.
URL www.intralinea.org/specials/article/2512
Risku, H., 2002. Situatedness in Translation Studies. Cognitive Systems Research 3(3), 523–533.
Ritsos, P.D., Gittins, R., Braun, S., Slater, C., Roberts, J.C., 2013. Training Interpreters Using Virtual
Worlds. In Gavrilova, M.L., Tan, K., Kuijper, A., eds. Transactions on Computational Science
XVIII. Lecture Notes in Computer Science, Vol. 7848. Springer, Berlin, Heidelberg, 21–40. URL
https://doi.org/10.1007/978-3-642-38803-3_2
Rodríguez, S., Frittella, F.M., Okoniewska, A., 2022. A Paper on the Conference Panel “In-Booth CAI
Tool Support in Conference Interpreter Training and Education”. In Esteves-Ferreira, J., Mitkov,
R., Recort Ruiz, M., Stefanov, O.M., eds. Translating & the Computer (TC) 43, Language AsLing
Technology 16–18 November 2021 on the Web Conference Proceedings. Tradulex, Geneva,
78–87. URL www.tradulex.com/varia/TC43-OnTheWeb2021.pdf
Rodríguez Melchor, M.D., Horvath, I., Ferguson, K., 2020. The Role of Technology in Conference
Interpreting Training. Peter Lang, Lausanne.
Rubin, J., Chisnell, D., 2008. Handbook of Usability Testing: How to Plan, Design, and Conduct
Effective Tests. Wiley, Hoboken, NJ.
Russo, M., Amato, A., Torresi, I., under review. The Digital Boothmate in an Educational Setting:
Evaluation of SmarTerp Performance. The Interpreters’ Newsletter.
Russo, M., Spinolo, N., 2022. Technology Affordances in Training Interpreters for Asylum Seek-
ers and Refugees. In Ruiz Rosendo, L., Todorova, M., eds. Interpreter Training in Conflict and
Post-Conflict Scenario. Routledge, London, 165–179.
Sandrelli, A., 2005. Designing CAIT Tools: Blackbox. MuTra 2005 – Challenges of Multidimensional
Translation: Conference Proceedings, 191–209.
Santoro, B., 2020. Tecnologia e didattica dell’interpretazione. Esperienza di interpretazione consecu-
tiva e simultanea con InTrain (MA dissertation). University of Bologna, Department of Interpret-
ing and Translation.
Schrepp, M., 2023. User Experience Questionnaire Handbook. All You Need to Know to Apply the
UEQ Successfully in Your Projects. User Experience Questionnaire. URL www.ueq-online.org/
Material/Handbook.pdf (accessed 9.6.2024).
Trilling, B., Fadel, C., 2009. 21st Century Skills: Learning for Life in Our Times. Wiley, Hoboken, NJ.
177
PART III
11.1 Introduction
Live communication occurs in a myriad of settings. These include live events, television
broadcasts, conferences, museums, theatres, public services, and schools. The rapid growth
of multimedia and multilingual content, coupled with the pandemic-induced surge in online
and streaming content (Nimdzi, 2022a), has called for the development of effective and
varied solutions to make this content accessible to the broadest possible audience (Nimdzi,
2023). Ensuring access to information, culture, education, and entertainment both within
and across linguistic, sensory, and cognitive boundaries, and in real time, is essential for
fostering an inclusive society, as outlined in the UN Sustainable Development Goal 17.8.1.1
A growing legal framework supports the necessity for inclusive services to guarantee
access to information and education as fundamental human rights, as established by the
EU Charter of Fundamental Human Rights.2 The UN Convention on the Rights of Persons
with Disabilities (2006)3 also emphasises the need for equitable access to information and
entertainment for individuals with disabilities. This concept, according to WHO,4 refers
not only to ‘body or mind impairment’ but also to ‘activity limitations and participation
restrictions that can affect any of us’. This expanded view of disability challenges tradi-
tional notions by emphasising the importance of accessibility, particularly communicative
access, as a critical criterion beyond mere physical needs. A holistic approach ensures that
everyone, regardless of physical or cognitive limitations, can fully participate in society.
Although not traditionally framed as an access service, interpreting has already catered
for such communicative needs by providing access both intramodally (spoken-to-spoken)
and interlingually (from one language to another). However, defining interpreting strictly
as an oral form of language use is overly restrictive. This perspective fails to account for
institutionalised forms of interpreting that cross intramodal boundaries, such as signed
language interpreting (from oral to signed) and sight translation (from written to oral).
Boundary-crossing is thus not new in interpreting. Following the ideas of Kade (1968) and
Pöchhacker (2022), this chapter views immediacy as a common denominator across these
practices, characterised by meeting communicative needs in real time with minimal oppor-
tunities for editing the output. Recent technological advances to support real-time content
DOI: 10.4324/9781003053248-15
The Routledge Handbook of Interpreting, Technology and AI
182
Technology for hybrid modalities
capabilities and services. Understanding the dynamics of these complex workflows is essen-
tial to appreciate the evolving landscape of translation and interpreting practices in the
digital age.
Following this introduction, Section 11.2 discusses the concept of hybridity in this land-
scape. Section 11.3 explores live STT practices and their integration within translational
activities. Section 11.4 details five workflows, presented along a continuum of human
involvement. Some advantages and disadvantages of each workflow in delivering live inter-
lingual content are discussed, with reference to examples of real-life use cases and appli-
cations, where available. Section 11.5 reviews some key research themes on these hybrid
practices. Finally, Section 11.6 reflects on the implications of such practices at conceptual,
professional, and pedagogical levels.
183
The Routledge Handbook of Interpreting, Technology and AI
Braun, 2019; Jiménez-Crespo, 2020) turns identified in TS, new modalities have emerged
which broaden the concept of hybridity across different dimensions.
Transfer. Although not always considered as translational activities in their own right,
given their traditionally intralingual nature, inter- or transmodal practices that cross tra-
ditional boundaries between spoken and written media are increasingly emerging. These
‘enrich the continuum of diamesic variations traditionally polarised between spoken and
written language’ (Eugeni and Gambier, 2023, 70, my translation). In this context, real-time
STT practices are a case in point, as they can cross both diamesic and linguistic boundaries.
Method. Hybridity requires historical disciplinary silos (e.g. translation vs subtitling vs
interpreting) to be overcome, and practices that defy rigid categories, such as full-blown
translational practices, to be recognised. For example, the use of digital pens or tablets
for SimConsec (Pöchhacker, 2007) and SightConsec (see Saina and Ünlü, this volume) has
transformed traditional workflows and, consequently, relaxed the immediacy criterion
associated with traditional interpreting (Pöchhacker, 2023, 281). Similarly, speech technol-
ogies have opened up different methods for real-time STT output, which will be explored
further in this chapter.
Competences. Emerging practices also reshape the skills that humans require to oper-
ate within technologised environments. Doing so involves significant human–AI interac-
tion and dealing with highly multimodal texts. This requires traditional competences to be
updated or adjusted to new practices and workflows and increased awareness of different
user needs. Examples range from learning to interact effectively with technologies that sup-
port or enhance traditional interpreting performance (e.g. CAI tools, see Prandi, this vol-
ume; tablets, see Saina, this volume; digital pens, see Orlando, this volume) to technologies
that share tasks with human professionals to ensure real-time multilingual communication,
as is the case for live STT practices.
Set-ups. In live scenarios, hybridity is also used to reflect the mixed nature of live multi-
lingual communication set-ups, which often combine online/distance and on-site elements.
The COVID-19 pandemic has accelerated the emergence of new hybrid forms of real-time
communication. This evolution compels us to rethink our interactions within traditional
spatial constraints and address the complexities of real-time interlingual communication
in these contexts. Unlike traditional face-to-face interpreting, live STT practices have been
integrated from the start within remote and hybrid scenarios, such as TV broadcasts, and,
more recently, within various types of live events, which can be delivered in different for-
mats (Eichmeyer, 2018).
This chapter embraces the notion of hybridity between humans and machines, in line
with the emerging field of human–AI hybrids (Fabri et al., 2023). It challenges the conven-
tional view of AI as a substitute for human tasks, which leads to ‘two unfortunate conse-
quences: (1) a disproportionately large focus on automation and (2) a tendency to neglect
the powerful interworking that occurs when humans and AI augment each other’. Instead,
this chapter highlights research that promotes a decisively more positive conceptualisation
of such human–AI collaboration as ‘dynamic combinations of individual competencies of
human and AI-enabled systems’ (Fabri et al., 2023 [no pagination]).
This chapter will next discuss the role of these technology-enabled hybrid modalities in
live interlingual communication, within the broader context of translation and interpreting.
Furthermore, it will highlight the evolving conceptual nexus shaped by the integration of
new technologies.
184
Technology for hybrid modalities
185
The Routledge Handbook of Interpreting, Technology and AI
186
Technology for hybrid modalities
187
The Routledge Handbook of Interpreting, Technology and AI
in these processes and has outpaced our understanding of their effectiveness and impact of
different practices. With these considerations in mind, we will now examine the workflows
in more detail. Based on research (e.g. Davitti and Sandrelli, 2020; Korybski et al., 2022;
Wallinheimo et al., 2023), different workflows for real-time interlingual STT have been
placed on a continuum from more human-centric to semi- and fully automated (Figure 11.1
- see also SMART 2023 video clip5).
188
Technology for hybrid modalities
189
The Routledge Handbook of Interpreting, Technology and AI
other major live events, such as cultural events, festivals, and award ceremonies. It is also
being explored in settings such as conferences, business meetings, trade shows, and polit-
ical, educational, and legal settings, among other social environments. With the rise of
online streaming content and the need to make this accessible, there is potential for new
markets to open up. This includes settings where simultaneous interpreting or subtitling
is not normally available. Examples may cover live subtitling of online radio broadcasts,
remote subtitling of museum tours, and MOOC classes.
This workflow relies on efficient interaction between language professionals and
AI-driven SR. It requires complex, specialised skills, and an ability to adjust to technology
and see beyond traditional practices to understand the need for skill adaptation and acqui-
sition. Required skills include concurrent listening and translating while adjusting one’s
way of speaking to the software for enhanced recognition. The respeaker must also perform
audiovisual monitoring: checking their own spoken output (as is normal during interpret-
ing) in addition to monitoring the appearance of the subtitles on-screen and applying edits
or corrections in real time. This multitasking activity, which involves coordinating and
controlling all these steps simultaneously, makes interlingual respeaking akin to ‘simultane-
ous interpreting 2.0’.
This process is neither flawless nor effortless. At its core lies the ability to strategically
reformulate content in the target language. Humans possess the unique real-time flexibility
to adjust to different accents, speeds, and other source text characteristics, including multiple
speakers, lexical density, changing registers, or even languages. They can also extract mean-
ing from chaotic, impromptu speeches full of hesitations and self-repairs, make inferences,
and accurately interpret colloquialisms, sensitive language, idioms, cultural references, and
implicit meanings based on a thorough understanding of the context. This sets humans apart
from automated transcription and machine-generated subtitles, particularly in high-risk,
high-stakes contexts, and allows them to shape and apply their live editing skills according
to the needs at hand. However, these skills require adaptation to the specific practice.
As highlighted in Section 11.4, unlike interpreters, respeakers must remember that they
produce more words than the original, as they also verbalise punctuation and voice labels
to identify sounds or speakers. This results to a delay in following the speaker and adds
to the time lag required to process the incoming information. While broadcasters often
address time lag by delaying the signal to reduce or eliminate the perceived gap, condensa-
tion strategies become crucial in live scenarios, where managing the delay like broadcasters
is not possible. However, the question still remains regarding how best to implement these
strategies without losing meaning. This is especially the case with intralingual respeaking,
where a verbatim approach is often expected by viewers. Given the lack of one-to-one cor-
respondence between a source and a target language, the argument for a sensatim approach
becomes more pronounced in interlingual respeaking. When language transfer is involved,
the limitations of a word-for-word approach have been well-documented by translation
studies research, where the importance of transferring ideas rather than words to ensure
meaning retention is well-established.
Spontaneous speech is often grammatically unpolished and redundant, making it unsuit-
able for direct use in written form. Respeakers can streamline and polish the content to
ensure high quality and accuracy while also considering the specific needs of the target audi-
ence. This makes live subtitles more engaging, readable, and accessible. Editing skills tra-
ditionally refer to those applied post hoc in respeaking, with different correction strategies
(e.g. based on voice commands, keyboards, mouse), performed by the respeaker themselves
190
Technology for hybrid modalities
191
The Routledge Handbook of Interpreting, Technology and AI
due to the corporate model of many companies that rely on a largely freelance workforce.
Additionally, increased awareness and education are needed, as this workflow puts signifi-
cant pressure on interpreters, who are responsible for conveying meaning across languages
under visible scrutiny.
192
Technology for hybrid modalities
193
The Routledge Handbook of Interpreting, Technology and AI
should this intervention occur, and what skills are required for effective on-the-go edit-
ing? Human-driven services are considered premium and increasingly involve a hybrid
approach. From an industry perspective, despite the hype around cutting human profes-
sionals from workflows, there is increasing awareness that different methods may be needed
in different circumstances. For example, in a recent Slator interview,6 Tony Abraham from
Ai-Media noted:
[Y]ou would put a respeaker on something that, A, is really important content, but,
B, where you do not have that situation where the AI can deliver those results. So, for
example, where you have multiple speakers, mixed quality audio, background noise,
singing, multiple languages . . . [which] tends to be the most important content for
our customers.
Similar to when MT came to the fore in translation, many providers are now offer-
ing a tiered approach to product accuracy, with premium captioning services providing
the highest accuracy and more automated solutions being offered with disclaimers about
their limitations. Consistent with academic discourse, and cutting through the marketing
hype, there is a growing consensus that a ‘one-size-fits-all’ solution cannot exist. However,
there is also an increasing need to validate these workflows empirically and identify the
conditions under which they perform best, understand their affordances and constraints in
different contexts, and explore how humans operate within these environments, as will be
discussed in the next section.
194
Technology for hybrid modalities
11.5.1 Efficiency
Although still relatively small, the body of research on different workflows is growing and
has yielded interesting findings. These studies have carried out either small-scale compara-
tive analyses of different methods or in-depth analyses of single workflows. They share a
common focus on assessing performance under various conditions. The term ‘quality’ is
often used. However, since ‘quality’ is a complex, multifaceted construct, it is not fully
captured (see Davitti et al., this volume). ‘Accuracy’ is another employed term covering
one important aspect of quality, but not its entirety. ‘Efficiency’ has emerged as a more
encompassing measure. It incorporates accuracy along with other critical factors, including
speed, latency – important for user experience – and cost, which are crucial for driving the
market demand for this service. This section will not delve into detailed reporting, which
can be accessed through the provided references. However, it will showcase the research
approaches in this domain, highlighting their strengths and weaknesses, with a focus on
interlingual real-time STT workflows.
Starting with comparative studies, Romero-Fresco and Alonso-Bacigalupe (2022) exam-
ined the five workflows discussed in this chapter, namely:
1. Interlingual respeaking
2. Simultaneous interpreting + intralingual respeaking
3. Simultaneous interpreting + ASR
4. Intralingual respeaking + MT
5. ASR + MT
The study focused on one language combination (EN > SP) and involved two participants
per workflow where needed, except for (2), requiring two pairs, and (5), requiring no
human intervention. Participants in these workflows were described as ‘professionals with
experience ranging from five to twenty years in the private market’. However, no detailed
information is provided in relation to what these years of experience represent in terms of
actual assignments performed or specific details on their prior training. Nevertheless, these
could both be key variables affecting performance. The materials used were two TEDx
monologic speeches, 11 to 15 min long, which, as stated by the authors, only differed in
length and speed of delivery. However, it is unclear how other variables, including topic,
technicality, lexical density, and syntactic complexity, were handled. Additionally, since the
study was conducted online, greater explanation of how specific variables were controlled
for would have been beneficial.
Efficiency was analysed in terms of accuracy, delay, and cost. Accuracy was calculated
via the NTR model (Romero-Fresco and Pöchhacker, 2017): three workflows (1, 2, 4)
exceeded the acceptability threshold of 98%, a benchmark validated only for intralingual
respeaking (using the NER model – Romero-Fresco and Martínez, 2015). This threshold
still requires validation in various real-life settings to determine its suitability as an accu-
racy benchmark in interlingual contexts. This distinction is important, as literature often
extends this threshold to interlingual modalities, without questioning the extent to which it
actually applies when language transfer is involved.
As the authors state, ‘the analysis of delay and cost yielded a much more nuanced sce-
nario that may limit the potential usefulness and acceptability of some of the workflows’
(Alonso-Bacigalupe and Romero-Fresco, 2024, 535). The fully automated workflow
195
The Routledge Handbook of Interpreting, Technology and AI
(5) ranked first in terms of delay, despite being the worst in terms of accuracy, while
workflow (2) ranked last in this variable, despite being one of the best in terms of accu-
racy. The authors also highlight an inversely proportional relationship between workflow
automation and cost, where the latter is calculated ‘speculatively’ in terms of resources
needed.
Despite the relevant contribution, there are several limitations in the design of this study.
The authors themselves call for ‘further studies [that] could test larger samples, including
different genres (for instance spontaneous, unscripted interactions involving several speak-
ers, such as TV talk shows, political or social debates, online meetings etc.) and other
language combinations’ (Romero-Fresco and Alonso-Bacigalupe, 2022, 13), which can
yield different results. In addition, more details regarding participants’ profiles, training
undertaken, and technology used (ASR, MT) would be needed for methodological rigour.
While these preliminary findings suggest certain potential patterns, it would be premature
to consider them conclusive, despite how they are reported at times.
One attempt to systematise research findings in relation to this small body of compara-
tive research can be found in Alonso-Bacigalupe and Romero-Fresco (2024). The authors
directly compare studies, including Eugeni (2020), Eichmeyer-Hell (2021), Dawson (2020),
and Pagano (2022). Despite acknowledging that ‘this is not a systematised battery of exper-
iments designed and carried out in parallel . . . but a compilation of the results obtained by
a number of researchers’ (Alonso-Bacigalupe and Romero-Fresco, 2024, 536), they carry
out direct comparisons and discuss ‘emerging trends’ and ‘consistency’ in some key find-
ings. For example, simultaneous interpreting + intralingual respeaking (2) is identified as
the most efficient workflow, and fully automated ASR + MT (5) as the least efficient. Inter-
lingual respeaking (1) is reported as performing well in some studies but not in others,
with comparison drawn to ‘interlingual velotyping’ (used in Eugeni, 2020), despite being a
completely different technique. Intralingual respeaking combined with MT (4) is reported
as yielding good results, while simultaneous interpreting combined with ASR (3) ‘performs
poorly’. Also, they extend their finding that ‘the more automated the workflow, the lower
the delay, the cost and accuracy; the more human the method, the higher the delay, cost
and accuracy’ across other studies, and conclude that semi-automated workflows repre-
sent a ‘happy medium in terms of overall quality and efficiency’ (Alonso-Bacigalupe and
Romero-Fresco, 2024, 538).
Despite the intriguing, pioneering nature of the findings, a straight comparative analy-
sis must be handled with caution as it can lead to misleading conclusions if taken out
of context by the industry. For example, the comparison included studies that focused
on very different workflow techniques. Eugeni (2020) combined keyboard-based and
SR-based methods (including stenotyping and velotyping) in several language pairs (Ital-
ian into German, German into English, English into French), and Eichmeyer-Hell (2021)
compared stenotyping and respeaking, only intralingually (in German). While both
studies provide very insightful observations due to their naturalistic settings, that is,
real-life conferences, they do not offer the controlled conditions typical of experimental
research, which makes them unsuitable for the kind of direct comparison performed by
Alonso-Bacigalupe and Romero-Fresco (2024). Dawson (2021) replicated Romero-Fresco
and Alonso-Bacigalupe’s experiment, using the same workflows and language pair but
opposite directionality. However, findings only seem to be reported in the latter study,
making it difficult to scrutinise the details and the comparability of the approach adopted.
196
Technology for hybrid modalities
Pagano (2022) carried out experiments comparing three workflows (1, 2, 3) from EN >
ITA and all five workflows from SP > ITA using a sample of 20 participants in total, all
postgraduate students. The potential implications of recruiting such diverse samples will
be discussed in Section 11.5.2.
Accuracy as a key element of efficiency is also evaluated by other studies, which focus
on a more in-depth analysis of specific workflows. SMART,7 for instance, focused on an
extensive experimental analysis of interlingual respeaking, performed by 51 language pro-
fessionals across six language pairs (IT, FR, SP into and out of EN; 17 participants per lan-
guage pair) from different, relevant professional backgrounds (including interpreting and/
or pre-recorded/live subtitling, translation). MATRIC8 carried out a small-scale study of
intralingual respeaking + MT compared against simultaneous interpreting. This study also
relied on the participation of professional respeakers working full-time for media outlets
and EU-accredited interpreters working for the European Parliament and four language
pairs (from EN into IT, SP, FR, PO).
In line with Romero-Fresco and Alonso-Bacigalupe (2022) and Dawson (2020), SMART
and MATRIC adopted the NTR model to evaluate accuracy based on a quantitative and
qualitative assessment of positive and negative shifts between source input and target out-
put. These were categorised as ‘effective editions’ and ‘errors’, respectively. Despite the var-
ying levels of detail provided across studies regarding the implementation of such models
(e.g. number of assessors, procedures for first and second marking, and whether the whole
performance or a sample was evaluated), there is an attempt at establishing coherence
across the studies, which bodes well for comparison. However, the issue of benchmark-
ing for interlingual practices remains, and the only attempt at addressing it can be found
in SMART. Here, NTR results, which point at informativeness, were triangulated with
results from the application of an intelligibility scale (based on Tiselius, 2009) to determine
whether and, if so, which performances below 98% could be included in the high perform-
ers samples.
As is the case for a growing number of hybrid practices being evaluated via NTR-like
approaches (e.g. Rodríguez González et al., 2023; Radić, 2023 – see Davitti et al., this vol-
ume), MATRIC adjusted the NTR model to suit the workflows analysed. Here, recognition
errors only apply to the intralingual respeaking process in the semi-automated workflow,
that is, an interim stage in the process. However, they do not apply to the simultaneous
interpreting (benchmark) workflow, where this category of errors was not considered for
the final output.
Among non-experimental studies, other types of accuracy assessment models were used,
namely, the IRA model (idea–unit rendition assessment; Eugeni, 2017) in Eugeni, 2020,
and the WIRA model (weighted idea rendition assessment; Eichmeyer-Hell, forthcoming) in
Eichmeyer-Hell (2021). Another naturalistic small-scale study, Sandrelli (2020), focused on
interlingual respeaking (compared against simultaneous interpreting) from EN > ITA pro-
vided at a real-life symposium during the same conference. Sandrelli compared the semantic
content conveyed by the two modalities using a more qualitative, purpose-made taxonomy,
based on three macro-categories in terms of information made available to the audience
(transmission, reduction, distortion) and related subcategories.
After providing a quick overview of the main methods used to evaluate accuracy in
existing studies on real-time interlingual STT practices, it emerges rather clearly that the
various approaches used make the comparison across different workflows rather arbitrary,
197
The Routledge Handbook of Interpreting, Technology and AI
particularly if the experiment parameters used in each study are not fully specified. Due
to the lack of a standardised approach, similar considerations arise in relation to meas-
urements of delay and cost as key elements of efficiency. The next section will address
other important methodological parameters to facilitate meaningful comparisons and offer
insights for future research.
198
Technology for hybrid modalities
likely to have a mixed background, combining more than one clear-cut profile. Therefore,
in the recruitment stage, the project designed an eligibility questionnaire to profile par-
ticipants. A definition of ‘professional’ for the purpose of the study was provided, that
is, ‘someone with a minimum of 2,000 hr of professional (paid or pro bono) experience
in at least one language profession’. This baseline requirement – equivalent to working
an average of 4 hr every weekday for two years – was used to streamline the recruitment
process and establish common grounds across participants. Practice hours were catego-
rised into brackets (e.g. 2,000–3,900 hr, 4,000–9,999 hr, and more than 10,000 hr) since
different professions track work in various ways (e.g. by days, number of words, or sub-
titles translated). The findings revealed that the majority of participants had a composite
background, with most (26 out of 51) combining three different professions. Furthermore,
all participants indicated written translation as part of their skill set, forming a common
denominator, despite their varying levels of expertise. This reflects the reality of the lan-
guage industry, where professionals often offer multiple services as part of their portfolio.
It also suggests that a more granular approach – focusing on specific skills rather than entire
professions – might prove more useful for identifying what can be transferred, what can be
adjusted, or what needs to be acquired from scratch when performing hybrid practices in
general (see Davitti and Wallinheimo, 2025).
Given the scarcity of professionals trained in the array of hybrid practices analysed,
planning a study requires careful consideration of the type of training participants receive
before testing each specific workflow. This includes the structure of the training and its
duration – both crucial when comparing and commenting on performance and accuracy.
Once again, to date, most available information comes from studies that focus on inter-
lingual respeaking, where the pre-testing training provided varies significantly in format
and duration. For example, the SMART project employed a 25 hr ‘training for testing’
prototype upskilling course over five weeks. Dawson (2020) used a ‘train, practice, and
test’ approach with a four-week course, consisting of three weekly sessions and 2 hr weekly
exercises. Pagano (2022) implemented a 70 hr course over three months, including 50 hr
of synchronous lessons and additional time for practical exercises and individual training.
In contrast, studies comparing hybrid practices do not report specific pre-testing train-
ing; instead, they asked participants to perform techniques for which they had previously
received training (see Romero-Fresco and Alonso-Bacigalupe, 2022; Korybski et al., 2022).
The type and duration of materials used as a basis for performance analysis also vary.
Dawson (2020) used 5 min videos, at speeds between 107 wpm and 159 wpm, on topics
including gardening and feminism. Pagano (2022) used 11 min clips, at 125 wpm, on cli-
mate change and speeches of varying lengths (1 to 4.2 min) and speeds (98, 139, and 126
wpm) on diverse topics, ranging from Pope Francis on climate change to the European
Economic and Social Committee. Romero-Fresco and Alonso-Bacigalupe (2022) employed
the same climate change clip used by Dawson and a 15 min TEDx talk at 165 wpm. These
materials all differ in genre, technicality, lexical density, and syntactical complexity. How-
ever, detailed information on the actual characteristics of each speech remains vague, only
specifying that a speech may include ‘some or no specialised terminology’ or be qualified as
‘not particularly dense’ or having a ‘low level of technicality’. This represents a weakness of
current research, which labels itself as experimental, although ‘using unedited speeches and
conducting the analysis at the text level would have likely introduced an excessive amount
of potentially confounding variables’ (Prandi, 2023, 135).
199
The Routledge Handbook of Interpreting, Technology and AI
To address these issues, the SMART project designed its materials differently, focus-
ing on specific scenarios rather than genres. The experiments included two monologic
speeches – one at a controlled fast pace of around 140 wpm (identified as a stress factor
in real-time modes), and another with planned/unplanned alternation of pre-prepared and
impromptu speech structure, with the former normally associated with higher processing
difficulty. The third test was a dialogic speech to test the multiple speakers’ condition,
which is challenging in live STT due to voice alternation, quick exchanges, and partial
overlaps. The topic was controlled across all speeches to ensure no participant had an
advantage in terms of prior knowledge. Participants received advanced briefs and termi-
nology for personal and software preparation. Focusing on conditions rather than genres
enabled control over specific variables of interest. This also enabled the average accuracy
across all conditions to be calculated based on the assumption that, in real-life speeches, all
variables can be present at once. Overall, the speeches were longer than those used in other
studies (approximately 15 min each), as one of the study’s goals was to explore the extent
to which participants could sustain such a complex practice. Comparability was ensured by
rendering and adapting the same English script for flow, idiomaticity, and terminological
consistency across six languages and directionality. Delivery during testing was randomised
to prevent practice effects. This approach aimed to promote the creation of comparable
speeches, ensuring the same structure across languages, which is crucial for comparative
analysis.
In light of this, while interesting accounts and findings on different workflows exist,
cross-study comparisons are challenging due to the numerous variables characterising each
study. Also, accuracy scores compared without specifying the conditions in which they were
obtained are not particularly informative. For instance, considered in isolation, SMART’s
95.37% average interlingual respeaking accuracy score across all participants, language
pairs, and testing scenarios might seem unimpressive. However, this figure alone does not
convey the full picture of the workflow’s capabilities. When contextualised, this result reflects
the performance of 51 participants, each undertaking three interlingual respeaking tests
under different conditions (totalling 153 performances) in 15 min long speeches, after only
25 hr training. This context provides a more nuanced understanding of the findings, which
appear very impressive under this light and have been validated by industry stakeholders
involved in the project. Furthermore, the two-pronged analysis focusing on informativeness
and intelligibility (as explained in Section 11.5.1) revealed that 27 ‘high performers’ met
both criteria, achieving an average accuracy score of 97% and above (up to 98.87%) across
conditions and languages after a short training. Once again, this outcome is very promis-
ing and much more in line with findings from other studies. Consequently, there is clear
need for more rigorous analyses for reliable comparisons and a more nuanced approach to
reporting research findings, which is a responsibility of the academic community.
200
Technology for hybrid modalities
Pöchhacker and Remael (2019) developed a theoretical model mapped against a tripar-
tite structure of the interlingual respeaking process: pre-, peri-, and post-process stages.
This model is based on a comprehensive understanding of the practice mechanics but lacks
empirical grounding. They identified integrated technical-methodological competences (i.e.
procedural, related to the interlingual respeaking task), linguistic and cultural competences,
world and subject matter knowledge, and interpersonal and professional skills. In her PhD
thesis, Dawson (2020) set out to explore empirically task-specific skills needed in this prac-
tice. These were identified as multitasking, live translation, dictation, command of source
and target languages, and comprehension. Based on the SMART project, Davitti and Wall-
inheimo (2025) framed interlingual respeaking as a form of HAII and broadened the scope
to empirically explore not only procedural but also cognitive and interpersonal skills that
underlie the process and the challenges that arise during performance. Despite the explora-
tory nature of the project, which does not aim to list a comprehensive set of skills, SMART
provides a multifactorial competence framework that can be further expanded in future
studies.
Another question, addressed by several studies, relates to defining the most suitable pro-
file for the job. This has often been approached using traditional, clear-cut labels, mostly
subtitling or interpreting. Szarkowska et al. (2018) conducted the first empirical work to
address this question by asking ‘Are interpreters better respeakers?’. They experimented
with 57 participants, grouped into 22 interpreters, 23 translators, and 12 controls, with dif-
ferent levels of experience in their fields and tested different parameters (speech rate and the
number of speakers) in intra- and interlingual respeaking. While they found that interpret-
ers consistently achieved the highest scores, the differences in interlingual respeaking were
not as pronounced as expected. This indicates a certain transfer of skills from interpreting
to new hybrid practices, but with other domains to explore. Dawson’s (2020) study showed
that, generally, the interpreters in her sample obtained better results than subtitlers in both
intra- and interlingual respeaking. Despite this, Dawson’s quantitative results suggest that
there may not be a particular professional profile best suited to interlingual respeaking.
In other words, being an interpreter does not appear to guarantee successful interlingual
respeaking performance.
SMART took a different approach, starting from the premise that no profile is clear-cut,
as described in Section 11.5.2. Different professional backgrounds among those chosen for
profiling were first examined independently to identify whether one could predict output
accuracy. Based on a sample of N = 51 language professionals, the data revealed that a
background in live (monolingual) subtitling emerged as a predictor of ‘good’ performance
(β = 0.32; p = 0.02.). This appears logical, as live intralingual subtitling via respeaking is
the closest practice to interlingual respeaking, sharing all procedural skills except language
transfer. This also suggests that adding language transfer (and the strategic behaviour asso-
ciated with it) is more efficient once all other skills involved in respeaking (e.g. interacting
with SR, adding punctuation, using SAD, etc.) are mastered. Of the 11 participants with
live intralingual respeaking in their background, 8 were high performers.
Given the composite background of most participants, SMART also took a different
approach to grouping their backgrounds. This approach considered the fundamentals of
interlingual respeaking, which involves a diamesic shift from spoken to written, transcend-
ing traditional boundaries of literacy or orality. One might argue that traditional subtitling
already does this, but the real-time nature and time constraints add a layer of complex-
ity. Conversely, the ‘real-time’ factor forms part of spoken interpreting, but traditionally,
201
The Routledge Handbook of Interpreting, Technology and AI
the diamesic transfer does not (as is argued at the beginning of this chapter). Therefore,
participants were grouped into clusters of professional backgrounds according to whether
they are more accustomed to practices that lead to spoken or written output, or both. This
approach helped move beyond the usual groupings according to profession. Based on this,
the project formed three balanced groups: spoken-to-spoken (n = 17), spoken-to-written
(n = 16), and mixed (n = 16). The former included individuals with consecutive/dialogue
and/or simultaneous/whispered interpreting in their background. The second included indi-
viduals with pre-recorded and/or live subtitling as part of their skills cluster. The last group
included individuals combining both interpreting and subtitling skills in their background
(e.g. consecutive/dialogue and pre-recorded subtitling; simultaneous and pre-recorded sub-
titling; consecutive/dialogue, simultaneous/whispered, and pre-recorded subtitling; con-
secutive/dialogue, simultaneous/whispered, pre-recorded, and live subtitling). Two outliers
reported having only written translation as their professional background. Interestingly, no
statistical difference was found between the three groups (p > 0.05), suggesting that no spe-
cific set of skills necessarily provides an advantage in this hybrid practice. It is thus a matter
of identifying the relevant procedural skills one already has, what is directly transferable,
and what may need adaptation or acquisition. This is particularly relevant for approaches
to training and upskilling, the final area of research, reviewed in the next section.
Before proceeding, it is important to note that other hybrid workflows have not been
scrutinised from the perspective of ‘required skills and competences’. Attention thus far has
focused mostly on establishing which workflow ‘performs best’, albeit not without limita-
tions, as discussed. Secondly, there seems to be a general assumption that, in workflows
combining different professional expertise (e.g. simultaneous interpreting + intralingual
respeaking), each actor would perform their tasks as normal, without much adaptation.
However, this would require further exploration and experimentation, since awareness of
working as part of these hybrid workflows, or in combination with machines (e.g. simulta-
neous interpreting + ASR), may need adjustment at procedural, cognitive, and interpersonal
levels.
202
Technology for hybrid modalities
the ILSA,10 and the SMART projects (with the latter including the ongoing SMART-UP
follow-up).11 ILSA developed an online, self-paced course focusing on three real-life scenar-
ios: TV, live, and educational settings. ILSA’s course structure includes foundational compo-
nents, such as media accessibility, pre-recorded subtitling, and simultaneous interpreting, as
prerequisites to the core components of intra- and interlingual respeaking. Dawson focused
on developing a framework aiming to ‘offer trainers a structured approach to planning
training and to prepare trainers to organise and present course materials for interlingual
respeaking’ (p. 223). This includes three blocks of activities – the first one introducing
trainees to media access and dictation and software management, running simultaneously;
then block 2 with simultaneous interpreting and intralingual respeaking running simultane-
ously; and block 3, including one module on interlingual respeaking. The framework also
includes quality assessment points and discussions on professional practices and technolog-
ical advancements relevant to respeaking. Suggestions for different delivery modes, either
online or on-site or blended, are provided.
SMART complemented these efforts by developing an upskilling course aimed at lan-
guage professionals from diverse, yet relevant, backgrounds, each bringing unique skills to
this emerging field. The course was piloted as a prototype to empirically explore different
types of competences (procedural, cognitive, and interpersonal) required for this new form
of HAII. Delivered remotely (due to the pandemic) over five weeks in 2021, the bespoke 25
hr course aimed to place participants on a level playing field by teaching all core procedural
skills needed for intra- and interlingual respeaking. The course also collected data on the
key competences (not only procedural but also cognitive and interpersonal) underlying its
process and the level of accuracy in terms of products that could be achieved after complet-
ing the training. As previously mentioned, the course attracted 51 participants from over
250 applicants, demonstrating significant interest in this CPD opportunity. A face-to-face
version of the course was later implemented at a summer school in 2022 with 12 language
professionals, including staff from project partners Sky and Sub-Ti Ltd. Findings high-
lighted both shared and unique challenges in skills acquisition, depending on participants’
professional backgrounds, and established an initial baseline for skill attainment after 25
hr of training (see Davitti et al., under review). This offers a promising starting point for
individuals wishing to progress from functional to expert levels.
Building on these insights, the follow-up SMART-UP project is currently refining the
SMART prototype into a flexible CPD model, characterised by a modular structure that
ensures adaptability and customisability. Some elements are shared by the approaches
reviewed earlier, such as the focus on elements related to the use of SR and on skills from
subtitling (e.g. familiarity with conventions, awareness of viewers’ needs, etc.) and inter-
preting (e.g. ability to listen and speak at the same time, strategic behaviour, multitask-
ing, etc). However, traditionally, interlingual respeaking training follows a sequence, with
simultaneous interpreting and pre-recorded subtitling taught in full and separately, before
teaching intralingual respeaking practice first, and interlingual respeaking practice after.
In contrast, SMART’s approach breaks down the respeaking process into single pro-
cedural skills that can be taught progressively and in an incremental manner. Similar to
Dawson, within each teaching block devoted to procedural skills, there is a tech strand.
Here, participants focus on learning skills to interact with SR software. Differently, though,
respeaking is broken down into core components, which are practised first intralingually
and then interlingually throughout the training, in constant alternation, with intralingual
practice serving as a scaffold for interlingual skill development. For instance, instead of
203
The Routledge Handbook of Interpreting, Technology and AI
11.6 Conclusion
In this chapter, live interlingual STT has been conceptualised as a unique form of interpret-
ing accessible to hearing individuals, speakers of other languages, and those with hearing
loss, making it a service that is essential for some and beneficial for all. This (re)conceptu-
alisation frames interpreting as a diamesic activity that transcends spoken, sign, and written
language boundaries. It is also where the fields of translation, interpreting, subtitling, and
media accessibility converge into an increasingly fluid, hybrid space. Traditional boundaries
between these disciplines are thus dissolving, leading to a growing recognition of their inter-
related nature and the need for a more integrated approach.
Given rapid technological advances, the same service can be delivered in different
ways. This chapter has offered, first, a descriptive overview of some salient workflows,
with reference to how they are currently used in the industry. Viewing these practices as
technology-enabled forms of interpreting is crucial for their broader adoption beyond audi-
ovisual translation and media accessibility. Understanding the dynamics of these complex
workflows is fundamental for appreciating the evolving landscape of language-related prac-
tices in the digital age. Raising awareness of their application and potential becomes essential
for effective participation in various contexts, as well as for effective information dissemina-
tion. The integration of human and AI efforts is central to this evolution, u nderscoring the
need for new skills and regulatory frameworks to ensure high-quality, reliable services.
While technology is a critical component of these emerging practices, it is not the sole
factor determining their success or failure. Several aspects must be considered when explor-
ing such dynamics. These include procedural aspects (how to efficiently interface with AI
204
Technology for hybrid modalities
and optimise human–AI synergy), cognitive aspects (how to ensure appropriate effort levels
to prevent cognitive overload), interpersonal traits (what traits and attitudes can best sup-
port humans working in these environments), interactional dynamics (how to work with
others in new workflows), ergonomic considerations (how to integrate technology in a
manner conducive to well-being), and declarative knowledge (understanding how these
practices work and their best applications). Addressing these challenges requires rigorous
research methods and approaches to study design that enable replicability and comparabil-
ity. Research in this domain is still in its infancy, and it is hoped that the brief review of
existing literature, carried out in the second half of this chapter, has critically shown key
findings, but also areas for further improvement.
Despite varying greatly in their degree of human input and involvement, and despite the
hype around fully automated services, one common view across both academia and the
industry is that there is no one-size-fits-all solution. The ‘right’ workflow must match
the specific requirements and needs of each situation. These hybrid practices are thus not
intended to replace traditional ones, like simultaneous interpreting or subtitling; instead,
they are meant to complement existing services, offering opportunities for diversifying
service portfolios and expanding professional capabilities while adapting to the evolving
demands of the digital age. The rapid pace of technological advancement often outstrips
our understanding of its optimal applications and implications. However, despite fears of
replacement by technology, live STT across languages is far from a ‘solved problem’, due to
the many untested variables that impact quality and accessibility.
Now is the time to think critically about how operating in these HAII environments is
reshaping the language professionals’ identity at its core. While there cannot be an expec-
tation for them to suddenly become AI experts, their understanding of the processes and
goals of these practices positions them best to advise on appropriate workflows and optimal
points of human integration for specific situations. As highlighted in a recent industry-led
report (Slator, 2023), the level and type of human intervention are also evolving, creating
roles such as live subtitles post-editors and consultants for developing, training, and evalu-
ating STT technologies and workflows, among others.
The fear that automation may threaten established job profiles is tangible. However, turning
away from human-centric practices that could benefit from expertise already available in the
language industry is only likely to ‘encourage the industry to adopt fully automatic methods that
are still not ready to provide high quality translations’ (Alonso-Bacigalupe and Romero-Fresco,
2024, 541). In conclusion, technology-enabled hybrid modalities are not a threat but an oppor-
tunity for the language industry and require a shift in the way research is conducted. By embrac-
ing these new practices and continuing to develop expertise, industry and academia can ensure
high-quality, accessible real-time interlingual communication for all.
Notes
1 www.un.org/sustainabledevelopment/globalpartnerships/ (accessed 02.04.2025).
2 eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX:12012P/TXT (accessed 02.04.2025).
3 www.ohchr.org/en/instruments-mechanisms/instruments/convention-rights-persons-disabilities
(accessed 02.04.2025).
4 World Health Organization, 2001. International Classification of Functioning, Disability, and
Health (ICF). Geneva: WHO.
5 See SMART (2023) video at www.youtube.com/watch?v=rxIRKLR2_7o (accessed 02.04.2025).
205
The Routledge Handbook of Interpreting, Technology and AI
6 ‘The future of live multilingual captioning – SLATOR interview Ai-Media CEO Tony Abrahams,
28.4.2023. URL https://slator.com/future-live-multilingual-captioning-ai-media-ceo-tony-abrahams/
(accessed 02.04.2025).
7 SMART project – Shaping Multilingual Access through Respeaking Technology, ES/T002530/1,
Economic and Social Research Council UK, 2020–2023. URL https://smartproject.surrey.ac.uk/
(accessed 02.04.2025).
8 MATRIC project – Machine Translation and Respeaking in Interlingual Communication, Expand-
ing Excellence in England, Research England, 2020–2024.
9 Unfortunately, no participants who signed up to take part in the study reported having a back-
ground in sign language interpreting, although it would be encouraged to include in future studies,
given the core diamesic transfer embedded in this practice.
10 ILSA project – Interlingual Live Subtitling for Access, Erasmus+ 2017-1-ES01-KA203-037948,
2017–2020.
11 SMART-UP project – Shaping Multilingual Access through Respeaking Technology – Upskilling,
Economic and Social Research Council Impact Acceleration Account, 2023–2025. URL https://
www.surrey.ac.uk/research-projects/smart-shaping-multilingual-access-through-respeaking-
technology-upskilling (accessed 02.04.2025).
References
Alonso-Bacigalupe, L., Romero-Fresco, P., 2024. Interlingual Live Subtitling: The Crossroads
Between Translation, Interpreting and Accessibility. Universal Access in the Information Society
23, 533–543. URL https://doi.org/10.1007/s10209-023-01032-8
Bhabha, H.K., 1994. The Location of Culture. Routledge, London.
Braun, S., 2019. Technology and Interpreting. In O’Hagan, M., ed. Routledge Handbook of Transla-
tion and Technology. Routledge, London, 271–288.
Davitti, E., Sandrelli, A., Zou, Y., Korybski, T., Wallinheimo, A.-S., under review. Designing a
Research-Informed Upskilling Course in Interlingual Respeaking for Language Professionals. The
Interpreter and Translator Trainer.
Davitti, E., 2018. Methodological Explorations of Interpreter-Mediated Interaction: Novel Insights
from Multimodal Analysis. Qualitative Research 19(1), 7–29. Sage Publications. https://doi.
org/10.1177/1468794118761492
Davitti, E., Sandrelli, A., 2020. Embracing the Complexity: A Pilot Study on Interlingual Respeaking.
Journal of Audiovisual Translation 3(2), 103–139. https://doi.org/10.47476/jat.v3i2.2020.135
Davitti, E., Wallinheimo, A.-S., 2025. Investigating Cognitive and Interpersonal Factors in Hybrid
Human-AI Practices: An Empirical Exploration of Interlingual Respeaking. Target 37, S pecial
Issue: Mapping Synergies within Cognitive Research on Multilectal Mediated Communication.
Dawson, H., 2020. Interlingual Live Subtitling: A Research-Informed Training Model for Interlingual
Respeakers to Improve Access for a Wide Audience (PhD thesis). University of Roehampton.
Dawson, H., 2021. Exploring the Quality of Different Live Subtitling Methods: A Spanish to English
Follow-Up Case Study. Paper presented at the 7th IATIS Conference, 17.9.2021, Pompeu Fabra
University, Barcelona, Spain.
Eichmeyer-Hell, D., 2018. Speech-to-Text Interpreting: Barrier-free Access to Universities for the
Hearing Impaired. Barrier-Free Communication: Methods and Products: Proceedings of the 1st
Swiss Conference on Barrier-Free Communication, ZHAW Digitalcollection, 6–15. URL https://
doi.org/10.21256/zhaw-3000
Eichmeyer-Hell, D., 2021. Speech Recognition (Respeaking) vs. the Conventional Method (Keyboard):
A Quality-Oriented Comparison of Speech-to-Text Interpreting Techniques and Addressee Prefer-
ences. In Jekat, S.J., Puhl, S., Carrer, L., Lintner, A., eds. Proceedings of the 3rd Swiss Conference on
Barrier-Free Communication (BfC 2020). Winterthur (online), 29.6.2020–4.7.2020. ZHAW Zurich
University of Applied Sciences, Winterthur. URL https://doi.org/10.21256/zhaw-3001
Eichmeyer-Hell, D., forthcoming. Schriftdolmetschen – Realisierungsformen qualitätsorientierten
Vergleich (PhD thesis).
Eszenyi, R., Bednárová-Gibová, K., Robin, E., 2023. Artificial Intelligence, Machine Translation &
Cyborg Translators: A Clash of Utopian and Dystopian Visions. Ezikov Svyat 21, 102–113. URL
https://doi.org/10.37708/ezs.swu.bg.v21i2.13
206
Technology for hybrid modalities
Eugeni, C., 2017. La sottotitolazione intralinguistica automatica – Valutare la qualità con IRA. CoMe
2(1), 102–113.
Eugeni, C., 2020. Human-Computer Interaction in Diamesic Translation: Multilingual Live Subtitling. In
Dejica, D., Eugeni, C., Dejica-Cartise, A., eds. Translation Studies and Information Technology – New
Pathways for Researchers, Teachers and Professionals. Editura Politehnica, Timişoara, 19–31.
Eugeni, C., Caro, R.B., 2019. The LTA Project: Bridging the Gap Between Training and the Profession
in Real-Time Intralingual Subtitling. Linguistica Antverpiensia, New Series – Themes in Transla-
tion Studies 18. URL https://doi.org/10.52034/lanstts.v18i0.512
Eugeni, C., Caro, R.B., 2021. Written Interpretation: When Simultaneous Interpreting Meets
Real-Time Subtitling. In Seeber, K., ed. 100 Years of Conference Interpreting: A Legacy. Cam-
bridge Scholars Publishing, London, 93–109.
Eugeni, C., Gambier, Y., 2023. La traduction intralinguistique: les défis de la diamésie. Editura
Politehnica, Timişoara.
TheExpressWire, 2023. Live Captioning Market 2023: Growth, Trend, Share, and Forecast Till 2030.
URL www.digitaljournal.com/pr/news/theexpresswire/live-captioning-market-2023-growth-trend-
share-and-forecast-till-2030-126-pages-report (accessed 12.08.2024).
Fabri, L., Häckel, B., Oberländer, A.M., Rieg, M., Stohr, A., 2023. Disentangling Human-AI Hybrids. Busi-
ness & Information Systems Engineering 65, 623–641. URL https://doi.org/10.1007/s12599-023-00810-1
Fantinuoli, C., 2018. Interpreting and Technology: The Upcoming Technological Turn. In Fantinuoli,
C., ed. Interpreting and Technology. Language Science Press, Berlin, 1–12.
Gottlieb, H., 2005. Multidimensional Translation: Semantics Turned Semiotics. In MuTra: Challenges
of Multidimensional Translation. URL www.euroconferences.info/proceedings/2005_Proceedings/
2005_proceedings.html
Greco, G.M., 2018. The Nature of Accessibility Studies. Journal of Audiovisual Translation 1(1),
205–232. URL https://doi.org/10.47476/jat.v1i1.51
Jiménez-Crespo, M., 2020. The “Technological Turn” in Translation Studies. Are We There Yet?
A Transversal Cross-Disciplinary Approach. Translation Spaces 9(2), 314–341. URL https://doi.
org/10.1075/ts.19012.jim
Joscelyne, A., 2019. World-Readiness: Towards a NewTranslate Benchmark. TAUS, The Language Data
Network. URL www.taus.net/resources/blog/world-readiness-towards-a-new-translate-benchmark
(accessed 12.09.2024).
Kade, O., 1968. Zufall und Gesetzmäßigkeit in der Übersetzung. Verlag Enzyklopädie, Leipzig.
(accessed 12.09.2024).
Korybski, T., Davitti, E., 2024. Human Agency in Live Subtitling Through Respeaking: Towards a
Taxonomy of Effective Editing. Journal of Audiovisual Translation. Special issue: Human Agency
in the Age of Technology, 1–22. URL https://doi.org/10.47476/jat.v7i2.2024.302
Korybski, T., Davitti, E., Orăsan, C., Braun, S., 2022. A Semi-Automated Live Interlingual Commu-
nication Workflow Featuring Intralingual Respeaking: Evaluation and Benchmarking. In Proceed-
ings of the 13th Conference on Language Resources and Evaluation (LREC 2022), Marseille,
France, 4405–4413. ELRA.
Kubánková, E., 2021. Captions Increase Viewership, Accessibility and Reach. URL www.newtontech.
net/en/blog/23083-captions-increase-viewership-accessibility-and-reach (accessed 15.07.2024).
Marais, K., 2019. A (Bio)Semiotic Theory of Translation: The Emergence of Social-Cultural Reality.
Routledge, Abingdon. URL https://doi.org/10.4324/9781315142319
Nimdzi, 2022a. Languages & the Media: The Latest Trends in Media Localization. URL www.nim
dzi.com/media-localization-trends-languages-the-media/ (accessed 15.07.2024).
Nimdzi, 2022b. Consolidation and Growth in Media Localization. URL https://www.nimdzi.com/
transperfect-buys-hiventy/ (accessed 15.07.2024).
Nimdzi, 2023. The Nimdzi Business Confidence Study: Q1 and Q2 2023 Edition. URL www.nimdzi.
com/the-nimdzi-business-confidence-study-q1-and-q2-2023/ (accessed 15.07.2024).
Pagano, A., 2022. Testing Quality in Interlingual Respeaking and Other Methods of Interlingual Live
Subtitling (PhD thesis). Università di Genova.
Pöchhacker, F., 2007. Simultaneous Consecutive Interpreting: A New Technique Put to the Test. Meta
Journal des Traducteurs 52(2), 276–289. URL https://doi.org/10.7202/016070ar
Pöchhacker, F., 2022. Interpreters and Interpreting: Shifting the Balance? The Translator 28(2),
148–161. URL https://doi.org/10.1080/13556509.2022.2133393
207
The Routledge Handbook of Interpreting, Technology and AI
Pöchhacker, F., 2023. Re-Interpreting Interpreting. Translation Studies 16(2), 277–296. URL https://
doi.org/10.1080/14781700.2023.2207567
Pöchhacker, F., Remael, A., 2019. New Efforts? A Competence-Oriented Task Analysis of Interlingual
Live Subtitling. Linguistica Antverpiensia, New Series – Themes in Translation Studies. Antwerp,
Belgium, 18. URL https://doi.org/10.52034/lanstts.v18i0.515
Prandi, B., 2023. Computer-Assisted Simultaneous Interpreting: A Cognitive-Experimental Study on
Terminology. Translation and Multilingual Natural Language Processing 22.
Radić, A., Braun, S., Davitti, E., 2023. Introducing Speech Recognition in Non-Live Subtitling to
Enhance the Subtitler Experience. Proceedings of the International Conference HiT-IT 2023
167–176. URL https://doi.org/10.26615/issn.2683-0078.2023_015
Robert, I.S., Schrijver, I., Diels, E., 2019. Live Subtitlers: Who Are They? A Survey Study. Linguistica
Antverpiensia, New Series – Themes in Translation Studies 18. URL https://doi.org/10.52034/
lanstts.v18i0.544
Robinson, D., 2022. Cyborg Translation. Originally published in Petrilli, S. (Ed.), La traduzione.
Special issue of Athanor: Semiotica, Filosofia, Arte, Letteratura 10(2), 219–233.
Rodríguez González, E., Saeed, M.A., Korybski, T., Davitti, E., Braun, S., 2023. Assessing the Impact of Auto-
matic Speech Recognition on Remote Simultaneous Interpreting Performance Using the NTR Model. In
Proceedings of the International Workshop on Interpreting Technologies SAY IT AGAIN 2023.
Romero-Fresco, P., 2011. Subtitling Through Speech Recognition: Respeaking. St Jerome, Manchester.
Romero-Fresco, P., 2018. Respeaking: Subtitling Through Speech Recognition. In Pérez-González, L.,
ed. The Routledge Handbook of Audiovisual Translation. Routledge, London, 96–113.
Romero-Fresco, P., Alonso-Bacigalupe, L., 2022. An Empirical Analysis on the Efficiency of Five Interlin-
gual Live Subtitling Workflows. XLinguae 15, 3–13. URL https://doi.org/10.18355/XL.2022.15.02.01
Romero-Fresco, P., Eugeni, C., 2020. Live Subtitling Through Respeaking. In Bogucki, Ł., Deckert,
M., eds. The Palgrave Handbook of Audiovisual Translation and Media Accessibility. Palgrave,
London, 269–295. URL https://doi.org/10.1007/978-3-030-42105-2_14
Romero-Fresco, P., Martínez, J., 2015. Accuracy Rate in Live Subtitling: The NER Model. In
Díaz-Cintas, J., Baños, R., eds. Audiovisual Translation in a Global Context: Mapping an
Ever-Changing Landscape. Palgrave Macmillan, London, 28–50.
Romero-Fresco, P., Pöchhacker, F., 2017. Quality Assessment in Interlingual Live Subtitling: The NTR
Model. Linguistica Antverpiensia, New Series: Themes in Translation Studies 16, 149–167.
Sandrelli, A., 2020. Interlingual Respeaking and Simultaneous Interpreting in a Conference Setting: A
Comparison. inTRAlinea Special Issue: Technology in Interpreter Education and Practice.
Schaffner, C., Adab, B., 1995. Translation as Intercultural Communication: Contact as Conflict. In
Translation as Intercultural Communication: Selected Papers from the EST Congress, Prague.
John Benjamin’s Publishing Co., Philadelphia.
Simon, S., 2011. Hybridity and Translation. In Gambier, Y., van Doorslaer, L., eds. Handbook of
Translation Studies, Vol. 2. John Benjamins, 49–53. URL https://doi.org/10.1075/hts.2.hyb1
Slator, 2023. Slator Pro Guide: Subtitling and Captioning. URL https://slator.com/slator-pro-
guide-subtitling-and-captioning/ (accessed 12.08.2024).
Szarkowska, A., Krejtz, K., Dutka, Ł., Pilipczuk, O., 2018. Are Interpreters Better Respeakers? The
Interpreter and Translator Trainer 12(2), 207–226.
Tiselius, E., 2009. Revisiting Carroll’s Scales. In Angelelli, C.V., Jacobson, H.E., eds. Testing and Assess-
ment in Translation and Interpreting Studies. A Call for Dialogue Between Research and Practice,
American Translators Association Scholarly Monograph Series 14. John Benjamins, 95–121.
van Merriënboer, J.J.G., Kirschner, P.A., 2018. Ten Steps to Complex Learning: A Systematic
Approach to Four-Component Instructional Design, 3rd ed. Routledge, New York.
Verified Market Reports, 2024. Real-Time Language Translation Device Market Insights. URL
www.verifiedmarketreports.com/product/real-time-language-translation-device-market/ (accessed
12.08.2024).
Wagner, S., 2005. Intralingual Speech-to-Text Conversion in Real-Time: Challenges and Opportuni-
ties. In MuTra: Challenges of Multidimensional Translation Conference Proceedings.
Wallinheimo, A.-S., Evans, S.L., Davitti, E., 2023. Training in New Forms of Human-AI Interaction
Improves Complex Working Memory and Switching Skills of Language Professionals. Frontiers in
Artificial Intelligence 6. URL https://doi.org/10.3389/frai.2023.1253940
Wordly, 2024. State of Live AI Translation. URL www.wordly.ai/resources/wordly-ai-translation-
research-2024 (accessed 15.07.2024).
208
12
MACHINE INTERPRETING
Claudio Fantinuoli
12.1 Introduction
Language barriers pose a significant obstacle for numerous people. These barriers not only
restrict access to content in daily leisure and entertainment contexts, in the circulation of
news, etc., but also create difficulties in critical situations, such as accessing public ser-
vices during humanitarian operations or while managing crisis scenarios, to name just a
few. Previous research has extensively shown, for example, that limited proficiency in the
languages of host societies significantly contributes to anxiety and stress among migrants
(Ding and Hargraves, 2009); the absence of mutual understanding because of language
barriers creates dangerous situations, hazards, exclusion, and other cascading effects during
crises (O’Brien and Federici, 2022); and the lack of language accessibility in the informa-
tion selection process impacts the framing and agenda-setting factors in newsrooms (Van
Doorslaer, 2009). These challenges are not limited to the physical world; they are also
becoming increasingly significant in online communication, where new challenges arise for
achieving barrier-free interaction among users with diverse linguistic and accessibility needs
(Kožuh and Debevc, 2018; Abarca et al., 2020).
For centuries, overcoming language barriers involved relying on human interpreters,
both professional and amateur, adopting a common lingua franca (Cogo, 2015) or even
creating artificial languages, like Esperanto (Fettes, 1997). While each approach presents
its own advantages, it is important to acknowledge that these also come with recognised
limitations. For example, human interpreters are not universally accessible or cost-effective,
and their services are often reserved for specific situations deemed essential by stakehold-
ers or mandated by law (Jaeger et al., 2019). The use of a lingua franca, such as English, is
accessible only to a subset of individuals, and often, their proficiency in it is limited (Hart-
shorne et al., 2018). Most situations involving people from different language backgrounds
remain unaddressed.
Several solutions are conceivable as a way to overcome language barriers. These include
training more interpreters, training bilingual staff in multilingual workforces to act as inter-
preters, making interpreter provision more effective through the use of remote interpreting,
etc. Among these opportunities, many have proposed speech translation technology as a
DOI: 10.4324/9781003053248-16
The Routledge Handbook of Interpreting, Technology and AI
tool to enhance accessibility and bridge linguistic and cultural divides, offering a more
inclusive and connected world (Waibel et al., 1991; Salesky et al., 2023; Anastasopou-
los et al., 2022). By overcoming the exclusivity of professionals in delivering this service,
machines promise to make accessibility across languages ubiquitous and more affordable
(Susskind and Susskind, 2017), contributing, at least to some extent, to mitigating language
barriers.
The quest for automatic speech translation is long, but it is only recently that this tech-
nology has gained momentum in research. The International Workshop on Spoken Lan-
guage Translation (IWSLT),1 the most important annual scientific conference dedicated to
this topic, was founded in 2004. The first workshop described the vision of this new branch
of computational science as an ‘attempt to cross the language barriers between people with
different native languages who each want to engage in conversation by using their mother
tongue’.2
More recently, riding the wave of advancements in speech recognition and machine
translation driven by neural networks and, lately, the introduction of large language mod-
els, MI has transitioned from research labs and conference venues to real-life applications.
Despite the progress made, the pursuit of high-quality automatic speech translation is
fraught with a wide range of challenges that go beyond technical and linguistic hurdles.
Achieving a high standard of translation quality is just the beginning; it is also important
to navigate the ethical considerations of MI, assess its suitability in various settings, and
engage in deeper discussions about the implications of society having unlimited access to
spoken language information from different languages and cultures.
This chapter seeks to enhance the understanding of the application of this technology
beyond its mere technical aspects, targeting a broader audience rather than the experts
of the field. Section 12.2 introduces the concept of MI and the terminological conven-
tions used in different disciplines. Section 12.3 gives an overview of the development of
the field. Section 12.4 delves into the technological approaches that are the base for MI.
Section 12.5 presents the major challenges for machine interpretation. Section 12.6 outlines
some technological perspectives, and Section 12.7 the challenges in evaluating MI applica-
tions. Section 12.8 reflects on possible ethical implications of MI in society at large. Finally,
Section 12.9 concludes the chapter.
210
Machine interpreting
characterised by its focus on immediacy, that is, the fact that the message in translation is
available only once and cannot be edited (Fantinuoli, 2023). This is similar to the distinctive
features of human interpreting (Kade, 1968; Pöchhacker, 2022). The focus of MI is there-
fore on real-time and live scenarios, setting it apart from other forms of speech-to-speech
translation. MI can be either consecutive, where the machine processes and translates an
entire oral segment after it is spoken (such as a sentence or a longer passage), or simul-
taneous, where the translation is generated incrementally as the original speech is being
delivered, that is, based on partial input. Notably, MI may involve nuanced intervention in
the translation process, which may include adaptations, omissions, voice speed changes, or
other modifications to tailor communication effectively for specific contexts.
While the term speech-to-speech translation may refer to both the underlying technol-
ogy and its application, MI is primarily used only to denote the application rather than the
technology itself.
12.3 History
Modern speech translation applications have only caught the world’s attention in the last
few years thanks to impressive improvements in the underlying technologies, but research-
ers have been working to develop and understand its potential for much longer than that.
The history of speech-to-speech translation has always been tightly related to the progress
of a series of different technologies, especially speech recognition and machine translation
and, more recently, of large language and multimodal models.
While research and development in machine translation date back to the 1940s, it was
only during the 1983 World Telecommunication Exhibition that multilingual speech trans-
lation appeared on the stage (Kato, 1995). This technology, integrating speech recognition
and speech synthesis with machine translation, captured widespread attention at that event
and initiated a series of projects and conferences in the domain. Following that, the Speech-
Trans (Tomita et al., 1990) system, developed in 1988, marked another significant step in
speech translation, consolidating the field inside the computer science space. In 1992, the
German Federal Ministry of Education and Research launched a project called Verbmobil,
which developed a prototype portable translation system for the language pair German–
English. During the implementation of Verbmobil, a great number of scientific publications
and several spin-off products were generated. Start-up companies were established by the
participating universities and research centres (Wahlster, 2000). The project was character-
ised by its pioneering nature, and its legacy continues to resonate today.
Over the next two decades, especially after the Consortium for Speech Translation
Advanced Research was founded in 1991, various speech translation systems were cre-
ated, ranging from restricted domain and vocabulary to open-domain translation systems
(Waibel et al., 1991; Fügen et al., 2006). To further advance speech translation systems,
the International Workshop on Spoken Language Translation (IWSLT) was founded in
2004. In July 2010, the National Institute of Information and Communications Technol-
ogy (NICT) in Japan launched the world’s inaugural field experiment of a network-based
multilingual speech-to-speech translation system using a smartphone application.
In 2019, the European Parliament, a significant institution utilising a multilingual regime
with 24 official languages, initiated an innovation project in the space of speech translation.4
While this project was not related to MI but rather to speech-to-text translation (the goal
was to improve the accessibility of plenary meetings, enabling deaf and hard-of-hearing
211
The Routledge Handbook of Interpreting, Technology and AI
individuals to follow debates in real time), the initiative represents a crucial milestone in the
large-scale implementation of speech translation technologies.
More recently, MI has begun to be used in real-life scenarios, such as conferences, work-
shops, town halls, etc.5 While these systems were initially designed only for consecutive
interpretation, recent developments have also introduced systems capable of near simulta-
neous interpretation.
Speech-to-speech translation is a challenging research problem. From a technological per-
spective, the evolution of speech technology has followed a pattern similar to that of machine
translation, moving from simple, rule-based and statistical approaches to more complex,
deep learning paradigms.6 To simplify the task and make it somewhat more feasible, the first
research projects addressing this topic in the 1990s worked on restricted domains, such as
scheduling appointments or making orders, and moved only later to free and spontaneous
speech translation. Initially, speech translation systems were based on rule-based or statis-
tical approaches and, therefore, had all the limitations that came with them. Limitations
include the inability to translate idiomatic expressions, to take into consideration context, or
to produce fluent translation, to name just a few. After Google announced the development
of the Google Neural Machine Translation (Wu et al., 2016) in September 2016, a signifi-
cant paradigm shift from statistical approaches to deep learning and neural machine transla-
tion has been taking place, paving the way for new speech translation systems with increased
accuracy. The increased availability of special datasets, such as Europarl-ST (Iranzo-Sànchez
et al., 2020) or TEDx (Salesky et al., 2021), has further increased the efforts of the commu-
nity to create dedicated models for speech translation tasks.
At the moment of writing, commercial systems are designed based on the cascad-
ing approach, that is, combining several components, such as speech recognition and
machine translation (see Section 12.4.1). However, significant strides have been made
in developing and embracing end-to-end systems, that is, systems capable of translating
from one spoken language to another directly, without relying on an intermediate text
representation (Jia et al., 2019b). In the future, the end-to-end paradigm will simplify
architectures and bring about systems that will further improve the translation experi-
ence, allowing for example expressiveness and the retention of the features of the original
(Barrault et al., 2023).
Recently, much interest has also been devoted to reducing the latency of cascading and
direct systems for a range of reasons, including supporting real-time and near simultaneous
translation (Sudoh et al., 2020), extending coverage of dialect and low-resource languages,
and adapting register (Agarwal et al., 2023).
212
Machine interpreting
systems, the overall goal of the application is divided into multiple tasks, with each task
assigned to a specialised module. The simplest cascading configuration involves automatic
speech recognition (ASR), to convert speech into written text; neural machine translation
(NMT), to translate the written text from one language to another; and voice synthesis or
text-to-speech (TTS), to convert the written translation into an oral form (Figure 12.1 – see
also Davitti, this volume).
Configurations of cascading systems can vary considerably, depending on technologi-
cal advancements, design requirements, and complexity of the system. Recently, more
sophisticated applications have emerged. These may have, for example, dedicated com-
ponents for achieving low latency and continuous translation, or components to adapt
translation strategies to the translation goals, for example, speed adaptation, speech cor-
rection or normalisation, and language simplification (for translation into plain or simpli-
fied language), etc.
Generally speaking, cascading systems have the advantage of being able to leverage
robust technologies with years of development behind them. The robustness of the single
components is also granted by the availability of large training sets (Sperber and Paulik,
2020). However, cascading systems suffer from two challenges. First, the extreme complex-
ity of training and maintaining composite pipelines. Second, such systems tend to suffer
from error propagation from one component to the next, that is, errors made in one stage
of the process are carried forward and potentially amplified in subsequent stages. Notwith-
standing the possibility to correct errors, for example, in transcription issues (Martucci
et al., 2021; Macháček et al., 2023b), each of these stages can introduce new errors. Since
each subsequent stage relies on the output of the previous one, errors can accumulate and
worsen as the process continues.
Simultaneous machine interpreting is the most intricate form of speech translation, as
it adds a new significant temporal constraint. Systems need to translate an ongoing stream
of speech incrementally, without interruptions and without complete knowledge of what
the speaker will say moments later. To accomplish this, speech must be segmented into
meaningful chunks in real time (Waibel et al., 2012). The goal of a segmentation module is
to achieve a system that balances translation accuracy and latency. Latency can be broadly
defined as the time (in seconds) between the speaker uttering a word and the moment
the translation engine delivers the same word. Achieving lower latency, which implies less
context for the engine, can pose challenges in producing accurate translations. Segmenta-
tion methods range from detecting pauses in the speaker’s flow and employing fixed word
lengths to utilising dynamic approaches based on real-time syntactic and semantic analysis
of incoming speech.
In Figure 12.2, a simultaneous model is added to the pipeline. This model allows the
pipeline to achieve near simultaneous speech translation. Generally speaking, this model
aims at segmenting the incoming stream of text into chunks of text, for example, based on
213
The Routledge Handbook of Interpreting, Technology and AI
units of meaning. These units of meaning can then be passed to the MT engine, and there-
after to the TTS model (Kumar et al., 2013; Wang et al., 1999; Ma et al., 2019), in a way
that the translation, once spoken, is meaningful and coherent.
Pipelines in translation processes are not inherently restricted to a unidirectional flow.
Depending on the chosen architecture, feedback loops can be integrated to enhance con-
trol over the translation process. Contrary to the unidirectional approach illustrated in
Figure 12.3 (which necessitates segmenting the input stream into chunks that resemble
semantically coherent sentences and thus incurs a latency at least as long as the length of
these sentences), the incorporation of a feedback loop can reduce latency. By reintroduc-
ing the already-translated portion of a sentence back into the system, the MT engine is
prompted to complete the translation of the new segment while still considering the context
of the previously translated part. This strategy not only achieves shorter latency but also
improves coherence and cohesion in the translated output.
As previously mentioned, tasks and components are not rigidly defined. For example, the
tasks of ASR and MT can be unified in a single component performing direct speech-to-text
translation (Berard et al., 2016; Papi et al., 2023; Sethiya and Maurya, 2023), that is, a
model that is able to directly convert speech signals in a language into text in another lan-
guage. The translation can then be passed directly to a TTS system for audio generation.
Several additional components can be integrated into a cascading pipeline, such as text
normalisation (Fügen, 2008), language identification (Singh et al., 2021), suppression of
disfluencies (Fitzgerald et al., 2009), prosody transfer (Kano et al., 2018), speaker diarisa-
tion (Yang et al., 2023), and so forth.
Recently, translation has been performed not only by neural machine translation engines
but also by large language models, such as GPT.7 This enables a more comprehensive
approach to the translation task, leveraging the capabilities of generative AI.
For instance, LLMs can incorporate contextual understanding for translation through
methods such as in-context learning (Moslem et al., 2023) or even integrate automatic
quality estimation to enhance real-time translation (cf. Kocmi and Federmann, 2023; Wang
and Fantinuoli, 2024).
214
Machine interpreting
in another language without generating intermediate text representations (Lee et al., 2022).
Prominent projects are Translatotron8 by Google and SeamlessM4T9 by Meta.
While cascading systems are unable to preserve para-/non-linguistic speech characteris-
tics that are central to human communication, such as prosody, tone of voice, pauses, and
emphasis, the end-to-end approach promises to maintain such traits also in translation.
In fact, end-to-end models learn to map features of language in a holistic way. They map
spoken language, comprising all the aforementioned features, to translations. End-to-end
models are then able to reproduce these features, at least partially (Barrault et al., 2023).
Initially, end-to-end models primarily used supervised machine learning techniques
that rely on bilingual speech datasets (Jia et al., 2019a, 2019b, 2021). This approach has
two limitations. On the one hand, collecting bilingual datasets, especially but not only
for low-resource languages, is difficult. On the other hand, the lack of bilingual data with
corresponding non-linguistic characteristics in both source and target languages makes it
impossible to transfer such features to the translated speech. More recently, attempts have
been made to overcome this limitation, using unsupervised machine learning with monolin-
gual datasets (Nachmani et al., 2023).
While the end-to-end approach was initially applied only to consecutive translation,
the first systems have also been proposed for the simultaneous modality (Barrault et al., 2023).
12.5 Challenges
MI encounters various challenges, drawing from the complexities inherent in speech trans-
lation (Macháček et al., 2023b) and adding unique difficulties due to its need for imme-
diacy. These challenges broadly fall into the following categories: linguistic, cultural and
communicative, and technological.
• Disfluencies. Natural speech often contains filler words and sounds, such as ‘um’, ‘uh’,
‘you know’, and ‘like’.
• Poor enunciation. Mumbled or slurred words are difficult to interpret accurately.
• Features of spoken grammar. People often speak in sentences that would not be gram-
matically or syntactically correct in written language: changing direction mid-sentence,
not completing sentences, or using non-standard grammar.
• Proper nouns. Identifying and transcribing names (in cascading systems) or maintaining
the correct pronunciation of names (in direct systems) is a big challenge for machines.
• Simultaneity. The need to process incoming speech while it is unfolding is a further
challenge.
The difficulty in tackling these challenges is exacerbated in the case of low-resource lan-
guages, particularly Indigenous and minority languages. The same issue applies to many
accent variations that are not covered in the training data. Therefore, providing compre-
hensive language coverage continues to be a significant challenge. Commercial MI systems,
215
The Routledge Handbook of Interpreting, Technology and AI
such as those offered by Interprefy10 or KUDO,11 offer support for 80 and 38 languages,
respectively, at the time of writing. Variations in quality might be significant, not only
among vendors, but also between languages. Addressing these challenges requires ongoing
effort, resources, sophisticated algorithms, and extensive linguistic knowledge. It should
be recognised that the number of languages supported in machine translation is constantly
increasing, and this trend is expected to be reflected in commercial systems for MI as well.12
12.6 Evaluation
Evaluating the quality of MI is an important activity. Evaluations yield insights that are
crucial for various stakeholders, including developers, users, buyers, certifying agencies,
and more (Han, 2022).
Quality in interpreting is a nuanced and variable concept heavily influenced by the spe-
cific needs and idiosyncrasies of end users or service purchasers (see Davitti, Korybski,
Orăsan and Braun, this volume). It becomes paramount in high-stakes scenarios, where the
accuracy and subtlety of translation can have significant implications. The notion of qual-
ity, therefore, is not uniformly defined and varies according to the context of use.
The task of assessing the quality of translated content is multifaceted and complex. Meas-
uring quality is inherently challenging due to the somewhat-intangible nature of spoken
217
The Routledge Handbook of Interpreting, Technology and AI
language translation (Garcia Becerra, 2016a, 7). Quality perceptions can differ significantly
among users, adding a layer of subjectivity to what is considered a correct and high-quality
translation. Quality standards vary depending on the interpreting context. For example, in
human conference interpreting, the focus is on the interpreter’s output, including content,
language, and delivery. While these features continue to remain important in public service
settings, such as social and healthcare interpreting, other skills, such as interactional skills
and discourse management, become relevant (Kalina, 2012). There seems to be a consensus
that there is poor agreement on what constitutes an acceptable translation (Zhang, 2016).
Translation quality can be evaluated manually or automatically. Human evaluations pro-
vide a comprehensive view of quality by considering various aspects of communication, offer-
ing a deep understanding of the interpreting performance, as indicated by interpreting scholars
(Pöchhacker, 2002; Garcia Becerra, 2016b). However, manual evaluations are labour-intensive,
time-consuming, and expensive (Wu, 2011). In automatic speech translation, manual evalu-
ation has been used to assess both accuracy and fluency (cf. Fantinuoli and Dastyar, 2022).
Automated or semi-automated metrics have been proposed as an alternative to manual eval-
uation in order to simplify and speed up this process. Very few studies have applied automated
evaluation to human interpreting, as in the case of semantic similarity proposed by Zhang
(2016). A higher number of studies have applied automated evaluation to speech-to-text trans-
lation. These use statistical metrics, such as BLEU (Papineni et al., 2002), BERTScore (Zhang
et al., 2020), and chrF2 (Popović, 2017), both to monitor quality evolution from a developer
perspective and to compare several systems, for example, during international evaluation cam-
paigns (Agarwal et al., 2023). Systems are often evaluated in terms of their ability to produce
translations that are similar to the target language references. More recently, reference-free
evaluations have been proposed, leveraging on multilingual sentence embeddings or other
techniques that attempt to capture meaning (cf. Wang and Fantinuoli, 2024).
When applying automatic metrics, one important consideration that is key to understand-
ing how reliable they are is how well they correlate with human judgment. Some scholars
have come to the conclusion that, given the current quality levels of the systems, simple
automatic metrics, such as COMET, can be used as proxies for quality estimation (Macháček
et al., 2023a). Semantic vectors and large language models, which can directly compare the
original with the translation, have also shown promise in inferring quality which better relates
to human judgments (Kocmi and Federmann, 2023; Han and Lu, 2021).
• Overuse. This occurs when MI systems are deployed without a genuine need, leading to
unnecessary resource expenditure. The economic and environmental costs are significant,
considering the substantial energy and computational resources required to run these sys-
tems. Overuse not only leads to wasteful expenditure but also potentially contributes to
environmental degradation due to the high energy demands of current ML technologies.
218
Machine interpreting
• Misuse. This scenario arises when the technology is employed in situations where it may
cause harm. The accuracy and appropriateness of MI systems vary based on several fac-
tors, including language pairs, context of the communication, cultural nuances, techni-
cal limitations, and expectations. Using MI in sensitive or high-stake environments, such
as legal proceedings or medical consultations, without adequate safeguards can lead to
misinterpretations with serious consequences. Such misuse is not only unethical but also
potentially harmful and should be regulated to prevent adverse outcomes.
• Underuse. This refers to situations where MI is not utilised despite its potential to sig-
nificantly enhance communication and accessibility. Underuse is considered unethi-
cal as it denies the benefits of reduced language barriers to those who could otherwise
stand to gain from them. It is also economically imprudent, as the technology offers a
cost-effective solution for increasing accessibility. The underuse of MI can stem from a
lack of awareness, technological limitations, or resistance to change. Addressing these
barriers is crucial for maximising the technology’s positive impact.
The preceding categorisation serves as a general framework to guide the responsible adop-
tion of this technology. However, many factors remain to be defined in practical terms.
Such factors include deciding who is in charge of defining ‘critical use’, or how to define
metrics of acceptable translation performances. From a technical and legal standpoint, MI
systems present several critical aspects that necessitate responsible management and regula-
tion. Key areas of concern include:
• Confidentiality. MI interpreting applications are cloud-based and face data breach risks.
Ensuring confidentiality requires robust encryption, secure storage, and strict access con-
trols. Companies should adopt certifications like ISO 27001 and SOC2.
• Data ownership and privacy. MI systems process sensitive data, which raises ownership
and privacy concerns. Clear policies and compliance with regulations like GDPR are
crucial for safeguarding user data.
• Appropriate use. MI system effectiveness varies by language pair, situation complexity,
and cultural nuances. Users need guidelines to ensure they avoid misuse in critical con-
texts. Certifications can ensure reliability in regulated areas, like courts or healthcare.
• Liability. Accountability for translation errors requires clarity. Balanced regulations are
needed to ensure quality and innovation without eroding trust or stifling development.
• Ethical AI and bias mitigation. MI systems must address biases to prevent stereotypes or
discrimination. Regular audits and updates are key to ensuring ethical AI practices.
219
The Routledge Handbook of Interpreting, Technology and AI
(Becker et al., 2019). In a similar vein, unrestricted access to information through machine
interpretation could yield both advantages and disadvantages.
On the positive side, MI offers the potential for enhanced and autonomous dissemination
of information and knowledge. While the exclusive provision of services by professionals
provides numerous advantages, including the assurance of expertise and the high-quality
standards that professionals can deliver, only machines have the possibility to make acces-
sibility available to everyone (Susskind and Susskind, 2017).14
However, the ubiquity of spoken language translation also carries the risk of further rad-
icalisation and polarisation of ideological positions. AI creates the illusion that all content
can and should be made accessible in every language and culture. However, not all content
created by individuals can be meaningfully translated without proper contextualisation
of cultural, historical, and sociological aspects. Some content is deeply rooted in specific
cultures or subcultures and only holds significance within that specific context. This makes
translation – if not culturally mediated or contextualised – futile or even counterproductive.
In such contexts, MI is bound to fall short or be executed poorly, potentially exacerbating
misunderstandings and polarisation.
220
Machine interpreting
data, and innovative machine learning frameworks capable of yielding quality results with
reduced training data requirements (Sperber and Paulik, 2020). The future of speech trans-
lation seems to be rapidly shifting from cascading systems to end-to-end systems. Recent
developments show that end-to-end systems can produce translations while preserving
the various features of speech. End-to-end models focus on often-overlooked elements of
prosody, such as speech rate and pauses. They also retain the emotion and style of the
original speech (Barrault et al., 2023; Jia et al., 2021; Nachmani et al., 2023). Not only can
end-to-end models be a useful solution for dealing with languages that do not have a formal
writing system (Duong et al., 2016), but they should also enable more effective streaming
techniques and better source language segmentation approaches (see Section 12.4.1) that
mimic the behaviour of human interpreters. Therefore, use of these models could lead to
both a potential reduction in latency and a simplification of deployment and maintenance.
Visual analysis could further ground the translation process in the communicative context.
Live communication typically encompasses both verbal and non-verbal elements, tailored
to the situational demands and communicative goals of the participants (Lala et al., 2019;
Sulubacak et al., 2019). This aspect is equally vital in multilingual communication and MI.
Emerging vision systems, when combined with LLMs, demonstrate remarkable proficiency
in image analysis, with live video analysis on the horizon. These systems are now capable
of converting visual data into what might be termed ‘situational meta-information’ – essen-
tially, information about what is happening in the communicative context: who is involved,
what the features of the setting are, etc. Leveraging this data may significantly enrich the
translation process, leading to enhanced accuracy and nuance.
In applications outside the simultaneous modality, virtual avatars might influence the
perception of artificial interpretation further, for example, in dialogic contexts, where the
embodiment of an interpreter seems to play a central role (Li et al., 2023).
12.9 Conclusions
This chapter has provided a thorough examination of MI, highlighting its significant evolu-
tion from early experimental stages to its current applications. Despite remarkable techno-
logical advancements, this technology continues to face complex challenges, particularly in
accurately capturing the nuances of spoken language and contextual subtleties. There are
also methodological challenges to tackle, such as the need to define shared quality criteria
that should help diverse stakeholders evaluate the fit of the technology for their use cases.
The chapter also underscores the importance of addressing ethical considerations in the
deployment of live speech translation technologies, emphasising the need for responsible
use to maximise benefits while minimising potential risks.
The future of MI looks promising, with potential advancements in computational power
and machine learning techniques, which could enhance its efficiency and accuracy. Overall,
MI stands at a pivotal point. It has the capacity to significantly impact global communica-
tion, provided its development and application are managed with careful consideration of
its technological, evaluative, and ethical dimensions.
Notes
1 https://iwslt.org (accessed 10.9.2024).
2 https://www2.nict.go.jp/astrec-att/workshop/IWSLT2004/archives/000196.html (accessed 10.9.2024).
3 For an in-depth analysis of this terminology, please refer to Pöchhacker (2024).
221
The Routledge Handbook of Interpreting, Technology and AI
References
Abarca, V.M.G., Palos-Sanchez, P.R., Rus-Arias, E., 2020. Working in Virtual Teams: A Systematic
Literature Review and a Bibliometric Analysis. IEEE Access 8, 168923–168940.
Agarwal, M., Agrawal, S., Anastasopoulos, A., Bentivogli, L., Bojar, O., Borg, C., Carpuat, M., Cattoni,
R., Cettolo, M., Chen, M., Chen, W., Choukri, K., Chronopoulou, A., Currey, A., Declerck, T., Dong,
Q., Duh, K., Estève, Y., Federico, M., Gahbiche, S., Haddow, B., Hsu, B., Mon Htut, P., Inaguma, H.,
Javorský, D., Judge, J., Kano, Y., Ko, T., Kumar, R., Li, P., Ma, X., Mathur, P., Matusov, E., McNamee,
P., McCrae, J.P., Murray, K., Nadejde, M., Nakamura, S., Negri, M., Nguyen, H., Niehues, J., Niu, X.,
Kr. A., Ojha, Ortega, J.E., Pal, P., Pino, J., van der Plas, L., Polák, P., Rippeth, E., Salesky, E., Shi, J.,
Sperber, M., Stüker, S., Sudoh, K., Tang, Y., Thompson, B., Tran, K., Turchi, M., Waibel, A., Wang, M.,
Watanabe, S., Zevallos, R., 2023. Findings of the IWSLT 2023 Evaluation Campaign. In Salesky, E.,
Federico, M., Carpuat, M., eds. Proceedings of the 20th International Conference on Spoken Language
Translation (IWSLT 2023), Association for Computational Linguistics, pp. 1–61.
Anastasopoulos, A., Barrault, L., Bentivogli, L., Zanon Boito, M., Bojar, O., Cattoni, R., Currey, A., Dinu,
G., Duh, K., Elbayad, M., Emmanuel, C., Estève, Y., Federico, M., Federmann, C., Gahbiche, S., Gong,
H., Grundkiewicz, R., Haddow, B., Hsu, B., Javorský, D., Kloudová, V., Lakew, S., Ma, X., Mathur,
P., McNamee, P., Murray, K., Nǎdejde, M., Nakamura, S., Negri, M., Niehues, J., Niu, X., Ortega, J.,
Pino, J., Salesky, E., Shi, J., Sperber, M., Stüker, S., Sudoh, K., Turchi, M., Virkar, Y., Waibel, A., Wang,
C., Watanabe, S., 2022. Findings of the IWSLT 2022 Evaluation Campaign. In Proceedings of the 19th
International Conference on Spoken Language Translation (IWSLT 2022), 98–157.
Barrault, L., Chung, Y.-A., Meglioli, M.C., Dale, D., Dong, N., Duquenne, P.-A., Elsahar, H., Gong,
H., Heffernan, K., Hoffman, J., Klaiber, C., Li, P., Licht, D., Maillard, J., Rakotoarison, A., Ram
Sadagopan, K., Wenzek, G., Ye, E., Akula, B., Chen, P.-J., El Hachem, N., Ellis, B., Mejia Gon-
zalez, G., Haaheim, J., Hansanti, P., Howes, R., Huang, B., Hwang, M.-J., Inaguma, H., Jain,
S., Kalbassi, E., Kallet, A., Kulikov, I., Lam, J., Li, D., Ma, X., Mavlyutov, R., Peloquin, M.,
Ramadan, M., Ramakrishnan, A., Sun, A., Tran, K., Tran, T., Tufanov, I., Vogeti, V., Wood, C.,
Yang, Y., Yu, B., Andrews, P., Balioglu, C., Costa-jussà, M.R., Celebi, O., Elbayad, M., Gao, C.,
Guzmán, F., Kao, J., Lee, A., Mourachko, A., Pino, J., Popuri, S., Ropers, C., Saleem, S., Schwenk,
H., Tomasello, P., Wang, C., Wang, J., Wang, S., 2023. SeamlessM4T: Massively Multilingual &
Multimodal Machine Translation. URL https://arxiv.org/abs/2308.11596
Becker, J., Porter, E., Centola, D., 2019. The Wisdom of Partisan Crowds. Proceedings of the National
Academy of Sciences 116(22), 10717–10722.
Bender, E.M., Koller, A., 2020. Climbing Towards NLU: On Meaning, Form, and Understanding in
the Age of Data. In Proceedings of the 58th Annual Meeting of the Association for Computational
Linguistics. Association for Computational Linguistics, 5185–5198.
Berard, A., Pietquin, O., Servan, C., Besacier, L., 2016. Listen and Translate: A Proof of Concept for
End-to-End Speech-to-Text Translation. NIPS Workshop on End-to-End Learning for Speech and
Audio Processing, December, Barcelona, Spain.
222
Machine interpreting
Cath, C., 2018. Governing Artificial Intelligence: Ethical, Legal and Technical Opportunities and
Challenges. Philosophical Transactions of the Royal Society A: Mathematical, Physical and
Engineering Sciences 37(2133).
Chang, C.-C., Chuang, S.-P., Lee, H.-Y., 2022. Anticipation-Free Training for Simultaneous Machine
Translation. In Proceedings of the 19th International Conference on Spoken Language Translation
(IWSLT 2022). Association for Computational Linguistics, Dublin, Ireland, 43–61.
Chang, Y., Wang, X., Wang, J., Wu, Y., Yang, L., Zhu, K., Chen, H., Yi, X., Wang, C., Wang, Y.,
Ye, W., Zhang, Y., Chang, Y., Yu, P.S., Yang, Q., Xie, X., 2023. A Survey on Evaluation of Large
Language Models. ACM Transactions on Intelligent Systems and Technology 15(3), 1–4.
Cogo, A., 2015. English as a Lingua Franca: Descriptions, Domains and Applications. In Bowles,
H., Cogo, A., eds. International Perspectives on English as a Lingua Franca: Pedagogical
Insights, International Perspectives on English Language Teaching. Palgrave Macmillan UK,
London, 1–12.
Ding, H., Hargraves, L., 2009. Stress-Associated Poor Health Among Adult Immigrants with a Lan-
guage Barrier in the United States. Journal of Immigrant and Minority Health 11(6), 446–452.
Duong, L., Anastasopoulos, A., Chiang, D., Bird, S., Cohn, T., 2016. An Attentional Model for
Speech Translation Without Transcription. In Proceedings of the 2016 Conference of the North
American Chapter of the Association for Computational Linguistics: Human Language Technolo-
gies, Association for Computational Linguistics, San Diego, CA, 949–959.
Fantinuoli, C., 2023. The Emergence of Machine Interpreting. European Society for Translation Stud-
ies 62, 10.
Fantinuoli, C., Dastyar, V., 2022. Interpreting and the Emerging Augmented Paradigm. Interpreting
and Society 2(2), 185–194.
Fettes, M., 1997. Esperanto and Language Awareness. In Van Lier, L., Corson, D., eds. Encyclopedia
of Language and Education: Knowledge About Language, Encyclopedia of Language and Educa-
tion. Springer Netherlands, Dordrecht, The Netherlands, 151–159.
Fitzgerald, E., Hall, K., Jelinek, F., 2009. Reconstructing False Start Errors in Spontaneous Speech
Text. In Proceedings of the 12th Conference of the European Chapter of the ACL (EACL 2009).
Association for Computational Linguistics, 255–263.
Floridi, L., 2021. The End of an Era: From Self-Regulation to Hard Law for the Digital Industry.
Springer, Rochester, NY.
Floridi, L., Cowls, J., Beltrametti, M., Chatila, R., Chazerand, P., Dignum, V., Luetge, C., Madelin, R., Pagallo,
U., Rossi, F., Schafer, B., Valcke, P.,Vayena, E., 2018. AI4People – an Ethical Framework for a Good AI
Society: Opportunities, Risks, Principles, and Recommendations. Minds and Machines 28(4), 689–707.
Fügen, C., 2008. A System for Simultaneous Translation of Lectures and Speeches (Unpublished PhD
thesis). University of Karlsruhe.
Fügen, C., Kolss, M., Bernreuther, D., Paulik, M., Stuker, S., Vogel, S., Waibel, A., 2006. Open
Domain Speech Recognition & Translation: Lectures and Speeches. In 2006 IEEE International
Conference on Acoustics Speech and Signal Processing Proceedings 1, I–I).
Gaido, M., Papi, S., Fucci, D., Fiameni, G., Negri, M., Turchi, M., 2022. Efficient Yet Competitive
Speech Translation: FBK@IWSLT2022. In Proceedings of the 19th International Conference on
Spoken Language Translation (IWSLT 2022). Association for Computational Linguistics, Dublin,
Ireland, 177–189 (in-person and online), May.
Garcia Becerra, O., 2016a. Do First Impressions Matter? The Effect of First Impressions on the Assess-
ment of the Quality of Simultaneous Interpreting. Across Languages and Cultures 17(1), 77–98.
Garcia Becerra, O., 2016b. Survey Research on Quality Expectations in Interpreting: The Effect of
Method of Administration on Subjects’ Response Rate. Meta 60.
Han, C., 2022. Interpreting Testing and Assessment: A State-of-the-Art Review. Language Testing
39(1), 30–55.
Han, C., Lu, X., 2021. Interpreting Quality Assessment Re-Imagined: The Synergy Between Human
and Machine Scoring. Interpreting and Society 1(1), 70–90.
Hartshorne, J.K., Tenenbaum, J.B., Pinker, S., 2018. A Critical Period for Second Language Acquisi-
tion: Evidence from 2/3 Million English Speakers. Cognition 177, 263–277.
Hu, C., Tian, Q., Li, T., Wang, Y., Wang, Y., Zhao, H., 2021. Neural Dubber: Dubbing for Videos
According to Scripts. Proceedings of the 35th Conference on Neural Information Processing Sys-
tems (NeurIPS 2021).
223
The Routledge Handbook of Interpreting, Technology and AI
Iranzo-Sànchez, J., Silvestre-Cerdà, J.A., Rosello, N., Sanchis, A., Civera, J., Juan, A. 2020.
Europarl-ST: A Multilingual Corpus for Speech Translation of Parliamentary Debates. Proceed-
ings of the ICASSP2020 Conference.
Jaeger, F.N., Pellaud, N., Laville, B., Klauser, P., 2019. Barriers to and Solutions for Addressing
Insufficient Professional Interpreter Use in Primary Healthcare. BMC Health Services Research
19(1), 753.
Jia, Y., Johnson, M., Macherey, W., Weiss, R.J., Cao, Y., Chiu, C.-C., Biadsy, F., Macherey, W., John-
son, M., Chen, Z., Wu, Y., 2019a. Leveraging Weakly Supervised Data to Improve End-to-End
Speech-to-Text Translation. URL https://doi.org/10.48550/arXiv.1904.06037
Jia, Y., Ramanovich, M. T., Remez, T., Pomerantz, R., 2021. Translatotron 2: Robust Direct
Speech-to-Speech Translation. URL arxiv.org/abs/2107.08661
Jia, Y., Weiss, R.J., Biadsy, F., Macherey, W., Johnson, M., Chen, Z., Wu, Y., 2019b. Direct
Speech-to-Speech Translation with a Sequence-to-Sequence Model. URL https://doi.org/10.48550/
arXiv.1904.06037
Kade, O., 1968. Zufall und Gesetzmäßigkeit in der Übersetzung. Verlag Enzyklopädie Edition,
Leipzig.
Kalina, S., 2012. Quality in Interpreting. In Handbook of Translation Studies Online, vol. 3. John
Benjamins Publishing Company, 134–140.
Kano, T., Takamichi, S., Sakti, S., Neubig, G., Toda, T., Nakamura, S., 2018. An End-to-End Model for
Crosslingual Transformation of Paralinguistic Information. Machine Translation 32(4), 353–368.
Karpinska, M., Iyyer, M., 2023. Large Language Models Effectively Leverage Document-Level Context
for Literary Translation, but Critical Errors Persist. URL https://doi.org/10.48550/arXiv.2304.03245
Kato, Y., 1995. The Future of Voice-Processing Technology in the World of Computers and Commu-
nications. Proceedings of the National Academy of Sciences 92(22), 10060–10063.
Kidawara, Y., Sumita, E., Kawai, H., eds., 2020. Speech-to-Speech Translation. SpringerBriefs in
Computer Science. Springer, Singapore.
Ko, Y., Fukuda, R., Nishikawa, Y., Kano, Y., Sudoh, K., Nakamura, S., 2023. Tagged End-to-End
Simultaneous Speech Translation Training Using Simultaneous Interpretation Data. In Proceedings
of the 20th International Conference on Spoken Language Translation (IWSLT 2023). Association
for Computational Linguistics, Toronto, Canada, 363–375.
Kocmi, T., Federmann, C., 2023. Large Language Models Are State-of-the-Art Evaluators of Transla-
tion Quality. URL https://doi.org/10.48550/arXiv.2302.14520
Korybski, T., Davitti, E., Orăsan, C., Braun, S., 2022. A Semi-Automated Live Interlingual Com-
munication Workflow Featuring Intralingual Respeaking: Evaluation and Benchmarking. In Pro-
ceedings of the Thirteenth Language Resources and Evaluation Conference. European Language
Resources Association, Marseille, France, 4405–4413.
Kozuh, I., Debevc, M., 2018. Challenges in Social Media Use Among Deaf and Hard of Hearing People.
In Dey, N., Babo, R., Ashour, A.S., Bhatnagar, V., Salim Bouhlel, M., eds. Social Networks Science:
Design, Implementation, Security, and Challenges. Springer International Publishing, Cham, 151–171.
Kumar, V., Sridhar, R., Chen, J., Bangalore, S., Ljolje, A., Chengalvarayan, R., 2013. Segmentation
Strategies for Streaming Speech Translation. In Proceedings of the 2013 Conference of the North
American Chapter of the Association for Computational Linguistics: Human Language Technolo-
gies. Association for Computational Linguistics, 230–238.
Lala, C., Madhyastha, P., Specia, L., 2019. Grounded Word Sense Translation. In Bernardi, R., Fer-
nandez, R., Gella, S., Kafle, K., Kanan, C., Lee, S., Nabi, M., eds. Proceedings of the Second
Workshop on Shortcomings in Vision and Language. Association for Computational Linguistics,
Minneapolis, MN, 78–85.
Lee, A., Gong, H., Duquenne, P.-A., Schwenk, H., Chen, P.-J., Wang, C., Popuri, S., Adi, Y., Pino, J.,
Gu, J., Hsu, W.-N., 2022. Textless Speech-to-Speech Translation on Real Data. URL https://doi.
org/10.48550/arXiv.2112.08352
Li, R., Liu, K., Cheung, A.K.F., 2023. Interpreter Visibility in Press Conferences: A Multimodal Con-
versation Analysis of Speaker – Interpreter Interactions. Humanities and Social Sciences Commu-
nications 10(1), 1–12.
Liebling, D., Heller, K., Robertson, S., Deng, W., 2022. Opportunities for Human-Centered Evalua-
tion of Machine Translation Systems. In Carpuat, M., de Marneffe, M.-C., Meza Ruiz, I.V., eds.
Findings of the Association for Computational Linguistics: NAACL 2022. Association for Com-
putational Linguistics, 229–240.
224
Machine interpreting
Ma, M., Huang, L., Xiong, H., Zheng, R., Liu, K., Zheng, B., Popuri, S., Adi, Y., Pino, J., Gu, J., Wang,
H., 2019. STACL: Simultaneous Translation with Implicit Anticipation and Controllable Latency
Using Prefix-to-Prefix Framework. In Proceedings of the 57th Annual Meeting of the Association
for Computational Linguistics. Association for Computational Linguistics, 3025–3036.
Macháček, D., Bojar, O., Dabre, R., 2023a. MT Metrics Correlate with Human Ratings of Simultane-
ous Speech Translation. In Proceedings of the 24th Annual Conference of the European Associa-
tion for Machine Translation.
Macháček, D., Polák, P., Bojar, O., Dabre, R., 2023b. Robustness of Multi-Source MT to Tran-
scription Errors. In Proceedings of the 24th Annual Conference of the European Association for
Machine Translation.
Martucci, G., Cettolo, M., Negri, M., Turchi, M., 2021. Lexical Modeling of ASR Errors for Robust
Speech Translation. In Interspeech 2021. ISCA, 2282–2286.
Moslem, Y., Haque, R., Kelleher, J.D., Way, A., 2023. Adaptive Machine Translation with Large
Language Models. In Proceedings of the 24th Annual Conference of the European Association for
Machine Translation, 227–237.
Nachmani, E., Levkovitch, A., Ding, Y., Asawaroengchai, C., Zen, H., Ramanovich, M.T., 2023.
Translatotron 3: Speech to Speech Translation with Monolingual Data. URL https://doi.
org/10.48550/arXiv.2305.17547
O’Brien, S., Federici, F.M., eds., 2022. Translating Crises. Bloomsbury Academic.
Papi, S., Gaido, M., Negri, M., 2023. Direct Models for Simultaneous Translation and Automatic Sub-
titling: FBK@IWSLT2023. In Proceedings of the 20th International Conference on Spoken Language
Translation (IWSLT 2023). Association for Computational Linguistics, Toronto, Canada, 159–168.
Papineni, K., Roukos, S., Ward, T., Zhu, W.-J., 2002. BLEU: A Method for Automatic Evaluation of
Machine Translation. In Proceedings of the 40th Annual Meeting of the Association for Compu-
tational Linguistics (ACL), 311–318.
Pöchhacker, F., 2002. Quality Assessment in Conference and Community Interpreting. Meta 46(2), 410–425.
Pöchhacker, F., 2022. Interpreters and Interpreting: Shifting the Balance? The Translator 28(2),
148–161, Routledge.
Pöchhacker, F., 2024. Is Machine Interpreting Interpreting? In Translation Spaces. John Benjamins
Publishing Company.
Poibeau, T., 2017. Machine Translation. The MIT Press Essential Knowledge Series. The MIT Press.
Popović, M., 2017. chrF++: Words Helping Character N-Grams. In Bojar, O., Buck, C., Chatterjee,
R., Federmann, C., Graham, Y., Haddow, B., Huck, M., Jimeno Yepes, A., Koehn, P., Kreutzer,
J., eds. Proceedings of the Second Conference on Machine Translation. Association for Computa-
tional Linguistics, Copenhagen, Denmark, 612–618.
Saboo, A., Baumann, T., 2019. Integration of Dubbing Constraints into Machine Translation. In Pro-
ceedings of the Fourth Conference on Machine Translation (Volume 1: Research Papers). Associa-
tion for Computational Linguistics, Florence, Italy, 94–101.
Salesky, E., Darwish, K., Al-Badrashiny, M., Diab, M., Niehues, J., 2023. Evaluating Multilingual
Speech Translation Under Realistic Conditions with Resegmentation and Terminology. In Pro-
ceedings of the 20th International Conference on Spoken Language Translation (IWSLT 2023)
(In-Person and Online). Association for Computational Linguistics, Toronto, Canada, 62–78.
Salesky, E., Wiesner, M., Bremerman, J., Cattoni, R., Negri, M., Turchi, M., Oard, D.W., Post, M.,
2021. The Multilingual TEDx Corpus for Speech Recognition and Translation. Proceeding of the
Interspeech 2021 Conference.
Sarkar, A., 2016. The Challenge of Simultaneous Speech Translation. In Proceedings of the 30th
Pacific Asia Conference on Language, Information and Computation: Keynote Speeches and
Invited Talks. Seoul, South Korea, 7.
Seeber, K.G., 2012. Multimodal Input in Simultaneous Interpreting: An Eyetracking Experiment. In Zyba-
tov, L., Petrova, A., Ustaszewski, M., eds. Proceedings of the 1st International Conference TRANS-
LATA, Translation & Interpreting Research: Yesterday – Today – Tomorrow. Peter Lang, 341–347.
Seligman, M., 1997. Interactive Real-Time Translation via the Internet. In AAAI.
Sethiya, N., Maurya, C.K., 2023. End-to-End Speech-to-Text Translation: A Survey. URL https://doi.
org/10.48550/arXiv.2312.01053
Singh, G., Sharma, S., Kumar, V., Kaur, M., Baz, M., Masud, M., 2021. Spoken Language Iden-
tification Using Deep Learning. Computational Intelligence and Neuroscience. URL https://doi.
org/10.1155/2021/5123671
225
The Routledge Handbook of Interpreting, Technology and AI
Sperber, M., Paulik, M., 2020. Speech Translation and the End-to-End Promise: Taking Stock of
Where We Are. In Proceedings of the 58th Annual Meeting of the Association for Computational
Linguistics.
Sudoh, K., Kano, T., Novitasari, S., Yanagita, T., Sakti, S., Nakamura, S., 2020. Simultaneous
Speech-to-Speech Translation System with Neural Incremental ASR, MT, and TTS. URL https://
arxiv.org/abs/2011.04845
Sulubacak, U., Caglayan, O., Grönroos, S.-A., Rouhe, A., Elliott, D., Specia, L., Tiedemann, J., 2019.
Multimodal Machine Translation Through Visuals and Speech, November 2019. URL https://
arxiv.org/abs/1911.12798
Susskind, R., Susskind, D., 2017. The Future of the Professions: How Technology Will Transform the
Work of Human Experts. Oxford University Press edition.
ten Bosch, L., 2003. Emotions, Speech and the ASR Framework. Speech Communication 40(1), 213–225.
Tomita, M., Tomabechi, H., Saito, H., 1990. Speech Trans: An Experimental Real-Time
Speech-to-Speech. Language Research 24(4), 663–672.
Van Doorslaer, L., 2009. How Language and (Non-)Translation Impact on Media Newsrooms: The
Case of Newspapers in Belgium. Perspectives 17(2), 83–92.
Wahlster, W., ed., 2000. Verbmobil: Foundations of Speech-to-Speech Translation. Springer, Berlin.
Waibel, A., Cho, E., Niehues, J., 2012. Segmentation and Punctuation Prediction in Speech Language
Translation Using a Monolingual Translation System. In IWSLT.
Waibel, A., Jain, A.N., McNair, A.E., Saito, H., Hauptmann, A.G., Tebelskis, J., 1991. JANUS: A
Speech-to-Speech Translation System Using Connectionist and Symbolic Processing Strategies.
URL https://doi.org/10.1109/ICASSP.1991.150456
Wang, H., Gao, W., Li, S., 1999. Utterance Segmentation of Spoken Chinese. Chinese Journal of
Computers 22, 1009–1013.
Wang, X., Fantinuoli, C., 2024. Exploring the Correlation Between Human and Machine Evaluation
of Simultaneous Speech Translation. Proceedings of the 25th Annual Conference of the European
Association for Machine Translation 1, 325–334.
Wang, Y., Wang, J., Zhang, W., Zhan, Y., Guo, S., Zheng, Q., Wang, X., 2022. A Survey on Deploying
Mobile Deep Learning Applications: A Systemic and Technical Perspective. Digital Communica-
tions and Networks 8, 1–17. URL https://doi.org/10.1016/j.dcan.2021.06.001
Wu, S.-C., 2011. Assessing Simultaneous Interpreting: A Study on Test Reliability and Examiners’
Assessment Behavior (PhD thesis). Newcastle University.
Wu, Y., Schuster, M., Chen, Z., Le, Q.V., Norouzi, M., Macherey, W., Krikun, M., Cao, Y., Gao,
Q., Macherey, K., Klingner, J., Shah, A., Johnson, M., Liu, X., Kaiser, L., Gouws, S., Kato, Y.,
Kudo, T., Kazawa, H., Stevens, K., Kurian, G., Patil, N., Wang, W., Young, C., Smith, J., Riesa,
J., Rudnick, A., Vinyals, O., Corrado, G., Hughes, M., Dean, J., 2016. Google’s Neural Machine
Translation System: Bridging the Gap Between Human and Machine Translation. URL https://doi.
org/10.48550/arXiv.1609.08144
Wu, Z., Li, Z., Wei, D., Shang, H., Guo, J., Chen, X., Rao, Z., Yu, X., Yang, J., Li, S., Xie, Y., Wei,
B., Zheng, J., Zhu, M., Lei, L., Yang, H., Jiang, Y., 2023. Improving Neural Machine Translation
Formality Control with Domain Adaptation and Reranking-Based Transductive Learning. In Pro-
ceedings of the 20th International Conference on Spoken Language Translation (IWSLT 2023).
Association for Computational Linguistics, Toronto, Canada, 180–186 (in-person and online).
URL https://doi.org/10.18653/v1/2023.iwslt-1.13
Xu, J., Buet, F., Crego, J., Bertin-Lemée, E., Yvon, F., 2022. Joint Generation of Captions and Subtitles
with Dual Decoding. In Proceedings of the 19th International Conference on Spoken Language
Translation (IWSLT 2022). Association for Computational Linguistics, Dublin, Ireland, 74–82
(in-person and online). URL https://doi.org/10.18653/v1/2022.iwslt-1.7
Yang, M., Kanda, N., Wang, X., Chen, J., Wang, P., Xue, J., Li, J., Yoshioka, T., 2023. DiariST: Stream-
ing Speech Translation with Speaker Diarization. URL https://doi.org/10.48550/arXiv.2309.08007
Yao, B., Jiang, M., Yang, D., Hu, J., 2023. Empowering LLM-Based Machine Translation with Cul-
tural Awareness. URL https://ar5iv.labs.arxiv.org/html/2305.14328
Zhang, T., Kishore, V., Wu, F., Weinberger, K.Q., Artzi, Y., 2020. BERTScore: Evaluating Text Gen-
eration with BERT. URL https://doi.org/10.48550/arXiv.1904.09675
Zhang, X., 2016. Semi-Automatic Simultaneous Interpreting Quality Evaluation. International Jour-
nal on Natural Language Computing 5, 1–12. URL https://doi.org/10.5121/ijnlc.2016.5501
226
PART IV
Technology in professional
interpreting settings
13
CONFERENCE SETTINGS
Kilian G. Seeber
13.1 Introduction
When we hear the word technology, many of us will spontaneously associate it with elec-
tronics, computers, algorithms, and probably some form of artificial intelligence. For many
scholars, however, technology has always been a much broader concept (see Mesthene,
1970; Galbraith, 1967; Ferré, 1988; Friedel, 2007), comprising ‘the application of organ-
ised knowledge to practical tasks by ordered systems of people and machines’ (Barbour,
1993, 3). According to this definition, technology can include practical experience, inven-
tions, and scientific theories aimed at producing goods, but also providing services. Cru-
cially, technology can relate to machines as much as it can relate to human beings. It is in
that sense that I will use the concept.
This chapter sets out to describe the way in which technology has been implemented in
conference interpreting settings. Crucially, whilst seemingly self-evident, what constitutes
conference interpreting is not unambiguously clear either and has, at times, been defined
rather loosely, both within and outside the professional community. Scholars have concep-
tualised and categorised interpreting by setting (the environment in which the interpreta-
tion takes place), by mode (the temporal dynamics underlying the interpreting technique),
and by modality (the channels involved in the production and reception of the interpreta-
tion) at different levels of granularity. This is how, over time, the relatively simple and
probably too simplistic dichotomous distinctions of interpreting settings (e.g. conference
vs community), modes (e.g. consecutive vs simultaneous), and modalities (e.g. on-site vs
distance) have given way to more fine-grained, partially overlapping taxonomies. So while
conference interpreting is undoubtedly the most highly professionalised (Baigorri-Jalón
et al., 2021) and perhaps the most readily recognised (Diriker, 2015) interpreting setting,
its definition according to the ‘socio-spatial contexts of interaction’ (Grbić, 2015, 371) in
which it takes place might be insufficient, particularly without further consideration of the
modes and modalities involved.
For instance, professionals regularly conflate simultaneous interpreting and conference
interpreting – either because of the prestige associated with that label (Kurz, 1991) or
because most simultaneous interpreter training happens within the confines of conference
DOI: 10.4324/9781003053248-18
The Routledge Handbook of Interpreting, Technology and AI
interpreting training programmes (Sawyer and Roy, 2015). Yet even in scholarly circles,
the distinction between setting and mode is not straightforward. Baker and Diriker (2019),
for instance, appear to merge mode and setting when arguing that the terms conference
interpreting and simultaneous interpreting are closely connected because the history of the
former begins with the introduction of the latter. This is in stark contrast to Baigorri-Jalón
(2015), who points to the first Pan-American conference in 1889–1890, where interpreters
were used to ensure mutual understanding across different languages, as an early exam-
ple of conference interpreting, before providing evidence of first attempts at implementing
(technology-enabled) simultaneous interpreting at the International Labour Organization
in Geneva in the mid-1920s. Others, like Chernov (2004) and Svejcer (1999), argue that
(technology-enabled) simultaneous interpreting was first used at the Comintern Congress
in Moscow in 1928, while Gaiba (1998) and Lederer (1984) pinpoint the Nuremberg trials
of 1945 as the birth of this interpreting mode.
While a comprehensive discussion of different interpreting taxonomies and a foren-
sic analysis of historical evidence documenting the origin of different interpreting modes
exceed the scope of this chapter, in this contribution I will focus on the most common inter-
preting modes and modalities in multilingual multilateral conferences. In doing so, I will
deliberately limit the scope of the conference interpreting setting, which has been argued to
include many different types of conference-like events, including high-level bilateral diplo-
matic exchanges, press conferences, workshops, etc. (Diriker, 2015), concentrating instead
on what Pöchhacker calls ‘international conference interpreting’ (2022, 16).
In this contribution, I will focus on the technological developments that impact the confer-
ence interpreting task as it unfolds, rather than those used during the preparation stage for
an interpreting assignment. This is not to suggest that the preparation of a conference cannot
be considered an integral part of a professional conference interpreter’s workflow (see Jiang,
2013). However, the tools used for conference preparation, many of them summarised under
the heading computer-assisted interpreting (CAI), are not unique to the conference interpret-
ing setting and are already comprehensively covered by Prandi (this volume).
This chapter opens with a short introduction to the beginnings of international confer-
ence interpreting in Section 13.2, before addressing some of the most salient technological
developments by modality, that is, consecutive and simultaneous conference interpreting.
To that end, Section 13.3 briefly describes the place of consecutive interpreting in multi-
lingual multilateral conferences, whereas Section 13.4 introduces an operative task model
for this modality. The use of note-taking, digital voice recorders, digital pens, tablets, auto-
matic speech recognition (ASR), and machine translation (MT), as well as distance inter-
preting (DI), in consecutive conference interpreting is then discussed in six subsections. In
Section 13.5, the chapter pivots to simultaneous interpreting, once more outlining how and
when it was introduced in international conference settings. Section 13.6 introduces the
notion of technology-enhanced simultaneous interpreting, feeding into the respective dis-
cussion of booths, microphones and headsets, electronic glossaries, DI, as well as ASR and
MT in simultaneous conference interpreting in four subsections. The final section provides
a short conclusion and outlook.
230
Conference settings
of interpreters unnecessary. And yet select accounts corroborate the reliance on interpreters
(at that time referred to as oral translators) to overcome the language barrier among differ-
ent conference participants as early as 1889:
Two other gentlemen, Mr. Starr-Hunt and Mr. Romero, are also entitled to a mention
for the able and conscientious manner in which they fulfilled their difficult task as
oral translators. As the deliberations of the Congress were in the English and Spanish
languages, it was necessary for the benefit of the United States delegates, that all the
remarks in Spanish be immediately translated into English. . . . This applies as well
to the observations and speeches made at various times by M. Léger, delegate from
Haiti, whose remarks in French were translated into English.
(Noel, 1902, 68)
International conferences, however, were not regularly multilingual before the founding of
the International Labour Organization (ILO) in 1919 and the League of Nations (LoN) in
1920, which initially worked primarily in French and English (Baigorri-Jalón, 2005), or
the Communist International in 1919, which chiefly relied on German and French and, to
a lesser extent, English and Russian (Riddell, 2015; but see Chernov, 2016a).1 The statu-
tory, and thus systematic, use of multiple languages in international conferences, therefore,
eventually led to the professionalisation of what had already existed as a practice several
decades prior (Baigorri-Jalón et al., 2021).
231
The Routledge Handbook of Interpreting, Technology and AI
conference interpreting is on the interpreter’s memory (Bajo and Padilla, 2015; Yenki-
maleki and Van Heuven, 2017), making the reliance on written notes to support their
memory one of its hallmarks.
232
Conference settings
233
The Routledge Handbook of Interpreting, Technology and AI
234
Conference settings
blogs and websites, the first systematic small-scale user survey on the use of tablet comput-
ers in consecutive interpreting (although not exclusively in conference settings) was carried
out by Goldsmith and Holley (2015). It concludes with a contrastive analysis of technical,
visual, physical, as well as client relationship parameters. Of the long list of parameters que-
ried in the study, however, many relate to the physical features of the tool (stylus and tablet
vs pen and paper), with very few having a strong link to the component tasks identified
earlier. Among the latter is the ability to write emulating different styles (e.g. pen, pencil,
marker) in different colours and to erase notes. These perceived advantages of tablets have
the potential to impact the memorising task.
Similarly, the ability to increase the size of notes, to display multiple pages at once, and
to scroll vertically through notes without physically turning pages all potentially impact
the retrieval task. This means that, although at first glance the use of tablets in consecutive
conference interpreting might look revolutionary, this might be deceiving. And yet Altieri’s
(2020) contrastive empirical study on interpreting students’ note-taking on notepads and
tablets reveals some interesting tendencies. On the one hand, the number of short pauses
and looks at the audience are slightly lower when interpreting with a tablet. On the other
hand, when assessing their own performance, (student) participants rated their expressive-
ness, accuracy, faithfulness, coherence, and communicative effect substantially lower when
interpreting with a tablet. The extent to which lack of familiarity (despite five training ses-
sions) or, indeed, evaluator bias influenced these results remains unclear.
One advantage of tablet interpreting, other than the reduction in paper waste (although
the CO2 footprint generated by tablet computers has been estimated at 100 kg/year; see
Lövehagen et al., 2023), seems to be related to the note-reading more than the note-writing
task: scrolling rather than flipping through notepads and the possibility of showing all
pages at once are small but perhaps not-altogether-inconsequential improvements. The
underlying principles of the consecutive conference interpreting task, however, seem largely
unaffected by this technological development.
235
The Routledge Handbook of Interpreting, Technology and AI
note-reading task with a reading task of either the automatically recognised original or its
machine translation (Chen and Kruger, 2024). This significantly alters the traditional con-
secutive interpreting task, which during the comprehension stage shares similarities with
phrase shadowing (Norman, 1976), while during the production stage can range from a
simple reading task (of a machine-translated text) to a paraphrase (McCarthy et al., 2009)
or a sight translation (Chmiel and Mazur, 2013) of the original automatically recognised
text. While accounts of real-life applications of this technology do not seem to have been
documented, first experiments suggest that overall quality gains can be significant, espe-
cially for interpreters working into a foreign language, thus providing a so-called ‘retour’
(Loiseau and Delgado Luchner, 2021). More specifically, Chen and Kruger’s (2024) study
suggests improved fluency and delivery when providing a consecutive retour interpretation,
which in a UN context is chiefly limited to interpreters working into Chinese and Arabic,
while in an EU context is mainly applicable to interpreters working in Eastern European
languages.7 It is interesting to note that subjective cognitive load was perceived to be lower
during CACI when working into, but not from, the foreign language.
236
Conference settings
original. This latter mode, which was eventually renamed and has since been known as
simultaneous interpreting, will be the object of the discussion that follows.
237
The Routledge Handbook of Interpreting, Technology and AI
equipment (ISO, 2016c, 2019) define minimum specifications for headphones, micro-
phones, integrated headsets combining earphones and microphones, along with consoles
and booth furniture. Some of the most fundamental parameters for simultaneous interpret-
ers, the quality of the sound transmitted to interpreters’ headsets, including the frequency
response, the total harmonic distortion, and the signal-to-noise ratio are also enshrined in
these standards. These early technological advances were crucial for the successful imple-
mentation of simultaneous conference interpreting, as they provided the necessary environ-
ment to carry out the unnatural task of speaking in one language while listening to another.
As compared to whispered simultaneous interpreting, these advances principally impacted
the listening and producing component tasks, as the acoustic separation of input and out-
put facilitated their simultaneous execution.
238
Conference settings
239
The Routledge Handbook of Interpreting, Technology and AI
13.7 Conclusion
Many of the technological developments discussed in this chapter, from booths, headsets,
and consoles to electronic glossaries and distance interpreting solutions, have been success-
fully integrated in international conference interpreting. While some of them were imme-
diately welcomed with excitement, others were initially met with resistance. Some, like
digital voice recorders or digital pens, were hailed as potentially revolutionary but never
really made much of an appearance in the conference room. And yet, even though it is fair
to say that they have not become part of the professional toolkit, they seem to have found
their way into the classroom as a training tool. This shows how, regardless of the hype
generated around a technological development, the factors leading to its acceptance and
adoption are complex. Most technological advances only address a small number of facets
of the interpreting task, meaning, that while they can be argued to provide added value,
this value comes at a cost – often a cognitive cost. All other things being equal, therefore, it
seems that conference interpreters might need to be convinced of a technology’s net benefit
to overcome what might be an initial reluctance to fix something that ain’t broke.
240
Conference settings
Notes
1 According to Chernov (2016a), the first Congress of the Comintern had adopted German and
Russian as working languages, with occasional interpretation provided from French, English, and
Chinese.
2 EMCI European Masters in Conference Interpreting. URL www.emcinterpreting.org/core-
curriculum/ (accessed 15.7.2024).
3 Interinstitutional accreditation tests. URL https://europa.eu/interpretation/freelance_en.html.
(accessed 31.3.2025).
4 MASIT – master of advanced studies in interpreter training. URL www.unige.ch/formcont/en/
courses/masit (accessed 31.3.2025).
5 Original blog posts no longer accessible; first referenced in Hamidi and Pöchhacker (2007).
6 Livescribe, the maker of the first paper-based smartpen, was acquired by Anoto in 2015.
7 It should be noted that, when on field missions, UN interpreters are regularly called upon to pro-
vide a ‘retour’, although often in simultaneous mode (Ruiz Rosendo et al., 2021). As for the EU,
interpreters with the necessary qualifications – regardless of their A language – may be called upon
to provide a retour when on mission or when providing interpretation ad personam (Tanzella and
Alvar Rozas, 2011).
8 For a conceptual and terminological discussion, see Guo et al. (2023).
References
Ahrens, B., Orlando, M., 2021. Note-Taking in Consecutive Interpreting. In Albl-Mikasa, M., Tise-
lius, E., eds. The Routledge Handbook of Conference Interpreting. Routledge, Abingdon, 34–48.
URL https://doi.org/10.4324/9780429297878
AIIC, 2019. Guidelines for Distance Interpreting. URL https://aiic.ch/wp-content/uploads/2020/04/
aiic-guidelines-for-distance-interpreting-version-10.pdf (accessed 1.7.2024).
AIIC, 2024. AIIC Statistical Report: 2023 Data. Unpublished internal report.
AIIC, n.d. Acoustic Shocks Research Project: Final Report. URL https://aiic.org/uploaded/web/
Acoustic%20Shocks%20Research%20Project.pdf (accessed 1.7.2024).
Albl-Mikasa, M., 2021. Conference Interpreting and English as a Lingua Franca. In Albl-Mikasa,
M., Tiselius, E., eds. The Routledge Handbook of Conference Interpreting. Routledge, Abingdon,
546–563. URL https://doi.org/10.4324/9780429297878-47
Altieri, M., 2020. Tablet Interpreting: Étude expérimentale de l’interprétation consécutive sur tab-
lette. The Interpreters’ Newsletter 25, 19–35.
Amos, R.M., Seeber, K.G., Pickering, M.J., 2022. Prediction During Simultaneous Interpreting: Evi-
dence from the Visual-World Paradigm. Cognition 220, 104987.
Anastasopoulos, A., Lui, A., Nguyen, T.Q., Chiang, D., 2019. Neural Machine Translation of Text
from Non-Native Speakers. In Proceedings of the 2019 Conference of the North American
Chapter of the Association for Computational Linguistics: Human Language Technologies Vol-
ume 1 (Long and Short Papers). Association for Computational Linguistics, 3070–3080. URL
https://doi.org/10.18653/v1/N19-1310
Baigorri-Jalón, J., 2004. Interpreters at the United Nations: A History. Salamanca: Ediciones Univer-
sidad de Salamanca.
Baigorri-Jalón, J., 2005. Conference Interpreting in the First International Labor Conference (Wash-
ington, D.C., 1919). Meta 50(3), 987–996.
Baigorri-Jalón, J., 2015. The History of the profession. In Mikkelson, H., Jourdenais, R., eds. Hand-
book of Conference Interpreting. Routledge, Abingdon, 11–28.
Baigorri-Jalón, J., 2021. Once Upon a Time at the ILO: The Infancy of Simultaneous Interpreting. In
Seeber, K.G., ed. 100 Years of Conference Interpreting: A Legacy. Cambridge Scholars, Newcastle
upon Tyne, 1–24.
Baigorri-Jalón, J., Fernández-Sánchez, M.-M., Payàs, A., 2021. Distance Conference Interpreting. In
Albl-Mikasa, M., Tiselius, E., eds. Routledge Handbook of Conference Interpreting. Routledge,
Abingdon, 9–18. URL https://doi.org/10.4324/9780429297878-43
241
The Routledge Handbook of Interpreting, Technology and AI
Bajo, M.T., Padilla, P., 2015. Memory. In Pöchhacker, F., ed. The Routledge Encyclopedia of Inter-
preting Studies. Routledge, Abingdon, 252–254.
Baker, M., Diriker, E., 2019. Conference and Simultaneous Interpreting. In Baker, M., Diriker, E.,
eds. Routledge Encyclopedia of Translation Studies, 3rd ed. Routledge, Abingdon, 95–101. URL
https://doi.org/10.4324/9781315683751
Barbour, I., 1993. Ethics in an Age of Technology. Harper, New York.
Brady, A., Pickles, M., 2022. Why Remote Interpreting Doesn’t Work for Interpreters. UNtoday,
June, 5, 12–14.
Camayd-Freixas, E., 2005. A Revolution in Consecutive Interpretation: Digital Voice-Recorder-Assisted
CI. The ATA Chronicle 34, 40–46.
Chen, S., Kruger, J.L., 2023. The Effectiveness of Computer-Assisted Interpreting: A Preliminary
Study Based on English-Chinese Consecutive Interpreting. Translation and Interpreting Studies
18(3), 399–420. URL https://doi.org/10.1075/tis.21036.che
Chen, S., Kruger, J.L., 2024. A Computer-Assisted Consecutive Interpreting Workflow: Training and
Evaluation. The Interpreter and Translator Trainer 1–20. URL https://doi.org/10.1080/17503
99X.2024.2373553
Chernov, G.V., 2004. Inference and Anticipation in Simultaneous Interpreting. Benjamins, Amsterdam.
Chernov, S., 2016a. At the Dawn of Simultaneous Interpreting in the USSR – Filling Some Gaps in
History. In Takeda, K., Baigorri-Jalón, J., eds. New Insights in the History of Interpreting. Benja-
mins, Amsterdam, 135–165. URL https://doi.org/10.1075/btl.122.06che
Chernov, S., 2016b. U istokov sinhronnogo perevoda v SSSR (The Origins of Simultaneous Interpret-
ing in the USSR). Mosty 2(50), 52–68.
Chernov, S., 2018. At the Dawn of Simuiltaneous Interpreting in the USSR: Filling Some Gaps in
History. In Takeda, K., Baigorri-Jalón, J., eds. New Insights into the History of Interpreting. Ben-
jamins, Amsterdam, 135–165.
Chmiel, A., Mazur, I., 2013. Eye Tracking Sight Translation Performed by Trainee Interpreters. In
Way, C., Vandepitte, S., Meylaerts, R., Bartlomiejczyk, M., eds. Tracks and Treks in Translation
Studies: Selected Papers from the EST Congress Leuven 2010. Benjamins, Amsterdam, 189–205.
Clark, R., Feldon, D., Van Merrienboer, J.J.G., Yates, K., Early, S., 2008. Cognitive Task Analysis. In
Spector, J.M., Merrill, M.D., van Merrienboer, J.J.G., Driscoll, M.P., eds. Handbook of Research
on Educational Communications and Technology. Macmillan/Gale, New York, 577–593.
Cooke, N.J., 1992. The Implications of Cognitive Task Analyses for the Revision of the Dictionary
of Occupational Titles. In Camara, W.J., ed. Implications of Cognitive Psychology and Cognitive
Task Analysis for the Revision of the Dictionary of Occupational Titles. American Psychological
Association, Washington, DC, 1–25.
Cooke, N.J., 1994. Varieties of Knowledge Elicitation Techniques. International Journal of
Human-Computer Studies 41, 801–849.
Dam, H.V., 2010. Consecutive Interpreting. In Gambier, Y., Van Doorslaer, L., eds. Handbook of
Translation Studies Online. Benjamins, Amsterdam, 75–79. URL https://doi.org/10.1075/hts.1
Defrancq, B., Fantinuoli, C., 2021. Automatic Speech Recognition in the Booth: Assessment of Sys-
tem Performance, Interpreters’ Performances and Interactions in the Context of Numbers. Target
33(1), 73–102.
Desmet, B., Vandierendonck, M., Defrancq, B., 2018. Simultaneous Interpretation of Numbers and
the Impact of Technological Support. In Fantinuoli, C., ed. Interpreting and Technology. Language
Science Press, Berlin, 13–27.
DiChristofano, A., Shuster, H., Chandra, S., Patwari, N., 2023. Performance Disparities Between
Accents in Automatic Speech Recognition (Student Abstract). Proceedings of the AAAI Conference
on Artificial Intelligence 37(13), 16200–16201. URL https://doi.org/10.1609/aaai.v37i13.26960
Diriker, E., 2015. De-/Re-Contextualizing Conference Interpreting: Interpreters in the Ivory Tower?
Benjamins, Amsterdam. URL https://doi.org/10.1075/btl.122.06che
EPO, n.d. Technical Guidelines. URL www.epo.org/en/applying/european/oral-proceedings/
proceedings/technical-guidelines (accessed 1.7.2024).
EPO, 2022a. Oral Proceedings in Opposition by Videoconference: Pilot Project Final Report. URL
https://link.epo.org/web/oral_proceedings_in_opposition_by_videoconference-pilot_project_
final_report_november_2022_en.pdf (accessed 1.7.2024).
242
Conference settings
EPO, 2022b. President Decides Future Format of Oral Proceedings in Opposition. News & Events.
URL www.epo.org/en/news-events/news/president-decides-future-format-oral-proceedings-oppo
sition (accessed 1.7.2024).
Fantinuoli, C., 2016. InterpretBank: Redefining Computer-Assisted Interpreting Tools. In
Esteves-Ferreira, J., Macan, J., Mitkov, R., Stefanov, O.-M., eds. Proceedings of the 38th Confer-
ence Translating and the Computer. AsLing, 42–52.
Fantinuoli, C., 2017. Speech Recognition in the Interpreter Workstation. In Esteves-Ferreira, J.,
Macan, J., Mitkov, R., Stefanov, O.-M., eds. Proceedings of the 39th Conference Translating and
the Computer. AsLing, 25–34.
Fantinuoli, C., 2018. Computer-Assisted Interpreting: Challenges and Future Perspectives. In Corpas
Pastor, G., Durán-Muñoz, I., eds. Trends in E-Tools and Resources for Translators and Interpret-
ers. Brill, Leiden, 153–174. URL https://doi.org/10.1163/9789004351790_009
Fantinuoli, C., 2023. Towards AI-Enhanced Computer-Assisted Interpreting. In Corpas Pastor, G.,
Defrancq, B., eds. Interpreting Technologies – Current and Future Trends. Benjamins, Amsterdam,
46–71.
Fantinuoli, C., Montecchio, M., 2023. Defining Maximum Acceptable Latency of AI-Enhanced
CAI Tools. In Ferreiro-Vázquez, Ó., Varajão Moutinho Pereira, A., Gonçalves Araújo, S., eds.
Technological Innovation Put to the Service of Language Learning Translation and Interpreting:
Insights from Academic and Professional Contexts. Peter Lang, Berlin, 213–226. URL https://doi.
org/10.3726/b20168
Feng, S., Halpern, B.M., Kudina, O., Scharenborg, O., 2024. Towards Inclusive Automatic Speech
Recognition. Computer Speech & Language 84, 101567. URL https://doi.org/10.1016/j.
csl.2023.101567
Ferrari, M., 2001. Consecutive Simultaneous? SCIC News 26, 2–4.
Ferrari, M., 2002. Traditional vs. ‘Simultaneous’ Consecutive. SCIC News 29, 6–7.
Ferré, F., 1988. Philosophy of Technology. Prentice Hall, Englewood Cliffs, NJ.
Friedel, R., 2007. A Culture of Improvement: Technology and the Western Millennium. MIT Press,
Cambridge, MA. URL https://doi.org/10.7551/mitpress/9780262062626.001.0001
Gaiba, F., 1998. The Origins of Simultaneous Interpretation: The Nuremberg Trial. University of
Ottawa Press, Ottawa. URL www.jstor.org/stable/j.ctt1cn6rsh
Galbraith, J.K., 1967. The New Industrial State. Houghton Mifflin, Boston, MA.
Gile, D., 1995/2009. Basic Concepts and Models for Interpreter and Translator Training. Benjamins,
Amsterdam.
Gillies, A., 2017. Note-Taking for Consecutive Interpreting: A Short Course. Routledge, Abingdon.
Goldsmith, J., Holley, J.C., 2015. Consecutive Interpreting 2.0: The Tablet Interpreting Experience
(MA dissertation). University of Geneva.
Gordon-Finlay, A., 1927. Telephonic Interpretation Experiments; Technical Arrangements. Dossier
Filene Experiment. Results obtained 1926–1927 Sessions of the Conference (30H/4/8910). ILO
Archives, Geneva.
Grbić, N., 2015. Settings. In Pöchhacker, F., ed. The Routledge Encyclopedia of Interpreting Studies.
Routledge, Abingdon, 370–371.
Guo, M., Han, L., Teiceira Anacleto, A., 2023. Computer-Assisted Interpreting Tools: Status Quo
and Future Trend. Theory and Practice in Language Studies 13(1), 89–99. URL https://doi.
org/10.17507/tpls.1301.11
Hamidi, M., Pöchhacker, F., 2007. Simultaneous Consecutive Interpreting: A New Technique Put to
the Test. Meta 52(2), 276–289.
Hawel, K., 2010. Simultanes versus klassisches Konsekutivdolmetschen: Eine vergleichende textuelle
Analyse (MA dissertation). University of Vienna.
Herbert, J., 1952. The Interpreter’s Handbook: How to Become a Conference Interpreter. Librerie
de l’Université, Paris.
Hiebl, B., 2011. Simultanes Konsekutivdolmetschen mit dem LivescribeTM EchoTM Smartpen. Ein
Experiment im Sprachenpaar Italienisch-Deutsch mit Fokus auf Zuhörerbewertung (MA disserta-
tion). University of Vienna.
ISO, 2016a. ISO 2603:2016 Simultaneous Interpreting – Permanent Booths-Requirements. URL
www.iso.org/standard/74006.html (accessed 1.7.2024).
243
The Routledge Handbook of Interpreting, Technology and AI
ISO, 2016b. ISO 4043:2016 Simultaneous Interpreting – Mobile Booths – Requirements. URL www.
iso.org/standard/70804.html (accessed 1.7.2024).
ISO, 2016c. ISO 20109:2016 Simultaneous Interpreting – Equipment – Requirements. URL www.
iso.org/standard/63033.html (accessed 1.7.2024).
ISO, 2019. ISO 22259:2019 Conference Interpreting – Equipment – Requirements. URL www.iso.
org/standard/72001.html (accessed 1.7.2024).
ISO, 2020. ISO/PAS 24019:2020 Simultaneous Interpreting Delivery Platforms – Requirements and
Recommendations. URL https://www.iso.org/standard/80761.html (accessed 31.3.2025)
ISO, 2022. ISO 23155:2022 Interpreting Services — Conference Interpreting — Requirements and
Recommendations. URL www.iso.org/standard/74749.html (accessed 1.7.2024).
Jiang, H., 2013. The Interpreter’s Glossary in Simultaneous Interpreting: A Survey. Interpreting 15(1),
74–93. URL https://doi.org/10.1075/intp.15.1.04jia
Kirchhoff, H., 1979. Die Notationssprache als Hilfsmittel des Konferenzdolmetschers im Konsekutiv-
vorgang. In Mair, W., Sallager, E., eds. Sprachtheorie und Sprachenpraxis. Festschrift für Henri
Vernay zu seinem 60. Geburtstag. Gunter Narr, Tübingen, 121–133.
Kurz, I., 1991. Conference Interpreting: Job Satisfaction, Occupational Prestige and Desirability. In
Jovanović, M., ed. XIIth World Congress of FIT – Belgrade 1990. Proceedings. Prevodilac, Bel-
grade, 363–367.
Lederer, M., 1984. La traduction simultanée. In Seleskovitch, D., Lederer, M., eds. Interpréter pour
traduire. Didier, Paris, 136–162.
Loiseau, N., Delgado Luchner, C., 2021. A, B and C Decoded: Understanding Interpreters’ Language
Combinations in Terms of Language Proficiency. The Interpreter and Translator Trainer 15(4),
468–489. URL https://doi.org/10.1080/1750399X.2021.1911193
Lombardi, J., 2003. DRAC Interpreting: Coming Soon to a Courthouse Near You? Proteus 12(2), 7–9.
Lövehagen, N., Malmodin, J., Bergmark, P., Matinfar, S., 2023. Assessing Embodied Carbon Emis-
sions of Communication User Devices by Combining Approaches. Renewable and Sustainable
Energy Reviews 183, 113422. URL https://doi.org/10.1016/j.rser.2023.113422
Mankauskienè, D., 2016. Problem Trigger Classification and Its Applications for Empirical Research.
Procedia – Social and Behavioral Sciences 231, 143–148.
McCarthy, P.M., Guess, R.H., McNamara, D.S., 2009. The Components of Paraphrase Evaluations.
Behavior Research Methods 41(3), 682–690. URL https://doi.org/10.3758/BRM.41.3.682
Mesthene, E.G., 1970. Technological Change: Its Impact on Man and Society. Mentor, New York.
Mielcarek, M.M., 2017. Das simultane Konsekutivdolmetschen: Ein Experiment im Sprachenpaar
Spanisch-Deutsch (MA dissertation). University of Vienna.
Moser-Mercer, B., 2003. Remote Interpreting: Assessment of Human Factors and Performance
Parameters. Joint project, International Telecommunication Union (ITU) and Ecole de traduction
et interprétation, University of Geneva (ETI).
Moser-Mercer, B., Lambert, S., Darò, V., Williams, D., 1997. Skill Components in Simultaneous
Interpreting. In Gambier, Y., Gile, D., Taylor, C., eds. Conference Interpreting: Current Trends in
Research. Benjamins, Amsterdam, 133–148.
Mouzourakis, P., 2006. Remote Interpreting: A Technical Perspective on Recent Experiments. Inter-
preting 8(1), 45–66.
Noel, J.V., 1902. History of the Second Pan American Congress. Guggenheimer Weil and Co.,
New York.
Norman, D.A., 1976. Memory and Attention: An Introduction to Human Information Processing,
2nd ed. Wiley, New York.
Orlando, M., 2010. Digital Pen Technology and Consecutive Interpreting: Another Dimension in
Note-Taking Training and Assessment. The Interpreters’ Newsletter 15, 71–86.
Orlando, M., 2014. A Study on the Amenability of Digital Pen Technology in a Hybrid Mode of
Interpreting: Consec-Simul with Notes. The International Journal for Translation and Interpreting
Research 6(2), 39–54. URL https://doi.org/10.12807/ti.106202.2014.a03
Orlando, M., 2015. Implementing Digital Pen Technology in the Consecutive Interpreting Classroom.
In Andres, D., Behr, M., eds. To Know How to Suggest . . .: Approaches to Teaching Conference
Interpreting. Frank & Timme, Berlin, 171–200.
244
Conference settings
Orlando, M., Hlavac, J., 2020. Simultaneous-Consecutive in Interpreter Training and Interpreting
Practice: Use and Perceptions of a Hybrid Mode. The Interpreters’ Newsletter, 25, 1–17.
Piolat, A., Olive, T., Kellogg, R.T., 2005. Cognitive Effort During Note Taking. Applied Cognitive
Psychology 19(3), 291–312. URL https://doi.org/10.1002/acp.1086
Pisani, E., Fantinuoli, C., 2021. Measuring the Impact of Automatic Speech Recognition on Num-
ber Rendition in Simultaneous Interpreting. In Wang, C., Zheng, B., eds. Empirical Studies of
Translation and Interpreting. Routledge, Abingdon, 181–197. URL https://doi.org/10.4324/
9781003017400-14
Pöchhacker, F., 2011. Consecutive Interpreting. In Malmkjær, K., Windle, K., eds. The Oxford
Handbook of translation Studies. Oxford University Press, Oxford, 294–306. URL https://doi.
org/10.1093/oxfordhb/9780199239306.013.0021
Pöchhacker, F., ed., 2015. The Routledge Encyclopedia of Interpreting Studies. Routledge, Abingdon.
URL https://doi.org/10.4324/9781315720728
Pöchhacker, F., 2022. Introducing Interpreting Studies. Routledge, Abingdon. URL https://doi.
org/10.4324/9781003109020
Prandi, B., 2023. Computer-Assisted Simultaneous Interpreting: A Cognitive-Experimental Study on
Terminology. Language Science Press, Berlin. URL https://doi.org/10.5281/zenodo.7143055
Riddell, J., ed., 2015. To the Masses. Proceedings of the Third Congress of the Communist Interna-
tional 1921. Brill, Leiden. URL https://doi.org/10.1163/9789004266177
Rozan, J.F., 1956. Note-Taking in Consecutive Interpreting. Georg, Geneva.
Roziner, I., Shlesinger, M., 2010. Much Ado About Something Remote: Stress and Performance in
Remote Interpreting. Interpreting 12(2), 214–247.
Ruetten, A., 2003. Computer-Based Information Management for Conference Interpreters or How Will
I Make My Computer Act Like an Infallible Information Butler? In Esteves-Ferreira, J., Macan, J.,
Mitkov, R., Stefanov, O.-M., eds. Proceedings of Translating and the Computer 25. Aslib, 20–21.
Ruiz Rosendo, L., Barghout, A., Martin, C.H., 2021. Interpreting on UN Field Missions: A Training
Programme. The Interpreter and Translator Trainer 15(4), 450–467. URL https://doi.org/10.1080/
1750399X.2021.1903736
Ryder, G., 2021. Prologue. In Seeber, K.G., ed. 100 Years of Conference Interpreting: A Legacy. Cam-
bridge Scholars, Newcastle upon Tyne, xviii–xxii.
Sawyer, R.K., Roy, C.B., 2015. Encyclopedia of the Social and Cultural Foundations of Education.
Sage Publications, Thousand Oaks, CA. URL https://doi.org/10.4135/9781412963992
Seeber, K.G., 2007. Thinking Outside the Cube: Modeling Language Processing Tasks in a Multiple
Resource Paradigm. Proceedings of Interspeech 2007, 1382–1385. URL https://doi.org/10.21437/
Interspeech.2007-21
Seeber, K.G., 2015. Simultaneous Interpreting. In Mikkelson, H., Jourdenais R., eds. The Routledge
Handbook of Interpreting. Routledge, Abingdon, 79–95.
Seeber, K.G., 2017a. Multimodal Processing in Simultaneous Interpreting. In Schwieter, J.W., Ferreira,
A., eds. The Handbook of Translation and Cognition. Wiley Blackwell, Hoboken, NJ, 461–475.
Seeber, K.G., 2017b. Simultaneous Interpreting into a B Language: Considerations for Trainers and
Trainees. In Zybatow, L.N., Stauder, A., Ustanzewski, M., eds. Translation Studies and Translation
Practice: Proceedings of the 2nd International Translata Conference. Peter Lang, Berlin, 321–328.
URL https://doi.org/10.3726/b10842
Seeber, K.G., 2020. Distance Interpreting: Mapping the Landscape. In Ahrens, B., Beaton-Thome, M.,
Krein-Kühle, M., Krüger, R., Link, L., Wienen, U., eds. Interdependence and Innovation in Trans-
lation Interpreting and Specialised Communication. Frank & Timme, Berlin, 123–172.
Seeber, K.G., 2022. Project Report: Load and Fatigue in ARI and VRI. Department of Public Works
and Government Services Canada – Conference Interpretation, Ottawa.
Seeber, K.G., Amos, R.M., 2023. Capacity, Load and Effort in Translation, Interpreting and Bilingual-
ism. In Ferreira, A., Schwieter, J.W., eds. The Routledge Handbook of Translation, Interpreting and
Bilingualism, 1st ed. Routledge, Abingdon, 260–279. URL https://doi.org/10.4324/9781003109020
Seeber, K.G., Arbona, E., 2020. What’s Load Got to Do with It? A Cognitive-Ergonomic Training
Model of Simultaneous Interpreting. The Interpreter and Translator Trainer 14(4), 369–385. URL
https://doi.org/10.1080/1750399X.2020.1839996
245
The Routledge Handbook of Interpreting, Technology and AI
Seeber, K.G., Fox, B., 2021. Distance Conference Interpreting. In Albl-Mikasa, M., Tiselius, E., eds.
Routledge Handbook of Conference Interpreting. Routledge, Abingdon, 491–507. URL https://
doi.org/10.4324/9780429297878-43
Seleskovitch, D., 1968. L’interprète dans les conférences internationales. Minard Lettres Modernes,
Paris.
Setton, R., Dawrant, A., 2016. Conference Interpreting: A Complete Course. Benjamins, Amsterdam.
Sienkiewicz, B., 2010. Das Konsekutivdolmetschen der Zukunft: Mit Notizblock oder Aufnah-
megerät? Ein Experiment zum Vergleich von klassischem und simultanem Konsekutivdolmetschen
(MA dissertation). University of Vienna.
Svejcer, A.S., 1999. At the Dawn of Simultaneous Interpretation in Russia. Interpreting 4, 23–28.
Stoll, C., 2009. Jenseits simultanfähiger Terminologiesysteme. WVT Wissenschaftlicher Verlag, Berlin.
Svoboda, S., 2020. SimConsec: The Technology of a Smartpen in Interpreting (MA dissertation).
Palacký University.
Tanzella, D., Alvar Rozas, P., 2011. Interpreting at the European Institutions: Interpretation ad Per-
sonam (MA dissertation). University of Geneva.
Timarová, S., Dragstead, B., Gorm Hansen, I., 2011. Time Lag in Translation and Interpreting:
A Methodological Exploration. In Alvstad, C., Hild, A., Tiselius, E., eds. Methods and Strate-
gies of Process Research: Integrative Approaches in Translation Studies. Benjamins, Amsterdam,
121–146.
UNESCO, 1976. A Teleconference Experiment: A Report on the Experimental Use of the Sympho-
nie Satellite to Link UNESCO Headquarters in Paris with the Conference Centre in Nairobi.
UNESCO, Paris.
Viezzi, M., 2013. Simultaneous and Consecutive Interpreting (Non-Conference Settings). In Mil-
lán, C., Bartrina, F., eds. The Routledge Handbook of Translation Studies. Routledge, Abingdon,
377–388. URL https://doi.org/10.4324/9780203102893
Wang, X., Wang, C., 2019. Can Computer-Assisted Interpreting Tools Assist Interpreting? Translet-
ters. International Journal of Translation and Interpreting, 3, 109–139.
Will, M., 2009. Dolmetschorientierte Terminologiearbeit: Modell und Methode. Gunter Narr Verlag,
Tübingen.
Will, M., 2020. Computer Aided Interpreting (CAI) for Conference Interpreters: Concepts, Content
and Prospects. Journal for Communication Studies 13(1), 37–71.
Wright, S., 2006. French as a Lingua Franca. Annual Review of Applied Linguistics 26, 35–60. URL
https://doi.org/10.1017/S0267190506000031
Yenkimaleki, M., van Heuven, V.J., 2017. The Effect of Memory Training on Consecutive Interpreting
Performance by Interpreter Trainees: An Experimental Study. Forum 15(1), 157–172. URL https://
doi.org/10.1075/forum.15.1.09yen
246
14
HEALTHCARE SETTINGS
Esther de Boe
14.1 Introduction
Healthcare interpreting (HI) refers to the interaction that occurs between three parties: ‘a
speaker of a non-societal language (for example, a patient seeking healthcare)’, ‘a speaker
of the societal language (generally the service provider)’, and an interpreter, ‘who medi-
ates between the two parties in a simultaneous or consecutive mode, either face-to-face
or remotely’ (Angelelli, 2014, 574). HI is one of the most prevalent fields of professional
interpreting practice and research within dialogue interpreting studies (Hale, 2007; Pöch-
hacker and Schlesinger, 2005). Similar to other contexts of public services, healthcare pro-
viders are increasingly challenged by linguistic and cultural barriers. As societies diversify
due to various migration patterns, healthcare providers encounter an increasing number of
patients with whom they share neither a common language nor culture (Boujon et al., 2018,
50). Furthermore, the transition towards a patient-centred approach (Schinkel et al., 2019)
presents additional challenges for healthcare providers who are dealing with a more diverse
patient population (Fiedler et al., 2022).
Finding solid ways to bridge linguistic and cultural boundaries is essential in healthcare,
since success or failure of the communication may become a matter of life or death (Ng
and Crezee, 2020). As Anazawa et al. (2012, 1) point out, accuracy of interpretation is ‘the
most critical component of safe and effective communication’ between healthcare providers
and patients. This is why researchers contend that the potential risk of using unqualified
interpreters to work in such high-stakes settings cannot be measured (Hedding and Kauf-
man, 2012) and call for high-quality interpreting standards. This necessity is further under-
scored by the existence of diverse specialised fields within healthcare, each with distinct
requirements and terminology, such as speech pathology, psychotherapy, gynaecology, and
mental health (Dean and Pollard, 2011; Ng and Crezee, 2020).
Hence, it comes as no surprise that academic studies have confirmed the effectiveness
of employing professional interpreters in order to enhance the health and satisfaction of
patients with no or low proficiency in the societal language (Ribera et al., 2008). Employing
professional interpreters has been linked to an increased reception of preventive advice and
prescriptions, coupled with a reduced reliance on emergency consultations (Jacobs et al.,
DOI: 10.4324/9781003053248-19
The Routledge Handbook of Interpreting, Technology and AI
2004). Moreover, this practice may also prevent ethical and legal issues, such as informed
consent (Bot, 2013). However, it must be noted that, as Schouten et al. (2020) argue, the
use of professional interpreters may also pose challenges, in terms of a simplistic vision of
their role or a lack of awareness about the complexity of healthcare encounters. Next to
that, the understanding of who is considered to be a ‘professional’ healthcare interpreter
also varies from one country to another (Bischoff and Grossmann, 2006).
Although empirical evidence supports the use of professional interpreters, various societal
factors, such as a decrease in government subsidies and a lack of regulations (De Boe, 2015;
Phelan, 2012), pose obstacles to their widespread employment. Moreover, the supply of qual-
ified interpreters is insufficient to meet the increasingly varied linguistic demand (González
Rodríguez and Spinolo, 2017, 242). While Deaf individuals have identified healthcare settings
as the most crucial context requiring interpreter services (De Meulder et al., 2021; Haualand,
2010), these settings also pose the greatest challenge in terms of securing the use of quali-
fied interpreters (NIEC, 2008, in Swabey et al., 2017). In addition, healthcare providers are
not always ready to employ professional interpreters (whether remote or on-site). They may
instead choose to ‘get by’ without interpreting services, even though professional interpreting
services are readily available (Fiedler et al., 2022; Gutman et al., 2020).
Such circumstances often drive healthcare providers to seek alternative options for
surmounting language barriers. This trend is increasingly reinforced by technological
advancements. In light of the ongoing digitisation of our societies, healthcare services have
largely embraced technology in their communication with patients over the last decades.
This so-called ‘telemedicine’, which is also being referred to as ‘e-health’ or ‘telehealth’
(WHO-ITU, 2022),1 encompasses a broad spectrum of technological support. This ranges
from traditional technologies like telephone and email to more recent innovations, such
as online health portals, through which patients have electronic access to their healthcare
information, and video-mediated consultations. When such resources are accessed through
mobile communication devices, this has been referred to as ‘m-Health’, or ‘mHealth’
(Sweileh et al., 2017). Nowadays, (monolingual) video-mediated consultations are being
considered as ‘at least a partial solution to the complex challenges of delivering healthcare
to an aging and increasingly diverse population’ (Greenhalgh et al., 2016, 1). Moreover,
teleconsultations are often encouraged by authorities and decision-makers to improve
access to specialised services for isolated patients (Esterle and Mathieu-Fritz, 2013).
In the same way, technology-mediated HI in the form of telephone-based or video-based
interpreting has become widespread (Farag and Meyer, 2024, 2). Braun and Taylor (2012)
referred to these modalities of interpreting as a case of ‘dual mediation’, indicating that the
interaction between individuals is mediated by an interpreter, while the interaction itself
is, in turn, mediated by technology. Aside from using technology to support HI, relatively
recently, a growing use of technology-supported tools has been observed in healthcare com-
munication with non-native patients (Sweileh et al., 2017; Thonon et al., 2021). These
tools – aimed at replacing human interpreters – include generic machine translation (MT)
tools, such as Google Translate (Patil and Davies, 2014; Vieira et al., 2021). There are also
dedicated care applications that generate pre-established medical translations, some of which
can be activated by automatic speech recognition (ASR) (Van Straaten et al., 2023). Such
technology-driven tools exist next to more traditional solutions, such as multilingual guides
and brochures (Pokorn et al., 2024; Thonon et al., 2021; Van Straaten et al., 2023). Com-
pared with professional interpreters, whether on-site or remote, technology-supported tools
raise issues regarding quality and ethics (Braun et al., 2023). Yet whereas video-mediated
248
Healthcare settings
249
The Routledge Handbook of Interpreting, Technology and AI
250
Healthcare settings
(e.g. Braun et al., 2023; De Cotret et al., 2020; De Boe et al., 2024) note, the outbreak of
this worldwide health crisis contributed extensively to shifting the balance from on-site to
RI interpreting in general, as governments were obliged to comply with safety protocols.
Although the adoption of VMI greatly accelerated during the COVID-19 pandemic, it is
worth noting that some countries’ governments had actively promoted VMI even before
the health crisis. Particular examples are Denmark (De Cotret et al., 2020) and Norway
(Hansen, 2021). However, VMI in particular seems to have benefited from the worldwide
crisis. For example, in Australia, a record use of VRI was observed in 2020, at the expense
of TI (Bachelier and Orlando, 2024, 82). Other examples are Belgium and the Nether-
lands, where the pandemic also clearly accelerated the market share of VMI3 (Stengers
et al., 2023; Van Straaten et al., 2023). Additionally, there were alterations in the methods
through which VMI was delivered. As a result of social distancing, an increasing number
of VMI services needed to be carried out via a three-point configuration, with each of the
participants located in separate places. Up to that time, the most common type of VMI had
been the configuration in which the primary participants were together in one location and
the interpreter in a separate one (Braun et al., 2023; Verhaegen, 2023).
Besides this, during the pandemic, telehealth platforms had to be adapted incredibly quickly
to integrate VRI (Bachelier and Orlando, 2024). This was the case across Australia, Europe,
and the Middle East (Almahasees et al., 2024). Taking the example of Australia, Bachelier
and Orlando (2024) show that the government-funded telehealth platforms Telehealth and
Healthdirect competed with the generic platforms Zoom, Microsoft Teams, and Cisco Webex
to offer VRI services. In the United States, the inclusion of VRI in telehealth platforms had
already become increasingly common over the last decades. This trend is illustrated by a
growing number of private companies partnering up to offer broader e-health services, such
as InTouchHealth, a telehealth platform, and InDemand Interpreting, a technology-abled
medical interpreting company. Together, they provide ‘virtual care networks’, offering ‘solu-
tions and services to support access and delivery of high-quality clinical care to any patient at
any time while reducing the overall cost of care’ (Teladoc Health, 2018). According to Corpas
Pastor and Sánchez-Rodas (2021, 10), these e-health systems are ‘novel, speedy, low-cost solu-
tions to communications needs in hospitals and healthcare centres’ and rapidly replace ‘tra-
ditional forms of interpreting’. These systems also increasingly integrate interpreter-replacing
communication applications (see Section 14.4) and usually specialise in either spoken lan-
guage or signed language distance interpreting services.
(1) It is widely accepted that spoken language interaction includes important nonver-
bal elements of communication (e.g., eye gaze, gestures, etc.), and (2) the evolution of
technology means it has become much easier to interact via video.
(Skinner et al., 2018, 13)
However, despite these plausible advantages, as well as the boost by the pandemic and
some countries’ government encouragement, for the moment, it remains unclear to what
251
The Routledge Handbook of Interpreting, Technology and AI
extent the claim that VMI has begun to take over TI (e.g. Braun et al., 2023) can actually
be confirmed by current practices in healthcare. Since the organisation of PSI services is
not centralised and, in many cases, provided by commercial suppliers, it is extremely hard
to obtain a comparative overview of numbers indicating the actual share of RI services
in PSI. The same applies to the distribution of VMI and TI within RI, or their use in the
healthcare sector as part of the broader PSI domain. Hence, our understanding is limited to
fragmented insights gleaned from individual countries or even regions, with numbers often
applying to the broader field of PSI. As Bachelier and Orlando (2024, 82) argue regarding
the situation in Australia, although VMI has great potential in our audiovisually oriented
digital era, ‘performing through VRI still remains quite a novel and difficult exercise for
interpreters and medical staff, and the use of this modality is not as widespread as one
could imagine’. This is also illustrated by surveys among healthcare providers from other
countries. For example, a report on technology-based solutions in healthcare settings in
the Netherlands indicated that TI is far better known and used than VRI (Van Straaten
et al., 2023). Another example comes from Norway, where TI clearly gained ground at the
expense of on-site interpreting over the last years. However, the share of VRI, remained
extremely limited in 2021 and 2022 (Imdi 2021, 2022) for all types of interpreting services.
Nevertheless, not all examples follow this trend. The numbers provided for the city of
Brussels4 show that, since its introduction in 2020, the number of VMI services in PSI has
been growing throughout 2021 and 2022, while the number of TI services remained more
or less stable. At the same time, the number of on-site interpreting services clearly dropped.
This indicates that VMI has gained ground, at the expense of on-site interpreting services,
rather than of TI (Sociaal Vertaalbureau, 2022). It must be mentioned, however, that these
numbers are not limited to healthcare settings but represent all PSI services.
The decision to opt for TI, VMI, or on-site interpreting is also closely linked to financial
matters, market-related developments, and users’ willingness to work with the different types
of interpreting. On the one hand, as Lion et al. (2015) observed in their clinical study com-
paring TI and VRI in a paediatric hospital, charges for VRI services were double those of TI.
On the other hand, as Yabe (2020) points out, VRI is, in turn, cheaper than on-site interpret-
ing. As Ozolins (2011, 34) explains, the rise of TI from the mid-1990s onwards was very
much linked to a steep drop of the cost of telephony. Similarly, the ongoing advancements
in videotelephony are set to drive cost-efficiencies, potentially influencing the frequency of its
adoption. Verrept et al. (2018, 59), who investigated the implementation of video-mediated
intercultural mediation in Belgian hospitals from a health-sociological perspective, also noted
a limited willingness of hospital managers to implement VRI. This was due to increased costs
and considering logistical issues to be barriers to efficient structural application. In the same
vein, recent medical research comparing TI and VRI demonstrated that VRI generated greater
user satisfaction. However, a low willingness by healthcare providers to work with it (Fiedler
et al., 2022) has also been observed. Besides this, due to the lack of standardised (VR)I plat-
forms in many countries, interpreters need to be trained on various systems (Bachelier and
Orlando, 2024, 93). This may also be an impediment to larger-scale adoption of VRI.
252
Healthcare settings
from various disciplines. These include medicine, sociology, and interpreting studies, based
on a wide array of research designs. Methodologies span from large-scale randomised tri-
als and clinical surveys to more focused case studies. These draw on authentic materials or
simulations, often analysed from a conversation analytical and/or interactional perspective.
Most of the research, although not all, takes a comparative approach, with one or more
distance interpreting methods being contrasted with on-site interpreting. In some medical
studies, these comparisons also investigate additional options for crossing language barri-
ers, such as bilingual healthcare providers (e.g. Crossman et al., 2010) or informal inter-
preters (e.g. Flores et al., 2012).
Despite the diversity in approaches, the consensus among most studies is that dis-
tance interpreting presents numerous challenges, situated across various themes associ-
ated with remote healthcare interpreting. Issues that have been addressed by research into
technology-mediated HI from the various disciplines coincide with the larger categories
identified by Li (2022) for VMI across contexts and disciplines. These include (1) cost of
time, financial cost, and benefits; (2) physical and psychological costs; (3) users’ acceptance
and satisfaction; and (4) communication quality. Of these themes, user acceptance and, in
particular, satisfaction with the interpreting method stand out across research from both
medical studies and interpreting studies as a common denominator. However, exceptions
aside (e.g. Greenhalgh et al., 2016; Saint-Louis et al., 2003), as Pöchhacker (2006) claims,
medical studies tend to focus on satisfaction with quality of care (e.g. Paras et al., 2002),
next to cost-efficiency (e.g. Masland et al., 2010), whereas research from interpreting stud-
ies focuses more on user satisfaction with communication and interpreting processes. Sys-
tematic overviews of findings from medical studies on spoken language RI in healthcare
can be found in Azarmina and Wallace (2005) and Joseph et al. (2017), whereas a detailed
synthesis is provided in De Boe (2023, 24–32) and Braun et al. (2023, 91–94). An overview
of medical studies on signed language distance interpreting in healthcare was conducted by
Rivas Velarde et al. (2022). The study is highly critical of the state of the art concerning this
topic and emphasises that the Global South is underrepresented in the research. The study
also stresses that VRI has the potential to overcome communication barriers, but that it is
not a ‘quick fix to overcome accessibility issues’. It also states that the views and needs of
the Deaf and Hard of Hearing community should be considered very seriously when devel-
oping this technology (Rivas Velarde et al., 2022).
Divergences in satisfaction results between medical studies and interpreting studies may
also be related to research design and focus. While the former tends to yield predomi-
nantly positive outcomes in support of distance interpreting, satisfaction levels in the latter
are highly varied and appear to be influenced significantly by individual preferences and
configurations. This makes it difficult to provide a concise summary. Interpreting studies
research typically adopts a narrower focus compared to medical studies, often scrutinising
communicative events at a micro-level. Consequently, the outcomes of the latter are less
suitable for broad generalisations.5
In interpreting studies research on remote healthcare interpreting, the link between
communicative challenges and technical conditions is logically a common thread. This is
because these can be extremely unfavourable in PSI settings due to issues with connec-
tivity, hardware and software, inferior sound quality, etc. In surveys conducted among
public sector interpreters, poor sound quality and other technical issues continue to be a
source of inconvenience and frustration (Corpas Pastor and Gaber, 2020). In conference
settings, standards of practice for distance interpreting were developed as early as 1998.
253
The Routledge Handbook of Interpreting, Technology and AI
254
Healthcare settings
describes difficulties for the interpreter to take control of turn duration in a three-point TI
constellation in service calls, a large number of which were carried in a healthcare context.
By contrast, Amato also reports on explicit efforts by the interpreter to coordinate the
conversation (Amato, 2018, 86). However, as Farag and Meyer (2024, 2) observe, despite
these challenges and a growing use of RI, the linguistic and communicative requirements of
remote dialogue interpreting remain underexplored.
255
The Routledge Handbook of Interpreting, Technology and AI
disadvantage of using MT. TI and the voice translation app ‘SayHi’ were rated higher than
Google Translate, although the participants indicated that experiences with various digital
tools varied greatly and the quality of the translation often depended on the language pairs
used (Van Straaten et al., 2023, 5).
Despite the exponential growth in use of digital tools as well as interest from the research
community in various aspects of MT (e.g. Kenny, 2022), so far, few studies have examined
the impact of the use of MT in healthcare communication. One of the few studies that
exist is an early small-scale experiment by Patil and Davies (2014), who translated ten
medical phrases in 26 languages. They found that Google Translate achieved only a 57.7%
accuracy rate for translating medical phrases and should therefore not be relied upon for
crucial medical communication. Nevertheless, as the study lacks clarity on methodological
procedures, and since MT has been greatly evaluated meanwhile, Patil and Davies’s (2014)
results have little relevance today.
A more recent overview study of the use of MT in medicine and law by Vieira et al.
(2021, 1515) confirms that MT errors may indeed pose serious dangers in high-risk envi-
ronments. The authors also note that little is known about the nature of the risks involved,
or of the broader effects of ‘uninformed’ use of MT. In addition, Vieira et al. (2021) show
that research often fails to consider the complexities of language and translation. They also
draw attention to the risk that the use of MT reinforces social inequalities and jeopardises
certain communities. In relation to this, Boujon et al. (2018) point out that MT apps do not
always provide languages that are highly relevant in healthcare. Notable examples include
Tigrinya or sign languages and are not easily adapted to include additional languages.
Several researchers also put forward ethical concerns surrounding the assurance of data
protection that is lacking in such tools (Boujon et al., 2018; Braun et al., 2023).
One way to remedy these objections is to use so-called phrasebook apps (Braun et al.,
2023), ‘phraselators’ (Boujon et al., 2018), or ‘multilingual phrasebooks’ (Pokorn et al.,
2024). These tools are tailored by medical professionals for medical diagnostic scenarios
and comprise a collection of pre-translated, domain-specific standard sentences, including
questions and instructions. As Boujon et al. (2018, 51) argue, unlike MT, phrasebooks apps
offer the benefit of providing reliable translations and are more straightforward to adapt to
new languages or domains. Nevertheless, due to the limited combination of sentences and
translations offered by these tools, users must navigate through menus or keywords to find
a precise sentence. This proves impractical. To tackle this, Boujon’s team from Geneva Uni-
versity Hospitals developed a more sophisticated tool known as ‘BabelDr’. This ASR-based
tool was compared against a traditional phraselator without ASR (MediBabble). The find-
ings indicate that the new tool enabled participants to gather information more swiftly and
effortlessly. Furthermore, it enhanced interaction by enabling users to engage in more natu-
ral conversations compared to using a traditional phrasebook (Boujon et al., 2018, 63).
Similar tools are provided by many commercial companies, for example, Global Talk
Care.7 These are being introduced to the market at such an accelerated pace that it would
be impossible to provide an exhaustive overview of the industry. Nevertheless, all these
apps share common features. They may integrate several tasks, are often managed by large
American companies, and are predominantly the subject of usability studies (see review by
Thonon et al., 2021). While studies offer comprehensive analyses of usability and function-
alities, a gap remains in our understanding of how users interact with these diverse tools and
their impact on language mediation during healthcare interactions (Braun et al., 2023, 101).
256
Healthcare settings
14.5 Conclusion
This chapter set out to discuss current common practices of technology-based solutions
for bridging language gaps in healthcare communication. These include remote healthcare
interpreting by humans, which has seen a considerable increase in the last decades, and
technological tools replacing interpreters, which is a recent phenomenon. Meanwhile, TI
has become an established practice in healthcare. However, the use of VMI is only slowly
catching up in healthcare settings, despite its steep increase during the COVID-19 pan-
demic. Nevertheless, its strong potential is being embraced in our audiovisual era, and users
prefer VMI in comparison with TI. It must be acknowledged that obtaining precise figures
on the use of distance interpreting in healthcare, including respective shares of TI and VMI,
remains a complex task.
Although TI has frequently been designated as an ‘inferior type of interpreting’ (Ozolins,
2011) and considered ‘unsatisfactory’ (Lion et al., 2015), in many countries, TI services in
healthcare now outnumber on-site interpreting services. While both medical and interpret-
ing studies consistently indicate that TI is generally the least-preferred option compared to
on-site or VMI solutions, it seems to be valued more highly compared with technological
tools that replace human interpreters. This highlights the significance of human touch, trust,
and rapport in healthcare and other PSI interactions, in which ‘interaction, non-verbal com-
munication and language paralinguistic information . . . are of paramount importance, and
ethics and confidentiality issues are at stake’ (Corpas Pastor and Gaber, 2020, 58).
Similarly, the medical, legal, ethical, and economic arguments for using distance inter-
preting in healthcare (Kletecka-Pulker and Parrag, 2015) constitute excellent reasons for
pleading against technology-generated, non-human interpreting tools in such critical set-
tings. Given the current low levels of accuracy in MT for some language pairs that are rel-
evant for healthcare communication, the potential of jeopardising patients’ health cannot
be underestimated. As far as dedicated care apps are concerned, although they seem to pose
a lower risk in terms of accuracy, the impact on interactional dynamics and therapeutic
relationships between healthcare providers and patients remains unclear for the moment.
Although such issues have been addressed to a larger extent in technology-mediated
HI, they are also still in need of further mapping. To illustrate, the more recently emerged
three-point VRI configurations, which became more current during the global healthcare
crisis, have not yet been thoroughly investigated. Although the different RI configurations
share some characteristics in terms of their technical constraints, each configuration also
poses its own specific challenges (Skinner et al., 2018, 19). The effect of these emerging
configurations on the coordination of the interaction and doctor–patient rapport has not
yet been fully explored (Braun et al., 2023; Verhaegen, 2023).
In addition, while publications describing interpreters’ working conditions since the pan-
demic are slowly emerging, we currently have little knowledge about the structural impact
of technology support on HI quality, working conditions, or cognitive processes involved in
HI. However, a lack of awareness of such impact is being reported, for example, in the form
of healthcare providers’ reduced willingness and readiness to adopt professional distance
interpreting (Fiedler et al., 2022; Gutman et al., 2020). This reflects a more generalised lack
of awareness of the importance of using professional language mediation, for example, for
Deaf persons in healthcare (Middleton et al., 2010; Iezzoni et al., 2004).
Further issues that need to be addressed urgently pertain to ethical aspects of technol-
ogy use in healthcare communication. So far, little attention has been paid to ethical or
257
The Routledge Handbook of Interpreting, Technology and AI
ecological implications of the use of technology. Data security and an increasing ‘digital
divide’ (Valero-Garcés, 2018) are two such examples. Although these issues are discussed
by some more general research (e.g. Lázaro-Gutiérez et al., 2021), ethical aspects do not
seem to be the main focus of any research into technology-based communication in health-
care settings for the moment. However, with the current advancement of AI-supported tools
(Braun et al., 2023), ethical matters are likely to be of growing importance, and clearer poli-
cies need to be developed. Currently, situations vary from one country to another, and even
between institutions within the same country. For example, in the UK, official guidance
does not endorse the use of MT in primary care, and medical advisers warn against their
use in everyday clinical practice (Braun et al., 2023). In contrast, in other countries, such as
the Netherlands, regulations concerning the use of MT are being investigated but currently
remain unclear (Van Straaten et al., 2023). In the United States, some hospitals prohibit
healthcare clinicians from using non-native-language speakers for medical communication
unless their proficiency has been validated (Lion et al., 2015).
Following Vieira et al. (2021) in their approach to the use of MT in critical settings, we
must conclude that more extensive interdisciplinary research is crucial for understanding
the complexities of technology-based cross-linguistic communication in healthcare. How-
ever, integration between medical and interpreting studies remains limited. To date, each of
these domains has each investigated technology-based solutions for bridging the language
gap in healthcare setting on their own end. Whereas interpreting studies tend to pay atten-
tion to results generated by medical studies, this knowledge flow seems to work only in
one direction. Except for a few studies (e.g. Greenhalgh et al., 2016), medical studies tend
to miss out on the opportunity of including a more linguistically and communicatively ori-
ented perspective, which can be provided by interpreting studies (Davitti, 2019b). Interpret-
ing studies, in turn, could greatly benefit from closer cooperation with medical studies by
gaining access to larger samples of users and solid research environments. A higher level of
interdisciplinarity is therefore urgently needed to further explore the dynamics of human–
machine interaction, user experience, and the impact of technology on cognitive processes
in healthcare settings. Examples include investigations into user experience, that is, how
the different groups involved in healthcare experience technology support (Braun et al.,
2023). Looking ahead, there is an anticipated increase in the use of technology to support
cross-linguistic communication in healthcare (Kerremans et al., 2018, 766), with advance-
ments in artificial intelligence driving the development of new health apps and tools. This
evolving landscape of technology in healthcare presents dynamic opportunities for further
exploring fields, ultimately shaping the future of interpreting and other types of language
mediation in healthcare settings.
Notes
1 The terms ‘e-health’, ‘telehealth’, and ‘telemedicine’ are often used interchangeably in the literature
and correspond to what the World Health Organization defines as telemedicine, that is, ‘delivery of
health care services, where patients and providers are separated by distance’ (WHO-ITU, 2022).
2 Personal communication of the author with hospital employees, March 2024.
3 In their publications of numbers, agencies often do not distinguish between different configurations
of VMI. When it is not sure whether numbers apply to VRI or other types of VMI, the overarching
term VMI is used.
4 www.sociaalvertaalbureau.be/wp-content/uploads/2023/05/JAARVERSLAG-NL-2022.docx-1.pdf
(accessed 4.4.2025).
258
Healthcare settings
5 Another important difference between the medical studies and interpreting studies, pointed out by
Bischoff and Grossmann (2006, 34), is that the former are generally carried out in the United States,
where non-native-speaking patients are predominantly Hispanophone, whereas IS research outside
the United States focuses on migrants, representing a multitude of languages and cultures.
6 AIIC Reference Guide for Remote Simultaneous Interpreting. https://aiic.ch/wp-content/uploads/
2020/05/ aiic-ch-reference-guide-to-rsi.pdf (accessed 4.4.2025).
7 www.globaltalk.be/care-app/ (accessed 4.4.2025).
References
Alley, E., 2012. Exploring Remote Interpreting. International Journal of Interpreter Education 4(1),
111–119.
Almahasees, Z., Al-Natour, M., Mahmoud, S., Aminzadeh, S., 2024. Bridging Communication Gaps
in Crisis: A Case Study of Remote Interpreting in the Middle East During the COVID-19 Pan-
demic. World Journal of English Language 14(2), 462–470. URL https://doi.org/10.5430/wjel.
v14n2p462
Amato, A., 2018. Challenges and Solutions: Some Paradigmatic Examples. In Amato, A., Spinolo, N.,
González Rodríguez, M.J., eds. Handbook of Remote Interpreting: Research Report Shift in Oral-
ity Erasmus + Project: Shaping the Interpreters of the Future and of Today, 79–98. URL www.
shiftinorality.eu/es/resources/2018/05/11/shift-handbook-remote-interpreting
Amato, A., 2020. Interpreting on the Phone: Interpreter’s Participation in Healthcare and Medical
Emergency Service Calls. inTRAlinea Special Issue: Technology in Interpreter Education and Prac-
tice. URL www.intralinea.org/specials/article/2519
Anazawa, R., Ishikawa, H., Kiuchi, T., 2012. The Accuracy of Medical Interpretations: A Pilot Study
of Errors in Japanese-English Interpreters During a Simulated Medical Scenario. Translation &
Interpreting 4(1), 1–20.
Angelelli, C.V., 2014. Interpreting in the Healthcare Setting: Access in Cross-Linguistic Communica-
tion. In Hamilton, H., Chou, S., eds. The Routledge Handbook of Language and Health Com-
munication. Routledge, London, 573–585.
Azarmina, P., Wallace, P., 2005. Remote Interpretation in Medical Encounters: A Systematic Review. Jour-
nal of Telemedicine and Telecare 11, 140–145. URL https://doi.org/10.1258/1357633053688679
Bachelier, K., Orlando, M., 2024. Building Capacity of Interpreting Services in Australian Health-
care Settings: The Use of Video Remote During the COVID-19 Pandemic. Media and Intercul-
tural Communication: A Multidisciplinary Journal 2(1), 80–96. URL https://doi.org/10.22034/
MIC.2024.446261.1015
Bischoff, A., Grossmann, F., 2006. Telefondolmetschen im Spital. Universitat Basel, Institut fur
Pflegewissenschaft, Basel.
Bot, H., 2013. Taalbarrières in de zorg. Van Gorcum, Utrecht.
Boujon, V., Bouillon, P., Spechbach, H., Gerlach, J., Strasly, I., 2018. Can Speech-Enabled Phrasela-
tors Improve Healthcare Accessibility? A Case Study Comparing Babeldr with Medibabble for
Anamnesis in Emergency Settings. In Proceedings of the 1st Swiss Conference on Barrier-free Com-
munication, Winterthur, 50–65. URL https://doi.org/10.21256/zhaw-3000
Braun, S., 2019. Technology in Interpreting. In O’Hagan, M., ed. Routledge Encyclopedia of Transla-
tion Studies. Routledge, London, 271–288. URL https://doi.org/10.4324/9781315311258-19
Braun, S., Al Sharou, K., Temizöz, Ö., 2023. Technology Use in Language-Discordant Interpersonal
Healthcare Communication. In Wadensjö, C., Gavioli, L., eds. The Routledge Handbook of Public
Service Interpreting. Routledge, London, 89–105. URL https://doi.org/10.4324/9780429298202
Braun, S., Taylor, J.L., 2012. AVIDICUS Comparative Studies – Part I: Traditional Interpreting and
Remote Interpreting in Police Interviews. In Braun, S., Taylor, J.L., eds. Videoconference and
Remote Interpreting in Criminal Proceedings. Intersentia, Antwerp, 99–117.
Brunson, J.L., 2018. The Irrational Component in the Rational System: Interpreter Talk About Their
Motivation to Work in Relay Services. In Napier, J., Skinner, R., Braun, S., eds. Here or There:
Research on Interpreting via Video Link. Gallaudet University Press, Washington, DC, 39–60.
URL https://doi.org/10.2307/j.ctv2rh2bs3.5
259
The Routledge Handbook of Interpreting, Technology and AI
Corpas Pastor, G., Gaber, M., 2020. Remote Interpreting in Public Service Settings: Technology, Per-
ceptions and Practice. SKASE Journal of Translation and Interpretation 13(2), 58–78. URL http://
hdl.handle.net/2436/624259
Corpas Pastor, G., Sánchez Rodas, F., 2021. Now What? A Fresh Look at Language Technolo-
gies and Resources for Translators and Interpreters. In Lavid-López, J., Maíz-Arévalo, C.,
Zamorano-Mansilla, J., eds. Corpora in Translation and Contrastive Research in the Digital Age.
John Benjamins, Amsterdam, 23–48. URL https://doi.org/10.1075/btl.158.01cor
Crossman, K.L., Wiener, E., Roosevelt, G., Bajaj, L., Hampers, L.C., 2010. Interpreters: Telephonic,
In-Person Interpretation and Bilingual Providers. Pediatrics 125(3), 631–638. URL https://doi.
org/10.1542/peds.2009-0769
Davitti, E., 2019a. Methodological Explorations of Interpreter-Mediated Interaction: Novel
Insights from Multimodal Analysis. Qualitative Research 19(1), 7–29. URL https://doi.
org/10.1177/1468794118761492
Davitti, E., 2019b. Healthcare Interpreting. In Baker, M., Saldanha, G., eds. Routledge Encyclopedia
of Translation Studies. Routledge, London.
Dean, R.K., Pollard, R.Q. Jr., 2011. Context-Based Ethical Reasoning in Interpreting: A Demand
Control Schema Perspective. Interpreter and Translator Trainer 5(1), 155–182. URL https://
doi.org/10.1080/13556509.2011.10798816
De Boe, E., 2015. The Influence of Governmental Policy on Public Service Interpreting in the Neth-
erlands. The International Journal for Translation & Interpreting Research 7(3), 166–184. URL
https://doi.org/10.12807/ti.107203.2015.a12
De Boe, E., 2021. Management of Overlapping Speech in Remote Healthcare Interpreting. The Inter-
preter’s Newsletter 26, 137–155. URL https://doi.org/10.13137/2421-714X/33268; www.open
starts.units.it/dspace/handle/10077/2119
De Boe, E., 2023. Remote Interpreting in Healthcare Settings. Peter Lang, London. URL https://
doi.org/10.3726/b18200
De Boe, E., 2024. Synchronization of Interaction in Healthcare Interpreting by Video Link and Telephone. In
De Boe, E., Vranjes, J., Salaets, H., eds. Interactional Dynamics in Remote Interpreting: Micro-Analytical
Approaches. Routledge, London, 22–41. URL https://doi.org/10.4324/9781003267867.
De Boe, E., Vranjes, J., Salaets, H., 2024. About the Need for Micro-Analytical Investigations in
Remote Dialogue Interpreting. In De Boe, E., Vranjes, J., Salaets, H., eds. Interactional Dynamics
in Remote Interpreting: Micro-Analytical Approaches. Routledge, London, 1–21. URL https://doi.
org/10.4324/9781003267867
De Cotret, F.R., Beaudoin-Julien, A.-A., Leanza, Y., 2020. Implementing and Managing Remote Pub-
lic Service Interpreting in Response to COVID-19 and Other Challenges of Globalization. Meta
65(3), 618–642. URL https://doi.org/10.7202/1077406ar
Defrancq, B., Corpas Pastor, G., 2023. Introduction. In Corpas Pastor, G., Defrancq, B., eds. Inter-
preting Technologies: Current and Future Trends. John Benjamins, Amsterdam, 1–6. URL https://
doi.org/10.1075/ivitra.37.intro
De Groot, E., Fransen, L., Van Dam, F., Pinckaers, E., Berkhout, B., 2022. Tolken in de zorg: Een
overzicht van huidige inzet, financiering en knelpunten [Research Rapport]. Berenschot, Utrecht.
De Meulder, M., Pouliot, O., Gebruers, K., 2021. Remote Sign Language in Times of COVID-19
[Research Report]. Kenniscentrum Gezond en Duurzaam Leven, Hogeschool Utrecht, Utrecht.
www.hu.nl/onderzoek/publicaties/remote-sign-language-interpreting-in-times-of-covid-19
(accessed 1.11.2024).
Esterle, L., Mathieu-Fritz, A., 2013. Teleconsultation in Geriatrics: Impact on Professional Practice.
International Journal of Medical Informatics 82(8), 684–695. URL http://dx.doi.org/10.1016/j.
ijmedinf.2013.04.006
Farag, F., Meyer, B., 2024. Coordination in Telephone-Based Remote Interpreting. Interpreting 26(1),
80–130. URL https://doi.org/10.1075/intp.00097.far
Fiedler, J., Pruskil, S., Wiessner, C., Zimmermann, T., Scherer, M., 2022. Remote Interpreting in Pri-
mary Care Settings: A Feasibility Trial in Germany. BMC Health Services Research 22(1). URL
https://doi.org/10.1186/s12913-021-07372-6
Flores, G., Abreu, M., Barone, C.P., Bachur, R., Lin, H., 2012. Errors of Medical Interpretation
and Their Potential Clinical Consequences: A Comparison of Professional Versus Ad Hoc Versus
No Interpreters. Annals of Emergency Medicine 60(5), 545–553. URL https://doi.org/10.1016/j.
annemergmed.2012.01.025
260
Healthcare settings
González Rodríguez, M.J., Spinolo, N., 2017. Telephonic Dialogue Interpreting. In Niemants, N.,
Cirillo, L., eds. Teaching Dialogue Interpreting. John Benjamins, Amsterdam, 242–257. URL
https://doi.org/10.1075/btl.138.12gon
Gracia-García, R.A., 2002. Telephone Interpreting: A Review of Pros and Cons. In Brennan, S., ed.
Proceedings of the 43rd Annual Conference. American Translators Association, Alexandria, VA,
195–216.
Greenhalgh, T., Vijayaraghavan, S., Wherton, J., Shaw, S., Byrne, E., Campbell-Richards, D., Bhat-
tacharya, S., Hanson, P., Ramoutar, S., Gutteridge, C., Hodkinson, I., Collard, A., Morris, J.,
2016. Virtual Online Consultations: Advantages and Limitations (VOCAL) Study. BMJ Open 6,
e009388. URL https://doi.org/10.1136/bmjopen-2015-009388
Gutman, C.K., Klein, E.J., Follmer, K., Brown, J.C., Ebel, B.E., Lion, K.C., 2020. Deficiencies in
Provider-Reported Interpreter Use in a Clinical Trial Comparing Telephonic and Video Interpre-
tation in a Pediatric Emergency Department. Joint Commission Journal on Quality and Patient
Safety 46(10), 573–580. URL https://doi.org/10.1016/j.jcjq.2020.08.001
Hale, S., 2007. Community Interpreting. Palgrave Macmillan, Hampshire/Basingstoke.
Hansen, J.P.B., 2020. Invisible Participants in a Visual Ecology: Visual Space as a Resource for Organ-
izing Video-Mediated Interpreting in Hospital Encounters. Social Interaction. Video-Based Studies
of Human Sociality 3(3), 1–25. URL https://doi.org/10.7146/si.v3i3.122609
Hansen, J.P.B., 2021. Video-Mediated Interpreting: The Interactional Accomplishment of Interpret-
ing in Video-Mediated Environments (Unpublished doctoral thesis). University of Oslo, Oslo.
Hansen, J.P.B., 2024. Interpreters’ Repair Initiators in Video-Mediated Environments. In De Boe,
E., Vranjes, J., Salaets, H., eds. Interactional Dynamics in Remote Interpreting: Micro-Analytical
Approaches. Routledge, London, 91–112. URL https://doi.org/10.4324/9781003267867
Hansen, J.P.B., Svennevig, J., 2021. Creating Space for Interpreting Within Extended Turns at Talk.
Journal of Pragmatics 182, 144–162. URL https://doi.org/10.1016/j.pragma.2021.06.009
Haualand, H., 2010. Provision of Videophones and Video Interpreting for the Deaf and Hard
of Hearing: A Comparative Study of Video Interpreting (IV) Systems in the US, Norway and
Sweden. The Swedish Institute of Assisted Technology. www.independentliving.org/files/
haualand20100924video-interpreting-systems.pdf (accessed 23.2.2024).
Havelka, I., 2018. Videodolmetschen im Gesundheitswesen: Dolmetschwissenschaftliche Untersu-
chung eines österreichisches Pilotprojektes. Frank & Timme, Berlin.
Hedding, T., Kaufman, G., 2012. Health Literacy and Deafness: Implications for Interpreter
Education. In Swabey, L., Malcolm, K., eds. In Our Hands: Educating Healthcare Interpret-
ers. Gallaudet University Press, Washington, DC, 164–189. URL https://doi.org/10.2307/j.
ctv2rcnmkt.12
Iezzoni, L.I., O’Day, B.L., Killeen, M., Harker, H., 2004. Communicating About Health Care: Obser-
vations from Persons Who Are Deaf or Hard of Hearing. Annals of Internal Medicine 140(5),
356–362. URL https://doi.org/10.7326/0003-4819-140-5-200403020-00011
Imdi, 2021. Offentlige organers behov for tolking 2021. Faktaark 2021. URL www.imdi.no/
contentassets/c669ebfc896d4fcc847b29b9ea14ae90/faktaark-2021.pdf (accessed 20.2.2024).
Imdi, 2022. Offentlige organers behov for tolking 2022. Faktaark 2022. URL www.imdi.no/
contentassets/261ab2f7b670401797f53d72fd574621/faktaark-2022.pdf (accessed 20.2.2024).
Jacobs, E.A., Shepard, D.S., Suaya, J.A., Stone, E., 2004. Overcoming Language Barriers in Health
Care: Costs and Benefits of Interpreter Services. American Journal of Public Health 94(5), 866–869.
URL https://doi.org/10.2105/ajph.94.5.866
Joseph, C., Garruba, M., Melder, A., 2017. Patient Satisfaction of Telephone or Video Interpreter
Services Compared with In-Person Services: A Systematic Review. Australian Health Review 42(2),
168–177. URL https://doi.org/10.1071/AH16195
Kelly, N., 2008. Telephone Interpreting: A Comprehensive Guide to the Profession. Trafford Publish-
ing, Bloomington, IN.
Kenny, D., ed., 2022. Machine Translation for Everyone: Empowering Users in the Age of Artificial
Intelligence. Language Science Press, Berlin. URL http://doi.org/10.5281/zenodo.6653406
Kerremans, K., De Ryck, L., De Tobel, V., Janssens, R., Rillof, P., Scheppers, M., 2018. Bridging the
Communication Gap in Multilingual Service Encounters: A Brussels Case Study. The European
Legacy 23(7–8), 757–772. URL https://doi.org/10.1080/10848770.2018.1492811
Kletecka-Pulker, M., Parrag, S., 2015. Pilotprojekt Qualitätssicherung in der Versorgung
nicht-deutschsprachiger PatientInnen: Videodolmetschen im Gesundheitswesen (research report).
261
The Routledge Handbook of Interpreting, Technology and AI
262
Healthcare settings
semanticscholar.org/paper/PASS-International-Is-the-use-of-interpreters-in-Ribera-Hausmann-
Muela/46e06f27509729bb1711780e65c00c7a0c3e87e1 (accessed 3.3.2024).
Rivas Velarde, M., Jagoe, C., Cuculick, J., 2022. Video Relay Interpretation and Overcoming Barri-
ers in Health Care for Deaf Users: Scoping Review. Journal of Medical Internet Research 24(6),
e32439. URL https://doi.org/10.2196/32439
Saint-Louis, L., Friedman, E., Chiasson, E., Quessa, A., Novaes, F., 2003. Testing New Technologies
in Medical Interpreting. Cambridge Health Alliance, Somerville, MA. URL https://icommunity-
health.org/publications/testing-new-technologies-in-medical-interpreting/
Schinkel, S., Schouten, B.C., Kerpiclik, F., Van Den Putte, B., Van Weert, J.C.M., 2019. Perceptions of
Barriers to Patient Participation: Are They Due to Language, Culture, or Discrimination? Health
Communication 34(12), 1469–1481. URL https://doi.org/10.1080/10410236.2018.1500431
Schouten, B.C., Cox, A., Duran, G., Kerremans, K., Banning, L.K., Lahdidioui, A., van den Muijsen-
bergh, M., Schinkel, S., Sungur, H., Suurmond, J., Zendedel, R., 2020. Mitigating Language and
Cultural Barriers in Healthcare Communication: Toward a Holistic Approach. Patient Education
and Counseling 103(12), 2604–2608. URL https://doi.org/10.1016/j.pec.2020.05.001
Skinner, R., Napier, J., Braun, S., 2018. Interpreting via Video Link: Mapping the Field. In Napier,
J., Skinner, R., Braun, S., eds. Here or There: Research on Interpreting via Video Link. Gallaudet
University Press, Washington, DC, 11–39. URL https://doi.org/10.2307/j.ctv2rh2bs3.4
Sociaal Vertaalbureau, 2022. Jaarverslag (Annual Report). URL https://www.sociaalvertaalbureau.be/
wp-content/uploads/2023/06/Jaaroverzicht-2022-NL.png (accessed 1.3.2024).
Stengers, H., Lázaro-Gutiérrez, R., Kerremans, K., 2023. Public Service Interpreters’ Perceptions and
Acceptance of Remote Interpreting Technologies in Times of a Pandemic. In Corpas Pastor, G.,
Defrancq, B., eds. Interpreting Technologies: Current and Future Trends. John Benjamins, Amster-
dam. https://doi.org/10.1075/ivitra.37.05ste
Sultanic, I., 2022. Interpreting in Pediatric Therapy Settings During the COVID-19 Pandemic: Ben-
efits and Limitations of Remote Communication Technologies and Their Effect on Turn-Taking
and Role Boundary. FITISPos International Journal 9(1), 78–101. URL https://doi.org/10.37536/
FITISPos-IJ.2023.1.9.313
Swabey, L., Laurion, R., Patrie, C., Ramirez, R., 2017. Using a Career Lattice to Chart a Path to Compe-
tency in Healthcare Interpreting. Conference of Interpreter Trainers – Out of the Gate, Towards the
Triple Crown: Research, Learn & Collaborate. URL https://citsl.org/using-a-career-lattice-to-chart-
a-path-to-competency-in-healthcare-interpreting/ (accessed 14.2.2024).
Sweileh, W.M., Al-Jabi, S.W., AbuTaha, A.S., Zyoud, S.H., Anayah, F.M.A., Sawalha, A.F., 2017. Bib-
liometric Analysis of Worldwide Scientific Literature in Mobile - Health: 2006-2016. BMC Medical
Informatics and Decision Making 30, 17(1), 72. URL https://doi.org/10.1186/s12911-017-0476-7
Teladoc Health, 2018. InTouch Health & InDemand Interpreting Partnership Enables Clinicians
to Connect with Interpreters. URL https://business.teladochealth.com/newsroom/intouch-health/
indemand-interpreting-partners-with-intouch-health/. (accessed 1.14.2025).
Thonon, F., Perrot, S., Yergolkar, A.V., Rousset-Torrente, O., Griffith, J.W., Chassany, O., Duracin-
sky, M., 2021. Electronic Tools to Bridge the Language Gap in Health Care for People Who Have
Migrated: Systematic Review. Journal of Medical Internet Research 6:23(5), e25131. URL https://
doi.org/10.2196/25131
Valero-Garcés, C., 2018. Introduction PSIT and Technology: Challenges in the Digital Age. FITISpos
International Journal 5(1), 1–6. URL https://doi.org/10.37536/FITISPos-IJ.2018.5.1.185
Van Straaten, W., Bloem, W., Gilhuis, N., Schipper. E., Boonen, L., 2023. Digitale hulpmiddelen
voor het overkomen van taalbarrières. Equalis Strategy & Modeling, Utrecht. URL https://open.
overheid.nl/documenten/5da7a55d-461b-4277-8c0f-3831baba1390/file (accessed 1.14.2025).
Verhaegen, M., 2023. Exploring Turn-Taking in Video-Mediated Interpreting: A Research Methodol-
ogy Using Eye Tracking. Interpreter’s Newsletter 28, 151–169. URL https://www.openstarts.units.
it/handle/10077/35555
Verrept, H., Coune, I., 2016. Guide for Intercultural Mediation in Healthcare. Federale over-
heidsdienst Volksgezondheid, veiligheid van de voedselketen en leefmilieu - Cel intercul-
turele bemiddeling & beleidsondersteuning, Brussels. URL https://www.health.belgium.be/nl/
gids-voor-de-interculturele-bemiddeling-de-gezondheidszorg (accessed 3.12.2024).
Verrept, H., Coune, I., Van de Velde, J., Baatout, S., 2018. Evaluatie projecten interculturele bemid-
deling via videoconferentie. Research report Federale overheidsdienst Volksgezondheid, veiligheid
263
The Routledge Handbook of Interpreting, Technology and AI
264
15
LEGAL SETTINGS
Jérôme Devaux
15.1 Introduction
Within the field of public service interpreting, legal interpreting is an umbrella term that
refers to interpreting during criminal and civil proceedings, when national and cross-national
cases are investigated and heard. It encompasses various settings, including police stations,
criminal and civil courts, asylum and immigration tribunals,1 correctional facilities, and
probation services (Hertog, 2015; Monteolivia-Garcia, 2018).
The advent of technology has brought significant changes to the practice of legal inter-
preting. While interpreters have been a part of legal proceedings for centuries (Morris,
1999), the inclusion of technology during the Nuremberg trials marked a new era in legal
interpreting, as the use of headsets and microphones allowed interpreters to simultaneously
interpret the trial proceedings.
More recently, technology has advanced rapidly, reshaping the legal interpreter’s working
environment. Technological systems such as telephone and videoconferencing have enabled
court users to participate in the legal process remotely. These systems have introduced new
modalities for interpreting beyond face-to-face interactions. Under the overarching term
of distance interpreting, telephone/audio-mediated interpreting and video-mediated inter-
preting2 have enabled participants to take part in multilingual legal proceedings remotely.
For example, a court hearing can now take place with the defendant in prison, attending
their remand hearing through a video link, while the interpreter is physically present in the
courtroom. Similarly, an interview between a police officer and a minority-language suspect
can be conducted at a police station with the assistance of an interpreter located in a remote
interpreting hub, many miles away.
The use of technology in legal proceedings and the impact of conducting legal proceed-
ings through technology have been researched since the 1980s, primarily in monolingual
legal settings. Scholarly investigations have explored various areas, including the concepts
of fairness and the rule of law (Johnson and Wiggins, 2006; Radburn-Remfry, 1994; Thax-
ton, 1993), the influence of technology on participants (Fullwood et al., 2008; McKay,
2016; Radburn-Remfry, 1994; Roth, 2000), and the challenges associated with technologi-
cal implementation (Haas, 2006; Plotnikoff and Woolfson, 1999, 2000). It is through the
DOI: 10.4324/9781003053248-20
The Routledge Handbook of Interpreting, Technology and AI
groundbreaking work of the AVIDICUS3 (2008–2016) and SHIFT4 projects that research
began to scrutinise the use of technology in interpreter-mediated legal interactions in a
more systematic fashion. These innovative projects initiated a new trajectory of research,
fostering a more comprehensive understanding of the interplay between technology and
legal interpreting.
Reviewing the current practice and existing body of research, this chapter examines
how technology has transformed the work of legal interpreters. The first section provides
a contextual background on the use of technology in legal interpreting. The second section
reviews the effects of technology on the legal interpreter’s working environment. The third
section analyses the impact of technology on the interpreting process itself. Finally, the last
section explores the influence of technology on legal interpreters’ coping mechanisms and
training.
266
Legal settings
anglophone countries, particularly within court settings (Braun and Taylor, 2012b; Dumou-
lin and Licoppe, 2010). Nowadays, this technology forms an integral part of day-to-day
court operations in countries such as England and Wales, as evidenced by the annual report
of Her Majesty’s Court and Tribunal Services (2020).
Examining its use in practice, Braun (2018) reviews how videoconferencing technol-
ogy is employed during criminal procedures to establish a connection between the court
and a defendant in prison. This technology also allows witnesses to provide evidence from
geographically remote locations and enables vulnerable witnesses to testify from separate
rooms within the court premises. Furthermore, it enables lawyers to communicate with
defendants in prison from their offices or courtrooms. This technology is particularly used
for short pre-trial hearings, lasting between 30 and 45 min (Braun et al., 2016a, 2018).
However, it is also noted that VC sessions may last much longer when a witness is giving
evidence via a video link, for instance (Braun et al., 2018). Although there is relatively less
research available, especially when an interpreter is present, videoconferencing technology
is also used in civil proceedings (Her Majesty’s Court and Tribunal Services, 2020; Zahrast-
nik and Baghrizabehi, 2022).
Considering the different locations a legal interpreter may be interpreting from, studies
(e.g. Braun and Taylor, 2012b; Devaux, 2017a, 2018; Singureanu et al., 2023a) distinguish
between videoconference interpreting A (VCI A), where the interpreter is co-located with
the participants in the courtroom while the minority-language speaker is in a remote loca-
tion. Videoconference interpreting B (VCI B) refers to the case scenario where the legal
interpreter is co-located with the minority-language speaker in a remote location.
Another case scenario that has emerged is remote interpreting (RI), which was first piloted
by international institutions in the 1990s and was later on adopted by courts and police
forces with spoken language interpreters (Braun, 2019). It led to the creation of interpreting
hubs, such as for the Ninth Circuit Court in Florida in 2007 (Braun and Taylor, 2012b).
In this context, the legal interpreter is not co-located with any of the other p articipants.
Similarly, the Metropolitan Police Service in London created seven hubs across the city to
provide RI services to the police in 2011. The aim was to reduce interpreters’ travel costs,
which were estimated at 33% of the total interpreting budget at that time (Braun, 2018).
One specific modality of RI in the context of sign language interpreting is the video
relay service (VRS; see Warnicke, this volume). This hybrid modality has been used since
the 1990s to establish a video connection between a deaf person and an interpreter while
the interpreter and the other participant are connected via an audio feed. Conversely, in
video remote interpreting (VRI) in North America, the interpreter is located in a hub and
is linked via an audio and video feed to the deaf service user and the other participant,
who are located in the same room. This differs from European practice, where VRI refers
to the sign language interpreter working with audio-video technology (Lee, 2020; Skinner
et al., 2018). VRS and VRI are used in a variety of legal contexts, including non-emergency
calls, courtrooms, and police stations (Napier and Leneham, 2011; Skinner, 2020; Skinner
et al., 2021).
267
The Routledge Handbook of Interpreting, Technology and AI
safety measures, protecting vulnerable witnesses, avoiding further traumatisation, and even
allowing witnesses to provide evidence from hospital beds (Adamska-Gallant, 2016; Ali
and Al-Junaid, 2019).
Additionally, the use of technology shows potential benefits for legal interpreters.
Indeed, it is argued that TI, VCI, RI, and VRS have expanded the pool of available legal
interpreters, especially for rare language combinations or for multilingual proceedings tak-
ing place in remote locations. Technology has also been advocated as a means to reduce the
interpreter’s travel time and costs, and to enhance their safety (Braun, 2013c; Braun and
Davitti, 2018; Devaux, 2016; Hale et al., 2022). Furthermore, legal interpreters have iden-
tified technology as a way to maintain neutrality, preserve anonymity, avoid interruptions
from the minority-language speaker, exert more control over breaks, and mitigate the risk
of malicious complaints (Hale et al., 2022).
15.2.4 Frameworks and codes regulating technology and the legal interpreter
Article 2(6) of the Directive 2010/64/EU on the right to interpretation in criminal proceed-
ings5 in Europe states that technologies, such as telephone and videoconferencing systems,
can be used during interpreter-mediated interactions. To ensure the suitability of the afore-
mentioned equipment for its intended purposes, the General Secretariat of the Council of
the European Union (2013) highlights the importance of adhering to various standards.
Within this context, the International Telecommunication Union (2024) puts forward an
extended list of standards, covering numerous (visual) telephoning and videoconferencing
aspects, such as video coding, picture-in-picture functionality, and real-time control pro-
tocols. These standards hold significance so that legal interpreters may be provided with
optimal audio and video quality.
At a more local level, some government organisations, interpreting bodies, and inter-
preting agencies have amended their guidelines and codes of conduct and practice to reflect
the fact that legal interpreters may be called to work via telephone and videoconferencing
systems (see, for instance, American Translators Association, 2021; College of Policing,
2020; Home Office, 2021; New York State Unified Court System, 2020). These amend-
ments cover a range of provisions, starting from a basic recognition that technology can
be used in legal interpreting to a more comprehensive set of guidelines outlining acceptable
practices for legal interpreters. However, it could be argued that these may not be sufficient,
as guidelines and codes of ethics are not consistently followed. Xu et al. (2020) analyse
interviews between lawyers and their clients, mediated via TI, and find that interpreters
do not always adhere to the guidance offered in their code of ethics. Similarly, Devaux
(2017b) interviews legal interpreters working in VCI in courts and concludes that codes of
ethics may not always offer the most suitable framework for addressing ethical dilemmas
encountered by court interpreters in VCI. He argues that other ethical approaches, such as
consequentialism, moral sentiments, and virtue ethics, may provide more suitable guidance
in the context of technology-mediated interpreting.
268
Legal settings
has shown potential to reshape translation and interpreting practice (Braun, 2019, 2020;
Carl and Braun, 2018; Pöchhacker, 2022), a phenomenon described as the ‘technological
turn’ (Fantinuoli, 2018).
Technology, from widely accessible cloud-based communication platforms to specialised
tools designed to assist interpreters, has been integrated into the interpreting workflow.
These technologies serve multiple purposes, from supporting interpreters in their prepara-
tory work to potentially replacing human interpreters with AI-powered software. While
research has demonstrated the application of various technologies in other translation and
interpreting contexts, there is limited evidence of their widespread adoption in legal inter-
preting. Nevertheless, this section presents technologies that could potentially redefine legal
interpreting provisions. However, a prerequisite for their implementation is the necessity
of conducting further research to assess the potential use and the subsequent impact and
benefits they may have within legal interpreting.
As discussed by Fantinuoli (2017, 2021), computer-assisted interpreting (CAI) tools offer
numerous benefits in supporting the interpreter’s workflow (see Prandi, this volume). These
tools have been developed to streamline the interpreter’s preparatory work by assisting in
the creation of glossaries, for example. They can also aid the interpreter whilst interpreting
by generating transcriptions and displaying numbers and proper names, which are notori-
ously challenging for interpreters. Another useful function is the fact that CAI tools can be
edited post-assignment for quality assurance purposes, thereby supporting interpreters in
their subsequent assignments.
Although the body of research is limited, Goldsmith (2018) discusses how tablet inter-
preting could enhance interpreters’ workflow both before and during assignments (see
also Saina, this volume). An important advantage worth highlighting is the possibility for
interpreters to transition to a paperless approach. By relying on tablets, interpreters can
eliminate the need for physical documents and instead rely on digital resources, which can
streamline their work processes and enhance overall efficiency and accessibility.
Digital pens may also bring potential benefits for interpreters, especially when work-
ing in consecutive mode (see Ünlü, this volume). They allow recording simultaneously the
interpreter’s notes and the speech being delivered, thereby assisting interpreters when deliv-
ering the speech in another language. They are a useful technological tool to be used dur-
ing interpreting training (Orlando, 2010). However, as they are equipped with audio and
video recording software, the use of digital pens within a legal context may raise significant
ethical concerns.
Finally, automated translation and interpreting have opened up new horizons. For
instance, Pöchhacker (2022) discusses speech-to-text technology and how it enhances
accessibility, in educational settings and live events, by providing access to spoken mes-
sages to deaf or hard-of-hearing individuals (see also Davitti, this volume). Automated
translation is also very much used by a broader audience, as evidenced by Google Translate
reaching 1 billion installs in 2021 (Pitman, 2021). However, studies focusing on automated
translation within legal systems indicate that the use of such tools could be inappropriate
(Nunes Vieira et al., 2021; Trzaskawka, 2020).
Within the realm of public service interpreting, Monteolivia-Garcia’s (2020) study
reveals that police forces acknowledge the benefits of automated translation tools, includ-
ing Google Translate, whilst policing. The study reports that these applications are used in
informal situations (such as giving directions, answering some questions, or establishing
initial circumstances while waiting for an interpreter). The police officers participating in
269
The Routledge Handbook of Interpreting, Technology and AI
this study acknowledge that such applications have a limited use. Therefore, it is unsurpris-
ing that at the time of writing, machine translation and interpreting find limited application
within the context of legal proceedings (see Fantinuoli, this volume).
270
Legal settings
271
The Routledge Handbook of Interpreting, Technology and AI
Given these spatial limitations, the interpreter’s position within courts and police inter-
view rooms must be adapted. Scholarly work shows that in face-to-face hearings, the inter-
preter typically sits or stands next to the defendant in the dock or just outside the dock.
However, in the context of a VCI hearing, the interpreter must sit or stand in a differ-
ent place. Braun et al.’s (2018) study reveals that the judge decides where the interpreter
stands, on an ad hoc basis, taking into account technological constraints, such as camera
or microphone locations. As a result, the interpreter might be standing outside the camera’s
field of view, making them invisible to the minority-language speaker. Singureanu et al.
(2023b) further investigate, amongst other themes, the visual ecology created in VCI A.
Their study reveals that this working environment presents additional challenges both for
the interpreter and the defendant. They report that the interpreter is more visible, and they
may find the setting more intimidating. Additionally, they may fail to notice the defendant’s
attempt to interact. It may also be more difficult for the defendant to identify who is speak-
ing, which results in an increased use of reported speech by the interpreter to make it more
accessible to the defendant. In an observational study conducted by Licoppe (2021), the
interpreter is positioned alongside the judge in VCI A. Proximity to the judge might suggest
collusion, thereby undermining the interpreter’s efforts to maintain impartiality.
In VCI B, Balogh and Hertog (2012) recommend that the interpreter sit behind the
minority-language speaker for them to be fully visible. Nonetheless, space is often restricted,
and Braun et al. (2018) note instances where defendants and interpreters are seated in a
line. This configuration has raised concerns among some participants in Devaux’s (2017a)
study, as court interpreters felt they were no longer perceived as impartial agents by partici-
pants located in the courtroom.
272
Legal settings
15.4.1 Quality
Empirical studies examining the influence of technology on legal interpreters’ performance
and quality of legal interpreting have yielded varied results. In a simulated police interview
context, Braun and Taylor (2012a) conduct a comparative study involving eight experi-
enced legal interpreters, gauging the difference in quality between face-to-face and remote
video interpreting. Their findings highlight a significant increase in the number of additions
(+290%) and distortions (+200%) in RI as compared to traditional face-to-face interac-
tions (see also Braun, 2013c).
A comparable study conducted by Miler-Cassino and Rybińska (2012) focuses on VCI
during prosecution questioning of witnesses in Polish courts. In this context, they designed
a simulation consisting of three case scenarios, employing three different interpreters. The
resulting assessment of the interpreters’ performances yielded more mixed results. For
instance, one interpreter demonstrated superior performance in VCI A compared to VCI B
or face-to-face, while another interpreter performed better in VCI B.
Broadening the pool of participants taking part in their study, Hale et al. (2022) also
report disparities regarding the performance and quality of interpreters when comparing
face-to-face, video remote, and audio remote interpreting. Their study identified no signifi-
cant disparity between face-to-face and video interpreting. However, the performance was
notably affected in audio remote interpreting. Hale et al. (2022, 18) suggest that a video
feed may contribute to improving quality, as it helps interpreters ‘render the original man-
ner and style accurately, maintain the verbal rapport markers, use correct legal discourse
and terminology, use the recommended interpreting protocols, and demonstrate adequate
management and coordination skills’.
Various factors potentially impacting the quality and performance of legal interpreting
have been postulated. Miler-Cassino and Rybińska (2012) hypothesise that differing levels
of linguistic and interpreting skills, or knowledge of the subject matter, could influence
outcomes. Other factors more intrinsically linked to the use of technology have been put
forward, including audio and video quality, interaction management, and familiarity with
the equipment (Braun, 2013a). Further investigation may be warranted to ascertain the
underlying factors that impact interpreting quality in a legal setting.
273
The Routledge Handbook of Interpreting, Technology and AI
274
Legal settings
interactions. Within the paradigm of legal interaction mediated via technologies, it emerges
that certain aspects of the interpreter’s role are redefined.
Skinner (2023) shows the multifaceted roles performed by legal interpreters when inter-
preting via video relay service in a police setting, including their auxiliary role as first-call
receivers. This notion of role multiplicity is echoed in Devaux’s (2017a, 2018) studies.
Situated within the context of court VCI, he scrutinises the perceptions of 18 interpreters
concerning their role(s). Using the concept of role space proposed by Llewellyn-Jones and
Lee (2014), the interpreters report on a variety of factors influencing their presentation of
self, participant alignment, and interaction management. Examples of these factors affect-
ing their role perception include the inability to introduce themselves as the interpreter,
seating arrangements, scarcity of feedback and backchanneling, a propensity to align more
with participants situated in the courtroom, and a sense of not being able to interrupt
the interaction for clarifications. Noteworthy is the fact that the interpreters’ perceptions
of their role were not homogenous, leading to the creation of diverse role space models.
Although these models vary, it is interesting to note that some participants report that
VCI forces them to adopt different roles, within the same court assignment, depending on
whether they are interacting with the courtroom or the defendant.
15.5.1 Strategies
Miler-Cassino and Rybińska’s (2012) research suggests that the interpreters involved in
their case study developed, over the three-day duration of the experiment, various coping
mechanisms to manage stress, as they seemed more relaxed. Subsequent research delves
further into interpreting strategies in distance interpreting, corroborating that legal inter-
preters have crafted techniques to offset the hurdles inherent to the use of technologies.
Braun (2013a), for instance, lists strategies such as request for repetition, alert to problem,
comprehension check, direct request for clarification, repetition, plus interrogative, approx-
imation, and physical resolution, which can be combined. Adopting actor–network theory
(ANT) as a framework, Devaux (2017a) pinpoints additional strategies (or ‘interessement
devices’ in ANT terminology) that interpreters employ, either overtly or covertly, in VCI
A and VCI B. In his study, interpreters report using many strategies similar to those used
in face-to-face interactions, including referring participants to their professional code of
conduct or defining boundaries to refuse to do certain tasks. However, the use of technol-
ogy gives way to new strategies. For instance, when interpreting in VCI B with the defend-
ant, the lawyer, and the interpreter being co-located in prison, some interpreters would use
their proximity with the lawyer to seek clarification or ask for repetitions. In VCI A, some
interpreters would use distance as a tool to preserve their impartiality, as they cannot com-
municate directly with the defendant in this configuration.
275
The Routledge Handbook of Interpreting, Technology and AI
Braun (2013a) argues that some strategies are used more successfully than others; for
instance, a repetition request was less efficient than a comprehension check, which was used
more frequently. These findings underline the need for additional research to assess the
range and, more specifically, the effectiveness of strategies called upon by legal interpreters
in the context of distance interpreting.
15.6 Conclusion
This chapter discussed the extent to which technology is redefining interpreting provisions
within the legal field. The introduction of distance interpreting has altered the legal land-
scape by allowing participants to attend legal proceedings remotely. Although this tech-
nology has brought numerous benefits, it also presents challenges for the legal interpreter.
Moreover, this chapter has explored the potential of other emerging technologies, which
may further transform the legal interpreting profession.
As the legal field continues to embrace technological advancements, it is imperative that
legal interpreters are involved in decision-making processes and technological development
and implementation. This chapter also underscored the need to continue investigating the
effect of technology on the interpreter’s working environment and interpreting process.
Taking interdisciplinary and mixed methodology approaches can potentially yield com-
plementary findings, thereby enhancing our understating of how technology influences
interpreter-mediated legal interaction.
276
Legal settings
Notes
1 The use of technology in Asylum and Immigration Tribunals will not be discussed in this chapter,
as it is covered in Singureanu and Braun, this volume.
2 For more information on the different modalities, see Braun (2019, 2020).
3 AVIDICUS stands for Assessment of Videoconference Interpreting in the Criminal Justice Services.
The projects are available from: http://wp.videoconference-interpreting.net/ (accessed 4.4.2025).
4 More information on the Shaping the Interpreters of the Future and of Today (SHIFT) is available
from: https://site.unibo.it/shiftinorality/en (accessed 4.4.2025).
5 See https://eur-lex.europa.eu/legal-content/EN/TXT/HTML/?uri=CELEX:32010L0064 (accessed
4.4.2025)
References
Adamska-Gallant, A., 2016. Video Conferencing in the Practice of the International Criminal Tri-
bunals. In Elektroniczny protokół – szansą na transparentny i szybkiproces’ [Electronic Proto-
col – a Chance for Transparent and Fast Trial]. Polish Ministry of Justice. URL www.academia.
edu/19730690/Video_Conferencing_in_Practice_of_Criminal_Courts (accessed 2.8.2024).
Ali, F., Al-Junaid, H., 2019. Literature Review for Videoconferencing in Court “E-Justice-Kingdom
of Bahrain”. 2nd Smart Cities Symposium, University of Bahrain, 24–26.3.2019. URL https://
ieeexplore.ieee.org/document/9124937 (accessed 2.8.2024).
Amato, A., 2018. Challenges and Solutions: Some Paradigmatic Examples. In Amato, A., Spinolo, N.,
González Rodríguez, M.J., eds. Handbook of Remote Interpreting – SHIFT in Orality. University
of Bologna, 79–102.
American Translators Association, 2021. ATA Position Paper on Remote Interpreting. URL www.
atanet.org/advocacy-outreach/ata-position-paper-on-remote-interpreting/ (accessed 2.8.2024).
Balogh, K., Hertog, E., 2012. AVIDICUS Comparative Studies – Part II: Traditional, Videoconference
and Remote Interpreting in Police Interviews. In Braun, S., Taylor, J., eds. Videoconference and
Remote Interpreting in Criminal Proceedings. Intersentia Publishing, Mortsel, 101–116.
Braun, S., 2013a. Assessment of Video-Mediated Interpreting in the Criminal Justice System: AVIDI-
CUS 2 – Action 2 Research Report. URL http://wp.videoconference-interpreting.net/wp-content/
uploads/2014/01/AVIDICUS2-Research-report.pdf (accessed 2.8.2024).
Braun, S., 2013b. Assessment of Video-Mediated Interpreting in the Criminal Justice System: AVIDI-
CUS 2 – Action 3: Guide to Video-Mediated Interpreting in Bilingual Proceedings. URL http://
wp.videoconference-interpreting.net/wp-content/uploads/2014/01/AVIDICUS2-Recommendations-
and-Guidelines.pdf (accessed 2.8.2024).
Braun, S., 2013c. Keep Your Distance? Remote Interpreting in Legal Proceedings: A Critical Assess-
ment of a Growing Practice. Interpreting 15(2), 200–228. https://doi.org/10.1075/intp.15.2.03bra
Braun, S., 2018. Video-Mediated Interpreting in Legal Settings in England: Interpreters’ Perceptions
in Their Sociopolitical Context. Translation and Interpreting Studies 13(3), 393–420. URL https://
doi.org/10.1075/tis.00022.bra
Braun, S., 2019. Technology, Interpreting. In Baker, M., Saldanha, G., eds. Routledge Ency-
clopedia of Translation Studies, 3rd ed. Routledge, Oxon and New York. URL https://doi.
org/10.4324/9781315678627
Braun, S., 2020. Technology and Interpreting. In O’Hagan, M., ed. Routledge Handbook of Transla-
tion and Technology. Routledge, Oxon and New York, 271–288.
Braun, S., Davitti, E., 2018. Face‐to‐Face vs. Video‐Mediated Communication – Monolingual. In
Amato, A., Spinolo, N., González Rodríguez, M.J., eds. Handbook of Remote Interpreting – SHIFT
in Orality Erasmus+ Project: Shaping the Interpreters of the Future and of Today. University of
Bologna, Bologna. URL http://amsacta.unibo.it/id/eprint/5955/
Braun, S., Davitti, E., Dicerto, S., 2016a. The Use of Videoconferencing in Proceedings Conducted
with the Assistance of an Interpreter. URL www.videoconference-interpreting.net/wp-content/
uploads/2016/11/AVIDICUS3_Research_Report.pdf (accessed 2.8.2024).
Braun, S., Davitti, E., Dicerto, S., 2016b. Handbook of Bilingual Videoconferencing: The Use of
Videoconferencing in Proceedings Conducted with the Assistance of an Interpreter. AVIDICUS
3 Project. URL www.videoconference-interpreting.net/wp-content/uploads/2016/08/AVIDICUS3_
Handbook_Bilingual_Videoconferencing.pdf
277
The Routledge Handbook of Interpreting, Technology and AI
Braun, S., Davitti, E., Dicerto, S., 2018. Video-Mediated Interpreting in Legal Settings: Assessing the
Implementation. In Napier, J., Skinner, R., Braun, S., eds. Here or There: Research in Interpreting
via Video Link. Gallaudet University Press, Washington, DC, 144–179.
Braun, S., Taylor, J., 2012a. AVIDICUS Comparative Studies – Part 1: Traditional Interpreting and
Remote Interpreting in Police Interviews. In Braun, S., Taylor, J., eds. Videoconference and Remote
Interpreting in Criminal Proceedings. Intersentia Publishing, Mortsel, 85–100.
Braun, S., Taylor, J., 2012b. Video-Mediated Interpreting: An Overview of Current Practice and
Research. In Braun, S., Taylor, J., eds. Videoconference and Remote Interpreting in Criminal Pro-
ceedings. Intersentia Publishing, Mortsel, 27–58.
Braun, S., Taylor, J., Miler-Cassino, J., Rybińska, Z., Balogh, K., Hertog, E., Vanden Bosh, Y., Rombouts,
D., 2012. Training in Video-Mediated Interpreting in Legal Proceedings: Modules for Interpreting
Students, Legal Interpreters and Legal Practitioner. In Braun, S., Taylor, J., eds. Videoconference and
Remote Interpreting in Criminal Proceedings. Intersentia Publishing, Mortsel, 233–288.
Carl, M., Braun, S., 2018. Translation, Interpreting and New Technologies. In Malmkjaer, K., ed. The
Routledge Handbook of Translation Studies and Linguistics. Routledge, Oxon and New York,
374–389.
Choo, Y.K., 2023. How Is Technology Impacting the Legal Profession? URL www.allaboutlaw.co.uk/
law-careers/legal-technology/how-is-technology-impacting-the-legal-profession (accessed 29.6.2023).
College of Policing, 2020. Briefing Note: Using Language Services. URL https://library.college.police.
uk/docs/college-of-policing/Language-Services-v1.0.pdf (accessed 2.8.2024).
Davitti, E., Braun, S., 2018a. Challenges and Solutions. In Amato, A., Spinolo, N., González Rod-
ríguez, M.J., eds. Handbook of Remote Interpreting – SHIFT in Orality Erasmus+ Project: Shap-
ing the Interpreters of the Future and of Today. University of Bologna, Bologna. URL http://
amsacta.unibo.it/id/eprint/5955/
Davitti, E., Braun, S., 2018b. Role-Play Simulations. In Amato, A., Spinolo, N., González Rodríguez,
M.J., eds. Handbook of Remote Interpreting – SHIFT in Orality Erasmus+ Project: Shaping the
Interpreters of the Future and of Today. University of Bologna, Bologna. URL http://amsacta.
unibo.it/id/eprint/5955/
Devaux, J., 2016. When the Role of the Court Interpreter Intersects and Interacts with New Technolo-
gies. In Intersect, Innovate, Interact. CTIS Occasional Papers, 7. URL https://hummedia.manches-
ter.ac.uk/schools/salc/centres/ctis/publications/occasional-papers/Devaux.pdf
Devaux, J., 2017a. Technologies in Interpreter-Mediated Criminal Court Hearings: An Actor-Network
Theory Account of the Interpreter’s Perception of Her Role-Space (PhD thesis). The University of
Salford. URL https://oro.open.ac.uk/54390/.
Devaux, J., 2017b. Virtual Presence, Ethics and Videoconferencing Interpreting: Insights from Court
Settings. In Valero-Garcés, C., Tipton, R., eds. Ideology, Ethics and Policy Development in Public
Service Interpreting and Translation. Multilingual Matters, Bristol, 131–150.
Devaux, J., 2018. Technologies and Role-Space: How Videoconference Interpreting Affects the Court
Interpreter’s Perception of Her Role. In Fantinuoli, C., ed. Interpreting and Technologies. Lan-
guage Science Press, Berlin, 91–119.
Dumoulin, L., Licoppe, C., 2010. Policy Transfer ou Innovation? L’activité juridictionnelle à distance
en France. Critique Internationale 48, 117–133.
Fantinuoli, C., 2017. Computer-Assisted Interpreting: Challenges and Future Perspectives. In Corpas
Pastor, G., Durán-Muñoz, I., eds. Trends in E-Tools and Resources for Translators and Interpret-
ers. Brill, Leiden, 153–174.
Fantinuoli, C., 2018. Interpreting and Technology: The Upcoming Technological Turn. In Fantinuoli,
C., ed. Interpreting and Technology. Language Service Press, Berlin, 1–12.
Fantinuoli, C., 2021. Conference Interpreting: New Technologies. In Albl-Mikasa, M., Tiselius, E.,
eds. The Routledge Handbook of Conference Interpreting. Routledge, Oxon and New York,
508–522.
Fowler, Y., 2007. Interpreting into the Ether: Interpreting for Prison/Court Video Link Hearings. Pro-
ceedings of the Critical Link 5 Conference, Sydney, 11–15.4.2007.
Fowler, Y., 2013. Non-English Speaking Defendants in the Magistrates Court: A Comparative Study
of Face to Face and Prison Video Link Interpreter Mediated Hearings in England (PhD thesis).
Aston University.
278
Legal settings
Fowler, Y., 2018. Interpreted Prison Video Link: The Prisoner’s Eye View. In Napier, J., Skinner, R.,
Braun, S., eds. Here or There: Research on Interpreting via Video Link. Gallaudet University Press,
Washington, DC, 183–209.
Fullwood, C., Judd, A.M., Finn, M., 2008. The Effect of Initial Meeting Context and Video-Mediation
on Jury Perceptions of an Eyewitness. Internet Journal of Criminology, Online. URL https://www.
academia.edu/1796147/THE_EFFECT_OF_INITIAL_MEETING_CONTEXT_AND_VIDEO_
MEDIATION_ON_JURY_PERCEPTIONS_OF_AN_EYEWITNESS
General Secretariat of the Council of the European Union, 2013. Guide on Videoconferencing in
Cross-Border Proceedings. URL http://bookshop.europa.eu/en/guide-on-videoconferencing-in-
cross-border-proceedings-pbQC3012963/ (accessed 11.6.2015).
Goldsmith, J., 2018. Tablet Interpreting: Consecutive Interpreting 2.0. Translation and Interpreting
Studies. The Journal of the American Translation and Interpreting Association 13(3), 342–365.
URL https://doi.org/10.1075/tis.00020.gol
González Rodríguez, M.J., 2018. Preparatory Exercises. In Amato, A., Spinolo, N., González Rod-
ríguez, M.J., eds. Handbook of Remote Interpreting – SHIFT in Orality. University of Bologna,
Bologna, 144–149.
Haas, A., 2006. Videoconferencing in Immigration Proceedings. Pierce Law Review 5(1), 59–90.
Hale, S.B., Goodman-Delahunty, J., Martschuk, N., Lim, J., 2022. Does Interpreter Location Make a
Difference? A Study of Remote vs Face-to-Face Interpreting in Simulated Police Interviews. Inter-
preting 24(2), 221–253. URL https://doi.org/10.1075/intp.00077.hal
Her Majesty’s Court and Tribunal Services, 2020. Annual Report and Accounts 2019–20. URL www.
gov.uk/official-documents (accessed 2.8.2024).
Hertog, E., 2015. Legal Interpreting. In Pöchhacker, F., ed. Routledge Encyclopedia of Interpreting
Studies. Routledge, Oxon and New York, 230–236.
Home Office, 2021. Interpreters Code of Conduct. URL https://assets.publishing.service.gov.uk/gov-
ernment/uploads/system/uploads/attachment_data/file/1085040/Code_of_conduct_for_UK_visas_
and_immigration_registered_interpreters_v4.pdf (accessed 2.8.2024).
International Telecommunication Union, 2024. ITU-T Recommendations by Series. URL www.itu.
int/ITU-T/recommendations/index.aspx?ser=H (accessed 2.8.2024).
Johnson, M., Wiggins, E., 2006. Videoconferencing in Criminal Proceedings: Legal and Empirical
Issues and Directions for Research. Law & Policy 28(2), 211–227.
Kelly, N., 2008. Telephone Interpreting: A Comprehensive Guide to the Profession. Trafford Publish-
ing, Victoria, BC.
Lee, R., 2020. Role-Space in VRS and VRI. In Salaets, H., Brône, G., eds. Linking Up with Video:
Perspectives on Interpreting Practice and Research. John Benjamins Publishing Company, Oxon
and New York, 107–125. URL https://doi.org/10.1075/btl.149.05lee
Licoppe, C., 2021. The Politics of Visuality and Talk in French Courtroom Proceedings with Video
Links and Remote Participants. Journal of Pragmatics 178, 363–377.
Licoppe, C., Verdier, M., 2013. Interpreting, Video Communication and the Sequential Reshaping of
Institutional Talk in the Bilingual and Distributed Courtroom. International Journal of Speech,
Language and the Law 20(2), 247–275. URL https://doi.org/10.1558/ijsll.v20i2.247
Licoppe, C., Verdier, M., 2021. L’interprète au centre du prétoire ? Voix, pouvoir et tours de parole
dans les débats multilingues avec interprétation consécutive et liaisons vidéo. Droit et société 107,
31–50.
Llewellyn-Jones, P., Lee, R.G., 2014. Redefining the Role of the Community Interpreter: The Concept
of Role-Space. SLI Press, Carlton-le-Moorland.
McKay, C., 2016. Video Links from Prison: Permeability and the Carceral World. International
Journal for Crime, Justice and Social Democracy 5(1), 21–37. URL https://doi.org/10.5204/ijcjsd.
v5i1.283
Miler-Cassino, J., Rybińska, Z., 2012. AVIDICUS Comparative Studies – Part III: Traditional Inter-
preting and Videoconference Interpreting in Prosecution Interviews. In Braun, S., Taylor, J., eds.
Videoconference and Remote Interpreting in Criminal Proceedings. Intersentia Publishing, Mort-
sel, 117–136.
Monteolivia-Garcia, E., 2018. The Last Ten Years of Legal Interpreting Research (2008–2017):
A Review of Research in the Field of Legal Interpreting. Language and Law/Linguagem e Direito
5(1), 36–80.
279
The Routledge Handbook of Interpreting, Technology and AI
Monteolivia-Garcia, E., 2020. Interpreting or Other Forms of Language Support? Experiences and
Decision-Making Among Response and Community Police Officers in Scotland. The International
Journal for Translation and Interpreting Research 12(1), 37–54.
Morris, R., 1999. The Face of Justice: Historical Aspects of Court Interpreting. Interpreting 4(1),
97–123.
Moser-Mercer, B., 2003. Remote Interpreting: Assessment of Human Factors and Performance
Parameters. The AIIC Webzine 23. URL http://aiic.net/page/1125/remote-interpreting-assessment-
of-human-factors-and-performance-parameters/lang/1 (accessed 27.11.2016).
Moser-Mercer, B., 2005. Remote Interpreting: The Crucial Role of Presence. Bulletin VALS-ASLA
81, 73–97.
Mouzourakis, T., 2003. That Feeling of Being There: Vision and Presence in Remote Interpreting. The
AIIC Webzine 23. URL http://aiic.net/issues/207/2003/summer-2003-23 (accessed 24.11.2016).
Napier, J., 2012. Here or There? An Assessment of Video Remote Signed Language Interpreter-Mediated
Interaction in Court. In Braun, S., Taylor, J., eds. Videoconference and Remote Interpreting in
Criminal Proceedings. Intersentia Publishing, Mortsel, 145–185.
Napier, J., Leneham, M., 2011. “It Was Difficult to Manage the Communication”: Testing the Fea-
sibility of Video Remote Signed Language Interpreting in Court. Journal of Interpretation 21(1).
URL https://digitalcommons.unf.edu/joi/vol21/iss1/5/
New York State Unified Court System, 2020. Court Interpreter: Manual and Code of Ethics. URL
https://ww2.nycourts.gov/sites/default/files/document/files/2020-10/20_Code_of_Ethics_0.pdf
(accessed 2.8.2024).
Nunes Vieira, L., O’Hagan, M., O’Sullivan, C., 2021. Understanding the Societal Impacts of Machine
Translation: A Critical Review of the Literature on Medical and Legal Use Cases. Information.
Communication and Society 24(11), 1515–1532.
Orlando, M., 2010. Digital Pen Technology and Consecutive Interpreting: Another Dimension in
Note-Taking Training and Assessment. The Interpreters’ Newsletter 15, 71–86.
Ozolins, U., 2011. Telephone Interpreting: Understanding Practice and Identifying Research Needs.
Translation and Interpreting 3(2), 33–47.
Pitman, J., 2021. Google Translate: One Billion Installs, One Billion Stories. URL https://blog.google/
products/translate/one-billion-installs/ (accessed 6.7.2023).
Plotnikoff, J., Woolfson, R., 1999. Preliminary Hearings: Video Links Evaluation of Pilot Projects.
URL http://lexiconlimited.co.uk/wp-content/uploads/2013/01/Videolink-magistrates.pdf (accessed
3.3.2015).
Plotnikoff, J., Woolfson, R., 2000. Evaluation of Video Link Pilot Project at Manchester Crown Court:
Final Report. URL http://lexiconlimited.co.uk/wp-content/uploads/2013/01/Videolink-Crown.pdf
(accessed 3.3.2015).
Pöchhacker, F., 2022. Interpreters and Interpreting: Shifting the Balance? The Translator 28(2),
148–161. URL https://doi.org/10.1080/13556509.2022.2133393
Radburn-Remfry, P., 1994. Due Process Concerns in Video Production of Defendants. Stetson Law
Review 23, 805–838.
Rombouts, D., 2012. The Police Interview Using Videoconferencing with a Legal Interpreter: A Criti-
cal View from the Perspective of Interview Techniques. In Braun, S., Taylor, J., eds. Videoconfer-
ence and Remote Interpreting in Criminal Proceedings. Intersentia Publishing, Mortsel, 137–144.
Roth, M.D., 2000. Laissez Faire Videoconferencing: Remote Witness Testimony and Adversarial
Truth. UCLA Law Review 48(1), 185–220.
Singureanu, D., Braun, S., Davitti, E., González Figueroa, L.A., Poellabauer, S., Mazzanti, E., De
Wilde, J., Maryns, K., Guaus, A., Buysse, L., 2023a. Research Report. EU-WEBPSI: Baseline
Study and Needs Analysis for PSI, VMI and LLDI. URL https://ucrisportal.univie.ac.at/de/pub-
lications/research-report-eu-webpsi-baseline-study-and-needs-analysis-for-p (accessed 2.8.2024).
Singureanu, D., Hieke, G., Gough, J., Braun, S., 2023b. “I am His Extension in the Courtroom”.
How Court Interpreters Cope with the Demands of Video-Mediated Interpreting in Hearings with
Remote Defendants. In Corpas Pastor, G., Defrancq, B., eds. Current and Future Trends. John Ben-
jamins Publishing Company, Amsterdam and Philadelphia, 72–108. URL https://doi.org/10.1075/
ivitra.37
Skinner, R., 2020. Approximately There – Positioning Video-Mediated Interpreting in Frontline
Police Services (PhD thesis). Heriot-Watt University.
280
Legal settings
Skinner, R., 2023. Would You Like Some Background? Establishing Shared Rights and Duties in
Video Relay Service Calls to the Police. Interpreting and Society: An Interdisciplinary Journal 3(1).
URL https://doi.org/10.1177/27523810221151107
Skinner, R., Napier, J., Braun, S., 2018. Interpreting via Video Link: Mapping of the Field. In Napier,
J., Skinner, R., Braun, S., eds. Here or There: Research on Interpreting via Video Link. Gallaudet
University Press, Washington, DC, 11–35.
Skinner, R., Napier, J., Fyfe, N.R., 2021. The Social Construction of 101 Non-Emergency Video Relay
Services for Deaf Signers. International Journal of Police Science & Management 23(2), 145–156.
Thaxton, R., 1993. Injustice Telecast: The Illegal Use of Closed-Circuit Television Arraignments and
Bail Bond Hearings in Federal Court. Iowa Law Review 79(1), 175–202.
Thompson, C., 2017. Why Many Deaf Prisoners Can’t Call Home. The Marshall Project: Nonprofit
Journalism about Criminal Justice. URL www.themarshallproject.org/2017/09/19/why-many-
deaf-prisoners-can-t-call-home#:~:text=The%20technology%20provided%20to%20deaf%
20people%20in%20most,which%20allows%20users%20to%20speak%20in%20sign%
20language (accessed 19.9.2017).
Trzaskawka, P., 2020. Selected Clauses of a Copyright Contract in Polish and English in Translation
by Google Translate: A Tentative Assessment of Quality. International Journal for the Semiotics of
Law – Revue internationale de Sémiotique juridique 33, 689–705.
Wadensjö, C., 1999. Telephone Interpreting and the Synchronisation of Talk. The Translator 5(2),
247–264.
Wang, J., 2018. “Telephone Interpreting Should Be Used Only as A Last Resort.” Interpreters’ Percep-
tions of the Suitability, Remuneration and Quality of Telephone Interpreting. Perspectives 26(1),
100–116.
Xu, H., Hale, S.B., Stern, L., 2020. Telephone Interpreting in Lawyer-Client Interviews: An Observa-
tional Study. Translation and Interpreting 12(1), 18–36.
Zahrastnik, K., Baghrizabehi, D., 2022. Videoconferencing in Times of the Pandemic and Beyond:
Addressing Open Issues of Videoconferencing in Cross-Border Civil Proceedings in the EU. Balkan
Social Science Review 19. URL https://doi.org/10.46763/BSSR2219047z
281
16
IMMIGRATION, ASYLUM, AND
REFUGEE SETTINGS
Diana Singureanu and Sabine Braun
16.1 Introduction
Asylum/immigration procedures include formal contexts, such as asylum interviews, that
is, formal procedures to assess an asylum claim. They also include court proceedings (e.g.
asylum appeals and judicial reviews of return decisions) and health assessments and a
wide range of informal encounters, for example, in reception centres. The 1951 United
Nations Geneva Convention (Refugee Convention), ratified by over 150 countries, guar-
antees the right to an interpreter for refugee-status applicants, including asylum seekers.
However, research has revealed malpractices, such as poor interpreting provision and the
soliciting of interpreters’ opinions. This potentially puts refugees and asylum seekers at risk
(Jiménez-Ivars and León-Pinilla, 2018, 31). Research on interpreting in asylum contexts
has largely focused on interactions of the government authorities responsible for granting
rights and providing resources, like counselling and medical support (Maryns, 2015), with
other diverse settings where interpreters operate (schools, NGOs, social services, banks,
shelters, landlord negotiations, and job interviews; Jiménez-Ivars and León-Pinilla, 2018,
31) remaining underexplored. Interpreters working in this variety of settings often face
challenges related to conveying cultural nuances and the traumatic experiences of refugees
while also navigating confusion about their role, as they are caught between authorities’
and refugees’ conflicting expectations (Jiménez-Ivars and León-Pinilla, 2018; Inghilleri and
Maryns, 2019). Pöllabauer (2023) highlights how interpreters are not always able to fully
or correctly render specific narratives, cultural nuances, and details, nor are they able to
fully address communication breakdowns. This can significantly affect asylum outcomes.
Unsurprisingly, due to these communication gaps, refugees and asylum seekers often feel
misunderstood and overlooked, as shown in a comprehensive literature review of primary
healthcare settings (Patel et al., 2021). Thus, the need for enhanced training and ethical
guidelines for interpreters working in these diverse and complex contexts is a prominent
recommendation in the literature (see also Giustini, this volume).
The use of technology in these settings is not new. For instance, video links have been
used in immigration contexts since the 1990s (Federman, 2006). Communication technolo-
gies such as video and telephone remain the primary tools for providing language support,
DOI: 10.4324/9781003053248-21
Immigration, asylum, and refugee settings
and they form the main focus of this chapter. This contribution builds on primary research
and practice-based evidence, primarily focused on formal settings, such as immigration
proceedings, tribunals, asylum interviews, and studies of asylum seekers in detention, as
well as research on healthcare interactions with refugees. It also highlights the European
EU-WEBPSI project, a recent initiative aimed at improving access to basic services for
migrants and refugees through video-mediated interpreting (VMI).
Section 16.2 provides an overview of the main uses of interpreting by telephone and
video link (distance interpreting) in immigration, asylum, and refugee settings. It begins
by outlining common practices, key research areas, and main findings (Section 16.2.1). To
promote best practices, it then reviews key guidelines for VMI in public service interpret-
ing (PSI) relevant to refugee contexts (Section 16.2.2). While there are no specific VMI
guidelines for asylum and refugee settings, the section introduces a promising develop-
ment: minimum standards for VMI in these environments. The final part (Section 16.2.3)
focuses on training initiatives available for community interpreters in immigration, asy-
lum, and refugee contexts, highlighting the shortage of interprofessional training and recent
efforts to address this gap through programmes for both VMI and telephone interpret-
ing. Section 16.3 explores additional technologies, such as crowdsourcing platforms and
AI-driven translation tools, which have helped maintain language support services, espe-
cially in crisis situations. Section 16.4 concludes the chapter, emphasising the importance of
improving technology-enabled interpreting services, such as VMI, for refugees by investing
in reliable technology, interpreter training, and structured feedback systems.
283
The Routledge Handbook of Interpreting, Technology and AI
disrupt the flow of hearings, making them longer, more exhausting, and less effective in
delivering justice (Ellis, 2004).
A Canadian feasibility study (Ellis, 2004) examined immigration hearings through 10 hr
of observations, interviews with counsels, and an online survey involving 17 interpreters,
14 counsels, 25 adjudicators, and 16 refugee protection officers. During these hearings, the
immigration judge, refugee protection officer, and interpreter were located in one office,
while the refugee and lawyer were in a different city. The study raises concerns about
whether videoconferencing strikes the right balance between fairness and efficiency, given
its observed effect on how credible claimants appear. This is especially important in refugee
hearings, where direct evidence is often limited. Body language and emotions are poorly
conveyed through VMI, and the impersonal nature of videoconferencing makes it difficult
for applicants to express their emotions effectively. Counsel generally held the most nega-
tive views of VMI, possibly due to their awareness of the impact on decision-making. The
separation of the interpreter from the claimant introduced additional difficulties, including
challenges with rapport, signalling turns, and the impracticality of whispered interpreta-
tion. Common technical and administrative issues, such as the absence of an authority
figure to manage claimants and maintain order, further complicate the process.
A study of immigration bail hearings via video link by two British charities, Bail for
Immigration Detainees (BID) and the British Refugee Council (BID, 2008), reached simi-
lar conclusions. Several applicants, who were separated from the interpreter and other
participants in court, reported difficulty in following the courtroom proceedings. In some
cases, only direct questions to the applicant were translated, leaving them as mere bystand-
ers for the rest of the proceedings, unable to fully participate. Building on the 2008 study,
BID re-examined the fairness of bail hearings and noted significant barriers related to the
integration of interpreters when proceedings were conducted via video link (BID, 2010). In
some instances, judges did not verify whether the applicant and interpreter understood each
other, and significant portions of dialogue were not translated. In some cases, interpreters
were asked to offer opinions on the applicant’s nationality. This violates the FTTIAC’s
Guidance Note1 on pre-hearing introductions, which states that interpreters should not be
used as experts or asked to give advice. Access to interpreters during pre-hearing consulta-
tions was also limited when hearings were conducted via video link, with technical issues
compounding communication barriers. One of the study’s key recommendations is that it
should be the judge’s responsibility to ensure complete interpretation, as well as to confirm
that the interpreter and claimant understand each other.
In a similar vein, a 2005 study by the Legal Assistance Foundation of Metropolitan Chi-
cago and the Chicago Appleseed Fund for Justice, which examined immigration removal
cases at Chicago’s immigration court and included interviews with immigration practition-
ers, found that non-English speakers faced challenges due to the quality and the integration
of the interpretation. Judges remained in the courtroom while detainees appeared remotely
via video, using a speakerphone at the court to communicate with the interpreter. Lengthier
exchanges between the judge and the attorneys were not translated. Nearly 30% of immi-
grants with interpreters misunderstood parts of the proceedings due to inadequate inter-
pretation, and 70% of non-English speakers faced issues related to videoconferencing. The
removal rate was significantly higher for non-English-speaking Latinos (76%) compared to
their English-speaking counterparts (46%).
A recent, more comprehensive study of removal hearings in the United States (Eagly,
2015) examining 153,835 hearings involving litigants held in detention centres, with
284
Immigration, asylum, and refugee settings
approximately a quarter conducted via video link, also identified technological, interac-
tional, and interpretation difficulties. Overall, detainees in video link hearings faced difficul-
ties following the proceedings due to poor video quality and technical issues. Furthermore,
lawyers reported transmission delays and screen blackouts. Difficulties also arose in under-
standing interpreters who were connected through speakerphone while the detainee par-
ticipated via video link. Echoing findings from earlier studies (Ellis, 2004), video link cases
showed reduced engagement compared to in-person hearings, with fewer detainees apply-
ing for removal relief or seeking to delay/stop the process.
A landmark report (Shaw, 2016) on the situation of immigrants in UK removal centres,
along with its follow-up report in 2018, raised significant concerns about the availability
and quality of interpreting services. These reports cited poor practices, including instances
where detainees were relied upon to assist with interpretation. While the follow-up review
noted some improvements, particularly thanks to an increased use of telephone interpret-
ing, quality issues persisted. These issues included interpreters’ reluctance to address sensi-
tive topics, such as sexual orientation, due to cultural or religious biases, and instances of
poor literacy among interpreters. Healthcare staff also reported instances of poor conduct,
such as interpreters abruptly disconnecting during mental health sessions and background
noise indicating a lack of privacy. Although on-site interpretation is preferable for critical
or sensitive situations, this may not always be practical. The report stressed the urgent need
for an independent review to improve the quality of interpreting services available in the
removal centres.
A study in France analysing over 300 immigration appeal court hearings via video link
explored the communicative dynamics between participants in the main court and claim-
ants. Interpreters were either co-located with the applicant or present in court. A significant
recurring issue was the poor positioning of cameras and screens. This often prevented par-
ticipants from having a clear view of each other, resulting in an incongruent visual set-up.
This made it difficult for claimants to identify whose speech the interpreter was conveying,
leading to comprehension problems (Licoppe and Veyrier, 2017). The physical separation
between asylum seeker and interpreter also affected the interpreters’ handling of communi-
cation flow. As a result, interpreters resorted to overt or explicit turn-taking techniques dur-
ing long exchanges. A prominent finding was that these interpreting strategies negatively
impacted perceptions of the asylum seeker’s credibility and willingness to cooperate during
the questioning process (Licoppe and Veyrier, 2020).
To address interpreter shortages and logistical challenges in asylum interviews, the use of
distance interpreting in asylum interviews has evolved across Europe with varying degrees
of success and adaptation based on individual countries’ needs. The General Directors’
Immigration Services Conference (GDISC) took an early step in integrating videoconfer-
encing technology to mitigate interpreter shortages across Europe. Launched in 2007, the
‘Interpreters’ Pool Project’ used relay interpreting, with one interpreter co-located with
the caseworker and applicant while a second, fluent in the applicant’s language, partici-
pated from another country. This project demonstrated how distance communication tech-
nologies could bridge gaps in interpreter availability within Europe, particularly for rare
languages.
More recently, the COVID-19 pandemic accelerated the use of distance interpreting
methods, as highlighted in EASO/EUAA’s 2021 Asylum Report. Countries adopted tel-
ephone (see also Lázaro Gutiérrez, this volume) and video-mediated interpreting (see also
Braun, this volume) to ensure safe and continued asylum interviews while adhering to health
285
The Routledge Handbook of Interpreting, Technology and AI
protocols. Some countries, like Norway and Ireland, already had remote interviewing prac-
tices in place, while others, such as Sweden and France, introduced this modality for the
first time. Norway has used Skype for asylum interviews since 2017, particularly in remote
centres, and Ireland adopted similar practices in 2019. Sweden uses remote interviews
to reduce travel for applicants, while France conducts remote interviews for vulnerable
individuals and overseas departments. Other countries, including Armenia, Belgium, and
Germany, employ remote interviews in specific cases, such as for detained asylum seekers
or when interpreters are working from a remote location. In the UK, remote interviews
are conducted with the asylum seeker, interpreter, legal representative, and interviewing
officer in separate locations, with a designated point of contact for safeguarding concerns
(UNHCR, 2020).
Interpreting for refugees has been explored in other contexts, with healthcare settings
being particularly well-documented. In these contexts, distance interpreting is commonly
used and has been the focus of studies which compare its effectiveness with in-person
interpreting. For example, Dubus (2016) explored interpreters’ experiences with both
in-person and telephone interpreting for refugees in the United States in healthcare set-
tings and revealed that telephone interpreting often led to emotional detachment and
communication breakdowns. Similarly, Kunin et al. (2022) assessed telephone interpret-
ing in refugee health clinics in Australia. They found that while practical, telephone
interpreting lacked the emotional connection and nuance of face-to-face consultations,
with many refugees feeling more comfortable and trusting during in-person interac-
tions. Another comparative study of the two modalities (Phillips, 2013) examined
the ‘meta-communication’ between doctors and refugees – conversations about care,
survival, and identity that occur beyond literal translation. Phillips found that while
telephone interpretation can provide a functional solution in certain situations, it often
fails to capture the full depth of the refugee experience. Consequently, Phillips argues
that in-person interpreting is superior in building trust and understanding between refu-
gees and healthcare professionals.
Distance interpreting has also become a useful tool in health screenings for asylum seek-
ers, which play an important role in the legal process of claiming asylum. For instance, a
pilot study conducted prior to the COVID-19 pandemic (Mishori et al., 2021) examined the
implementation of audio- and videoconferencing in remote asylum evaluations in migrant
camps in Mexico to assess the effectiveness and challenges of conducting these evalua-
tions via telehealth. The study found that while remote evaluations were considered better
than no evaluations at all, they posed significant challenges, including technological issues,
concerns about confidentiality, and difficulties in making visual observations and building
rapport – factors that are particularly important in mental health assessments. Despite
these challenges, the study suggests that remote evaluations could be a viable option for
hard-to-reach communities if improvements are made in technology and protocols. A later
study by Pogue et al. (2021), carried out in the United States during the COVID-19 pan-
demic, also explored the experiences of clinicians conducting remote medical evaluations
for asylum seekers and reached similar conclusions. While remote evaluations were neces-
sary due to pandemic restrictions, they were often fraught with challenges, such as lim-
ited technology access and difficulties in assessing the physical and mental state of clients.
Nonetheless, the clinicians viewed this to be a vital alternative to the absence of evaluations
and recommended developing better technology and communication protocols to improve
future assessments.
286
Immigration, asylum, and refugee settings
287
The Routledge Handbook of Interpreting, Technology and AI
employing resident interpreters who can be booked directly by staff. In other places, tel-
ephone interpreting remains the only modality of distance interpreting available, because
the transition to VMI has not yet been completed; speakers and webcams are not always
available in every office, and staff members may require training to use the equipment. In
addition, access to VMI is also dependent on the agencies with whom the centres work and
whether these agencies offer VMI.
To meet the needs of the centres, the most frequently used configuration is video
remote interpreting (VRI), where all primary participants are co-located and the inter-
preter is remote/off-site, connected to via video (Braun, 2019). Due to employee time
pressure to find and book interpreters at short notice and the frequent demand for LLDs
in remote locations, this configuration appears to be a suitable arrangement that meets
the centres’ daily needs, offering professional interpreters that are easy to book and also
cancel. This configuration typically features three parties: The beneficiary or centre resi-
dent and the authority staff member (e.g. the asylum service staff member or a social
worker) are co-located, and the interpreter is remote. This set-up occurs when a different
centre or refugee camp from the one where the resident interpreters are based requires
interpretation. Consequently, the interpreter is connected to the requesting centre via an
online platform. Of note, centres dedicated to minors are more likely to have an increased
need for interpreting services. A second use of this VMI configuration arises when centre
residents leave the centre (e.g. for medical appointments) and the interpreter remains in
the residential facility. Interestingly, VMI has been used to communicate with minors in
order to maintain the impartiality of resident interpreters, who might otherwise become
too familiar with the minors and risk compromising their neutrality. At times, a fourth
party, such as a psychologist, lawyer, or doctor, may join remotely, creating multi-point
video links. Although not meant to participate actively, such a fourth party may add
information at the interview’s conclusion.
A combination of staff interpreters, qualified freelancers, and agency interpreters is hired
for these assignments, while volunteers are also used to meet demand and typically assist
with informal or straightforward communications. When interpreting services are unavail-
able, staff often use tools like Google Translate or rely on multilingual colleagues that have
been strategically recruited. On-site interpreting is considered a better option for extended
briefings, sensitive topics, risk-to-life medical scenarios, and mental health matters, with
exceptions made for LLDs, especially when immediate assistance is required. The diversity
of settings in which interpreters work is covered, to varying degrees, by their training.
Training includes brief overviews and presentations on the asylum service, the medical field,
interpreting for individuals who have experienced smuggling or trafficking, and interpret-
ing for the LGBTQI community (Singureanu et al., 2023a). Furthermore, whether VMI
is offered directly to end users or outsourced to private agencies affects interpreter work-
ing arrangements, video platform choice, quality control, and technical support. In some
countries, interpreter bookings are managed centrally, and interpreters are deployed either
to VMI or on-site, depending on demand. This central management also allows for more
structured interpreter management, scheduled breaks, and debriefing sessions for challeng-
ing assignments. In other cases, in-house interpreters manage their own schedules and are
directly booked by staff using an internal online platform. When VMI or telephone inter-
preting is booked through contracted agencies, this requires advance requests and subse-
quent confirmation. The perception among service users appears to be that working via
agencies offers a quality guarantee; however, there is little control or transparency regarding
288
Immigration, asylum, and refugee settings
289
The Routledge Handbook of Interpreting, Technology and AI
Eagly, 2015; Clark, 2021) and negatively impact interpreter performance (Alley, 2012;
Braun, 2018; Corpas Pastor and Gaber, 2020; Hale et al., 2022). These issues can lead
to disruptions, distract users, or even necessitate adjournments and re-scheduling (Davis
et al., 2015; Clark, 2021). To improve the quality and uptake of VMI, there is a need for
investment in reliable, user-friendly equipment that is easily accessible to all participants
(Hale et al., 2022). PSI literature offers clear guidelines for VMI-ready workstations. These
should include a laptop, external camera (ideally positioned to enhance eye contact), pro-
fessional headsets, a microphone, a quiet space, and a high-speed, wired internet connec-
tion (Koller and Pöchhacker, 2018; Klammer and Pöchhacker, 2021). As with other VMI
settings, institutional stakeholders and professionals may be unaware of these technological
requirements (Braun et al., 2018). Furthermore, the input of interpreters working in refugee
settings is often missing from the grey literature. Thus, regular technical checks and sup-
port, along with increased technological literacy for professionals and interpreters, are also
essential (Ellis, 2004; Braun, 2013; Eagly, 2015; Fowler, 2018; Sultanić, 2020; Tam et al.,
2020; Ji et al., 2021).
Additionally, difficulties in interacting with/via the technology contribute to inter-
actional challenges between participants and the interpreter, an aspect of VMI
well-documented within PSI literature. Research shows that participants often struggle
to pick up on the remote interpreter’s non-verbal cues for speech segmentation due to
challenges with eye contact and screen monitoring (Klammer and Pöchhacker, 2021; Sin-
gureanu et al., 2023b). Overlapping speech is also more common in VMI. This causes
problems for the interpreter, who may mishear or miss information (Korak, 2012; Braun,
2018). These additional complexities tax the interpreter’s cognitive load in VMI, lead-
ing to more interpreting issues, such as omissions and additions, compared to on-site
interpreting (Braun, 2018). Communication and interactional problems also stem from
the incongruent visual environment due to the positioning of the equipment, that is, the
camera and screens, at both sites (Licoppe and Veyrier, 2017; Braun, 2018). Interactional
problems were found to be more common in VMI encounters with multiple participants
at one end, both in medical (Price et al., 2012) and legal settings (Licoppe and Veyrier,
2017; Fowler, 2018; Braun, 2020). This is particularly relevant in the refugee context,
where medical and age assessments conducted remotely can include multiple participants,
such as minors, or vulnerable patients who require the presence of an independent guard-
ian or family member.
VMI interactional issues for other-language speakers include difficulties in building rap-
port with the remote interpreter (Ellis, 2004), psychological distancing (Rowe et al., 2019),
and trust issues (Lindström and Pozo, 2020; Martínez et al., 2021). Without the personal
dynamics of in-person interpreting, remote interactions can hinder patient engagement with
interpreters and healthcare providers (Martínez et al., 2021). This is especially critical in
refugee contexts, where the power imbalance between cultures and languages makes rap-
port particularly important (Kleinert et al., 2019; Blasco Mayor, 2020). The human barriers
associated with VMI raise concerns about its suitability for certain encounters, since remote
interpreters struggle to navigate emotionally charged accounts and culture-specific nuances
(Ellis, 2004). Additionally, interpreters often struggle to read facial cues from elderly or
impaired individuals (Gilbert et al., 2022), which could impact the accuracy of remote
health assessments for migrants, discussed earlier in this section. While on-site interpreting
is prioritised for adversarial asylum interviews, there are situations when VMI remains the
only option for certain LLDs and in emergency situations.
290
Immigration, asylum, and refugee settings
16.2.2 Guidelines
This section showcases examples of good practices in VMI for asylum and immigration
settings. It draws on both the limited guidance specific to these areas and valuable insights
from VMI guidelines that have been developed for healthcare and legal contexts. Integrat-
ing these broader best practices can help close the gap in standardised guidelines for VMI
in asylum and refugee contexts. This ensures that interpreters working in this modality are
better prepared to deliver high-quality services, ultimately enhancing communication and
support for refugees.
More specific recommendations for VMI in refugee settings, particularly in fully virtual
encounters, where all parties are spatially separated, emerged during the COVID-19 pan-
demic. UNHCR (2020) recommendations for remote interviewing underscore interpret-
ers’ pivotal role in ensuring the fairness and effectiveness of remote asylum interviews.
They emphasise the importance of ensuring equal participation for all parties, including
interpreters. Thus, interpreters should ideally participate via the same method as other
participants, preferably via videoconferencing rather than just telephone, to ensure clear
communication and rapport. Additionally, training interpreters in VMI and remote inter-
viewing techniques helps minimise the challenges posed by physical distance and ensures
the effective participation of all parties throughout the interview process. These serve as
crucial procedural safeguards to uphold fairness during the remote interviewing process.
There is still a lack of harmonisation and standardisation of VMI guidelines, similar to
the situation in PSI. While some countries have developed detailed VMI guidelines driven
by institutional needs, there remains a significant gap in guidance for interpreters working
in asylum or immigration settings, especially for LLDs. To capture current VMI practices
across several European countries, guidelines from German, Finnish, Norwegian, Czech,
Dutch, Italian, and Polish sources were reviewed, with translations into English facilitated
by DeepL. It should be noted that this review of VMI guidelines is not exhaustive. Addi-
tionally, guidelines from healthcare and legal settings were examined, as they share com-
mon elements relevant to VMI in refugee contexts. These offer valuable insights that can
inform the development of guidelines for asylum and immigration settings.
VMI guidelines in legal settings clarify the prerequisites for VMI implementation and
address users’ lack of familiarity with integrating interpreters into distance communica-
tion (Council of the European Union, 2013). The guidelines also highlight factors that can
limit the use of VMI. This is of particular relevance to refugee settings, where one may
encounter inadequate technical equipment, a lack of technical knowledge, physical room
layout limitations, or long and complex interactions (Braun et al., 2013; OROLSI-JCS and
UNITAR, 2020; Federal Ministry of Justice, 2022). It is generally considered good prac-
tice to avoid an imbalance of participants, for example, set-ups where most participants
are co-located and only one primary participant or the interpreter is off-site. End users
can also be affected by problems with technological access and literacy (Federal Ministry
of Justice, 2022; OROLSI-JCS and UNITAR, 2020). These are crucial considerations in
refugee settings, as noted in Section 16.2.1. Continuous evaluation through a structured
feedback process is also essential for successful VMI implementation in local public services
(GDISC, 2008; HSE Social Inclusion Unit and the Health Promoting Hospitals Network,
2009; Braun et al., 2013).
Few guidelines currently address the various spatial arrangements and visual require-
ments regarding VMI configurations where one or more participants are remote (Council
291
The Routledge Handbook of Interpreting, Technology and AI
of the European Union, 2013; CIOL and ATC, 2021; IMDi, 2023a). For effective com-
munication, participants must be able to clearly see who is speaking at all times (van Rot-
terdam and van den Hoogen, 2012). Recreating the traditional triangular positioning of
participants, as seen in on-site interpreting, helps facilitate this (Office of the Commissioner
General for Refugees and Stateless Persons, 2022). Replicating this traditional arrangement
allows the interpreter to view both the speaker and the audience without participants need-
ing to turn away from the screen (Council of the European Union, 2013). Additionally,
participants should describe gestures or objects that are not visible to the interpreter (IMDi,
2023a). Suitable lighting is also essential to ensure that participants’ facial expressions are
clearly visible (Braun et al., 2013; CIOL, 2021).
A collaborative approach to communication management is recommended due to the
increased coordination required (Council of the European Union, 2013; CIOL, 2021; CIOL
and ATC, 2021). This extends to the preparation stage for interpreters, which requires
additional effort and coordination in VMI, particularly when booked via an agency, with
briefings provided either well in advance or just before the assignment (CIOL and ATC,
2021; IMDi, 2023b). For sensitive encounters, such as medical or adversarial settings, the
service provider or initiating party must determine which participants should be part of the
interpreter’s briefing (IMDi, 2024). Similarly, introductions at the beginning of the VMI
session should be facilitated by the coordinator, with the interpreter introducing themselves
if necessary (Braun et al., 2013; CIOL, 2021). Verbal and non-verbal signals for pacing and
turn-taking should also be established early (Council of the European Union, 2013; TEPIS,
2019; IMDi, 2024). As part of a collaborative approach, and in adherence to the princi-
ples of integrity outlined in the interpreters’ code of conduct, interpreters are encouraged
to intervene when necessary to address communication problems and technological issues
(Braun et al., 2013; TEPIS, 2019). Effective communication management involves pausing
the participants’ speech regularly to allow the interpreter to render manageable speech
segments, ensuring high-quality consecutive interpreting (Braun et al., 2013; CIOL, 2021;
Federal Ministry of Justice, 2022).
Technological recommendations highlight the importance of sound quality at the inter-
preter’s location (Braun et al., 2013). Ideally, all participants should use individual profes-
sional microphones with echo and background-noise cancellation (van Rotterdam and van
den Hoogen, 2012; Braun et al., 2013), though this may not be practical in refugee settings.
To address this, guidelines suggest conducting technical checks and establishing clear pro-
cedures for resolving acoustic issues at the beginning of the encounter, while continuously
monitoring sound quality throughout the session (Braun et al., 2013; OROLSI-JCS and
UNITAR, 2020; CIOL and ATC, 2021; Federal Ministry of Justice, 2022; IMDi, 2023b).
Essential technical specifications for web-based VMI platforms include supporting various
views (e.g. speaker, gallery, self-image). The initiating party must manage the platform as
well as features including data security, webcasting, and recording (Federal Ministry of
Justice, 2022). All parties, including the interpreter, must have adequate technical equip-
ment and are responsible for speed and connectivity at their location (OROLSI-JCS and
UNITAR, 2020; Federal Ministry of Justice, 2022; IMDi, 2023b). Freelance interpreters,
in particular, must handle technical issues at their location. This requires IT skills and VMI
experience or training (Braun et al., 2013; IMDi, 2023b).
Current VMI recommendations also highlight the importance of training for inter-
preters, professionals, and end users. Given the complexities of VMI, a well-coordinated
approach is essential to ensure effective use (Vastaanottava Pohjois-Savo [The Welcome
292
Immigration, asylum, and refugee settings
to Northern Savo Project], 2011; Braun et al., 2013; Council of the European Union,
2013; Finnish Translators and Interpreters’ Association, 2013; IMDi, 2023a). Ethical
aspects, such as interpreters only accepting remote assignments for which they have the
necessary skills and equipment, should also be included in VMI training (Finnish Trans-
lators and Interpreters’ Association, 2013). Interprofessional training should cover the
limitations of VMI and equip professionals with the ability to make informed decisions
about VMI’s suitability for specific assignments, as well as the ability to make adjust-
ments when required. Raising awareness among end users and stakeholders about the
importance of VMI-trained interpreters is key to maintaining a strong accreditation
system (Braun et al., 2013). Prioritising trained interpreters as standard practice could
further enhance the quality of VMI (HSE Social Inclusion Unit and Health Promoting
Hospitals Network, 2009).
A core set of minimum VMI requirements for interpreters and professionals in refugee
and asylum settings is essential to establishing a mandatory standard rather than mere
recommendations. At the same time, these requirements should provide NGOs and refugee
organisations with the flexibility to adapt them to their specific needs. One such step in this
direction is the EU-WEBPSI guidelines model, which aims to establish minimum standards
and essential requirements for VMI practice in the asylum and refugee settings (Guaus
et al., 2023). These guidelines highlight the importance of distinguishing between VMI
requirements for all involved parties and specific needs for interpreters and end users, such
as clients and service providers. Certain aspects, such as visual and acoustic requirements,
communication management, breaks, VMI equipment, and the use of consecutive mode,
are common points relevant to all parties involved. Thus, both end users and interpreters
must understand that effective communication in video-mediated encounters depends on
mutual visibility, optimal spatial arrangement, and suitable lighting. An increased level
of communication management is necessary for efficient VMI encounters, which require
structured collaborative approaches to introductions, pacing, and turn-taking during the
encounter. The guidelines stress the importance of interpreters possessing strong linguistic,
technical, and interactional skills, as well as familiarity with troubleshooting and the VMI
platform. The guidelines also address ethical considerations. Interpreters are expected to
act with integrity by accepting only VMI assignments that they can competently manage
and withdrawing if conditions, such as inadequate equipment or unfamiliarity with the
platform, are insufficient. Additionally, interpreters must address impartiality concerns by
ensuring that any overlooked input from the other-language speaker is brought to the atten-
tion of all participants. An important ethical consideration for meeting organisers is that
their duty of care must also extend to off-site interpreters, particularly during emotionally
sensitive assignments. This includes providing interpreters with free access to mental health
support resources.
16.2.3 Training
The literature on asylum and refugee settings often highlights the lack of adequate training
for interpreters and the need for ongoing professional development (Nuč and Pöllabauer,
2021; Cassim et al., 2022). Training courses are typically short, with minimal feedback
from a bilingual instructor (Gany et al., 2017; Blasco Mayor, 2020; Pöllabauer, 2022).
This makes it unlikely that these interpreting trainees will reach advanced levels of inter-
preting skills (Hale and Ozolins, 2014). Of note, the training and qualifications offered to
293
The Routledge Handbook of Interpreting, Technology and AI
interpreters working in this context vary considerably, and there is a notable lack of stand-
ardisation and quality control mechanisms (Tipton and Furmanek, 2016).
Prominent examples of comprehensive training initiatives in this field, that is, Quada,
EASO, and REMILAS, can be highlighted thanks to their involvement of multidisciplinary
teams from interpreting, linguistics, and cultural studies (Bergunde and Pöllabauer, 2019;
Ticca et al., 2023). Bergunde and Pöllabauer (2019) provide an in-depth exploration of
both the development and implementation of a training course for interpreters in asylum
contexts, focusing on linguistic and cultural competency. Their findings show that compre-
hensive training, including ethics and trauma-informed approaches, significantly improves
interpreter performance in complex asylum interviews, and the authors advocate for con-
tinuous professional development to better prepare interpreters for these demanding roles.
The VMI training initiative implemented in Germany, Austria, and Switzerland to address
the urgent need for interpreters of languages of limited diffusion (Albl-Mikasa and Eingrie-
ber, 2018) is also worth noting. This study highlights the integration of video interpreting
into healthcare services in response to the 2015 refugee crisis, especially for languages with
a limited pool of interpreters. Conducted over several days, the training focused on both
lay and qualified community interpreters working in social services, healthcare, and asylum
contexts. The course encompassed interpreting techniques, role-plays, professional ethics,
and VRI skills, including managing distance communication and technical issues. It also
featured VRI practice sessions, followed by feedback from clients and peers. The use of
simulations, discussions, and reflections on the differences between VMI and in-person
interpreting has also been made in other studies (Skaaden, 2018) as curriculum components
designed to help trainee interpreters adapt to the complexities of working in VMI. Skaaden
(2018) examines interpreting trainees’ experiences using Skype to interpret remotely dur-
ing simulated VRI interactions in social services and asylum settings. The 45 participating
interpreters reported challenges, such as turn-taking difficulties, an overload of visual infor-
mation, and the complexity of managing screen monitoring, alongside their interpreting
tasks. The study also emphasises the importance of online etiquette and the need for all
participants to be briefed on best practices for distance communication.
Other comprehensive training initiatives for distance interpreting in PSI settings also
offer relevant insights and training methodologies. The EU project SHIFT3 (Amato et al.,
2018) focused on training for telephone and video interpreting across various community
and social interpreting scenarios. This is particularly relevant for informal communication
in asylum and refugee settings. The project’s primary aim was to identify effective prac-
tices and develop educational materials based on these identified insights. A key outcome
of the SHIFT project was the creation of a taxonomy to analyse VRI interactions in col-
laborative settings, including business, community, and emergency encounters. This tax-
onomy addresses crucial interactional aspects, such as managing the opening and closing of
encounters, spatial organisation, and turn-taking. The categories highlight both successful
and problematic interpreting techniques while outlining coping and adaptation strategies
for VRI (Davitti and Braun, 2020). Some of the main challenges identified include handling
long speaking turns, referencing documents or objects, and achieving clear communication.
The use of adaptive strategies, such as micro-pauses and slower speech rates, can improve
communication management, while verbal cues should replace non-verbal signals for better
coordination. This analytical framework serves as an effective tool for interpreter training,
particularly in refugee settings.
294
Immigration, asylum, and refugee settings
Joint training programmes, such as the European-funded AVIDICUS 1–3 projects,4 have
been instrumental in legal PSI settings, helping professionals using VMI services understand
and implement best practices. This underscores the value of collaboration and shared train-
ing in achieving successful outcomes for VMI implementation. However, such in-person
collaborative training initiatives can be restrictive due to the time and cost investments
required from all parties. Training VMI end users through online self-directed courses or
short training sessions provides the flexibility professionals need (Ramos et al., 2014; Pavill,
2019). As current efforts suggest (EU-WEBPSI, 2022–2025; IMDi, 2023a, 2023b), this
approach may be more efficient and practical for those working in refugee settings, given
typical budget and time constraints.
The most recent training initiative is the EU-WEBPSI project, which aims to address the
growing demand for interpreting services in refugee and reception centre contexts. Here,
interpreters are often required for a wide variety of informal conversations in diverse ref-
ugee contexts, from medical to quasi-legal, educational, and social services. The project
primarily aims at developing a framework to train refugees as interpreters, particularly
for VMI, by creating specialised training materials for both interpreters and interpreter
trainers. Additionally, to ensure that interpreters can effectively work via video link in these
settings, the project is developing a dedicated platform for delivering video distance inter-
preting services. This comprehensive approach is designed to meet the unique communica-
tion challenges present in refugee contexts, while also empowering refugees to contribute
as trained interpreters.
295
The Routledge Handbook of Interpreting, Technology and AI
lack familiarity with the professional ethics that guide certified interpreters. To address
these issues, providing training to volunteers is crucial. Another way to improve crowd-
sourced translation is to optimise the matching of bilingual volunteers with refugees or aid
workers needing support. This involves selecting the most suitable volunteer based on their
skills and the specific request. The Tarjimly platform uses artificial intelligence and machine
learning to do this on a large scale (Agarwal et al., 2020).
16.4 Conclusion
The provision of VMI services in refugee settings is fragmented due to reception authorities
and national governments having to address various local needs. Some of the more typical
challenges related to the asylum and refugee settings consist of a complex target audience
made up of families, elderly people, and unaccompanied minors, together with a lack of
training in the VMI modality. VMI implementation varies by country and the particular
stage of the asylum process. Some centres use VMI daily, whilst others have not yet transi-
tioned to VMI and still use telephone interpreting to meet the demand. On-site interpreting
is predominantly used for initial intake procedures at arrival centres due to high volumes,
while VMI is more likely to be used for advanced intake procedures at dispersed facilities,
to manage unpredictable demand and the need for interpreters of LLDs. Technological
problems significantly impact the effective use of VMI. In remote locations, unreliable inter-
net connections are a major issue, while more generally, staff and interpreters often lack
essential VMI equipment and private spaces.
The integration of VMI in refugee settings presents both logistical and interactional
challenges, as highlighted by research over the past two decades. Studies consistently point
to the difficulties posed by the physical separation of interpreters and participants, sub-
standard equipment, and inadequate preparation time, all of which impair the quality of
interpretation. Non-verbal cues, essential for effective communication, are often lost in
VMI, exacerbating issues of rapport, trust, and accuracy. These challenges are particularly
296
Immigration, asylum, and refugee settings
significant in refugee contexts, where emotionally charged accounts and complex cultural
nuances demand accurate and culturally nuanced interpreting.
There is currently a lack of harmonisation and standardisation in VMI guidelines.
However, certain best practices, such as visual and acoustic requirements, technological
considerations, and communication management strategies, are key to establishing VMI
minimum standards in asylum and refugee settings. It is both essential and feasible to create
a shared core of minimum standards to ensure high-quality VMI provision and technologi-
cal infrastructure. Reception authorities can then adapt these standards into local codes
of practice based on their specific needs. Such a model can also serve as a benchmark for
training and interpreting needs, with continuous evaluation and feedback from stakehold-
ers being crucial.
Both specialised programmes for interpreters and interprofessional training have proved
successful in the adoption of best practices and in raising awareness about VMI limita-
tions in legal and medical settings. Whilst such training may be time- and cost-prohibitive,
there are practical alternatives, which consist of online, self-directed courses that would
be effective in the asylum context. VMI interprofessional training can also help end users,
stakeholders, and interpreters understand the knock-on effect that logistical and technical
problems can have on interpreters’ performances in VMI.
Despite its limitations, VMI remains an important tool in managing the logistical and
practical demands of interpreting in refugee settings. However, improvements are neces-
sary to ensure its effectiveness and fairness. To guarantee a certain level of quality within
VMI services in the refugee context, future effort must prioritise investments in reliable
technology, provide training for interpreters and users, and establish structured feedback
mechanisms.
Notes
1 FTTIAC, Adjudicator Guidance Note on Unrepresented Appellants, April 2003, para on the
interpreter.
2 The EU-WEBPSI project is funded by the Asylum and Migration Integration Fund (AMIF), grant
number 101038590; project coordinator: University of Ghent. Website: https://www.webpsi.eu.
3 SHIFT in Orality – Shaping the Interpreters of the Future and of Today. European Commission,
Erasmus+, Key Action 2: Strategic Partnership in Higher Education. 2015–1-IT02-KA203–014786.
Website: www.shiftinorality.eu.
4 AVIDICUS 1, 2, and 3 (Assessment of Video-Mediated Interpreting in the Criminal Justice Systems)
were projects funded by the European Commission Directorate-General for Justice. For further
details, see www.videoconference-interpreting.net and Braun (2016).
References
Agarwal, D., Baba, Y., Sachdeva, P., Tandon, T., Vetterli, T., Alghunaim, A., 2020. Accurate and
Scalable Matching of Translators to Displaced Persons for Overcoming Language Barriers. arXiv
preprint arXiv:2012.02595.
Albl-Mikasa, M., Eingrieber, M., 2018. Training Video Interpreters for Refugee Languages in the
German-Speaking DACH Countries. FITISPos International Journal 5, 33–44. URL https://doi.
org/10.37536/fitispos-ij.2018.5.1.163
Alley, E., 2012. Exploring Remote Interpreting. International Journal of Interpreter Education 4(1),
111–119.
Amato, A., Bertozzi, M., Braun, S., Capiozzo, E., Danese, L., Davitti, E., Fernandes, E.I., Lopez,
J.M., Méndez, G.C., Rodríguez, M.J.G., Russo, M., Spinolo, N., 2018. Handbook of Remote
297
The Routledge Handbook of Interpreting, Technology and AI
Interpreting – Shift in Orality – Shaping the Interpreters of the Future and of Today. University of
Bologna, 1–320. URL https://doi.org/10.6092/unibo/amsacta/5955
Anastasiou, D., Gupta, R., 2011. Crowdsourcing as Human-Machine Translation (HMT). Journal of
Information Science 20(10), 1–15.
Balogh, K., Salaets, H., 2019. The Role of Non-Verbal Elements in Legal Interpreting: A Study of a
Cross-Border Interpreter-Mediated Videoconference Witness Hearing. In Tipton, R., Desilla, L.,
eds. The Routledge Handbook of Translation and Pragmatics. Routledge, London, 394–429.
Barea Muñoz, M., 2021. Psychological Aspects of Interpreting Violence: A Narrative from the
Israeli-Palestinian Conflict. In Todorova, M., Ruiz Rosendo, L., eds. Interpreting Conflict. Pal-
grave Macmillan, Cham, 159–176. URL https://doi.org/10.1007/978-3-030-66909-6_10
Bergunde, A., Pöllabauer, S., 2019. Curricular Design and Implementation of a Training Course for
Interpreters in an Asylum Context. Translation and Interpreting 11(1), 1–21. URL https://doi.
org/10.12807/ti.111201.2019.a01
BID (Bail for Immigration Detainees), 2008. Immigration Bail Hearings by Video Link: A Monitoring
Exercise by Bail for Immigration Detainees and the Refugee Council. URL http://refugeecouncil.
org.uk/policy/position/2008/bail_hearings (accessed 21.5.2018).
BID (Bail for Immigration Detainees), 2010. A Nice Judge on a Good Day: Immigration Bail and the
Right to Liberty. URL https://bailobs.org/wp-content/uploads/2023/03/a_nice_judge_on_a_good_
day.pdf (accessed 16.1.2025).
Blasco Mayor, M.J., 2020. Legal Translator and Interpreter Training in Languages of Lesser Diffusion
in Spain. In Ng, E.N., Crezee, I.H., eds. Interpreting in Legal and Healthcare Settings: Perspectives
on Research and Training. John Benjamins, Amsterdam, 133–163. URL https://doi.org/10.1075/
btl.151.06bla
Braun, S., 2013. Keep Your Distance? Remote Interpreting in Legal Proceedings: A Critical Assess-
ment of a Growing Practice. Interpreting 15(2), 200–228. https://doi.org/10.1075/intp.15.2.03bra
Braun, S., 2016. The European AVIDICUS Projects: Collaborating to Assess the Viability of
Video-Mediated Interpreting in Legal Proceedings. European Journal of Applied Linguistics 4(1),
173–180. URL https://doi.org/10.1515/eujal-2016-0002
Braun, S., 2018. Video-Mediated Interpreting in Legal Settings in England: Interpreters’ Perceptions
in Their Sociopolitical Context. Translation and Interpreting Studies 13, 393–420. URL https://
doi.org/10.1075/tis.00022.bra
Braun, S., 2019. Technology and Interpreting. In The Routledge Handbook of Translation and Tech-
nology. Centre for Translation Studies, University of Surrey, Taylor and Francis, 271–288. URL
https://doi.org/10.4324/9781315311258-19
Braun, S., 2020. “You Are Just a Disembodied Voice Really”: Perceptions of Video Remote Inter-
preting by Legal Interpreters and Police Officers. In Salaets, H., Brône, G., eds. Linking Up with
Video: Perspectives on Interpreting Practice and Research. John Benjamins, 47–78. URL https://
doi.org/10.1075/btl.149.03bra
Braun, S., Balogh, K., Hertog, E., Licoppe, A., Miler-Cassino, J., Rombouts, D., Rybioska, Z., Taylor,
J., van den Bosch, Y., Verdier, M., 2013. Assessment of Video-Mediated Interpreting in the Crimi-
nal Justice System: AVIDICUS 2 – Guide to Video-Mediated Interpreting in Bilingual Proceedings.
AVIDICUS 2, EU Criminal Justice Programme, Project JUST/2010/JPEN/AG/1558, 2011–2013. URL
https://wp.videoconference-interpreting.net/wp-content/uploads/2014/01/AVIDICUS2-Research-
report.pdf (accessed 16.1.2025).
Braun, S., Davitti, E., Dicerto, S., 2018. Video-Mediated Interpreting in Legal Settings: Assessing the
Implementation. In Napier, J., Skinner, R., Braun, S., eds. Here or There: Research on Interpreting
via Video-Link. Gallaudet University Press, Washington, DC, 144–179.
Butow, P.N., Aldridge, L., Eisenbruch, M., Girgis, A., Goldstein, D., Jefford, M., King, M., Lobb, E.,
Schofield, P., Sze, M., 2012. A Bridge Between Cultures: Interpreters’ Perspectives of Consulta-
tions with Migrant Oncology Patients. Supportive Care in Cancer 20, 235–244. URL https://doi.
org/10.1007/s00520-010-1046-z
Cassim, S., Kidd, J., Ali, M., Abdul, H.N., Jamil, D., Keenan, R., Begum, F., Lawrenson, R., 2022.
“Look, Wait, I’ll Translate”: Refugee Women’s Experiences with Interpreters in Healthcare in
Aotearoa New Zealand. Australian Journal of Primary Health 28, 296–302. URL https://doi.
org/10.1071/PY21256
298
Immigration, asylum, and refugee settings
CIOL, 2021. CIOL Guide to Working with Interpreters Remotely for Judges. Based on AVIDICUS
Recommendations. www.ciol.org.uk/ciol-guide-working-interpreters-remotely-judges (accessed
1.7.2023).
CIOL and ATC, 2021. Remote Interpreting Best Practice Checklists. URL www.ciol.org.uk/sites/
default/files/Interpreting%20Checklist-FNL.pdf (accessed 20.7.2023).
Clark, J., 2021. Evaluation of Remote Hearings During the COVID-19 Pandemic. Research Report.
H.M. Courts & Tribunals Service.
Corpas Pastor, G., Gaber, M., 2020. Remote Interpreting in Public Service Settings: Technology, Per-
ceptions and Practice. SKASE Journal of Translation and Interpretation [Preprint]. The Slovak
Association for the Study of English 13(2), 58–78.
Council of the European Union: General Secretariat of the Council, 2013. Guide on Videoconfer-
encing in Cross-Border Proceedings. European e-Justice, Publications Office. URL https://data.
europa.eu/doi/10.2860/76243
Davis, R., Barton, A., Debus-Sherill, S., Matelevich-Hoang, J.B., Niedzwiecki, E., 2015. Research on
Videoconferencing at Post-Arraignment Release Hearings: Phase I Final Report. U.S. Department
of Justice, Fairfax, VA.
Davitti, E., Braun, S., 2020. Analysing Interactional Phenomena in Video Remote Interpreting in Col-
laborative Settings: Implications for Interpreter Education. The Interpreter and Translator Trainer
14(3), 279–302. URL https://doi.org/10.1080/1750399X.2020.1800364
De Wilde, J., Guaus, A., 2022. 1st International Conference on the Right to Languages. In Linguistic
Policies and Translation and Interpreting in Public Services and Institutions. Facultat de Dret – Uni-
versitat de València, 63–64. URL https://blogs.uji.es/cidl/files/2022/07/BoA.pdf (accessed 15.7.2024).
Dubus, N., 2016. Interpreters’ Subjective Experiences of Interpreting for Refugees in Person and via
Telephone in Health and Behavioural Health Settings in the United States. Health & Social Care
in the Community 24(5), 649–656.
Eagly, I.V., 2015. Remote Adjudication in Immigration. Northwestern University Law Review 109,
933–1020.
EASO, 2020. Practical Recommendations on Conducting Remote/Online Registration. EASO Prac-
tical Guide Series. Publications Office of the European Union, Luxembourg. URL https://doi.
org/10.2847/964488
EASO, 2021. EASO Asylum Report 2021: Annual Report on the Situation of Asylum in the European
Union. European Asylum Support Office. URL https://bit.ly/3NjoEKi (accessed 16.1.2015).
Ellis, S.R., 2004. Videoconferencing in Refugee Hearings. Ellis Report to the Immigration and Refu-
gee Board Audit and Evaluation Committee. Immigration and Refugee Board of Canada.
European Migration Network, 2022. Ad-Hoc Query on 2022.63 Interpreting in Reception Facili-
ties. EMN. URL https://emnbelgium.be/sites/default/files/publications/Compilation%20AHQ%20
2022.63_interpreting_in_reception_facilities%20FINAL%20VERSION.pdf (accessed 16.1.2025).
Federal Ministry of Justice, 2022. Draft Law to Promote the Use of Video Conferencing Technology
in Civil Jurisdiction and Specialist Jurisdictions. www.bundestag.de/dokumente/textarchiv/2023/
kw38-de-videokonferenztechnik-965066 (accessed 22.7.2024).
Federman, M., 2006. On the Media Effects of Immigration and Refugee Board Hearings via Video-
conference. Journal of Refugee Studies 19(4), 433–452. URL https://doi.org/10.1093/refuge/fel018
Finnish Translators and Interpreters’ Association, 2013. Asioimistulkin eettiset ohjeet [Ethical
Guidelines for Business Interpreters]. URL www.youpret.com/fi/tulkkien-eettiset-ohjeet/ (accessed
16.1.2025).
Fowler, Y., 2013. Non-English-Speaking Defendants in the Magistrates Court: A Comparative Study
of Face-to-Face and Prison Video Link Interpreter-Mediated Hearings in England (PhD thesis).
Aston University.
Fowler, Y., 2018. Interpreted Prison Video Link: The Prisoner’s Eye View. In Napier, J., Skinner, R.,
Braun, S., eds. Here or There: Research on Interpreting via Video Link. Gallaudet University Press,
Washington, DC, 183–209.
Gany, F., González, C.J., Pelto, D.J., Schutzman, E.Z., 2017. Engaging the Community to Develop
Solutions for Languages of Lesser Diffusion. In Jacobs, E.A., Diamond, L.C., eds. Providing Health
Care in the Context of Language Barriers: International Perspectives. Multilingual Matters, Clev-
edon, 149–169. URL https://doi.org/10.21832/9781783097777-011
299
The Routledge Handbook of Interpreting, Technology and AI
GDISC, 2008. GDISC Interpreters Pool – Final Evaluation Meeting, Sofia, 20–21 November 2008:
Summary Conclusions. GDISC Secretariat, The Hague.
Gilbert, A.S., Croy, S., Haralambous, B., Hwang, K., LoGiudice, D., 2022. Video Remote Inter-
preting for Home-Based Cognitive Assessments: Stakeholders’ Perspectives. Interpreting 24(1),
84–110. URL https://doi.org/10.1075/intp.00065.gil
Guaus, A., Braun, S., Buysse, L., Davitti, E., De Wilde, J., González Figueroa, L.A., Mazzanti, E., Mar-
yns, K., Pöllabauer, S., Singureanu, D., 2023. EU-WEBSI Model: Harmonised Minimal Standards
for Webcam Public Service Interpreting. www.webpsi.eu/deliverables/wp3-minimum-standards/
(accessed 17.1.2025).
Hale, S., Goodman-Delahunty, J., Lim, J., Martschuk, N., 2022. Does Interpreter Location Make a
Difference? A Study of Remote vs Face-to-Face Interpreting in Simulated Police Interviews. Inter-
preting 24(2), 221–253. URL https://doi.org/10.1075/intp.00077.hal
Hale, S., Ozolins, U., 2014. Monolingual Short Courses for Language-Specific Accreditation: Can
They Work? A Sydney Experience. Interpreter and Translator Trainer 8(2), 217–239. URL https://
doi.org/10.1080/1750399X.2014.929371
HSE Social Inclusion Unit and the Health Promoting Hospitals Network, 2009. On Speaking Terms:
Good Practice Guidelines for HSE Staff in the Provision of Interpreting Services. URL www.hse.
ie/eng/services/publications/socialinclusion/emaspeaking.pdf (accessed 16.1.2025).
IMDi [The Directorate for Integration and Diversity], 2023a. E-læringskurs i skjermtolking [E-Learning
Course in Screen Interpretation]. IMDi. www.imdi.no/tolk/skjerm/ (accessed 16.7.2023).
IMDi [The Directorate for Integration and Diversity], 2023b. Roller og ansvar ved tolking over
skjerm [Roles and Responsibilities When Interpreting Over a Screen]. www.imdi.no/tolk/
roller-og-ansvar-ved-tolking-over-skjerm/ (accessed 16.7.2023).
IMDi [The Directorate for Inclusion and Diversity], 2024. Gjennomføring av tolkede samtaler
og møter [Implementation of Interpreted Conversations and Meetings]. www.imdi.no/tolk/
hvordan-lage-retningslinjer-for-bruk-av-tolk-i-dinvirksomhet/#title_11 (accessed 16.7.2023).
Inghilleri, M., Maryns, K., 2019. Asylum. In Baker, M., ed. Routledge Encyclopaedia of Translation
Studies. Routledge, 22–27.
Ji, X., Abdelhamid, K., Bergeron, A., Chow, E., Lebouché, B., Mate, K.K.V., Naumova, D., 2021.
Utility of Mobile Technology in Medical Interpretation: A Literature Review of Current Prac-
tices. Patient Education and Counseling 104(9), 2137–2145. URL https://doi.org/10.1016/j.
pec.2021.02.019
Jiménez-Crespo, M.A., 2017. Translation Crowdsourcing: Research Trends and Perspectives.
In Cordingley, A., Frigau Manning, C., eds. Collaborative Translation: From the Renais-
sance to the Digital Age. Bloomsbury Academic, London, 192–211. URL http://dx.doi.
org/10.5040/9781350006034.0016
Jiménez-Ivars, A., León-Pinilla, R., 2018. Interpreting in Refugee Contexts: A Descriptive and Quali-
tative Study. Language & Communication 60, 28–43.
Klammer, M., Pöchhacker, F., 2021. Video Remote Interpreting in Clinical Communication: A
Multimodal Analysis. Patient Education and Counseling 104, 2867–2876. URL https://doi.
org/10.1016/j.pec.2021.08.024
Kleinert, C.V., Núñez-Borja, C., Stallaert, C., 2019. Buscando espacios para la formación de intér-
pretes para la justicia en lenguas indígenas en América Latina’ [Seeking Spaces for the Training
of Interpreters for Justice in Indigenous Languages in Latin America]. Mutatis Mutandis 12(1),
78–99. URL https://doi.org/10.17533/udea.mut.v12n1a03
Koller, M., Pöchhacker, F., 2018. The Work and Skills: A Profile of First-Generation Video Remote
Interpreters. In Napier, J., Skinner, R., Braun, S., eds. Here or There: Research on Interpreting via
Video Link. Gallaudet University Press, 89–110.
Korak, C.A., 2012. Remote Interpreting via Skype – a Viable Alternative to In Situ Interpreting?
Interpreters Newsletter 17, 83–102. URL http://hdl.handle.net/10077/8614 (accessed 15.1.2025).
Kunin, M., Ali, R., Yugusuk, C., Davis, A., McBride, J., 2022. Providing Care by Telephone to Refu-
gees and Asylum Seekers: An Evaluation of Telephone Mode-of-Care in Monash Health Refugee
Health and Wellbeing Clinic in Victoria, Australia. Health Services Insights 15, 1–10. URL https://
doi.org/10.1177/11786329221134349
The Legal Assistance Foundation of Metropolitan Chicago and The Chicago Appleseed Fund for Jus-
tice, 2005. Videoconferencing in Removal Proceedings: A Case Study of the Chicago Immigration
300
Immigration, asylum, and refugee settings
301
The Routledge Handbook of Interpreting, Technology and AI
Ramos, R., Antolino, P., Davis, J.L., Grant, C.G., Green, B.L., Sanz, M., 2014. Language and Commu
nication Services: A Cancer Centre Perspective. Diversity and Equality in Health and Care 11(1),
71–80. URL www.primescholars.com/articles/language-and-communication-services-a-cancer-
centre-perspective-94515.html (accessed 16.7.2024).
Rowe, A., Twose, A., Makower, C., Mitchell, E., Benton, G., Munro Kerr, M., Singh, N., Meyer, N.,
Rath, N., Krishna, N., Thompson, T., 2019. Systematic Failure: Immigration Bail Hearings. Bail
Observation Project. URL https://bailobs.org/wp-content/uploads/2019/10/systematic-failure-1.
pdf (accessed 16.7.2024).
Shaw, S., 2016. Review into the Welfare in Detention of Vulnerable Persons: A Report to the Home
Office by Stephen Shaw. URL https://assets.publishing.service.gov.uk/government/uploads/system/
uploads/attachment_data/file/490782/52532_Shaw_Review_Accessible.pdf
Shaw, S., 2018. Assessment of Government Progress in Implementing the Report on the Welfare in
Detention of Vulnerable Persons: A Follow-Up Report to the Home Office by Stephen Shaw. URL
https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/
file/728376/Shaw_report_2018_Final_web_accessible.pdf
Singureanu, D., Braun, S., Buysse, L., Davitti, E., De Wilde, J., González Figueroa, L.A., Guaus, A.,
Maryns, K., Mazzanti, E., Pöllabauer, S., 2023a. EU-WEBPSI: Baseline Study and Needs Analysis
for Public Service Interpreting, Video-Mediated Interpreting, and Languages of Lesser Diffusion
Interpreting. www.webpsi.eu/wp-content/uploads/2023/03/Comprehensive-research-report.pdf
(accessed 16.7.2024).
Singureanu, D., Gough, J., Hieke, G., Braun, S., 2023b. “I am His Extension in the Courtroom”:
How Court Interpreters Cope with the Demands of Video-Mediated Interpreting in Hearings with
Remote Defendants. In Corpas Pastor, G., Defrancq, B., eds. Interpreting Technologies – Cur-
rent and Future Trends. John Benjamins, Amsterdam, 72–108. URL https://doi.org/10.1075/
ivitra.37.04sin
Skaaden, H., 2018. Remote Interpreting: Potential Solutions to Communication Needs in the Refugee
Crisis and Beyond. European Legacy 23(7–8), 837–856. URL https://doi.org/10.1080/10848770.
2018.1499474
Sultanić, I., 2020. Medical Interpreter Education and Training. In Meng, J., Laviosa S., eds. The
Oxford Handbook of Translation and Social Practices. Oxford University Press, Oxford, 356–377.
URL https://doi.org/10.1093/oxfordhb/9780190067205.013.23
Tam, I., Fisher, E., Huang, M.Z., Patel, A., Rhee, K.E., 2020. Spanish Interpreter Services for the
Hospitalised Pediatric Patient: Provider and Interpreter Perceptions. Academic Pediatrics 20(2),
216–224. URL https://doi.org/10.1016/j.acap.2019.08.012
TEPIS [The Polish Society of Sworn and Specialised Translators], 2019. Kodeks zawodowy tłumacza
przysięgłego [The Professional Code of Sworn Translators]. URL https://tepis.org.pl/wp-content/
uploads/2020/02/Kodeks_zawodowy_t%C5%82umacza_przysi%C4%99g%C5%82ego_2019.
pdf (accessed 16.1.2025).
Ticca, A.C., Jouin, E., Traverso, V., 2023. Training Interpreters in Asylum Settings: The REMILAS
Project. In Gavioli, L., Wadensjö, C., eds. The Routledge Handbook of Public Service Interpreting.
Routledge, 362–382.
Tipton, R., Furmanek, O., 2016. Dialogue Interpreting: A Guide to Interpreting in Public Services
and the Community. Routledge, London. URL https://doi.org/10.4324/9781315644578
Translators without Borders, 2020. TWB and Kobo Inc Develop Speech Recognition Technology to
Capture Voices of Speakers of Marginalized Languages. URL https://translatorswithoutborders.org/
twb-and-kobo-inc-develop-speech-recognition-technology-to-capture-voices-of-speakers-of-marginalized-
languages/ (accessed 15.10.2024).
UNHCR, 2020. Remote Interviewing: Practical Considerations for States in Europe During
COVID-19. UNHCR Operational Data Portal (ODP). URL https://data2.unhcr.org/en/docu-
ments/details/77134 (accessed 4.10.2024).
van Rotterdam, P., van den Hoogen, R., 2012. True-to-Life Requirements for Using Videoconferenc-
ing in Legal Proceedings. In Braun, S., Taylor, J., eds. Videoconference and Remote Interpreting in
Criminal Proceedings. Intersentia, Antwerp, 187–198.
Vastaanottava Pohjois-Savo [The Welcome to Northern Savo], 2011. VÄLTÄ VÄÄRINKÄSI-
TYKSIÄ – KÄYTÄ TULKKIA [Avoid Misunderstandings – Use an Interpreter]. URL https://doc-
player.fi/1815584-Valta-vaarinkasityksia-kaytatulkkia.html (accessed 16.1.2025).
302
PART V
17.1 Introduction
Quality in interpreting is a fundamental yet elusive and relative construct which has tra-
ditionally been shaped by various perspectives and has been difficult to define or assess
through a unified approach. As a multidimensional concept, quality encompasses elements
such as textual accuracy, source–target correspondence, communicative effect, and the
interpreter’s role performance (Pöchhacker, 2001) and thus requires diverse evaluation
methods. While quality has always been a core concern of the interpreting profession, it
only became a focus of research in the 1980s. The question of quality evaluation has long
been debated, with different approaches offered. Traditionally, conference interpreting has
centred on product-oriented analyses, whereas community interpreting has emphasised
competence-oriented evaluations in its quest for establishing professional standards. Still,
more than 20 years into the 21st century, the elusiveness of the concept of quality and its
dependence on a plethora of contextual factors remain major research challenges.
The significant and rapid technological advancements made in recent decades have
complicated quality definition and assessment further by introducing both technological
variables and a new set of unknowns which relate to the efficiency of human–machine
collaboration in the delivery of interpreting assignments. The interaction between tech-
nology and interpreting has spawned various modalities of distance interpreting. These
include video-mediated interpreting (VMI; see Braun, this volume) and telephone interpret-
ing (TI; see Lázaro Gutiérrez, this volume), which in themselves comprise a wide variety of
sub-forms, depending on the social and technological contexts of application. For example,
both VMI and TI can be used for simultaneous or consecutive, or a mixture of the two, with
varied data throughputs and varied service delivery platforms (see, for example, Chmiel
and Spinolo, and Warnicke and Davitti, this volume). Each of these elements can affect
interpreter performance and, consequently, (aspects of) service quality.
Furthermore, the integration of assistive, AI-driven tools, such as automatic speech
recognition (ASR), adds another layer of complexity. These technologies have not only
expanded the possibilities of interpreting in different contexts but also enabled new assess-
ment methods, which range from semi-automated to fully automated metrics and ensue
DOI: 10.4324/9781003053248-23
The Routledge Handbook of Interpreting, Technology and AI
concerns about the reliability of such approaches. This chapter explores the multiple angles
from which quality evaluation in interpreting has been approached. It begins by revisiting
foundational concepts of quality in interpreting (Section 17.2) and examines key challenges
posed by the intersection of technology and interpreting (Section 17.3). Section 17.4 pro-
vides a comprehensive review of methods and perspectives used to assess quality in these
technologised workflows. Future developments are discussed in Section 17.5, followed by
conclusions in Section 17.6.
306
Quality-related aspects
explore the expectations of a range of users of interpreting services with regards to inter-
preting quality in the context of simultaneous interpreting. They built on and expanded
Bühler’s criteria, incorporating additional factors, such as interpreter’s voice, accent, and
intonation; setting; and subject type. These studies have highlighted that users generally
prioritise content over form when evaluating interpreting quality. However, these also
show how user expectations can vary significantly based on the users’ background. While
many users emphasised the importance of accuracy, logical cohesion, and completeness
of information, users from diplomatic backgrounds, for example, placed greater empha-
sis on fluency and consistency. However, those from technical fields prioritised accuracy
and terminology. Furthermore, Moser (1996) observed how more experienced conference
participants tended to have higher expectations of interpreting quality than less-frequent
service users. Nevertheless, interpreters were found to have higher expectations than users,
placing greater importance on consistency, completeness, and logical cohesion as key fac-
tors in quality (Bühler, 1986; Chiaro and Nocella, 2004). User expectations have also been
explored in relation to different settings, including healthcare (Mesa, 1997) and court inter-
preting (Kadric, 2001). Here, findings revealed that different professional backgrounds,
gender, age, and experience shaped quality priorities.
Parallel to user-centred studies, research into interpreter output has also evolved further.
Research in this area has focused on comparing interpreted versions (interpreters’ out-
put) with source speeches to assess and measure semantic equivalence and communicative
effectiveness, identify errors, or more broadly, ‘interpreting problems’. At the core of this
research lie evaluations by human raters, with approaches that can be broadly divided into
two categories. On the one hand, ‘bottom-up approaches’ assess quality through error/
item-based classifications and weighted scores according to the severity or an error or prob-
lem. These approaches are increasingly used to assess the quality of interpreting-related
workflows (see Section 17.4.2). On the other hand, ‘top-down’ approaches assess the inter-
preter’s output against an explicit rubric/set of criteria covering a range of dimensions of
interpreting quality. These approaches use (weighted) scales to rate the interpreter’s perfor-
mance level for each criterion and have been used widely for both consecutive and simul-
taneous interpreting, often in the context of interpreter training (see Section 17.4.1). There
is ongoing debate regarding the effectiveness of different assessment types in interpreting,
particularly for pedagogical and research purposes (e.g. Dawrant and Han, 2021; Han and
Lu, 2021b; Liu, 2013; Tiselius, 2009). Error/item-based assessments provide a fine-grained,
nuanced analysis of issues in an interpreter’s output but can be more labour-intensive.
Criterion-referenced approaches tend to be less time-consuming and offer holistic evalu-
ations; however, subjectivity remains a concern. To address this, detailed descriptors tend
to be developed to guide assessors, and multiple evaluators are often used. Assessor train-
ing, calibration, and the use of multiple assessors are essential in both error/item-based
and criterion-referenced evaluation methods. These processes help mitigate subjectivity and
ensure consistency in quality assessments. These methods also form the foundation for the
development of machine-based evaluation approaches, reviewed in Section 17.4.3. This
is because reliable human assessments make it possible to explore correlations with auto-
mated metrics, thereby enhancing the robustness of interpreting quality evaluations.
Despite variations, models of output quality have focused primarily on linguistic aspects
of interpreting performance. In an attempt to move beyond output quality and broaden the
scope of the concept of quality, Shlesinger (1997) contended that the quality of the inter-
preter’s output should be assessed at three levels: intertextual level (comparison between
307
The Routledge Handbook of Interpreting, Technology and AI
source speech and the interpreter’s output), intratextual level (interpreter’s output in its
own right), and instrumental level (usefulness and comprehensibility of the interpreter’s
output for a given audience and purpose). Together with findings from cognitive research,
which identify speech rate, interpreter preparation, and cognitive (over)load as factors that
influence interpreting quality (Gile, 1988, 1995/2009), Shlesinger’s model underpins the
case for developing more comprehensive models of quality that reach beyond the focus on
the interpreter’s output.
Examples of comprehensive models of quality assessment and assurance include those
developed by Pöchhacker (2001) and Kalina (2002, 2005). Pöchhacker (2001) argues for
quality to be broken down into different levels of assessment but states that the overarching
criterion should be successful communicative interaction.
Finally, the focus of quality assessment may be neither on the source text nor on lis-
teners’ comprehension or speakers’ intentions but on the process of communicative
interaction as such. From this perspective, which foregrounds the ‘(inter)activity’ of
interpreting rather than its nature as a ‘text-processing task’, quality essentially means
‘successful communication’ . . . among the interacting parties in a particular context
of interaction, as judged from the various (subjective) perspectives in and on the com-
municative event . . . and/or as analyzed more intersubjectively from the position of
an observer.
(Pöchhacker, 2001, 413)
Kalina (2002, 2005) emphasised the need for a model that considers the communication
situation, actors’ intentions and knowledge, and conditions impacting the event (before,
during, after). She also recognises that ‘total objectivity’ is not possible; ‘[w]hat is needed
is a model encompassing the communication situation, the intentions and knowledge
bases of different actors (including the interpreter), and any conditions liable to affect
the interpreted event’ (Kalina, 2002, 133). Ultimately, the goal is to achieve a complete,
accurate interpretation while being mindful of extralinguistic factors and situational con-
straints. By emphasising the importance of the communicative situation in which inter-
preting is embedded, Kalina shifts the focus towards the relationship between the output
quality and the interpreting process. She also highlights the role clients play in achiev-
ing quality, for instance, by providing preparation material, speaking clearly, and with
appropriate pace.
While both the concept of interpreting quality and the criteria used to assess it remain
the subject of ongoing debate, there is consensus that interpreting quality is a complex, mul-
tifaceted concept, shaped by various parameters. It is also widely recognised that (interpret-
ing) quality is a social construct (Grbić, 2008), influenced by individual perspectives and
the situational context in which interpreting occurs. Different actors – such as interpreters,
clients and users of interpreting services, trainers, and examiners – play a role in defining
quality, as do the contextual factors of an interpreting assignment. Ozolins and Hale (2009)
highlight this by describing interpreting quality as a shared responsibility. Importantly, the
situational context for interpreting has been influenced or altered by the introduction of
technology. It therefore appears logical that quality assessment frameworks and models
should account for the unique challenges and opportunities that the application of technol-
ogy in interpreting presents. Nevertheless, open questions remain about how the use of
technology in interpreting should influence/shape evaluation methods and criteria.
308
Quality-related aspects
For example, distance interpreting (further discussed later) has introduced challenges,
which include limited access to non-verbal cues, potential disruptions due to network prob-
lems, increased cognitive load, and fatigue. These challenges have been shown to affect the
interpreter’s performance. Nevertheless, there is currently limited understanding of how
these factors influence the quality perceptions of users of interpreting services. Similarly,
the use of language technologies, such as automatic speech recognition, during the inter-
preting task may enhance an interpreter’s output quality. However, this also increases the
interpreter’s need to multitask. The combined impact of these factors on both interpreting
quality and how it should be evaluated raises new questions for both research and practice.
The next section will review current challenges at the intersection of technology and inter-
preting, with a view to shaping the questions for quality assessment in technology-based
interpreting workflows. Section 17.4 will then explore the various approaches to quality
assessment that have already been applied to interpreting and technology research.
309
The Routledge Handbook of Interpreting, Technology and AI
channels. This creates a high processing load for the professional. A current research chal-
lenge focuses on addressing how interpreters strike a balance between the benefits and
challenges of ASR, in addition to investigating the extent to which the product’s reliability
is impacted by such factors as duration of training with the technology or the interpreter’s
associated skills – for example, sight translation.
The consideration of speech-to-text technologies in interpreting also inevitably leads to
questions concerning the hybrid practices in live communication (these include respeaking
and related workflows; see Davitti, this volume). Rather than receiving live audio interpre-
tation, (some) audiences in certain settings may opt to receive live text instead (also referred
to as ‘live captions’ or ‘live subtitles’). There are several ways to produce such text. Davitti
(this volume) places these on a continuum, from fully automated to largely human. Fur-
thermore, recent research has looked at a pertinent related question: how do we compare
(aspects of) quality across audio and text, as diverse users tend to take advantage of both?
For example, the MATRIC project1 (Korybski et al., 2022) investigated live subtitles pro-
duced in a hybrid (semi-automated) workflow with human respeakers and compared their
accuracy to that offered by top-level human interpreters. Despite some caveats concerning
the scale of the experiment and the genres and languages investigated, the study found that
the semi-automated workflow, combining intralingual respeaking and machine translation,
was capable of generating outputs that were similar in terms of accuracy and completeness
to those produced by human interpreters. In the emerging practice of interlingual respeak-
ing investigated by the SMART project2 (e.g. Davitti and Wallinheimo, 2025; Korybski
and Davitti, 2024, Davitti, this volume), the human is responsible for rendering live audio
directly into the target language through speech recognition, thus producing live subtitles
in a process that adds challenges to ‘regular’ interpreting. Here, emerging technologies like
speech recognition and machine translation have not only established new practices but
also raised the need to reshape the criteria for evaluating quality in such complex scenarios.
The emerging question, therefore, is how to approach the measurement of aspects of qual-
ity while transcending the audio–text input divide?
Aside from the research initiatives mentioned, there is a research gap in this respect,
which calls for investigation, especially with regards to the reception of content via live
audio and live text. This further complicates the challenge of carrying out comprehensive
quality assessment in a highly technologised interpreting (or hybrid) set-up and adds an
array of human end user variables. Therefore, one may claim that studying quality meas-
urement in interpreting requires prior acceptance of its interdisciplinary nature. Research
may cross boundaries with (neuro)cognitive studies, HCI studies, psychology, anthropol-
ogy, and culture studies, for example. However, earlier research in interpreting studies also
provides a solid foundation on which to build, as the next section will present.
310
Quality-related aspects
shift towards bottom-up approaches. These incorporate elements such as category weight-
ing for errors to allow for a more nuanced understanding of performance. These approaches
have marked a move towards flexible and context-sensitive assessment techniques. Further-
more, this progression sets the stage for the development of (semi-)automated assessment
tools. These could offer efficiency and precision when evaluating interpreting performance
across various contexts and thus enable comparative analyses, which are currently less
common, due to the time-consuming nature of quality assessment. This section will provide
an overview of existing assessment methods, grouping them into ‘top-down’, ‘bottom-up’,
and automated approaches. Discussion will focus on the evolution, strengths, and limita-
tions of these methods, in addition to the role played by emerging technologies in shaping
and refining these methods within the field of interpreting.
311
The Routledge Handbook of Interpreting, Technology and AI
yet statistically significant difference of 0.09 units on the 5-point rating scale. This result
slightly favoured on-site interpreting over RSI.
In legal settings, Hale et al. (2022) used a similar approach to compare interpreting
quality in on-site, telephone, and video-mediated interpreting during a police interview.
The criteria developed by Hale et al. (2018) to evaluate interpreting performance in police
interviews included ‘accuracy of propositional content’, ‘accuracy of manner and style’,
‘maintenance of verbal rapport markers’, ‘use of correct interpreting protocols’, ‘accuracy
of legal discourse and terminology’, ‘management and coordination skills’, and ‘bilin-
gual competence’. Each criterion was assigned a detailed descriptor, and the assessors
were asked to provide a score between 1 and 10 to each criterion. The scores were then
weighted. For example, the criterion ‘accuracy of propositional content’ had a weight of
30%, while other criteria were weighted between 10% and 15%. Two independent asses-
sors rated each interpreting performance based on a transcript, without knowing the test
conditions, and the mean score from both assessors was used in the quantitative analysis.
Hale and her colleagues found that the interpreters in the study performed better in on-site
and video-mediated interpreting than in interpreting via audio link.
The two studies selected here demonstrate both the potential and limitations of using
top-down assessment methods to compare interpreting quality across different conditions,
which is a common aim in the study of technology-enabled interpreting. While these meth-
ods provide a structured framework and the assessment criteria can be tailored to specific
research questions and interpreting settings, these methods also impose limitations on the
size of the data corpus and number of language pairs that can be assessed without consist-
ency being compromised. The variability in human judgment, even with clear guidelines,
and the lack of data annotation (which makes it difficult to trace how judges arrived at their
scores) suggest that this approach is better suited to smaller datasets, where fewer judges
are needed. Therefore, one of the overall challenges of this method is scalability.
312
Quality-related aspects
particularly in relation to error classification. In addition, later research has still adopted
bottom-up approaches, from which it has provided a basis for quantifying interpreter per-
formance, thus laying the foundations for the development of later error classification sys-
tems when technology is involved.
One early study by Hornberger et al. (1996) applied Barik’s system to evaluate accuracy
in physician-mother-language-discordant consultations. The study compared two interpret-
ing modalities: remote simultaneous interpreting (experimental) and on-site consecutive
interpreting (control). Findings showed that remote simultaneous interpreting led to more
utterances per visit and a 13% lower rate of inaccurately interpreted mother utterances
(primarily omissions) compared to consecutive interpreting. Similar trends were observed in
physician utterances. This further demonstrates the utility of using this error classification
system to provide quantifiable evidence and identify trends, particularly when comparing
different modalities. More recently, in a study on the cognitive challenges of real-time auto-
mated captioning in interpreting during online meetings, Yuan and Wang (2023) adopted
Barik’s error classification to assess interpreting performance in two conditions, namely,
with and without live captioning. According to the authors, ‘the purpose of utilizing an
error-based analysis is to compromise the subjective caused by different makers recruited
for interpreting performance assessment’ (Yuan and Wang, 2023, 4).
Moser-Mercer (2003) conducted a comparative analysis between on-site interpreting and
booth-based RSI to assess the impact of booth-based RSI on interpreting quality and human
factors, such as interpreter stress and fatigue. In this case, bottom-up analysis included
transcribing and assessing interpreters’ performance outputs using an error rating scale
developed by Moser-Mercer et al. (1998) and adapted from similar earlier research (Barik,
1971; Gerver, 1974). This scale included four categories for meaning-related errors, namely,
‘“contre-sens” – saying exactly the opposite of what the speaker said; “faux-sens” – say-
ing something different from what the speaker said; “nonsense” – not making any sense
at all; “imprecision” – not capturing all of the original meaning (leaving out nuances)’
(Moser-Mercer et al., 1998, 54). This section of their assessment framework (which also
included methods to explore other human factors, such as baseline stress measurements)
captured a range of interpreter errors, such as omissions, additions, hesitations, corrections,
grammatical mistakes, and lexical errors. These were then graded based on their severity.
Errors that lead to a contre-sens were considered ‘the most serious’, and lexical mistakes
were deemed ‘the least serious’.
The AVIDICUS projects (Braun and Taylor, 2012a; Braun et al., 2013) extended
bottom-up approaches to the analysis of interpreting quality in a different mode and con-
text, namely, (simulated) police and prosecution interviews. Each scenario presented an
instance of two-way consecutive interpretating between a police officer or prosecutor and
a suspect or witness, in three different language combinations. Whilst in AVIDICUS 1, one
session was conducted using on-site interpreting and other sessions used video-mediated
interpreting, the sessions in AVIDICUS 2 all involved video-mediated interpreting, using
a variety of equipment and set-up and providing training to the interpreters prior to their
session. These comparative studies adapted Kalina’s (2002) comprehensive criteria, which
had been (primarily) used to assess conference interpreting performances, and combined
them with the quality analysis requirements for interpreting in legal settings. In addition,
AVIDICUS 2 also incorporated more nuanced language-based categories, including omis-
sions, additions, inaccuracies, lexical/terminological issues, and interactional problems (e.g.
turn-taking), and non-verbal and visual elements (such as gaze direction and being out of
313
The Routledge Handbook of Interpreting, Technology and AI
shot). Analysis involved coding using multiple raters and quantitative findings. Qualita-
tive analysis complemented this, in order to assess the scale of emerging problem areas
and identify critical instances. In addition, the interviews were divided into relevant genre
moves (e.g. introduction, caution, suspect’s version, etc., in the police interviews) to analyse
the occurrence of interpreting both within genre moves and over time (Braun, 2013).
While the researchers recognised that these categories only represent ‘one step of the
way to a comprehensive assessment of the viability of video-mediated criminal proceed-
ings that involve an interpreter’ (Braun and Taylor, 2012b: 115), the findings indicated that
video-mediated interpreting amplified known challenges in legal interpreting, particularly in
relation to omissions, additions, and inaccuracies, which were found to be more prevalent
in the technology-mediated settings. Inaccuracies were broken down further into subcatego-
ries, such as ‘distortions’, and given differential weighting, based on severity. Moreover, the
adopted approach revealed that certain problem types tended to co-occur (e.g. turn-taking
problems with omissions, with this being stronger in video-mediated interpreting). Conse-
quently, these categories were analysed together. The research also supported earlier findings
(e.g. from Moser-Mercer, 2003), which suggested that fatigue sets in more quickly dur-
ing technology-enabled sessions than in face-to-face interpreting, as evidenced by a greater
increase in interpreting problems. However, the study’s findings, based on simulations and a
small sample size, call for further replication to validate their significance.
Bottom-up error-based assessment methods that have been developed within interpret-
ing studies have seen a recent revival in relation to hybrid practices (e.g. live speech-to-text)
involving interaction between human professionals and speech recognition technology (aka
respeaking; see Davitti, this volume, for contextualisation of this practice as a unique form
of interpreting, which is accessible to hearing, non-hearing, and other-language-speaker
individuals). The NTR model was developed by Romero-Fresco and Pöchhacker (2017,
159; Figure 17.1). Based on the NER model for intralingual interaction (Romero-Fresco
and Martínez, 2015), it focuses on quantifying aspects such as accuracy and achieves this
via error classification and weighting. The model distinguishes between ‘recognition errors’
314
Quality-related aspects
and ‘translation errors’, the latter including both ‘content-related’ errors, that is, omissions,
additions, and substitutions, and ‘form-related’ errors, that is, grammatical correctness and
style errors. In this model, errors are attributed a score depending on their severity: ‘Minor’
errors (penalised with a −0.25-point deduction) do not hamper comprehension, ‘major’
errors (−0.50) cause confusion or loss of information, and ‘critical’ errors (−1) introduce
false or misleading information to the output.
However, the NTR model differs from earlier approaches based on error classifications.
One major difference includes the introduction of the category of ‘effective editions’ (EEs)
to account for editions that do not cause loss of information and may even improve the
text (Korybski and Davitti, 2024). While EEs are not assigned numerical scores, they do
play a role in the analysis, as they highlight the strengths of human editing. Furthermore,
space for qualitative evaluations is also accounted for. The accuracy rate is calculated using
the formula shown in Figure 17.1. In live intralingual subtitling, for subtitle accuracy to be
considered ‘acceptable’, the minimum accuracy threshold is set at 98% (Romero-Fresco,
2011). Several studies have validated the NER model in professional settings (e.g. Ofcom,
2015a, 2015b). In training, the model is also considered as a useful diagnostic tool to
identify recurrent errors in the performance of respeakers. However, no established quality
benchmark has yet been validated for interlingual live subtitling.
In effect, both the NER and NTR models assess interpreting accuracy from the perspec-
tive of evaluators rather than the end users. That being said, in principle, error scoring is
guided by the impact that a certain error may have on viewers’ comprehension. Conse-
quently, this leaves room for borderline cases where distinguishing between an error and
a positive strategy is more challenging. Moreover, the subjectivity involved in error rat-
ing can affect the consistency of assessments. This highlights the importance of having a
second-marking process when applying the model, to ensure reliability and reduce indi-
vidual bias in evaluation.
Adaptations of the NTR model have started to be used to evaluate interpreting perfor-
mance, with a view to producing findings that are comparable across studies. For instance,
Korybski et al.’s (2022) study compared a semi-automated workflow for live interlingual
subtitling, which involved a human (intralingual) respeaker paired with machine transla-
tion, against the output of simultaneous interpreting across several language pairs (Spanish,
Italian, French, Polish). The study used source speeches from the European Parliament for
both workflows. Importantly, recognition errors were not captured since they represented
an interim stage, specific to the semi-automated workflow’s intralingual respeaking process,
and did not apply to the simultaneous interpreting (the benchmark workflow).
Rodríguez González (2024; see also Rodríguez González et al., 2023) also adapted the
NTR model to evaluate the impact of ASR on interpreters’ performance in platform-based
RSI. To this end, recognition errors were excluded, as they were deemed irrelevant to the
interpreting workflow. A new category of disfluency was introduced to capture important
aspects of the interpreting delivery, namely, interjections, false starts, unfinished sentences,
truncated words, self-repairs, repetitions, silent and filled pauses. Three or more disfluen-
cies within an idea unit were penalised as a ‘minor style-related’ error. These adjustments
‘provided the research team with a pragmatic and systematic tool that enabled a rigorous
comparative analysis, based on a quantitative assessment, that captured the differences that
are present in the interpretations, both from intra- and extralinguistic perspectives’ (Rod-
ríguez González, 2024, 75).
315
The Routledge Handbook of Interpreting, Technology and AI
Two further studies are currently adapting the NTR model to explore how different
methods of integrating ASR into the interpreting workflow affect interpreting quality in
consecutive/dialogue interpreting in healthcare setting (Tan et al., 2024) and legal setting
(Tang et al., 2024).
Overall, approaches that aim to quantify aspects of interpreting performance offer
the potential to reduce the subjectivity, which has traditionally been a concern in quality
assessment, and achieve greater consistency and replicability across studies. However,
these methods remain somewhat experimental, given their labour-intensive nature, which
limits their application to large datasets. Semi-automation of quality assessment has been
tested in the context of intralingual speech-to-text practices (e.g. the NER Buddy; Alonso
Bacigalupe and Romero-Fresco, 2023). While this approach currently seems to perform
better with verbatim renditions, its use for the evaluation of interlingual practices is
worth exploring.
In relation to interlingual respeaking, the SMART project (see endnote 2) adopted an
NTR-driven assessment method involving two independent evaluators per performance
over a total of 153 performances and six language directionalities (see Davitti, this vol-
ume, and Davitti and Wallinheimo, 2025, for further information about the study design).
Through this bottom-up approach based on errors, different error types (as discussed ear-
lier in this section) as well as effective editions (EEs, that is, moves having a positive impact
on the final output) were manually identified. The results were used to calculate final accu-
racy scores, but also in multiple regressions, to identify predictors of accuracy (or lack
thereof) across all participants and scenarios tested. Findings showed that omissions were
the strongest negative predictor of accuracy (β = −1.1, p < .001), followed by substitutions
(β = −.19, p < .001), and recognition errors (β = −.31, p < .001). EEs emerged as positive
predictors of accuracy across all scenarios (β = .31, p = .03).
These findings highlight the potential for semi-automation of error detection, especially
as omissions are stronger predictors of inaccuracy than other error types. A system that
automatically identifies errors that have a strong impact on accuracy would provide a
quick assessment of the level of accuracy achieved by a specific output while reducing the
time required for analysis. More detailed, manual analysis could then be directed to those
errors that require a more nuanced source–target comparison or could be applied on a sam-
pling basis. However, the small-scale attempt at semi-automating this process presented in
Alonso Bacigalupe and Romero-Fresco (2023) shows that current systems struggle to differ-
entiate between omissions as strategic edits (e.g. for condensation purposes) and omissions
as actual errors in intralingual respeaking, which is likely to be exacerbated when language
transfer is involved.
While improvements in prompts and ASR technology will continue to enhance per-
formance, at the time of writing, human judgment remains irreplaceable for a compre-
hensive evaluation of quality, particularly in interlingual live content, where a verbatim
approach is rarely effective. This also highlights both the potential and the limitations of
automated systems, which, though efficient, lack the flexibility and adaptability of human
decision-making. Having to reassess post hoc which instances identified by the system as
‘errors’ actually have a positive or negative impact on the output would undermine any
efficiency gains. These considerations also point to the need to tailor automated assessment
systems to specific contexts and scenarios. Further discussion on full automation for quality
evaluation purposes, which would improve consistency and allow for processing of larger
datasets, will be offered in the following section.
316
Quality-related aspects
317
The Routledge Handbook of Interpreting, Technology and AI
318
Quality-related aspects
Stewart et al. (2018) attempted to use MTQE to evaluate interpretations. They extended
QuEst++ (Specia et al., 2015) to account for interpreting specific features, like the ratio
of pauses/hesitations/incomplete words, the ratio of non-specific words, and the ratio of
‘quasi-cognates’. Their evaluation shows that the proposed method improved the correla-
tion for the predicted METEOR score over a baseline without the need for a reference
interpretation. Despite Stewart et al.’s (2018) promising results, it is currently difficult to
use MTQE methods in interpreting, due to lack of training data.
Although metrics like BLEU and METEOR were shown to correlate with human scores,
the scores provided by these metrics can only be used to rank interpretations. Alone, these
scores have limited utility, as they do not correspond to any predefined levels of quality.
The same criticism applies to methods derived from machine translation quality estimation,
such as that which Stewart et al. (2018) propose. This limitation is addressed by the method
presented in the next section.
319
The Routledge Handbook of Interpreting, Technology and AI
these metrics are meant to measure the quality of machine interpreting, they can also
be used with human interpreting. The speech output of these systems is evaluated using
BLASER (BLASER et al., 2023), which compares the translated speech with the reference
speech. The same metrics can also be used to assess the quality of human interpreting per-
formance. However, as previously mentioned, these would only be useful when comparing
two interpretations.
320
Quality-related aspects
17.6 Conclusion
This chapter has undertaken the challenging task of exploring the broad, multifaceted con-
cept of quality within the context of interpreting and technology. It began by reviewing
key concepts related to interpreting quality and identifying specific challenges that arise
when technology is involved. Furthermore, it provided an overview of various evalua-
tion approaches adopted to explore such intersection, categorising them into top-down,
bottom-up, and automated methods.
A significant issue that was highlighted in relation to the latter group is the use of written
translation assessment methods to assess interpreting, which may not always be appropri-
ate. This is the case, for instance, in consecutive interpreting, where aspects of delivery (e.g.
intonation) and condensation play crucial roles. To adequately address these complexities,
it is essential to refine assessment methods to reflect the nuances of spoken language and
account for changes introduced by technology.
Looking ahead, a hybrid approach to quality assessment and evaluation appears promis-
ing. This approach would incorporate elements of automation for scalability while preserv-
ing the qualitative insights critical for assessing interpreting performance – especially as it
relates to technology and AI.
Further research endeavours based on larger datasets can also inform quality assessment
practices employed by different stakeholders. With the growing scale of live multilingual
(and multimodal) communication, one can predict increased interest in research-informed
semi-automated methods for interpreting quality evaluation. Additionally, the impact of
the ongoing academic debate on quality in interpreting is likely to extend beyond interpret-
ing studies, as long as sufficient, transparent, iterative communication and collaboration
between industry and academia remain.
321
The Routledge Handbook of Interpreting, Technology and AI
Furthermore, using some AI-driven technologies in evaluation can provide deeper under-
standing into how the use of other AI-driven technologies during interpretation delivery
could impact both the process and the product of interpretation. In addition, AI-driven
technology will be instrumental not only in assessing the quality of output delivered by
professional interpreters working in different workflows (at different levels of automation)
but also in shaping assessment in interpreter training contexts in the near future.
Dynamic, highly responsive research in the area of interpreting quality is indispensable to
ensure recurrent verification of the affordances of new technologies. With research-informed
insights, it will be possible to mitigate the current dominance of ‘tech hype’ over reason,
especially regarding the role of humans in providing high-quality, real-time communication
across languages. Ongoing research into interpreting quality and the best methods to eval-
uate technology-driven interpreting workflows may help counter some of the premature
claims about the universal feasibility of automated interpreting, often made by advocates
of major AI companies. If these claims remain unchallenged, they could discourage future
interpreters, potentially leading to a shortage, or even the disappearance, of a socially vital
profession.
Notes
1 MATRIC project – Machine Translation and Respeaking in Interlingual Communication, Expand-
ing Excellence in England, Research England, 2020–2024.
2 SMART project – Shaping Multilingual Access through Respeaking Technology, ES/T002530/1,
Economic and Social Research Council UK, 2020–2023. URL https://smartproject.surrey.ac.uk/
3 An ngram is a group of n consecutive words in a text. The most common way of applying BLEU
uses n = 1 . . . 4, that is, compares the groups of one word (unigrams), two words (bigrams), three
words (trigrams), and four words.
4 https://iwslt.org/ (accessed 4.4.2025).
References
Alonso Bacigalupe, L., Romero-Fresco, P., 2023. The Application of Artificial Intelligence-Based
Tools in Intralingual Respeaking: The NER Buddy. In Corpas Pastor, G., Hidalgo-Ternero, C., eds.
Proceedings of the International Workshop on Interpreting Technologies SAY IT AGAIN 2023,
9–15. URL https://lexytrad.es/SAYITAGAIN2023/
Banerjee, S., Lavie, A., 2005. METEOR: An Automatic Metric for MT Evaluation with High Levels
of Correlation with Human Judgments. In Proceedings of the ACL Workshop on Intrinsic and
Extrinsic Evaluation Measures for Machine Translation and/or Summarization, Ann Arbor, MI,
65–72.
Barik, H.C., 1969. A Study of Simultaneous Interpretation (PhD thesis).
Barik, H.C., 1971. A Description of Various Types of Omissions, Additions and Errors of Trans-
lation Encountered in Simultaneous Interpretation. Meta 16(4), 199–210. URL https://doi.
org/10.7202/001972ar
Barik, H.C., 1975. Simultaneous Interpretation: Qualitative and Linguistic Data. Language and
Speech 18(3), 272–297.
BLASER, Seamless Communication, Barrault, L., Chung, Y.-A., Meglioli, M.C., Dale, D., Dong, N.,
Duquenne, P.-A., Elsahar, H., Gong, H., Heffernan, K., Hoffman, J., Klaiber, C., Li, P., Licht,
D., Maillard, J., Rakotoarison, A., Sadagopan, K.R., Wenzek, G., Ye, E., Akula, B., Chen, P.-J.,
Hachem, N.E., Ellis, B., Gonzalez, G.M., Haaheim, J., Hansanti, P., Howes, R., Huang, B., Hwang,
M.-J., Inaguma, H., Jain, S., Kalbassi, E., Kallet, A., Kulikov, I., Lam, J., Li, D., Ma, X., Mavlyutov,
R., Peloquin, B., Ramadan, M., Ramakrishnan, A., Sun, A., Tran, K., Tran, T., Tufanov, I., Vogeti,
V., Wood, C., Yang, Y., Yu, B., Andrews, P., Balioglu, C., Costa-jussà, M.R., Celebi, O., Elbayad,
M., Gao, C., Guzmán, F., Kao, J., Lee, A., Mourachko, A., Pino, J., Popuri, S., Ropers, C., Saleem,
322
Quality-related aspects
S., Schwenk, H., Tomasello, P., Wang, C., Wang, J., Wang, S., 2023. SeamlessM4T: Massively Mul-
tilingual & Multimodal Machine Translation. URL https://arxiv.org/abs/2308.11596
Braun, S., 2013. Keep Your Distance? Remote Interpreting in Legal Proceedings. Interpreting 15(2),
200–228. URL https://doi.org/10.1075/intp.15.2.03bra
Braun, S., Taylor, J., eds., 2012a. Videoconference and Remote Interpreting in Criminal Proceedings.
Intersentia, Antwerp.
Braun, S., Taylor, J., 2012b. AVIDICUS Comparative Studies – Part I: Traditional Interpreting and
Remote Interpreting in Police Interviews. In Braun, S., Taylor, J., eds. Videoconference and Remote
Interpreting in Criminal Proceedings. Intersentia, Antwerp, 99–117.
Braun, S., Taylor, J., Miler-Cassino, J., Rybinska, Z., Balogh, K., Hertog, E., Vanden Bosch, Y., Rombouts,
D., Licoppe, C., Verdier, M., 2013. Assessment of Video-Mediated Interpreting in the Criminal Jus-
tice System: AVIDICUS 2 – Action 2 Research Report. URL http://wp.videoconference-interpreting.
net/wp-content/uploads/2014/01/AVIDICUS2-Research-report.pdf
Bühler, H., 1986. Linguistic (Semantic) and Extra-Linguistic (Pragmatic) Criteria for the Evaluation
of Conference Interpretation and Interpreters. Multilingua 5(4), 231–235.
Chiaro, D., Nocella, G., 2004. Interpreters’ Perception of Linguistic and Non-Linguistic Factors
Affecting Quality: A Survey Through the World Wide Web. Meta 49(2), 278–293. URL https://
doi.org/10.7202/009351ar
Choi, J.Y., 2013. Assessing the Impact of Text Length on Consecutive Interpreting. In Tsagari, D.,
van Deemter, R., eds. Assessment Issues in Language Translation and Interpreting. Peter Lang,
Frankfurt am Main, 85–96.
Collados Aís, Á., 1998. La evaluación de la calidad en interpretación simultánea. La importancia de
la comunicación no verbal. Editorial Comares, Granada.
Davitti, E., Wallinheimo, A.-S., 2025. Investigating Cognitive and Interpersonal Factors in Hybrid
Human-AI Practices: An Empirical Exploration of Interlingual Respeaking. Target 37, Special
Issue: Mapping Synergies within Cognitive Research on Multilectal Mediated Communication.
Dawrant, A., Han, C., 2021. Testing for Professional Qualification in Conference Interpreting. In
Albl-Mikasa, M., Tiselius, E., eds. Routledge Handbook of Conference Interpreting. Routledge,
London, 258–274.
Fantinuoli, C., 2017. Speech Recognition in the Interpreter Workstation. In Esteves-Ferreira, J.,
Macan, J., Mitkov, R., Stefanov, O.-M., eds. Translating and the Computer 39: Proceedings. Edi-
tions Tradulex, Geneva, 25–34.
Frittella, F.M., 2019. 70.6 Billion World Citizens: Investigating the Difficulty of Interpreting Numbers.
Translation and Interpreting, 11(1), 79–99. URL https://doi.org/10.12807/ti.111201.2019.a05
García Becerra, O., Collados Aís, Á., 2019. Quality, Interpreting. In Baker, M., ed. Encyclopedia of
Translation Studies. Routledge, London, 454–458.
Gerver, D., 1969/2002. The Effects of Source Language Presentation Rate on the Performance of
Simultaneous Conference Interpreters. In Pöchhacker, F., Shlesinger, M., eds. The Interpreting
Studies Reader. Routledge, London, 53–66.
Gerver, D., 1974. The Effects of Noise on the Performance of Simultaneous Interpreters: Accuracy of
Performance. Acta Psychologica 38, 159–167.
Gile, D., 1988. Le partage de l’attention et le ‘modèle d’effort’ en interprétation simultanée. The
Interpreters’ Newsletter 1, 4–22.
Gile, D., 1995/2009. Basic Concepts and Models for Interpreter and Translator Training. John Ben-
jamins, Amsterdam.
Grbić, N., 2008. Constructing Interpreting Quality. Interpreting 10(2), 232–257.
Hale, S., Goodman-Delahunty, J., Martschuk, N., 2018. Interpreter Performance in Police Interviews:
Differences Between Trained Interpreters and Untrained Bilinguals. The Interpreter and Translator
Trainer 13(2), 107–131. URL https://doi.org/10.1080/1750399X.2018.1541649
Hale, S., Goodman-Delahunty, J., Martschuk, N., Lim, J., 2022. Does Interpreter Location Make a
Difference? A Study of Remote vs Face-to-Face Interpreting in Simulated Police Interviews. Inter-
preting 24(2), 221–253. URL https://doi.org/10.1075/INTP.00077.HAL
Han, C., Chen, S., Fu, R., Fan, Q., 2020. Modeling the Relationship Between Utterance Flu-
ency and Raters’ Perceived Fluency of Consecutive Interpreting. Interpreting: International
Journal of Reasearch and Practice in Interpreting 22, 211–237. URL https://doi.org/10.1075/
intp.00040.han
323
The Routledge Handbook of Interpreting, Technology and AI
Han, C., Lu, X., 2021a. Can Automated Machine Translation Evaluation Metrics Be Used to Assess
Students’ Interpretation in the Language Learning Classroom? Computer Assisted Language
Learning 36, 1064–1087. URL https://doi.org/10.1080/09588221.2021.1968915
Han, C., Lu, X., 2021b. Interpreting Quality Assessment Re-Imagined: The Synergy Between
Human and Machine Scoring. Interpreting and Society, 1(1), 70–90. URL https://doi.org/10.
1177/27523810211033670
Hartley, A., Mason, I., Peng, G., Perez, I., 2003. Peer- and Self-Assessment in Conference Interpreter
Training. Centre for Languages, Linguistics and Area Studies, Heriot-Watt University.
Herbert, J., 1952. The Interpreter’s Handbook: How to Become a Conference Interpreter. Georg,
Geneva.
Hornberger, J., Gibson, C., Wood, W., Dequeldre, C., Corso, I., Palla, B., Bloch, D., 1996. Eliminating
Language Barriers for Non-English-Speaking Patients. Medical Care 34(8), 845–856.
Kadric, M., 2001. Dolmetschen bei Gericht. Erwartungen, Anforderungen, Kompetenzen. WUV Uni-
versitätsverlag, Wien.
Kalina, S., 2002. Quality in Interpreting and Its Prerequisites – a Framework for a Comprehensive
View. In Garzone, G., Viezzi, M., eds. Interpreting in the 21st Century. John Benjamins, Amster-
dam, 121–130.
Kalina, S., 2005. Quality Assurance for Interpreting Processes. Meta 50(2), 769–784. URL https://doi.
org/10.7202/011017ar
Kopczyński, A., 1994. Quality in Conference Interpreting: Some Pragmatic Problems. In Lambert,
S., Moser-Mercer, B., eds. Bridging the Gap: Empirical Research on Simultaneous Interpretation.
John Benjamins, Amsterdam, 87–99.
Korybski, T., Davitti, E., 2024. Human Agency in Live Subtitling Through Respeaking: Towards a
Taxonomy of Effective Editing. Journal of Audiovisual Translation. Special issue: Human Agency
in the Age of Technology 7(2), 1–22. URL https://doi.org/10.47476/jat.v7i2.2024.302
Korybski, T., Davitti, E., Orăsan, C., Braun, S., 2022. A Semi-Automated Live Interlingual Commu-
nication Workflow Featuring Intralingual Respeaking: Evaluation and Benchmarking. In Proceed-
ings of the 13th Conference on Language Resources and Evaluation (LREC 2022). Marseille,
France, 4405–4413. ELRA. URL https://aclanthology.org/2022.lrec-1.468/
Kurz, I., 1993. Conference Interpretation: Expectations of Different User Groups. The Interpreters’
Newsletter 5, 13–21.
Kurz, I., 2001. Conference Interpreting: Quality in the Ears of the User. Meta 46(2), 394–409.
Lee, J., 2008. Rating Scales for Interpreting Performance Assessment. The Interpreter and Translator
Trainer 2(2), 165–184. URL https://doi.org/10.1080/1750399X.2008.10798772
Li, X., Li, X., Chen, S., Ma, S., 2022. Neural-Based Automatic Scoring Model for Chinese-English
Interpretation with a Multi-Indicator Assessment. Connection Science 34(1), 1638–1653.
Liu, M., 2013. Design and Analysis of Taiwan’s Interpretation Certification Examination. In Tsagari,
D., van Deemter, R., eds. Assessment Issues in Language Translation and Interpreting. Peter Lang,
Frankfurt am Main, 163–178.
Lu, X., Han, C., 2023. Automatic Assessment of Spoken-Language Interpreting Based on
Machine-Translation Evaluation Metrics: A Multi-Scenario Exploratory Study. Interpreting:
International Journal of Research and Practice in Interpreting 25, 109–143. URL https://doi.
org/10.1075/intp.00076.lu
Mesa, A.-M., 1997. L’interprète culturel: Un professionnel apprécié. Étude sur les services
d’interprétation: le point de vue des clients, des intervenants et des interprètes. Régie régionale de
la santé et des services sociaux de Montréal-Centre, Montréal.
Moser, P., 1996. Expectations of Users of Conference Interpretation. Interpreting 1(2), 145–178.
Moser-Mercer, B., 2003. Remote Interpreting: Assessment of Human Factors and Performance
Parameters. Communicate! Summer, 3.
Moser-Mercer, B., Künzli, A., Korac, M., 1998. Prolonged Turns in Interpreting: Effects on Quality,
Physiological and Psychological Stress (Pilot Study). Interpreting 3(1), 47–64. URL https://doi.
org/10.1075/intp.3.1.03mos
Napier, J., Skinner, R., Braun, S., 2018. Here or There: Research on Interpreting via Video Link. Gal-
laudet University Press, Washington, DC, 11–35.
Ofcom, 2015a. Ofcom’s Code on Television Access Services. URL https://www.ofcom.org.uk/__data/
assets/pdf_file/0016/40273/tv-access-services-2015.pdf (accessed 06.09.2024).
324
Quality-related aspects
Ofcom, 2015b. Measuring Live Subtitling Quality. Results from the Fourth Sampling Exercise.
URL https://www.scribd.com/document/552210915/REPORT-2015-Measuring-Live-Subtitling-
Quality-OFCOM (accessed 06.09.2024).
Ozolins, U., Hale, S., 2009. Introduction. Quality in Interpreting: A Shared Responsibility. In Hale,
S., Ozolins, U., Stern, L., eds. The Critical Link 5. Quality in Interpreting – a Shared Responsibil-
ity. John Benjamins, Amsterdam, 1–10.
Papineni, K., Roukos, S., Ward, T., Zhu, W., 2002. BLEU: A Method for Automatic Evaluation of Machine
Translation. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Annual Meeting (ACL), Philadelphia, PA, 311–318. URL https://doi.org/10.3115/1073083.1073135
Pöchhacker, F., 2001. Quality Assessment in Conference and Community Interpreting. Meta 46(2),
410–425.
Pöchhacker, F., Zwischenberger, C., 2010. Survey on Quality and Role: Conference Interpreters’
Expectations and Self-Perceptions. Communicate! – A Webzine for Conference Interpreters and
the Conference Industry 53.
Rennert, S., 2008. Visual Input in Simultaneous Interpreting. Meta 52(1), 204–217.
Riccardi, A., 1998. Evaluation in Interpretation: Macrocriteria and Microcriteria. In Huang, E., ed.
Teaching Translation and Interpreting 4: Building Bridges. John Benjamins, Amsterdam, 115–127.
URL https://doi.org/10.1075/btl.42.14ric
Rodríguez González, E., 2024. The Use of Automatic Speech Recognition in Cloud-Based Remote
Simultaneous Interpreting (PhD thesis).
Rodríguez González, E., Saeed, A., Davitti, E., Korybski, T., Braun, S., 2023. Assessing the Impact of
Automatic Speech Recognition on Remote Simultaneous Interpreting Performance Using the NTR
Model. In Corpas Pastor, G., Hidalgo-Ternero, C., eds. Proceedings of the International Workshop
on Interpreting Technologies SAY IT AGAIN 2023, 1–8. URL https://acl-bg.org, https://lexytrad.
es/SAYITAGAIN2023/
Romero-Fresco, P., 2011. Subtitling Through Speech Recognition: Respeaking. St Jerome, Manchester.
Romero-Fresco, P., Martínez, J., 2015. Accuracy Rate in Live Subtitling–the NER Model. In
Díaz-Cintas, J., Baños Piñero, R., eds. Audiovisual Translation in a Global Context. Mapping an
Ever-Changing Landscape. Palgrave, London, 28–50.
Romero-Fresco, P., Pöchhacker, F., 2017. Quality Assessment in Interlingual Live Subtitling: The NTR
Model. Linguistica Antverpiensia, New Series: Themes in Translation Studies 16, 149–167.
Roziner, I., Shlesinger, M., 2010. Much Ado About Something Remote: Stress and Performance in
Remote Interpreting. Interpreting 12(2), 214–247.
Seeber, K., 2017. Multimodal Processing in Simultaneous Interpreting. In Schwieter, J.W., Ferreira, A.,
eds. The Handbook of Translation and Cognition. John Wiley & Sons, Inc.
Sellam, T., Das, D., Parikh, A.P., 2020. BLEURT: Learning Robust Metrics for Text Generation. In
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. URL
https://aclanthology.org/2020.acl-main.704.pdf
Shlesinger, M., 1997. Quality in Simultaneous Interpreting. In Gambier, Y., Gile, D., Taylor, C., eds.
Conference Interpreting: Current Trends in Research. John Benjamins, Amsterdam, 123–131.
Specia, L., Paetzold, G., Scarton, C., 2015. Multi-Level Translation Quality Prediction with QuEst++.
In Proceedings of ACL-IJCNLP 2015 System Demonstrations. Presented at the Proceedings of
ACL-IJCNLP 2015 System Demonstrations, Association for Computational Linguistics and The
Asian Federation of Natural Language Processing, Beijing, China, 115–120. URL https://doi.
org/10.3115/v1/P15-4020
Specia, L., Raj, D., Turchi, M., 2010. Machine Translation Evaluation Versus Quality Estimation.
Machine Translation 24, 39–50. URL https://doi.org/10.1007/s10590-010-9077-2
Stewart, C., Vogler, N., Hu, J., Boyd-Graber, J., Neubig, G., 2018. Automatic Estimation of Simultaneous
Interpreter Performance. In Proceedings of the 56th Annual Meeting of the Association for Computa-
tional Linguistics (Volume 2: Short Papers). Presented at the Proceedings of the 56th Annual Meeting
of the Association for Computational Linguistics (Volume 2: Short Papers), Association for Computa-
tional Linguistics, Melbourne, Australia, 662–666. URL https://doi.org/10.18653/v1/P18-2105
Tan, S., Orăsan, C., Braun, S., 2024. Integrating Automatic Speech Recognition into Remote
Healthcare Interpreting: A Pilot Study on Its Impact on Interpreting Quality. Proceedings of
Translating and the Computer 2024 (TC46), 175–191. URL https://asling.org/tc46/wp-content/
uploads/2025/03/TC46-proceedings.pdf
325
The Routledge Handbook of Interpreting, Technology and AI
Tang, W., Singureanu, D., Wang, F., Orăsan, C., Braun, S., 2024. Integrating Automatic Speech Rec-
ognition in Remote Interpreting Platforms: An Initial Assessment. In CIOL Interpreters Day, Lon-
don, 16.3.2024.
Tiselius, E., 2009. Revisiting Carroll’s Scales. In Angelelli, C., Jacobson, H., eds. Testing and Assess-
ment in Translation and Interpreting Studies. John Benjamins, Amsterdam, 95–121. URL https://
doi.org/10.1075/ata.xiv.07tis
Treisman, A.M., 1965. The Effects of Redundancy and Familiarity on Translating and Repeating Back
a Foreign and a Native Language. British Journal of Psychology 56, 369–379.
Ünlü, C., 2023. InterpreTutor: Using Large Language Models for Interpreter Assessment. In Proceed-
ings of the International Conference on Human-Informed Translation and Interpreting Technology
2023. Presented at the International Conference on Human-informed Translation and Interpreting
Technology 2023, Naples, Italy, 78–96. URL https://doi.org/10.26615/issn.2683-0078.2023_007
Wang, X., Fantinuoli, C., 2024. Exploring the Correlation Between Human and Machine Evalua-
tion of Simultaneous Speech Translation. In Proceedings of the 25th Annual Conference of the
European Association for Machine Translation (Volume 1), Sheffield, UK, 327–336. URL https://
aclanthology.org/2024.eamt-1.28/
Yu, W., Van Heuven, V.J., 2017. Predicting Judged Fluency of Consecutive Interpreting from Acoustic
Measures: Potential for Automatic Assessment and Pedagogic Implications. Interpreting: Interna-
tional Journal of Research and Practice in Interpreting 19, 47–68. URL https://doi.org/10.1075/
intp.19.1.03yu
Yuan, L., Wang, B., 2023. Cognitive Processing of the Extra Visual Layer of Live Captioning in
Simultaneous Interpreting. Triangulation of Eye-Tracking and Performance Data. Ampersand 11,
100131. URL https://doi.org/10.1016/j.amper.2023.100131
Zhang, T., Kishore, V., Wu, F., Weinberger, K.Q., Artzi, Y., 2020. BERTscore: Evaluating Text Gen-
eration with BERT. In Proceedings of the 8th International Conference on Learning Representa-
tions (ICLR 2020). URL https://iclr.cc/virtual_2020/poster_SkeHuCVFDr.html
326
18
ETHICAL ASPECTS
Deborah Giustini
18.1 Introduction
In recent decades, technological tools have become increasingly integrated by the interpret-
ing profession and industry thanks to growing computing power and market availability
of advanced audiovisual communication tools (Braun, 2019). Technological solutions now
include infrastructures that facilitate distance communication options for interpreting, as
well as online labour environments to manage the supply and demand of interpreting ser-
vices. As momentum continues to grow, stakeholders have increasing access to AI software
that is able to integrate a range of functionalities into the interpreting process, from auto-
mated terminology search to computer-assisted interpreting tools. In addition, technology
developers are driving innovation in machine interpreting through neural networks, leading
to its use in an increasing range of communication settings.
As debates on interpreting technologies come to the fore in the sector, research and
industry initiatives are aiming to assess their impact on both practices and settings. There
has been a relatively recent increase in scholarly literature that directly addresses the inter-
section between technology and interpreting, as well as the interrelation between these two
factors and a wide spectrum of cultural, social, economic, and professional issues (Drugan,
2019). In addition, interpreting researchers have themselves started developing a substan-
tial body of work relating to the ethics of interpreting technologies. This chapter reviews
the current body of literature that addresses these concerns and explores the ethical dimen-
sions of technological systems, or ‘technoethics’ (Bunge, 1975), of available interpreting
technologies. The chapter concludes by looking critically to the emerging technoethical
issues that have yet to be covered in the field and, hence, require further investigation in
the future.
DOI: 10.4324/9781003053248-24
The Routledge Handbook of Interpreting, Technology and AI
of customs and values belonging to specific groups of individuals which involve moral
principles, that is, notions of what is right and wrong. More specifically, the discussion on
ethics in relation to interpreting ranges from universalist statements about general moral
principles, for example, Chesterman’s ‘Hieronymic oath’ (2001), to the recognition that
attitudes may vary according to the interpreting setting, situation, or participants’ needs.
In terms of approaches, scholarship and discourse on ethics in interpreting can be catego-
rised as follows: prescriptive (‘what interpreters ought to do’, or deontological), descriptive
(‘documenting what practitioners actually do’), and metaethical (moral reasoning in the
profession). When applied to practice, however, each category plays an interrelated role
(Dean and Pollard, 2022).
Regardless of the chosen approach, Horváth and Tryuk (2021) indicate that the main
ethical issues recognised in scholarly and professional discourse on interpreting focus on
accuracy, commitment, competence, confidentiality, impartiality, integrity, invisibility, and
transparency. Indeed, scholars highlight how ethical aspects relate to effective professional
performance in terms of an interpreter’s role, behaviour, and norms. This is illustrated in
the following quote by Drugan, taken from their work with ethicist Chris Megone:
[T]ranslation often involves impacts, direct or indirect, on oneself and others. Thus,
the question arises whether, in these impacts, one is manifesting virtues or vices (or
respecting obligations, or producing good or bad consequences), and this . . . requires
ethical reflection. In sum . . . the point of studying ethics for translators is not that
they become philosophers but that they develop good judgement.
(Drugan and Megone, 2011, 189)
This view also relates to demands for social recognition and fair working conditions and
draws attention to the fact that interpreters should operate within a framework that sup-
ports their professional contributions. To illustrate, in 2007, Hale insisted upon the neces-
sity to have both normative obligations and ethically acceptable working conditions. Their
survey of 16 professional codes of ethics found that the most prominent ethical principles
(confidentiality, accuracy, and impartiality) were accompanied by considerations regarding
definitions of an interpreter’s role, professional solidarity, and organisation of work (Hale,
2007).
However, technological developments are profoundly affecting the contexts and ways in
which the interpreting sector operates. More widely, technological developments have been
seen to impact the roles, norms, and practices involved at all stages of interpreting service
provision. Rapid advances in digital and AI tools have triggered changes in market demand.
In turn, these have impacted the tasks and skills present in the industry (examples include
distance interpreting, machine learning, speech technology, and most recently, generative
AI). It is unclear at this stage whether these changes represent an evolution, a revolution, an
innovation, or a disruption to the industry. Similarly, one cannot yet be sure of the scale of
the impact that these advances will have on the sector (Schmitt, 2019). Thus, as interpreting
technologies continue to be present on the mainstream market and to be used by society at
large, scholars are encouraged to systematically address the ethical dimensions underlying
their usage. Examples of ethical dimensions that require systematic consideration include
questions surrounding ownership, confidentiality, accessibility, workflow integration, con-
texts, and modalities of application (Massey et al., 2024).
328
Ethical aspects
329
The Routledge Handbook of Interpreting, Technology and AI
resultant corpora can then be used for terminology extraction or to elicit suggested transla-
tions for specific terms. Regarding the assignment (‘during’) phase, complex workstations
exist which address interpreting workflow. The most commonplace functionality here is
a computer-assisted search for specialised terminology and other units of interest during
an interpreting job, carried out while interpreting or assisting a boothmate (Fantinuoli,
2023). More advanced versions of this functionality include AI-enhanced CAI tools, which
attempt to automate certain components of the interpreter’s workflow (Fantinuoli, 2023).
Examples of such products include automatic glossary creation. The steps taken to create
an automatic glossary are similar to those taken during corpus creation, term extraction,
term translation, and glossary evaluation and require a building engine, whose output the
interpreter can edit accordingly. Another relevant tool is the Artificial Boothmate (ABM).
ABM is an application that automatically suggests problem triggers, such as numbers and
proper nouns, in real time (Fantinuoli, 2017). The architecture of such tools is based on
numerous elements, including automatic speech recognition (ASR; transcribing speech),
large language models (LLMs; retrieving units of interest from the transcription and match-
ing them with glossaries), natural language processing (NLP), machine translation (MT),
and a user interface (displaying information to the interpreter). Finally, CAI can also assist
in the ‘after’ assignment phase, aiding in creating glossaries and improving future interpret-
ing assignments by providing the interpreters with feedback and performance insights.
330
Ethical aspects
including hybrid arrangements, such as virtual interpreting (Nimdzi, 2023). DI also goes
hand in hand with the ‘platformisation’ of the industry (Giustini, 2024), where digital
labour platforms are a rapidly expanding business structure, used to direct the supply and
demand of interpreting services. Using online marketplaces such as these, clients can con-
nect to a network of registered professionals directly. These platforms are often affiliated to
the software-as-service companies that run DI solutions through patented technologies. In
this way, interpreting can be outsourced digitally, 24/7, on demand.
Many of these questions about ethical aspects of new technologies are difficult to sep-
arate from broader sociocultural issues. Technological developments have occurred
331
The Routledge Handbook of Interpreting, Technology and AI
alongside, and played a part in, major ongoing shifts in social structures, migration
patterns, trade, information and employment.
(2019, 250)
332
Ethical aspects
fear being seen as an interchangeable crowd of workers who can be simply ‘contracted on
and off’ and upon short-notice request. This raises ethical implications for interpreters’
visibility by users and risks increasing the ‘depersonification’ effect of the service provider.
As a result, interpreters fear that users and clients may end up considering interpreters as
‘a plug-and-play feature of RSI [remote simultaneous interpreting] platforms’ (Buján and
Collard, 2023, 146, author’s addition).
A cognate dimension relates to the outsourcing of DI through online marketplaces and
applications, especially in the private sector. This phenomenon falls into the gig economy,
where digital labour platforms mediate supply and demand for interpreting services (Gius-
tini, 2024). In this environment, work is rigidly monitored and commissioned through
algorithmic management – algorithms that direct the allocation and intermediation of
interpreting work through big data points. In turn, platforms unilaterally impose restric-
tions on interpreters’ working conditions in areas such as completion time, labour price,
and expectations of constant availability. For example, jobs are often offered to the first
or lowest bidder. In turn, the platforms take the lion’s share of profits, with detrimen-
tal consequences to interpreters in terms of underpayment and unfair trade competition.
Baumgarten and Bourgadel (2024) comment on this phenomenon by linking the lack of
ethics present within the neoliberal system of digital capitalism in the language industry to
the pressures of competing in a globally interconnected sector that is destined to generate
profits and shareholder value. As the authors remark, the digital economy of language ser-
vices favours production values, such as efficiency, speed, quantification, and cost saving,
in its pursuit of commercial profit. However, the authors remain doubtful about whether
the industry’s relentless drive towards ever-enhanced optimisation through platforms is the
right path towards ethically sustainable work.
Scholarship has also reported on interpreters’ fears about their labour substitution that
have been brought about by the arrival of neural networks and AI. Interpreters are said to
have ‘automation anxiety’ (Vieira, 2020, 9). However, a distinction must be made between
MI and CAI in this context. CAI is a supporting group of technological solutions. Corpas
Pastor (2018) and Prandi (2023) show that attitudes to CAI tools are not entirely negative.
For instance, real-time terminology support can lead to increased efficiency and improved
performance. Generally speaking, interpreters feel they still have control over their work-
flow while using CAI, despite the necessity to upskill when integrating such tools into one’s
set of competencies. The application of these technologies mainly raises ethical questions
with regard to the interpreter’s cognitive load. Interpreters lament their time-consuming
nature and potential for distraction and the perceived lack of sophistication, intuitiveness,
and user-friendliness of the software that is currently available. As a result, ethical issues
remain more limited to perceptions of quality of the available software and multitasking
needs deriving from its use.
In contrast, the impact of MI on the market and employment prospects is potentially
more marked than that of CAI. Fantinuoli envisages a near future in which MI will enter
the low-end segment of the market, that is, ‘areas which are less prestigious, critical or
sensitive’ (2018, 12). This strategy would ostensibly offer ‘acceptable performance’ and
capture customers in exchange for economic savings and larger service availability. Profes-
sional human interpreting would survive at the higher end of the market, ‘at least until the
advent of real human-like MI’ (p. 345). This is a common argument revolving around the
tech-driven polarisation of skills (Acemoglu and Autor, 2011). When automation is intro-
duced in a sector, labour tends to split into jobs at the bottom, requiring lower skill level,
333
The Routledge Handbook of Interpreting, Technology and AI
and jobs at the top, requiring greater skill level. This polarisation is not free from ethical
implications, which include potential job displacement in certain market sectors and reten-
tion in others. This raises concerns about employment security and inequalities among
interpreters and interpreting settings, according to perceptions of rank, specialisation, and
importance.
Lastly, advances in MI have ignited the debate from another angle: whether automated
technologies will replace interpreters on the lines of conduit versus situational interpreting
expertise. MI relies on algorithms and data processing to simulate human cognitive func-
tions. This, in turn, allows it to generate automated real-time renditions that are incremen-
tally provided with only partial input. Some scholars claim that this process is incapable of
replacing human interpreters (Ortiz and Cavalli, 2018; Corpas Pastor, 2018, 2021) because,
despite such affordances, MI cannot yet capture linguistic variation, non-verbal communi-
cation, or emotions. In other words, at present, MI lacks problem-solving capabilities and
situational reactions. However, Fantinuoli and Dastyar (2021) suggest that relying on these
dimensions as selling points for the superiority of human interpreting results in an uncon-
vincing argument. They state that organisations and end users require clearer explanations
of the higher value of professional human interpreting to convince them to prefer this
service over MI. Giustini and Dastyar (2024) add that another pressing ethical question in
this domain relates to whether customers and businesses are likely to prioritise perceptions
of human expertise, emotional nuances, and service quality over availability and price. The
signalling of expertise, quality, and professionalism is, in fact, afforded not only by inter-
preters but also by consumers. To this extent, users may favour MI if they view automated
tools as being capable of providing at least an ‘acceptable’ level of service delivery. It is
critical to recognise that the acceptability of MI is also highly dependent on the interpreting
context. Different settings – such as medical, legal, diplomatic, or business – present varying
levels of complexity and ethical implications. In high-stakes environments, such as medical
or legal settings, accepting lower quality or the potential for error resulting from limita-
tions of MI could have serious consequences and involve significant ethical trade-offs. This
raises additional questions that service providers should transparently address: To what
extent are customers, especially those in critical settings, informed about the risks of using
automated technology? How is the potential for lower quality or miscommunication result-
ing from MI being communicated to them? Moreover, should service providers be held
responsible for clarifying when MI is suitable and when it is not? Taking responsibility in
this area includes ensuring that users understand how MI may suffice for low-risk, routine
tasks (e.g. booking an appointment at a hospital’s reception desk), but disclosing how it has
limitations in more nuanced or high-pressure situations (e.g. medical consultations). These
actions must be taken to avoid context-specific risks and misleading consumers. In turn, in
less high-risk settings, such as routine corporate webinars or product demonstrations, MI
may be deemed more acceptable, despite inaccuracies or lack of nuance. However, even
in these cases, it remains crucial for service providers to communicate the potential for
reduced quality clearly, so that clients can manifest informed consent. Otherwise, provid-
ers and users across settings may inadvertently normalise risks and unclear standards as
‘acceptable’ for certain types of communication. This would raise broader ethical questions
about the long-term impact of MI on groups involved.
Finally, it is worth noting that speech translation enhancements are developing rapidly,
introducing more features to appeal to a wider range of more varied potential clients. For
instance, in May 2024, Microsoft Azure AI Speech launched a language detection tool
334
Ethical aspects
which switches between supported languages in the same audio stream. This eliminates the
need for developers to specify particular input languages and integrates custom translations
to a client’s domain-specific vocabulary. With these capabilities, companies can attract
broader audiences and improve user experience. The process may still eschew an apprecia-
tion for quality, but the extent to which these ethical concerns impact the future of inter-
preting remains uncertain at this stage, because the full potential of MI is still unfolding.
335
The Routledge Handbook of Interpreting, Technology and AI
336
Ethical aspects
round-the-clock availability. The next most cited benefits (58%) were ‘no need to schedule
an interpreter’ and ‘lower costs’. Nonetheless, respondents also noted that these benefits
were not necessarily assured in real deployments, due to AI’s limited language availability
and the potential high costs of errors offsetting savings.
In turn, Bowker and Buitrago Ciro (2019, 9) describe how the availability of data in
dominant languages, coupled with the ease of access to online tools, means that an increas-
ing amount of material is being translated, both into and out of these languages. However,
the dominance of English and European languages in AI training data excludes almost
1.7 billion people, minimally representing over 2.3 billion. Just ten languages comprise over
85% of training data (De Palma and Lommel, 2023). Since AI systems tend to be developed
for commercial purposes, lesser-known languages that cannot generate adequate profit are
excluded. AI systems also rely on extensive written text availability. This means that Indig-
enous languages, which exhibit greater oral traditions and lack extensive written corpora,
are left behind. Thus, developers tend to prioritise the application of LLMs for dominant
languages, as they exhibit more consistent corpora and algorithmic performance. However,
this results in the undertraining of languages of lesser diffusion, which, in turn, contributes
to low algorithmic performance in AI systems and further exacerbates the underrepresenta-
tion of social groups affiliated with these languages.
Along with underrepresentation, AI systems perpetuate bias in their training corpora.
Bias refers to systematic and disproportionate prejudice against a group and often results in
unfair judgment and treatment. To illustrate, research has identified gender bias (defaulting
to masculine forms and stereotypical associations) (Vanmassenhove et al., 2021), racial bias
(Wolfe and Caliskan, 2021), and political bias (Rozado, 2023). Since AI systems require
large amounts of data for quality outputs, they are programmed to identify and prioritise
patterns in texts. Societal biases embedded in texts, combined with the underrepresentation
of more advanced discourses in training corpora, lead to their perpetuation in interpret-
ing technologies (Monzó-Nebot and Tasa-Fuster, 2024, 9). These studies have been com-
pounded by further empirical evidence. The SAFE AI survey (Pielmeier et al., 2024, Ch. 3,
9) indicates that 81% of respondents (industry stakeholders, service providers, and service
recipients) worry about the detrimental effects of AI on the quality of language access.
In particular, respondents fear that AI could introduce biases and lead to discrimination,
which could degrade interpreting quality in critical areas, such as healthcare. Meanwhile,
Ren and Yin (2020) raise an adjacent ethical consideration: accountability. Accountability
presents a significant challenge because, unlike providers, users, or organisations, AI cannot
assume legal liability, demonstrate professional responsibility, or face consequences for its
interpretations. However, interpretation errors caused by AI can lead to harmful outcomes
or dangerous decisions and create human, legal, or professional risks. The question as to
how ethically desirable relationships and responsibilities can be developed in order to pre-
vent potential harm within the limits of compliance, reporting, and enforcement is yet to
be clarified.
Another less widely discussed matter relates to the improbable balance of accessibil-
ity versus the lack of inclusion of minority and Indigenous communities in technology
training. This problem extends to the methods that are used to harvest data for AI sys-
tems, particularly in NLP, which rely on datasets of previously translated sentences to
train models between different languages. For low-resource languages to be adequately
trained, automated web crawling is employed to gather larger datasets. However, domi-
nant languages are used to pivot engines for low-resource languages. Not only does this
337
The Routledge Handbook of Interpreting, Technology and AI
increase limitations in terms of acceptable output quality, but it also raises concerns over
the exclusion of native speakers in development and validation processes. In this context,
speaker involvement refers to participation in contributing data, review of output, and
provision of culturally and linguistically relevant feedback. This perspective on inclusion
is championed by a variety of global and academic actors. For example, according to the
IBM Artificial Intelligence Pillars, ‘inclusion’ means working with diverse development
teams and seeking out the perspectives of minority-serving organisations and impacted
communities in AI systems design (IBM 2023). UNESCO’s ethics of AI (2021) indicate
that states should work to ensure the ‘respect, protection and promotion of diversity and
inclusiveness . . . throughout the life cycle of AI systems . . . by promoting active partici-
pation of all individuals or groups’ (p. 6), including in AI development (p. 8). UNESCO
also states that in an AI life cycle, ‘measures should be adopted to allow for meaningful
participation by marginalized groups, communities and individuals and, where relevant,
in the case of Indigenous Peoples, respect for the self-governance of their data’ (p. 10).
Scholars such as Mager et al. (2023), García González (2024), and Ghosh and Chat-
terjee (2024) recommend adopting a human-centred approach toward MT and NLP to
avoid data extractivism and digital colonialism that may otherwise affect communities.
This involves upliftment of low-resource languages. It can be achieved by encouraging
the active participation of community members with relevant lived experience in vari-
ous context in which the language is used. As a result, native speakers can inform both
data collection and AI training. Therefore, speakers’ inclusion can ensure that both the
input (training data) and output (translations) align with their ways of communicating,
values, and beliefs and avoid misrepresentation. While speakers’ direct involvement is not
always feasible, it remains essential that relevant communities are consulted, represented,
and empowered to ensure that AI systems uphold their linguistic and cultural integrity.
A potential way forward includes collaborations with community members, representa-
tives, linguists, and experts who are able to advocate for their needs, and who can avoid
perpetuating the very power imbalances that technology should be minimising.
To this regard, Mager et al. (2023) discuss ethical considerations for MT for Indig-
enous languages. They argue that NLP risks being weaponised as a political and ideo-
logical instrument of power, influencing the culture of minorities as a means of control
and expropriation. In the same vein, Tymoczko highlights the issue of foreignisation, the
flooding of vulnerable communities and ‘subaltern’ cultures with ‘foreign materials and
foreign language impositions’ (2006, 454). Borrowing Tymoczko’s argument, Mager et al.
(2023) suggest that the ethical implication in this context relates to the encoding of colonial
domination in MT. To avoid such risks, in the same study, the authors involved members
of the Aymara, Chatino, Maya, Mazatec, Mixe, Nahua, Otomí, Quechua, Tenek, Tep-
ehuano, Kichwa of Otavalo, and Zapotec communities while researching ways to inform
technological advances. Their study revealed that while participants expressed interest in
having MT systems for their own languages, they were also concerned about the commer-
cial usage and knowledge ownership of the output. These concerns centred on the misuse
of cultural, religious, health, and mercantile matters, and that distortions, appropriations,
and attempts at standardisation would result, which could be used by corporations wishing
to profit from technological sovereignty and data ownership. Respondents called instead
for quality checks to be carried out by community members, for the right to control the
knowledge shared, and for licensing of the final datasets to be central instruments for ethi-
cal decision-making.
338
Ethical aspects
Overall, this corpus of studies suggests that the ethics of interpreting technologies are rife
with epistemological implications. To this regard, Monzó-Nebot and Tasa-Fuster (2024)
draw attention to the bigger picture of the linguistic hegemony that is reproduced by auto-
mated interpreting technologies, such as MI, and their impact on systems of knowledge.
As the authors point out, certain concepts and representations may not be readily avail-
able in English and other dominant languages. The original text/speech may also be struc-
tured according to encoded beliefs and normative expectations that are unfamiliar to those
operating in dominant languages. In other words, the source text may be embedded in an
epistemological paradigm that cannot be decoded effectively in an English equivalent. As a
result, this process may tame knowledge structures into those that are more aligned with a
Western-centric, hegemonic worldview, in a process that Bennett (2013, 171) dubbed ‘epis-
temicide’. Similarly, Měchura (2015) observed that, while technology is often seen as a tool
to ‘overcome’ the challenges posed by linguistic diversity, in minority-language settings the
goal is often to do the opposite: to preserve and reinforce diversity. Měchura also warns
that lack of attention to cultural-linguistic nuances in training AI systems results in the
original content, authored in the minority language, being allowed to ‘escape’ as it is assimi-
lated into the knowledge frameworks of dominant languages. To safeguard against this, it
is crucial to protect and diversify the epistemological frameworks embedded in AI systems.
339
The Routledge Handbook of Interpreting, Technology and AI
forced interpreters to navigate new ethical challenges to maintain patient interaction and
confidentiality while working from home. These ethical challenges drove interpreters to
devise alternative uses of technologies in order to uphold interpreting quality and empathy.
These include comforting patients via audio link, rearranging household spaces to guaran-
tee privacy, and assisting medical staff in the case of contingent technical issues.
In a similar vein, studies on technologies in legal settings also provide important
insights into ethical matters. The most comprehensive insights to date which relate to
VMI (video-mediated interpreting, used as a cover term for all modalities of interpreting
involving video links) in criminal proceedings were conducted by the European AVIDICUS
1 (2008–2011), 2 (2011–2013), and 3 (2014–2016) projects (Braun and Taylor, 2012;
Braun, 2013; Braun et al., 2018). Through surveys from 200 legal interpreters and 30
institutions (AVIDICUS 1, 2) and ethnographic research (AVIDICUS 3), the projects iden-
tified numerous quality and ethical issues in court and police settings. The findings high-
light a common issue: While VMI is seen as a cost-effective way to improve access to
interpreter-supported justice services, its use is also controversial. To illustrate, there is
a discrepancy between objective measures (like interpreters’ performance) and percep-
tions of technology (see also Braun and Singureanu, this volume). This often leads to
reduced quality in participant interactions and a greater fragmentation of discourse. In
turn, some authors (Barak, 2021; Mellinger, 2022; Russo and Spinolo, 2022) highlight
that uninformed uses of language technology can severely impact decision-making in
asylum settings. Relevant arguments generally indicate that technical issues, translation
accuracy, and interaction management directly affect the credibility assessments and the
immigrant’s testimony. This potentially undermines the process and could impact the out-
come of deportation hearings or the granting of refugee status. Evidence from the ground
and language associations denounce increasing cases in the UK, the United States, and EU
states’ immigration systems, which are increasingly relying on AI-powered translations
in lieu of human interpreter mediation during asylum proceedings. AI tools, when used
unsupervised, have already been seen to result in asylum rejections and the weaponisation
of small linguistic technicalities to justify deportations (The Guardian, 2023a, 2023b; The
New Humanitarian, 2020). This is particularly the case when marginalised languages are
concerned. Indeed, lack of scrutiny when embedding technologies into institutional power
dynamics and interpreting processes can compromise vulnerable populations’ human
rights in judicial settings.
Finally, Federici et al. (2023) caution against deploying automated technologies for inter-
preting in crisis contexts, such as natural disasters, armed conflicts, displacement, and health
emergencies. These situations require effective communication for humanitarian operators
to coordinate efforts and provide information in local languages. The authors emphasise
that automated technologies can expedite outreach but still require ethical scrutiny. Insuf-
ficient crisis planning leads to improvised responses that depend heavily on automated
tools during emergencies. These tools are often used reactively instead of being integrated
into comprehensive crisis management strategies. Such oversight can hinder the overall
effectiveness of crisis communication efforts. Secondly, the authors highlight technological
constraints in handling languages with limited resources and the potential for insensitivity
to cultural nuances, which further diminish the tools’ efficacy in crisis situations. Therefore,
while automation can indeed enhance crisis response, it still requires human oversight and
optimised resources to ensure optimum utility and mitigate risks.
340
Ethical aspects
341
The Routledge Handbook of Interpreting, Technology and AI
landscape. As Drugan and Tipton observe, the question arises as to how technologies can
‘bring into relief the competing tensions . . . of what constitute socially responsible working
practices . . . as an ethical goal’ (2017, 121). ‘Responsibility’, in this context, encompasses a
dynamic commitment to sustaining decision-making and value judgements. This could lead to
the enhancement of ethics, rooted in social consensus on technological development and use.
Such commitment is reflected in policymakers’ attempts at promoting AI governance mecha-
nisms, which aim to manage the ethical and social challenges automated systems pose, while
maintaining incentives for technological innovation. For instance, the European Parliament
(2023) agreed on drafting the Artificial Intelligence Act (AI Act), which will enter into applica-
tion from 2026 and will be the first of its kind by a major regulator to do so. The European
Commission’s Directorate-General for Translation has also highlighted the necessity to exercise
caution and judgement with regards to ongoing ethical reflection between uses of sustainable
AI for language services and EU initiatives. This has led to the AI Act (Ellinides, 2023).
Alongside such attempts, sector associations are also devising ethical infrastructures. For
example, the SAFE-AI Task Force (2024), in collaboration with bodies such as the Ameri-
can Translators Association, proposes industry-wide guidelines to facilitate dialogue about
best practices for the responsible adoption of AI. These focus on questions relating to user
autonomy, safety and well-being, quality transparency, and accountability for errors. In
order to foster a reflective conversation on AI’s implications in interpreting, the Task Force
opened its guidance document for public comment, so as to receive feedback on the pro-
posed foundational ethical principles.1 By emphasising social responsibility as dynamic and
widely distributed, sector associations, groups, and organisations can create a discursive
space on the ethical implications of interpreting technologies on industry progress, linguis-
tic production, and the socio-economic order of the profession.
342
Ethical aspects
343
The Routledge Handbook of Interpreting, Technology and AI
Note
1 The finalised document ‘Interpreting SAFE AI Task Force Guidance on AI and Interpreting Services’
is available for consultation on the Task Force website URL https://safeaitf.org/guidance/ (accessed
22.6.2024).
References
Abdalla, M., Wahle, J.P., Ruas, T., Névéol, A., Ducel, F., Mohammad, S.M., Fort, K., 2023. The Ele-
phant in the Room: Analyzing the Presence of Big Tech in Natural Language Processing Research.
ArXiv preprint arXiv:2305.02797.
Acemoglu, D., Autor, D., 2011. Skills, Tasks and Technologies: Implications for Employment and
Earnings. In Ashenfelter, A.C., Card, D., eds. Handbook of Labor Economics. Elsevier, Amster-
dam, 1043–1171.
AIIC, 1999. Memorandum Concerning the Use of Recordings of Interpretation at Conferences
[online]. aiic.net. URL https://aiic.net/p/58 (accessed 22.6.2024).
Anastasopoulos, A., Barrault, L., Bentivogli, L., Zanon Boito, M., Bojar, O., Cattoni, R., Currey, A.,
Dinu, G., Duh, K., Elbayad, M., Emmanuel, C., Estève, Y., Federico, M., Federmann, C., Gah-
biche, S., Gong, H., Grundkiewicz, R., Haddow, B., Hsu, B., Javorský, D., Kloudová, V., Lakew,
S., Ma, X., Mathur, P., McNamee, P., Murray, K., Nădejde, M., Nakamura, S., Negri, M., Nie-
hues, J., Niu, X., Ortega, J., Pino, J., Salesky, E., Shi, J., Sperber, M., Stüker, S., Sudoh, K., Turchi,
M., Virkar, Y., Waibel, A., Wang, C., Watanabe, S., 2022. Findings of the IWSLT 2022 Evaluation
Campaign. In Proceedings of the 19th International Conference on Spoken Language Translation
(IWSLT 2022). Association for Computational Linguistics, Dublin, 98–157.
Barak, M.P., 2021. Can You Hear Me Now? Attorney Perceptions of Interpretation, Technology, and
Power in Immigration Court. Journal on Migration and Human Security 9(4), 207–223.
Baumgarten, S., Bourgadel, C., 2024. Digitalisation, Neo-Taylorism and Translation in the 2020s.
Perspectives 32(3), 508–523.
Bennett, K., 2013. English as a Lingua Franca in Academia. Combating Epistemicide through Transla-
tor Training. The Interpreter and Translator Trainer 7(2), 169–193.
Boéri, J., Giustini, D., 2022. Localizing the COVID-19 Pandemic in Qatar: Interpreters’ Narratives
of Cultural, Temporal and Spatial Reconfiguration of Practice. The Journal of Internationalization
and Localization 9(2), 139–161.
Boéri, J., Giustini, D., 2024. Qualitative Research in Crisis: A Narrative-Practice Methodology to
Delve into the Discourse and Action of the Unheard in the COVID-19 Pandemic. Qualitative
Research 24(2), 412–432.
Bowker, L., Buitrago Ciro, J., 2019. Machine Translation and Global Research: Towards Improved
Machine Translation Literacy in the Scholarly Community. Emerald.
Braun, S., 2013. Keep Your Distance? Remote Interpreting in Legal Proceedings: A Critical Assess-
ment of a Growing Practice. Interpreting 15(2), 200–228. https://doi.org/10.1075/intp.15.2.03bra
Braun, S., 2019. Technology and Interpreting. In O’Hagan, M., ed. Routledge Handbook of Transla-
tion and Technology. Routledge, London, 271–288.
Braun, S., 2024. Distance Interpreting as a Professional Profile. In Massey, G., Ehrensberger-Dow,
M., Angelone, E., eds. Handbook of the Language Industry: Contexts, Resources and Profiles.
Gruyter & Co KG, 449–472.
Braun, S., Davitti, E., Dicerto, S., 2018. Video-Mediated Interpreting in Legal Settings: Assessing the
Implementation. In Napier, J., Skinner, R., Braun, S., eds. Here or There: Research on Interpreting
via Video Link. Gallaudet, Washington, DC, 144–179.
Braun, S., Taylor, J., 2012. Videoconference and Remote Interpreting in Legal Proceedings. Intersentia.
Buján, M., Collard, C., 2023. Remote Simultaneous Interpreting and COVID-19: Conference
Interpreters’ Perspective. In Liu, K., Cheung, A., eds. Translation and Interpreting in the Age of
COVID-19. Springer Nature, Singapore, 133–150.
Bunge, M., 1975. Towards a Technoethics. Philosophic Exchange 6(1), 69–79.
Castilho, S., Mallon, C., Meister, R., Yue, S., 2023. Do Online Machine Translation Systems
Care for Context? What About a GPT Model? In Nurminen, M., Brenner, J., Koponen, M.,
Latomaa, S., Mikhailov, M., Schierl, F., Ranasinghe, T., Vanmassenhove, E., Alvarez Vidal, S.,
344
Ethical aspects
Aranberri, N., Nunziatini, M., Parra Escartín, C., Forcada, M., Popovic, M., Scarton, C., Moniz,
H., eds. 24th Annual Conference of the European Association for Machine Translation (EAMT
2023). EAMT, Tampere, 393–417.
Chesterman, A., 2001. Proposal for a Hieronymic Oath. The Translator 7(2), 139–154.
Corpas Pastor, G., 2018. Tools for Interpreters: The Challenges That Lie Ahead. Trends in Translation
Teaching and Learning E 5, 157–182.
Corpas Pastor, G., 2021. Interpreting and Technology: Is the Sky Really the Limit? Proceedings of the
Translation and Interpreting Technology Online Conference, 15–24. URL https://doi.org/10.266
15/978-954-452-071-7_003
Dean, R.K., Pollard, R.Q., 2022. Improving Interpreters’ Normative Ethics Discourse by Imparting
Principled-Reasoning Through Case Analysis. Interpreting and Society 2(1), 55–72.
Defrancq, B., 2024. Conference Interpreting in AI Settings: New Skills and Ethical Challenges. In
Massey, G., Ehrensberger-Dow, M., Angelone, E., eds. Handbook of the Language Industry: Con-
texts, Resources and Profiles. Gruyter & Co KG, Berlin, 473–488.
De Palma, D., Lommel, A., 2023. The Evolution of Language Services and Technology [online]. CSA
Research. URL https://insights.csa-research.com/reportaction/305013598/Marketing (accessed
28.6.2024).
Drugan, J., 2019. Police Communication Across Languages in Crisis Situations: Human Trafficking
Investigations in the UK. In Federici, F., O’Brien, S., eds. Translation in Cascading Crises. Rout-
ledge, London, 46–66.
Drugan, J., Babych, B., 2010. Shared Resources, Shared Values? Ethical Implications of Sharing
Translation Resources. In Proceedings of the Second Joint EM+/CNGL Workshop: Bringing MT
to the User: Research on Integrating MT in the Translation Industry, 3–10.
Drugan, J., Megone, C., 2011. Bringing Ethics into Translator Training: An Integrated, Inter-Disciplinary
Approach. The Interpreter and Translator Trainer 5(1), 183–211.
Drugan, J., Tipton, R., 2017. Translation, Ethics and Social Responsibility. The Translator 23(2),
119–125.
Ellinides, C., 2023. EUATC Keynote: Ethical, Sustainable Business from EU Perspective [online].
ATC-EUATC Ethical Business Summit. URL https://atc.org.uk/people-and-purpose-driven-progress-
not-perfection/ (accessed 28.6.2024).
Fantinuoli, C., 2017. Speech Recognition in the Interpreter Workstation. In Esteves-Ferreira, J.,
Macan, J., Mitkov, R., Stefanov, O., eds. Proceedings of the Translating and the Computer.
AsLing, London, 25–34.
Fantinuoli, C., 2018. Interpreting and Technology. Language Science Press, Berlin.
Fantinuoli, C., 2023. The Emergence of Machine Interpreting. European Society for Translation Stud-
ies 62, 10.
Fantinuoli, C., Dastyar, V., 2021. Interpreting and the Emerging Augmented Paradigm. Interpreting
and Society 2(2), 185–194.
Federici, F.M., Declercq, C., Díaz Cintas, J., Baños Piñero, R., 2023. Ethics, Automated Processes,
Machine Translation, and Crises. In Moniz, H., Parra Escartín, E., eds. Towards Responsible Machine
Translation: Ethical and Legal Considerations in Machine Translation. Springer, Cham, 135–156.
Forcada, M.L., 2023. Licensing and Usage Rights of Language Data in Machine Translation. In
Moniz, H., Escartín, C.P., eds. Towards Responsible Machine Translation: Ethical and Legal Con-
siderations in Machine Translation. Springer, Cham, 49–69.
García González, M., 2024. The Role of Human Translators in the Human-Machine Era: Assessing
Gender Neutrality in Galician Machine and Human Translation. In Monzó-Nebot, E., Tasa-Fuster,
V., eds. Gendered Technology in Translation and Interpreting: Centering Rights in the Develop-
ment of Language Technology. Routledge, London, 173–201.
Gentile, P., Albl-Mikasa, M., 2017. “Everybody Speaks English Nowadays”. Conference Interpreters’ Per-
ception of the Impact of English as a Lingua Franca on a Changing Profession. Cultus 10(1), 53–66.
Ghosh, S., Chatterjee, S., 2024. Misgendering and Assuming Gender in Machine Translation When
Working with Low-Resource Languages. In Monzó-Nebot, E., Tasa-Fuster, V., eds. Gendered
Technology in Translation and Interpreting: Centering Rights in the Development of Language
Technology. Routledge, London, 274–290.
Gilbert, A.S., Croy, S., Hwang, K., LoGiudice, D., Haralambous, B., 2021. Video Remote Interpreting
for Home-Based Cognitive Assessments: Stakeholders’ Perspectives. Interpreting 24(1), 84–110.
345
The Routledge Handbook of Interpreting, Technology and AI
Giustini, D., 2024. “You Can Book an Interpreter the Same Way You Order Your Uber”: (Re) Inter-
preting Work and Digital Labour Platforms. Perspectives 32(3), 441–459.
Giustini, D., Dastyar, V., 2024. Critical AI Literacy for Interpreting in the Age of AI. Interpreting and
Society. URL https://doi.org/10.1177/27523810241247259
The Guardian, 2023a. Lost in AI Translation: Growing Reliance on Language Apps Jeopardizes
Some Asylum Applications [online]. URL www.theguardian.com/us-news/2023/sep/07/asylum-
seekers-ai-translation-apps (accessed 25.6.2024).
The Guardian, 2023b. Home Office to Tell Refugees to Complete Questionnaire in English or Risk
Refusal [online]. URL www.theguardian.com/uk-news/2023/feb/22/home-office-plans-to-use-
questionnaires-to-clear-asylum-backlog (accessed 25.6.2024).
Hale, S., 2007. Community Interpreting. Palgrave Macmillan, Basingstoke.
Horváth, I., 2022. AI in Interpreting: Ethical Considerations. Across Languages and Cultures
23(1), 1–13.
Horváth, I., Tryuk, M., 2021. Ethics and Codes of Ethics in Conference Interpreting. In Mikkelson,
H., Jourdenais, R., eds. The Routledge Handbook of Conference Interpreting. Routledge, London,
290–304.
IBM, 2023. IBM Artificial Intelligence Pillars. URL www.ibm.com/policy/ibm-artificial-intelligence-pillars/
(accessed 15.9.2024).
International Organization for Standardization (ISO). ISO/DIS 17651-3- Simultaneous Interpreting.
URL www.iso.org/obp/ui#iso:std:iso:17651:-3:dis:ed-1:v1:en:term:3.5 (accessed 15.9.2024).
Kak, A., Myers West, A., Whittaker, M., 2023. Make No Mistake – AI Is Owned by Big Tech
[online]. MIT Technology Review. URL www.technologyreview.com/2023/12/05/1084393/
make-no-mistake-ai-is-owned-by-big-tech/ (accessed 23.6.2024).
Kenny, D., 2019. Technology and Translator Training. In O’Hagan, M., ed. Routledge Handbook of
Translation and Technology. Routledge, London, 498–515.
Klammer, M., Pöchhacker, F., 2021. Video Remote Interpreting in Clinical Communication: A Multi-
modal Analysis. Patient Education and Counseling 104(12), 2867–2876.
Lewis, D., Moorkens, J., 2020. A Rights-Based Approach to Trustworthy AI in Social Media. Social
Media+ Society 6(3), 2056305120954672.
Mager, M., Mager, E., Kann, K., Thang, V.N., 2023. Ethical Considerations for Machine Translation
of Indigenous Languages: Giving a Voice to the Speakers. In Proceedings of the 61st Annual Meet-
ing of the Association for Computational Linguistics (Volume 1: Long Papers). Association for
Computational Linguistics, Toronto, 4871–4897.
Mahyub Rayaa, B., Martin, A., 2022. Remote Simultaneous Interpreting: Perceptions, Practices and
Developments. The Interpreters’ Newsletter 27, 21–42.
Massey, G., Ehrensberger-Dow, M., Angelone, E., eds., 2024. Handbook of the Language Industry:
Contexts, Resources and Profiles. Gruyter & Co KG, Berlin.
Měchura, M., 2015. Do Minority Languages Need the Same Language Technology as Majority Lan-
guages? [online]. URL www.lexiconista.com/minority-languages-machine-translation/ (accessed
22.6.2024).
Mellinger, C.D., Hanson, T.A., 2018. Interpreter Traits and the Relationship with Technology and
Visibility. Translation and Interpreting Studies 13(3), 366–392.
Mellinger, H., 2022. Interpretation at the Asylum Office. Law & Policy 44(3), 230–254.
Monzó-Nebot, E., Tasa-Fuster, V., eds., 2024. Gendered Technology in Translation and Interpreting:
Centering Rights in the Development of Language Technology. Routledge, London.
Mouzourakis, P., 2006. Remote Interpreting: A Technical Perspective on Recent Experiments. Inter-
preting 8(1), 45–66.
The New Humanitarian, 2020. “Translation Machines”: Interpretation Gaps Plague French Asylum
Process [online]. URL www.thenewhumanitarian.org/news-feature/2020/10/27/france-migration-
asylum-translation (accessed 25.6.2024).
Nimdzi, 2023. Remote vs Onsite Interpreting: The Post-Pandemic Equilibrium [online]. URL www.
nimdzi.com/remote-vs-onsite-interpreting-t-post-pandemic-equilibrium/ (accessed 20.6.2024).
Nimdzi, 2024. The 2024 Nimdzi 100 [online]. URL www.nimdzi.com/nimdzi-100-top-lsp/ (accessed
20.6.2024).
Okoniewska, A.M., 2022. Interpreters’ Roles in a Changing Environment. The Translator 28(2),
139–147.
346
Ethical aspects
Ortiz, L.E., Cavalli, P., 2018. Computer-Assisted Interpreting Tools (CAI) and Options for Automa-
tion with Automatic Speech Recognition. TradTerm 32, 9–31.
Pielmeier, H., Lommel, A., Toon, A., 2024. Perceptions on Automated Interpreting. Results of a
Large-Scale Study of End-Users, Requestors, and Providers of Interpreting Services and Technol-
ogy [online]. CSA Research. URL https://insights.csa-research.com/reports/305013618/Chapter7
Perceptionsa#r::305013618:Chapter7Perceptionsa:TheEthicsofReplacing (accessed 10.6.2024).
Prandi, B., 2023. Computer-Assisted Simultaneous Interpreting: A Cognitive-Experimental Study on
Terminology. Language Science Press, Berlin.
Ramírez-Polo, L., Vargas-Sierra, C., 2023. Translation Technology and Ethical Competence: An Anal-
ysis and Proposal for Translators’ Training. Languages 8(2), 1–22.
Ren, W., Yin, M., 2020. Conference Interpreter Ethics. In Koskinen, K., Pokorn, N.K., eds. The Rout-
ledge Handbook of Translation and Ethics. Routledge, London, 195–210.
Rozado, D., 2023. The Political Biases of ChatGPT. Social Sciences 12(3), art. 148.
Russo, M., Spinolo, N., 2022. Technology Affordances in Training Interpreters for Asylum Seek-
ers and Refugees. In Ruiz Rosendo, L., Todorova, M., eds. Interpreter Training in Conflict and
Post-Conflict Scenarios. Routledge, London, 165–180.
Schmitt, P.A., 2019. Translation 4.0–Evolution, Revolution, Innovation or Disruption? Lebende
Sprachen 64(2), 193–229.
Stakeholders Advocating for Fair and Ethical AI in Interpreting (SAFE-AI), 2024. Interpreting SAFE
AI Task Force Guidance (Ethical Principles) AI and Interpreting Services [online]. URL https://
safeaitf.org/guidance/ (accessed 25.6.2024).
Stengers, H., Lázaro Gutiérrez, R., Kerremans, K., 2023. Public Service Interpreters’ Perceptions
and Acceptance of Remote Interpreting Technologies in Times of a Pandemic. In Corpas Pastor,
G., Defrancq, B., eds. Interpreting Technologies – Current and Future Trends. John Benjamins,
Amsterdam, 109–141.
Tejada Delgado, A., 2019. Is the Public Sector Interpreting Market Ready for Digital Transformation?
Revista Tradumática 17, 88–93.
Tymoczko, M., 2006. Translation: Ethics, Ideology, Action. The Massachusetts Review 47(3),
442–461.
UNESCO, 2021. Recommendation on the Ethics of Artificial Intelligence. URL https://unesdoc.une-
sco.org/ark:/48223/pf0000380455 (accessed 15.9.2024).
Van de Meer, J., 2021. Translation Economics of the 2020s [online]. Multilingual. URL https://
multilingual.com/issues/july-august-2021/translation-economics-of-the-2020s/ (accessed 25.6.2024).
Van Dis, E., Bollen, J., Zuidema, W., Van Rooij, R., Bockting, C.L., 2023. ChatGPT: Five Priorities
for Research. Nature 614(7947), 224–226.
Vanmassenhove, E., Emmery, C., Shterionov, D., 2021. Neutral Rewriter: A Rule-Based and
Neural Approach to Automatic Rewriting into Gender-Neutral Alternatives. arXiv preprint
arXiv:2109.06105.
Vieira, N.L., 2020. Automation Anxiety and Translators. Translation Studies 13(1), 1–21.
Wolfe, R., Caliskan, A., 2021. Low Frequency Names Exhibit Bias and Overfitting in Contextualizing
Language Models. arXiv preprint arXiv:2110.00672.
Zetzsche, J., 2020. Freelance Translator’s Perspectives. In O’Hagan, M., ed. Routledge Handbook of
Translation and Technology. Routledge, London, 166–182.
347
19
COGNITIVE ASPECTS
Christopher D. Mellinger
19.1 Introduction
Researchers have been interested in cognitive aspects of interpreting since the inception of
the discipline. In part, the emphasis on interpreter cognition is the result of psycholinguists,
who viewed interpreting as a form of extreme language processing that could provide insights
into how the bilingual brain functions with multiple languages at the same time. These stud-
ies were largely eschewed by practising interpreters at that time, particularly since the exper-
imental control required to understand cognition failed to account for the situated nature of
interpreting, which enables cross-language communication among multiple parties. Conse-
quently, practising interpreters-turned-researchers – that is, practisearchers – began to com-
plement these studies (Gile, 2000), seeking to balance laboratory-conducted experimental
studies with more professionally informed, context-dependent examinations of interpreting
(for an overview, see Mellinger, 2024; Pöchhacker and Shlesinger, 2002). Edited collections
on interpreting technologies now bring together research from both traditions to examine
interpreting as a product, a process, and a service (e.g. Fantinuoli, 2018; Jiménez Serrano,
2019; Mellinger and Pokorn, 2018; Napier et al., 2018).
Implicit within this disciplinary backdrop is the development and use of interpreting
technologies which have been present since some of the earliest studies on interpreting. For
instance, simultaneous interpreting as a professional practice for spoken language interpret-
ing arose from technological advances that allowed audio equipment to facilitate source
language listening at the same time as target language production (Diriker, 2010; see also
Seeber, this volume). If we consider a broader definition of technologies to encompass any
tools or equipment that enables specific professional practices, then note-taking involving
paper and pen, tablet computers, or digital pen technologies is a commonplace feature of
interpreting practices in both community and conference settings (Ahrens and Orlando,
2022; Goldsmith, 2018). Still, more recent technological advances that facilitate distance
interpreting, be it via telephone (Braun, 2019, 2024; see also Lázaro Gutiérrez, this vol-
ume) or videoconference technologies (Braun, 2019, this volume), are part and parcel of
professional interpreting practices. The same holds true for technologies that disseminate
interpreting services via television (e.g. Wehrmeyer, 2015), video streaming (e.g. Picchio,
DOI: 10.4324/9781003053248-25
Cognitive aspects
2023), or speech-to-text broadcasting (e.g. Romero-Fresco, 2023; see also Davitti, this vol-
ume), such that technology remains an ever-present aspect associated with interpreting as
a practice, as a process, and as a service. Moreover, AI-driven technologies are increasingly
integrated into multilingual communication workflows, blurring the interface between
interpreters and the technologies used to support their work (Fantinuoli, 2023, this vol-
ume; Horváth, 2022).
Given the ubiquitous nature of interpreting technologies, there is an increasing need
to understand how technologies influence interpreter cognition. Interpreting is a cogni-
tively demanding task without the use of technology; however, technology-mediated or
technology-supported communication adds another dimension for which we must account
in order to understand interpreter behaviour and cognitive processing during the interpret-
ing task. As such, the primary focus of this chapter is on the intersection of technology
and interpreter cognition, highlighting studies that have explored various cognitive aspects
of interpreter cognition as they relate to technology use in the practice and process of
interpreting (Section 19.2). These studies are categorised based on how technologies either
enable interpreting (Section 19.2.1) or support the interpreting process (Section 19.2.2).
Mention is also made of the extent to which technology can support cognitive benefits
during interpreter training (Section 19.2.3). Then, several key topics associated with tech-
nology use are reviewed, including cognitive load and cognitive effort (Section 19.3.1),
cognitive ergonomics, and human–computer interaction (Section 19.3.2), as well as more
situated and contextualised approaches to interpreter cognition, including 4EA cognition
and augmented cognition (Section 19.3.3). The chapter concludes with a brief discussion of
open questions in the field related to interpreting technologies and cognition, namely, big
data and interpreting ethics (Section 19.4).
349
The Routledge Handbook of Interpreting, Technology and AI
modulate cognitive processes of interpreting. This chapter makes a distinction in line with
Braun (2019), categorising cognitive studies on interpreting based on whether the tech-
nologies enable a new type of interpreting (e.g. remote or distance interpreting) or sup-
port the interpreter during the interpreting task (e.g. automatic speech recognition [ASR],
real-time support technologies, digital pens). Cognitive studies on technology-enabled and
technology-supported interpreting can focus on the use of technologies as the specific vari-
able of interest in comparison to non-mediated or non-supported interpreting. These stud-
ies can also approach technology and interpreting as the baseline configuration – that is,
technology is part of each condition – focusing instead on other moderating variables that
occur in these technologised environments that impact the parties involved. Illustrative
examples of both types of studies are included to account for the breadth of scholarship on
interpreting, technology, and cognition.
350
Cognitive aspects
351
The Routledge Handbook of Interpreting, Technology and AI
considering the technologies themselves, such as Goldsmith’s (2018) study that focuses on
the integration of tablet computers into the practice of interpreting, or Liu’s (2018) over-
view of interpreting technologies more generally and the call for a more critical evaluation
and integration in interpreting workflows. Such calls lead to the potential construct of a
technology literacy for interpreters (Drechsel, 2019) which, if sufficiently operationalised
and established as a measurable construct, could serve as yet another individual-level dif-
ference discernible in the literature. Reactions to technology form part of cognitive explo-
rations of interpreting, and these individual-level differences can provide a more nuanced
understanding of how interpreters interact with technologies.
Similar to questions associated with providing visual support to interpreters when work-
ing in the simultaneous mode, Chen and Kruger (2023) also examine ASR that allows
machine translation systems to support interpreters working in the consecutive mode. This
study posits a reduction in cognitive load as a result of the visual access to a potential tar-
get language rendition and also suggests that directionality plays a role when considering
technology-supported interpreting. Doherty et al. (2022) approach the question of visual
attention with respect to note-taking during video remote interpreting, finding that the
inclusion of note-taking as a concurrent activity to interpreting increases shifts in attention,
which in turn decreases visual attention to the speaker as a result of note-taking practices.
These note-taking behaviours have been examined at the level of individual differences as
well, particularly in relation to experience, but do not find that this variable mitigates the
perception of task difficulty (Kuang and Zheng, 2023). Still others have sought to exam-
ine how note-taking may be indicative of metacognitive behaviour using digital pens, evi-
denced by omissions, hesitations, stray marks, and symbols that are produced during the
interpreting task (Mellinger, 2022b). These studies are still tentative in their conclusions,
and additional work needs to be conducted to better understand the mechanisms by which
technology alters cognitive resource allocation and interpreter behaviour.
352
Cognitive aspects
interpreting platforms (Bertozzi and Cecchi, 2023; Davitti and Braun, 2020). These studies
can be augmented by research associated with metacognition and self-regulation (Aguirre
Fernández Bravo, 2019; Cañada and Arumí, 2012; Herring, 2019), particularly since these
cognitive variables have not been sufficiently addressed when working with tools. Addi-
tionally, studies that focus on challenges associated with interpreting different modalities
of interpreting, such as telephone interpreting (e.g. Iglesias Fernández and Ouellet, 2018;
Ozolins, 2011), could be further tested beyond the affective dimensions to determine the
extent to which these tools can cognitively support interpreting students or how these tools
can be leveraged to enhance learning and skill-building.
353
The Routledge Handbook of Interpreting, Technology and AI
be illustrated by way of example. In the case of distance interpreting, the cognitive demands
regularly encountered by interpreters may be modulated by technology as an additional
variable involved during the process. In the case of distance interpreting, the technologised
environment in which interpreters work may give rise to changes in the interpreting task
and, as such, require interpreter cognition to vary to account for these differences (Chmiel
and Spinolo, 2022; Mellinger, 2019).
Typically, cognitive models of interpreting seek to describe the interpreting task as a pro-
cess (e.g. Gile 1991, 2009; Seeber, 2013; Seeber and Kerzel, 2012; for an overview of early
cognitive models, see Ahrens, 2025). These models provide information about various loci
of cognitive load, enabling researchers to focus on specific stages of the interpreting task to
investigate cognitive load or effort. To account for task-related characteristics that poten-
tially affect interpreter performance and cognitive load, Chen (2017) moves away from
these process-oriented models to posit a componential model of interpreter cognition. In
doing so, Chen’s (2017) work seeks to address technological aspects as well as physiologi-
cal and affective dimensions associated with interpreter cognition and, in the case of Chen,
questions associated with note-taking practices that arise during consecutive interpreting.
Zhu and Aryadoust (2022) take a similar approach when examining distance interpreting,
placing particular emphasis on remote interpreting and the task-related dimensions, envi-
ronment, and individual differences that may influence cognitive load. Performance indica-
tors and cognitive load have also been examined in relation to computer-aided interpreting
tools (Defrancq et al., 2024).
354
Cognitive aspects
human factors scholarship. This type of reflection is reminiscent of scholars who focus on
information technologies in other spaces, including healthcare, that seek to understand the
impact that these tools can have on performance and human factor designs (e.g. Lawler
et al., 2011). Researchers working with cognitive ergonomics can leverage this framework
as a means to examine the impact that technologies have not only on individuals but also
on the systems in which they are embedded and the broader set of actors involved (Barcel-
lini et al., 2016).
The potential of ergonomics to address translator and interpreter training has been taken
up more recently, bridging both physical and cognitive ergonomic perspectives (e.g. van
Egdom et al., 2020). Seeber and Arbona (2020) explore the utility of cognitive ergonomics
in interpreting studies more explicitly, focusing on how training for simultaneous interpret-
ing can be adapted using these principles. As the authors note, their emphasis is to make
training efficient and effective, leveraging technologies to implement their training model
based on cognitively informed research. By relying on a cognitive ergonomics framework,
the work of interpreters can be further situated while allowing cognitive dimensions of the
interpreting task to be explored.
355
The Routledge Handbook of Interpreting, Technology and AI
enquiry into interpreting and technology, as this theoretical orientation addresses human
factors and the interface that interpreters have with technologies and their environment.
Scholarship that employs physiological measures to understand cognitive aspects of inter-
preting may also fit within this broader umbrella, particularly related to reactions to stress
and affective dimensions of the interpreting task (e.g. Gieshoff et al., 2021).
More recently, augmented cognition has garnered attention by the translation and inter-
preting studies research community. In translation studies, researchers have argued that
translation has been an augmented activity as a result of technology usage for some time
rather than being a recent development in line with widespread usage of large language
models (O’Brien, 2024). Yet as O’Brien (2024) notes, augmented cognition has varying
conceptualisations, depending on the disciplinary traditions and theoretical frameworks
employed. For instance, augmented cognition, in a more traditional sense, understands
interactions between humans and technological systems as extending beyond an enhance-
ment provided by technology-supported tasks; rather, it relies on a ‘tight coupling between
user and computer . . . achieved via physiological and neurophysiological sensing of a user’s
cognitive state’ (Stanney et al., 2009). Rather than replacing human ability, O’Brien (2024)
suggests that human-centred artificial intelligence that can amplify intelligence is a potential
means to move forward.
Moving this discussion into interpreting studies, Prandi (2023b) has discussed how
human cognition can move beyond 4EA cognitive frameworks of distributed and extended
cognition to augment human ability using technologies. Much in the same way that O’Brien
(2024) identifies multiple approaches to understand augmentation, so too can interpreting
technologies be viewed in myriad ways. Gieshoff (2023) describes how augmented reality
is one means by which interpreters working remotely may be able to leverage technologies
to support interpreter cognition. Scholarship involving augmented cognition in interpret-
ing is nascent, yet these studies show potential means forward as a way to understand how
technology can support interpreting as a task and as a service.
356
Cognitive aspects
ethical questions arise in relation to data protection and confidentiality, particularly since
leveraging data that would serve as the foundation of AI systems and potential augmented
cognitive systems may run counter to codes of ethics. Moreover, there is the potential for
the biases inherent in the data to be propagated in their subsequent use. These biases have
been recognised in large language models (Navigli et al., 2023), and as such, tool develop-
ers and researchers must develop what Giustini and Dastyar (2024) refer to as critical AI
literacy to inform interpreting practices involving these technologies. These questions seem
particularly pressing as researchers continue working at the interface of artificial intelli-
gence, augmented cognition, and large language models (O’Leary, 2022).
In addition, ethical questions loom large when considering the intersection of inter-
preting, technology, and cognition. While there are any number of noteworthy benefits
associated with the incorporation of technology into process that can potentially enhance
interpreter performance, how these practices are implemented requires ethical reflection.
At present, many professional standards of conduct or codes of ethics do not mention
technology. In a similar vein, there is a relative dearth of scholarship on ethical aspects of
technological integration into interpreting practices, such that greater attention ought to
be paid to these practices. Scholars beyond interpreting studies have begun to reflect on
the ethics of cognitive enhancement (e.g. Hofmann, 2017; Jotterand and Dubljević, 2016),
the scope of which continues to expand as technological advances continue. Interpreting
studies scholars may find the ethical frameworks and questions raised in relation to other
technologies useful as a starting point to contemplate ethical dimensions.
19.5 Conclusion
Several tentative conclusions can be drawn based on this review of the extant scholarship
that lies at the intersection of interpreting, technology, and cognition. First, the practice and
research of interpreting require more careful consideration of how technology influences
interpreter behaviour, attitudes, and cognition. While previous reflections may have con-
ceptualised technology as an add-on to the practice of interpreting, 4EA cognitive frame-
works and HCI research paradigms highlight the integrated nature of technology within
the interpreting process. Therefore, technology cannot as easily be treated as a stand-alone
variable that researchers ask whether its specific use or not alters interpreter cognition.
Instead, technology may need to be treated as a moderating variable that alters how cogni-
tive resources are managed or allocated, not only when interpreters are actively leveraging
technological resources to support their practice, but also when interpreters are working
without tools that are typically at their disposal. Technological use in regular practice likely
needs to figure into demographic variables that researchers probe to understand whether
technology has a bearing on study findings.
Second, technology as an extension of interpreter cognition raises important questions
associated with ethics both in terms of research and practice. In addition to the considera-
tions posed in the section on open questions for research, the ethical dimensions of a tech-
nologised workplace challenge researchers to reflect on how technological advances alter
interpreting as a cognitive task. In many respects, augmented cognition remains largely
underexplored in interpreting studies, thereby necessitating more vigorous engagement
with these areas. This type of work will likely need to be collaborative in nature, bringing
together experts on interpreting technologies with those focused on cognitive interpreting
studies in an effort to address increasingly complex research questions.
357
The Routledge Handbook of Interpreting, Technology and AI
To conclude, it should be recognised that the study of interpreting technology and its
influence provides an opportunity to revisit our understanding of interpreter cognition
more broadly. Many studies focus on specific cognitive variables that may change when
working with technologies, but these constructs are embedded in broader cognitive mod-
els of interpreting. These models – be they computational, componential, interactional,
neurobiological, or from any of the 4EA research paradigms – seek to describe interpreter
cognition more generally, yet the technologised nature of these practices in specific settings
may precipitate the need for revision. In some cases, these models may already account
for interpreting technologies; however, the increasingly technologised workspace of inter-
preters highlights the complex interplay at the human–computer interface. In sum, more
explicit synthesis of research in these areas is likely needed to understand the relationship
between technology and interpreter cognition.
References
Aguirre Fernández Bravo, E., 2019. Metacognitive Self-Perception in Interpreting. Translation, Cog-
nition & Behavior 2(2), 147–164. URL https://doi.org/10.1075/tcb.00025.fer
Ahrens, B., 2025. Cognitive Models of Interpreting. In Mellinger, C.D., ed. The Routledge
Handbook of Interpreting and Cognition. Routledge, New York, 52–69. URL https://doi.
org/10.4324/9780429297533-5
Ahrens, B., Orlando, M., 2022. Note-Taking for Consecutive Conference Interpreting. In Albl-Mikasa,
M., Tiselius, E., eds. The Routledge Handbook of Conference Interpreting. Routledge, New York,
34–48. URL https://doi.org/10.4324/9780429297878-5
Barcellini, F., De Greef, T., Détienne, F., 2016. Editorial for Special Issue on Cognitive Ergonomics for
Work, Education and Everyday Life. Cognition, Technology & Work 18, 233–235. URL https://
doi.org/10.1007/s10111-016-0371-5
Bertozzi, M., Cecchi, F., 2023. Simultaneous Interpretation (SI) Facing the Zoom Challenge:
Technology-Driven Changes in SI Training and Professional Practice. In Proceedings of the Inter-
national Workshop on Interpreting Technologies SAY-IT 2023, Incoma, 32–40.
Braun, S., 2013. Keep Your Distance? Remote Interpreting in Legal Proceedings: A Critical Assessment
of a Growing Practice. Interpreting 15(2), 200–228. URL https://doi.org/10.1075/intp.15.2.03bra
Braun, S., 2017. What a Micro-Analytical Investigation of Additions and Expansions in Remote
Interpreting Can Tell Us About Interpreters’ Participation in a Shared Virtual Space. Journal of
Pragmatics 107, 165–177. URL https://doi.org/10.1016/j.pragma.2016.09.011
Braun, S., 2019. Technology and Interpreting. In O’Hagan, M., ed. The Routledge Handbook of Translation
and Technology. Routledge, New York, 271–288. URL https://doi.org/10.4324/9781315311258-16
Braun, S., 2020. “You Are Just a Disembodied Voice Really”: Perceptions of Video Remote Interpret-
ing by Legal Interpreters and Police Officers. In Salaets, H., Brône, G., eds. Linking Up with Video:
Perspectives on Interpreting Practice and Research. John Benjamins, Amsterdam, 47–78. URL
https://doi.org/10.1075/btl.149.03bra
Braun, S., 2024. Distance Interpreting as a Professional Profile. In Handbook of the Language Indus-
try. De Gruyter Mouton, 449–472.
Cañada, M.D., Arumí, M., 2012. Self-Regulating Activity: Use of Metacognitive Guides in the Inter-
preting Classroom. Educational Research and Evaluation 18(3), 245–264. URL https://doi.org/
10.1080/13803611.2012.661934
Chen, S., 2017. The Construct of Cognitive Load in Interpreting and Its Measurement. Perspectives
25(4), 640–657. URL https://doi.org/10.1080/0907676X.2016.1278026
Chen, S., Doherty, S., 2025. Interpreting and Technologies. In Mellinger, C.D., ed. The Routledge
Handbook of Interpreting and Cognition. Routledge, New York, 403–416. URL https://doi.
org/10.4324/9780429297533-29
Chen, S., Kruger, J.L., 2023. The Effectiveness of Computer-Assisted Interpreting: A Preliminary
Study Based on English-Chinese Consecutive Interpreting. Translation and Interpreting Studies
18(3), 399–420. URL https://doi.org/10.1075/tis.21036.che
358
Cognitive aspects
Chmiel, A., Spinolo, N., 2022. Testing the Impact of Remote Interpreting Settings on Interpreter
Experience and Performance: Methodological Challenges Inside the Virtual Booth. Translation,
Cognition & Behavior 5(2), 250–274. URL https://doi.org/10.1075/tcb.00068.chm
Darden, V., Maroney, E.M., 2018. “Craving to Hear from You . . .”: An Exploration of m-Learning in
Global Interpreter Education. Translation and Interpreting Studies 13(3), 442–464. URL https://
doi.org/10.1075/tis.00024.dar
Davitti, E., Braun, S., 2020. Analysing Interactional Phenomena in Video Remote Interpreting in Col-
laborative Settings: Implications for Interpreter Education. The Interpreter and Translator Trainer
14(3), 279–302. URL https://doi.org/10.1080/1750399X.2020.1800364
Defrancq, B., Snoeck, H., Fantinuoli, C., 2024. Interpreters’ Performances and Cognitive Load in the
Context of a CAI Tool. In Winters, M., Deane-Cox, S., Böser, U., eds. Translation, Interpreting
and Technological Change: Innovations in Research, Practice and Training. Bloomsbury, London,
37–58. URL https://doi.org/10.5040/9781350212978.0009
Díaz-Galaz, S., Winston, E.A., 2025. Interpreting, Training, and Education. In Mellinger, C.D., ed.
The Routledge Handbook of Interpreting and Cognition. Routledge, New York, 417–437. URL
https://doi.org/10.4324/9780429297533-30
Diriker, E., 2010. Simultaneous Conference Interpreting and Technology. In Gambier, Y., van
Doorslaer, L., eds. Handbook of Translation Studies. John Benjamins, Amsterdam, 329–332. URL
https://doi.org/10.1075/hts.1.sim1
Doherty, S., Martschuk, N., Goodman-Delahunty, J., Hale, S., 2022. An Eye-Movement Anal-
ysis of Overt Visual Attention During Consecutive and Simultaneous Interpreting Modes in a
Remotely Interpreted Investigative Interview. Frontiers in Psychology 13, 764460. URL https://
doi.org/10.3389/fpsyg.2022.764460
Donovan, C., 2023. The Consequences of Fully Remote Interpretation on Interpreter Interaction and
Cooperation: A Threat to Professional Cohesion? INContext: Studies in Translation and Intercul-
turalism 3(1), 24–48. URL https://doi.org/10.54754/incontext.v3i1.59
Drechsel, A., 2019. Technology Literacy for the Interpreter. In Sawyer, D.B., Austermühl, F., Enríquez
Raído, V., eds. The Evolving Curriculum in Interpreter and Translator Education: Stakeholder
Perspectives and Voices. John Benjamins, Amsterdam, 259–268. URL https://doi.org/10.1075/ata.
xix.12dre
Englund Dimitrova, B., Tiselius, E., 2016. Cognitive Aspects of Community Interpreting. Toward a
Process Model. In Muñoz Martín, R., ed. Reembedding Translation Process Research. John Ben-
jamins, Amsterdam, 195–214. URL https://doi.org/10.1075/btl.128.10eng
Fantinuoli, C., ed., 2018. Interpreting and Technology. Language Science Press, Berlin.
Fantinuoli, C., 2023. Towards AI-Enhanced Computer-Assisted Interpreting. In Corpas Pastor,
G., Defrancq, B., eds. Interpreting Technologies – Current and Future Trends. John Benjamins,
Amsterdam, 46–71. URL https://doi.org/10.1075/ivitra.37.03fan
Fantinuoli, C., Dastyar, V., 2022. Interpreting and the Emerging Augmented Paradigm. Interpreting and Soci-
ety: An Interdisciplinary Journal 2(2), 185–194. URL https://doi.org/10.1177/27523810221111631
Frittella, F.M., 2021. Computer-Assisted Conference Interpreter Training: Limitations and Future Direc-
tions. Journal of Translation Studies 1(2), 103–142. URL https://doi.org/10.3726/JTS022021.6
Frittella, F.M., 2023. Usability Research for Interpreter-Centred Technology: The Case Study of
SmarTerp. Language Science Press, Berlin.
Gardenfors, P., Johansson, P., eds., 2005. Cognition, Education, and Communication Technology.
Routledge, New York.
Gieshoff, A.C., 2023. The Use of Augmented Reality in Interpreting: Methodological Challenges.
Paper presented at the Fourth International Conference on Translation, Interpreting, and Cogni-
tion, Santiago, Chile.
Gieshoff, A.C., Heeb, A.H., 2023. Cognitive Load and Cognitive Effort: Probing the Psychological
Reality of a Conceptual Difference. Translation, Cognition & Behavior 6(1), 3–28. URL https://
doi.org/10.1075/tcb.00073.gie
Gieshoff, A.C., Lehr, C., Heeb, A.H., 2021. Stress, Cognitive, Emotional and Ergonomic Demands in
Interpreting and Translation: A Review of Physiological Studies. Cognitive Linguistic Studies 8(2),
404–439. URL https://doi.org/10.1075/cogls.00084.gie
Gile, D., 1991. The Processing Capacity Issue in Conference Interpretation. Babel 37(1), 15–27. URL
https://doi.org/10.1075/babel.37.1.04gil
359
The Routledge Handbook of Interpreting, Technology and AI
Gile, D., 2000. Issues in Interdisciplinary Research into Conference Interpreting. In Englund Dim-
itrova, B., Hyltenstam, K., eds. Language Processing and Simultaneous Interpreting: Interdiscipli-
nary Perspectives. John Benjamins, Amsterdam, 89–106.
Gile, D., 2009. Basic Concepts and Models for Interpreter and Translator Training, revised ed. John
Benjamins, Amsterdam. URL https://doi.org/10.1075/btl.8
Gile, D., Lei, V., 2020. Translation, Effort and Cognition. In Alves, F., Jakobson, A.L., eds. The Rout-
ledge Handbook of Translation and Cognition. Routledge, New York, 263–278. URL https://doi.
org/10.4324/9781315178127-18
Giustini, D., Dastyar, V., 2024. Critical AI Literacy for Interpreting in the Age of AI. Interpreting and
Society: An Interdisciplinary Journal. URL https://doi.org/10.1177/27523810241247259
Goldsmith, J., 2018. Tablet Interpreting: Consecutive Interpreting 2.0. Translation and Interpreting
Studies 13(3), 342–365. URL https://doi.org/10.1075/tis.00020.gol
Hale, S., Goodman-Delahunty, J., Martschuk, N., Lim, J., 2022. Does Interpreter Location Make a
Difference? A Study of Remote vs Face-to-Face Interpreting in Simulated Police Interviews. Inter-
preting 24(2), 221–253. URL https://doi.org/10.1075/intp.00077.hal
Halverson, S.L., 2021. Translation, Linguistic Commitment and Cognition. In Alves, F., Jakobsen,
A.L., eds. The Routledge Handbook of Translation and Cognition. Routledge, New York, 37–51.
Herring, R., 2019. “A Lot to Think About”: Online Monitoring in Dialogue Interpreting. Transla-
tion, Cognition & Behavior 2(2), 283–304. URL https://doi.org/10.1075/tcb.00030.her
Hofmann, B., 2017. Toward a Method for Exposing and Elucidating Ethical Issues with Human
Cognitive Enhancement Technologies. Science and Engineering Ethics 23, 413–429. URL https://
doi.org/10.1007/s11948-016-9791-0
Hollnagel, E., 1997. Cognitive Ergonomics: It’s All in the Mind. Ergonomics 40(10), 1170–1182.
URL https://doi.org/10.1080/001401397187685
Horváth, I., 2022. AI in Interpreting: Ethical Considerations. Across Languages and Cultures 23(1),
1–13. URL https://doi.org/10.1556/084.2022.00108
Iglesias Fernández, E., Ouellet, M., 2018. From the Phone to the Classroom: Categories of Problems
for Telephone Interpreting Training. The Interpreters’ Newsletter 23, 19–44.
Jiménez Serrano, O., 2019. Interpreting Technologies: Introduction. Revista tradumàtica: traduc-
ció i tecnologies de la informació i la comunicació 17, 20–32. URL https://doi.org/10.5565/rev/
tradumatica.240
Jotterand, F., Dubljević, V., eds., 2016. Cognitive Enhancement: Ethical and Policy Implications in
International Perspectives. Oxford University Press, Oxford.
Korpal, P., Mellinger, C.D., 2025. Interpreting and Individual Differences. In Mellinger, C.D., ed.
The Routledge Handbook of Interpreting and Cognition. Routledge, New York, 357–372. URL
https://doi.org/10.4324/9780429297533-26
Korpal, P., Rojo López, A.M., 2023. Physiological Measurement in Translation and Interpreting.
In Schwieter, J.W., Ferreira, A., eds. The Routledge Handbook of Translation, Interpreting and
Bilingualism. Routledge, New York, 97–110. URL https://doi.org/10.4324/9781003109020-10
Kuang, H., Zheng, B., 2023. Note-Taking Effort in Video Remote Interpreting: Effects of Source
Speech Difficulty and Interpreter Work Experience. Perspectives 31(4), 724–744. URL https://doi.
org/10.1080/0907676X.2022.2053730
Lawler, E.K., Hedge, A., Pavlovic-Veselinovic, S., 2011. Cognitive Ergonomics, Socio-Technical Sys-
tems, and the Impact of Healthcare Information Technologies. International Journal of Industrial
Economics 41(4), 336–344. URL https://doi.org/10.1016/j.ergon.2011.02.006
Liu, H., 2018. Help or Hinder? The Impact of Technology on the Role of Interpreters. FITISPos Inter-
national Journal 5(1), 13–32. URL https://doi.org/10.37536/FITISPos-IJ.2018.5.1.162
Martín de León, C., Fernández Santana, A., 2021. Embodied Cognition in the Booth: Referential and
Pragmatic Gestures in Simultaneous Interpreting. Cognitive Linguistic Studies 14, 363–387. URL
https://doi.org/10.1075/cogls.00079.mar
Mellinger, C.D., 2019. Computer-Assisted Interpreting Technologies and Interpreter Cognition: A
Product and Process-Oriented Perspective. Revista tradumàtica: traducció i tecnologies de la infor-
mació i la comunicació 17, 33–44. URL https://doi.org/10.5565/rev/tradumatica.228
Mellinger, C.D., 2022a. Quantitative Questions on Big Data in Translation Studies. Meta 67(1),
217–231. URL https://doi.org/10.7202/1092197ar
360
Cognitive aspects
Mellinger, C.D., 2022b. Cognitive Behavior During Consecutive Interpreting: Describing the Note-
taking Process. Translation & Interpreting 14(2), 103–119. URL https://doi.org/10.12807/
ti.114202.2022.a07
Mellinger, C.D., 2023. Embedding, Extending, and Distributing Interpreter Cognition with Tech-
nology. In Corpas-Pastor, G., Defrancq, B., eds. Interpreting Technologies – Current and Future
Trends. John Benjamins, Amsterdam, 195–216. URL https://doi.org/10.1075/ivitra.37.08mel
Mellinger, C.D., 2024. Translation and Interpreting Process Research. In Lange, A., Monticelli, D.,
Rundle, C., eds. The Routledge Handbook of the History of Translation Studies. Routledge,
New York, 450–465. URL https://doi.org/10.4324/9781032690056-31
Mellinger, C.D., ed., 2025. The Routledge Handbook of Interpreting and Cognition. Routledge,
New York. URL https://doi.org/10.4324/9780429297533
Mellinger, C.D., Hanson, T.A., 2018. Interpreter Traits and the Relationship with Technology and Visibil-
ity. Translation and Interpreting Studies 13(3), 366–392. URL https://doi.org/10.1075/tis.00021.mel
Mellinger, C.D., Hanson, T.A., 2019. Meta-Analyses of Simultaneous Interpreting and Working
Memory. Interpreting 21(2), 165–195. URL https://doi.org/10.1075/intp.00026.mel
Mellinger, C.D., Hanson, T.A., 2020. Methodological Considerations for Survey Research: Validity,
Reliability, and Quantitative Analysis. Linguistica Antverpiensia, New Series – Themes in Transla-
tion Studies 19, 172–190. URL https://doi.org/10.52034/lanstts.v19i0.549
Mellinger, C.D., Pokorn, N.K., 2018. Community Interpreting, Translation, and Technology. Trans-
lation and Interpreting Studies 13(3), 337–341. URL https://doi.org/10.1075/tis.00019.int
Milošević, J., Risku, H., 2025. Embodied Cognition. In Mellinger, C.D., ed. The Routledge
Handbook of Interpreting and Cognition. Routledge, New York, 324–340. URL https://doi.
org/10.4324/9780429297533-24
Moser-Mercer, B., 2005. Remote Interpreting: Issues of Multi-Sensory Integration in a Multilingual
Task. Meta 50(2), 727–738. URL https://doi.org/10.7202/011014ar
Muñoz Martín, R., 2016. Of Minds and Men – Computers and Translators. Poznań Studies in Con-
temporary Linguistics 52(2), 351–381. URL https://doi.org/10.1515/psicl-2016-0013
Napier, J., Skinner, R., Braun, S., eds., 2018. Here or There: Research on Interpreting via Video Link.
Gallaudet University Press, Washington, DC. URL https://doi.org/10.2307/j.ctv2rh2bs3
Navigli, R., Conia, S., Ross, B., 2023. Biases in Large Language Models: Origins, Inventory,
and Discussion. Journal of Data and Information Quality 15(2), art. 10. URL https://doi.
org/10.1145/3597307
Norman, K.K., Kirakowski, J., eds., 2018. The Wiley Handbook of Human Computer Interaction,
2nd ed. Wiley, Malden, MA. URL https://doi.org/10.1002/9781118976005
O’Brien, S., 2012. Translation as Human-Computer Interaction. Translation Spaces 1(1), 101–122.
URL https://doi.org/10.1075/ts.1.05obr
O’Brien, S., 2024. Human-Centered Augmented Translation: Against Antagonistic Dualisms. Per-
spectives 32(3), 391–406. URL https://doi.org/10.1080/0907676X.2023.2247423
Olalla-Soler, C., Spinolo, N., Muñoz Martin, R., 2023. Under Pressure? A Study of Heart Rate and
Heart-Rate Variability Using SmarTerp. Hermes 63, 119–142. URL https://doi.org/10.7146/hjlcb.
vi63.134292
O’Leary, D.E., 2022. Massive Data Language Models and Conversational Artificial Intelligence:
Emerging Issues. Intelligent Systems in Accounting, Finance and Management 29(3), 182–198.
URL https://doi.org/10.1002/isaf.1522
Orlando, M., 2023. Using Smartpens and Digital Pens in Interpreter Training and Interpreting
Research. In Corpas Pastor, G., Defrancq, B., eds. Interpreting Technologies – Current and Future
Trends. John Benjamins, Amsterdam, 6–26. URL https://doi.org/10.1075/ivitra.37.01orl
Ozolins, U., 2011. Telephone Interpreting: Understanding Practice and Identifying Research Needs.
Translation & Interpreting 3(2), 33–47.
Picchio, L., 2023. Distance vs. Onsite (Non-) Streamed Interpreting Performances: A Focus on the
Renditions of Film Scenes. The Interpreters’ Newsletter 28, 171–188.
Pisani, E., Fantinuoli, C., 2021. Measuring the Impact of Automatic Speech Recognition on Number
Rendition in Simultaneous Interpreting. In Wang, C., Zheng, B., eds. Empirical Studies of Trans-
lation and Interpreting: The Post-Structuralist Approach. Routledge, New York, 181–197. URL
https://doi.org/10.4324/9781003017400-14
361
The Routledge Handbook of Interpreting, Technology and AI
Pöchhacker, F., Shlesinger, M., eds., 2002. The Interpreting Studies Reader. Routledge, New York.
Prandi, B., 2023a. Computer-Assisted Simultaneous Interpreting: A Cognitive-Experimental Study on
Terminology. Language Science Press.
Prandi, B., 2023b. Exploring Augmented Cognition for Real-Time Interpreter Support. Paper pre-
sented at the Second Bertinoro Translation Society Conference, Cabo de Palos, Spain.
Romero-Fresco, P., 2023. Interpreting for Access: The Long Road to Recognition. In Zwischenberger,
C., Reithofer, K., Rennert, S., eds. Introducing New Hypertexts on Interpreting Studies: A Trib-
ute to Franz Pöchhacker. John Benjamins, Amsterdam, 236–253. URL https://doi.org/10.1075/
btl.160.12rom
Roziner, I., Shlesinger, M., 2010. Much Ado About Something Remote: Stress and Performance in
Remote Interpreting. Interpreting 12(2), 214–247. URL https://doi.org/10.1075/intp.12.2.05roz
Saeed, M., Rodríguez González, E., Korybski, T., Davitti, E., Braun, S., 2023. Comparing Inter-
face Designs to Improve RSI Platforms: Insights from an Experimental Study. Proceed-
ings of the International Conference HiT-IT 2023, 147–156. URL https://doi.org/10.26615/
issn.2683-0078.2023_013
Salaets, H., Brône, G., 2023. “Working at a Distance from Everybody”: Challenges (and Some
Advantages) in Working with Video-Based Interpreting Platforms. The Interpreters’ Newsletter
28, 189–209.
Sannholm, R., Risku, H., 2024. Situated Minds and Distributed Systems in Translation: Exploring
the Conceptual and Empirical Implications. Target 36(2), 159–183. URL https://doi.org/10.1075/
target.22172.san
Seeber, K.G., 2013. Cognitive Load in Simultaneous Interpreting: Measures and Methods. Target
25(1), 18–32. URL https://doi.org/10.1075/target.25.1.03see
Seeber, K.G., 2017. Multimodal Processing in Simultaneous Interpreting. In Schwieter, J.W., Ferreira,
A., eds. The Handbook of Translation and Cognition. Wiley, Malden, MA, 461–475. URL https://
doi.org/10.1002/9781119241485.ch25
Seeber, K.G., Amos, R.M., 2023. Capacity, Load, and Effort in Translation, Interpreting, and Bilingual-
ism. In Schwieter, J.W., Ferreira, A., eds. The Routledge Handbook of Translation, Interpreting and
Bilingualism. Routledge, New York, 260–279. URL https://doi.org/10.4324/9781003109020-22
Seeber, K.G., Arbona, E., 2020. What’s Load Got to Do with It? A Cognitive-Ergonomic Training
Model of Simultaneous Interpreting. The Interpreter and Translator Trainer 14(4), 369–385. URL
https://doi.org/10.1080/1750399X.2020.1839996
Seeber, K.G., Kerzel, D., 2012. Cognitive Load in Simultaneous Interpreting: Model Meets Data. Inter-
national Journal of Bilingualism 16(2), 228–242. URL https://doi.org/10.1177/1367006911402982
Shlesinger, M., 2000. Interpreting as a Cognitive Process: How Can We Know What Really Happens?
In Tirkkonen-Condit, S., Jääskeläinen, R., eds. Tapping and Mapping the Processes of Transla-
tion and Interpreting: Outlooks on Empirical Research. John Benjamins, Amsterdam, 3–16. URL
https://doi.org/10.1075/btl.37.03shl
Stanney, K.M., Schmorrow, D.D., Johnston, M., Fuchs, S., Jones, D., Hale, K.S., Ahmad, A., Young,
P., 2009. Augmented Cognition: An Overview. Reviews of Human Factors and Ergonomics 5(1),
195–224. URL https://doi.org/10.1518/155723409X448062
Stengers, H., Lázaro Gutiérrez, R., Kerremans, K., 2023. Public Service Interpreters’ Perceptions
and Acceptance of Remote Interpreting Technologies in Times of a Pandemic. In Corpas-Pastor,
G., Defrancq, B., eds. Interpreting Technologies – Current and Future Trends. John Benjamins,
Amsterdam, 109–141. URL https://doi.org/10.1075/ivitra.37.05ste
Tiselius, E., Englund Dimitrova, B., 2023. Testing the Working Memory Capacity of Dialogue Interpret-
ers. Across Languages and Cultures 24(2), 163–180. URL https://doi.org/10.1556/084.2023.00439
Van Egdom, G.-W., Cadwell, P., Kockaert, H., Segers, W., 2020. A Turn to Ergonomics in Translator
and Interpreter Training. The Interpreter and Translator Trainer 14(4), 363–368. URL https://doi.
org/10.1080/1750399X.2020.1846930
Viljanmaa, A., 2018. Students’ Views on the Use of Film-Based LangPerform Computer Simulations
for Dialogue Interpreting. Translation and Interpreting Studies 13(3), 465–485. URL https://doi.
org/10.1075/tis.00025.vil
Wehrmeyer, E., 2015. Comprehension of Television News Signed Language Interpreters: A South
African Perspective. Interpreting 17(2), 195–225. URL https://doi.org/10.1075/intp.17.2.03weh
362
Cognitive aspects
Wen, H., Dong, Y., 2019. How Does Interpreting Experience Enhance Working Memory and
Short-Term Memory: A Meta-Analysis. Journal of Cognitive Psychology 31(8), 769–784. URL
https://doi.org/10.1080/20445911.2019.1674857
Zheng, R.Z., ed., 2018. Cognitive Load Measurement and Application. Routledge, New York.
Zhu, X., Aryadoust, V., 2022. A Synthetic Review of Cognitive Load in Distance Interpret-
ing: Toward an Explanatory Model. Frontiers in Psychology 13. URL https://doi.org/10.3389/
fpsyg.2022.899718
Ziegler, K., Gigliobianco, S., 2018. Present? Remote? Remotely Present! New Technological
Approaches to Remote Simultaneous Conference Interpreting. In Fantinuoli, C., ed. Interpreting
and Technology. Language Science Press, Berlin, 119–139.
363
20
INTERNATIONAL AND
PROFESSIONAL STANDARDS
Verónica Pérez Guarnieri and Haris N. Ghinos
DOI: 10.4324/9781003053248-26
International and professional standards
365
The Routledge Handbook of Interpreting, Technology and AI
document and its applicability within their country, consulting with stakeholders to deter-
mine its continued validity, the need for updates, or the possibility of withdrawal (ISO 2019).
366
International and professional standards
In the introduction of interpreting standards, we read about the needs of the language
industry that led to their development, the clarification of some terms, and why some con-
cepts have been excluded. The most important section is the Scope, which should always
be concise, to explain to potential users what it covers and precisely and succinctly define
the subject of the standard. Like the Introduction, the Scope should be written as a series of
statements of facts and should not contain requirements or recommendations. The Scope
will include verbs such as ‘specifies’ and ‘establishes’.
Subsequently, the Normative References section provides a list of standards that are
referenced for understanding and implementing the content outlined in the document. The
Terms and Definitions section then covers how a typical terminological entry is drafted, the
importance of definitions seamlessly replacing terms in context, and specific rules for terms
and definitions. These rules include avoiding articles in definitions and refraining from
using equations, figures, and tables in definitions. Additionally, as definitions are explana-
tory, they should not contain requirements, recommendations, or permissions.
There are three possibilities for an introductory text in the ‘Terms and Definitions’ clause,
depending on whether new terms are defined, terms are referenced from another document,
or no terms and definitions are present. In the three definitions that follow, notice the ref-
erencing and derivation among them, the source in the case of definitions that are not new,
and the justification for the change of the cited definition.
3.2.63
interpreting
interpretation
rendering spoken or signed information from a source language (3.1.4) to a target
language (3.1.5) in oral or signed form, conveying both the meaning and language
register (3.1.10) of the source language content (3.1.15)
[SOURCE: ISO 20539:2019, 3.1.10, modified – The order of the wording ‘both the lan-
guage register and meaning’ has been changed to ‘both the meaning and language register’.]
3.2.104
conference interpreting
interpreting (3.2.6) used for multilingual communication at technical, political, scien-
tific and other formal meetings
3.2.135
consultant interpreter
conference interpreter (3.2.9) who provides consultancy services in addition to work-
ing as a conference interpreter
367
The Routledge Handbook of Interpreting, Technology and AI
• Plain English should be used for enhanced document clarity. This is deemed crucial for
international readers with English as a non-native language and aims at minimising
translation errors. Latin words should be avoided; if utilised, English plurals should be
employed, if one exists.
• When legislation and regulations are referenced, the word ‘compliance’ is used; for stand-
ards requirements, ‘conformity’ or the phrase ‘in accordance with’ should be utilised.
368
International and professional standards
• Oxford British spelling is followed by ISO documents, with exceptions made for approx-
imately 200 verbs for which the suffix ‑ize (rather than ‑ise) is used, for example, organ-
ize and standardize.
• The present tense should be used by default, and an impersonal tone should be main-
tained. ‘Shall’ should be specified for document requirements, and ‘must’ for external
constraints or obligations. The use of ‘need(s) to’ should be avoided to prevent confu-
sion. ‘Should’ denotes a ‘strong recommendation’.
• ‘May’ should be used to express permission, and ‘can’ should be used for possibilities or
capabilities. The substitution with ‘might’ or ‘could’ should be avoided to prevent confu-
sion during translation.
• When referring to an individual, ‘they’, ‘them’, and ‘their’ can serve as gender-neutral
pronouns.
369
The Routledge Handbook of Interpreting, Technology and AI
of ISO 13611 marked a pivotal stride toward establishing a standardized framework in the
dynamic landscape of interpreting. This standard, whose revision has just been published, is
now called ‘ISO 13611:2024 Interpreting services – Community interpreting-Requirements
and recommendations’. The new version includes requirements and recommendations for
the provision of community interpreting services, establishing the relevant practices neces-
sary to ensure quality community interpreting services for all language communities (spo-
ken and/or signed) and for all stakeholders.
The only reference made to technology in the 2014 version of this standard was in
terms of its use to work remotely with the help of video or teleconferencing technology,
and with equipment or doing chuchotage, if interpreting simultaneously. In the newly pub-
lished version of the standard, for tasks involving technology, community interpreters are
expected to proficiently operate interpreting equipment, including microphones and audio-/
videoconferencing technology. ‘Proficiency’ in using the necessary equipment and platforms
for remote interpreting services is now also required.
The next standard to be approved in the interpreting standardisation pathway was
‘ISO 18841:2018 Interpreting Services – General requirements and recommendations’.
This standard is the umbrella standard from which all specialist standards derive. It was
intended to be the first of the series, but there was a pressing need to regulate the commu-
nity interpreting field back in 2010. ISO 18841 – under review at the moment of writing
this chapter – covers the basic requirements and recommendations necessary for the provi-
sion of interpreting services. It also offers recommendations of good practice for users of
interpreting services. In this standard, ‘distance interpreting’ was mentioned for the first
time in an interpreting standard, under Section 20.5.2.2, ‘Working Conditions’. Interpret-
ing equipment is mentioned in general without providing any specifications. This standard
is being revised at the time of publication. For the first time, the term “human beings” is
introduced in the scope of an international interpreting standard as subjects performing the
interpreting task.
The next standard in the series was ‘ISO 20228:2019 Interpreting Services – Legal
Interpreting-Requirements’. Its text covers the principles governing the provision of legal
interpreting services, outlining the required skills of legal interpreters. The recommenda-
tions it contains apply to all parties involved in the legal communicative event: interpret-
ers (oral, signed), legal practitioners/legal service providers, lay users, recipients of legal
services, and institutions. Informative Annex C of this standard, ‘Recommendations for
interpreting mode’, mentions that distance interpreting is used by court and the police to
facilitate interpreting when the parties are at different locations and that interpreters should
be provided with the right equipment. However, no specifications are given.11
In chronological order, the next standard to be approved was ‘ISO 21998:2020 Inter-
preting services – Healthcare interpreting – Requirements and recommendations’. This
text outlines the criteria and suggestions for spoken and signed communication in health-
care interpreting services. It is relevant to any scenario necessitating healthcare interpreta-
tion, wherein individuals must communicate using spoken or signed language to address
health-related matters. The target audience includes providers of interpreting services and
healthcare interpreters, as well as healthcare providers and users of healthcare services (i.e.
laypersons – patients, carers, etc). For the first time in an interpreting standard, this standard
includes a subsection on the technical competences and skills for healthcare interpreters. It
states that they shall have to use interpreting technology and underlines the responsibilities
370
International and professional standards
for the interpreting service providers to offer a suitable working environment for remote
interpreting, trying to mitigate noise or visual disruptions, ensuring optimal technology
quality, and providing adequate ventilation.12
The latest interpreting standard to be published is ‘ISO 23155:2022 Interpreting ser-
vices – Conference interpreting – Requirements and recommendations’. This standard, dis-
cussed in more depth in the next section, regulates the provision of conference interpreting
services and offers good practice recommendations. It mentions interpreting technology,
equipment, and for the first time in an interpreting standard, cognitive load.
In simultaneous interpreting, compliance with ‘ISO 24019:2022 Simultaneous interpreting
delivery platforms – Requirements and Recommendations’13 is mandatory, and adherence to
‘ISO 20109-2016 Simultaneous interpreting – Equipment – Requirements’14 is for the use of
audio/video, microphones, and headphones (see Section 20.4.8 for further details).
20.3.1 Overview
ISO 23155 started as a new work item proposal (NWIP) in August 2017 and was published
52 months later, on 2 January 2022. It specifies requirements and recommendations for the
provision of conference interpreting services. It is primarily addressed to conference inter-
preters and conference interpreting service providers (CISP), but it also serves as reference
for users of and parties involved in conference interpreting.
Conference interpreting is needed at conferences, that is, specialised, structured, for-
mal, multilingual communicative events (see definition 3.3.1). Conference interpreting is
a well-established profession. Every year, conference interpreters enable hundreds of thou-
sands of multilingual conferences and meetings to take place.
ISO 23155 can be qualified as innovative since it considers the provision of conference
interpreting as an integrated project. While ‘conference interpreting’ refers to the mental
processes taking place in the brain of a conference interpreter, the ‘conference interpreting
service’ includes the working conditions that enable conference interpreters to perform, as
well as the logistics (booths, conference equipment, cabling, documentation, travel arrange-
ments) required to deliver conference interpreting to an audience. Accordingly, the term
‘conference interpreting service provider’ (CISP) denotes the professionals, individuals or
organisations, that provide conference interpreting services.
Aside from the informative sections (Foreword, Introduction, Scope, Normative Ref-
erences, Terms and Definitions, Annexes, Bibliography, Index), the key structure of ISO
23155 contains the following clauses:
371
The Routledge Handbook of Interpreting, Technology and AI
372
International and professional standards
373
The Routledge Handbook of Interpreting, Technology and AI
interpreters is the responsibility of the CISP (7.2). Accordingly, it also underlines the need
to ensure conformity with ISO technical standards (see later text).
The concept of ‘consultant interpreter’ is also defined (3.2.13 and 7.1). Here, effort is
made to shed light on the concept of confidentiality in a requirement for an ‘augmented
level of confidentiality’ (6.1, 7.2, 7.3.1, 7.3.2, Annex B, Annex D).
None of these concepts is totally new, of course, but 23155 groups them together in a
consistent manner under an ISO standard. For decades, the International Association of
Conference Interpreters (AIIC, www.aiic.org), researchers, and academia have been con-
tributing to a vast knowledge base. ISO experts have recently transformed this into a pro-
gressive international standard.
This concludes the presentation of ISO standards on interpreting, those which describe
‘how interpreting should be done’. However, the interpreting industry also relies on another
series of standards. These lay down requirements and recommendations concerning the
technical means used during the majority of interpreted communicative events. Indeed,
there are very few multilingual events or meetings with interpretation that do not require
a minimum amount of technical equipment. This could range from a simple public address
system to simultaneous interpreting booths, interpreter interfaces (hard and soft consoles),
microphones and headsets, screens, cabling, and ancillary equipment.
Interestingly, although ISO Technical Committee 37 on Language and Terminol-
ogy became active as early as 1952, technical interpreting ISO standards predate the
non-technical interpreting standards. This could be due to the technical origins of ISO as
an organisation. To illustrate, ISO 4043 on mobile (simultaneous interpreting) booths dates
back to 1981, while the non-technical interpreting ISO standards previously mentioned
only started emerging well into the 21st century.
This chapter will now briefly visit the technical ISO standards related to interpreting,
with the aim of explaining their relevance to the interpreting service.
374
International and professional standards
4 Location of booths
5 Building standards for booths
6 Booth interior
7 Facilities for interpreters
As one would expect, the Standard addresses minimum dimensions, windows, and vis-
ibility; soundproofing and acoustics; air quality; lighting; and working surface. It also
addresses exposure to electromagnetic radiation.
The Standard recognises that ‘as interpreting is an activity that requires high concentra-
tion, stress factors have to be avoided, and the working environment accordingly has to
meet the highest ergonomic standards and provide an environment that enables interpreters
to carry out their work properly’ (Introduction).
This document also emphasises the need for ‘good visual communication between the
interpreters and the participants in the event’ (Introduction). As discussed later, this consid-
eration becomes even more important, if not critical, in DI settings.
4 Location
5 Design
6 Booth interior
7 Facilities for interpreters
375
The Routledge Handbook of Interpreting, Technology and AI
ISO 17651–1 also specifies that ‘booths are places used for work and are occupied through-
out the day’ (5.6.1). This affects the definition of requirements for air quality, temperature,
humidity, etc.
Its main structure (compared with ISO 17651–2:2024, shown later) includes the
following elements:
4 General requirements
5 Size, weight and handling
6 Doors
7 Cable passages
8 Windows
9 Acoustics
10 Ventilation
11 Working surface
12 Lighting
13 Electricity supply
14 Language panels
4 Location
5 Design
6 Booth interior
7 Facilities for interpreters
376
International and professional standards
Part 1 and Part 2 of ISO 17651 will be complemented with ISO 17651–3 (Part 3: Require-
ments and recommendations for interpreting hubs). This is currently still in progress but
will apply to booths which do not have a direct view of the room in which the communica-
tive event is taking place.
Under ‘7.1 Quality’ lies an interesting reference to the ‘effects of both packet level and
signal-related impairments caused by coding processes’. These currently (as of 2024) form
the centre of the debate concerning interpreters’ auditory health.
ISO 20109 also introduces hearing protection that must be provided by the interpreting
equipment (clause 4.5).
It must be noted that ISO 20109 is currently under review and will apply to a variety
of different settings. These include interpreters working in booths in the same space as all
377
The Routledge Handbook of Interpreting, Technology and AI
other participants, in booths adjacent to the meeting room, or interpreters working from
interpreting hubs, etc.
ISO 24019 is a non-certifiable standard and reflects the rapidly changing landscape in the
industry. It cancels and replaces ISO/PAS 24019:2020 (PAS is explained in Section 20.1
of this chapter). Some of the changes of note include the additional requirements for sign
language interpreting, a reference to communication between interpreters with sound
and image, and requirements referring to the working environments of both speakers and
signers.
Also of note is the Standard’s approach to and description of distance interpreting as
‘settings where the interpreters are not at the same venue as participants, speakers and
signers or each other’ (Introduction). Here, the Standard indirectly acknowledges that
distance also from participants (not only speakers and signers) makes interpreting at
least ‘different’ for interpreters. The Standard also discusses sound and image quality,
synchronisation of sound and image, hearing protection, latency, existence of technical
support, etc.
Its main structure includes the following elements:
ISO 24019 also introduces a ‘handover procedure and control’ (7.7.11), a procedure to
allow an interpreter to hand over command of their outgoing channel to a channel partner
who is not located by their side. Here, the Standard implicitly refers to a situation where
an interpreter would be obliged to work alone, in uncontrolled (perhaps private) premises,
without technical support. The so-called ‘home alone’ model remains highly controversial,
378
International and professional standards
especially for the purpose of conference interpreting. This shall be discussed in greater
detail later.
Furthermore, the Standard recommends (note the use of ‘should’ vs ‘shall’) that com-
munication between interpreters, and between interpreters and technicians, moderators,
speakers, or signers and the conference organizer, ‘should necessitate minimal additional
intellectual effort on the part of the interpreters’ (7.8.2).
electric circuit serving as a path for information spoken, signed or otherwise pre-
sented in the course of the proceedings of a conference by participants other than
conference interpreters.
379
The Routledge Handbook of Interpreting, Technology and AI
Experts felt that, to add clarity and bring the definition closer to the way it is used at confer-
ences, some reference should be made to ‘content other than that produced by interpreters’,
hence the addition of ‘excluding input originating from interpreters interpreting from a
spoken language’.
In the meantime, within ISO 20539:2019, the 2019 version of the ‘vocabulary standard’
was withdrawn, revised, and succeeded by ISO 20539:2023. It now carries the following
definition of ‘floor’, derived directly from the ‘platform standard’, ISO 24019:2022. Note
that the source reference between brackets in the preceding quote has also disappeared, as
the 2019 version of 20539 has been withdrawn. The current ‘clean’ definition of ‘floor’ in
ISO 20539:2023 is as follows:
However, due to the term ‘audio output’, this definition still restricts ‘floor’ to audio content
and excludes a signer presenting at a conference. This is an issue which is likely to be dis-
cussed during future ISO meetings. From this brief representation of the varying definitions
of the concept ‘floor’, it is hoped that the reader has been provided with insight into why
definitions in ISO standards must always be considered a ‘work in progress’.
380
International and professional standards
At least one qualified conference technician shall be present throughout the event/
conference, in order to monitor the correct functioning of the equipment. The techni-
cian may either be physically present or located in a centralised control booth or room.
No conference interpreting can be provided, on-site or remotely, without visual cues. There-
fore, at least one EU institution instructs interpreters to stop interpreting when a speaker
does not use a camera or if it is switched off. This and other relevant provisions facilitate
negotiations between users of conference interpreting and CISPs.
Similarly, clause 5.1, ISO 17651–1, on permanent booths, reads:
Each booth shall accommodate interpreters comfortably seated side by side... Perma-
nent booths providing space for no more than one interpreter do not conform to this
document.
381
The Routledge Handbook of Interpreting, Technology and AI
However, neither definition encompasses the range of situations where interpreters are
physically separated from either one or more speakers or signers, from part or all of the
audience, or a combination of both (see Braun, Warnicke, Chmiel, and Spinolo, this vol-
ume). Nevertheless, while not providing a fully satisfactory definition, ISO 24019 does
come closer to this approach, mentioning ‘settings where the interpreters are not at the
same venue as participants, speakers and signers or each other’ (Introduction).
One can therefore argue that an improved definition of DI should encompass all settings
and arrangements where physical separation between interpreters and participants results
in reduced sensory input for the interpreters.
This social information about participants’ feelings, emotions, and attitudes to the
other delegates and to what is being discussed is of vital importance to the interpreter
as it constitutes the general framework that defines a communication event.
(Diriker, 2004, in Moser Mercer, 2005, 730)
382
International and professional standards
Much effort has gone into producing complex topologies that describe the respective posi-
tioning of speakers, moderators, interpreters, technicians, and listeners. However, the only
practicable way of discussing the effects of DI on interpreters is to put interpreters at the
centre and discuss their separation from various categories of participants:
Technical issues aside, what is common to this range of settings is that progressively – with
each degree of separation – interpreters receive less and less information (sensory input)
from the communicative event. In addition, interpreters are also loaded with additional
tasks, such as those which are normally carried out by specialised technicians. This ‘lethal’
combination of reduced sensory input and increased cognitive load is discussed by AIIC
member Andrew Constable, who links it to the question of interpreting quality. Constable
makes a distinction between intrinsic and extraneous cognitive load in DI settings:
The intrinsic cognitive load will be related to the difficulty of the source speech (e.g.,
speed of delivery, density, vocabulary, level of specialism) related to the capacity of
the interpreter to perform the task (experience, subject knowledge, level of prepara-
tion, etc.).
(Constable, 2021, 4)
383
The Routledge Handbook of Interpreting, Technology and AI
20.8 Conclusion
Reflecting on the core topic discussed in this chapter, one must note that it is essential to
understand the need for interpreting standards. ISO standards incorporate a vast amount
of knowledge, experience, and good practices from various domains. Standards are shaped,
to a large extent, by industry pioneers and show others the way forward. Since the effort
towards standardising interpretation commenced 14 years ago, significant progress has
been made to establish a valuable tool that raises awareness and makes an impactful entry
into the realm of international standards.
In the realm of standardisation, the incorporation of terms and definitions plays a pivotal
role in promoting awareness and establishing a shared language across diverse fields. Pro-
fessionals and practitioners can draw upon these standardised terms and foster heightened
consistency in documentation. This ensures more effective communication and ultimately
facilitates the successful application of standards in a wide range of varied contexts. Stake-
holders utilising interpreting services are empowered to make informed decisions when
seeking interpreting services. The interpreting profession is poised for more organised and
systematic growth, with the standard serving as a critical reference point. In addition, those
embarking on a career in interpreting can gain a clearer understanding of the expectations
for delivering a proficient performance, with reference to current standards.
Furthermore, interpreting standards play a crucial role in advancing the profession towards
‘Sustainable Development Goal 10: Reduced Inequalities’ through the promotion of inclusive
communication. These standards serve to guarantee fair access to information and services.
They aim to dismantle language barriers and foster understanding among diverse communities.
Offering a consistent and universally applicable framework for language services, interpreting
standards actively contribute to the creation of a more inclusive and accessible global landscape.
In relation to the complex intersection between interpreting and technology, standards
also provide constant reflection in relation to technological progress. The two current
major changes in the industry, distance interpreting and artificial intelligence, are examples
of such complexities faced by both providers and users of interpreting today. Although
they are already used extensively during preparation ahead of an assignment, interpreters
mainly use AI tools to extract terminology from conference-related material (see Prandi,
this volume). However, a second category of applications is currently being built to assist
interpreters during an assignment. The success of such tools will depend on whether their
application causes additional cognitive load (distractions) during interpreting. Finally, a
third emerging category of AI tools involves that of machine interpreting, with the ambition
of replacing humans in suitable interpreting assignments in the future.
AI could indeed bring about disruptive changes to interpreting in the future, likely in a
more evident way than machine translation has changed translation. However, until AI starts
seriously interfering with human interpreting, DI remains the most recent key development in
the interpreting industry, despite the conventional technology it uses (combined transmission
384
International and professional standards
of sound and image over the internet). DI can be deemed ‘conventional’ because it does not
affect the core of interpreting; it does not impinge on or emulate the mental processes in the
interpreter’s brain, as AI may do in the future. However, although DI merely represents a new
means of delivery, it is still proving deeply disruptive in several ways.
To illustrate, the launch of distance interpreting, accelerated by the COVID-19 pan-
demic, helped ensure the continuation of business during a global health crisis. However, DI
contributed much more than that: SIDPs enabled novel business models. Large LSPs, who
traditionally focused on translation (where technology already played a role), realised that
interpreting could also become a profit centre, and thousands of linguists entered the inter-
preting market without credentials. DI blurred the dividing line between conference inter-
preting and other types of interpreting, and quality temporarily became a reduced priority.
As this first cycle of advancement reaches its conclusion, the language services indus-
try must review its impact. A series of technical issues and operational challenges cast
their shadow over DI. Interpreters feel that the sound quality transmitted to their ears can
present a health hazard. The question of whether interpreters receive adequate sensory
input to perform efficiently remains unanswered. No conclusive research exists concerning
interpreters’ new working conditions. For example, research confirms increased difficultly
within DI, indeed, but has not reached a conclusion about the corresponding necessary
reduction of working hours; or about whether working from home is conducive to qual-
ity; the forms of interpreting that could be covered, even in a rudimentary way, in ‘home
alone’ mode; how to help interpreters cope with increased stress in DI settings; or whether
it is acceptable to work in a windowless booth. It is hoped that further research into ISO
standards will help improve quality interpreting in DI settings.
DI technology currently removes and calls into question fundamental premises that
interpreters have taken for granted for decades. These include those of physical proximity
and unlimited access to speakers and listeners, full immersion in the proceedings of the
conference, genuine situational awareness of the meeting room and its surroundings. As a
result, DI affects the interpreter’s working environment and working conditions by reduc-
ing sensory input and adding workload (including from exotic tasks). This leads to exces-
sive cognitive load and accelerated onset of fatigue.
Accepting the hypothesis that the human brain has not evolved over the past five years
(given the split second that five years represent within the evolutionary timescale), one can
reasonably assume that in DI settings, a human interpreter’s brain is thus pushed beyond the
boundaries of its known ‘envelope’. To illustrate, in traditional (on-site) settings, the interpret-
er’s brain is already often operating at the edge of this ‘envelope’. In times of difficulty, when
mental resources are overwhelmed by the total effort required or, as Daniel Gile writes, ‘when
total available processing capacity is insufficient’ the interpreter’s brain is oversaturated. As
a result, interpreting output starts to fail (‘errors, omissions and infelicities can occur’) (Gile,
2021). In DI, on top of known interpreting challenges, this same human brain undergoes a
novel debilitating deficit in sensory input, combined with an increased cognitive load. This
mix predictably leads to more frequent or serious interpreting failures.
This phenomenon summarises the issues that interpreters face in DI settings. However,
this is not limited to interpreters but is, in fact, critical for the interpreting service as a
whole. If interpreting fails, the entire conference interpreting service collapses, and in this
sense interpreters can be considered as a ‘single point of failure’ in the process (see layer 1
in ISO 23155). To sum up, effective multilingual communication cannot be guaranteed in
DI when interpreters perform facing serious handicaps.
385
The Routledge Handbook of Interpreting, Technology and AI
The question remains as to whether these handicaps can be restored or if this is even
possible. What can be done to restore the status quo ante (i.e. make an interpreter feel like
they are ‘really there’ despite physical separation from various conference stakeholders)?
How can interpreters be redeemed from exotic tasks? How can the excess cognitive load be
mitigated to allow interpreters to operate within the ‘envelope’ again? Is this even possible?
Answers to these questions may lie in what has made interpreting, in particular, con-
ference interpreting, such a successful practice and the foundation of spoken interlingual
communication over decades: implementing best practices combined with a high level of
education and training at reputable interpreting schools worldwide.
While an inordinate amount of time has been spent lamenting issues related to DI set-
tings during its initial years (internet connection failures or unsuitable microphones), it is
time for the industry to acknowledge the extensive expertise and good practices that have
ensured millions of hours of conferences to take place without issue, since Nuremberg, and
invest in perpetuating this stellar record. This can be achieved by adopting and implement-
ing the practices now universally enshrined in acclaimed ISO standards.
Notes
1 The name does not match the acronym because ISO is derived from the Greek word ἴσος (ísos),
which means ‘equal’. This signifies that, irrespective of the country or language, we operate on an
equal footing, emphasising a universal and equitable approach in our work.
2 ISO 18841:2018 Interpreting Services – General Requirements and Recommendations. URL
www.iso.org/standard/63544.html (accessed 20.2.2024).
3 ISO 23155:2022 Interpreting – Conference Interpreting – Requirements and Recommendations.
URL www.iso.org/obp/ui/en/#iso:std:iso:23155:ed-1:v1:en (accessed 20.2.2024).
4 ISO 23155:2022 Interpreting – Conference Interpreting – Requirements and Recommendations.
URL www.iso.org/obp/ui/en/#iso:std:iso:23155:ed-1:v1:en (accessed 20.2.2024).
5 ISO 23155:2022 Interpreting – Conference Interpreting – Requirements and Recommendations.
URL www.iso.org/obp/ui/en/#iso:std:iso:23155:ed-1:v1:en (accessed 20.2.2024).
6 ISO, ISO House Style. URL www.iso.org/ISO-house-style.html#iso-hs-s-text-r-plain (accessed
24.2.2024).
7 ISO, Consumers and Standards: Partnership for a Better World. URL https://www.iso.org/sites/
ConsumersStandards/5_glossary.html (accessed 2.3.2024).
8 ISO, Certification. (7.2.2023). URL www.iso.org/certification.html (accessed 2.3.2024).
9 ISO, Certification. (7.2.2023). URL www.iso.org/certification.html (accessed 2.3.2024).
10 Replaced by ISO 13611:2024 Interpreting Services – Community Interpreting – Requirements
and Recommendations. URL www.iso.org/standard/82387.html (accessed 2.3.2024).
11 ISO 20228:2019 Interpreting Services – Legal Interpreting – Requirements. URL https://www.iso.
org/standard/67327.html (accessed 4.3.2024).
12 ISO 21998:2020 Interpreting Services – Healthcare Interpreting – Requirements and Recommen-
dations. URL www.iso.org/standard/72344.html (accessed 4.3.2024).
13 ISO 24019:2022 Simultaneous Interpreting Delivery Platforms – Requirements and Recommen-
dations. URL www.iso.org/standard/80761.html (accessed 4.3.2024).
14 ISO 20109:2016 Simultaneous Interpreting – Equipment – Requirements. URL https://www.iso.
org/standard/67063.html (accessed 4.3.2024).
15 ISO 23155:2022 Interpreting – Conference Interpreting – Requirements and Recommendations.
URL www.iso.org/standard/74749.html (accessed 20.2.2024).
16 The definition of ‘interpreting hub’ in the current form of ISO 17651–3 is ‘facility managed by special-
ized staff, with interpreting workspaces and fully equipped for the provision of distance interpreting’.
386
International and professional standards
References
Constable, A., 2015. Distance Interpreting: A Nuremberg Moment for Our Time? AIIC 2015 Assem-
bly Day 3: Debate on Remote. 18 January. URL https://aiic.ch/wp-content/uploads/2020/05/
di-a-nuremberg-moment-for-our-time-andrew-constable-01182015.pdf (accessed 16.7.2024).
Constable, A., 2021. Extraneous Cognitive Load in Distance Interpreting, November. URL www.
researchgate.net/publication/380519089 (accessed 16.7.2024).
Gile, D., 2021. The Effort Models of Interpreting as a Didactic Construct. In Muñoz Martín, R.; Sun,
S. S., Li, D. (eds). 2021. Advances in Cognitive Translation Studies. Singapore: Springer Nature,
139–160.
ISO, 2010. Ed. 1, Guidance for ISO Liaison Organizations. URL www.iso.org/publication/
PUB100270.html (accessed 10.2.2024).
ISO, 2011. Ed. 1, Guidance for ISO Liaison Organizations-Engaging stakeholders. URL http://www.
iso.org/iso/guidance_liaison-organizations.pdf (accessed 10.2.2024).
ISO, 2019. Ed. 2, Guidance on the Systematic Review Process in ISO. URL https://www.iso.org/files/
live/sites/isoorg/files/store/en/PUB100413.pdf (accessed 16.2.2024).
ISO, 2020. My ISO Job – What Delegates and Experts Need to Know. URL www.iso.org/publication/
PUB100037.html (accessed 16.2.2024).
ISO, 2023a. Ed.7, Getting Started Toolkit for ISO Committee Chairs. URL www.iso.org/publication/
PUB100417.html (accessed 12.2.2024).
ISO, 2023b. ISO/IEC Directives, Part 1, Consolidated ISO Supplement. URL https://www.iso.org/sites/
directives/current/consolidated/index.html (accessed 15.2.2024).
ISO, 2024. Ed. 7, My ISO Job-What Delegates and Experts Need to Know. URL www.iso.org/
publication/PUB100037.html (accessed 10.7.2024).
ISO/IEC, 2021. ISO/IEC Directives, Part 2 Principles and Rules for the Structure and Drafting of
ISO and IEC Documents, 9th ed. URL www.iso.org/sites/directives/current/part2/index.xhtml
(accessed 24.2.2024).
Moser-Mercer, B. 2005. Remote Interpreting: Issues of Multi-Sensory Integration in a Multilingual
Task. Meta, 50(2), 727–738. URL https://doi.org/10.7202/011014ar (accessed 10.7.2024).
387
21
WORKFLOWS AND
WORKING MODELS
Anja Rütten
21.1 Introduction
This chapter will cover different aspects under which an interpreter’s workflow is affected
and has changed under the influence of technology. While the focus is on conference inter-
preting, many aspects also apply to other forms of interpreting. The chapter will first look
at the evolution over time in a chronological order. It will then analyse the topic from the
perspectives of phases and efforts, semiotics of interpreting, and information and knowl-
edge. Finally, it will look at the impact of technology from a business point of view.
DOI: 10.4324/9781003053248-27
Workflows and working models
passages can be interpreted. This, in turn, reduces the number of interruptions to a speech.
For interpreters, notes support their memory, on the one hand, but on the other, the act of
both note-taking and listening to the speaker simultaneously, over longer stretches of time,
increases their cognitive load (Chen, 2018, 91).
389
The Routledge Handbook of Interpreting, Technology and AI
interpreter plays back the recording into earphones and renders it in simultaneous mode
(Hamidi & Pöchhacker, 2007; Ferrari, 2002). Digital pens with built-in cameras can now-
adays also film notes on special microdot paper. These microdots enable the camera to
record the position of each element of the notes and match it with the recording. In this
way, the interpreter, while interpreting, can also play back the sound of the recording that
corresponds to a certain note element (Orlando, 2015, 144, this volume). While the tech-
nique of SimConsec clearly represents a cognitive relief to interpreters and – as research
suggests – increases the accuracy of the interpreting performance, ‘classical’ consecutive
seems to be rated higher by the audience (Svoboda, 2020, 74).
Both whispered interpreting and SimConsec are suitable for similar settings – situations
in which a simultaneous booth cannot be installed due to lack of space or movement of
the participants (i.e. factory tours or a change of location). Although the author does not
dispose statistics about the application of SimConsec, long-standing experience exists in
the admission committee of the German Association of Conference Interpreters, where
all senior members must submit a list of 200 working days as proof of proficiency. From
this perspective, it is fair to say that SimConsec has no relevance, at least in the German
conference interpreting market, whereas the use of personal guidance systems is quite wide-
spread. On a continuum of maximum and minimum cognitive load for the interpreters, the
two techniques can be seen as opposite extremes, with correspondingly opposite levels of
comfort for the communication participants (maximum vs minimum time delay, freedom
of movement for listeners thanks to headphones in the case of tour guide systems). Client
convenience seems to be a more dominant factor than interpreter convenience or interpreta-
tion accuracy.
390
Workflows and working models
in different languages has gradually become commonplace (with Linguee and Wikipedia
being the icing on the cake). Information overload has become a highly discussed topic, not
only in the world of interpreting. The proliferation of texts on all sorts of subjects, in many
languages, has been fuelled by the onset of search engine optimisation, not to mention
AI-generated texts in recent years. And still, new glossaries are being created from scratch
every day. This underlines the highly specialised and contextual nature of their work and
the need for highly targeted information.
21.2.7 Videoconferencing
The next and most recent form of technology that has had a huge impact on conference
interpreting – the term ‘technology’ is again used here to mean mediation and not support
(Braun 2019) – is videoconferencing. While tests using audiovisual connections for remote
interpreting were conducted long before the turn of the century (Ziegler and Gigliobianco,
391
The Routledge Handbook of Interpreting, Technology and AI
2018, 123) and over-the-phone interpreting has long since been a reality, it was not before
the outbreak of COVID-19 that videoconference interpreting, in particular, remote simul-
taneous interpreting (RSI), became a ‘new normal’ setting for conference interpreters. The
ISO standard 20108:2017 defines distance interpreting or remote interpreting as ‘interpret-
ing of a speaker in a different location from that of the interpreter, enabled by information
and communications technology (ICT)’. While this can mean that all meeting participants
are in the same location, with only the interpreters connected remotely, nowadays vide-
oconferences are common practice in business and politics settings, where often both meet-
ing participants as well as the interpreters are distributed across different locations around
the world, or hybrid meetings, where some participants and interpreters are physically
co-located while others are connected remotely, from their own (home) offices or commer-
cial interpreting hubs (AIIC, 2019d; Braun, this volume).
Depending on the remote setting, additional tasks may need to be fulfilled by the inter-
preter, particularly in remote simultaneous interpreting (RSI) settings:
• If not working from a hub: creating the technical set-up before the meeting (computer,
ethernet connection, camera, microphone, videoconferencing software)
• Managing the videoconferencing software during the meeting
• Maintaining contact with the client/conference participants via chat/email, etc.
• If not co-located with their booth partner: communicating via chat/email/video or voice
backchannels, finding a way of supporting each other remotely, managing microphone
handover swiftly
Aside from this addition of more tasks and alienation from the setting, limited access and
contact with the communicative setting and participants, potentially poor sound quality,
and split attention in hybrid meetings add another layer of cognitive load. In institutions
like the European Commission and Parliament and European Patent Office, this has led to
an increase in team strength or restriction of working hours in settings involving videocon-
ference participants (Mahyub Rayaa and Marti, 2022).
392
Workflows and working models
impressive progress in machine translation and speech recognition, among others. Thus, AI
has made its way to interpreters’ workflow. This could have an even more sizeable impact
on workflow. Aside from tasks such as abstracting, terminology extraction, and glossary
creation, AI support is also possible during simultaneous interpreting. Examples include
SmarTerp, InterpretBank’s Automatic Booth Assistant, and Cymo Note, which already
offer live support when it comes to displaying terminology, numbers, and named entities.
In addition, the University of Ghent’s EABM (Ergonomics for the Artificial Booth Mate)
project also aims to provide a similar tool. In terms of cognitive load, it remains to be
seen whether real-time, automated computer support adds another layer of cognitive load,
causes split attention, and increases stress and/or deteriorates performance, or if it relieves
the complex process of simultaneous interpreting. Research to date shows a heterogene-
ous image (cf. Chapter 2.3). In terms of listening support, generic transcription tools like
otter.ai or notta.ai/, or live captions and translations like, for example, Speechmatics, may
become an alternative to bespoke CAI tools in the future. These tools are, of course, far
from intuitive to ‘read’ while interpreting, but their user interfaces might eventually opti-
mise in a way that also meets interpreters’ needs.
393
The Routledge Handbook of Interpreting, Technology and AI
394
Workflows and working models
395
The Routledge Handbook of Interpreting, Technology and AI
What turns ‘information and knowledge work’ into ‘information and knowledge man-
agement’ is the deliberate orientation of work towards a defined goal (successful com-
munication) and the corresponding measurement, or at least evaluation, of the level of
achievement, as well as the corresponding deliberate and selective handling of information
and knowledge beyond one single assignment and isolated tasks (Rütten, 2007, 153ff). This
connects the end of one working cycle (assignment) to the beginning of the next. This helps
optimise the workflow and, at the same time, takes a somewhat-overarching perspective.
396
Workflows and working models
‘learning vocab’ more efficient and convenient, as this can also be done ‘on the go’. Another
technology-enabled practice in meeting preparation is online collaboration. Shared online
glossaries, edited by several interpreters of a team, not only are efficient when it comes to
sharing workload but also make exchange among team members easier. This helps synchro-
nise and consolidate knowledge.
In summary, it can be stated that technology has made interpreters’ work less cumber-
some at the level of data retrieval. It has also provided greater amounts of information and
more sophisticated means of processing and managing it. Besides the risk or temptation of
overreliance on technology in the booth, technology brings a number of convenient options
for transforming processed information into knowledge.
397
The Routledge Handbook of Interpreting, Technology and AI
Unlike large language models, which transcode the superficial form of one language into
another, interpreters, just like translators, work on the basis of the underlying semantics.
Unlike translators or terminologists, interpreters work in the moment, for a unique setting,
with a more or less defined group of participants. Accordingly, the circumstances of the
situation play a greater role and also allow for more freedom. A ‘correct’ interpretation
is what participants accept or prefer, even if no one outside the group would understand.
Expressions that are documented to be correct, be it in a dictionary or other references, will
not necessarily be understood. This could be due to a lack of background knowledge or dif-
ferent origins. Expressions that would clearly not be equivalents in a strict conceptual sense
(e.g. suction cleaner nozzle and suction nozzle) can be perfect equivalents in the context of
a particular meeting, because everyone knows they are referring to the same object.
While it is of course important to know the correct terminology (i.e. form) for, say, the
German Aktiengesellschaft, Vorstand, Aufsichtsrat, and Verwaltungsrat in the other lan-
guage, in case of emergency, such punctual knowledge gaps can be filled by looking them
up or (often more efficiently) asking a boothmate. If, however, the interpreter is unfamiliar
with the intricacies of company management structures in the different countries (semantic
dimension), the risk of getting a message wrong is higher and more complicated to sort out
ad hoc. Errors in interpreting resulting from such semantic, or pragmatic, knowledge gaps
often create more severe miscommunication than ‘just the wrong word’ (Rütten, 2007,
196f).
398
Workflows and working models
On the other hand, the close-up camera view of each speaker often far surpasses the view
interpreters have on speaker from their booths in physical meetings. This can give a very
detailed impression of the speakers’ facial expression – an effect that could be even more
striking if image transmission in virtual reality format became common (Gigliobianco &
Ziegler, 2018).
Overall, digital information technologies have given interpreters better access to relevant
information across all semiotic levels, making their knowledge work potentially easier and
more meaningful. At pragmatic level, audiovisual transmission technologies may alienate
the interpreter (as well as normal participants) from the communicative situation but can
also offer additional situational insight.
399
The Routledge Handbook of Interpreting, Technology and AI
For simultaneous interpreting, Seeber further differentiates the subtasks in his model for
measuring the cognitive load:
400
Workflows and working models
practice for many interpreters. If the interpreter wants only exact matches to be recognised
by the system, with no fuzzy matches, then any term interpreters want to see on their
screens in the booth needs to be first taught to the system in advance (SmarTerp, 2024). For
example, you couldn’t just write:
Ideally, glossary creation would be carried out by CAI tools based on a specially trained
AI. This would considerably reduce the preparation work pre-process and leave the inter-
preter with more capacity for important content and context-related preparation. In any
case, this may reduce the effort of memorising terminology pre-process, as well as looking
up terminology in-process.
As for the primary interpreting activity as such, a subtask where technology can be of
great use is the perceptual auditory verbal processing of the source language. For languages
of high diffusion, transcription tools already provide reasonable (although not flawless)
support. This can help with following very fast or unclear speakers or, in the case of Eng-
lish, unknown accents.
In consecutive interpreting settings, interpreters are usually directly involved in the com-
municative setting (standing in front of an audience or sitting between the parties). Conse-
quently, relying on digital or paper glossaries or live prompting is far more difficult. It was
only with tablets for note-taking that consecutive interpreters had a real option of looking
up terms ad hoc or having any cognitive support that goes beyond a short ‘cheat sheet’. In
addition, the interpreter can also take notes on an electronic device. This can offer several
convenient functions, like copying and pasting or erasing notes, using different line thick-
nesses and colours, browsing the electronic notepad, zooming in and out, etc. (Goldsmith,
2017, 43ff). This facilitates the clearer organisation of information and the filling of knowl-
edge gaps ad hoc.
Similarly to pre-process, tasks that once had to be completed post-process (once an inter-
preter is back at their desk) can now be competed ‘on the job’. The use of portable comput-
ers in the booth enables interpreters to update their terminology or even carry out some
background research to fill semantic knowledge gaps peri-process. Other post-process tasks
like journal writing or self-assessment may also be supported by technology, for example,
by facilitating log files of CAI tools.
Overall, in terms of secondary information and knowledge work that formerly had to be
completed pre- or post-process, this can now theoretically be carried out in- or peri-process,
on top of the primary task of interpreting. On the other hand, CAI tools can help reduce
the burden of filling knowledge gaps up ad hoc and possibly also spare at least some of the
memorising of terminology pre-process. However, this would require a different approach
in preparation.
Furthermore, mobile computing enables interpreters to prepare for one meeting while
still attending another. They can also attend to clients’ requests and take care of logistical
401
The Routledge Handbook of Interpreting, Technology and AI
issues that used to be done before or after the meeting, before the era of mobile computers
and smartphones.
402
Workflows and working models
to organise; or when the organiser does not have to wait for a free slot in the interpreters’
agendas. However, where human interpreters will most probably still be needed are when
assignments require any of the following: technical, political, diplomatic, emotional, or
contextual understanding; tact; or when human creativity is required to translate a joke,
understand irony, convey newly created terms and ideas in another language; or where a
plausibility control is required. Considering that low-intensity assignments are more likely
to disappear for human interpreters, since they are more easily replaceable by machines or
the use of a common (foreign) language like English, this is another factor contributing to
the intensification of conference interpreting.
21.5 Conclusion
Technology, in very general terms, has contributed to the intensification and simultanifica-
tion of interpreters’ workflow over time, with simultaneous interpreting and videoconfer-
encing probably being the biggest contributors. These major developments have increased
the interpreters’ alienation from the communicative setting. From a client’s perspective,
acceptance of technology in interpreting seems to be influenced not only by accuracy but
also – or, to a certain extent, even more so – by convenience. Interpreters have long been
valued for what they are: human minds listening to what is being said, extracting and pro-
cessing underlying meaning, and communicating it to another party in an understandable
way (Seleskovitch, 1968, cited in Seleskovitch, 1992, 41).
As mobile devices such as laptops and tablets have entered booths, many more tasks
are able to be performed during an assignment. This potentially further adds to the
technology-induced intensification of work in the booth. The lines between the differ-
ent phases have become blurred. Now, however, new developments have also brought
technology-enabled benefits for interpreters. The large increase in available data and infor-
mation has gone hand in hand with more efficient exchange and software solutions – as if
interpreters have been given a pile of earth and a shovel. Cloud-based online collaboration,
for the first time, has allowed for real teamwork and workload-sharing in preparation.
This, in turn, has the potential to further increase complexity and simultaneity, as well as
efficiency and quality.
Information literacy, including strategies such as selection/prioritisation, classification,
automatisation/memorisation, systematisation, and extraction, can help capitalise on the
wealth of information available. Just as financial literacy can help capitalise on new busi-
ness opportunities created by the internet, the more sophisticated means of processing and
managing information can be aided by information literacy strategies. However, excessive
use of information technologies may lead to overreliance on it in the booth without memo-
rising terminology anymore. On the other hand, flashcard functions and other tools can
help convert processed information into knowledge. Furthermore, existing live prompt-
ing CAI tools, or those that are currently being developed, can help reduce the burden of
filling knowledge gaps while interpreting. However, these require a different approach in
preparation.
It remains to be seen whether these tools make their way to the market. The uptake of
specific CAI tools by interpreters has not been excessive in the past. Similarly, an increase
in accuracy of interpreting performance is not necessarily a convincing argument to clients
either. Nonetheless, at least bespoke CAI tools offering real-time prompting are in line
with the current trend concerning the use of AI, which is to combine the strengths of both
403
The Routledge Handbook of Interpreting, Technology and AI
humans and machines. Wilson et al. found that firms achieve the most significant perfor-
mance improvements through collaborative intelligence, where humans and AI comple-
ment their respective strengths:
The leadership, teamwork, creativity, and social skills of the former, and the speed,
scalability, and quantitative capabilities of the latter. What comes naturally to people
(making a joke, for example) can be tricky for machines, and what’s straightforward
for machines (analysing gigabytes of data) remains virtually impossible for humans.
Business requires both kinds of capabilities.
(Wilson & Daugherty, 2018)
It will be interesting to see which form of technological support the future brings for
conference interpreters in order to take advantage of human–machine synergies. Could it
be intuitive facial gesture commands, document navigation via speech recognition, teaching
AI to predict problematic elements, correction of pronunciation, or something else entirely?
The rapid technological developments in AI and human–machine interaction at least allow
for speculation that intuitive support for the highly specific needs of conference interpreters
might one day become a reality.
References
AIIC, 2019a. History of the Profession. URL https://aiic.org/site/world/about/history/profession
(accessed 21.2.2024).
AIIC, 2019b. History of AIIC. URL https://aiic.org/site/world/about/history (accessed 21.2.2024).
AIIC, 2019c. Glossary. URL https://aiic.org/site/world/conference/glossary (accessed 22.2.2024).
AIIC, 2019d. Leitlinien der AIIC für das Ferndolmetschen (Distance Interpreting). Version 1.0.
URL https://aiic.de/wp-content/uploads/2019/08/aiic-leitlinien-ferndolmetschen-20190802-2.pdf
(accessed 15.3.2024).
Arntz, R., Picht, H., Mayer, F., 2002. Einführung in die Terminologiearbeit. Georg Olms Verlag,
Hildesheim, Zürich, New York.
Bartsch, L.M., Oberauer, K., 2021. The Effects of Elaboration on Working Memory and Long-Term
Memory Across Age. Journal of Memory and Language 118. URL https://doi.org/10.1016/j.
jml.2020.104215
Braun, S., 2019. Technology and Interpreting. In O’Hagan, M., ed. The Routledge Handbook of
Translation and Technology. Routledge, London, 271–288.
Cambridge Dictionary, 2024. Cambridge University Press & Assessment. URL https://dictionary.
cambridge.org/de/worterbuch/englisch/workflow (accessed 9.7.2024).
Chen, S., 2018. Exploring the Process of Note-Taking and Consecutive Interpreting: A Pen-Eye-Voice
Approach Towards Cognitive Load (PhD thesis). Department of Linguistics, Faculty of Human
Sciences, Macquarie University, Sydney, Australia.
Corpas Pastor, G., Fern, M.L., 2016. A Survey of Interpreters’ Needs and Their Practices Related to
Language Technology. Technical Report. Universidad de Málaga, Málaga.
Defrancq, B., Fantinuoli, C., 2020. Automatic Speech Recognition in the Booth: Assessment of Sys-
tem Performance, Interpreters’ Performances and Interactions in the Context of Numbers. Target
33(1), 73–102. URL https://doi.org/10.1075/target.19166.def
Desmet, B., Vandierendock, M., Defrancq, B., 2018. Simultaneous Interpretation of Numbers and the
Impact of Technological Support. In Fantinuoli, C., ed. Interpreting and Technology: Translation
and Multilingual Natural Language Processing 11. Language Science Press, Berlin.
Fantinuoli, C., 2018. Computer-Assisted Interpreting: Challenges and Future Perspectives. In Corpas
Pastor, G., Durán-Muñoz, I., eds. Trends in E-Tools and Resources for Translators and Interpret-
ers. Brill, Leiden, 153–174. URL https://doi.org/10.1163/9789004351790_009
Ferrari, M., 2002. Traditional vs. “Simultaneous Consecutive”. SCIC News 29, 6–7.
404
Workflows and working models
Frittella, F.M., 2023. Usability Research for Interpreter-Centred Technology: The Case Study of
SmarTerp. Translation and Multilingual Natural Language Processing 21, Language Science Press,
Berlin.
Gaiba, F., 1998. The Origins of Simultaneous Interpretation: The Nuremberg Trial. University of
Ottawa Press, Ottawa.
Gigliobianco, S., Ziegler, K., 2018. Present? Remote? Remotely Present! New Technological
Approaches to Remote Simultaneous Conference Interpreting. In Fantinuoli, C., ed. Interpreting
and Technology. Language Science Press, Berlin, 119–139.
Gile, D., 1997. Conference Interpreting as a Cognitive Management Problem. In Danks, J., Shreve,
G.M., Fountain, S.B., McBeath, M.K., eds. Cognitive Processes in Translation and Interpretation.
Sage Publications, Thousand Oaks, 196–214.
Goldsmith, J., 2017. A Comparative User Evaluation of Tablets and Tools for Consecutive Interpret-
ers. In Translating and the Computer 39. Proceedings. AsLing, London, 40–50.
Goldsmith, J., 2020. Terminology Extraction Tools for Interpreters. In Ahrens, B., ed. Interdepend-
ence and Innovation in Translation, Interpreting and Specialized Communication. Frank &
Timme, Berlin, 279–302.
Hamidi, M., Pöchhacker, F., 2007. Simultaneous Consecutive Interpreting: A New Technique Put to
the Test. Meta 52, 276–289. URL https://doi.org/10.7202/016070ar
Heery, E., Noon, M., 2008. A Dictionary of Human Resource Management, 2nd ed. Oxford Univer-
sity Press. URL https://doi.org/10.1093/acref/9780199298761.001.0001
ISO, 2017. ISO 20108:2017 Simultaneous Interpreting – Quality and Transmission of Sound and
Image Input – Requirements.
Jiang, H., 2015. A Survey of Glossary Practice of Conference Interpreters. aiic.net. URL https://api.
semanticscholar.org/CorpusID:60436853 (accessed 5.3.2024).
Kalina, S., 2005. Quality Assurance for Interpreting Processes. Meta 50(2), 769–784.
Kalina, S., Ziegler, K., 2015. Technology. In Pöchhacker, F., ed. Routledge Encyclopedia of Interpret-
ing Studies. Routledge, London, 410–411.
Kuhlen, R., Seeger, T., Strauch, D., 2004. Grundlagen der Praktischen Information und Dokumenta-
tion, begründet von Klaus Laisiepen, Ernst Lutterbeck und Karl-Heinrich Meyer-Uhlenried. 5.,
völlig neu gefasste Ausgabe. Band 1: Handbuch zur Einführung in die Informationswissenschaft
und -praxis. K. G. Saur, München.
Leffer, L., 2023. When It Comes to AI Models, Bigger Isn’t Always Better. Scientific American.
URL www.scientificamerican.com/article/when-it-comes-to-ai-models-bigger-isnt-always-better/
(accessed 15.6.2024).
Lewandowski, T., 1990. Linguistisches Wörterbuch. 3 Bände. 5. Auflage. UTB, Heidelberg.
MacLeod, C.M., 2011. ‘I Said, You Said: The Production Effect Gets Personal. Psychonomic Bulle-
tin & Review 18, 1197–1202. URL https://doi.org/10.3758/s13423-011-0168-8
Mahyub Rayaa, B., Marti, A., 2022. Remote Simultaneous Interpreting: Perceptions, Practices
and Developments. The Interpreters’ Newsletter 27, 21–42. URL doi.org/10.13137/2421-
714X/34390
Matyssek, H., 1989. Handbuch der Notizentechnik für Dolmetscher: Ein Weg zur sprachunabhängi-
gen Notation. Groos Verlag, Heidelberg.
McHugh-Johnson, M., 2021. 15 Milestones, Moments and More for Google Docs’ 15th. In The Key-
word. Google, Mountain View, CA. URL https://blog.google/products/docs/happy-15-years-google-
docs/ (accessed 29.2.2024).
Orlando, M., 2015. Digital Pen Technology and Interpreter Training, Practice, and Research: Status
and Trends. In Ehrlich, S., Napier, J., eds. Interpreter Education in the Digital Age: Innovation,
Access, and Change. Gallaudet University Press, Washington, DC, 125–152.
Prandi, B., 2023. Computer-Assisted Simultaneous Interpreting: A Cognitive-Experimental Study on
Terminology. Translation and Multilingual Natural Language Processing 22, Language Science
Press, Berlin.
Probst, G., Raub, S., Romhardt, K., 1999. Wissen managen. Wie Unternehmen ihre wertvollste Res-
source optimal nutzen. Gabler, Wiesbaden.
Rozan, J.F., 1956. La prise de notes en interprétation consécutive. Georg, Geneva.
Rütten, A., 2007. Informations- und Wissensmanagement im Konferenzdolmetschen. Sabest. Saar-
brücker Beiträge zur Sprach- und Translationswissenschaft. Peter Lang, Frankfurt a. M.
405
The Routledge Handbook of Interpreting, Technology and AI
Rütten, A., 2016. Interpreters’ Workflows and Fees in the Digital Era. In Translating and the Com-
puter 38. Proceedings. AsLing, London, 133 ff.
Rütten, A., 2017. Terminology Management Tools for Conference Interpreters – Current Tools and
How They Address the Specific Needs of Interpreters. In Translating and the Computer 39. Pro-
ceedings. AsLing, London, 98 ff.
Seeber, K., 2011. Cognitive Load in Simultaneous Interpreting: Existing Theories – New Models.
Interpreting 13(2), 176–204. URL doi.org/10.1075/intp.13.2.02see
Seleskovitch, D., 1992. De la pratique à la théorie – Von der Praxis zur Theorie. In Salevsky, H.,
ed. Wissenschaftliche Grundlagen der Sprachmittlung. Berliner Beiträge zur Übersetzungswissen-
schaft. Peter Lang, Frankfurt a. M., 38–55.
Skolnikoff, E.B., 1993. The Elusive Transformation: Science, Technology, and the Evolution of Inter-
national Politics. Princeton University Press.
Stoll, C., 2009. Jenseits simultanfähiger Terminologiesysteme: Methoden der Vorverlagerung und Fix-
ierung von Kognition im Arbeitsablauf professioneller Konferenzdolmetscher. Wissenschaftlicher
Verlag Trier, Trier.
Svoboda, Š., 2020. SimConsec: The Technology of a Smartpen in Interpreting (MA dissertation).
Faculty of Philosophy, Faculty of Arts, University of Olomouc, Prague.
Van Herpen, D., 2017. Work Intensification: A Clarification and Exploration into Causes, Conse-
quences and Conditions. A Literary Review. School of Social and Behavioral Sciences, Tilburg
University.
Wagener, L., 2012. Vorbereitende Terminologiearbeit im Konferenzdolmetschen unter besonderer
Berücksichtigung der Zusammenarbeit im Dolmetschteam (MA dissertation). Faculty of Informa-
tion and Communication Sciences, Institute for Translation and Multilingual Communication,
University of Applied Sciences Cologne.
Wilson, H.J., Daugherty, P., 2018. Collaborative Intelligence: Humans and AI Are Joining Forces.
Harvard Business Review, July–August 2018 issue, 114–123.
Zimmermann, H.H., 2004. Information in der Sprachwissenschaft. In Kuhlen, R., Seeger, T., Strauch,
D., eds. Grundlagen der praktischen Information und Dokumentation, 5th ed., vol. 1. K. G. Saur,
Munich, 704–709.
Software
Airgram.io, 2024. URL https://www.notta.ai/en/welcome-airgram (accessed 20.3.2024).
Boothmate, 2024. URL https://boothmate.app (accessed 20.3.2024).
Cymo Note, 2024. URL www.cymo.io/en/note.html (accessed 20.3.2024).
Ergonomics for the Artificial Booth Mate (EABM), 2024. Ergonomics for the Artificial Booth Mate
(EABM). URL www.eabm.ugent.be/eabm/ (accessed 20.3.2024).
Interplex, 2024. URL www.fourwillows.com/interplex.html (accessed 20.3.2024).
InterpretBank ASR, 2024. URL https://www.interpretbank.com/site/docs/v4/asr.html (accessed
20.3.2024).
Otter.ai, 2024. URL https://otter.ai/ (accessed 20.3.2024).
SmarTerp, 2024. URL https://smarterp.me/ (accessed 20.3.2024).
Speechmatics, 2024. URL www.speechmatics.com/ (accessed 20.3.2024).
406
22
ERGONOMICS AND
ACCESSIBILITY
Wojciech Figiel
22.1 Introduction
This chapter discusses the notions of ergonomics and accessibility as applied to simultane-
ous interpreting, both in its on-site and remote modalities. While the author is aware that
interpreting exists in many modalities besides simultaneous (e.g. consecutive, dialogue, whis-
pered interpreting, etc.), this chapter will briefly mention these but will focus on the simul-
taneous mode, as it is here that the greatest technological developments can be observed.
As the complexity of interpreting technologies continues to grow, one can observe an
increased awareness of the links between ergonomics and interpreting studies and a result-
ant need for more extensive research into the interface between these two fields (see van
Egdom et al., 2020). Nevertheless, to date, little attention has been devoted to ergonomics
in the field of interpreting studies, and even less to the intersection between accessibility
and ergonomics in interpreting. This chapter will argue that there is a necessity to consider
accessibility, as well as ergonomics, in the discussion, in order to be inclusive to all inter-
preting professionals, including those with disabilities.
The author of this chapter is not neutral in this respect. As a visually impaired per-
son (VIP), and being an active conference interpreter for over 15 years, the author has
first-hand experience in striving to find the best possible solutions for both themselves and
their blind students, in an effort to make the interpreting profession as accessible as pos-
sible. As a result, this chapter will include personal perspectives relating to accessibility,
with particular attention being paid to experiences from the region of Central and Eastern
Europe (CEE), as the author is based in Poland. However, this does not make the chapter
less relevant for other locations, as most challenges encountered in the field of ergonomics
are universal in their nature.
The structure of the chapter is as follows: Section 22.2 will provide a discussion of the
most relevant terms related to this chapter. These include ergonomics, usability, user experi-
ence, and accessibility. Section 22.3 will sketch out a historical outline of the development
of technologies in the field of conference interpreting and analyse their impact on both ergo-
nomics and accessibility. Section 22.4 will review some of the current workflows of confer-
ence interpreters. Section 22.5 discusses computer-assisted interpreting (CAI) tools. The
DOI: 10.4324/9781003053248-28
The Routledge Handbook of Interpreting, Technology and AI
sixth section is devoted to distance interpreting (DI). This section presents the opportunities
and challenges involved in DI, with particular attention paid to accessibility for people with
visual impairments. The chapter concludes by providing information relating to ergonom-
ics in speech-to-text interpreting (Section 22.7) and in pedagogical contexts (Section 22.8).
22.2 Definitions
At first glance, the terms relevant to this chapter – ‘ergonomics’, ‘user experience’, and
‘accessibility’ – appear straightforward, intuitive, and easy to define. However, as with
nearly all terms, this is far from true. ‘Ergonomics’ comes from the Greek words for
‘work’ – ‘ergo’ – and ‘laws’ – ‘nomos’. The International Ergonomics Association defines
ergonomics as the ‘scientific discipline concerned with the understanding of interactions
among humans and other elements of a system, and the profession that applies theory,
principles, data, and methods to design in order to optimize human well-being and overall
system performance’ (International Ergonomics Association, n.d.). Other definitions of the
term (see Kiran, 2020: 221) place emphasis on the study of the relationships between human
beings and machines, as well as well-being and improvement in efficiency. In simple words,
the goal of ergonomics is ‘to improve the performance of systems by improving human
machine interaction’ (Bridger, 2003, 1). Interestingly, although the term was employed in
Ancient Greece, its first modern usage dates back to the 1857 treatise ‘The Outline of
Ergonomics, i.e. Science of Work, Based on the Truths Taken from the Natural Science’,
authored by a Polish scientist, Wojciech Bogumił Jastrzębowski (Kiran, 2020, 220).
In light of this, it can be postulated that ergonomics is highly relevant when it comes to
discussing both working conditions and interpreter training. In addition, one can also view
the human–machine interaction that takes place between interpreters and technology from
the perspective of user experience and usability. As Law et al. (2008) state, the notion of
user experience is both ‘elusive’ and ‘hard to define’. Norman and Nielsen (1998) note how
‘“user experience” encompasses all aspects of the end-user’s interaction with the company,
its services, and its products’. Meanwhile, Alben (1996, 12) stresses that experience can be
understood to be
all the aspects of how people use an interactive product: the way it feels in their
hands, how well they understand how it works, how they feel about it while they’re
using it, how well it serves their purposes, and how well it fits into the entire context
in which they are using it.
Usability, in turn, can be defined as ‘a quality attribute that assesses how easy user interfaces
are to use’ (Nielsen, 2012). Furthermore, the International Organisation for Standardization
(ISO, 2018) defines usability as ‘the extent to which a product can be used by specified users
to achieved specified goals with effectiveness, efficiency and satisfaction in a specified context
of use’. Moreover, Nielsen (2012) distinguishes five quality components of usability:
1. Learnability: How easy is it for users to accomplish basic tasks the first time they encoun-
ter the design?
2. Efficiency: Once users have learned the design, how quickly can they perform tasks?
3. Memorability: When users return to the design after a period of not using it, how easily
can they reestablish proficiency?
408
Ergonomics and accessibility
4. Errors: How many errors do users make, how severe are these errors, and how easily can
they recover from the errors?
5. Satisfaction: How pleasant is it to use the design?
These components overlap with the principles of ‘universal design’ (UD), a term which,
in itself, possesses multiple definitions (see Dolph, 2021). The classical definition of UD
postulates that it is ‘the design of products and environments to be usable by all people, to
the greatest extent possible, without the need for adaptation or specialized design’ (Mace,
1985). The aforementioned principles include ‘equitable use’, ‘flexibility in use’, ‘simple
and intuitive use’, ‘perceptible information’, ‘tolerance for error’, ‘low physical effort’, and
‘appropriate size and space for approach and use’ (Connell et al., 1997).
Lastly, the concept of ‘access’ refers to ‘efforts . . . to reform architecture and technology
to address diverse human abilities’ (Williamson, 2015, 14). Accessibility, in turn, can be
defined as a situation in which ‘[a]ll people, particularly disabled and older people, can use
websites in a range of contexts of use, including mainstream and assistive technologies; to
achieve this, websites need to be designed and developed to support usability across these
contexts’ (Petrie et al., 2015, 2). Although this definition refers to web design, it can easily
be extended to other domains (Choi and Seo, 2024).
[I]n all my 20 years’ experience as a freelance interpreter on the local market, I have
yet to encounter a booth which complies with most (if any) of these hypothetical
requirements, having had to sit on broken dining room chairs, in booths without
409
The Routledge Handbook of Interpreting, Technology and AI
working lighting, in cramped conditions with no ventilation (or the option of a noisy
ventilator) at the height of summer, etc.
(2015, 66)
410
Ergonomics and accessibility
for consecutive interpreting has been identified, in the aforementioned papers, as also being
beneficial for blind interpreters.
However, these developments also present a host of challenges relating to ergonomics
and accessibility. For example, while it should be applauded that interpreters now have
increased opportunities to customise their own workspaces, event organisers and techni-
cians should have been consistently guaranteeing quality baseline working conditions that
do not require equipment modification on the part of interpreters. In other words, headsets,
booths, and other factors, including connectivity, should comply with ISO norms, and
interpreters should not be forced to take action to guarantee these equipment standards
themselves. That being said, if they choose to do so, interpreters should be made aware of
the range of options available for improving their work.
Nevertheless, there are modes and methods of interpreting where challenges have
remained largely unresolved for decades. One example is chuchotage, where interpretation
is whispered to a small audience. Issues pertaining to the ergonomics of such assignments
result both from the interpreter’s body position and from the act of whispering itself (see
Baxter, 2015). In addition, there is also the option of using portable interpreting equipment
(PIEs) (Porlán Moreno, 2019; Korybski, this volume), also known as ‘infoport systems’ or
tour guide sets. These devices usually employ FM radio frequencies to transmit voice over
short distances and are employed for simultaneous interpreting without a console, or in
Baxter’s (2015) words, in ‘boothless simultaneous’ contexts. However, very often, inter-
preters’ working conditions in such settings are deemed ‘poor’ due to the lack of space and
the absence of the soundproof environment typically found in a booth. It is for this reason
that, in such circumstances, it is even more important to observe guidelines, such as those
offered by the European institutions (EC, 2021).
411
The Routledge Handbook of Interpreting, Technology and AI
comprehensive presentation and digest of the most modern forms of CAI technology, see
also Prandi, this volume).
However, these tools have graphical interfaces, which suggest potential problems with
accessibility. While accessibility of CAT tools is already the subject of academic interest
(see Figiel, 2018; Rodríguez Vázquez and Mileto, 2016), accessibility of CAI solutions is
an issue that has not been addressed until now. What follows is inspired by the author’s
personal professional experience with regards to the accessibility of CAI tools.
In general, CAI tools are not accessible. For example, InterpretBank has been deemed
‘inaccessible on all levels’ for screen reader users (Hof, 2021). This view is confirmed by Pol-
ish accessibility expert and well-known blind technology podcaster Piotr Machacz. Machacz
notes that the InterpretBank software is not accessible on all platforms because developers
built it on the cross-platform TK widget, a widget that is notorious within the blind commu-
nity for its inaccessibility. The web-based AI module for InterpretBank does not appear usable
for blind people either. However, although far from achieving perfect levels of accessibility,
screen reader users are able to access the main features of another CAI tool that continues
to be updated, Interpreters’ Help (see Hof, 2021). To the best of this author’s knowledge, no
tests for compliance with accessibility standards in relation to CAI tools have been reported in
academic publications thus far (besides SIPD tests that are referred to in subsequent sections
in this chapter). Therefore, as with CAT tools a decade ago, it is imperative that the question
of accessibility be studied in greater depth with regards to CAI tools.
With this in mind, as attested by the author’s correspondence on ‘The Round Table mail-
ing list for blind translators and interpreters’ (The Round Table, n.d.), there has been no
uptake of CAI tools among blind interpreters. As blind interpreters use several channels
for accessing information (e.g. tactile channels via Braille display, and auditory via speech
synthesis), the cognitive challenges related to CAI tools are only aggravated for blind inter-
preters. However, it can be posited that blind interpreters make ideal testers for how acces-
sible any potential solutions could be. This is because blind interpreters are more likely to
be able to identify any barriers to access that have been created by the dominance of visual
access to information. Even if these barriers could be mitigated by using accessible software
(which is not currently the case), they still present themselves in an acute form to this group.
Consequently, blind interpreters may be able to provide an important contribution to user
interface research, helping developers generate user interfaces that can be characterised
by high usability. Regardless of the existence (or lack of) of the potential advantages of
working alongside blind interpreters on improving user interfaces, the decision to provide
accessible solutions should not be motivated primarily by profit (or lack thereof). One
should consider instead the fact that individuals with disabilities should be able to enjoy
equal access to modern technologies, in order for their work to be competitive. This human
rights–based approach is described in the UN Convention on the Rights of Persons with
Disabilities (United Nations, 2006) and forms the core of EU’s Web Accessibility Directive
(EU, 2016), as well as the European Accessibility Act (EU, 2019). It has to be stressed that
this growing body of legislation is gradually starting to be taken on board not only by the
public sector but also by the private sector.
412
Ergonomics and accessibility
but work from a screen and earphones without a direct view of the meeting room or the
speaker’ (Mouzourakis, 2006, 46). Other scholars postulate that DI ‘is a specific method of
(conference) interpreting and covers a variety of scenarios of a speaker at a different loca-
tion from that of the interpreter, enabled by information and communication technology’
(Ziegler and Gigliobianco, 2018, 128).
Experiments with RSI have been carried out since the 1970s (see Moser-Mercer, 2005;
Mouzourakis, 2006; Ziegler and Gigliobianco, 2018). However, it was only at the turn of
the century that technology allowed for (relatively) seamless audio and video streaming.
This resulted in a series of studies regarding the feasibility of RSI. These early tests showed
numerous complaints from interpreters about the physiological (sore eyes, back and neck
pain, headaches, nausea) and psychological (loss of concentration and motivation, feeling
of alienation) aspects of RSI (Mouzourakis, 2006, 52–53). Researchers at that time sug-
gested that these adverse consequences of RSI were related to interpreters experiencing a
lack of sense of presence (Moser-Mercer, 2005; Mouzourakis, 2006). However, it is worth
noting that, at that time, eyesight was considered to be one of the most important metrics
in these studies. To illustrate, one such study, conducted by the Directorate General for
Interpretation and Conferences of the European Parliament, suggested a statistically sig-
nificant difference in terms of such parameters as fatigue, physical discomfort, motivation,
and feeling of participation in the meeting for people wearing eyeglasses. The authors con-
cluded that ‘it seems that people wearing glasses had to strain their eyesight more than their
counterparts to extract the information they needed to carry out their job. . . . [T]heir visual
defect put them on an unequal footing with their colleagues’ (EPID, 2001, 30).
As technology developed further and bandwidth increased, quality of sound and video
in RSI steadily improved (Ziegler and Gigliobianco, 2018). Seeber et al. (2019, 300) even
go as far as to argue that ‘with the progress achieved in technology, [RSI] is no longer
perceived as more stressful than in-situ interpreting or as being detrimental to the quality
interpreters are able to provide’. And yet as the authors observe elsewhere, ‘“conference
interpreters” acceptance of some of these new professional paradigms has been consider-
ably slower than the development of the technology making them possible’ (2019, 271).
However, it seems that, just as with previous major technological breakthroughs, there is a
growing acceptance of RSI (see, for example, Saeed et al., 2022; Mahyub Rayaa and Mar-
tin, 2022; or Salaets and Brône, 2023).
It goes without saying that the COVID-19 pandemic played a crucial role in the grow-
ing acceptance of RSI (see Chmiel and Spinolo, this volume). In a very short period of
time, almost all interpreters were forced to switch to working online. While the pandemic
caused disruption, it also provided opportunities that were hitherto inaccessible for many
interpreters, who had previously worked in a face-to-face context. With the evolution of
RSI, interpreters now had the potential to work abroad without travelling, improve their
work–life balance, or work on more assignments per day (Mahyub Rayaa and Martin,
2022; Salaets and Brône, 2023).
However, one could argue that the growth of DI has made issues related to ergonomics
even more acute. We understand ergonomics to be connected to human–machine interac-
tion, and elements of interpreting, such as handover or collaboration between interpreters,
were previously considered as examples of human-to-human interaction in the majority of
cases. However, with the increasing move towards RSI, professional freelancers now find
themselves able to work from separate locations (Salaets and Brône, 2023). As a result,
these examples of interaction have become machine-mediated activities. What is more,
413
The Routledge Handbook of Interpreting, Technology and AI
since the pandemic and the rise of RSI, many tasks that were previously handled by techni-
cians are now dealt with by the interpreters themselves.
Seresi and Láncos (2023) identified three major challenges for interpreters working
remotely during the initial phase of the COVID-19 pandemic. These include listening
to their boothmate, handover, and assisting a virtual booth partner. Mahyub Rayaa and
Martin (2022), in turn, include coordination with a booth partner and auditory and cog-
nitive fatigue in this context. Solving these challenges has required ingenious solutions
and a range of new competencies and skills on the part of interpreters. This is because,
in most of cases, interpreters were having to use tools and software that had not been
designed with professional conference interpreters in mind. To illustrate, most simulta-
neous interpreting delivery platforms (SIDPs), as well as mainstream videoconferencing
platforms, did not offer effective or efficient ways of providing interpreter-to-interpreter
communication. Consequently, many interpreters resorted to using a second device to
communicate with each other (Przepiórkowska, 2021; Mahyub Rayaa and Martin, 2022;
Seresi and Láncos, 2023). Some even used three devices at a time (Salaets and Brône,
2023). Solving these challenges has had a direct impact on the ergonomics of an interpret-
er’s workstation. This further complicates workflows and requires interpreters to invest
in their own equipment.
In terms of accessibility, handover represents a major challenge for interpreters. For
blind interpreters in particular, handover has always required a more direct form of con-
tact, such as touching a colleague’s arm. However, in online settings, this is not feasible.
Blind interpreters are unable to communicate handover via the standard chat features, as
this would require additional cognitive effort, in the form of listening out for a spoken
reply from a colleague via speech synthesis, or by reading messages on a Braille display.
As a result, blind interpreters working remotely tend to set up an audio conversation on
a separate device and use spoken instructions to communicate handover. And yet it has to
be stressed that RSI, if provided in an accessible way, can represent a huge opportunity for
blind people, who may have previously experienced problems with travelling to a venue or
moving around independently (Figiel, 2017). Likewise, preparatory materials delivered on
paper could be challenging for them (Dold, 2016). Although by no means insurmountable
(see Dold, 2016), these challenges are eliminated by RSI. In addition, online tools can also
facilitate speaker identification (Zoom has a keyboard shortcut to facilitate reading aloud
the active speaker’s name) or grant increased control over microphone status (Interactio has
ISO-compliant sounds for signalling muting/unmuting a microphone).
Overall, one could argue that, through their own agency and the work of professional
organisations, working conditions and online working standards for conference interpret-
ers during the pandemic were maintained (Mahyub Rayaa and Martin, 2022). One excep-
tion to this pattern is the reduced sound quality that occurred as use of online platforms
increased (Hynes, 2021)1. Not only does poor sound quality present a health hazard for
interpreters, but it can also increase simultaneous interpreters’ cognitive load (Seeber and
Pan, 2022). This caused numerous interpreters to express concerns (Mahyub Rayaa and
Martin, 2022) and led to industrial action among the interpreters working for the European
Parliament (Sheftalovich, 2022).
The topic of ‘acoustic shock’ among interpreters had been receiving attention even before
the pandemic. Authors of a 2020 study commissioned by the International Association of
Conference Interpreters (AIIC) concluded that up to 67% of respondents were affected
by acoustic shock (AIIC, 2020). Other studies also confirm that this is a major issue. For
414
Ergonomics and accessibility
example, Hynes (2021) reports that 35% of respondents from her study complained about
experiencing hearing problems. As a result, the AIIC (2020) report presented a host of
recommendations for event organisers, conference interpreters, and other stakeholders on
how to prevent acoustic shock. These recommendations concentrate on raising awareness,
as well as the necessity to provide adequate equipment (for both interpreters and confer-
ence participants) and a recommendation for interpreters to undergo annual hearing tests
in order to react quickly to potential symptoms of acoustic shock. The commitment of the
AIIC in this field was reiterated in its 2022 resolution, which called for specific action to be
taken to protect interpreters’ health (AIIC, 2022).
These aforementioned challenges have led some researchers to try to develop SIDPs
themselves. These include SIDPs which offer specific features dedicated to teaching inter-
preting. One such example is the SmarTerp project (Rodríguez et al., 2021; Fossati, 2021;
Fritella, 2023). This project aims to build a comprehensive RSI system that includes an RSI
platform, a CAI tool based on ASR, a CAI tool for terminology preparation, and a peda-
gogical module (see SmarTerp, n.d.). It should also be noted that SmarTerp is fully commit-
ted to providing inclusive, accessible solutions for professionals and users.
Similarly, the VIP project (Corpas Pastor, 2021) aims to provide interpreters with tools
that can be used before, during, and after an assignment, as well as providing lifelong learn-
ing. Inspired by repeated, reported complaints about cumbersome SIDP interfaces, other
researchers (Rodríguez González et al., 2023; Saeed, 2024; Saeed et al., 2022) developed
a new, ergonomic interface. To this effect, after undergoing a series of focus groups and
pilot tests, interpreter users were then able to sample the resultant prototype view of an
‘ergonomic SIDP’. This led researchers to conclude that interpreters are indeed interested
in using a range of different interfaces, including more minimalist versions than those of
existing SIDPs.
However, studies conducted thus far rarely mention the non-visual parameters of SIDPs.
Examples include the ease with which one can use a keyboard or the availability of key-
board shortcuts or sounds relating to the status of certain parameters (such as microphone
on/off). A notable exception is Jawad’s (2022) MA dissertation, which was written within
the context of the SmarTerp project. Jawad carried out a series of usability tests with a
limited number of visually impaired interpreters and sign language interpreters. Results
following his intervention suggest that SmarTerp has achieved basic levels of accessibility,
both for visually impaired interpreters and for sign language interpreters.
However, there is a lack of information regarding the level of accessibility for visually
impaired interpreters of other SIDPs. The blind community at large universally acknowl-
edge that Zoom is an ‘accessible platform’ (Roussey, 2023; Hof, 2021). The author’s per-
sonal experience as a professional interpreter confirms this but also suggests that Zoom is
unable to effectively inform a blind user about the current channel into which they are inter-
preting. There is also an individual testimony of a blind interpreter who managed to work
with KUDO and QuaQua (Hof, 2021). Despite successfully navigating the web interface,
the same interpreter observed that she was able to work faster and more efficiently after
connecting a hard console to KUDO.
In addition, the author’s personal experience suggests that there are still a lot of unmet
needs in terms of making many RSDPs (remote simultaneous delivery platforms) accessible.
However, the author would also like to note certain cases where accessibility was consid-
ered and provided reasonable results. For example, Interactio, which the author personally
tested in spring 2024, appears to provide an environment within which blind interpreters
415
The Routledge Handbook of Interpreting, Technology and AI
can work. This environment includes keyboard shortcuts, and sound and voice feedback
regarding the status change of various parameters. Positive examples such as these only
serve to emphasise just how important and impactful it is to continue increasing accessibil-
ity within SIDPs, and doing so alongside a community of end users who can test improve-
ments first-hand.
It must be noted that interpreting researchers have only recently become interested in
usability as a research topic (e.g. Fritella and Rodriguez, 2022; Jawad, 2022; Fritella, 2023;
Rodríguez González et al., 2023; Saeed, 2024; Saeed et al., 2022). Taking into considera-
tion the growing complexity of the interpreters’ workstation and workflow, it is only natu-
ral that one should expect increased usability and user experience studies to be carried out
in this field in the future.
416
Ergonomics and accessibility
22.9 Conclusions
This chapter has summarised the challenges related to ergonomics in conference inter-
preting. It has alluded to the significant amount of progress made in ergonomics that has
been achieved over the past decades. However, despite this, conference interpreters con-
tinue to face challenges in ergonomics to this day. Rather than reducing them, the move
417
The Routledge Handbook of Interpreting, Technology and AI
to online and hybrid modalities of interpreting has, in fact, increased these challenges. As
a result, awareness must be raised among all stakeholders. The same applies to the ques-
tion of accessibility, which has received attention from scholars and practitioners only
recently. If the interpreting profession is to continue on a sustainable, inclusive pathway,
manufacturers and developers need to fully acknowledge the challenges related to both
ergonomics and accessibility and work alongside practitioners to resolve them. Trainers
of conference interpreters, on the other hand, must provide new generations with the
knowledge about working conditions, ergonomics, and accessibility that will allow them
to make conscious decisions to guarantee their service as conference interpreters for dec-
ades to come.
Note
1 Very few users are aware of the fact that Zoom offers a feature that allows speakers to eliminate
noise cancellation and other types of sound compression that contribute to acoustic shock. The
speaker needs to turn on the so-called “original sound for musicians”. To do that, go to audio set-
tings and select “Original Sound for Musicians” under the “Audio Profile” section. Make sure that
“High fidelity mode” and “Echo Cancellation” checkboxes are checked. Then, once the meeting
starts, turn on the “Original Sound for Musicians” button in the top-right corner of the screen.
References
AIIC, 2020. Acoustic Shocks Research Project: Final Report. URL https://aiic.org/uploaded/web/
Acoustic%20Shocks%20Research%20Project.pdf (accessed 27.7.2024).
AIIC, 2022. Global Threats to Interpreters’ Auditory Health: Symptoms of Damage to the Audi-
tory System of Conference Interpreters Reported by Interpreters Worldwide Since the Advent of
Widespread Recourse to Remote Simultaneous Interpretation Platforms and Videoconferencing
Systems. URL https://aiic.org/document/10559/CdP_Resolution_Auditory%20Health_FINAL_
Sept22.pdf (accessed 27.7.2024).
Alben, L., 1996. Quality of Experience: Defining the Criteria for Effective Interaction Design. Interac-
tions 3(3), 11–15.
Alonso-Bacigalupe, L., Romero-Fresco, P., 2024. Interlingual Live Subtitling: The Crossroads
Between Translation, Interpreting and Accessibility. Universal Access in the Information Society
23, 533–543. URL https://doi.org/10.1007/s10209-023-01032-8
Baigorri-Jalón, J., 2014. From Paris to Nuremberg: The Birth of Conference Interpreting. John Ben-
jamins, Amsterdam and Philadelphia.
Baxter, R.N., 2015. A Discussion of Chuchotage and Boothless Simultaneous as Marginal and Unor-
thodox Interpreting Modes. The Translator 22(1), 59–71. URL https://doi.org/10.1080/1355650
9.2015.1072614
Bridger, R.S., 2003. Introduction to Ergonomics. Routledge, London and New York.
Chernov, S., 2016. At the Dawn of Simultaneous Interpreting in the USSR: Filling Some Gaps in
History. In Takeda, K., Baigorri-Jalón, J., eds. New Insights in the History of Interpreting. John
Benjamins, Amsterdam and Philadelphia, 136–166.
Choi, G., Seo, J., 2024. Accessibility, Usability, and Universal Design for Learning: Discussion of Three
Key LX/UX Elements for Inclusive Learning Design. TechTrends. URL https://doi.org/10.1007/
s11528-024-00987-6
Connell, B.R., Jones, M.L., Mace, R.L., Mueller, J.L., Mullick, A., Ostroff, E., Sanford, J., Steinfeld,
E., Story, M., Vanderheiden, G., 1997. The Principles of Universal Design, Version 2.0. Center
for Universal Design, North Carolina State University, Raleigh, NC. URL https://design.ncsu.edu/
wp-content/uploads/2022/11/principles-of-universal-design.pdf (accessed 27.7.2024).
Corpas Pastor, G., 2017. VIP: Voice-Text Integrated System for Interpreters. In Esteves-Ferreira, J.,
Macan, J., Mitkov, R., Stefanov, O., eds. Proceedings of the 39th Conference Translating and the
Computer. Editions Tradulex, Geneva, 7–10.
418
Ergonomics and accessibility
Corpas Pastor, G., 2021. Language Technology for Interpreters: The Vip Project. In Chambers, D.,
Esteves-Ferreira, J., Macan, J.M., Mitkov, R., Stefanov, O., eds. Translating and the Computer 42.
Tradulex, Geneva, 36–49.
Corpas Pastor, G., Fern, F.M., 2016. A Survey of Interpreters’ Needs and Practices Related to Language
Technology. Technical Paper. URL www.researchgate.net/publication/303685153_A_survey_of_
interpreters%5C%27_needs_and_practices_related_to_language_technology (accessed 27.7.2024).
Defrancq, B., Fantinuoli, C., 2021. Automatic Speech Recognition in the Booth: Assessment of Sys-
tem Performance, Interpreters’ Performances and Interactions in the Context of Numbers. Target
33(1), 73–102.
Dold, D., 2016. Technology Tip for Blind Interpreters: The KNFB Reader App. URL https://
theblindtranslator.wordpress.com/2016/03/29/technology-tip-for-blind-interpreters-th
e-knfb-reader-app/ (accessed 27.7.2024).
Dolph, E., 2021. The Developing Definition of Universal Design. Journal of Accessibility and Design
for All 11(2), 178–194. URL https://doi.org/10.17411/jacces.v11i2.263
Drechsel, A., Goldsmith, J., 2016. Tablet Interpreting: The Evolution and Uses of Mobile Devices in
Interpreting. In Lee-Jahnke, H., Forstner, M., eds. Proceedings of the 2016 CIUTI Forum. URL
www.academia.edu/36017504/Tablet_Interpreting_The_evolution_and_uses_of_mobile_devices_
in_interpreting (accessed 27.7.2024).
EC, 2021. PIE (Portable Interpreting Equipment) Technical Specifications. URL https://commission.
europa.eu/system/files/2022-01/technical-specifications-for-portable-interpreting-equipment_
2021_en.pdf (accessed 27.7.2024).
EPID, 2001. Report on Remote Interpretation Test: 22–25 January 2001 Brussels. URL www.euro-
parl.europa.eu/interp/remote_interpreting/ep_report1.pdf (accessed 27.7.2024).
EU, 2016. Directive (EU) 2016/2102 of the European Parliament and of the Council of 26
October 2016 on the Accessibility of the Websites and Mobile Applications of Public Sector Bod-
ies. URL https://eur-lex.europa.eu/eli/dir/2016/2102/oj (accessed 10.9.2024).
EU, 2019. Directive (EU) 2019/882 of the European Parliament and of the Council of 17 April 2019
on the Accessibility Requirements for Products and Services. URL https://eur-lex.europa.eu/eli/
dir/2019/882/oj (accessed 10.9.2024).
Fantinuoli, C., 2017. Speech Recognition in the Interpreter Workstation. In Esteves-Ferreira, J.,
Macan, J., Mitkov, R., Stefanov, O., eds. Proceedings of the 39th Conference Translating and the
Computer. Editions Tradulex, Geneva, 25–40.
Fantinuoli, C., 2019. The Technological Turn in Interpreting: Challenges That Lie Ahead. BDÜ Con-
ference Translating and Interpreting 4.0, Bonn.
Figiel, W., 2017. Tożsamość I status tłumaczy z dysfunkcją wzroku (Unpublished PhD thesis).
University of Warsaw.
Figiel, W., 2018. Levelling the Playing Field with (In)Accessible Technologies? How Technological
Revolution Has Changed the Working Conditions of Blind Translators. Między Oryginałem a
Przekładem 24(3/41), 75–88. URL https://doi.org/10.12797/MOaP.24.2018.41.04
Figiel, W., 2024. Teaching Simultaneous Interpreting During the COVID-19 Pandemic: Technology,
Society, Access. In Biernacka, A., Figiel, W., eds. New Insights into Interpreting Studies: Technol-
ogy, Society and Access. Peter Lang, Berlin, 289–302.
Fossati, G., 2021. SmarTerp: Applying the User-Centred Design Process in a Computer-Assisted
Interpreting (CAI) Tool (Unpublished dissertation thesis). Universidad Politécnica de Madrid.
Fritella, F., 2023. Usability Research for Interpreter-Centred Technology: The Case Study of SmarT-
erp. Language Science Press, Berlin.
Fritella, F., Rodriguez, S., 2022. Putting SmartTerp to Test: A Tool for the Challenges of Remote
Interpreting. INContext 2(2), 137–166.
García Oya, E., 2021. De las cabinas al entorno virtual: Didáctica de la interpretación simultánea en
línea sobrevenida. Estudios de Traducción 11, 147–155.
Goldsmith, J., 2017. A Comparative User Evaluation of Tablets and Tools for Consecutive Interpret-
ers. In Esteves-Ferreira, J., Macan, J., Mitkov, R., Stefanov, O., eds. Proceedings of the 39th Con-
ference Translating and the Computer. Editions Tradulex, Geneva, 41–50.
Guevara, N., 2021. Speech-to-Text Interpreting, Part 2: Some Practical Insights. Touch 29(3), 16–17.
Hamidi, M., Pöchhacker, F., 2007. Simultaneous Consecutive Interpreting: A New Technique Put to
the Test. Meta: Journal des traducteurs/Meta: Translators’ Journal 52(2), 276–289.
419
The Routledge Handbook of Interpreting, Technology and AI
Hof, M., 2021. Learning and Working Online as a Visually Impaired Interpreter (Part 2). URL
https://aibarcelona.blogspot.com/2021/05/learning-and-working-online-as-visually_26.html?m=1
(accessed 27.7.2024).
Hynes, R., 2021. Assessing the Risk Factors for Hearing Problems Among Simultaneous Conference
Interpreters. Translation Ireland 21(1), 163–188.
International Ergonomics Association, n.d. What Is Ergonomics (HFE)? URL https://iea.cc/about/
what-is-ergonomics/ (accessed 27.7.2024).
ISO, 2016. ISO 20109:2016: Simultaneous Interpreting – Equipment – Requirements. International
Organization for Standardization, Geneva.
ISO, 2017. ISO/FDIS 20108: Simultaneous Interpreting – Quality and Transmission of Sound and
Image Input – Requirements (Under Development). International Organization for Standardiza-
tion, Geneva.
ISO, 2018. Ergonomics of Human-System Interaction – Part 11: Usability: Definitions and Concepts.
International Organization for Standardization, Geneva.
ISO, 2024a. ISO 17651–1:2024 – Simultaneous Interpreting – Interpreters’ Working Environment.
Part 1: Requirements and Recommendations for Permanent Booths. International Organization
for Standardization, Geneva.
ISO, 2024b. ISO 17651–2:2024 – Simultaneous Interpreting – Interpreters’ Working Environment.
Part 2: Requirements and Recommendations for Mobile Booths. International Organization for
Standardization, Geneva.
Jawad, R., 2022. SmarTerp: Applying the User Centred Design Process for Visually Impaired Inter-
preters and Sign Language Interpreters (Unpublished dissertation thesis). Universidad Politécnica de
Madrid.
Kiran, D.R., 2020. Work Organization and Methods Engineering for Productivity. Butterworth-
Heinemann, Oxford.
Korybski, T., Davitti, E., Orasan, C., Braun, S., 2022. A Semi-Automated Live Interlingual Com-
munication. Workflow Featuring Intralingual Respeaking: Evaluation and Benchmarking. LREC
2022: Thirteenth International Conference on Language Resources and Evaluation, 4405–4413.
URL https://aclanthology.org/2022.lrec-1.468/
Law, L.-Ch., Roto, V., Vermeeren, A., Kort, J., Hassenzahl, M., 2008. Towards a Shared Definition
of User Experience. Extended Abstracts Proceedings of the 2008 Conference on Human Factors
in Computing Systems, CHI 2008, Florence, Italy, 5–10.4.2008, 2395–2398. URL https://doi.
org/10.1145/1358628.1358693
Mace, R., 1985. Universal Design, Barrier Free Environments for Everyone. Designers West, November.
Mahyub Rayaa, B., Martin, A., 2022. Remote Simultaneous Interpreting: Perceptions, Practices
and Developments. The Interpreters’ Newsletter 27, 21–42. URL https://doi.org/10.13137/
2421-714X/34390
Mirek, J., 2021. Teaching Simultaneous Interpreting During the COVID-19 Pandemic: A Case Study.
New Voices in Translation Studies 23, 94–103.
Moser-Mercer, B., 1992. Banking on Terminology: Conference Interpreters in the Electronic Age.
Meta: Journal des Traducteurs/Meta: Translators’ Journal 37(3), 507–522. URL https://doi.
org/10.7202/003634ar
Moser-Mercer, B., 2005. Remote Interpreting: Issues of Multi-Sensory Integration in a Multilingual
Task. Meta 50(2), 727–738.
Mouzourakis, P., 2006. Remote Interpreting: A Technical Perspective on Recent Experiments. Inter-
preting 8(1), 45–66.
Nielsen, J., 2012. Usability 101: Introduction to Usability. URL www.nngroup.com/articles/
usability-101-introduction-to-usability/ (accessed 27.7.2024).
Norman, D., Nielsen, J., 1998. The Definition of User Experience (UX). URL www.nngroup.com/
articles/definition-user-experience/ (accessed 27.7.2024).
Orlando, M., 2010. Digital Pen Technology and Consecutive Interpreting: Another Dimension in
Note-Taking Training and Assessment. The Interpreters’ Newsletter 15, 71–86.
Petrie, H., Savva, A., Power, C., 2015. Towards a Unified Definition of Web Accessibility. Proceed-
ings of the 12th Web for All Conference on – W4A ’15, 1–13. URL https://doi.org/10.1145/
2745555.2746653
420
Ergonomics and accessibility
Porlán Moreno, R., 2019. The Use of Portable Interpreting Devices: An Overview. Revista Trad-
umàtica. Tecnologies de la Traducció 17, 45–58.
Przepiórkowska, D., 2021. Adapt or Perish: How Forced Transition to Remote Simultane-
ous Interpreting During the COVID-19 Pandemic Affected Interpreters’ Professional Prac-
tices. Między Oryginałem a Przekładem 27(4/54), 137–159. URL https://doi.org/10.12797/
MOaP.27.2021.54.08
Rodríguez, S., Gretter, S., Matassoni, M., Falavigna, D., Alonso, Á., Corcho, O., Rico, M., 2021.
SmarTerp: A CAI System to Support Simultaneous Interpreters in Real-Time. In Proceedings of
the Translation and Interpreting Technology Online Conference TRITON 2021. INCOMA Ltd.,
Shoumen, 102–109. URL https://doi.org/10.26615/978-954-452-071-7_012
Rodríguez González, E., Saeed, M., Korybski, T., Davitti, E., Braun, S., 2023. Reimagining the Remote
Simultaneous Interpreting Interface to Improve Support for Interpreters. In Ferreiro Vázquez, Ó.,
Correia, A., Araújo, S., eds. Technological Innovation Put to the Service of Language Learning,
Translation and Interpreting: Insights from Academic and Professional Contexts. Peter Lang, Ber-
lin, 227–246.
Rodríguez Vázquez, S., Mileto, F., 2016. On the Lookout for Accessible Translation Aids: Current
Scenario and New Horizons for Blind Translation Students and Professionals. Journal of Transla-
tor Education and Translation Studies (TETS) 1, 115–135.
The Round Table, n.d. URL http://lists.screenreview.org/listinfo.cgi/theroundtable-screenreview.org
(accessed 27.7.2024).
Roussey, B., 2023. Accessibility Features of Zoom and How to Make Zoom Meetings More Acces-
sible. URL www.accessibility.com/blog/accessibility-features-of-zoom-and-how-to-make-zoom-
meetings-more-accessible (accessed 27.7.2024).
Saeed, M.A., 2024. Exploring the Visual Interface in Remote Simultaneous Interpreting (PhD thesis).
University of Surrey [online]. URL https://openresearch.surrey.ac.uk/esploro/outputs/doctoral/
Exploring-the-visual-interface-in-Remote/99870866202346#file-0
Saeed, M.A., González, E.R., Korybski, T., Davitti, E., Braun, S., 2022. Connected Yet Distant: An
Experimental Study into the Visual Needs of the Interpreter in Remote Simultaneous Interpreting.
In Kurosu, M., ed. Human-Computer Interaction. User Experience and Behavior. HCII 2022. Lec-
ture Notes in Computer Science, vol. 13304. Springer, Cham. URL https://doi.org/10.1007/978-3-
031-05412-9_16
Salaets, H., Brône, G., 2023. “Working at a Distance from Everybody”: Challenges (and Some
Advantages) in Working with Video-Based Interpreting Platforms. The Interpreters’ Newsletter
28, 189–209. URL https://doi.org/10.13137/2421-714X/35556
Seeber, K., Pan, D., 2022. The Effect of Sound Quality on Attention and Load in Language Tasks.
ExLing 2022 Paris: Proceedings of 13th International Conference of Experimental Linguistics,
17–19.10.2022. Paris, France. URL https://doi.org/10.36505/ExLing-2022/13/0040/000582
Seeber, K.G., Keller, L., Amos, R., Hengl, S., 2019. Expectations vs. Experience: Attitudes Towards
Video Remote Conference Interpreting. Interpreting 21(2), 270–304.
Seresi, M., Láncos, P., 2023. Teamwork in the Virtual Booth – Conference Interpreters’ Experiences.
In Liu, K., Cheung, A., eds. Translation and Interpreting in the Age of COVID-19. Springer, Sin-
gapore, 181–196.
Sheftalovich, Z., 2022. European Parliament Interpreters Call Off Strike. URL www.politico.eu/arti-
cle/european-parliament-interpreters-call-off-strike/ (accessed 27.7.2024).
SmarTerp, n.d. Tutorials. URL https://smarterp.me/tutorials/ (accessed 27.7.2024).
Szczygielska, M., Dutka, Ł., Szarkowska, A., Romero-Fresco, P., Pöchhacker, F., Tampir, M., Figiel,
W., Moores, Z., Robert, I., Schrijver, I., Haverhals, V., 2020. How to Implement Speech-to-Text
Interpreting (Live Subtitling) in Live Events: Guidelines on Making Live Events Accessible. ILSA
Project. URL https://repository.uantwerpen.be/docman/irua/1618b4/how_to_implement_speech_
to_text_interpreting_in_live_events_1.pdf (accessed 27.7.2024).
Tips for Vision-Impaired Users, n.d. URL www.nuance.com/products/help/dragon15/dragon-for-pc/
enx/dpg-cp/Content/Help/tips_for_vision_impaired_users.htm (accessed 27.7.2024).
United Nations, 2006. Convention on the Rights of Persons with Disabilities and Optional
Protocol. URL www.un.org/disabilities/documents/convention/convoptprot-e.pdf (accessed
10.9.2024).
421
The Routledge Handbook of Interpreting, Technology and AI
van Egdom, G.W., Cadwell, P., Kockaert, H., Segers, W., 2020. A Turn to Ergonomics in Translator
and Interpreter Training. The Interpreter and Translator Trainer 14(4), 363–368. URL https://doi.
org/10.1080/1750399X.2020.1846930
Will, M., 2020. Computer Aided Interpreting (CAI) for Conference Interpreters. Concepts, Content
and Prospects. ESSACHESS-Journal for Communication Studies 13(25), 37–71.
Williamson, B., 2015. Access. In Adams, R., Reiss, B., Serlin, D., eds. Keywords for Disability Studies.
New York University Press, New York and London, 14–16.
Zhao, N., 2023. Use of Computer-Assisted Interpreting Tools in Conference Interpreting Training and
Practice During COVID-19. In Liu, K., Cheung, A., eds. Translation and Interpreting in the Age of
COVID-19. Springer, Singapore, 331–347.
Ziegler, K., Gigliobianco, S., 2018. Present? Remote? Remotely Present! New Technological
Approaches to Remote Simultaneous Conference Interpreting. In Fantinuoli, C., ed. Interpret-
ing and Technology. Language Science Press, Berlin, 119–139. URL https://doi.org/10.5281/
zenodo.1493299
422
INDEX
Note: Page numbers in italics indicate figures, bold indicate tables in the text, and references following
“n” refer to notes.
423
Index
410, 416; advantages of 400; AI-based remotely 414; work with KUDO and
information retrieval model and 92; artificial QuaQua 415 – 416
intelligence and 158; -assisted consecutive Boéri, J. 339
interpreting 98 – 104, 235 – 236; -assisted bottom-up quality assessment methods
modalities 97 – 98; CAI tools -based 311 – 316, 321
99 – 101, 103, 130 – 136, 140, 172 – 173, Boujon, V. 256
411, 415; computer-assisted consecutive Bourgadel, C. 333
interpreting and 94; machine translation Bowker, L. 137, 337
and 99, 113, 123, 187, 193 – 194, 210 – 212, Braille code 410, 412, 414
214, 235 – 236, 239 – 240; and natural Braun, S. 11, 13, 16, 41, 44, 51 – 52, 159, 248,
language processing 89, 115; remote 250, 253, 267, 270 – 276, 329 – 330, 350, 390
simultaneous interpreting and 136; within British Refugee Council 284
second-generation CAI tools 411; in Bu, X. 101
simultaneous conference interpreting 98, Bühler, H. 306
192 – 193, 196, 202, 239 – 240 Buitrago Ciro, J. 337
automatic speech translation (AST) 172, Buján, M. 332
210, 218
automation 196, 205, 321, 331, 333, 340; Cabrera Mendez, G. 17
anxiety 333; automated metrics/methods Camayd-Freixas, E. 95
218, 317 – 320; CAI tools 126; forms of Carioli, G. 159, 166
22 (see also specific forms); full 18, 187, cascading approach 212 – 214, 213
193 – 194, 316; future 320; semi (see Cavents, D. 45
semi-automated workflows); telephone ChatGPT 22
interpreting 22 – 23 Chatterjee, S. 338
AVIDICUS projects, Europe 30, 38, 41, 276, Chaves, S.G. 54
277n3, 295, 297n4, 313, 340 Chen, N.S. 165
Azarmina, P. 253 Chen, S. 102, 132, 133, 134, 137, 235, 236,
349, 352, 354
BabelDr 256 Chernov, G.V. 230
Bachelier, K. 251, 252 Chesterman, A. 328
Baidu Translate 102 Cheung, A.K.F. 54
Baigorri-Jalón, J. 230 Chitrakar, R. 95
Bail for Immigration Detainees (BID) 284 Chmiel, A. 59, 61, 351
Baker, M. 230 cloud-based interpreting 56
Balogh, K. 272, 273 cognition 158, 348 – 349; 4EA 91, 353,
Barbour, I. 232 355 – 356; augmented 355 – 356, 357;
Barik, H.C. 306, 312, 313 cognitive effort 100, 116, 126, 128, 133,
Baumgarten, S. 333 349, 353 – 354; cognitive ergonomics
Baxter, R.N. 83, 86, 409 – 410, 411 20 – 22, 354 – 355; on interpreting studies
Bell, Alexander Graham 70 349 – 353; tablet interpreting and 116 – 117;
Bennett 339 on technology 349 – 353; technology-enabled
Berber-Irabien, D.-C. 138 interpreting 350 – 351; technology-supported
Bergunde, A. 294 interpreting 351 – 352; telephone interpreting
BERTScore 218, 317 – 318 and 20 – 22; training and education 352 – 353
Biagini, G. 131 cognitive effort 100, 116, 126, 128, 133, 349,
Bidone, A. 83, 86 353 – 354
bidule interpreting 79 – 80, 82 – 89, 389 – 390 cognitive load 94, 96, 102 – 103, 133, 233,
bilateral interpreting 12; automated 23; on-site 238 – 240, 309, 333, 349, 352, 371, 373,
14, 23 – 24 389, 414; and cognitive effort 353 – 354; in
billable time 74 – 75 computer-assisted consecutive interpreting
Blackbox 157 134; in distance interpreting 382 – 383;
BLEU 218, 317 – 319 ergonomics and 20 – 21; extraneous
BLEURT 317, 319 383; intrinsic 383; remote simultaneous
blind interpreters 411; Braille code for 410; CAI interpreting and 60 – 61; self-reported 58, 62;
tools among 412; as conference interpreting subjective 236; subtasks for measuring 400;
teachers 417; of SIDPs 415; working in VMI 42, 46, 290
424
Index
425
Index
426
Index
427
Index
videoconference interpreting in 284 – 287, InTrain 159, 161, 166, 174, 174n9;
291; video-mediated interpreting in 283, development 160 – 162; suitability 162 – 164;
285, 287 – 297 usability 162 – 164
infoport systems see portable interpreting intralingual respeaking 189 – 190, 195, 201, 316,
equipment 416; machine translation and 192, 196 – 197,
in-process knowledge work 399 – 401, 402 310; semi-automated 315; simultaneous
Integrated Services Digital Network (ISDN) interpreting and 191 – 192, 196, 202 – 203
33 – 35, 37, 40, 56, 270 InZone project 165
Interactio 415 – 417 ISO 13611: 2014 369 – 370
interlingual respeaking 187, 188 – 190, 194, ISO 17651 – 1:2024 374, 375 – 376, 381, 409
196 – 197, 416; ILSA and 203; SMART ISO 17651 – 2:2024 374, 376 – 377, 409
project and 197 – 204, 206n7, 310, 316, ISO 18841:2018 366, 368, 370
322n2; structure of 201, 203 ISO 20108:2017 377 – 378, 381 – 382
International Association of Conference ISO 20109:2016 36, 371, 377 – 378, 380 – 382,
Interpreters (AIIC) 59, 85 – 86, 88, 236, 238, 409, 410
306, 335, 365, 374, 414 – 415 ISO 20228:2019 370
international conference interpreting 230, ISO 20539:2019 379, 380
230 – 231, 239 – 240 ISO 21998:2020 370
International Ergonomics Association 408 ISO 22259:2019 378, 379 – 380, 417
International Labour Organization (ILO) ISO 23155:2022 371, 373, 374, 379, 382;
231, 236 clauses of 371 – 372; old and new ideas
International Organisation of Conference 373 – 374; three-layer model underlying
Interpreters 389 372 – 373, 372
International Organization for Standardization ISO 24019:2022 52, 371, 378 – 380, 382
(ISO) 408; accreditation, defined 369; ISO 2603:2016 375, 377
certification, defined 369; development ISO 4043:2016 374, 376, 377
stages 364 – 366; distance interpreting ISO/IEC Directives 364, 368
370, 373, 376, 378, 381 – 385; drafting IVY project 157
standards 366 – 369; interpreting standards
369 – 371; ISO 13611: 2014 369 – 370; ISO Jastrzębowski, W. B. 408
17651 – 1:2024 374, 375 – 376, 381, 409; Jawad, R. 415
ISO 17651 – 2:2024 374, 376 – 377, 409; ISO Joseph, C. 253
18841:2018 366, 368, 370; ISO 20108:2017
377 – 378, 381 – 382; ISO 20109:2016 36, Kade, O. 181
371, 377 – 378, 380 – 382, 409, 410; ISO Kajzer-Wietrzny, M. 158
20228:2019 370; ISO 20539:2019 379, Kak, A. 336
380; ISO 21998:2020 370; ISO 22259:2019 Kalina, S. 82, 308, 313
378, 379 – 380, 417; ISO 23155:2022 Keating, E. 70
(see ISO 23155:2022); ISO 24019:2022 Keiser, W. 80
52, 371, 378 – 380, 382; ISO 2603:2016 Kellet Bidoli, C.J. 148 – 149, 151
375, 377; ISO 4043:2016 374, 376, 377; Kelly, N. 17, 19
standardisation, defined 369; stylistic Kenny, D. 341
recommendations 368 – 369; technical Kiraly, D. 139
standards 374 – 381 Klammer, M. 254, 339
International Telecommunication Union (ITU) knowledge work 394 – 396, 394, 398, 401;
33, 35, 41, 239, 279 digitalisation of 390 – 391; in-process
International Workshop on Spoken Language 399 – 401, 402; peri-process 399 – 401, 402;
Translation (IWSLT) 210 – 211, 319 post-process 399 – 401, 402; pre-process
Interplex 125, 157, 329, 390, 396 399 – 401, 402; primary 394, 394; secondary
InterpretBank 100, 126, 131 – 132, 135, 152, 394, 394
157 – 158, 329, 393, 396, 412 Ko, L. 165
interpreter-mediated phone calls 71 – 73 KoBo Inc. 296
Interpreters’ Help 390, 396, 412 Kopczyński, A. 306
Interpreters’ Pool Project 285 Korybski, T. 315
InterprIT 157 Kruger, J.-L. 102, 132, 133, 134, 137, 235,
Intragloss 329 236, 352
428
Index
KUDO AI Speech Translator 126, 136, 216, machine interpreting (MI) 23, 79, 91, 140,
318, 415 – 416 210 – 211, 329, 331, 339, 356, 384;
Kunin, M. 286 cascading approach 212 – 214, 213; cultural
Kurz, I. 306 and communicative challenges in 216;
end-to-end approach 215; ethical issues in
Lancos, P. 414 218 – 220, 332 – 335, 343; future trajectory
large language models (LLMs) 18, 104, 210, of 220 – 221; history 211 – 212; linguistic
214, 218, 220 – 221, 235, 319, 330, 337, challenges in 215 – 216; measures in
343, 356 – 357, 392 evaluation of 319 – 320; misuse 219; overuse
latency 115, 135, 213 – 214, 217 219; post-editing 63; quality 217 – 218;
Law, L.-Ch. 408 technical challenges in 217; underuse 219
Lazaro Gutierrez, R. 17, 20 – 21, 23 machine translation (MT) 23, 98, 102,
League of Nations (LoN) 231, 236, 388 109, 132, 182, 216, 230, 315, 330 – 331,
Lederer, M. 230 392 – 393, 416; automatic speech recognition
Lee, K.M. 53, 100, 101 and 99, 113, 123, 187, 193 – 194, 210 – 212,
Lee, R.G. 275 214, 235 – 236, 239 – 240; in consecutive
legal interpreting 33, 265, 290 – 291, 312 – 313, conference interpreting 235 – 236; ethical
340, 351; benefits of using technologies considerations 338; evaluation metrics in
in 267 – 268; ethnographic research in 44; 317 – 319; generic tools 248; in healthcare
frameworks and codes regulating technology 256, 258; human version of 295; intralingual
and 268; guidelines and training resources respeaking and 192, 196 – 197, 310;
276; interpreter’s role 274 – 275; interpreting neural 22, 212 – 214, 335; on-demand 99;
mode, change in 272; new working space in simultaneous conference interpreting
adaptation 271 – 272; physiological impact 239 – 240
274; quality in 273; rapport building 272, machine translation quality estimation (MTQE)
274; re-distributed legal system 271; remote 317, 319
simultaneous interpreting and 268, 271, Magalhães, E. 86
273; sound quality 270 – 271; standard Mager, M. 338
370; strategies 275 – 276; technology Magnuski, E. H. 81
and equipment quality in 270 – 271, 273; Mahyub Rayaa, B. 414
telephone 265 – 266, 268, 270, 273; Martin, A. 414
turn-taking and interaction management MASIT 232, 241n4
in 273 – 274; videoconferencing technology MATRIC project 197, 206n8, 310, 322n1
in 37 – 39, 266 – 267, 268, 270 – 276; Maxell 146
video-mediated 37 – 38, 40 – 42, 265, 274; Měchura, M. 339
video relay service 267 – 268, 275; video media accessibility (MA) 185 – 186, 203 – 204
remote interpreting and 268 MediBabble 256
Lewis, D. 335 Megone, C. 328, 341
Li, H.Y. 101 Mellinger, C. 21, 127, 128, 147, 351, 355
Li, X. 319 Merlini, R. 157
Li, Y. 253 METEOR 317 – 319
Li, Z. 138 Meyer, B. 254 – 255
liaison organisation 364 – 365 Microsoft Azure AI Speech 101, 334 – 335
Licoppe, C. 272, 273, 274 Mielcarek, M. 95, 97
Lion, K.C. 252 Miler-Cassino, J. 273, 274, 275
Liu, H. 352 Milošević, J. 355
Liu, J. 21, 22 Mintz, D. 14
Livescribe 96, 146 – 149, 151, 234, 241n6 Mirus, G. 70
Llewellyn-Jones, P. 275 Moleskine 152
Logitech 146 Monteolivia-Garcia, E. 269
Lombardi, J. 95 Monzo-Nebot, E. 339
Lookup 390 Moodle 157
Lu, X. 318 Moorkens, J. 335
Moser-Mercer, B. 15, 20, 42, 45, 60, 61, 114,
Ma, Z. 95 274, 313, 350, 410
Machacz, P. 412 Moser, P. 306, 307
429
Index
430
Index
431
Index
simultaneous-consecutive interpreting 94 – 97, sound quality 163, 239, 378, 392, 409, 414;
102, 112 – 113, 233 – 234 healthcare interpreting 253; immigration,
simultaneous interpreting (SI) 1, 34, 44, 87, asylum, and refugee settings 289, 292; legal
152, 160, 229, 272, 348, 350, 389 – 391, settings 270 – 271; portable interpreting
403, 409 – 411; artificial intelligence and equipment 81 – 82, 85 – 86, 88 – 89; remote
188; ASR assisted 98, 192 – 193, 196, simultaneous interpreting 52 – 53, 58 – 60, 62;
202, 239 – 240; boothless 86; booths, telephone interpreting 14; video-mediated
microphones, and headsets in 237 – 238; interpreting 34, 40, 289
CAI tools and 92; cognition and 350 – 351; speaker involvement 338
distance interpreting in 239; electronic Speechmatics 393
glossaries in 238; equipment for 377 – 378; speech recognition (SR) 189, 191 – 193, 196,
IBM system enabled 156; at international 201, 203; CAI tools 235; in consecutive
conferences 236 – 237; interpreters’ working conference interpreting 235 – 236; in
environment for 375 – 377; and intralingual simultaneous conference interpreting
respeaking 191 – 192, 196, 202 – 203; 239 – 240; see also automatic speech
machine translation in 239 – 240; mobile recognition
booths for 376; on-site 87, 311; permanent speech synthesis 110, 183, 211, 414
booths for 375; phases and subtasks of speech-to-speech translation 210 – 212, 319, 331
399 – 400; portable interpreting equipment in speech-to-text (STT) 109, 148, 152, 183,
83 – 84; real SI 86; simultaneous interpreting 204, 269, 309 – 310; broadcasting 349;
2.0 190; sound and image input, quality and intralingual 194, 316; live interlingual 187,
transmission of 377; speech recognition in 204; live STT 184 – 187, 189, 193, 200,
239 – 240; tablet in 109 – 112; teaching of 205; Microsoft Azure 101; real-time 182,
417; see also computer-assisted simultaneous 185, 191, 194; real-time interlingual 182,
interpreting; remote simultaneous 184, 186 – 187, 188, 188, 194 – 195, 197;
interpreting; simultaneous interpreting transcripts 89; translation 211, 214, 214,
delivery platforms 218, 221
simultaneous interpreting delivery platforms speech-to-text interpreting (STTI) 186,
(SIDPs) 52, 414, 416, 417; accessibility 414; 189, 416
blind interpreters and 415; ergonomic 415; speech translation (ST) 91, 98 – 100, 102,
ISO standards 371, 374, 378 – 379, 380, 382, 209 – 210, 221; automatic 172, 210, 218;
385; sign language interpreting and 415 challenges 215 – 217; free services 336;
Singureanu, D. 40, 272 live 222; multilingual 211; real-time
Skaaden, H. 294 239; simultaneous 213; speech-to-speech
Skinner, R. 249, 251, 275 210 – 212, 319, 331; see also machine
SLI e-learning programme 165 interpreting
SmarTerp&Me 126 Spinolo, N. 11 – 13, 15 – 16, 21, 58, 59, 159,
SmarTerp 136 – 137, 152, 158 – 159, 171 – 174, 166, 351
351, 354, 393, 415 Stahl, J. 83
smartpens 94, 112, 145, 149, 153; benefits 97; Standardisation see International Organization
features 96, 147, 151 – 152; in interpreter for Standardization
education 150 – 151; Livescribe 96, 146 – 149, Stengers, H. 351
151, 234, 241n6; Moleskine 152; Neo Stewart, C. 319
models 151; note-taking 148; at present Stoll, C. 125
151; simultaneous-consecutive with 97; stress 95, 132, 274, 313, 350 – 351, 385;
SyncPen 152 cognitive 124; management 275, 372; remote
SMART project 197 – 204, 206n7, 310, simultaneous interpreting 61; video relay
316, 322n2 service 73 – 74; working-memory 124
SMART-UP project 203 – 204, 206n11 STUN protocol 160, 174n10
Smith, L. 157 Sultanic, I. 15, 16, 17
social presence 53, 54 Svejcer, A.S. 230
social responsibility 329, 341 – 342 Svennevig, J. 254
soft skills 156, 164, 174; development needs Svoboda, S. 95, 97
172; teaching 171 SyncPen 152
software-adapted delivery (SAD) 191 – 192, 204 Szarkowska, A. 201
432
Index
tablet interpreting 84, 94, 99, 145, 147, top-down quality assessment methods 310 – 312,
152 – 153, 230, 269, 352, 410; artificial 320 – 321
intelligence and 109, 116 – 117; CAI Translators without Borders 296
tools and 115; and cognition 116 – 117; Translatotron 215
consecutive interpreting 109 – 111, 113, Trilling, B. 172
115 – 116, 125, 234 – 235; defined 108; Tryuk, M. 328
digital pens and 111, 114 – 115, 235, TURN protocol 160, 175n11
348, 401; future directions 114 – 117; in turn-taking 14, 16, 43 – 44, 254, 273 – 274, 284,
hybrid interpreting modalities 112 – 113; 292 – 294
in professional practice 109 – 110; research Tymczyńska, M. 158
110 – 113; simultaneous 109 – 112; and skills Tymoczko, M. 338
acquisition 115 – 116; training 113 – 114
Tasa-Fuster, V. 339 UN Convention on the Rights of Persons with
Taylor, J. 273, 274 Disabilities 181, 412
Taylor, J.L. 248, 250 universal design (UD) 409
teamwork 58, 61, 332, 373, 391, 403 – 404 Unlu, C. 98, 99, 100, 132, 319
technologies 348 – 349; cognition on 349 – 353;
in data, information, knowledge, and Van Cauwenberghe, G. 135
management 396 – 397; defined 388; -enabled Van de Meer, J. 335
consecutive interpreting (see specific entries); van Heuven, V.J. 319
-enabled interpreting 350 – 351; in phases and Van Straaten, W. 255
subtasks 400 – 402; in semiotics 398 – 399; Varde, S. 148 – 149, 151
-supported interpreting 351 – 352; training Vargas-Sierra, C. 341
and education 352 – 353; see also specific verbatim approach 98, 189 – 190, 204, 316
technologies Verbmobil 211
technology-mediated interpreting see distance Verdier, M. 273, 274
interpreting Verrept, H. 252
telecommunications device for the deaf (TDD) videoconference interpreting (VCI) 30, 44 – 45,
13, 67 51, 236, 250, 370, 391 – 392, 398, 403;
teleconference interpreting 11, 51, 330 business and 402; COVID-19 pandemic 392,
telehealth 45, 248, 251, 258n1, 286, 339 402; generic 309; in immigration, asylum,
telemedicine 248, 258n1 and refugee settings 284 – 287, 291; legal
telephone interpreting (TI) 39, 41 – 42, 305, 312, 37 – 39, 266 – 267, 268, 270 – 276; video
353, 382; advantages 13, 15; automation remote interpreting vs. 32 – 35, 37;
22 – 23; CAI tools 21 – 22, 24; cognition see also Zoom
and 20 – 22; coordination of discourse in video-mediated dialogue interpreting 15, 30, 32,
14, 15 – 17; defined 11, 12; disadvantages 34, 43
14 – 15; ergonomics and 20 – 22; future video-mediated interpreting (VMI) 11, 12 – 13,
avenues 18 – 24; in healthcare settings 15 – 16, 46, 258n3, 305, 312, 314, 340;
34, 249 – 252, 254 – 257; in immigration, adaptation strategies 44 – 45; AVIDICUS
asylum, and refugee settings 283, 285, projects 30, 38, 41, 276, 277n3, 295,
286 – 289, 296; lack of visual context in 297n4, 313, 340; challenges of 42 – 44;
14 – 16; in legal settings 265 – 266, 268, 270, characteristics of 31 – 33; cognitive load in 42,
273; multimodal input in 14 – 15; on-site 46, 290; in conference interpreting settings
interpreting and 13, 15 – 17, 23, 41 – 42, 252; 35 – 37; COVID-19 pandemic 31 – 32, 34,
quality and satisfaction 17; sound quality 36, 38, 41, 45, 250 – 251; critical aspects of
14; technological workflows 18 – 19; training 41 – 42; defined 30, 31; in healthcare settings
19 – 20; VMI and 251 – 252 38 – 41, 43, 45 – 46, 249 – 255, 257; historical
Telephone Interpreting Project (TIP) 37 evolution of 33 – 34; human factors in
teletypewriter (TTY) 13, 67 41 – 42; in immigration, asylum, and refugee
text telephones 13 settings 283, 285, 287 – 297; interaction in
text-to-speech (TTS) 148, 183, 213, 396 42 – 44; interpreter comprehension in 41 – 42;
Thonon, F. 255 interpreting quality in 34 – 36, 41 – 42; in legal
Tipton, R. 336, 342 settings 37 – 38, 40 – 42, 265, 274; on-site
Tiselius, E. 100 interpreting and 15, 35, 38, 40 – 43, 252,
433
Index
434