0% found this document useful (0 votes)
17 views15 pages

Corpora of Different Kinds Can Be Used For Different Purposein Translation

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views15 pages

Corpora of Different Kinds Can Be Used For Different Purposein Translation

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 15

Corpora of different kinds can be used for different purposein Translation

Studies. For example, parallel corpora are useful in exploring how an idea in
one language is conveyed in another language, thus providing indirect evidence
to the study of translation processes. Corpora of this kind are indispensable for
building statistical or example-based machine translation (EBMT) systems, and for
the development of bilingual lexicons and translation memories.Also, parallel
concordancing is a useful tool for translators.
Comparable corpora are useful in improving the translator's underrstanding of
the subject field and improving the quality of translation in terms offluency,
correct choice of term and idiomatic expressions in the chosenfield. They can also
be used to build terminology banks.
Translational corpora provide primary evidence in product-oriented Transla
tion Studies (see Section 14.3.2.1), and in studies of translation universals (see
Section 14.3.3). If corpora of this kind are encoded with socifolinguistic and
cultural parameters, they can also be used to study the socio-cultural environment
of translations (see Section 14.3.2.3).
Even monolingual corpora of source language and target languageare of great
value in Translation Studies because they can raise the translatepr's linguistic and
cultural awareness in general and provide a useful and effective reference tool for translators
and trainees. They can also be used in combination wwith a parallel
corpus to form a so-called translation evaluation corpus that helpstranslator train-
ers or critics to evaluate translations more effectively and objectively.
This section explores the state of the art of corpus-based Translation Studies on
the Holmes-Toury map, that is, applied TS, descriptive TS and theoreticalTS. On the applied TS
front, three major contributions of corpora include corpus
assisted translating, corpus-aided translation teaching and training, and develop
ment of translation tools. An increasing number of studies have demonstrated
the value of corpora, corpus linguistic techniques and tools in assisting transla
tion production, translator training and translation evaluation.For example,
Bernardini (1997) suggests that 'large corpora concordancing' (ICC) can help
students to develop 'awareness', 'reflectiveness' and 'resourcefulness', which are
said to be the skills that distinguish a translator from those unskilled amateurs.
Bowker (1998: 631) observes that corpus-assisted translations are ofa higher qual
ity with respect to subject field understanding, correct term choiceand idiomatic
expressions'. Zanettin (1998) shows that corpora help trainee translators become
aware of general patterns and preferred ways of expressing things in the target
language, get better comprehension of source language textsand improve pro
duction skills; Aston (1999) demonstrates how the use of corpora cain enable trans
lators to produce more native-like interpretations and strategies iin source and
target texts respectively; according to Bowker (2001), an evaluationcorpus, which
is composed of a parallel corpus and comparable corpora of source and target
languages, can help translator trainers to evaluate student translations and
provide more objective feedback; Bernardini (20026), Hansen andTeich (2002)
and Tagnin (2002) show that the use of a multilingual concordancer in conjunc
tion with parallel corpora can help students with 'a range of translation-related
tasks, such as identifying more appropriate target language equivalents and collo
cations, identifying the norms, stylistic preferences and discourse structures associ
ated with different text types, and uncovering important conceptual information
(Bowker and Barlow 2004: 74); Bernardini and Zanettin (2004:650) suggest that
corpora be used in order to 'provide a framework within which textual and linguis
tic features of translation can be evaluated'. Finally, Vintar (2007)reporefforts
to build Slovene corpora for translator training and practice.
Corpora, and especially aligned parallel corpora, are essential for the deevelop
ment of translation technology such as machine translation (MT) systems, and
computer-aided translation (CAT) tools. An MT system is designed to translate
without or with minimal human intervention. MT systems haave become more
reliable since the methodological shift in the 1990s from rule-baseed to text-based
algorithms which are enhanced by statistical models trained usingcorpus data.
Parallel corpora can be said to play an essential role in developingexample-based
and statistical MT systems. Well-known MT systems include exanaples such as Systran,
Babelfish, World Lingo and Google Translation. MT systemss like these are
mainly used in translation of domain-specific and controlled langguage, automated
'gisting' of online contents, translation of corporate communications, and locat
ing text or fragments requiring human translation. CAT toolsare designed to
assist in human translation. There are three major types of CAT tools. The most
important type are translation memory and terminology management tools which
can be used to create, manage and access translation memories (TMs) and term
bases. They can also suggest translation candidates intelligently iin the process
of translation. A second type are localization tools, which are ableto distinguish
program codes or tags from the texts to be translated (e.g. menus,buttons, error
messages etc.), or even better, turn program codes or tags into what a program or
webpage really looks like. Another type of tool is used in audicvisual translation
(e.g. subtitling, dubbing and voice-over). Major products of CAT tools include
SDL Trados, Deja Vu, Transit, and Wordfast for TM and terminology tools, Catalyst
for software localization, Trados TagEditor for webpage translation, and WinCap
for subtitling. CAT tools have brought translation into the industtial age, but they
are useless unless translated units and terminologies have beenstored in transla.
tion memories and termbases. This is where corpora come into the picture.

14.3.2 Descriptive Translation Studies

Descriptive Translation Studies (DTS) is characterized by its empbhasis on the study

of translation per se. It answers the question of why a tranislator translates in this

way' instead of dealing with the problem of 'how to translate' (HoImes 1972/1988)

The target-oriented and empirical nature of the corpus methodology is in perfect

harmony with DTS. Baker (1993: 243) predicted that the availability of large cor

pora of both source and translated texts, together with the development of the

corpus-based approach, would enable translation scholars to uncover the nature

of translation as a mediated communicative event.

Corpus-based DTS has revealed its full vitality over the past deccade, which will

be reviewed in this section in terms of its three foci: translationas a product, trans
lation as a process and the function of translation (Holmes 1972/1988).

14.3.2.1 Product-oriented DTS

Presently, corpus-based DTS has primarily been concerned withdescribing transla

tion as a product, by comparing corpora of translated and nontranslational native

texts in the target language, especially translated and native English. TThe majority

of product-oriented translation studies attempt to uncover evidendto support or

reject the so-called translation universal hypotheses (see Section 143.3)

As far as the English language is concerned, a large part of produict-oriented

translation studies have been based on the Translational EnglishCorpus (TEC),

which was built by Mona Baker and colleagues at the Universityof Manchester.

The TEC corpus, which was designed specifically for the purposesof studying

translated texts, consists of contemporary written texts translatted into English

from a range of source languages. It is constantly expanded with fresh materials,

reaching a total of 20 million words by the year 2001. Thecorpus comprises

full texts from four genres (fiction, biography, newspaper articles and in-flight

magazines) translated by native speakers of English. Metalinguisticdata such as

information about translators, source texts and publishing dates is annotated and

stored in the header section of each text. A subcorpus of original English was

specifically selected and is being modified from the BNC to match the TEC in

terms of both composition and dates of publication.

Presently, the TEC is perhaps the only publicly available corpus of translational

English. Most of the pioneering and prominent studies of translationaal English

have been based on this corpus, which have so far focused on syntactic and lexical

features of translated and original texts of English. They have porovided evidence

to support the hypotheses of translational universals, for example, sinmplification,

explicitation, sanitization, and normalization (see Section 14.3.3). For example,

Laviosa (19986) studies the distinctive features of translational English in relation

to native English (as represented by the BNC), finding that translational language

has four core patterns of lexical use: a relatively lower proportion of lexical words

over function words, a relatively higher proportion of high-frequencywords over

low-frequency words, a relatively greater repetition of the mostfrequent words,


and a smaller vocabulary frequently used (see Section 14.4 for further discussion).

This is regarded as the most significant work in support of thesimplification

hypothesis of translation universals (see Section 14.3.3.2). Olohaan and Baker's

(2000)comparison of concordances from the TEC and the BNC sshows that the

thatconnective with reporting werbs say and tell is far more frequent in t

English. These results provide strong evidence for syntactic explicitation in trans

lated English (see Section 14.3.3.1), which, unlike additionof explanatory

information used to fill in knowledge gaps between source text and target text

readers, is hypothesized to be a subliminal phenomenon inherenin the transla

tion process' (Laviosa 2002: 68). Olohan (2004)investigates intensifters such as

quite, rather, pretty and fairly in translated versus native English fictiin an attempt

to uncover the relation between collocation and moderation, finading that pretty

and rather, and more marginally quite, are considerably less frequeent in the TEC-fic

tion subcorpus, but when they are used, there is usually more variatioin usage,

and less repetition of common collocates, than in the BNC-fiction corpus

A number of corpus-based studies have explored lexical patterningin transla

tional language. For example, Kanter et al. (2006) identify new universais charac

terizing the mutual overlaps between native English and translated Esnglish on the

basis of Zipf's Law (Zipf 1949). Øveras (1998) explores the relationaship between

collocation and explicitation in English and Norwegian novels,demonstrating

how a collocational clash in the source text is translated using a cconventional com

bination in the target language. Kenny (2001) studies the relationsship between

collocation and sanitization on the basis of an English-German parallel corpus and

monolingual corpora of source languages. Baroni and Bernardini (2003) compare

the bigrams (i.e. two-word clusters) in a monolingual comparable corpus of native

Italian texts and translated articles from a geopolitics journal, concluding that:

Translated language is repetitive, possibly more repetitive thatn original lan

guage. Yet the two differ in what they tend to repeat: translatioins show a ten

dency to repeat structural patterns and strongly topic-dependent sequences,

whereas originals show a higher incidence of topic-independentsequences, i.e.

the more usual lexicalized collocations in the language. (Baroni and Bermardini
2003:379)

One interesting area of product-oriented translation researchinvolves corpora

composed of multiple translations of the same source textfor comparing individ.

ual styles of translators. One such corpus is the Hong Lou MengParallel Corpus,

which is composed of the Chinese original and four Englishcranslations of the

classic Chinese novel Hong Lou Meng 'A Dream of Red Chamber'.

14.3.2.2 Process-oriented DTS

Process-oriented DTS aims at revealing the thought processes that take place in

the mind of the translator while she or he is translating. While it itdifficult to study

those processes on-line, one possible way for corpus-based DTSto proceed is to

investigate the written transcripts of these recordings off-line, which is known as

Think-Aloud Protocols (or TAPs, see Bernardini 2002c). However, the process

cannot be totally detached from the product. Stubbs (2001adraws parallels

between corpus linguistics and geology, both assuming a rrelation between process

and product. A geologist is interested in geological processes, whiech are not directly

observable, but individual rocks and geographical formations such as destr

(2001 a: 154) agues, 'By and large, the processes are invisible, andmust be inferred

from the products.' The same can be said of translation: Translation as a product

can provide indirect evidence of translation as a process. Hence, both types of

studies can be approached on the basis of corpus data. Process-ordented studies are

typically based on parallel corpora by comparing source and taarget texts while

product-oriented studies are usually based on monolingual comparable corpora

by comparing translated target language and native target languaage. For example,

Utka (2004) is a process-oriented study based on the EnglishLithuanian Phases

of Translation Corpus. Quantitative and qualitative conmparisons of successive

draft versions of translation have allowed him not only to rejec:t Toury's (1995)

claim that it is impossible to use a corpus to study the translationprocess, but

also to report cases of normalization, systematic replacement of ternninology and

influence by the original language.

Chen (2006) presents a corpus-based study of connectives, namely conjuInctions

and sentential adverbials, in a composite corpus'composed oof English source


texts and their two Chinese versions independently produced in'Taiwan and main

land China, plus a comparable component of native Chinese texts asthe reference

corpus in the genre of popular science writing. This investigation inttegrates

product- and process-oriented approaches in an attempt to verify the hypothesis

of explicitation in translated Chinese. In the product-oriented paart of his study,

Chen compares translational and native Chinese texts to find out vwhether connec

tives are significantly more common in the first type of texts in termsof parameters

such as frequency and type-token ratio, as well as statistically deffned common con

nectives and the so-called translationally distinctive connectives(TDCs). He also

examines whether syntactic patterning in the translated texts is diffecrent from native

texts via a case study of the five TDCs that are most statistically signconsection and

process-oriented part of the study, he compares translated Chinesetexts with the

English source texts, through a study of the same five TDCs, in an attempt to deter

mine the extent to which connectives in translated Chinese texts are carried over

from the English source texts, or in other words, the extent to whichconnectives

are explicitated in translational Chinese. Both parts of his study support the hypo

thesis of explicitation as a translation universal in the process atnd product of

English-Chinese translation of popular science writing.

14.3.2.3 Function-oriented DTS

Function-oriented DTS encompasses research which describes thfunction or

impact that a translation or a collection of translations may haive in the socio

cultural context of the target language, thus leading to the 'studyof contexts rather

than texts' (Holmes 1972/1988: 72). There are relatively fewfunction-oriented

studies that are corpus-based, possibly because the marriage between corpora and

this type of research, just like corpus-based discourse analysis (e.g. B"aker 2006), is"

still in the 'honeymoon' period.

One such study is Laviosa (2000), which is concerned with the lexicogramnati-

cal analysis of five semantically related words (i.e. Europe, European, Ettropean Union,

Union and EU) in the TEC corpus. These words are frequently used in translated

newspaper articles and can be considered as what Stubbs (1996, 2)0016) calls cul

tural keywords', or words that are important from a socio-culturalpoint of view,


because they embody social values and transmit culture. In this case they reveal the

image of Europe as portrayed in data from translated articles in ThGuardian and

The European. Given that the TEC is a growing multi-source-languagecorpus of

translational English, Laviosa (2000) suggests that it is possible tocarry out com

parative analyses between Europe and other lemmas of cultural kteywords such as

Britain and British, France and French, and Italy and Italian, and so on, which may

lead to the direction of corpus-based investigation into the ideological inpact of

translated texts.

Similarly, Baker (2000) examines, on the basis of the fictional cormponent of

the TEC corpus, three aspects of linguistic patterning in the works of two British

literary translators, that is, average sentence length, type/token rattio, and indirect

speech with the typical reporting verbs such as say. The results indicate that the

two translators differ in terms of their choices of source texts and intended reader

ship for the translated works. One translator is found to prefer works targeting a

highly educated readership with an elaborate narrative which creates a world of intellectually
sophisticated characters. In contrast, the otherchoosesto translate

texts for an ordinary readership, which are less elaborate inn narrative and con

cerned with emotions. These findings allow Baker (2000) to draw the conclusion

that it is 'also possible to use the description emerging froma studyofthistype

to elaborate the kind of text world that each translatorr has chosen to recreate in

the target language' (cf. Kruger 2002:96).

Kruger (2000) examines whether the Afrikaans 'stage translaation' of The

Merchant of Venice reveals more spoken language features signalinginvolvement

and interaction between the characters than a 'page trans!ation'. She used an

analytical tool that would not only enable her to quantify linguthe provide and and

involvement in four Shakespeare texts (the original and three translations

English. This type of investigation allows her to validate her assumptions that

different registers of translated drama have different functionsand therefore they

present information differently.

Masubelele (2004) examines the changes in orthography, phonology, meprphol

ogy, syntax, lexis and register of Zulu brought about by translationworks. She
compares the 1959 and 1986 translations of the Book of Matthew into Zulu in a

translational corpus in order to research the role played by Bible translation in the

growth and development of written Zulu in the context of South Africa. She finds

that Toury's (1980) concept of the initial norm (i.e. the socio-cultural constraints)

'scems to have guided the translators of these translations in their selection of the

options at their disposal' (Masubelele 2004: 201). The study shows 'ain inclination

towards the target norms and culture' - while translators of the 1959 version

adopted source text norms and culture, the translators of the 1986 version adopted

the norms of the target culture (Masubelele 2004: 201).

14.3.3 Theoretical Translation Studies

Theoretical Translation Studies aims 'to establish general principles by means

of which these phenomena can be explained and predicted' (Holmes)1988:71).It

elaborates principles, theories and models to explain and predict what the process

of translation is, given certain conditions such as a particular pair of languages

or a particular pair of texts. Unsurprisingly it is closely related toand is often

reliant on the empirical findings produced by DTS.

One good battleground of using DTS findings to pursue a general theory

of translation is the hypothesis of so-called translation universals ((TUs) and its

related sub-hypotheses, which are sometimes referred to as the inherent features

of translational language, or 'translationese'. It is a well-recognized fact that trans

lations cannot possibly avoid the effect of translationese (cf. Hatrtmann 1985; Baker

1993:243-5; Teubert 1996: 247; Gellerstam 1996; Laviosa 1997:315;McEnery and

Wilson 2001: 71-2; McEnery and Xiao 2002, 2007). The conceptof TUs is first

proposed by Baker (1993), who suggests that all translations are likely to show

certain linguistic characteristics simply by virtue of being translations, which are caused in and
by the process of translation. The effect of the source langguage on

the translations is strong enough to make translated language pperceptibly dif

ferent from the target native language. Consequently translationallanguage is at

best an unrepresentative special variant of the target languagge (McEnery and Xiao

(2007). The distinctive features of translational languagecan be identified by

comparing translations with comparable native texts, thus throwingg new light on
the translation process and helping to uncover translation norms, owhat Frawley

(1984) calls the 'third code' of translation.

Over the past decade, TUs have been an important area of researchas well as a

target of debate in DTS. Some scholars (e.g. Tymoczko 1998) arguethat the very

idea of making universal claims about translation is inconceivable, while others

(e.g. Toury 2004) advocate that the chief value of general laws of translation lies

in their explanatory power; still others (e.g. Chesterman 2004) acctept universals

as one possible route to high-level generalizations. Chesterman (2004) further

differentiates between two types of TUs: one relates to the prrocess from the source

to the target text (what he calls 'S-universals'), while the other ("T-universals"

compares translations to other target-language texts. Mauranen (2007), in her

comprehensive review of TUs, suggests that the discussions of TUs follow the

general discussion on 'universals' in language typology.

Recent corpus-based works have proposed a number of TUs, the bestknown

of which include explicitation, simplification, normalization, sanitizzation and leve

elling out (or convergence). Other TUs that have been investigated iinclude under

representation, interference and untypical collocations (see Mauranen 2007)

While individual studies have sometimes investigated moire than one of these

features, they are discussed in the following sections separattely for the purpose of

this presentation.

14.3.3.1 Explicitation

The explicitation hypothesis is formulated by Blum-Kulka (1986) con the basis of

evidence from individual sample texts showing thhat translators tend to make

explicit optional cohesive markers in the target text even thhough they are absent

in the source text. It relates to the tendency in translationis to 'spell things out

rather than leave them implicit' (Baker 1996: 180). Explicitation can be realized

syntactically or lexically, for instance, via more frequent use ofconjunctions

in translated texts than in non-translated texts (see Section 14.4.2.3 for further

discussion), and additions providing extra information essential for a target cul

ture reader, and thus resulting in longer text than the non-translated text. Another

result of explicitation is increased cohesion in translated texxt (Øveras 1998). Pym


(2005) provides an excellent account of explicitation, locating its origin, discuss

ing its different types, elaborating a model of explicitation withina risk-manage

ment framework, and offering a range of explanations of the pheenomenon.

In the light of the distinction made above between S- and Tuniversals

(Chesterman 2004), explicitation would seem to fall most naturally into the S-type.

Recently, however, explicitation has also been studied as a T-universal. In his

corpus-based study of structures involving NP modification (equivalent of the structure noun +


prepositional phrase in English) in English and Hungarian,

Váradi (2007) suggests that genuine cases of explicitation mustbe distinguished

from constructions that require expansion in order to meet therequirements of

grammar. While explicitation is found at various linguistic levels ranging from lexis

to syntax and textual organization, 'there is variation even in thesse results, which

could be explained in terms of the level of language studied, or thegenre of the

texts' (Mauranen 2007: 39). The question of whether explicitationis a translation

universal is yet to be conclusively answered, according to existing evidence which

has largely come from translational English and related European languages

(see Section 14.4 for further discussion).

14.3.3.2 Simplification

Explicitation is related to simplification: the tendency to simplify thelanguage

used in translation' (Baker 1996: 181-2), which means that translational language

is supposed to be simpler than native language, lexically, syntactically and/or

stylistically (cf. Blum-Kulka and Levenston 1983; Laviosa-Braithwvaite 1997). As

noted in Section 14.3.2.1, product-oriented studies such as Laviosa (19986) and

Olohan and Baker (2000) have provided evidence for lexical andsyntactic simpli

fication in translational English. Translated texts have also beenfound to be

simplified stylistically. For example, Malmkjaer (1997) notes that itn translations,

punctuation usually becomes stronger; for example commas are often replaced

with semicolons or full stops while semicolons are replaced with full stop

shorter and less complex clauses in translations, thereby reducing structural

complexity for easier reading. Nevertheless, as we will see in Section 14.4.2.1, this

observation is likely to be language specific.


The simplification hypothesis, however, is controversial. It has been contested

by subsequent studies of collocations (Mauranen 2000), lexical use(Jantunen

2001), and syntax (Jantunen 2004). Just as Laviosa-Braithwaite (1996:534) cau

tions, evidence produced in early studies that support the simplification hypothe

sis is patchy and not always coherent. Such studies are based on different datasets

and are carried out to address different research questions, and thhus cannot be

compared.

14.3.3.3 Normalization

Normalization, which is also called 'conventionalization' in the literature (e.g.

Mauranen 2007), refers to the 'tendency to exaggerate features of the target lan

guage and to conform to its typical patterns' (Baker 1996: 183). Asa result, trans

lational language appears to be 'more normal' than the target language. Typical

manifestations of normalization include overuse of cliches or typicagrammatical

structures of the target language, adapting punctuation to the typical usage of

the target language, and the treatment of the different dialects used by certain

characters in dialogues in the source texts.

Kenny (1998, 1999, 2000, 2001) presents a series of studies of how urnusual and

marked compounds and collocations in German literary texts are translated into English, in an
attempt to assess whether they are normalized by mheans of more

conventional use. Her research suggests that certain trainslators may be more

inclined to normalize than others, and that normalization mayapply in particular

to lexis in the source text. Nevalainen (2005; in Mauranen 2007: 41)suggests that

translated texts show greater proportions of recurrent lexical burindles or word

clusters.

Toury (1995: 208), it is a well-documented fact that in translations, ling

and structures often occur which are rarely, or perhaps even neverencountered in

utterances originally composed in the target language'. Tirkkonen-Condit's (2002:

216) experiment, which asked subjects to distinguish translations from non

translated texts, also shows that 'translations are not readily distinguisfrom

original writing on account of their linguistic features.

14.3.3.4 Other translational universals


Kenny (1998) analyses semantic prosody in translated texts in anattempt to find

evidence of sanitization (i.e. reduced connotational meaning). Sshe concludes that

translated texts are "somewhat "sanitized" versions of the original' (Kenny 1998:

515). Another translational universal that has been proposed isthe so-called fea

ture of 'leveling out', that is, 'the tendency of translated text togravitate towards

the centre of a continuum' (Baker 1996: 184). This is what Laviosa(2002:72)calls

'convergence', that is, the 'relatively higher level of homogenheity of translated

texts with regard to their own scores on given measures of universal features' that

are discussed above.

'Under representation', which is also known as the 'unique items hypothesis', is

concerned with the unique items in translation (Mauranen 2007: 441-2). For exam

ple, Tirkkonen-Condit (2005)compared frequency and uses of theclitic particle

kin in translated and original Finnish in five genres (fiction,children's fiction,

popular fiction, academic prose and popular science), findingthat the average

frequency of kin in original Finnish is 6.1 instances per 1,000O words, whereas

its normalized frequency in translated Finnish is 4.6 instances per 1,000 words.

Tirkkonen-Condit interprets this phenomenon as a case of undder-representation

in translated Finnish. Aijmer's (2007) study of the use of the English discourse

marker 04 and its translation in Swedish shows that there is no singgle lexical equiva

lent of oh in Swedish translation, because direct translation with thestandard Śwed

ish equivalent ah would result in an unnatural sounding structture in this

language.

14.4 Core Lexical Features of Translated Novels in Chinese

As can be seen in the discussion above, while we have followed theliterature in

using the conventional term 'translation universal', the term is hghly debatable

in Translation Studies. Since the translational universals that havebeen proposed

so far are identified on the basis of translational English - mostthe the proverstans and from
closely related European languages, there is a possibility thait such linguistic fea

tures are not 'universal' but rather specific to English and/or genetically related

languages that have been investigated. For example, Chcong's(2006) study of

English-Korean translation contradicts even the least controversial exxplicitation


hypothesis.

As noted, research on the features of translated texts has so farbeen confined

largely to translational English translated from closely relatedEuropean lan

guages (e.g. Mauranen and Kujamāki 2004). Clearly, if the features of traanslational

language that have been reported are to be generalized as translation'universals',

the language pairs involved must not be restricted to English and closely related

languages. Therefore, evidence from genetically' distinct language pairs such as

English and Chinese is undoubtedly more convincing.

We noted in Section 14.3.2.2 that the explicitation hypothesis is supported

by Chen's (2006) study of connectives in English-Chinese translationis of popular

science books. Nevertheless, as Biber (1995: 278) observes, languhage may vary

across genres even more markedly than across languages. Xiao (20008) also demon

strates that the genre of scientific writing is the least diversified ofall genres

across various varieties of English. The implication is that the similaarity reported in

Chen (2006) might be a result of similar genre instead of language pair. Ideally,

what is required to verify the English-based translation universalsis a detailed

account of the features of translational Chinese based on balanced comparable

corpora of native Chinese and translated Chinese. This is whatwe are aiming

at on our project A corpus-based quantitative study of translational Chinese in

English-Chinese translation, which compares the Lancaster Corpus oMandarin Chinese

(LČMC, see McEnery and Xiao 2004) and its translational match in Chinese -

the newly built ZJU Čorpus of Translational Chinese (ZCTC, see Xiao, He anYue

2008).

In this section, we will present a case study of Laviosa's (19986) corefeatures of

lexical use in translational language (see Section 14.3.2.1) on thebasis of a parallel

analysis of the fiction categories in the LCMC corpus and a corpusof translated

Chinese fiction.

14.4.1 The Corpora

The corpus data used in this case study are the five categories of fiction (i.e. gen

eral fiction,mystery and detective stories, science fictionadventure stories

and romantic fiction) in the LCMC corpus (LCMC-Fiction herecafter) for native
Chinese, amounting to approximately 200,000 running words in I17 text samples

taken from novels and stories published in China around 1991.*The Contemporary

Chinese Translated Fiction Corpus (CCTFC hereafter) is composed of over one

million words in 56 novels published over the past three deccades, with most of

them translated and published in the 1980s and 1990s. These novels atre mostly

translated from English while other source languages are also repiresented inclu

ding, for example,Russian, French, Spanish, Czech, German andJapanese.

14.4.2 Results and Discussions

This section presents and discusses the results of data analysiis. We will first discuss

the parameters used in Laviosa (19986) in an attempt to find out whether the

core patterns of lexical use that Laviosa observes in translational Eraglish also apply

in translated Chinese fiction. We will also compare the frequencyand use of con

nectives in translated and native Chinese.

14.4.2.1 Lexical density and mean sentence length

There are two common measures of lexical density. Stubbs (1986:33;1996:172)

defines lexical density as the ratio between the number of lexiccal words (i.e. con

tent words) and the total number of words. This approach is taken in Laviosa

(19986). As our corpora are part-of-speech (POS) tagged, frequencieof differ

ent POS categories are readily available.

The other approach commonly used in corpus linguistics is the type-token ratio

(TTR), that is, the ratio between the number of types (i.e. unique woreds) and the

number of tokens (i.e. running words). However, since the TTR is seriously affected

by text length, it is reliable only when texts of equal or similar length are com

pared. To remedy this issue, Scott (2004) proposes a differentstrategy, namely,

using a standardized type-token ratio (STTR), which is computed every n (the

default setting is 1,000 in WordSmith) words as the Wordlist application of the

WordSmith Tools goes through each text file in a corpus. The STTRis the average

type-token ratio based on consecutive 1,000-word chunks of toext (Scott 2004:130).

It appears that lexical density defined by Stubbs (1986, 1996)measures infor

mational load whereas the STTR is a measure of lexical variabilityas reflected by

the different ways they are computed.


14.5 Conclusions

The present chapter has explored how corpora have helped to advance Transla

tion Studies as a scholarly discipline. On the basis of a clarificationof the terminol.

ogy in using corpora in translation and contrastive research, we revlewed the state

of the art of corpus-based Translation Studies. A case study was alsopresented that

was undertaken to bring fresh evidence from a genetically distinctlanguage pair,

namely English-Chinese translation, into the research of the socalled translation

universals, which has so far been confined largely to English andi closely related

European languages. It is our hope that more empirical evidernce for or against

translational universals will be produced from our balanced meonolingual com

parable corpora of translated and native Chinese when our project iscompleted.

You might also like