CrossLingPhras
CrossLingPhras
net/publication/309033164
CITATIONS READS
80 3,127
1 author:
Jean-Pierre Colson
Catholic University of Louvain
25 PUBLICATIONS 133 CITATIONS
SEE PROFILE
All content following this page was uploaded by Jean-Pierre Colson on 12 October 2016.
DOI: 10.1075/z.139.19col
https://benjamins.com/#catalog/books/z.139.19col/details
Readers should be aware that this paper is under copyright and that the publisher should be
contacted for permission to re-use or reprint the material in any form.
In recent years, phraseology in the broad sense has become a unifying theme for an increasing
number of theoretical and practical linguistic studies. Among this broad palette of
investigations into the meaning, structure or use of set phrases, cross-linguistic research is one
An Englishman may sleep like a log, but a Frenchman will, among other possibilities,
sleep like a marmot (dormir comme une marmotte), a Dutchman like a rose (slapen als een
roos), a German like a stone (schlafen wie ein Stein) and a speaker of the Bété language
(Ivory Coast) like a python (Ô honhoun glibi yèrè, Zouogbo 2003). This list might be
extended to all languages of the world and would reveal the amazing richness and diversity of
language. The famous Danish linguist Hjelmslev (1961) already pointed out that there is a
difference between form and substance of language, and he argued that this dichotomy was
also applicable to the level of content, so that the whole semantic organisation of the lexicon
and its interaction with the real world will vary a lot from one language to another.
This is undoubtedly a possible starting point for carrying out research on phraseology
across languages. Is there no rhyme or reason to the unbridled imagination underlying set
phrases in all languages, or is it possible to discover some universal principles? Will set
phrases enable researchers to gain information about the cultural patterns and life ways
prevailing in other parts of the world? Can we improve translation practice or theory by a
systematic comparison of set phrases across languages? These are just a few examples of the
very wide range of approaches involved in cross-linguistic and contrastive phraseology. The
language peculiarities as illustrated by concrete examples are only the top of the iceberg.
It would be quite interesting to shed light on the diversity of phraseology by
concentrating on specific cases across languages. This could, however, create the impression
that comparing languages from the point of view of their set phrases is only a practical matter,
and that no thorough theoretical grounding is necessary. Nothing is less true, as the very
starting point of the research, the sheer existence of a separate linguistic domain called
phraseology, remains controversial. In this article, we shall briefly mention a few theoretical
and practical issues that arise when set phrases are analysed in several languages.
Set phrases in the broad sense (see Burger et al. 1982) have now been identified in many
languages. It is well known that the phraseological tradition originated from Russia and
Germany (Vinogradov 1946). As a result, Russian and German were among the first
languages to be fully described from the point of view of phraseology, but the movement later
It soon became clear that a comparison between set phrases in two or more languages
was of crucial importance for discovering the theoretical principles underlying phraseology,
as well as its contextual use. As the European Society for Phraseology (Europhras,
wonder that German has taken the lion’s share as far as cross-linguistic phraseology is
concerned. German has been compared with Russian (Dobrovol’skij 1997), Slovakian (Durco
1994), Hungarian (Hessky 1987), Japanese (Rothkegel 2003, Ueda 2004), Spanish (Piñel
López 2003), Lithuanian (Budvytyte 2003), Rumanian (Zaharia 2003), French (Gréciano
1989, Dalmas 1999, Valentin 1999), Finnish (Korhonen 1989), Dutch (Piirainen 1995),
comparison with one or more languages: Arabic (Awwad 1990), German (Gläser 1984),
German & Polish (Paszenda 2003), French (Gläser 1999), Spanish (Marín-Arrese 1996, Mena
Martinez 2003), Hebrew (Newman 1988), Latvian (Veisbergs 1992), or Malay (Charteris-
Black 2003). Dobrovols’kij and Piirainen (2005), a major contribution to which we shall
refer again in this article, have analysed figurative language, an important component of
in all languages, but there is still a long way to go before we can claim that phraseology as we
above were based on non-Indo-European languages (Bété, Japanese, Arabic, Finnish, Malay),
and can already be considered as valuable clues. The common features between those
- In all those languages, there are many examples of a wide variety of constructions
that meet the general definition of phraseology (Burger et al. 1982; Burger 1998):
phraseology in the broad sense meets the criteria of polylexicality and fixedness,
idiomaticity. It is not yet clear, however, that the proportion of the various
categories of set phrases is universal. There are indeed many indications that some
verbal vs. nominal set phrases, or metaphorical vs. opaque set phrases, to mention
European syntax, we may have a slightly biased vision of what phraseology looks
phraseology of all languages, but some languages may prefer simple metaphors to
- There is a close link between culture and phraseology. This is best revealed by
proverbs and fully idiomatic set phrases, because they tend to rely a lot on images,
traditions or habits that are proper to a given culture. It is no easy matter, however,
to draw the line between images that seem to be related to more or less universal
aspects of the human mind, and other features having to do with a specific culture.
the description of phraseology in the world’s languages. English and Dutch, for
instance, have a larger proportion of set phrases deriving from the sea (Jeans,
2004).
From an ethno-linguistic point of view, it would be very welcome to extend the study
of set phrases to the language families that are considered to be the most ancient ones on the
basis of both archaeology and biology. Recent studies have shown that the Khoisan language
family (spoken in southern Africa, among others by the Bushmen) may very well be the most
ancient language family, as archaeological evidence goes back to some 60,000 years ago.
The Khoisan languages have only recently been studied extensively by linguists
(Westphal 1971, Treis 1998). As in the case of other languages from distant parts of the
world, a number of own features have been noted, but they do not contradict the universal
principles of syntax, semantics, pragmatics and culture, with inevitably a great number of set
phrases.
If confirmed by further research, the findings available for a broad array of languages
show that phraseology, just as syntax, is one of the key components of human language. This
inevitably poses a more general question: why is that so? As a matter of fact, the theoretical
The weak theoretical background of research on phraseology has been criticised by Čermák
(2001). When studying set phrases across languages, one should first be aware that several
interpretations of the term contrastive are possible. They are best described by Dobrovol’skij
any kind of comparison between languages from the point of view of their set phrases will be
considered as contrastive phraseology. However, contrastive in the narrow sense implies that a
really systematic comparison is achieved between two or more languages, on the basis of all
their differences and similarities. Finally, a more restricted interpretation of contrastive is also
possible, in which only differences between languages are taken into account.
This is more than a terminological issue. Mentioning a few examples taken from a
number of languages may be interesting from a cross-linguistic point of view, but a truly
languages.
Apart from these methodological issues, cross-linguistic and contrastive phraseology are
based on examples, but these only make sense if they are interpreted in a theoretical framework.
As pointed out by Čermák (2001) and Dobrovol’skij & Piirainen (2005), one of the main flaws
interaction with context (Burger et al. 1982, Burger 1998; Cowie 1998, Gläser 1984, 1985).
There is, however, no global theory of phraseology available, in the sense that the origin of the
issue of set phrases, its relative importance in language, or its interaction with syntax, semantics
and pragmatics remain largely controversial. If set phrases turn out to be a major aspect of
language, both for their frequency and for their semantic connections, a subtheory of language
In the absence of such a theory, at least two main linguistic schools can already provide
Cognitive semantics (Lakoff 1988) and cognitive linguistics (Langacker 1999, 2000;
Taylor 2002) have stressed the role of metaphor as a corner stone of language. From a cognitive
point of view, metaphors play a crucial role in most set phrases, especially idioms, and there are
abstract concepts underlying metaphors, such as ‘GOOD IS UP; BAD IS DOWN’ (Lakoff &
Johnson 1980; Chun 2002). As cognitive semantics has historical links with generative
linguistics, it comes as no surprise that those abstract structures receive a more or less universal
one of the sources of inspiration for contrastive phraseology (eg Kempcke 1989, Marín-Arrese
Although there are obvious similarities between metaphors and set phrases, using a
cognitive framework for the analysis of phraseology raises a number of problems. In the first
place, not all set phrases correspond to metaphors. Most pragmatic or communicative set
phrases such as routine formulas are not metaphorical. On the other hand, many metaphors
are closely related to set phrases and there are numerous borderline cases. An angel can be
considered as a one-word metaphor referring to a very kind person, but the imperative form
Be an angel and…is considered by most dictionaries as a set phrase. To use another metaphor,
we are really getting here to the heart of the matter. Is an angel really a metaphor or has this
meaning become so common (in many European languages) that this is a simple case of
polysemy? How can we distinguish between metaphors and idioms? What is the exact
both for metaphors and set phrases? And, for that matter, how do we define meaning? The
absence of a universally recognised semantic theory makes this whole approach very
complex. It may also be criticised from the point of view of the verification of the data and
the reproducibility of the experiments, two key features of any scientific method. Indeed,
defining the underlying cognitive structures of metaphors or set phrases relies a lot on the
intuition of the linguist, and different cognitive linguists will inevitably come to different
analyses of the same structures. This methodology is largely deductive, in much the same way
Piirainen (2005) is not a general theory of phraseology, but it can be seen as a major theoretical
breakthrough in understanding the cognitive foundations of both metaphors and idioms, as well
according to two basic criteria: image requirement (a conceptual structure mediating between
the lexical structure and the actual meaning) and additional naming (figurative language is not
the only way of expressing a specific idea). Contrary to most cross-linguistic and contrastive
studies on phraseology, Dobrovol’skij & Piirainen pay attention to the many theoretical
assumptions that can be derived from the observation of the diversity of languages.
Their theory lays stress on the image component as a specific conceptual structure
underlying figurative units, and as a relevant element of their meaning. They also claim that
some restrictions in the use of figurative units can be directly attributed to this image
component.
This is obviously one of the key issues. The image component is an interesting
theoretical construct providing a better account of the interaction between form and meaning in
figurative units such as metaphors and idioms, but it is no more than a cognitive hypothesis if
the linguistic data provide no corroborative evidence. Dobrovol’skij & Piirainen mention a few
interesting examples in that respect. The set phrase (to be) caught between a rock and a hard
place displays according to the authors a number of usage restrictions that can be traced back to
the image component. The general meaning of this set phrase is to be in a very difficult position,
but they point out that it cannot be used in all situations in which someone is in a difficult
position, because this set phrase involves “the mental picture of being between two obstacles,
i.e. the idea of a ‘lack of freedom of movement’” (Dobrovol’skij & Piirainen 2005:15). If one
accepts this view, it is indeed evidence for the cognitive approach to set phrases, and especially
for the image component. This example shows how interesting and at the same time how
complex a semantic approach to set phrases in the world’s languages can be, all the more so as
the cognitive approach is not the only possible way. This example might indeed be analysed
from a purely pragmatic point of view, with restrictions due to context or speaker. The
interaction between figurative meaning, cognitive principles and literal meaning is also
problematic. In the example mentioned above, it is not quite clear to what extent the literal
meaning of a rock and a hard place may also contribute to some usage restrictions.
unveils a number of interesting cognitive and semantic principles. At the same time, the image
component is influenced by the culture of a specific language, and it can therefore yield a lot of
information about differences in culture, especially when very remote languages are the object
of investigation.
Dobrovol’skij & Piirainen, does not cover all set phrases, because many of them are not
figurative: grammatical or pragmatic phrases, phrasal verbs, routine formulas, and many
collocations, etc. There is obviously a need for additional contrastive work before the exact
place of phraseology within general linguistic theory can be clearly determined. If we claim that
phraseology is just one aspect of figurative language, we then disregard the great bulk of set
phrases. If, on the other hand, cooccurrence is used as the only principle underlying set phrases,
Across the diversity of studies on set phrases in several languages, another major
theoretical issue is the following one: what is the central category of set phrases? A lot of
attention has traditionally been devoted to fully idiomatic set phrases, the well-known idioms.
In many respects, they can be considered as extreme cases of phraseology, especially when they
are opaque or non-compositional. Dobrovol’skij & Piirainen (2005:39) call them the “central
and most important class of phrasemes”. Comparing idioms in several languages is particularly
useful for analysing cultural phenomena, and idioms are also open to several types of
modification, variation, or literal reinterpretation (see Burger 1998). Besides, they can create
stylistic effects in various registers of language, including literature. Thus, when Dickens writes
in his Christmas Carol that Old Marley was dead as doornail, he already prepares the reader for
the scene where Scrooge watches the knocker of his door and sees Marley’s face.
It’s all very well claiming that idioms are the essence of phraseology, but this is only
taking the cognitive or semantic aspect of language into account. If, on the other hand, we pay
attention to the relative frequencies of the various categories of set phrases (Moon 1998, Colson
2003), we are struck by the very low figures for idioms. Pragmatic set phrases such as routine
formulas are much more frequent than idioms, both in written and in spoken language.
languages. A lot of studies have been devoted to idioms, but idioms are rather marginal from a
purely statistical point of view. Most of them have a frequency that is lower than 1 occurrence
If we claim that idioms are the central category of set phrases, we may then come to the
conclusion that phraseology is a marginal phenomenon, because idioms are rather rare in
corpora. Besides, this seems to be confirmed by the semantic and cognitive research on
Piirainen (2005:18), i.e. figurative language is not the primary way of expressing an idea.
Finally, this restricted view on phraseology is also consistent with the traditional interpretation
translation) reveals just the opposite: phraseology turns out to be a major aspect of all
languages. Taken in the broad sense, phraseology is indeed present at all levels of linguistic
production and comprehension, because native speakers will assemble lexical elements
according to a wide variety of existing patterns that may have little to do with grammar.
As in other sciences, linguistics may have to find a unifying principle behind apparently
more accurately, and this is precisely where corpus linguistics comes in.
The idiom principle posed by John Sinclair (1991) implies that set phrases in the broad
sense are responsible for at least half of the constructions that are found in most texts. From the
very beginning of the research on linguistic corpora, it was clear that co-occurrence phenomena,
languages. The frequency issue should rather be analysed on very large corpora (Moon 1998,
Colson 2003), because the more idiomatic set phrases tend to be rather infrequent. This remains
to some extent problematic, as it has so far not been possible to determine the precise frequency
levels for phraseology. For all their interest and importance, semantic classifications of set
phrases are no more than hypotheses, and hard evidence is very difficult to find. This implies
that one semantic classification can always be replaced by another, and that this can go on for
some time. Further research might on the other hand focus on corpus evidence from various
languages that would point to the existence of set phrases, the criteria for recognising and
classifying them, as well as the frequency limits that would help differentiate the specific
A related topic is that of the frequency differences across languages. It is not at all
clear, for instance, that all languages will use set phrases in the same proportions. The relative
importance of the noun category may vary from language to language and will therefore
interfere with the importance of verbal expressions, one of the main categories of set phrases.
Describing some kind of phraseological profile for various languages on the basis of large
corpora can be very useful for both language learners and translators, because many errors are
Altenberg & Granger 2002) play a very important role in determining the actual use of lexis
in context, and its many interactions with phraseology. Across the diversity of languages, it
becomes more and more clear that a very detailed analysis, both manual and automated, of
lexical and co-occurrence phenomena in corpora is particularly useful for solving the
underlying theoretical issues, such as the role of semantics and syntax and their interplay with
phraseology. Prepositions are very interesting in that respect, because they can often be
The frequency issue again plays a significant part in this interaction. Indeed,
prepositions, as well as adverbs, connectives and articles have been often regarded in
traditional linguistics as essentially grammatical parts of speech, but their behaviour in large
corpora seems to point to the opposite. As already mentioned by Sinclair (1991), most
grammatical constructions are largely dependent on the use of lexical elements. In other
words, the choice between prepositions or even determiners may often be influenced by
phraseology. This may even apply to the choice between definite and indefinite article in
As far as connectives are concerned, interesting research has recently been devoted to
cross-linguistic differences and their motivation (Degand 2005). The use of causal
connectives in different languages, for instance, reveals similarities but also striking
principles, but connectives are often part of larger units such as clichés, routine formulas or
grammatical phrases, all of which have to do with phraseology in the broad sense. Future
research on large corpora may therefore benefit from a combination of linguistic approaches,
including phraseology.
Within this field, a pioneer work has been done by researchers in contrastive
phraseology French / German. Gréciano (1997) and Dalmas (1997), among others, have
investigated the use of phrasemes (in the sense of fully idiomatic set phrases) in combination
with discourse particles. Many examples taken from French and German corpora point to the
frequent association between phrasemes and German particles (eg doch, übrigens, überhaupt,
ja denn auch, ganz used in combination with a phraseme), whereas French seems to often
moderate or introduce the use of phrasemes by variants of the verb ‘to say’ or ‘to name’, as in
the following example (Gréciano 1997: 458): “cela ne ressemblait en rien à ce qu’on appelle
ressemble s’assemble”: like attracts like. In other words, discourse particles and set phrases
share many common features, and it should therefore be no surprise that they often co-occur.
It is not quite obvious, however, what percentage of set phrases (and which category) will
Another closely related issue is whether set phrases across languages are regularly
accompanied by introducers. Some researchers (see Čermák 2002) have pointed out that
many set phrases, especially verbal idioms, are often accompanied by syntactic constructions
or specific words that seem to introduce or moderate the set phrase. A typical example is the
English adjective proverbial, as in to spill the proverbial beans. The same holds true of Dutch
(with the adjective spreekwoordelijk ) and German (sprichwörtlich), which is in itself an
interesting starting point for a more thorough contrastive analysis of this phenomenon. It is
still unclear to what extent the use of such types of introducers in combination with set
phrases relates to rhetorical or pragmatic principles. Obviously, it is always possible to
combine pragmatic modifiers with set phrases, but the case of proverbial associated with
verbal idioms rather suggests that languages such as English, German and Dutch have
recourse to conventionalised patterns.
Studying phraseology in a lot of languages inevitably leads to translation. In the first place,
translation is often a workable solution for detecting phraseology. Indeed, many set phrases
and especially verbal idioms cannot be translated literally, even in closely related languages.
Thus, a phrase like down the hatch is easily recognised as a set phrase by French speaking
There are notable exceptions to this principle, because a great number of set phrases
are common to several languages. This is particularly the case with the many phrases that
European languages have borrowed from Greek, Latin or Hebrew. Apart from this practical
use of translation as one of the criteria for recognising set phrases, the interaction between
If, as many researchers within corpus linguistics and phraseology have pointed out, set
phrases constitute a major aspect of any language, it is clear that translating from one
language to another will mean being confronted twice with a very difficult task: establishing
the meaning of the source text while taking figurative language and phraseology into account,
and then trying to find an equivalent formulation in the target language. Phraseology will, in
Strangely enough, phraseology and translation is not such a common research field.
Apart from a conference in German (Sabban 1999) and a few articles on this subject (Roberts
1998, Poirier 2003, Rojo 2003), the very concept of phraseology is still notably absent from
studies on translation theory or translation practice. Delisle (2003), one of the best reference
books on translation theory and practice, does not mention the domain of phraseology. Set
phrases are treated as expressions and their importance is not underestimated, but they are
interdisciplinary research field. Sabban (1999) illustrates the rich cultural diversity underlying
any attempt to translate a set phrase from one language into another. As pointed out by
several researchers, a widely held misconception about set phrases is that you have to
translate one set phrase from L1 into a corresponding set phrase in L2. Foreign language
teachers and learners are often faced with the practical problem of having to make set phrases
correspond across languages, as in the case of lists of idioms, and they tend to reinforce this
misconception.
A more dynamic view on the translation of set phrases takes into consideration a
number of cultural and linguistic principles. Once again, phraseology is the meeting point of
conflicting theories about form, meaning and culture in language. Poirier (2003) analyses the
arbitrary and conventional nature of the translation of set phrases from the point of view of
sense that a semantic paraphrase is always possible without keeping the idiomatic aspect (for
instance, spill the beans may be translated into other languages by simple constructions
meaning reveal a secret). On the other hand, the translation of set phrases will be
conventional in the semiotic sense of the word, because of the conventional relations between
lexical units, and as a result of the conventional nature of the notion of equivalence.
This interesting theoretical approach to the translation of set phrases points to the
complex interplay between phraseology, semiotics and translation. Because of their special
status, somewhere in between lexicon and syntax, set phrases are particularly revealing of
both the strong and weak points of the current linguistic theories. Cognitive semantics insists
upon metaphors but many set phrases cannot be reduced to metaphors, and this principle is of
little help for the practical translation of set phrases. Corpus linguistics, on the other hand,
lays stress on the many contextual examples derived from a corpus, but the many intricate
facets of a given phrase are also governed by semantic principles, and cannot so easily be
As mentioned above, Delisle (2003) does not use the term phraseology, but he insists
on the thorny problems posed by the translation of the various categories of expressions. His
very informative and useful handbook is corroborated by the experience of many a translation
teacher: phraseology hampers the translation of most texts, be they general and informative,
or technical and scientific. In the latter case, phraseology often combines with terminology
because many disciplines or technical domains create their own set phrases or multiword
terms. In view of available evidence, future research might aim at testing a number of
What is the impact of phraseology on the overall pattern of the translation processes?
What kind of psychological or cognitive activities does phraseology require from translators
and interpreters? Are there universal translation techniques for set phrases, or is the solution
From a theoretical point of view as well, it remains for future research to determine
whether phraseology deserves its own place among the underlying principles of translation, as
There may also be an interesting link between phraseology and the research on
Condit 2002). A number of studies have already been devoted to a comparison between
translated and non-translated monolingual corpora (Hansen 2003, Laviosa 1998, Puurtinen
2003).
Baroni & Benardini (2006) have used an automated method for the recognition of
translationese, and they claim that the computer’s work achieves better results than human
evaluation. If confirmed by other studies, this might open the door to a better identification of
method is based on SVMs (support vector machines) and highlights the importance of
translationese and to translation quality assessment. As the results are partly derived from n-
gram extraction, it comes as no surprise that they mention “collocational and colligational
necessity for large companies providing translation services (De Sutter 2005). Because of
time constraints and in view of the very large number of language combinations, the
evaluation of translators is already partly automated, but the existing methods need to be
improved. Phraseology may be one of the key factors for evaluating the quality of a
translation, and it may be a new challenge for NLP and machine learning algorithms to extract
set phrases from translated corpora and to compare them with original texts.
Phraseology can be seen as the linguistic repository of a number of cultural traditions that are
languages, because this will elucidate the origins of many of those linguistic and cultural
habits.
A number of them may be more or less universal, thereby revealing a few fascinating
aspects of human cognition. However, some caution is needed in pursuing an analysis of this
sort. Until now, the focus of research has been primarily on European languages, and a
confrontation with other language families is necessary before we can draw any firm
multidisciplinary field. It has strong links with contrastive lexicology, syntax, pragmatics and
semantics, but also with semiotics and translation theory. The wide diversity of linguistic
theories underpinning phraseology across languages can be an advantage, but the downside is
that no single agreed methodology has been developed. Cognitive linguists rely a lot on their
intuition, while corpus linguists have recourse to large corpora. A widely accepted view is
that there is some truth in every theory, and future research may therefore benefit from
Phraseology across languages has important consequences for translation theory and
translation practice. The technological evolution in translation assessment should also benefit
from new insights into the structure and functioning of set phrases.
References