Corpus-Based Language Studies
An advanced resource book
Tony McEnery, Richard Xiao and Yukio Tono
O Routledge
g j ^ ^ Taylor Si Francis Croup
LONDON AND NEW YORK
Contents
Series editors' preface
Preface
Acknowledgements
SECTION A: INTRODUCTION
xv
xvii
xix
1
Unit A1 Corpus linguistics: the basics
A1.1 Introduction
A1.2 Corpus linguistics: past and present
A1.3 What is a corpus?
A1.4 Why use computers to study language?
A1.5 The corpus-based approach vs. the intuition-based approach
A1.6 Corpus linguistics: a methodology or a theory?
A1.7 Corpus-based vs. corpus-driven approaches
Summary
Looking ahead
3
3
3
4
5
6
7
8
11
12
Unit A2 Representativeness, balance and sampling
A2.1 Introduction
A2.2 What does representativeness mean in corpus linguistics?
A2.3 The representativeness of general and specialized corpora
A2.4 Balance
A2.5 Sampling
Summary
Looking ahead
13
13
13
15
16
19
21
21
Unit A3 Corpus mark-up
A3.1 Introduction
A3.2 The rationale for corpus mark-up
A3.3 Corpus mark-up schemes
A3.4 Character encoding
Summary
Looking ahead
22
22
22
23
27
28
28
Unit A4
A4.1
A4.2
A4.3
A4.4
A4.5
29
29
30
33
33
44
Corpus annotation
Introduction
Corpus annotation = added value
How is corpus annotation achieved?
Types of corpus annotation
Embedded vs. standalone annotation
C oatsrif s
Summary
Looking ahead
44
45
UnitA5
Multilingual corpora
A5.1
Introduction
A5.2
Multilingual corpora: terminological issues
A5.3
Corpus alignment
Summary
Looking ahead
46
46
47
50
51
51
UnitA6
Making statistical claims
Introduction
A6.1
A6.2
Raw frequency and normalized frequency
A6.3
Descriptive and inferential statistics
A6.4
Tests of statistical significance
A6.5
Tests for significant collocations
Summary
Looking ahead
52
52
52
53
53
56
57
57
UnitA7
Using available corpora
Introduction
A7.1
A7.2
General corpora
A7.3
Specialized corpora
A7.4
Written corpora
A7.5
Spoken corpora
A7.6
Synchronic corpora
A7.7
Diachronic corpora
A7.8
Learner corpora
A7.9
Monitor corpora
Summary
Looking ahead
59
59
59
60
61
62
64
65
65
67
69
70
UnitA8
Going solo: DIY corpora
Introduction
A8.1
A8.2
Corpus size
Balance and representativeness
A8.3
Data capture
A8.4
A8.5
Corpus mark-up
A8.6
Corpus annotation
A8.7
Character encoding
Summary
Looking ahead
71
71
71
73
73
74
75
76
76
76
Copyright
UnitA9
Introduction
A9.1
Coping with copyright: warning and advice
A9.2
Summary
Looking ahead
77
77
77
78
79
UnitAlO Corpora and applied linguistics
A10.1 Introduction
A10.2 Lexicographic and lexical studies
80
80
80
Contents
A10.3
Grammatical studies
A10.4
Register variation and genre analysis
A10.5
Dialect distinction and language variety
A10.6
Contrastive and translation studies
A10.7
Diachronic study and language change
A10.8
Language learning and teaching
A10.9
Semantics
A10.10 Pragmatics
A10.11 Sociolinguistis
A10.12 Discourse analysis
A10.13 Stylistics and literary studies
A10.14 Forensic linguistics
A10.15 What corpora cannot tell us
Summary
Looking ahead
85
87
90
91
96
97
103
104
108
111
113
116
120
121
1 22
SECTION B: EXTENSION
123
Unit B1
Corpus representativeness and balance
B1.1
Introduction
B1.2
Biber(1993)
B1.3
Atkins, Clear and Ostler (1992)
Summary
Looking ahead
125
125
125
128
130
130
Unit B2
B2.1
B2.2
B2.3
B2.4
Summary
131
131
131
135
140
144
Objections to corpora: an ongoing debate
Introduction
Widdowson (2000)
Stubbs (2001b)
Widdowson (1991) vs. Sinclair (1991b): a summary
Unit B3
Lexical and grammatical studies
B3.1
Introduction
B3.2
Krishnamurthy (2000)
B3.3
Partington (2004)
B3.4
Carter and McCarthy (1999)
B3.5
Kreyer(2003)
Summary
Looking ahead
145
145
145
148
152
155
159
159
Unit B4
Language variation studies
B4.1
Introduction
B4.2
Biber (1995a)
B4.3
Hyland(1999)
B4.4
Lehmann (2002)
B4.5
Kachru (2003)
Summary
Looking ahead
160
160
160
165
169
174
177
1 77
Unit B5
B5.1
178
178
Contrastive and diachronic studies
Introduction
Contents
B5.2
Altenberg and Granger (2002)
B5.3
McEnery, Xiao and Mo (2003)
B5.4
Kilpio (1997)
B5.5
Mair, Hundt, Leech and Smith (2002)
Summary
Looking ahead
Unit B6
Language teaching and learning
B6.1
Introduction
B6.2
Gavioli and Aston (2001)
B6.3
Thurston and Candlin (1998)
B6.4
Conrad (1999)
Summary
Looking ahead
SECTION C: EXPLORATION
1 78
1 81
185
1 90
194
1 94
195
195
195
198
201
202
203
205
Unit C1
Collocation and pedagogical lexicography Case study 1
C1.1
Introduction
C1.2
Collocation information
C1.3
Using corpus data for improving a dictionary entry
Summary
Further study
208
208
210
220
225
225
Unit C2
HELP or HELP to: what do corpora have to say? Case study 2
C2.1
Introduction
C2.2
Concordancing
C2.3
Language variety
C2.4
Language change
C2.5
An intervening NP
C2.6
The infinite marker preceding HELP
C2.7
The passive construction
Summary
Further study
227
227
228
235
239
240
241
245
246
246
Unit C3
L2 acquisition of grammatical morphemes Case study 3
C3.1
Introduction
C3.2
Morpheme studies: a short review
C3.3
The Longman Learners' Corpus
C3.4
Problem-oriented corpus annotation
C3.5
Discussion
Summary
Further study
247
247
249
250
251
260
263
263
Unit C4
Swearing in modern British English Case study 4
C4.1
Introduction
C4.2
Spoken vs. written register
C4.3
Variations within spoken English
C4.4
Variations within written English
Summary
Further study
264
264
265
269
279
285
286
C o n t s si t s
Unit C5 Conversation and speech in American English Case study 5
C5.1 Introduction
C5.2 Salient linguistic features
C5.3 Basic statistical data from the corpus
C5.4 The dimension scores of three genres
C5.5 The keyword approach to genre analysis
Summary
Further study
Domains, text types, aspect marking and English-Chinese
translation Case study 6
C6.1 Introduction
C6.2 The corpus data
C6.3 Translation of aspect marking
C6.4 Translation and aspect marking
C6.5 Domain and aspect marking
C6.6 Text type and aspect marking
Summary
Further study
287
287
288
293
303
308
319
320
Unit C6
Glossary
Bibliography
Appendix of useful Internet links
Index
321
321
323
324
336
338
340
341
343
344
352
379
381