Preview-9781107266490 A23760370
Preview-9781107266490 A23760370
The International Phonetic Association exists to promote the study of the science of
phonetics and the applications of that science. The Association can trace its history back to
1886, and since that time the most widely known aspect of its work has been the
International Phonetic Alphabet. The Handbook has been produced collaboratively by
leading phoneticians who have been on the Executive of the Association, and it
incorporates (for instance in the case of the Illustrations) material provided by numerous
members of the Association world wide.
Handbook of the International Phonetic Association
I CAMBRIDGE
UNIVERSITY PRESS
PUBLISHED BY THE PRESS SYNDICATE OF THE UNIVERSITY OF CAMBRIDGE
The Pitt Building, Trumpington Street, Cambridge CB2 IRP, United Kingdom
www.cambridge.org
© The International Phonetic Association 1999
Information on this title: www.cambridge.org/9780521652360
A catalogue record for this book is available from the British Library
PART 3: Appendices
breadth of readership has led perhaps to a more equivocal tone in the presentation of the
premises behind the IPA than in the Principles. For instance, the way in which the IPA
developed historically was closely bound up with a 'strictly segmented' phonemic view,
and in section 10 the fact that there are alternatives in phonological theory is
acknowledged. The vertical spread of readers poses the recurring question of how much or
how little to say. The lower bound is presumably what a novice needs to pick up in order
to have some idea of the principles governing the organization of the chart. The upper
bound is the practical goal of a compact booklet, readily affordable by students, and
concise enough to be easily digested by non-specialist readers.
The resulting text in part 1 is more discursive than that of the old Principles. It should
be borne in mind, however, that it does not attempt the job either of a phonetics textbook,
or of a critique of the IPA. Nowadays there are many good phonetics textbooks available,
and it would be expected that students of phonetics would read one or more of these in
conjunction with the Handbook. The purpose of the Handbook is not to provide a
comprehensive or balanced education in phonetics, but to provide a concise summary of
information needed for getting to grips with the IPA. Likewise, whilst a full-scale critique
of the assumptions on which the IPA is founded is perhaps due, the practically-oriented
Handbook is not the place for it. The IPA is a working tool for many, and whilst it may be
possible to improve that tool, the role of the Handbook is that of an instruction manual for
the tool which is currently available.
The creation of the Handbook has been in every sense a collaborative effort. The text
in part 1 is largely the responsibility of Francis Nolan, and the exemplification of the use
of sounds was provided by Peter Ladefoged and Ian Maddieson. Ian Maddieson, and
Martin Barry, as successive editors of the Journal of the International Phonetic
Association, have been responsible for overseeing and collating the rich and ever growing
stock of Illustrations. Martin Ball was instrumental in formulating the Extensions to the
IPA (appendix 3), and Mike MacMahon wrote appendix 4 on the history of the
Association. John Esling is responsible for appendix 2 on the computer coding of symbols,
and for most of the work involved in the final stages of preparing the Handbook including
the final editing of the Illustrations. And, of course, particular thanks are due to the authors
of the Illustrations, and to the large number of members of the International Phonetic
Association who responded with suggestions and corrections when a draft of parts of the
Handbook was published in the Journal of the International Phonetic Association.
THE INTERNATIONAL PHONETIC ALPHABET (revised to 1993, updated 1996)
CONSONANTS (PULMONIC)
Velar
Where symbols appear in pairs, the one to the right represents a voiced consonant. Shaded areas denote articulations judged impossible.
CONSONANTS (NON-PULMONIC)
o Bilabial D Bilabial
?
Examples: u
1 Dental Ci Dental/alveolar
P' Bilabial
T Epiglottal plosive
joined by a tie bar if necessary. .founs'tijsn
I Long CI
DIACRITICS Diacritics may be placed above a symbol with a descender, e.g. 11 T
Half-long CT
s t
o I Minor (foot) group
Voiced Creaky voiced D d Apical L Li
V II Major (intonation) group
h
th dh
V
Linguolabial t Li Laminal L CI
Aspirated . Syllable break li.aekt
• D O
More rounded 0 w Labialized
f-W
L
/1W
Li Nasalized C
Linking (absence of a break)
>
3 w
Advanced
V Y Velarized t Q 1 H1 TONES AND WORD ACCENTS
LEVEL CONTOUR
+ Lateral release vi
1 it
e S Extra
Sor A
" —1
Rising
__ Retracted Pharyngealized t ?
d Y No audible release Li
eor l high
e H Mid e •1
High
X
£ Raised C (J = voiced alveolar fricative)
rising
-1 Low e A
V Low
e
Mid-centralized v / rising
Syllabic n Lowered C ( LJ = voiced bilabial approximant)
e J Extra
low e 1
Rising-
falling
Non-syllabic e H
Advanced Tongue Root t
k
Downstep ? Global rise
2.2 Segments
Observation of the movements of the speech organs reveals that they are in almost
Introduction to the IP A 5
continuous motion. Similarly the acoustic speech signal does not switch between
successive steady states, but at many points changes gradually and at others consists of
rapid transient events. Neither the movements of the speech organs nor the acoustic signal
offers a clear division of speech into successive phonetic units. This may be surprising to
those whose view of speech is influenced mainly by alphabetic writing, but it emerges
clearly from (for instance) x-ray films and acoustic displays.
For example, the movements and the acoustic signal corresponding to the English
word worry will show continuous change. Figure 1 presents a spectrogram of this word.
Spectrograms are a way of making visible the patterns of energy in the acoustic signal.
Time runs from left to right, and the dark bands reflect the changing resonances of the
vocal tract as the word is pronounced. In the case of the word worry, the pattern ebbs and
flows constantly, and there are no boundaries between successive sounds. Nonetheless the
word can be segmented as [w^ji] - that is, as [w] + [B] + [i] + [i]. This segmentation is
undoubtedly influenced by knowledge of where linguistically significant changes in sound
can be made. A speaker could progress through the word making changes: in a British
pronunciation, for instance, [wRii] worry, [Iraii] hurry, [haeii] Harry, [haeti] Hatty, [haetg]
hatter. There are thus four points at which the phonetic event can be changed significantly,
and this is reflected in the analysis into four segments. Languages may vary in the points
at which they allow changes to be made, and so segmentation may have to be tentative in
a first transcription of an unknown language (see section 9). Nonetheless there is a great
deal in common between languages in the way they organize sound, and so many initial
guesses about the segmentation of an unfamiliar language are likely to be right.
w B J i
Figure 1 Spectrogram of the word worry, spoken in a Southern British accent.
6 Handbook of the IP A
Phonetic analysis is based on the crucial premise that it is possible to describe speech
in terms of a sequence of segments, and on the further crucial assumption that each
segment can be characterized by an articulatory target. * Articulation' is the technical term
for the activity of the vocal organs in making a speech sound. The description of the target
is static, but this does not imply that the articulation itself is necessarily held static. So, for
example, [J] (as in the word worry above) is described as having a narrowing made by the
tongue-tip near the back of the alveolar ridge (the flattish area behind the upper front
teeth). The tongue-tip actually makes a continuous movement to and from that target, as
reflected in the dipping pattern of higher resonances on the spectrogram in figure 1
between 0.4 and 0.5 s. In other sounds, a target will be held for a fixed amount of time.
The important point is that the use of segments and associated 'target' descriptions allows
for a very economical analysis of the complex and continuously varying events of speech.
different techniques for describing them. The different techniques arise from the more
closed articulation of consonants and the more open articulation of vowels.
2.4 Consonants
Because consonants involve a narrowing or 'stricture' at an identifiable place in the vocal
tract, phoneticians have traditionally classified a consonant in terms of its 'place of
articulation'. The [t] of ten, for instance, requires an airtight seal between the upper rim of
the tongue and the upper gum or teeth. Phonetic description of place of articulation,
however, concentrates on a section or 'slice' through the mid-line of the vocal tract, the
mid-sagittal plane as it is known, and in this plane the seal is made between the tip or
blade of the tongue and the bony ridge behind the upper front teeth, the alveolar ridge. The
sound is therefore described as alveolar. Figure 2 shows a mid-sagittal section of the vocal
tract, with the different places of articulation labelled. As further examples, the [p] of pen
is bilabial (the closure is made by the upper and lower lips), and the [k] of Ken velar
(made by the back of the tongue against the soft palate or velum). Other places of
articulation are exemplified in section 3.
Figure 2 Mid-sagittal section of the vocal tract with labels for place of articulation
On the IP A Chart, symbols for the majority of consonants are to be found in the large
table at the top. Place of articulation is reflected in the organization of this consonant
table. Each column represents a place of articulation, reflected in the labels across the top
of the table from bilabial at the left to glottal (consonants made by the vocal cords or vocal
folds) at the right. The terms 'bilabial' and 'labiodental' indicate that the consonant is
made by the lower lip against the upper lip and the upper front teeth respectively;
8 Handbook of the 1PA
otherwise it is normally assumed that the sound at a named place of articulation is made
by the articulator lying opposite the place of articulation (so alveolars are made with the
tip of the tongue or the blade (which lies just behind the tip)). The exception to this is the
term 'retroflex'. In retroflex sounds, the tip of the tongue is curled back from its normal
position to a point behind the alveolar ridge. Usually alveolar [J] shares some degree of
this curling back of the tongue tip, which distinguishes it from other alveolars. Note that
except in the case of fricatives only one symbol is provided for dental / alveolar /
postalveolar; if necessary, these three places can be distinguished by the use of extra
marks or 'diacritics' to form composite symbols, as discussed in section 2.8. For example,
the dental / alveolar / postalveolar nasals can be represented as [n n n] respectively.
The rows of the consonant table, labelled at the left side by terms such as plosive,
nasal, trill, and so on, reflect another major descriptive dimension for consonants, namely
'manner of articulation'. Manner of articulation covers a number of distinct factors to do
with the articulation of a sound. One is the degree of stricture (narrowing) of the vocal
tract involved. If the articulation of the plosive [t] is modified so that the tongue tip or
blade forms a narrow groove running from front to back along the alveolar ridge, instead
of an airtight closure, air can escape. The airflow is turbulent, and this creates sound of a
hissing kind known in phonetics as frication. Such a sound is*called a fricative. In this case
the resultant sound would be [s] as in sin. Other fricatives include [f] (as mfin) and [J] (as
in shin). If even less narrowing is made in the vocal tract, an approximant will result, in
which the airflow is not turbulent and no frication is audible. Approximants are
exemplified by the sound [j] at the start of yet, and the first sound in red in most varieties
of English ([J], [4], or [u] according to the variety).
'Manner of articulation' also includes important factors such as whether the velum (the
soft part of the palate at the back of the mouth) is raised or lowered. If it is lowered, as for
the sounds [m] and [n] in man, the resonances of the nasal cavity will contribute to the
sounds. Consonants where this happens are called nasals. Laterals (lateral approximants
such as English [1] in let and lateral fricatives such as Welsh [1] in llan 'church (place-
name element)' are sounds where air escapes not in the mid-line of the vocal tract but at
the side. Trills are sounds like [r] in Spanish perro 'dog' in which the air is repeatedly
interrupted by an articulator (in this case the tongue tip) vibrating in an airstream. A very
short contact, similar in duration to one cycle of the vibration of a trill, is called a tap, such
as the [r] in Spanish pero 'but'.
A further important factor in the description of consonants is not shown in the column
or row labels. This is whether the consonant is voiced or voiceless. In voiced consonants
the vocal cords are producing acoustic energy by vibrating as air passes between them,
and in voiceless ones they are not. A symbol on the left of a cell in the table is for a
voiceless consonant, e.g. [p] and [?], and one on the right is for a voiced consonant, e.g.
[b] (the voiced counterpart of [p]) and [m]. Voicing distinctions are actually more fine-
grained than implied by this two-way distinction, so it may be necessary to add to the
notation allowed by the two basic symbols. For instance, the symbolization [ba pa p ha]
implies consonants in which the vocal cords are, respectively, vibrating during the plosive
Introduction to the IPA 9
closure, vibrating only from the release of the closure, and vibrating only from a time well
after the release (giving what is often known as an 'aspirated' plosive). Where a cell
contains only one symbol, it indicates (with one exception) a voiced consonant and is
placed on the right. The exception is the glottal plosive [?] (as the vocal cords are closed,
they are unable simultaneously to vibrate).
It should be clear that the consonant table is more than a list of symbols; it embodies a
classificatory system for consonants. It allows the user to ask a question such as 'how
should I symbolize a voiced sound involving complete closure at the uvula?' (The answer
is [G].) Or conversely, 'what sort of a sound is [j]?' (The answer is one which is voiced,
and in which frication can be heard resulting from a narrowing between the tongue front
and the hard palate.)
Not all cells or halves of cells in the consonant table contain symbols. The gaps are of
three kinds. Shaded cells occur where the intersection of a manner and a place of
articulation define a sound which is thought not to be possible, either by definition (a nasal
requires an oral occlusion combined with lowering of the velum, and so a pharyngeal or
glottal nasal is ruled out), or because the sound is impossible or too difficult to produce,
such as a velar trill or a bilabial lateral fricative. Unless phoneticians are mistaken in their
view of the latter category of sound, no symbols will be needed for any of the shaded
cells. An unshaded gap, such as the velar lateral fricative, may indicate that the sound in
question can be produced, but has not been found in languages. It is always possible that a
language will be discovered which requires the gap to be filled in. A case of this kind is
the velar lateral approximant [L], which only became generally known among phoneticians
in the 1970s when it was reported in Kanite, a language of Papua New Guinea. An
unshaded gap may also occur where a sound can be represented by using an existing
symbol but giving it a slightly different value, with or without an added mark separate
from the symbol. A symbol such as [0], shown on the chart in the position for a voiced
bilabial fricative, can also be used to represent a voiced bilabial approximant if needed. In
a similar way, no symbols are provided for voiceless nasals. A voiceless alveolar nasal can
be written by adding the voiceless mark [ o ] below the symbol [n] to form an appropriate
composite symbol [n]. Many of the gaps on the chart could be filled in this way by the use
of diacritics (sections 2.8 and 3). The formation of this kind of composite symbol is
discussed further in the section on diacritics below.
between the glottis and a consonant stricture further forward in the vocal tract. If the air is
squeezed, and therefore flows outwards - abruptly when a closure further forward is
released, or briefly but continuously through a fricative stricture - the sound is known as
an 'ejective'. Ejectives are symbolized by the appropriate voiceless consonant symbol
with the addition of an apostrophe, e.g. [p'], [s']. If instead the air between the glottis and
a closure further forward is expanded, reducing its pressure, air will flow into the mouth
abruptly at the release of the forward closure. Usually the closure phase of such sounds is
accompanied by vocal cord vibration, giving '(voiced) implosives' such as [6]. If it is
necessary to symbolize a voiceless version of such a sound, this can be done by adding a
diacritic: [6].
'Velaric' airstream sounds, usually known as 'clicks', again involve creating an
enclosed cavity in which the pressure of the air can be changed, but this time the back
closure is made not with the glottis but with the back of the tongue against the soft palate,
such that air is sucked into the mouth when the closure further forward is released. The
'tut-tut' or 'tsk-tsk' sound, used by many English speakers as an indication of disapproval,
is produced in this way, but only in isolation and not as part of ordinary words. Some
other languages use clicks as consonants. A separate set of symbols such as [^] is
provided for clicks. Since any click involves a velar or uvular closure, it is possible to
symbolize factors such as voicelessness, voicing, or nasality of the click by combining the
click symbol with the appropriate velar or uvular symbol: [k^ g=j= rj=t=], [q!].
2.6 Vowels
Vowels are sounds which occur at syllable centres, and which, because they involve a less
extreme narrowing of the vocal tract than consonants, cannot easily be described in terms
of a 'place of articulation' as consonants can. Instead, they are classified in terms of an
abstract 'vowel space', which is represented by the four-sided figure known as the 'Vowel
Quadrilateral' (see the Chart, middle right). This space bears a relation, though not an
exact one, to the position of the tongue in vowel production, as explained below.
Figure 3 shows a mid-sagittal section of the vocal tract with four superimposed
outlines of the tongue's shape. For the vowel labelled [i], which is rather like the vowel of
heed or French si 'if, the body of the tongue is displaced forwards and upwards in the
mouth, towards the hard palate. The diagram shows a more extreme version of this vowel
than normally found in English at least, made so that any further narrowing in the palatal
region would cause the airflow to become turbulent, resulting in a fricative. This extreme
vowel is taken as a fixed reference point for vowel description. Since the tongue is near
the roof of the mouth this vowel is described as 'close', and since the highest point of the
tongue is at the front of the area where vowel articulations are possible, it is described as
'front'.