Measurement of Formant Frequencies and B
Measurement of Formant Frequencies and B
One of the methods we have frequently used to the vocal tract. In such moments the standing wave
gain insight into the resonatory function in singing is effect is evident in the fact that the wave continues
that of measuring the variation in time of that which uninterrupted through the open phase of the glottal
we have generally called supraglottal pressure. The cycle, where it otherwise would be largely damped
concept is simple enough: A wide-band miniature in a poorly tuned, " u n r e s o n a n t " state.
pressure transducer mounted on a catheter passed Such strong standing waves in the vocal tract are
through the nares is positioned close to the glottis. also evident in the radiated sound, measured by a
Since the glottal end of the vocal tract forms a pres- microphone in front of the singer. B e c a u s e the
sure antinode, the space just above the glottis is the mouth opening has a stronger reflecting effect on
most favorable location for measuring the ampli- low-frequency sound emerging from the vocal tract
tude of standing waves in the vocal tract, at least for than on high frequencies, however, the microphone
the lower resonances. These amplitudes can be con- signal (and thus the sound we hear) gives a picture
siderable--we have measured peak-to-peak values of the pressure waves in the vocal tract that is
of 100 cm H 2 0 pressure in a soprano singing a high s o m e w h a t o b s c u r e d by the e m p h a s i z e d high-
note (1)---and often result in a waveform that shows frequency component (see, e.g., Fig. 9). Simulta-
a clear dominance of one or another resonance in neous registration of the two signals during singing
allows us to compare them, affording insight into
vocal tract resonances that underlie the radiated
sound.
Accepted July 15, 1994. The purpose of the present study is twofold: (a) to
Address correspondence and reprint requests to Dr. H. K.
Schutte at Voice Research Lab., Dpt. Med. Physiol., University show how audio and supraglottai pressure signals
of Groningen, BLOEMS1NGEL 10, NL 9743KZ, Groningen, can, under certain conditions, yield precise and re-
The Netherlands. liable information about formants in singing, and,
This article was presented at the Voice Foundation, 22rid An-
nual Symposium: Care of the Professional Voice, Philadelphia, by extension, (b) to indicate how these signals can
PA, U.S.A., June 1993. serve as informal feedback for locating and adjust-
290
FORMANT MEASUREMENTS IN SINGING 291
ing the formants for improved "resonance" in sing- not expect the F 1 measurements to be as well de-
ing. fined as those for F 2.
a bandpass filter designed to isolate the fourth har- In Fig. 2 the oscillograms of selected points in the
monic. The fact that SPL modulation occurs at vibrato cycle are adjusted in time by aligning the
twice the rate of Fo modulation is clearly a result of audio and supra signals at the moment of glottal
the sweep of H4 across F 2 in both directions. The closing ("cl"). Both the maximal amplitude of the
projection of the SPL peaks on the frequency curve wave (within a larger time segment) and its domi-
(solid vertical lines) shows a difference in formant nant four-part structure allow us, by visual inspec-
frequency of 8 Hz between the ascending and de- tion, to locate the point of SPLmax within a narrow
scending phases (points P and Q). The projection of time range. Supraglottal pressure is given here with-
the -3-dB level of filtered F2 (the other SPL curves out amplitude calibration, essentially as a micro-
differ only marginally) on the frequency gives an phone signal.
effective bandwidth of 16 Hz at fundamental fre- Figure 3 shows 2.5 cycles (note the different time
quency. Since the sweep of the resonance takes scale) of vibrato in the top panel and similar SPL
place at the fourth harmonic, however, F 0 must be signals to those of Fig. 1, presenting five sweeps
multiplied by 4 to calculate F 2 (1,572 ascending and past a dominant F2. As with the tenor, the center
1,540 descending) and its bandwidth (64 Hz). frequency of the formant is higher in the ascending
phase. Since in this phonation it is the third har-
, i , ,
360 ~ . . . . 0 monic that sweeps the formant (see the oscillo-
"~ 350- ~s.= grams in Fig. 4), the Fo values must be multiplied by
3: F 2 is 1,035 Hz ascending and 1,005 Hz descend-
o 330
320 frecluency
f~equency vv {I.~ [ ~ z -]
• i
ing; bandwidth is 45 Hz.
81 audio, u n f i l t e r e d i - The extent to which the vibrato modulation
sweeps the bandwidth has important implications
~ 75
~ 72 for the accuracy of measurement of formant quali-
~ 69 ties. The double sweep in the F 2 signals presented
~ a6
63 here affords the determination, with a high degree
60
57 nu(ho fllteied F
of precision, of the effective formant frequency; a
direct measurement of the less precise parameter,
bandwidth, subject to reservations noted; and clear
~" 69 evidence of the correlation of the tuned moment in
~ 66
~ 63 the vocal tract with the moment of high SPL in the
60 radiated sound.
57 ...... - . . . . . . . . . . . .
i i i i i As we have argued elsewhere (2), a complete de-
0.0 0.1 0.2 0.3 0.4 0.5
scription of the modulation of sound in singers' vi-
Time (sec)
brato must consider not only the varying proximi-
FIG. 3. Signals similar to those of Fig. i. This time the s e g m e n t ties of harmonics to formants, but also vibrato-
is the vowel IOI on the pitch F4 s u n g by a b a s s - b a r i t o n e subject.
Note the difference from Fig. 1 in time scale (filter characteris- based modulations of the voice source spectrum.
tics: order 100, 0.9-1.2 k H z b a n d p a s s Blackman filter). The difficulty of distinguishing between source fac-
audioF ~
FIG. 4. Oscillograms, similar to those of Fig. 2.
taken from points marked with letters (S, R, Q)
in Fig. 3. cl, glottal closing.
cl ol cl
tots and vocal tract (resonance) effects makes any indicated dominance of F, is confirmed in the os-
analysis of vibrato complex. [For a theoretical over- cillograms (Fig. 6).
view of the problem, see Rothenberg et al. (3)]. A If we make the plausible assumption that F, is
list of modulating factors should include for voice reached at the peak of the first vibrato cycle, it is
source: intensity (with varying subglottal pressure) possible to measure a half-bandwidth. Extrapolat-
and source spectrum (with varying closing speed ing from the 12 Hz measured (24 Hz for the full
and closed/open quotient); and for vocal tract: width x 2 for H E) gives 48 Hz, also a plausible
physical movement in vibrato; and varying band- value. However, there are problems in taking such
width with closed quotient. an approach.
These have a relatively small effect in the case of In the case of the signals with F, dominance, the
our F2 examples. Rothenberg et al. (3) found that
the greatest variations in the voice source are cor- 360
550 p
related with the maxima and minima of frequency 540 ' ' ' ' ' ' ' ~ '
0
modulation, while the resonant points in our F 2 ex- ~-
v
330
o 520 ;~., .
amples occur near the center of the cycle. The vi- -- 510
290
(estimated from esophageal pressure) does not ex-
ceed 3% of the mean pressure, and we disregard the 78
69 ~
vibrato is a consistent feature of both F 2 examples
given here, and it witnesses, incidentally, to the
precision of the measurements. We attribute this ~ 57
difference to vibrato-based movement of the vocal g
tract, for example, vertical oscillation of the larynx.
u~ 5460 a:: ~ ~ i audio,filtereFd,
Bandwidth measurements varied within a narrow 51
range and showed a not more than marginal differ-
ence between ascending and descending phases.
g
~ 60
F l dominance
57 . . , i , r t
Figure 5 is composed similarly to Figs. 1 and 3. 0.0 0.! 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 ,.0
This time the signals come from two pitches (three Time (sec)
vibrato cycles each) of an ascending scale sung by
FIG. 5. Signals similar to Fig. I, giving the segment E~,4-F4 of a
the tenor. The filter is chosen, however, to isolate scale on the vowel/a/sung by the tenor subject. Each of the two
the harmonic (H 2) that engages the dominant F I res- pitches gets three cycles of vibrato. Here the filter is designed to
select the dominant second harmonic, which is resonated by F I.
onance. There is no complete sweep of the formant,
See text for discussion of formant frequency and bandwidth (fil-
and the SPL curves, at least at the beginning of the ter characteristics: order 100, 0.5-0.8 kHz bandpass Blackman
segment shown, closely follow the F o curve. The filter).
TENOR, Eb4-A4, / a / SUPRA velop names for the difference in feel (and sound)
450
between the different dominances. Our figure gives
400 two terms, "chest" and "head," which have some-
I
times been applied to this transition.
350
o Comparing the various signals for the identical
500 phonation gives rise to further general observa-
tions. The effective high-frequency emphasis in the
80 audio signal is the most characteristic feature of the
difference between audio and supra oscillograms.
v 70
This is most evident in the F o max of Fig. 9, where a
60 strong singer's formant (at the eighth harmonic)
80 dominates the audio, appearing much weaker in su-
pra. The Psupra signal, on the other hand, is presum-
~" 70
"a
v ably free from the influence of the room acoustics,
g 60 which may be a significant factor in audio SPL.
Comparison of various signals taken simulta-
50
80 neously brings us-to the heart of the matter regard-
ing the extraction of information from signals that
~"
"o
70 elude precise quantification, such as those pertain-
g- 6o
ing to F~ in this study. Fi dominance, especially
i/i
supra 900-2200 Hz
where it resonates the second harmonic, is a con-
50 i i I i t spicuous feature of a sung phonation and can be
0.0 0.2 0.4 0.6 0.8 1.0 ~.2 1.4 1.6
easily followed in real time by monitoring oscillo-
Time (sec)
grams, as well as spectrum analysis. Feedback is
FIG. 8. The same scale segment as Fig. 7, this time with the SPL
signals taken from the supraglottal pressure. The filtered signals salient, and one can learn to associate the perceived
follow similar curves to those characterizing audio. The portion complex "signal" (auditory and proprioceptive)
of unfiltered Psup,a to the right of the vertical line, however, with the characteristic oscillogram. Careful com-
differs from unfiltered audio in that it effectively reduces the
prominent contribution of the singer's formant at the peaks of the parison of signals taken simultaneously leads to the
vibrato cycle. ability to predict one signal on the basis of another
and even to predicting signals on the basis of per-
ception. Gradually a coherent picture of the various
data leave little doubt that H 2 passes F~ in the move parameters emerges.
from F to G (marked by the solid vertical line).
Typical of tones characterized by F r H 2 dominance CONCLUSIONS
is the fact that the harmonic does not pass the for- This study, more narrowly defined, leads us to
mant in vibrato, but approaches it as an upper limit. the following conclusions:
In Figs. 1 and 3, the magnitude of the shift of F 2
p - F o max Q - F 2 SPL max
frequency between points P and Q measured is on
the order of 2.5%, even near the center of the vi-
brato sweep. The full extent of the (peak-to-peak)
formant frequency shift in vibrato is likely to be audio
considerably greater; the method presented here,
however, does not allow us to estimate this accu-
rately.
In Figs. 7 and 8, although F~ appears to be
reached already at the top of the vibrato cycle on supra " ~ , ~
E~4, the whole of the pitch F4 remains under F~
dominance. This scale segment is presented as a cl c!
particularly vivid illustration of the move from F~ FIG. 9. Oscillograms similar to those of Fig. 2, taken from points
dominance to F2 dominance. Other comparable marked with letters in Figs. 7 and 8. The signals show the evident
dominance of the third harmonic at F2 max (indicated with B), and
scale segments may show it less conspicuously or the strong singer's formant on the eighth harmonic is conspicu-
not at all, but it is not surprising that singers de- ous in the microphone signal at A. cl, glottal closing.