Next: Diphthongs and Reduced Vowels
Up: The Acoustics of Vowels
Previous: Resonance
(NOTE: We do not include the diphthongs in the definition of the vowel
category; this class of sounds is more accurately termed the monophthongs,
but we will for now call them the vowels. The American English vowels which
we consider are Worldbet front vowels /i:/, /I/, /E/, /@/; the mid vowels
/I_x/, /3r/, /&/, /&r/; and back vowels /u/, /U/, />/, //,
/A/.
- A vowel can be defined as a relatively long-lasting, unchanging
sound in which the oral tract (with help from the nasal tract in the case
of nasalized vowels such as in French) is kept relatively open from the
glottis to the lips, allowing the vocal tract to act as a resonator.
Remember from our discussion of the whisper that that a vowel need not
be voiced, but we consider only voiced vowels from this point on.
- Vowels are stable segments during which the articulators do not move.
They almost always carry the greatest energy in the speech signal, because
during vowel phonation the vocal tract is most open. Because of these
characteristics, vowels are probably the easiest speech category to recognize
in a spectrogram.
- What gives vowels their individual character is the existence of
a different set of formants in the spectrogram for each vowel.
Formants are those frequency ranges which emerge from the mouth and nose
with the greatest relative amplitude. From the above discussion
of resonance, formants may be recognized as the resonant frequencies
of the vocal tract. For all voiced sounds including vowels, it is usually
sufficient to look at the three lowest frequency formants to recognize the
phoneme. Those formants are labelled F1, F2, and F3.
- For today's purpose, which is to gain some idea of the acoustic
characteristics of vowels, it will be sufficient to take as examples the
three so-called quantal vowels of American English plus the neutral
vowel. See Figure 8 for an idea of where the quantal vowels fit in the
vowel triangle (which is really a quadrilateral). The four vowels are
the following:
- /i:/ An high front vowel having a high-frequency
concentration of energy above 1800 Hz.
- /u/ A high back vowel having almost all its energy in the low
frequencies below 1000 Hz.
- /A/ A low vowel having a tight concentration of energy
in the mid-range of 800 Hz to 1800 Hz.
- /&/ A central vowel having a spread of energy among all
frequencies.
Figure 9 shows about 10 pitch periods for each of these quantal vowels.
Figure 10 shows the traditional spectrogram of the same four vowels, along
with a notion of where they fit in the vowel space. Figure 11 shows
these four vowels in 3-D form. Observe the location of the formants F1, F2,
and F3, which look like mountain peaks.
- The rules which acousticians use to predict the formant structure
of vowels can be summarized as follows:
- The area of the major constriction determines the location of F1; as
that area decreases, F1 also decreases. Contrast the small opening of /i:/
and /u/ with the widely open /A/ and the intermediate /&/.
- The distance from the glottis to the major constriction determines
the location of F2; as the distance increases, F2 also increases. Contrast
the high F2 of /i:/ with the low F2 of /u/. See Figure 11.
- For a given area and distance from the glottis of the major constriction
in a vowel, lip rounding causes F1 and F2, especially F2, to fall. Thus, the
back vowel /u/ has a lower F2 than rule 2 alone would predict because we
round our lips when pronouncing this vowel.
The above rules are subject to further refinement.
- The importance of the article Peterson and Barney article, ``Control
Methods Used in a Study of the Vowels", which appeared in 1952, is that it
provided one of the first solid insights into how a listener's classification
of a given vowel depends upon the formant frequencies in the speech signal.
In the study, 70 subjects listened to 1520 recorded words from the set
heed, hid, head, had, hod, hawed, hood, who'd, hud, heard as pronounced
by 76 different adult male, adult female, and child speakers. Many interesting
statistics emerged, but one of the main conclusions is provided by Figure
12, which plots the F1 formant value versus the F2 formant value of a number
of adult male and child speakers for the 10 vowels.
}
Next: Diphthongs and Reduced Vowels
Up: The Acoustics of Vowels
Previous: Resonance
Ed Kaiser
Sat Mar 15 00:01:27 PST 1997