Next: Resonance Up: The Acoustics of Vowels Previous: Major speech sound categories

The larynx and the glottal cycle

Look at Figure 1, which gives a picture of the larynx, or voice box, which is located in the neck below the pharynx or throat, and above the two descending tubes which divide here, the esophagus for food and liquid, and the trachea for air.
It is thought that the mammalian larynx evolved as a sort of air valve. In humans the larynx has the following non-speech functions:
1. It prevents food and liquid from passing into the lower air passages.
2. If solid or liquid matter touches the larynx, the cough reflex attempts to expel it.
3. The larynx can be closed off to stiffen the trachea for strenuous activities such as lifting and childbirth.
The vocal cords or folds are attached by cartilages to the laryngeal muscles, which exert very precise small motor control over the placement and tension of the vocal cords.
The vocal cords can assume a variety of configurations (see Figure 2). In this course we will focus almost entirely on true voicing, as opposed to whispering, creaky voicing, or breathy voicing.
Voicing is best termed a kind of buzz. (Demonstration of a labial buzz.) It is cyclical, and according to the frequency of the cycle we perceive a person's voice as low as at 100 Hz, or high as at 400 Hz, two octaves higher. A trained soprano voice can go as high as 1000 Hz. The sound of the glottal buzz itself is not very interesting, but it is the source of energy for all of the voiced phonemes, as well as for all voiced non-speech sounds such as screams, humming, and the like.
A buzz is the result of a cyclical series of events; in the case of the labial buzz, the events are the forcing open of the lips by the compressed air in the mouth, the escape of a puff of air, and the reclosing of the lips in preparation for the next cycle. Since the glottis is the name for the space between the vocal cords, voicing can be defined as a glottal buzz. The glottal cycle is very similar to the labial buzz cycle; it proceeds through the following phases (Figure 3):
1. Closure The vocal folds are brought into the closed position by the muscular energy of the laryngeal muscles.
2. Pressure build-up The expiratory flow of air from the lungs continues, forcing compression of the air immediately below the vocal folds.
3. Blowout When the pressure of the compressed air exceeds the retaining force of the laryngeal muscles, the vocal cords are blown apart in a very brief instant of time.
4. Return to closure Two forces work for the reclosing of the vocal cords: the elastic recoil of the folds themselves, and the Bernoulli effect of the rushing air through the vocal aperture.
It is interesting to note that the glottal cycle is not controlled by the exercise of separate muscular control for each repetition of the cycle, but by maintaining an appropriate muscular tension and letting the passage of air cause the rhythmic vibrations.
Viewed from the articulatory perspective, then, voicing is involved in any speech sound which involves the repeated opening and closing of the vocal cords, with the release of a puff of air for each repetition of the cycle.
Look at Figure 4 for the waveform of pressure changes immediately above the vocal cords in the pharynx. The time measurements for the phases of the glottal cycle, which is assumed to last for 8.3 ms in this case, equivalent to a frequency of 120 Hz, are as follows:
1. Blowout - 2 ms
2. Reclosing - 3 ms
3. Closure - 3.3 ms
The variations in air pressure and velocity caused by the glottal cycle are the basis for voiced speech. For each cycle, glottal closure causes the excitation of the air mass in the vocal tract; this excitation is dying down as the next glottal closure occurs. The original sound waves which leave the glottis and move up the vocal tract have the waveform in Figure 4.
The mapping of the glottal waveform to the frequency domain is called the glottal spectrum, the source function which defines how much of each frequency range is initially contained within the glottal waveform. The glottal spectrum contains practically all frequencies which are audible. Figure 5 contains the spectrum of the glottal waveform. Notice that there is a steady drop in the contribution of the higher frequencies. The glottal spectrum is similar that of the sawtooth waveform or the square waveform; it has been synthesized and is described as a harsh, monotonous buzzing noise. This combination of frequencies is the sound source for all voiced sounds.
Before the glottal sound wave emerges from the lips, it is subject to a great deal of modification in the vocal tract. We can best understand what happens to speech in the mouth and nose by considering the speech signal in terms of its frequency components. One provisional definition of speech is that it is a durationally-encoded, intonationally-encoded, and frequency-encoded signal. Most of the denotational information contained in the speech signal is contained in the frequency-encoded part of speech, although the tonal languages are an example of intonational-encoding of denotation. How do human beings encode an abstract phonemic pattern upon sound. We have to understand something about resonance in order to answer that question.

Next: Resonance Up: The Acoustics of Vowels Previous: Major speech sound categories

Ed Kaiser
Sat Mar 15 00:01:27 PST 1997