Next: Resonance
Up: The Acoustics of Vowels
Previous: Major speech sound categories
- Look at Figure 1, which gives a picture of the larynx, or voice box,
which is located in the neck below the pharynx or throat, and above the
two descending tubes which divide here, the esophagus for food and liquid,
and the trachea for air.
- It is thought that the mammalian larynx evolved as a sort of air
valve. In humans the larynx has the following non-speech functions:
- It prevents food and liquid from passing into the lower air passages.
- If solid or liquid matter touches the larynx, the cough reflex
attempts to expel it.
- The larynx can be closed off to stiffen the trachea for strenuous
activities such as lifting and childbirth.
-
The vocal cords or folds are attached by cartilages to the laryngeal muscles,
which exert very precise small motor control over the placement and tension
of the vocal cords.
-
The vocal cords can assume a variety of configurations (see Figure 2).
In this course we will focus almost entirely on true voicing, as opposed
to whispering, creaky voicing, or breathy voicing.
-
Voicing is best termed a kind of buzz. (Demonstration of a labial buzz.) It
is cyclical, and according to the frequency of the cycle we perceive a person's
voice as low as at 100 Hz, or high as at 400 Hz, two octaves higher. A trained
soprano voice can go as high as 1000 Hz. The sound of the glottal
buzz itself is not very interesting, but it is the source of energy for
all of the voiced phonemes, as well as for all voiced non-speech sounds such
as screams, humming, and the like.
- A buzz is the result of a cyclical series of events; in the case of
the labial buzz, the events are the forcing open of the lips by the compressed
air in the mouth, the escape of a puff of air, and the reclosing of the lips
in preparation for the next cycle. Since the glottis is the name for
the space between the vocal cords, voicing can be defined as a glottal buzz.
The glottal cycle is very similar to the labial buzz cycle; it proceeds
through the following phases (Figure 3):
- Closure The vocal folds are brought into the closed position by the
muscular energy of the laryngeal muscles.
- Pressure build-up The expiratory flow of air from the lungs
continues, forcing compression of the air immediately below the vocal folds.
- Blowout When the pressure of the compressed air exceeds the
retaining force of the laryngeal muscles, the vocal cords are blown apart in
a very brief instant of time.
- Return to closure Two forces work for the reclosing of the vocal
cords: the elastic recoil of the folds themselves, and the Bernoulli effect
of the rushing air through the vocal aperture.
-
It is interesting to note that the glottal cycle is not controlled by the
exercise of separate muscular control for each repetition of the cycle,
but by maintaining an appropriate muscular tension and letting the passage
of air cause the rhythmic vibrations.
-
Viewed from the articulatory perspective, then, voicing is involved in any
speech sound which involves the repeated opening and closing of the vocal
cords, with the release of a puff of air for each repetition of the cycle.
-
Look at Figure 4 for the waveform of pressure changes immediately above the
vocal cords in the pharynx. The time measurements for the phases of the
glottal cycle, which is assumed to last for 8.3 ms in this case, equivalent
to a frequency of 120 Hz, are as follows:
- Blowout - 2 ms
- Reclosing - 3 ms
- Closure - 3.3 ms
-
The variations in air pressure and velocity caused by the glottal cycle
are the basis for voiced speech. For each cycle, glottal closure causes
the excitation of the air mass in the vocal tract; this excitation is dying
down as the next glottal closure occurs. The original sound waves which
leave the glottis and move up the vocal tract have the waveform in Figure 4.
-
The mapping of the glottal waveform to the frequency domain is called the
glottal spectrum, the source function which defines how much of each
frequency range is initially contained within the glottal waveform. The glottal
spectrum contains practically all frequencies which are audible. Figure 5
contains the spectrum of the glottal waveform. Notice that there is
a steady drop in the contribution of the higher frequencies. The glottal
spectrum is similar that of the sawtooth waveform or the square waveform;
it has been synthesized and is described as a harsh, monotonous buzzing noise.
This combination of frequencies is the sound source for all voiced sounds.
-
Before the glottal sound wave emerges from the lips, it is subject
to a great deal of modification in the vocal tract.
We can best understand what happens to speech in the mouth and nose by
considering the speech signal in terms of its frequency components. One
provisional definition of speech is that it is a durationally-encoded,
intonationally-encoded, and frequency-encoded signal. Most of the
denotational information contained in the speech signal is contained in the
frequency-encoded part of speech, although the tonal languages are an
example of intonational-encoding of denotation. How do human beings encode
an abstract phonemic pattern upon sound. We have to understand something
about resonance in order to answer that question.
Next: Resonance
Up: The Acoustics of Vowels
Previous: Major speech sound categories
Ed Kaiser
Sat Mar 15 00:01:27 PST 1997