Next: Vowels Up: Introduction to Spectrogram Reading Previous: Introduction to Spectrogram Reading

Overall Hints

Try to display speech spectrograms at the same time scale as much as possible. The resolution setting which I prefer is 1 pixel = 1 millisecond. This allows 0.8 seconds of speech in the shapeutil waveform and spectrogram window, which is convenient for short utterances. But once you have cut the resolution by a factor of four, so that 3 or more seconds of speech are visible, it is difficult to discern important features of the signal. The shapeutil program will try to squeeze the entire spectrogram into the visible window. For longer segments of speech, this feature should be turned off by copying the file shapeutil.res from ~speech/class/bin/alpha to the user's current working directory and then changing the resource SPECTRUMINWINDOW to 'No.'
DEMO of changing resolution
Make sure that the decibel setting is optimum for the current spectrogram. In general, 45 dB is a good setting for the spectrograms that we will be looking at, but some people talk more softly or loudly than others, and they may have been seated closer to or further from the microphone while recording. In some cases it is better to see more noise along as the low amplitude speech sounds can also be seen.
DEMO of changing dB value
Try first to get an overview of the utterance. How many syllables does it have? Where are the voiced sections, the fricatives, and the plosives, Look at both the waveform and the spectrogram while you attempt to distinguish the syllable structure of the word or phrase, and in trying to find plosives.
DEMO of different words via the RANDOM WORD menu choice

Next: Vowels Up: Introduction to Spectrogram Reading Previous: Introduction to Spectrogram Reading

Ed Kaiser
Sat Mar 15 00:01:27 PST 1997