- Try to display speech spectrograms at the same time scale as much as
possible. The resolution setting which I prefer is 1 pixel = 1 millisecond.
This allows 0.8 seconds of speech in the shapeutil waveform and
spectrogram window, which is convenient for short utterances. But once you
have cut the resolution by a factor of four, so that 3 or more seconds of
speech are visible, it is difficult to discern important features of the
signal. The shapeutil program will try to squeeze the entire
spectrogram into the visible window. For longer segments of speech, this
feature should be turned off by copying the file shapeutil.res from
~speech/class/bin/alpha to the user's current working directory
and then changing the resource SPECTRUMINWINDOW to 'No.'
DEMO of changing resolution
- Make sure that the decibel setting is optimum for the current
spectrogram. In general, 45 dB is a good setting for the spectrograms
that we will be looking at, but some people talk more softly or loudly
than others, and they may have been seated closer to or further from the
microphone while recording. In some cases it is better to see more noise
along as the low amplitude speech sounds can also be seen.
DEMO of changing dB value
- Try first to get an overview of the utterance. How many syllables does
it have? Where are the voiced sections, the fricatives, and the plosives,
Look at both the waveform and the spectrogram while you attempt to distinguish
the syllable structure of the word or phrase, and in trying to find plosives.
DEMO of different words via the RANDOM WORD menu choice