Week 9

Anders dan de andere opgaven hebt u voor deze opgave de tijd tot maandag 1 december (2 1/2 week).

Automatic evaluation of Mandarin tone pronunciation.

An obvious way to evaluate a student's pronunciation is to compare it to a stored or produced reference example. If the differences are too large, the utterance is rejected.

This will be build using the Mandarin voice of eSpeak (-v zh) to generate the reference examples.

Procedure:
(Note that To DTW... for Pitch objects is hidden. You can make it visible "Praat->Preferences->Buttons...->Actions N-Z->Pitch(2): To DTW... " change to shown)

  1. Generate reference utterance:
    espeak -v zh "shuo1 hao3 zhong1 wen2" -w reference.wav
  2. Generate test utterance:
    Record it or use espeak (with errors!)
    espeak -v zh -g 50 -p 100 -s 100 "shuo1 hao2 zhong1 wen2" -w test.wav
  3. Read the wav files into Praat
  4. Calculate the Pitch of both test and reference utterances
  5. Normalize the reference utterance to obtain the same mean and standard deviation as the test utterance. (reject if the standard deviation is too small)
    Either:
    - Use Modify->Formula... "(self - MeanRef)*(SDtest/SDref) + MeanTest"
    - Resynthesize the reference with the new Pitch and Standard deviation
    NOTE: both can be done in Hz and in Semitones
  6. Select both the test and the normalized reference pitches -> To DTW... (fix start and end, no restrictions)
  7. Query for the final distance.
The values for the above procedure should be compared to the same values obtained by generating incorrect test utterances with eSpeak, eg, "shuo1 hao4 zhong1 wen2" or "shuo3 hao4 zhong1 wen4" using different speeds and pitch and compare them to the reference utterance.

Try to find out what kind of errors can be found this way using several four syllabic phrases. What are good boundaries for "bad" pronunciation? Why?

Example sentences and a translator can be found at the MDBG Chinese English dictionary http://us.mdbg.net/chindict/chindict.php
(note that this dictionary uses a 5 to indicate the neutral tone)

Chinese examples:
shuo1 hao3 zhong1 wen2 - Speak Good Chinese
bei3 jing1 da4 xue2 - Beijing University
xue2 sheng1 hen3 mang2 - Students are busy
chi1 he1 wan2 le4 - Eat drink and be merry
qin1 peng2 hao3 you3 - Friends and family
zi4 xing2 che1 sai4 - Bicycle race

Use Praat scripting to automate the above procedure. That is, from an input list of 4 syllabic Chinese (pinyin) phrases:
  1. Select a phrase
  2. Call eSpeak and generate the reference phrase
  3. Read it and play it to the subject
  4. Record, generate, or read the test phrase
  5. Normalize the reference phrase
  6. Use DTW to determine the distance
  7. Give feedback
  8. Clean up
  9. Pause and next phrase
(see: Praat help or http://www.fon.hum.uva.nl/david/ba_spc/2008/scripting.pdf)

Scripts released under the GNU GPL will be published on the SpeakGoodChinese web site.