Using Praat to synthesize speech from Vocal Tract Area functions

New functionality was introduced in Praat 5.3.14 and is still under development. Future versions will allow the direct creation of Vocal Tract Tiers from LPC objects and LPC filtering of (source) sounds with sample-frequencies and durations that differ from those of the LPC analysis. Be aware that this code has not been tested extensively, so there will be bugs.

Links to Real Time MRI and articulatory synthesis

Examples of manipulations using Vocal Tract Area functions in praat

In Praat it is possible to calculate a vocal tract area function that is equivalent to a certain (vowel) sound. The sound can then be resynthesized using the calculated vocal tract area function as a filter. The vocal tract area functions can be manipulated and modified before resynthesis. In the list below, you find some example vocal tract area functions of sustained /a/, /i/, and /y/, and the resynthesized sounds.

Female speaker

Blend two vocal tracts: Paste the lips of an /y/ onto the vocal tract of an /i/. That is append the last four segments of /y/ to /i/ vocal tract function, adapt length etc.:

Male speaker

Blend two vocal tracts: Paste the lips of an /y/ onto the vocal tract of an /i/. That is append the last two segments of /y/ to /i/ vocal tract function, adapt length etc.:

Attaching measured areas to a Vocal Tract Area functions in praat

Take measured areas from MRI slices of the lips, and attach them to an existing Vocal Tract Area function. Start with the recordings of /i/ and /y/ of the female speaker above. Areas for her lips were determined using an MRI image. Starting from the teeth (X=0) go outward. Only every third slice was used. Slice thickness was 1.4064 mm and the area value is positioned at slice midpoint. All values are recalculated to meters.
X /i/ (m2) /y/ (m2) /a/ (m2)
0.0007032 0.00024051 0.00017821 0.00062801
0.0049224 0.000366 0.00012811 0.00043362
0.0091416 0.00035899 0.00008623 0.00037098
0.0133608 - 0.00001303 0.00039874
0.01758 - - 0.00037381
Start with the original recorded vowels /i/ and /y/ from the female voice. Convert them to LPC -> VocalTract with order 30 and length 0.17 (/i/ VocalTract) and order 32 and length 0.1756 (/y/ VocalTract). Replace the last three sections in the original /i/ VocalTract with the values from the table for /i/ and /y/, For the /y/ table values, adapt the number of sections of the resulting VocalTract to 31 and length to 0.1756. The same is done for the last four sections of the original /y/ VocalTract. But now the number of sections for the /i/ table values is reduced to 31 and the length to 0.17. The two original and four new VocalTracts can then be resynthesized like was done above.

Starting with the original recorded /i/

Starting with the original recorded /y/

From Vocal Tract area functions to speech

The following table explains how to get from a Vocal Tract to a synthetic sound. For synthesis, a "Source" sound is needed that supplies the driver of the Vocal Tract filter. In normal speech, the source sound is produced by the glottal folds, or voice box. You can generate a source as specified below. Note that the sample frequency of the source sound has to be equal to the number of segments in the Vocal Tract in kHz. For instance, if you have 40 segments (tubes), you need a source sampled with 40kHz. Use the Praat Resample... function to perform the resampling. The length of the Vocal Tract Tier must be exactly the same as the length of the Source sound. Below, we take a duration of 3 seconds in the presented examples. The audio example is 5 seconds long.

Here is an example generated by determining the vocal tract area function at a point in a recorded /a/ and one at a corresponding point in a recorded /i/ from the same speaker. The voice source signal is entirely synthetic.

To test the synthesis, you can use the standard vocal tracts in Praat or create a Vocal Tract from recorded speech. The standard phone Vocal Tracts can be created in Praat from New->Articulatory synthesis->Create Vocal Tract from phone... . To create a Vocal Tract from recorded speech, simply read in the recording and convert it to LPC with the Formants & LPC -> LPC (autocorrelation)... options. Enter the number of segments you want in your Vocal Tract as the prediction order. Then use To VocalTract (slice)... to generate the Vocal Tract object. Save it with Save->Save as short text file... . Note that there is a rather convoluted relationship between the LPC prediction order, the sample frequency, the recorded sound and the quality of the resulting LPC model.

You can download Praat from

Sythesize sound
Read VocalTract fileOpen->Read from file... Read from file... a.VocalTract
Convert to Vocal Tract TierTo VocalTractTier... To VocalTractTier... 0 3 0.5
Convert Tier to LPCTo LPC... To LPC... 0.005
Select both LPC and Source audio fileOption/Control select source audio Sound plus Sound Source
Filter Source with LPCFilter... Filter... no
Resample to 10kHzConvert->Resample... Resample... 10000 50
Generate Source sound
Create an empty PitchTier objectNew->Tiers->Create PitchTier... Create PitchTier... Source 0 3
Add a high starting point at 120HzModify->Add point... Add point... 0 120
Add a low end point at 100HzModify->Add point... Add point... duration 100
Convert it into a phonation soundSynthesize->To Sound (phonation)... To Sound (phonation)... 40000 1 0.05 0.7 0.03 3 4 no
Scale to a nice intensityModify->Scale intensity... Scale intensity... 70
Create Vocal Tract
Read audio fileOpen->Read from file... Read from file... a.wav
Convert to LPC with predition order 40 for 40 tube segmentsFormants & LPC -> LPC (autocorrelation)... To LPC (autocorrelation)... 40 0.025 0.005 50
Convert LPC to Vocal tract, use slice at 2 seconds and a total vocal tract length of 20 cmTo VocalTract (slice)... To VocalTract (slice)... 2 0.20

Example files /i/ /a/

Example files /i/ /y/ (LPC order 44)

These files all Copyright © 2012 NKI-AVL and R.J.J.H. van Son. Licensed under the GNU GPL v3 or later See below

Vocal Tract tube models

The Vocal Tract area functions model the human vocal tract as a set of connected tubes with variable width. Determining the tube, or segment, areas with LPC is not very reliable. Below are presented tube models as determined with LPC (prediction order 40) and the "theoretical" models as given by Praat New->Articulatory synthesis->Create Vocal Tract from phone....
Vocal Tract tube model of /a/ (example) Vocal Tract tube model of /i/ (example)
Standard Vocal Tract tube model of /a/ Standard Vocal Tract tube model of /i/
Vocal Tract tube model of /y/ (example) Standard Vocal Tract tube model of /y/

VocalTract file format

The example VocalTract file below is created with Save as short text file... . There is a more descriptive (longer) format that is obtained with Save as text file... .

File type = "ooTextFile" The line by which Praat can recognize your file
Object class = "VocalTract 2" The line that tells Praat about the contents
 Empty line
0 xmin: First segment (Glottis, meter)
0.2 xmax: Last segment (Lips, meter)
40 nx: Number of segments
0.005 dx: Segment length (m)
0.0025 x1: Position of first segment
1 ymin: NA
1 ymax: NA
1 ny: NA
1 dy: NA
1 y1: NA
0.00010813061971705616 Area in m2
0.00010390570341053334 Area in m2
8.903563828398031e-05 Area in m2
0.00010876151465927323 Area in m2
.... Many more values
0.008175693406171154 Area in m2
0.0013459947563683344 Area in m2
0.04293933951717365 Area in m2
0.000489118171677886 Area in m2 all files


    Copyright © 2012  NKI-AVL, Amsterdam and R.J.J.H. van Son

    This program is free software: you can redistribute it and/or modify
    it under the terms of the GNU General Public License as published by
    the Free Software Foundation, either version 3 of the License, or
    (at your option) any later version.

    This program is distributed in the hope that it will be useful,
    but WITHOUT ANY WARRANTY; without even the implied warranty of
    GNU General Public License for more details.

    You should have received a copy of the GNU General Public License along with this program.  
    If not, see