NKI TE-VOICE ANALYSIS tool (EN)

The NKI TE-VOICE ANALYSIS tool (TEVA) is intended to help the education and research of Speech Pathologists and others who want to study Tracheoesophageal speech. TEVA implements Acoustic Signal Typing from the work of Corina van As-Brooks (Van As 2001; 2006).

Contents

What is the NKI TEVA tool
Getting started
Overview of the Main page
Overview of the Configuration page
Tutorials
Analysis calculations and displays
Acoustic Signal Typing
TEVA Copyright and License
What's new?

Links to this page


© R.J.J.H. van Son, December 1, 2011

 

Introduction to TEVA

the NKI TE-VOICE ANALSYSIS tool (TEVA) is developed as a tool for use in education and research. TEVA is intended to help Speech Pathologists and other researchers to study the acoustic characteristics of TE speech and to allow them to get experience with the acoustic analysis of Tracheoesophageal (TE) speech.

Introduction

TEVA is built on top of the Praat phonetics software package (www.praat.org). As such TEVA presents a selection of the relevant analysis methods with an easy to use interface. The approach to the analysis of TE speech used in TEVA is based on the work of Corina van As-Brooks as described in her PhD thesis (Van As, 2001).

Links to this page


© R.J.J.H. van Son, December 6, 2011

 

Getting started with TEVA

The NKI TE-VOICE ANALYSIS tool (TEVA) is a multi-platform stand-alone application. It is available for MS Windows, Apple Macintosh OSX, and Linux. It is also available as a separate interactive Praat script (www.praat.org).

Getting a copy of TEVA

TEVA is licensed under the GNU GPLv2 and can be freely used and distributed. You can download a copy of TEVA from www.fon.hum.uva.nl/IFA-SpokenLanguageCorpora/NKIcorpora/NKI_TEVA/ . TEVA can be saved on your hard disk or USB thumb drive and started by clicking on the icon.

The TEVA pages

After you start TEVA, a window will appear. This will initially contain the Main page. In normal practice, this is the page you will spend most of your time in. There is a second page which allows you to change the settings of the application, the Configuration page. You can turn pages using a button on the top right of each page (with an arrow symbol →).

Each page contains a number of buttons. The Main page also contains a canvas which is used to draw the results of the analysis on. You interact with TEVA by clicking the buttons. For instance, at the top-right of the Main page, there is a Quit button labeled with a red X cross that will terminate the application when you click on it. While the TEVA application is busy processing whatever the click of a button asks it to do, the button will be grayed out (the label will be gray too instead of black or colored). While a button is grayed out, TEVA will not respond to other button clicks.

Every button has a keyboard shortcut. This shortcut is generally a single character, one of the letters of the label on the button. That character will be printed in italic in the label. For instance, in English, the Quit button is labeled Quit (with an italic Q). Hitting the Q key (upper or lowercase) will terminate the program.

You can change the size of the TEVA window just as you can with every other window on your desktop. However, you will notice that the positioning of the buttons and texts on the page will be off. Sometimes the windows will look completely scrambled. The page can be redrawn with the Refresh button or by hitting the space bar. Use the space bar when the page is so scrambled that you cannot click the Refresh button anymore.

Each page contains a Help button. This button is labeled with a ? question mark. Hitting the ? or / key will start the interactive Help service. While Help is active, a single line of help text will appear whenever you click a button. Clicking the Help button again will stop the Help service. The help text for each button will include the keyboard shortcuts in the current language.

General functions

Here are descriptions of a few often used buttons for general use. English labels will be used here. The keyboard shortcuts are given between the [brackets] (might depend on language).

On the Main page
Quit: [Q] Stop TEVA. Will save the current preferences
Settings: [S] Go to Configuration page
Refresh: [h] Redraw the current page, hitting the space-bar always refreshes the screen
Help: [?] Press on the button you want information on, press Help again to continue
  
On the Configuration page
Help: [?] Press the button you want information on, press Help again to continue
Return: [R] Go back to Main page
English: [E] Use English labels and help
Deutsch: [D] Use German labels and help
Nederlands: [N] Use Dutch labels and help
Manual: [M] Display this manual

More can be found in Overview of Main page and Overview of Configuration page.

Language support

TEVA supports a few languages, currently English, German, and Dutch. Extending this to other languages is rather easy (just translating a few dozen sentences). But it can only be done with the help of a native speakers of that language. Please contact us if you would like to help to port TEVA to your language.

Links to this page


© R.J.J.H. van Son, December 6, 2011

 

Overview of Main page

TEVA is used by clicking buttons on the current page. Here is a list of the buttons on the Main page with a description of their use. English labels will be used here. The keyboard shortcuts are given between the []-brackets (might depend on language). See also the Overview of Configuration page.

The display

In the center of the page is the display area where a graph is drawn with the analysis. Below the time axis (if present), the voiced parts are indicated with blue-gray line segments. A green marker will indicate the point where the Harmonicity is maximal (which indicates a potential "best" part). These two additional markers will in general appear only after, respectively, the Pitch and Harmonicity contours have been displayed.

General functions

Recorded speech

The sound level of the sound will be indicated as a colored circle in the top left of the page. The diameter scales with the maximal amplitude and the color indicates whether the maximal amplitude is too high (red), good (green), or too low (darker green to black).

Selecting an interval of speech

The current time window will be indicated with vertical blue lines if it is smaller than the current display. Changing the display (eg, zoom in or out) will set the display window to the current time window.

A shortcut to the Select button is to click inside the display graph. The point where clicked will be marked as the first boundary and all will proceed as if the Select button was pressed. After selecting the first boundary, pressing the - or + keys will position a 1 or 2 second window around the (first) selection point, respectively. Pressing any other key or clicking the Select button again will cancel the selection.

Saving and printing a report

A report with graphs and the analysis results can be saved into a report in printing format. The report contains the waveform of the current window, a 0.1 second enlargement window, and the Spectrogram, Pitch, and Ltas graphs. The analysis windows are calculated and pictures are combined into a single page report. On Apple OSX the pictures are saved as PDF graphics, on Microsoft Windows as (extended) WMF graphics. On all systems, pictures can also saved as PostScript graphics (EPS). There is experimental support for PNG format on Mac OSX and Linux.

Analysis calculations and displays

Selected statistics about the analysis will be written below the display. This includes the Acoustic Signal Typing analysis related to the display.

Links to this page


© R.J.J.H. van Son, December 6, 2011

 

Overview of Configuration page

TEVA is used by clicking buttons on the current page. Here is a list of the buttons on the Configuration page with a description of their use. English labels will be used here. The keyboard shortcuts are given between the []-brackets (might depend on language). See also the Overview of Main page.

Many of buttons are radio type push buttons. When clicked, they remain "pushed down" until another button is pushed. The state of these button is remembered between invocations of TEVA. Buttons on a (light) gray background are grouped together, like the language choice buttons or the Frequency buttons.

General functions

Frequency: Spectral display

The display of spectral features should be reduced to exclude irrelevant detail. Set the maximal frequency to a frequency that just includes all the relevant features. The sample frequency of recordings are adjusted to this display frequency. If the display frequency is set to 5 kHz or below, the sample frequency will be set to 11.025 kHz (down from 22.050 kHz).

Pitch

Pitch tracker settings

Archive recording and collection of audio

Execute automatic analysis that might be time consuming

Voiceprint output format

The format to write the voicprint image to. Not all formats are available on all platforms

Selection of speakers or recordings from the speaker data table

Speakers: Speaker data tables and recording duration

It is useful to have a list of speaker data available. TEVA can read and write Tab delimited tables (tsv) with speaker data. A backup of this file, with a tilde '~' appended to the name, is created if the data are changed.

Each row in the speaker table contains at least five fields separated by tabs:

[1] ID Speaker, or Recording, ID (must be unique)
[2] Text Speaker description (age, sex, etc)
[3] Description Comments
[4] Audio File name of a recording with path relative to the table
[5] AST Acoustic Signal Type (1-4)
[6] StartTime Start of the window used for AST (optional)
[7] EndTime End of the window used for AST (optional)
[8] Prepared columns with Acoustic Signal Type (1-4) (optional)

If given, the recording will be automatically loaded if this speaker is selected in the main page. Additional columns are used for specific functions, e.g., VAS rating scales.

Rating screens

Miscelaneous

Recording sets the duration of live recordings. It also allows to set complex recording tasks

Additional information

Links to this page


© R.J.J.H. van Son, December 6, 2011

 

TEVA Tutorials

Tutorials to get acquainted with TEVA

Links to this page


© R.J.J.H. van Son, December 19, 2011

 

Recording your own voice

How to record and analyze a voice.

Record your voice

First set up your computer for voice recording. You should use a microphone of a reasonable quality. Furthermore, if your computer has some kind of Microphone boost feature, make sure it is turned off. Then use some application you know to check whether you can actually record your sound. For instance, if you have Praat (www.praat.org) or Audacity (audacity.sourceforge.net) installed, try to record your voice with them. If recording works, you can continue.

After you started TEVA, click on the Settings (→) button to go to the Configuration page. There you should check the relevant sound input, either Microphone if you use a built-in microphone or the microphone jack, or Line input if you have connected to the line input. You can use the Test recording button to open a window where you can check the setup and recording level. Close the window when you are satisfied. Note that your changes in the settings of this window will be ignored.

After you are satisfied that the recording setup is working, go back to the Main page by clicking the Return (→) button.

On the Main page, click on the red Record (•) button. A bright red spot will appear in the top left corner of the page during the time of the recording. The default duration of a recording is 4 seconds. You can change this duration on the Configuration page, with the Recording button. While the red spot is visible, speak a sustained /a/ sound in the microphone.

After the recording has stopped, the wave-form of the recorded sound will be shown in the central part of the Main page. The wave-form display is the default setting of TEVA. However, if TEVA was closed the last time while another display was selected, that display will be used again. The bright red spot in the top left corner will have been replaced by a open colored circle. The diameter and color of the circle indicate the maximum amplitude of the recorded sound. A big red circle means the recorded sound might have been too loud and clipped. A green circle indicates a safe recording level. When the circle becomes smaller and the color becomes darker towards black, the sound level of the recording might have been too soft.

Listen to the recorded sound. You can play the recorded sound by clicking the red Play button (right pointing solid triangle). You might notice that the recorded sound is not 4 seconds long (or whatever your recording setting is). TEVA will cut off silence at the start and end of the recording. Check whether there is enough of the /a/ recorded and that there is no background noise in the recording. Repeat the recording procedure until you are satisfied with the result. You do not have to reject the old recording, a new recording simply replaces the existing recording.

Other displays and analysis

When you click on any of the buttons on the right side below Sound, e.g., Pitch, Spectrogram, Ltas, Intensity, or Harmonicity, these will be displayed instead. Calculation of some of these displays might take some time, so be patient. Below all of the windows, except Sound and Spectrogram, text will appear with statistics of these analysis types. The button labeled Rating will display Visual Analogue Scales (VAS) for perceptual rating of the sounds.

Next:

Links to this page


© R.J.J.H. van Son, December 20, 2011

 

Opening an existing recording

How to open existing sound files in TEVA.

Audio formats

TEVA can handle all audio file formats that Praat can read. This includes, among others, WAV, AIFF/C, FLAC, and MP3 files.

Open a file

To open an existing recording, click the Open button on the Main page. A file select window will open which allows you to select the file in the customary way. Then click Choose. The file will open and the currently selected analysis display will be drawn.

Next:

Links to this page


© R.J.J.H. van Son, December 20, 2011

 

Recording tasks

TEVA can be used to record a list of tasks initiated by pompts. Such prompts can be simple, like [a as in hat], as well as page sequences with formatted text. A recording task is defined by setting the Task field in the Recording window of the Configuration page. The recording sessions is started by clicking the Record button on the Main page.

Recording task window

When clicking Recording on the Configuration page a small window appears. This window asks for three pieces of information:

Recording

The number of seconds recorded. This is a default time that can be changed in a table with recording prompts. Defaults to 4 seconds.

Task

The prompt text in Praat Text styles format. Alternatively, a text or table file with prompts can be selected by pressing the Task button. Each line in a (ASCII|DOS) text file will be displayed as a single prompt screen. Each line will give rise to a numbered recording. If the Task file is a (tsv, or tab-separated values) table, the size and formating of the prompts can be controlled.

Store

The directory where the recordings are stored. Each recording is stored in a sub-directory with the ID or name of the Speaker. File names are constructed from the ID or name of the Speaker, the number of name of the prompt screen, and a time stamp.

Links to this page


© R.J.J.H. van Son, May 12, 2014

 

Adding speaker information

How to add information about a speaker to TEVA.

Record or Open an /a/ sound, select a stable part of the recording:

Adding information about a recording

TEVA can keep a record of recordings. Click the Speaker button at the bottom left of the Main page. A window will open where you can enter an identifier (name or code) and other information like sex, age, treatment etc. There is also a field for more informal comments. The top text field is an identifier for the recording, the next lower one contains information about the speaker, ID, name, and so on. The bottom text field is for comments. Click the Continue button when you have completed the input. You can change the text later if you like.

Reading a table with speaker information

In general, it is better to collect speaker information in a table beforehand. You can open such a table with the Open button in the Main page, or with the Data button in the Configuration page. When opened, you will have to select a specific speaker or recording from the list. This is done using the Speaker button on the Main page. In the top text field, you can type in the ID code of the recording, or the line number in the table, and click Ready. Or you can step through all the records with the next (>) and previous (<) buttons. If you enter a partial ID, it will be matched to the start of the ID's in the table. The first record that matches will be selected. If you enter a non-existing recording ID in the top, ID, field, a new record is added to the table. You can remove the current record by completely emptying the top, record ID field, and replace it with a single dash, -.

You can save changes to the list with the Save button in the Configuration page. You can close and purge the current table with the Close button in the Configuration page. With Close, the current table is not saved. However, all changes made in an existing speaker table will be saved in a backup file with the same name as the original file, but with a tilde (~) added to the name (e.g., example.Table becomes example~.Table). This backup file will be kept until the table is saved to file using the Save button or all changes are purged with the Close button, after which the backup will be removed. The backup file will be overwritten the next time the table is opened from file and a change is made. When using Save to save a list, an attempt wil be made to convert all paths to audio files to paths relative to the position of the saved table.

Format of the speaker info table (.tsv or .Table)

Speaker Info tables are tab delimited (tsv) lists with five fields, starting with a line with the field names separated by tabs, i.e., ID, Text, Description, Audio, AST (the order is immaterial). The extension of the file should be .tsv or .Table.

1: ID of speaker or recording (must be unique)
2: Essential information, often starting with the ID code
3: Free form comments
4: Relative path to the audio file
5: Manually entered Acoustic Signal Type, i.e., 1, 2, 3, or 4
6: Start time of interval for which the AST was entered
7: End time of interval for which the AST was entered
8+: Any number of values for rating scales

If the table contains a path to a sound file, this file will be opened automatically when the record is chosen

Missing columns are automatically generated when the table is read. So, if a table without the Description and AST columns is read, two empty columns with the corresponding labels are created.

An example file: SignaltypeVoiceSamples.Table

ID Text Description Audio AST StartTime EndTime
Speaker1 Speaker1, M 66yo, Type I [comments] signaltype1voicesample.wav 1 0.000 1.750
Speaker2 Speaker2, M 48yo, Type II [comments] signaltype2voicesample.wav 2 1.000 2.750
Speaker3 Speaker3, F 77yo, Type III [comments] signaltype3voicesample.wav 3 0.500 2.250
Speaker4 Speaker4, F 48yo, Type IV [comments] signaltype4voicesample.wav 4 1.250 3.000

A text-only file without the .tsv or .Table extension will be read as a list of records separated by tabs. The order of the fields is like above, ID Text Description Audio AST StartTime EndTime. For instance, if three fields are given, they will be entered as ID, Text, Description. If only a single field is given, it is treated as the Audio field. The ID field will be set to numbered Item[row] values if not present. Upon reading, such a file will be converted to a a full table and saved (if backups are set, ~.tsv). Backup files too will be saved as ~.tsv files.

Next:

Links to this page


© R.J.J.H. van Son, December 20, 2011

 

Selecting stable sounds

How to select a stable part of the voice.

Record or Open an /a/ sound and add information about the speaker and recording:

Introduction and basics

Not all parts of your recording will be useful for analysis. Selecting a part of the recording is done with the buttons around the Select button. If the current interval is smaller than the current window, the boundaries of the current interval are indicated by vertical blue lines. With Zoom in (+) and Zoom out (-) you can decrease and increase the size of the window. With the Previous and Next buttons you can step through the recording. With the Select button you can indicate the start and end of the preferred interval.

Selecting a stable interval of speech

Go to the Spectrogram for selecting a stable part of your /a/ recording. Zoom out until you see the complete recording. You might notice that the recorded sound is not 4 seconds long (or whatever your recording setting is). TEVA will cut off silence at the start and end of the recording. A stable /a/ sound will show a smooth spectrogram with many harmonics as horizontal lines. The more harmonics are clearly visible, the better the voice is. Find the longest stretch of speech with many, flat harmonics. This will be the interval to analyse. For the Acoustic Signal Typing analysis, around 1-2 seconds of speech are needed.

Click on Select. A blue text will appear below the display "Select new start time (or press Select or a key to continue)". If you press Select again or press any key on the keyboard, the selection procedure will stop. Use the mouse pointer to click on the start of the desired stable interval inside the display. This procedure can be done much easier by simply clicking on the display at the point where you want the border to be. This will automatically bring you to step two of the Select procedure.

A vertical blue line will be draw at the point where you clicked. The text below the display will have changed to "Select new end time, - or + for a 1 or 2 sec window (cancel with Select or another key)". Use the mouse pointer to click on the end of the desired stable interval inside the display. A second blue line will appear and the text disappears. If you press the - key, a new interval of 1 second wide will appear centered on the point you selected in step 1. If you press the + key, a new interval with a width of 2 seconds will appear.

If you now click on the Play button, you will hear only the fragment you selected. The blue lines will be present in all other displays, except the Ltas display. The Ltas display will have changed and will only give the spectrum of the selected interval. All the statistics printed below the displays will refer to only the selected interval.

You can move around the selected interval with the Previous and Next buttons. You can make the window match the selected interval by clicking on To selection.

Next:

Links to this page


© R.J.J.H. van Son, December 20, 2011

 

Determine pathological type

How to determine the pathological type (acoustic signal typing)

Record or Open an /a/ sound, select a stable part of the recording and add information about the speaker and recording:

Pathological type

A short description of the criteria to determine the pathological type is displayed when you click on the Pathology button on the Configuration page. It can also be found on the Acoustic Signal Typing manual page.

The criteria for the pathological typing are mostly impressionistic. This is about the ability of the speaker to produce a stable /a/ sound with many harmonics. This can be evaluated by listening to the sound, and looking at the Spectrogram. Inspection of the Pitch and Harmonicity displays will give extra information.

When a voice has been evaluated and a type decided upon, the type can be entered by pressing one of the number keys, 1-4 for types I - IV. The types can always be changed. The currently selected type will be printed over the display and stored together with the boundaries of the current selection in the table with speaker data. Pressing 0 will erase the type and the boundaries. Pressing 9 will set the boundaries but will not change the type.

Acoustic signal typing and Voice Quality

There are also experimental automatic evaluation of pathological type and Voice Quality. These automatic evaluations are displayed in the VoicePrints.

Next:

Links to this page


© R.J.J.H. van Son, December 20, 2011

 

Perceptual rating of a vowel

How to determine the perceptual quality of a vowel using VAS rating.

Record or Open an /a/ sound, select a stable part of the recording and add information about the speaker and recording:

Perceptual rating using Visual Analogue Scales

On the main page, select the Rating button (bottom right). The screen will show a number of horizontal bars with titles and qualifications to the left and right of each bar. To change the type evaluations, go to the Configuration page and select one of the buttons labeled Vowels, Text, IINFVo, or GRBAS (see VAS rating).

Listen to the sound by pressing the Play button. Click on the position in the Visual Analogue Scale that corresponds to the relative quality of the speech. Default, there will be a grey mark at the center of each scale. When a selection is made, a red mark will be visible at the indicated position. Use the Print button to save and print the evaluation. It is best to leave the scale Markers off when evaluating speech. These scale Markers are useful when inspecting evaluation scores.

Acoustic signal typing and Voice Quality

There are also experimental automatic evaluation of pathological type and Voice Quality. These automatic evaluations are displayed in the VoicePrints.

Next:

Links to this page


© R.J.J.H. van Son, August 22, 2014

 

VoicePrints

Introduction

The development of voice characteristics and voice quality after laryngectomy is important for the quality of speech and, ultimately, for the quality of life (QoL) of the patient. Clinical practice and subsequent therapies will benefit when such developments are documented over the course of treatments and even beyond. This documentation should consist of professional and perceptual (subjective) evaluations of the voice and the results of acoustic analysis.

Based on research of Van As, Clapham, and others, a set of acoustic measures have emerged that are useful to describe the voice of tracheolaryngeal (TE) speakers. These have been incorporated in the Voice Print of TEVA which gives a single-page view of the most important acoustic and perceptual measures of the TE voice. Voice prints are useful to document the acoustic characteristics of the TE voice.

Traditionally, voice evaluations are done on sustained vowels and articulated speech, e.g., words, sentences, and read stories. For TE speech, the basic evaluations can be done on sustained vowels, most importantly, sustained /a/.

Perceptual evaluation of voice

Several perceptual dimensions are traditionally distinguished to evaluate voice and speech quality. These voice and speech evaluations are implemented in TEVA as Visual Analogue Scales (VAS) on the Rating page. For TE speech, three scales can express the most important qualities of the /a/ (under the Vowels Rating):

The VoicePrint will print out the Voice Quality rating (VQ).

Visual evaluation of voice

Van As-Brooks has developed a four grade Acoustic Signal Typing of voice that is based on visual inspection of the Spectrogram. The AST categories can be entered using the number keys 1-4, the key 0 removes the AST category. AST categories are displayed as Roman numerals on the main page.

Acoustic measures of voice

Information displayed on a voice print

References

Links to this page


© R.J.J.H. van Son, February 11, 2014

 

Prompted speech recordings

Use on-screen prompts to record speech.

When collecting speech from informants or patients, it is very useful to be able to direct them through the recording session with prompt screens.

Set up a simple prompt screen: a as in hat

On the configuration page, click the Recording button under the Archive heading. A window will appear.

Record speech

Go back to the main page. When you press the Record button, a window will appear. Enter the recording ID or filename and press Continue.

Links to this page


© R.J.J.H. van Son, August 22, 2014

 

Example evaluating AST

A tutorial example of how to evaluate the Acoustic Signal Typing on a sample of recordings

Download and open the example corpus

Download the TEVA_AST_example.zip file from [To be announced] and extract it to a convenient location. Open TEVA and then click the Open button on the Main page. A file selection window will open. Navigate into the map you just downloaded and select SignaltypeVoiceSamples.Table. Then click Choose. The Main page now contains a display of an empty sound. SignaltypeVoiceSamples.Table contains data on a number of recordings. You must now select the recording you want to use.

Select speakers

Click on the Speaker button (lower left). A new window will open. Click on the = button. You now see a table with 5 columns and 4 lines (enlarge the window if you do not see all of it). The headings read ID, Text, Description, Audio, and AST. Except for the Description column, all columns contain metadata for the recordings mentioned in the Audio column. Close the Info window and click in the TEVA window to view the Speaker window.

The Speaker window has three text fields. The top one contains the ID of the recording, the next field a short description of the speaker (e.g., name or ID, age, sex, etc.). You can change this field, or even create it for a newly recorded voice. If you enter an ID, or just the start of the ID, or row number in the table, the corresponding speaker from the Info window will be read and displayed in TEVA. If you empty the ID field completely and replace it with a single dash -, the current record will be removed from the table.

Put a single 3 in the top field of the Speaker window to select the third entry in the Table. Then Click the Continue button. You now see a display of the recording of the fourth speaker. You see the ID of this speaker and the Pathological Type as it was recorded in the Table.

Find a stable interval

Inspect the recording using the Sound and Spectrogram windows. You can change the frequency range of the Spectrogram by selecting the desired top frequency in the Configuration page. After selecting the most stable part of the speech, decide how the speech can be judged according to following statements as Good, Mediocre, or Bad. See also Acoustic Signal Typing

With these statements in mind, pick an interval where the vowel is most stable. You can use the Zoom in and Zoom out buttons as well as the Previous (move left) and Next (move right) buttons to navigate around.

When you found a part that you think is most stable, select an interval of 200 ms. Click the Select button. You are now asked to click on the position in the display where you want the start of the interval. When you do that, a vertical blue line appears. Now you are asked to click on the position in the display where the end of the interval should come. Click there. Alternatively, you can simply click inside the display window to select the first boundary. After that it proceeds as with the Select button. Make sure that the interval is at least 100 ms long. You can find the times of start and end points at the top of the graph. You can Zoom in or go To selection, but that is not necessary.

Acoustic analysis

Inspect the other displays, Pitch, Ltas, Intensity, and Harmonicity. Below each of these displays, analysis parameters will be printed about the selected interval (see also Analysis and AST categories). Do these acoustic parameters support your evaluation of the preceding statements? Use the Previous (move left) and Next (move right) buttons to move the selected interval around and see how this affects the acoustic parameters.

Entering the Pathological Type

When you have decided what type of pathological voice the recording has, you simply push one of the numbers 1-4, and the type is recorded. The type will be displayed in the window and in the title bar. Push 0 to remove the type.

Speeding up evaluations in long lists

In case you need to examine a large number of recordings in sequence, a simplification exists. Select the Serial button on the Configuration page. When you go back to the Main page, the Speaker button now reads Nxt and has turned blue. Clicking on this Nxt button now automatically reads and displays the next recording in the list. Just click on the Serial button again to reset TEVA to the default behavior.

Saving your work

After evaluating the sounds, the pathological types and remarks added to the Speaker Info table should be saved to disk. Go to the Configuration page and click on Save. Fill in the name you want the file to have (or use the default) and click Save. During the work, every change or addition made to the Speaker Info table is saved into a backup file with the same name as the original file and an added ~ character. When you Save or Close the table, this backup file is removed. Otherwise, it will remain until you open the table again and change something in it. If you accidently forgot to save the table, you can simply open the backup file and save it under the old, or a new, name.

To save the complete analysis of a single recording, use the Print button on the Main page. This will ask for a place and name to save a directory with all the displays and all the analysis data for the current time interval.

Advanced topic: Pre-set local preferences

For some projects, e.g., evaluation of a large corpus by judges or teaching a course, each user should start TEVA using the same preferences settings. For that purpose, the preferences can be pre-set in a file called TEVApreferences.tsv or .tevarc stored in the same directory as the <list>.Table file (the former file is visible, the latter is hidden). When the <list>.Table file is opened, the settings file is read. See the global preferences file for the available settings.

Links to this page


© R.J.J.H. van Son, February 8, 2012

 

Example evaluating a Corpus

A tutorial example of how to evaluate the Acoustic Signal Typing in a corpus

Download and open the corpus

Download the corpus .zip file from [To be announced] and extract it to a convenient location. Open TEVA and then click the Open button on the Main page. A file selection window will open. Navigate into the map you just downloaded and select the assigned table file. Then click Choose. The Main page now contains a display of an empty sound.

Stepping through the Corpus

You will notice that after opening the Table file, some settings of TEVA have changed. The display type has changed to Spectrogram, with a range from 0-2kHz. The Speaker button at the bottom left has changed to a blue Spk button. When you click on the Spk button, a file will be loaded. If you just started, this will be the first file ([1] in the window title). If you already have worked on this project, the next file without an AST label will be displayed. Every time the Spk button is clicked, the next file will be loaded.

If you ever have to go to a different file than the next in the list, click on the Settings button to go to the Configuration page and below the Selection header click the < Serial button for the previous item in the list or Individual for selecting any item in the list. After that change, the direction symbol will be reversed on the Main page or the original Speaker button will reappear on the Main page. Use this button to step to or display the required file. To get back to the earlier state with the blue Spk button, click again on the Settings button to go to the Configuration page and click the Serial > button (which should change from black to red).

TEVA will keep the name of the last used file with speaker and recording data in the preferences file. Next time TEVA is opened, you just have to click on the Spk button to continue where you left off and get to the next recording.

Evaluate a recording

Use the Spectrogram (or another analysis display) to determine the Acoustic Signal Type. You can click inside the display to indicate a window of interest and use the Zoom&Select panel to navigate through the recording. Note that every (time) display will have a bar below it indicating the parts considered Voiced by Praat (www.praat.org) voicing detection (Pitch).

Inspect the recording using the Spectrogram and Sound windows. You can change the frequency range of the Spectrogram by selecting the desired top frequency in the Configuration page. After selecting the most stable part of the speech, decide how the speech can be judged according to following statements as Good, Mediocre, or Bad. See also Acoustic Signal Typing

With these statements in mind, pick an interval where the vowel is most stable. You can use the Zoom in and Zoom out buttons as well as the Previous (move left) and Next (move right) buttons in the Zoom&Select panel to navigate around. Help about signal typing can be found in the man pages (Acoustic Signal Typing) or in the Pathology window on the Configuration page. After you have decided which pathological type you want to assign, just press the number key (one of 1-4). Press 0 to remove your choice. After that, you can press Spk to go to the next recording.

You can also operate TEVA (almost) completely by pressing shortcut keys instead of using the buttons. The sequence Open -> Spk -> Play -> 3 -> Spk can be entered as o, x, p, 3, x. You will obviously have to choose the Table file after pressing o(pen).

Saving your work

To save your evaluations, click on the Settings button to go to the Configuration page and click the Save button (not the Save button on the Main page). A window will open which asks you where to save the Table. Select the file you opened before to replace the original Table. If you forget to save your work, TEVA will prompt you to save your work when you Quit or open another Table. Selecting Cancel will let you leave without saving your work. If you ever leave TEVA without saving the Table file, you will find a recovery file next to the original file. This file has the same name as the original, but with a ~ added. Open this file and save it to replace the original to recover the "lost" work. Note that this recovery file will be removed if you open the original file and either change something, Save it again, or Close it.

Advanced: Setting local preferences

To set up local preferences for a project, create a file called TEVApreferences.tsv or .tevarc and store it in the same directory as the Table with the list of recordings. For example, the preferences for this example read (white-space is single tabs):

Key Value
config.frequency 2000
config.showFormants 0
config.speakerSerial 1
mainPage.draw Spectrogram

A complete preferences file as created by TEVA, e.g., ~/Library/Preferences/TEVA/TEVArc.tsv, could look like (again, use single tabs to separate the fields):

Key Value
config.language EN
config.frequency 2000
config.showFormants 0
config.speakerData /Users/guest/Examples/SignaltypeVoiceSamples.Table
config.speakerSerial 1
config.recordingTime 4
config.showBackground 1
config.input Microphone
config.muteOutput 0
config.openLog /Users/gues/Library/Preferences/TEVA/log
mainPage.draw Spectrogram

Note that the last Table used is stored in the preferences file (here it is SignaltypeVoiceSamples.Table).

Links to this page


© R.J.J.H. van Son, March 8, 2012

 

Analysis

Praat commands used to calculate the analysis results

TEVA is a Praat script. The commands used to perform the analysis and draw the displays are listed here:

Displays

Sound

-

Pitch

select Sound SND
To Pitch (cc)... 0 40 15 yes 0.03 0.40 0.045 0.35 0.14 300

There are three options, a low and high pitch cutoff (300 and 600Hz), and a compatible option that implements the settings of Van As 2001

To Pitch (cc)... 0 40 15 no 0.03 0.40 0.01 0.35 0.14 250

Spectrogram

select Sound SND
To Spectrogram... 0.1 'Fn' 0.001 10 Gaussian

Fn is the Nyquist frequency

select Sound SND
To Formant (burg)... 0.02 4 4400 0.1 50

Ltas

select Sound SND
To Spectrum... yes
To Ltas (1-to-1)

Intensity

select Sound SND
To Intensity... 60 0 yes

Harmonicity

select Sound SND
To Harmonicity (cc)... 'dT' 40 0 1.0

dT is the time step. The position of the maximum Harmonicity is determined on a smoothed low-pass filtered contour using 'dT' 40 0 4.5 (low-pass 5Hz and 5Hz transition), not on the highest peak.

Measurements

Voiced fraction

select Sound SND
To Pitch... 0 60 600

Count the number of voiced frames in the window and divide by the total number of frames. Using these settings, the step size (frame duration) is 0.0125s.

GNE (glottal to noise excitation ratio)

select Sound SND
Extract part... 'T1' 'T2' rectangular 1.0 false
To Harmonicity (gne)... 500 4500 1000 80
gne = Get maximum

T1 and T2 are the start and end time, respectively

Jitter

select Sound SND
To Pitch... 0 60 600
To PointProcess
jitter = Get jitter (local)... 'T1' 'T2' 0.0001 0.05 5

T1 and T2 are the start and end time, respectively

Shimmer

select Sound SND
To Pitch... 0 60 600
To PointProcess
select Sound SND
plus PointProcess SND
shimmer = Get shimmer (local)... 'T1' 'T2' 0.0001 0.05 5 5

T1 and T2 are the start and end time, respectively

BED (band energy difference)

select Sound SND
To Spectrum... yes
To Ltas (1-to-1)
Get number of bins

Average power over bins. Where lowPower is the average power over bins between 0 and 500 Hz and highPower is the average power over bins between 4000 and 5000 Hz

bed = 10 * log10(lowPower / highPower)

CoG (spectral center of gravity)

select Sound SND
To Spectrum... yes
To Ltas (1-to-1)
Get number of bins

Sum the power (10(power/10)), sumPower over the bins and the product of frequency and energy (f * 10(power/10)), productFreq, over all bins.

cog = productFreq / sumPower

Maximum Voicing Duration (MVD)

select Pitch SND
To PointProcess
To TextGrid (vuv)... 0.2 0.1
Get longest interval with label V

Formant quality factors (QF_i)

select Formant SND
medianF = Get quantile... 'i' 'T1' 'T2' Hertz 0.50
medianB = Get quantile of bandwidth... 'i' 'T1' 'T2' Hertz 0.50
qf = medianF / medianB

Links to this page


© R.J.J.H. van Son, December 13, 2011

 

Acoustic Signal Typing

Introduction

The quality of the voice in Tracheoesophageal (TE) speech is determined by the characteristics of the neo-glottis. Individual differences in the functioning of the neo-glottis after treatment cause great variation in the intelligibility and quality of speech. The voice pathology of TE speech is graded into four levels.

Pathology types (Van As, 2001, Chapter 5)

Type I - Stable & Harmonic (press 1)

Type II - Stable & At least one harmonic (press 2)

Type III - Unstable or Partly harmonic (press 3)

Type IV - Barely harmonic (press 4)

(press 0 to reset)

Table of the relation between the four types of acoustic signal typing and the perceptual judgment of overall voice quality for 39 speakers (converted to percentages).

| Perceptual judgment of overall voice quality |
| Good Reasonable Poor |
| Type I 70% 40% 0% |
| Type II 45% 45% 10% |
| Type III 20% 35% 45% |
| Type IV 0% 25% 75% |

Acoustic measures of voice quality

In Acoustic Signal Typing, the voice characteristics are determined using acoustic analysis of speech. The typing is based on both visual inspection of plots of these analysis parameters and quantitive measures of a short (e.g., 0.1 second) stretch of "stable" speech.

Visual determination of pathology uses displays of:

A quantitative evaluation is based on the analysis of:

These measures are determined on a short segment (around 0.1 second) of speech from the most stable part of a sustained /a/ sound. Pathological categories are defined using (Van As, 2001). See Analysis calculations and displays for details on the commands used.

References:

Links to this page


© R.J.J.H. van Son, December 1, 2011

 

AST categories

Category boundaries

Categories are determined when the corresponding analysis is performed, e.g., when a display is drawn or data are saved. Automatically determined values are displayed in Arabic numbers (1, 2, 3, 4). Categories set by hand are displayed by Roman numerals (I, II, III, IV). Category boundaries are taken from (Van As, 2001, Table 5.4 p88) and here indicated by Roman numerals for clarity.

The final automatic AST category is defined as the median of the individual measures.

Links to this page


© R.J.J.H. van Son, December 1, 2011

 

VAS rating scales (EN)

Visual Analogue rating Scales for IINFVo and GRBAS.

(I)INFV0 and GRBAS are standards for rating voices used in Speech Therapy. TEVA will primarily be used to study voicing in sustained vowels. In these circumstances, the standard Impression, Intelligibility, and Fluency scales of (I)INFVo would be of little use. Therefore, the TEVA Rating screen also includes derived sets of the scales.

Visual Analogue Scale (VAS) rating

In VAS rating, the judges have to indicate the severety of some condition as a mark on a line. Each parameter has to be scored on a Visual Analogue Scale (VAS) that takes the form of an undivided horizontal bar, where a position has to be marked. The extreme right concurs with a very good score for this parameter. In (I)INFVo Impression rating this is a substitute voice Most like a normal voice. The extreme left concurs with a very bad score for this parameter, that is, Least like a normal voice. The words Least and Most are printed at the end of the bars. The other scales have equivalent words printed at the end of the bar.

When a parameter has not yet been marked, a vertical gray line will appear in the center of the scale. When a parameter has been marked, a vertical red line will appear on that position.

(I)INFVo scales

Vowel Scales (derived from (I)INFVo and GRBAS scales)

GRBAS scales

Standard GRBAS rating scales

Consensus ratings

Sometimes it is necessary to combine ratings from two or more raters into a single consensus rating. This can be achieved by combining the tables written for all the raters. Concatenate the numbers for each rater with ; (a semicolon-";" character) into a single tab-delimited column. The ratings will be displayed with blue markers. Clicking in the customary way on the VAScale will generate the single consensus rating in Red.

References

M. B. J. Moerman, J. P. Martens, M. J. Van der Borgt, M. Peleman, M. Gillis, P. H. Dejonckere (2006). 'Perceptual evaluation of substitution voices: development and evaluation of the (I)INFVo rating scale', Eur Arch Otorhinolaryngol, 263: 183-87.

Links to this page


© R.J.J.H. van Son, August 10, 2012

 

TEVA license

NKI TE-VOICE ANALYSIS tool version 1.0

Netherlands Cancer Institute tool for Tracheoesophageal Voice Analysis (TEVA)

For more information, visit our websites: www.fon.hum.uva.nl/IFA-SpokenLanguageCorpora/NKIcorpora/NKI_TEVA/ and www.provoxweb.info/acoustic-analyses.html . TEVA is based on Praat (www.praat.org)

This application was made possible by an unrestricted research grant from: ATOS MEDICAL AB: P.O. BOX 183 SE-242 22 HÖRBY SWEDEN

This application is licensed under the GNU GPL version 2 or later (www.gnu.org/licenses/old-licenses/gpl-2.0.html)

The NKI TE-VOICE ANALYSIS tool
Copyright © 2011 Netherlands Cancer Institute and R.J.J.H. van Son
Praat code Copyright © 1992-2011 Paul Boersma and David Weenink

This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.

Links to this page


© R.J.J.H. van Son, December 6, 2011