Sound: To Pitch (filtered ac)...


A command that creates a Pitch object from every selected Sound object.
Purpose
to perform a pitch analysis based on the autocorrelation of the lowpass filtered signal.
Algorithm
This command will first lowpass filter the signal, then apply Sound: To Pitch (raw ac)... on the filtered signal.
The lowpass filter is Gaussian in the frequency domain. If, for instance, you set the pitch ceiling to 800 Hz, and the attenuation at ceiling to 0.03, then the attenuation at 400 Hz is the fourth root of 0.03, i.e. about 42%. As a function of frequency f, the attenuation is 0.03^{(f/800)²}. Here’s a table of attenuation factors, also in dB (in this logarithmic domain, the shape is parabolic):

frequency  attenuation  logarithmic 








Note: the attenuation curve will be identical to the curve shown here if you use a ceiling of 500 Hz and an attenuation at ceiling of 0.25; however, this is not advised, because the example table provides a more gradual suppression of higher pitches, almost as if there were no ceiling at all.
Settings
The settings that control the recruitment of the candidates are:

Time step (s) (standard value: 0.0)

the measurement interval (frame duration), in seconds. If you supply 0, Praat will use a time step of 0.75 / (pitch floor), e.g. 0.015 seconds if the pitch floor is 50 Hz; in this example, Praat computes 66.7 pitch values per second.

Pitch floor (Hz) (standard value: 50 Hz)

candidates below this frequency will not be recruited. This parameter determines the effective length of the analysis window: it will be 3 longest periods long, i.e., if the pitch floor is 50 Hz, the window will be effectively 3/50 = 0.06 seconds long.
Note that if you set the time step to zero, the analysis windows for consecutive measurements will overlap appreciably: Praat will always compute 4 pitch values within one window length, i.e., the degree of oversampling is 4.

Pitch ceiling (Hz) (standard value: 800 Hz)

candidates above this frequency will be ignored.

Max. number of candidates (standard value: 15)

each frame will contain at least this many pitch candidates. One of them is the “unvoiced candidate”; the others correspond to time lags over which the signal is more or less similar to itself.

Very accurate (standard value: off)

if off, the window is a Hanning window with a physical length of 3 / (pitch floor). If on, the window is a Gaussian window with a physical length of 6 / (pitch floor), i.e. twice the effective length.
A postprocessing algorithm seeks the cheapest path through the candidates. The settings that determine the cheapest path are:

Silence threshold (standard value: 0.09)

frames that do not contain amplitudes above this threshold (relative to the global maximum amplitude), are probably silent.

Voicing threshold (standard value: 0.50)

the strength of the unvoiced candidate, relative to the maximum possible autocorrelation. If the amount of periodic energy in a frame is more than this fraction of the total energy (the remainder being noise), then Praat will prefer to regard this frame as voiced; otherwise as unvoiced. To increase the number of unvoiced decisions, increase the voicing threshold.

Octave cost (standard value: 0.055 per octave)

degree of favouring of highfrequency candidates, relative to the maximum possible autocorrelation. This is necessary because even (or: especially) in the case of a perfectly periodic signal, all undertones of F_{0} are equally strong candidates as F_{0} itself. To more strongly favour recruitment of highfrequency candidates, increase this value.

Octavejump cost (standard value: 0.35)

degree of disfavouring of pitch changes, relative to the maximum possible autocorrelation. To decrease the number of large frequency jumps, increase this value. In contrast with what is described in the article, this value will be corrected for the time step: multiply by 0.01 s / TimeStep to get the value in the way it is used in the formulas in the article.

Voiced / unvoiced cost (standard value: 0.14)

degree of disfavouring of voiced/unvoiced transitions, relative to the maximum possible autocorrelation. To decrease the number of voiced/unvoiced transitions, increase this value. In contrast with what is described in the article, this value will be corrected for the time step: multiply by 0.01 s / TimeStep to get the value in the way it is used in the formulas in the article.
Links to this page
© Paul Boersma 2023