Sound: To Pitch (filtered ac)...

A command that creates a Pitch object from every selected Sound object.

Purpose

to perform a pitch analysis based on the autocorrelation of the low-pass filtered signal.

Algorithm

This command will first low-pass filter the signal, then apply Sound: To Pitch (raw ac)... on the filtered signal.

The low-pass filter is Gaussian in the frequency domain. If, for instance, you set the pitch ceiling to 800 Hz, and the attenuation at ceiling to 0.03, then the attenuation at 400 Hz is the fourth root of 0.03, i.e. about 42%. As a function of frequency f, the attenuation is 0.03(f/800)². Here’s a table of attenuation factors, also in dB (in this logarithmic domain, the shape is parabolic):

frequency attenuation logarithmic
  100 Hz 0.95 -0.5 dB
  200 Hz 0.80 -1.9 dB
  300 Hz 0.61 -4.3 dB
  400 Hz 0.42 -7.6 dB
  500 Hz 0.25 -11.9 dB
  600 Hz 0.14 -17.1 dB
  700 Hz 0.07 -23.3 dB
  800 Hz 0.03 -30.5 dB

Note: the attenuation curve will be identical to the curve shown here if you use a ceiling of 500 Hz and an attenuation at ceiling of 0.25; however, this is not advised, because the example table provides a more gradual suppression of higher pitches, almost as if there were no ceiling at all.

Settings

The settings that control the recruitment of the candidates are:

Time step (s) (standard value: 0.0)
the measurement interval (frame duration), in seconds. If you supply 0, Praat will use a time step of 0.75 / (pitch floor), e.g. 0.015 seconds if the pitch floor is 50 Hz; in this example, Praat computes 66.7 pitch values per second.
Pitch floor (Hz) (standard value: 50 Hz)
candidates below this frequency will not be recruited. This parameter determines the effective length of the analysis window: it will be 3 longest periods long, i.e., if the pitch floor is 50 Hz, the window will be effectively 3/50 = 0.06 seconds long.

Note that if you set the time step to zero, the analysis windows for consecutive measurements will overlap appreciably: Praat will always compute 4 pitch values within one window length, i.e., the degree of oversampling is 4.

Pitch ceiling (Hz) (standard value: 800 Hz)
candidates above this frequency will be ignored.
Max. number of candidates (standard value: 15)
each frame will contain at least this many pitch candidates. One of them is the “unvoiced candidate”; the others correspond to time lags over which the signal is more or less similar to itself.
Very accurate (standard value: off)
if off, the window is a Hanning window with a physical length of 3 / (pitch floor). If on, the window is a Gaussian window with a physical length of 6 / (pitch floor), i.e. twice the effective length.

A post-processing algorithm seeks the cheapest path through the candidates. The settings that determine the cheapest path are:

Silence threshold (standard value: 0.09)
frames that do not contain amplitudes above this threshold (relative to the global maximum amplitude), are probably silent.
Voicing threshold (standard value: 0.50)
the strength of the unvoiced candidate, relative to the maximum possible autocorrelation. If the amount of periodic energy in a frame is more than this fraction of the total energy (the remainder being noise), then Praat will prefer to regard this frame as voiced; otherwise as unvoiced. To increase the number of unvoiced decisions, increase the voicing threshold.
Octave cost (standard value: 0.055 per octave)
degree of favouring of high-frequency candidates, relative to the maximum possible autocorrelation. This is necessary because even (or: especially) in the case of a perfectly periodic signal, all undertones of F0 are equally strong candidates as F0 itself. To more strongly favour recruitment of high-frequency candidates, increase this value.
Octave-jump cost (standard value: 0.35)
degree of disfavouring of pitch changes, relative to the maximum possible autocorrelation. To decrease the number of large frequency jumps, increase this value. In contrast with what is described in the article, this value will be corrected for the time step: multiply by 0.01 s / TimeStep to get the value in the way it is used in the formulas in the article.
Voiced / unvoiced cost (standard value: 0.14)
degree of disfavouring of voiced/unvoiced transitions, relative to the maximum possible autocorrelation. To decrease the number of voiced/unvoiced transitions, increase this value. In contrast with what is described in the article, this value will be corrected for the time step: multiply by 0.01 s / TimeStep to get the value in the way it is used in the formulas in the article.

Links to this page


© Paul Boersma 2023