Sound: To Spectrum...

A command that appears in the Spectrum menu if you select one or more Sound objects. It turns the selected Sound into a Spectrum by an over-all spectral analysis, a Fourier transform.

Setting

Fast
determines whether zeroes are appended to the sound such that the number of samples is a power of two. This can appreciably speed up the Fourier transform.

Mathematical procedure

For the Fourier transform, the Praat-defined time domain of the Sound is ignored. Instead, its time domain is considered to run from t=0 to t=T, where t=0 is supposed to be aligned with the first sample, and T is the total duration of the samples, i.e. NΔt, where N is the number of samples and Δt is the sampling period. Thus, the last sample lies at t=T–Δt.

For a sound x(t), defined for all times t in the domain (0, T), the complex spectrum X(f) for any frequency f is the forward Fourier transform of x(t), with a negative exponent:

X(f) = ∫0T x(t) e-2πift dt

If the Sound is expressed in Pascal (Pa), the Spectrum is expressed in Pa·s, or Pa/Hz. Since a Spectrum object can only contain a finite number of frequency samples, it is only computed for frequencies that are multiples of Δf = 1/T. The number of those frequencies is determined by the number of samples N of the sound.

If N is odd, there will be N frequency samples. For instance, if the sound has 20,457 samples, the spectrum will be computed at the frequencies -10,228Δf, -10,227Δf, ..., –Δf, 0, +Δf, ..., +10,227Δf, +10,228Δf. If we suppose that a frequency sample represents a frequency bin with a width of Δf, we see that the frequency samples span adjacent frequency ranges, e.g. the first sample runs from -10,228.5Δf to -10,227.5Δf, the second from -10,227.5Δf to -10,226.5Δf. Together, the frequency samples span the frequency domain of the spectrum, which runs from -F to +F, where F = 10,228.5Δf. We can see that this frequency equals one half of the sampling frequency of the original sound: F = 10,228.5Δf = 10,228.5/T = 10,228.5/(20,457Δt) = 0.5/Δt. This is the so-called Nyquist frequency.

If N is even, there will be N+1 frequency samples. For instance, if the sound has 32,768 samples, the spectrum will be computed at the frequencies -16,384Δf, -16,383Δf, ..., -Δf, 0, +Δf, ..., +16,383Δf, +16,384Δf. Again, the frequency samples span adjacent frequency ranges, but the first and last samples are only half as wide as the rest, i.e. the first sample runs from -16,384Δf to -16,383.5Δf, the second from -16,383.5Δf to -16,382.5Δf, and the last from +16,383.5Δf to +16,384Δf. Together, the frequency samples again span the frequency domain of the spectrum, which runs from –F to +F, where F = 16,384Δf = 0.5/Δt, the Nyquist frequency.

Storage

In a Spectrum object, Praat stores the real and imaginary parts of the complex spectrum separately. The real part is equal to the cosine transform:

re X(f) = ∫0T x(t) cos (2πft) dt

The imaginary part is equal to the reverse of the sine transform:

im X(f) = – ∫0T x(t) sin (2πft) dt

The complex spectrum can be reconstructed from the real and imaginary part as follows:

X(f) = re X(f) + i im X(f)

Since the cosine is a symmetric function of t and the sine is an antisymmetric function of t, the complex spectrum for a negative frequency is the complex conjugate of the complex spectrum for the corresponding positive frequency:

X(-f) = re X(-f) + i im X(-f) = re X(f) - i im X(f) = X*(f)

For purposes of storage, therefore, the negative frequencies are superfluous. For this reason, the Spectrum object stores re X(f) and im X(f) only for frequencies f = 0, Δf, 2Δf... In the case of a sound with 20,457 samples, the Spectrum object contains the real part of X(0) (its imaginary part is always zero), and the real and imaginary parts of X(f) for frequencies from Δf to 10,228Δf, which makes in total 1+2·10,228 = 20,457 real values. In the case of a sound with 32,768 samples, the Spectrum object contains the real parts of X(0) and X(16,384Δf) (their imaginary parts are always zero), and the real and imaginary parts of X(f) for frequencies from Δf to 16,383Δf, which makes in total 2+2·16,383 = 32,768 real values.

Since the negative frequencies have been removed, the frequency domain now runs from 0 to F. This means that the first frequency bin is now only 0.5Δf wide (i.e. as wide as the last bin for even-N spectra), which has consequences for computations of energies.

Behaviour

If you perform Spectrum: To Sound on the resulting Spectrum object, a Sound is created that is equal to the original Sound (or to the original Sound with appended zeroes).

Properties

The frequency integral over the squared Spectrum equals the time integral over the squared Sound:

-F+F |X(f)|2 df = ∫0T |x(t)|2 dt

This is called Parceval's theorem.

Links to this page


© ppgb 20041123