HTK parameter file format

HTK parameter format files consist of a contiguous sequence of frames preceded by a header. Each frame is a vector of either 2-byte integers or 4-byte floats. 2-byte integers are used for compressed forms and for vector quantised data. All multi-byte numbers are written as Big-endian numbers.

The HTK file format header is 12 bytes long and contains the following data:

numberOfFrames (4-byte integer)
The number of analysis frames in a file
samplePeriod (4-byte integer)
The sample period in units of 100 ns. A sampling frequency of 10 kHz would correspond to a sample period of 0.0001 s and to a value of 1000 in this field.
frameSize (2-byte integer)
The number of bytes per frame.
parameterKind (2-byte integer)
A code indication what kind of frames the file contains.

Remarks

The HTK parameter files do not contain specific information that identifies them as HTK parameter files. However, the file type can be deduced as follows. If we have any file and interpret the first 12 bytes as the above format specifies then we know that the first three numbers read have to be positive integers, that frameSize has to an even number and that numberOfFrames * frameSize + 12 must be equal to the number of bytes in the file. The chance that a random data file fulfils these conditions is very small.

VTRFormants data in HTK parameter format

The VTRFormants data of Deng et al. (2006) can serve as reference formant frequency values for (part of) the TIMIT acoustic phonetic corpus. These data are stored in HTK parameter files with extension ".fb". They can be download from http://www.seas.ucla.edu/spapl/VTRFormants.html.

HTK parameter files do not contain timing information and therefore we can only calculate the domain of the Formant from external information. From the Deng et al. (2006) paper we assume that the data were taken every 10 ms and therefore the best guess for the duration of the Formant equals numberOfFrames * 0.01.The value in these files for the samplePeriod is 10000 which is a factor 10 off from the correct value of 1000 in units of 100 ns.


© djmw 20210311