IFA Dialog Video corpus

This corpus was made possible by grant 276-75-002 of
the Netherlands Organization for Scientific Research

The IFA Dialog Video corpus is a collection of annotated video recordings of friendly Face-to-Face dialogs. It is modelled on the Face-to-Face dialogs in the Spoken Dutch Corpus (CGN). The procedures and design of the corpus were adapted to make this corpus useful for other researchers of Dutch speech. For this corpus 20 dialog conversations of 15 minutes we recorded and annotated, in total 5 hours of speech. To stay close to the very useful Face-to-Face dialogs in the CGN, pairs of well acquainted participants were selected, either good friends, relatives, or long-time colleagues. The participants were allowed to talk about any topic they wanted.

In total, 20 recordings were annotated to the same, or updated, standards as the original CGN. Only the initial orthographic transcription was done by hand. Other CGN-format annotations were only done automatically. Two other manual annotations were added, a functional annotation of dialog utterances and annotated gaze direction.

See also the LREC paper The IFADV corpus: A free dialog video corpus
(van Son, R., Wesseling, W., Sanders, E., and van den Heuvel, H. (2008). LREC'08, Marrakech)


Recordings were made with two gen-locked JVC TK-C1480B analog color video cameras.

Gen-lock ensures synchronization of all frames of the two cameras. Recordings were digitized using two Canopus ADVC110 digital video converters. Recordings were stored unprocessed on disk, ie, in DV format with 48 kHz 16 bit PCM sound.

Each camera was positioned to the left of one speaker and focussed on the face of the other. Subjects wore a Samson QV head-set microphone.

Subjects first spoke some scripted sentences. Then they were instructed to speak freely while preferably avoiding sensitive material or identifying people by name. All subject signed an informed consent and transfered all copyrights to the Dutch Language Union (Nederlandse Taalunie).

Point your IMDI browser to: http://www.fon.hum.uva.nl/IFA-SpokenLanguageCorpora/IFADVcorpus/Annotations/IMDI/IFADVcorpus.imdi


Release note:
The original recordings contained dropped frames which made the two recordings of each dialog to become out-of-sync. This has been corrected by duplicating frames. This procedure is described in the SMILoverlay files. Only the corrected recordings are made available here. The original recordings, with the lacking frames, are available on request. Recordings are limited to 900 seconds (15 min) and corrected for dropped frames. That is, the video frames and audio files of both recordings are synchronized.

Created by R.J.J.H. van Son and Wieneke Wesseling of the ACLC. Annotations were performed by SPEX

Please note that these materials are distributed under the the GPLv2 license. This license only covers the Copyright protection of the corpus. Publishing or broadcasting of materials from this corpus might be covered by other laws, eg, laws protecting the privacy and "good name" of the subjects. This is especially relevant if the materials are used outside of an educational or R&D context. Please read the forms in the Documents directory for more information (in Dutch).

