(see copyright notice)
The IFA Dialog Video corpus is a collection of annotated video recordings of friendly Face-to-Face dialogs. It is modelled on the Face-to-Face dialogs in the Spoken Dutch Corpus (CGN). The procedures and design of the corpus were adapted to make this corpus useful for other researchers of Dutch speech. For this corpus 20 dialog conversations of 15 minutes we recorded and annotated, in total 5 hours of speech. To stay close to the very useful Face-to-Face dialogs in the CGN, pairs of well acquainted participants were selected, either good friends, relatives, or long-time colleagues. The participants were allowed to talk about any topic they wanted.
In total, 20 recordings were annotated to the same, or updated, standards as the original CGN. Only the initial orthographic transcription was done by hand. Other CGN-format annotations were only done automatically. Two other manual annotations were added, a functional annotation of dialog utterances and annotated gaze direction.
See also the LREC paper The IFADV corpus: A free dialog video corpus
(van Son, R., Wesseling, W., Sanders, E., and van den Heuvel, H. (2008). LREC'08, Marrakech)
Recordings were made with two gen-locked
JVC TK-C1480B analog color video cameras.
Each camera was positioned to the left of one speaker and focussed on the face of the other. Subjects wore a Samson QV head-set microphone.
Subjects first spoke some scripted sentences. Then they were instructed to speak freely while preferably avoiding sensitive material or identifying people by name. All subject signed an informed consent and transfered all copyrights to the Dutch Language Union (Nederlandse Taalunie).
Point your IMDI browser to: http://www.fon.hum.uva.nl/IFA-SpokenLanguageCorpora/IFADVcorpus/Annotations/IMDI/IFADVcorpus.imdi
The original recordings contained dropped frames which made the two recordings of each dialog to become out-of-sync. This has been corrected by duplicating frames. This procedure is described in the SMILoverlay files. Only the corrected recordings are made available here. The original recordings, with the lacking frames, are available on request. Recordings are limited to 900 seconds (15 min) and corrected for dropped frames. That is, the video frames and audio files of both recordings are synchronized.
mencoder -quiet -af volnorm=2:0.25 -vf pp=autolevels:fullyrange -of avi -ovc lavc -lavcopts vcodec=msmpeg4:vbitrate=2400000:vhq:keyint=50 -oac mp3lame -o outfile.avi infile.dv;
ffmpeg2theora --format dv --videoquality 4 --sharpness 1 --pp autolevels:fullyrange --license GPLv2 -o outfile.ogv infile.dv
IFA Dialog Video Corpus.
Copyright © 2007 Nederlandse Taal Unie
This corpus was made possible by grant 276-75-002 of the Netherlands Organization for Scientific Research
(Created by R.J.J.H. van Son and Wieneke Wesseling of the ACLC. Annotations were performed by SPEX)
Please note that these materials are distributed under the the GPLv2 license. This license only covers the Copyright protection of the corpus. Publishing or broadcasting of materials from this corpus might be covered by other laws, eg, laws protecting the privacy and "good name" of the subjects. This is especially relevant if the materials are used outside of an educational or R&D context. Please read the forms in the Documents directory for more information (in Dutch).
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.