| 
 | 
The logical layout of hte corpus is stored in the CorpusLayout.tsv file as a tab-separated values table.
At the bottom of this page is an example of the CorpusLayout.tsv file. It consists of three columns labeled Key, Value, and Description separated by tabs.
NCB divides a corpus in four logical components. Each component is subdivided into parts again. The corresponding directories will be created when they are absent. If the value of a part is empty (or -), the corresponding part will not exist for NCB, and not created when NCB opens the corpus. So, if a corpus has no Recordings component, all the values for the Recordings/ keys should be empty. The logical components are:
Except for the Documentation, each logical component of a corpus contains at least four logical subdivisions:
The Corpus part contains an Overlays directory (Corpus/Overlays) which stores stand-off annotations (wiki.tei-c.org/index.php/Stand-off_markup). Each separate set of stand-off markup is stored in a separate sub-tree under Corpus/Overlays. Note that logically this is a subdirectory of the Corpus part, but it is stored default next to the Corpus directory.
Note: Adding Recordings/Overlays or Evaluation/Overlays to the CorpusLayout.tsv file will generate these overlay directories automatically.
The Evaluation part also contains some extra parts:
A central principle of NCB is that files related to a certain media file, ie, meta-data, annotations, and text files, are stored in parallel directory paths. That is, if there is a recording in Corpus/Media/my/path/to/a/recording.wav, then the following files correspond to each other:
In these paths, the bold parts, Corpus/Media, Corpus/Info, Corpus/Annotations, and Corpus/Texts are logical names. The real names of these directories are taken from the CorpusLayout.tsv table. The part reading /my/path/to/a/recording. in these paths are identical for all four of the files.
Key  Value  Description
Documentation  Documentation  The documentation of the corpus
Documentation/Speakers  Documentation/Speakers  The documentation about the speakers
Recordings/Media  Recordings/Media  Original audio recordings
Recordings/Annotations  Recordings/Annotations  Annotations of the original audio recordings
Recordings/Texts  Recordings/Texts  Original texts used for the recordings
Recordings/Info  Recordings/Info  Information and meta data on the recordings
Corpus/Media  Corpus/Media  Corpus content: Media files
Corpus/Annotations  Corpus/Annotations  Corpus Content: Annotations
Corpus/Texts  Corpus/Texts  Corpus Content: Texts
Corpus/Info  Corpus/Info  Corpus Content: Info and meta data
Corpus/Overlays  CorpusOverlays  Alternative Corpus Annotations
Evaluation/Media  Evaluation/Media  Stimuli used in evaluations: Media files
Evaluation/Annotations  Evaluation/Annotations  Stimuli used in evaluations: Annotations
Evaluation/Texts  Evaluation/Texts  Stimuli used in evaluations: Text files
Evaluation/Info  Evaluation/Info  Stimuli used in evaluations: Info and meta data
Evaluation/Experiments  Evaluation/Experiments  Control files used in experiments with the Stimuli
Evaluation/Responses  Evaluation/Responses  Responses to the Stimuli
© R.J.J.H. van Son, April 20, 2012