WFT-GTB Data
DiscAn
DiscAN: Towards a Discourse Annotation system for Dutch language corpora
SummaryLAISEANG
LAISEANG: Language Archive of Insular South East Asia and West New Guinea
SummaryThe LAISEANG corpus contains an unrivaled collection of multimedia materials and written documents from over 50 languages in Insular South East Asia and West New Guinea.
EMIT-X
EMIT-X: Early-Modern Image and Text eXchange
SummaryVU-DNC
VU-DNC: VU Diachronic Newspaper Corpus
SummaryVU-DNC is a unique diachronic corpus of Dutch newspaper articles from five major Dutch newspapers from 1950/1951 and 2002 (2 MW). The VU-DNC has been annotated for quotations, which enables the researcher to differentiate between the words directly under responsibility of the journalist.
NEHOL
NEHOL: Negerhollands Database
SummaryC-DSD
C-DSD: Curating the Dutch Song Database
SummaryD-LUCEA
D-LUCEA: Database of the Longitudinal Utrecht Collection of English Accents
SummaryVALID
VALID - vulnerability in language acquisition: language impairments in Dutch
SummaryAn open access multimedia archive of language pathology data collected in the Netherlands, primarily on Dutch, audio files and transcripts. Currently, this corpus contains 5 different data sets. In the VALID data archive old, current and future data can be brought together.
DBD/TCULT
DBD - Dutch Bilingualism Database, TCULT - Talen en Culturen in Utrechtse Lombok en Transvaal
SummaryThe CLARIN NL supported data sets are part of an already existing collection: Dutch Bilingualism Database housed at the MPI for Psycholinguistics that are both also CLARIN compatible. The addtional DBD / TCULT data were curated by the CLARIN DCS (http://dev.clarin.nl/node/1963) and delivered in February 2014.