DiscAN: Towards a Discourse Annotation system for Dutch language corporaSummary
LAISEANG: Language Archive of Insular South East Asia and West New GuineaSummary
The LAISEANG corpus contains an unrivaled collection of multimedia materials and written documents from over 50 languages in Insular South East Asia and West New Guinea.
EMIT-X: Early-Modern Image and Text eXchangeSummary
VU-DNC: VU Diachronic Newspaper CorpusSummary
VU-DNC is a unique diachronic corpus of Dutch newspaper articles from five major Dutch newspapers from 1950/1951 and 2002 (2 MW). The VU-DNC has been annotated for quotations, which enables the researcher to differentiate between the words directly under responsibility of the journalist.
NEHOL: Negerhollands DatabaseSummary
C-DSD: Curating the Dutch Song DatabaseSummary
D-LUCEA: Database of the Longitudinal Utrecht Collection of English AccentsSummary
VALID - vulnerability in language acquisition: language impairments in DutchSummary
An open access multimedia archive of language pathology data collected in the Netherlands, primarily on Dutch, audio files and transcripts. Currently, this corpus contains 5 different data sets. In the VALID data archive old, current and future data can be brought together.
DBD - Dutch Bilingualism Database, TCULT - Talen en Culturen in Utrechtse Lombok en TransvaalSummary
The CLARIN NL supported data sets are part of an already existing collection: Dutch Bilingualism Database housed at the MPI for Psycholinguistics that are both also CLARIN compatible. The addtional DBD / TCULT data were curated by the CLARIN DCS (http://dev.clarin.nl/node/1963) and delivered in February 2014.