VALID - vulnerability in language acquisition: language impairments in Dutch


An open access multimedia archive of language pathology data collected in the Netherlands, primarily on Dutch, audio files and transcripts. Currently, this corpus contains 5 different data sets. In the VALID data archive old, current and future data can be brought together.


VALID, an open access multimedia language data archive, facilitates and promotes innovative research in the area of language impairment. There are a number of compelling reasons for exchanging and sharing this kind of data. Obviously, in a small country like the Netherlands a considerably smaller amount of language data is being collected than in much larger regions, especially so with respect to language disorders, a highly specific research domain. The combination of a wide range of language impairments in one data archive not only enhances the study of similar impairments but also advances comparisons between different disorders. Moreover, the inclusion of different age groups allows for quasi-longitudinal research designs. Finally, analysis of task properties and effects that are specific to pathological language groups can make a significant contribution to evidence-based research.

The following data sets define the launching platform for the VALID data archive, together with the BISLI data set that is currently being curated (FESLI):

  • The SLI RU-Kentalis database: the expression of spatial relations by children with SLI and normally developing children in their spoken language production
  • The UU SLI-Dyslexia project database: early language development in children at familial risk of dyslexia
  • The bilingual deaf children RU-Kentalis database: the bilingual language and communication development of young deaf children in Sign Language of the Netherlands (SLN) and Dutch (D)
  • The ADHD and SLI corpus UvA database: the language and executive functioning profiles of children with ADHD and children with SLI and TD children
  • The deaf adults RU database: the acquisition of morphosyntactic aspects of Dutch by deaf Dutch adults (late L1/early L2) and hearing Turkish and Moroccan-Arabic L2-learners of Dutch (late L2).
For all data sets concerned, written informed consent from the participants or their caretakers has been obtained. Informants or their caretakers have agreed to share their speech/language data and metadata, on the condition of anonymity, which will be ensured by the data providers and infrastructure specialists Contacts
  • Project leader: Dr. J. Klatter (Radboud University) 

  • CLARIN center: Max Planck Institute for Psycholinguistics
  • Help contact: (coordinator)

  • Web-sites:
  • User scenario's (screencasts, screenshots): n.a.
  • Manual: n.a.
  • Tool/Service link: (CMDI top node) (metadata browser)
  • Publications:
    • Bergmann, L, van Hout, R and Klatter-Folmer, J. 2017. SLI Diagnostics in Narratives: Exploring the CLARIN-NL VALID Data Archive. In: Odijk, J and van Hessen, A. (eds.) CLARIN in the Low Countries, Pp. 167–180. London: Ubiquity Press. DOI: License: CC-BY 4.0
    • Klatter, J., Hout, R. van, Heuvel, H. van den, Fikkert, P., Baker, A., Jong, J. de, Wijnen, F., Sanders, E. & Trilsbeek, P. (2014). Vulnerability in Acquisition, Language Impairments in Dutch: Creating a VALID Data Archive. Language Resources and Evaluation Conference Proceedings (LREC) May 26-31, 2014, Reykjavik. pp. 357-364.
    • Heuvel, H. van den, Sanders, E., Klatter, J., Hout, R. van, Fikkert, P., Baker, A., Jong, J. de, Wijnen, F. & Trilsbeek, P. (2014). Data curation for a VALID Archive of Dutch Language Impairment Data. In: Dutch Journal of Applied Linguistics, 3(2), pp. 127-135. DOI:10.1075/dujal.3.2.02heu



CLARIN centre

MPI for Psycholinguistics


Research domain

Resource tags