

VALID - vulnerability in language acquisition: language impairments in Dutch


An open access multimedia archive of language pathology data collected in the Netherlands, primarily on Dutch, audio files and transcripts. Currently, this corpus contains 5 different data sets. In the VALID data archive old, current and future data can be brought together.


SHEBANQ: System for HEBrew Text: ANnotations for Queries and Markup


A web application that enables researchers to perform linguistic queries on the WIVU Hebrew Text Database and preserve significant results as annotations to this resource. This database contains the Hebrew text of the Old Testament enriched with many linguistic features at the morpheme level up to the discourse level.


DBD - Dutch Bilingualism Database, TCULT - Talen en Culturen in Utrechtse Lombok en Transvaal


The CLARIN NL supported data sets are part of an already existing collection: Dutch Bilingualism Database housed at the MPI for Psycholinguistics that are both also CLARIN compatible. The addtional DBD / TCULT data were curated by the CLARIN DCS ( and delivered in February 2014.


GrETEL is a query engine in which linguists can use a natural language example as a starting point for searching a treebank with limited knowledge about tree representations and formal query languages. By allowing users to search for constructions which are similar to the example they provide, we hope to bridge the gap between traditional and computational linguistics.

LASSY Word Relations Search

The LASSY word relations web application makes it possible to search for sentences that contain pairs of words between which there is a grammatical relation. One can search in the Dutch LASSY-SMALL Treebank (1 million tokens), in which the syntactic parse of each sentence has been manually verified, and in (a part of) the LASSY-LARGE Treebank (700 million tokens ),in which the syntactic parse of each sentence has been added by the automatic parser Alpino.
