Title
INPOLDER: Integrated Parser and Lemmatizer Dutch in Retrospect
Description
INPOLDER (Integrated Parser and Lemmatizer of Dutch in Retrospect) provides a tool that assigns morphological tagging, lemmatization, and syntactic parsing for historical Dutch texts. It is built on the Adelheid tool (tagging and lemmatization) and Collins-Bikel statistical Parser.
As an essential part of the Dutch cultural heritage, it is of vital importance that the Dutch historical record be made accessible for research into a wide range of historical and linguistic research questions. In the transition from the Middle Ages to the modern era, the Netherlands developed from speaking a diverse group of dialects (Hollandic, Brabantic, Flemish, North-eastern, Limburgian) to a country with a standard language, and there is good reason to believe that this process was an extremely dynamic one. Systematic research into these processes affecting syntax, phonology, morphology and spelling cannot be done without access to lemmatized, tagged and parsed corpora of historical Dutch. In recent years, a tagger-lemmatizer has been developed by Hans van Halteren (Adelheid, also available in the CLARIN infrastructure). INPOLDER complements these enrichment tool with a parser for historical Dutch.
The INPOLDER parser is trained using a subset of the corpus of fourteenth-century texts (Corpus van Reenen/Mulder CRM, van Reenen and Mulder, 1993; Rem, 2003) and a subset of the Drenthe corpus (DC). CRM consists of 2700 charters from 345 places of origin. The corpus was designed as representative for the local language use of Middle Dutch and to be suitable for all types of linguistic research.
Project
INPOLDER
CLARIN National Project
CLARIN centre
Meertens/HuC
Research domain
Linguistic Subject
Tool task
Country
Netherlands
Tool Type
Research Phase
Tool status
Input format
Input Language
Access Contact
Project Contact
Creator Contact
Documentation
Source code
not specified
Original source
Resource
CMDI File Link
License
unknown