CMD2RDF data









CMDI2RDF - CMDI to RDF conversion

Summary

CMD2RDF offers all the CMDI data from the CLARIN domain as a SPARQL endpoint
There is growing amount of on-line information available in RDF format as Linked Open Data (LOD) and a strong community very actively promotes its use. The publication of information as LOD is also considered an important signal that the publisher is actively searching for information sharing with a world full of new potential users. Added advantages of LOD, when well used, are the explicit semantics and high interoperability.

Background

The CMD2RDFservice was created to allow connection with the growing LOD world, and facilitate experiments within CLARIN merging CMDI with other, RDF based, information sources.
One of the promises of LOD is the ease to link data sets together and answer queries based on this ‘cloud’ of LOD datasets. Thus in the enrichment and use cases part of the project we looked at other datasets to link to the CLARIN joint metadata domain. We used the WALS N3 RDF dump for one of the use cases. Although it is in the end relatively easy to go from a specific typological feature to the CMD records via a shared URI, it still showcased a weakness of the Linked Data approach. One has to carefully inspect the property paths involved. And in this case the path was broken as there was no clear way to go from the WALS feature data to the WALS language info except for extracting the WALS language code from the feature URI pattern and insert it the language URI pattern. This showcases that although the big LOD cloud shows potential for knowledge discovery by crossing dataset boundaries, design decisions in the individual datasets can still hamper algorithms and manual inspection is needed.
The CMD2RDF service was developed at the TLA/MPI for Psycholinguistics and DANS and later moved to Meertens Institute where the expertise is available. The CMDI2RDF service will professionalised and extended in the Dutch CLARIAH project.

Contacts Links

Country

Netherlands

CLARIN centre

Meertens/HuC

Language

Resource tags

Format

XML
RDF