CLARIAH integrations of Dataverse with SEMAF Semantic framework and SpaCy DANS Machine Learning library

Activity: Talk or presentationAcademic


DANS-KNAW has developed "Archive in a box" distribution which was presented for the Dataverse Community Meeting'23 in Braga, Portugal. This distribution provides fully automatic FAIR Dataverse data repository deployment integrated with third-party networked services, and connection to external controlled vocabularies required to produce Linked Data out of datasets metadata descriptions. It also includes data previewers and support of custom metadata schemes such as CESSDA CMM, CLARIN CMDI, ODISSEI etc. There are multiple benefits for institutions worldwide to run Common Data Infrastructure and do community based maintenance and development as costs will drop massively with a number of organizations joining the consortium. Following distributed setup, this shared infrastructure is sustainable and more suitable for the future. To make it more flexible and suitable for different communities, in CLARIAH project we have developed SEMAF semantic transformation framework and created SpaCy Machine Learning library to create semiautomatic workflow to generate FAIR metadata descriptions of datasets.
Period06 Jun 2023
Event titleDataverse Community Meeting 2023: Sharing data for future generations
Event typeConference
LocationBraga, PortugalShow on map
Degree of RecognitionInternational


  • dataverse
  • machine learning
  • data repository
  • artificial intelligence
  • data