While there is a constantly growing amount of digital resources available to humanities scholars, more often than not it remains difficult to assess the content and quality of the data. In our presentation we will share insights from the Huygens Institute for the History of the Netherlands (Huygens ING), an institute that holds a large and heterogeneous collection of digital historical and literary sources, into the practice of sharing data within the context of the semantic web in ways that make them easy-to-use and meaningful for users.
One of the issues our institute faces, is that its collections were formed over a very long period of time, with various modes of selecting and editing. We e.g. have a large number of digitized historical source editions, with alternating bits of literal transcription and calendars (summaries) of the editors. Some of our structured datasets consist of ‘raw’ representations of archival data, while others provide more heavily edited views of the original source, and a growing number consists of automatically enriched data, containing edits and adaptations of data observations from one of our other sets, and connections between observations. Our aim is to present these data as much as possible with its original contexts, and to do this in a way that enables scholars to examine all adaptations and edits and determine whether they are appropriate for their own research projects. The usage of ontologies for linked open data collections, and other data models for textual collections, makes it possible to convey information about the selection and editing criteria that were applied during the collection process, but this can easily get too complicated.
Secondly, we not only want data users to be able to easily assess the contents of a particular dataset, but also find their way through our broad range of almost 200 digital resources. We will discuss our experiences with ‘data stories’ – a form of data presentation that combines pieces of text and interactive data visualizations. We will show a data story we made with collections of digitized objects of heritage institutions, historical source collections and research data, all centered around the theme of maritime history. This data story gives quick insight into the scope of the collections and at the same time provides suggestions for ways in which the data could be analyzed. We argue that such data stories are a good way of providing a well-defined lens to information in massive data collections.