Reflections on Encoding Languages in Historical Data: Working With the Multilingual Dimension of the Dutch East India Company Archives

Onderzoeksoutput: Bijdrage aan wetenschappelijk tijdschrift/periodieke uitgaveArtikelWetenschappelijkpeer review

1 Citaat (Scopus)

Samenvatting

This article investigates the challenges of encoding languages in historical data through the example of a reference dataset: a thesaurus in SKOS format of commodities traded by the Dutch East India Company (VOC).

The VOC archives, from which this thesaurus draws a lot of its data, are far from purely Dutch. The company’s multilingual workforce and interactions across Asia resulted in records influenced by a multitude of languages, full of loanwords and citations. This is further complicated by the VOC’s role in colonising regions and suppressing local languages, resulting in some languages potentially only surviving in these ‘Dutch’ archives.

This means that when working with a large corpus like the VOC archives, various challenges arise regarding historical language evolution, vocabulary borrowing, extinct languages, technical standards that are not geared towards historical context, and political sensitivities around identity-bound language.

The article demonstrates how the GLOBALISE project navigates these issues by prioritising transparency, flexibility, and iterative refinement. It argues that as long as researchers are aware of the challenges, language complexities are not a roadblock but offer opportunities for further research and critical engagement with the past, encouraging broader discussions and creative solutions for encoding historical multilingualism and development of language.
Originele taal-2Engels
Pagina's (van-tot)1-10
Aantal pagina's10
TijdschriftJournal of Open Humanities Data
Volume10
DOI's
StatusGepubliceerd - 12 apr. 2024

Vingerafdruk

Duik in de onderzoeksthema's van 'Reflections on Encoding Languages in Historical Data: Working With the Multilingual Dimension of the Dutch East India Company Archives'. Samen vormen ze een unieke vingerafdruk.

Citeer dit