Digital begriffsgeschichte: Tracing semantic change using word embeddings

Melvin Wevers, Marijn Koolen

Onderzoeksoutput: Bijdrage aan wetenschappelijk tijdschrift/periodieke uitgaveArtikelWetenschappelijkpeer review

Samenvatting

Recently, the use of word embedding models (WEM) has received ample attention in the natural language processing community. These models can capture semantic information in large corpora of text by learning distributional properties of words, that is how often particular words appear in specific contexts. Scholars have pointed out the potential of WEMs for historical research. In particular, their ability to capture semantic change might assist historians studying conceptual change or specific discursive formations over time. Concurrently, others voiced their criticism and pointed out that WEMs require large amounts of training data, that they are challenging to evaluate, and they lack the specificity looked for by historians. The ability to examine semantic change resonates with the goals of historians such as Reinhart Koselleck, whose research focused on the formation of concepts and the transformation of semantic fields. However, word embeddings can only be used to study particular types of semantic change, and the model’s use is dependent on the size, quality, and bias in training data. In this article, we examine what is required of historical data to produce reliable WEMs, and we describe the types of questions that can be answered using WEMs.
Originele taal-2Engels
TijdschriftHistorical Methods
DOI's
StatusGepubliceerd - 13 mei 2020

Vingerafdruk

Duik in de onderzoeksthema's van 'Digital begriffsgeschichte: Tracing semantic change using word embeddings'. Samen vormen ze een unieke vingerafdruk.

Citeer dit