Digital begriffsgeschichte: Tracing semantic change using word embeddings

Melvin Wevers, Marijn Koolen

Research output: Contribution to journal/periodicalArticleScientificpeer-review

Abstract

Recently, the use of word embedding models (WEM) has received ample attention in the natural language processing community. These models can capture semantic information in large corpora of text by learning distributional properties of words, that is how often particular words appear in specific contexts. Scholars have pointed out the potential of WEMs for historical research. In particular, their ability to capture semantic change might assist historians studying conceptual change or specific discursive formations over time. Concurrently, others voiced their criticism and pointed out that WEMs require large amounts of training data, that they are challenging to evaluate, and they lack the specificity looked for by historians. The ability to examine semantic change resonates with the goals of historians such as Reinhart Koselleck, whose research focused on the formation of concepts and the transformation of semantic fields. However, word embeddings can only be used to study particular types of semantic change, and the model’s use is dependent on the size, quality, and bias in training data. In this article, we examine what is required of historical data to produce reliable WEMs, and we describe the types of questions that can be answered using WEMs.
Original languageEnglish
JournalHistorical Methods
DOIs
Publication statusPublished - 13 May 2020

Keywords

  • conceptual history
  • word embeddings
  • digital history
  • semantic change

Fingerprint Dive into the research topics of 'Digital begriffsgeschichte: Tracing semantic change using word embeddings'. Together they form a unique fingerprint.

Cite this