Introducing Functional Diversity: A Novel Approach to Lexical Diversity in (Historical) Corpora

F.B. Karsdorp, Enrique Manjavacas, Lauren Fonteyn

Research output: Chapter in book/volumeContribution to conference proceedingsScientificpeer-review

Abstract

The question how we can reliably estimate the lexical diversity of a particular text (collection) has often been asked by linguists and literary scholars alike. This short paper introduces a way of operationalizing functional diversity measurements by means of token-based embeddings, and argues that functional diversity is not only a practically advantageous, but also a theoretically relevant addition to the Computational Humanities Research toolkit. By means of an experiment on the historical ARCHER corpus, we show that lexical diversity at the level of functional groups is less sensitive to orthographic variation, and provides insight into an important and often disregarded dimension of vocabulary diversity
in textual data.
Original languageEnglish
Title of host publicationProceedings of the Computational Humanities Research Conference 2022
EditorsFolgert Karsdorp, Alie Lassche, Kristoffer Nielbo
PublisherCEUR Workshop Proceedings
Pages114-126
Volume3290
Publication statusPublished - Nov 2022

Fingerprint

Dive into the research topics of 'Introducing Functional Diversity: A Novel Approach to Lexical Diversity in (Historical) Corpora'. Together they form a unique fingerprint.

Cite this