Measuring syntactical variation in Germanic texts

Research output: Contribution to journal/periodicalArticleScientificpeer-review

Abstract

We present two new measures of syntactic distance between languages. First, we present the movement measure which measures the average number of words that has moved in sentences of one language compared to the corresponding sentences in another language. Second, we introduce the indel measure which measures the average number of words being inserted or deleted in sentences of one language compared to the corresponding sentences in another language. The two measures were compared to the trigram measure which was introduced by Nerbonne & Wiersma (2006). We correlated the results of the three measures and found a low correlation between the results of the movement and indel measure, indicating that the two measures represent different kinds of linguistic variation. We found a high correlation between the results of the movement measure and the trigram measure. The results of all of the three measures suggest that English is syntactically a Scandinavian language. Because of our unique database design we were able to detect asymmetric relationships between the languages. All three measures suggest that asymmetric syntactical distances could be part of the explanation why native speakers of Dutch more easily understand German texts than native speakers of German understand Dutch texts (Swarte 2016).
Original languageEnglish
Pages (from-to)279-296
JournalDigital Scholarship in the Humanities
Volume33
Issue number1
DOIs
Publication statusPublished - 2018

Keywords

  • syntax
  • dialectometry, dialectology, computational linguistics, variationist linguistics

Fingerprint Dive into the research topics of 'Measuring syntactical variation in Germanic texts'. Together they form a unique fingerprint.

Cite this