TY - JOUR
T1 - Natiolectal variation in Dutch morphosyntax
T2 - A large-scale, data-driven perspective
AU - De Troij, Robbert
AU - Grondelaers, Stef
AU - Speelman, Dirk
PY - 2023
Y1 - 2023
N2 - In this article, we report a large-scale corpus study aimed at tackling the (controversial) question to what extent the European national varieties of Dutch, that is, Belgian and Netherlandic Dutch, exhibit morpho-syntactic differences. Instead of relying on a manual selection of cases of morphosyntactic variation, we first marshal large bilingual parallel corpora and machine translation software to identify semiautomatically, in an extensively data-driven fashion, loci of variation from various “corners” of Dutch grammar. We then gauge the distribution of con-structional alternatives in a nationally as well as stylistically stratified corpus for a representative selection of twenty alternation patterns. We find that natiolectal variation in the grammar of Dutch is far more prevalent than often assumed, especially in less edited text types, and that it shows up in inflection phenomena, lexically conditioned syntactic variation, and pure word order permutations. Another key finding is that many cases of synchronic probabilistic asymmetries reflect a diachronic difference between the two varieties: Netherlandic Dutch often tends to be ahead in cases of ongoing grammatical change, with Belgian Dutch holding on somewhat longer to obsolescent features of the grammar.
AB - In this article, we report a large-scale corpus study aimed at tackling the (controversial) question to what extent the European national varieties of Dutch, that is, Belgian and Netherlandic Dutch, exhibit morpho-syntactic differences. Instead of relying on a manual selection of cases of morphosyntactic variation, we first marshal large bilingual parallel corpora and machine translation software to identify semiautomatically, in an extensively data-driven fashion, loci of variation from various “corners” of Dutch grammar. We then gauge the distribution of con-structional alternatives in a nationally as well as stylistically stratified corpus for a representative selection of twenty alternation patterns. We find that natiolectal variation in the grammar of Dutch is far more prevalent than often assumed, especially in less edited text types, and that it shows up in inflection phenomena, lexically conditioned syntactic variation, and pure word order permutations. Another key finding is that many cases of synchronic probabilistic asymmetries reflect a diachronic difference between the two varieties: Netherlandic Dutch often tends to be ahead in cases of ongoing grammatical change, with Belgian Dutch holding on somewhat longer to obsolescent features of the grammar.
KW - computational linguistics, corpus linguistics, Dutch, grammatical variation, natiolectal variation, parallel corpus
U2 - 10.1017/S1470542722000071
DO - 10.1017/S1470542722000071
M3 - Article
SN - 1470-5427
VL - 35
SP - 1
EP - 68
JO - Journal of Germanic Linguistics
JF - Journal of Germanic Linguistics
IS - 1
ER -