Automatic Detection of Intra-Word Code-Switching

D. Nguyen, L.M.E.A. Cornips

Onderzoeksoutput: Hoofdstuk in boek/boekdeelBijdrage aan conferentie proceedingsWetenschappelijkpeer review

Samenvatting

Many people are multilingual and they may draw from multiple language varieties
when writing their messages. This paper is a first step towards analyzing and detecting code-switching within words. We first segment words into smaller units. Then, words are identified that are composed of sequences of subunits associated with different languages. We demonstrate our method on Twitter data in which both Dutch and dialect varieties labeled as Limburgish, a minority language, are used.
Originele taal-2Engels
TitelProceedings of the 14th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology
Plaats van productieStroudsburg
UitgeverijAssociation for Computational Linguistics (ACL)
Pagina's82-28
ISBN van geprinte versie 978-1-945626-08-1
StatusGepubliceerd - 01 aug 2016

Citeer dit