Finding Rising and Falling Words

E. Tjong Kim Sang

We examine two different methods for finding rising words (among which neologisms) and falling words (among which archaisms) in decades of magazine texts (millions of words) and in years of tweets (billions of words): one based on correlation coefficients of relative frequencies and time, and one based on comparing initial and final word frequencies of time intervals. We find that smoothing frequency scores improves the precision scores of both methods and that the correlation coefficients perform better on magazine text but worse on tweets. Since the two ranking methods find different words they can be used in side-by-side to study the behavior of words over time
Originele taal-2Engels
Aantal pagina's8
StatusGepubliceerd - 11 dec. 2016
  • LT4DH 2016

    Erik Tjong Kim Sang (Speaker)

    11 dec. 2016

    Activiteit: Toespraak of presentatieAcademisch

