Finding Rising and Falling Words

E. Tjong Kim Sang

Research output: Contribution to conferencePaperScientificpeer-review

Abstract

We examine two different methods for finding rising words (among which neologisms) and falling words (among which archaisms) in decades of magazine texts (millions of words) and in years of tweets (billions of words): one based on correlation coefficients of relative frequencies and time, and one based on comparing initial and final word frequencies of time intervals. We find that smoothing frequency scores improves the precision scores of both methods and that the correlation coefficients perform better on magazine text but worse on tweets. Since the two ranking methods find different words they can be used in side-by-side to study the behavior of words over time
Original languageEnglish
Pages2-9
Number of pages8
Publication statusPublished - 11 Dec 2016

Keywords

  • neologisms
  • archaisms
  • DBNL
  • Twitter
  • LT4DH 2016

    Erik Tjong Kim Sang (Speaker)

    11 Dec 2016

    Activity: Talk or presentationAcademic

Cite this