Word2Vec Models Dutch Newspapers

  • Melvin Wevers (Maker)

Dataset

Beschrijving

Word Embedding models trained on 6 national Dutch newspapers.

We use the Gensim implementation of Word2Vec to train four embedding models per newspaper, each representing one decade between 1950 and 1990. The models were trained using C-BOW with hierarchical softmax, with a dimensionality of 300, a minimal word count and context of 5, and downsampling of 10-5

These models belong to the article: Using Word Embeddings to Examine Gender Bias in Dutch Newspapers, 1950-1990
Datum van beschikbaarheid03 jun. 2019
UitgeverZenodo

Dataset type

  • Verwerkte data

Citeer dit