Description
Word Embedding models trained on 6 national Dutch newspapers.
We use the Gensim implementation of Word2Vec to train four embedding models per newspaper, each representing one decade between 1950 and 1990. The models were trained using C-BOW with hierarchical softmax, with a dimensionality of 300, a minimal word count and context of 5, and downsampling of 10-5
These models belong to the article: Using Word Embeddings to Examine Gender Bias in Dutch Newspapers, 1950-1990
We use the Gensim implementation of Word2Vec to train four embedding models per newspaper, each representing one decade between 1950 and 1990. The models were trained using C-BOW with hierarchical softmax, with a dimensionality of 300, a minimal word count and context of 5, and downsampling of 10-5
These models belong to the article: Using Word Embeddings to Examine Gender Bias in Dutch Newspapers, 1950-1990
Date made available | 03 Jun 2019 |
---|---|
Publisher | Zenodo |
Keywords
- Word embeddings
- newspapers
Dataset type
- Processed data