TY - JOUR
T1 - Named entity recognition and resolution for literary studies
AU - van Dalen-Oskam, K.H.
AU - de Does, Jesse
AU - Marx, Maarten
AU - Sijaranamual, Isaac
AU - Depuydt, Katrien
AU - Verheij, Boukje
AU - Geirnaert, Valentijn
PY - 2014/12/20
Y1 - 2014/12/20
N2 - This paper reports on the project Namescape: Mapping the Landscape of Names in Modern Dutch Literature, funded by CLARIN-NL. The background of the project is research in literary onomastics, the study of the usage and functions of proper names in literary (i.e. ctional) texts. The two main tasks for the project were to adapt existing Named Entity Recognition software to modern Dutch ction, and to perform Named Entity Resolution by linking the names to Wikipedia entries. For Named Entity Recognition, existing tools have been trained on literary texts and a new NE tagger has been developed. The standard list of name categories had to be extended, since the analysis of the usage of proper names in literature needs to distinguish e.g. between rst names and family names. The Named Entity Resolution task was done to explore the possibility of labeling the names in ction in another way, by categorizing a name as referring to a person or location that only exist in the story of a ctional work (plot-internal names), or one referring to a person or location in the real world (plot-external names). This distinction is linked to the hypothesis that plot-internal and plot-external names can have dierent (stylistic and narrative) functions. Automatically marking them up is the rst step toward testing that hypothesis on a large corpus. In this paper we describe the results of these two main tasks.
AB - This paper reports on the project Namescape: Mapping the Landscape of Names in Modern Dutch Literature, funded by CLARIN-NL. The background of the project is research in literary onomastics, the study of the usage and functions of proper names in literary (i.e. ctional) texts. The two main tasks for the project were to adapt existing Named Entity Recognition software to modern Dutch ction, and to perform Named Entity Resolution by linking the names to Wikipedia entries. For Named Entity Recognition, existing tools have been trained on literary texts and a new NE tagger has been developed. The standard list of name categories had to be extended, since the analysis of the usage of proper names in literature needs to distinguish e.g. between rst names and family names. The Named Entity Resolution task was done to explore the possibility of labeling the names in ction in another way, by categorizing a name as referring to a person or location that only exist in the story of a ctional work (plot-internal names), or one referring to a person or location in the real world (plot-external names). This distinction is linked to the hypothesis that plot-internal and plot-external names can have dierent (stylistic and narrative) functions. Automatically marking them up is the rst step toward testing that hypothesis on a large corpus. In this paper we describe the results of these two main tasks.
KW - literary onomastics
KW - named entity recognition
KW - named entity resolution
M3 - Article
SN - 2211-4009
VL - 4
SP - 121
EP - 136
JO - Computational Linguistics in the Netherlands Journal
JF - Computational Linguistics in the Netherlands Journal
ER -