Computational Literary Studies Infrastructure (CLSINFRA): a H2020 Research Infrastructure Project that aids to connect researchers, data, and methods

Julie M. Birkholz, Ingo Börner, Sally Chambers, Silvie Cinkova, K.H. van Dalen-Oskam, Tess Dejaeghere, Julia Dudar, Maciej Eder, Jennifer Edmond, Vicky Garnett, Michal Kren, Michal Mrugalski, Ciara Lynn Murphy, Carolin Odebrecht, Eliza Papaki, Marco Raciti, Lisanne van Rossum, Christof Schöch, Artjoms Šeļa, Sharma SrishtiJustin Tonra, Erzsébet Tóth-Czifra, Peer Trilcke

Research output: Contribution to conferencePosterScientific


The aim of this poster is to provide an overview of the principal objectives of the newly started H2020
Computational Literary Studies (CLS) project- CLS is a infrastructure project
works to develop and bring together resources of high-quality data, tools and knowledge to aid new
approaches to studying literature in the digital age. Conducting computational literary studies has a
number of challenges and opportunities from multilingual and bringing together distributing
information. At present, the landscape of literary data is diverse and fragmented. Even though many
resources are currently available in digital libraries, archives, repositories, websites or catalogues, a lack
of standardisation hinders how they are constructed, accessed and the extent to which they are reusable
(Ciotti 2014). CLS project aims to federate these resources, with the tools needed to interrogate them,
and with a widened base of users, in the spirit of the FAIR and CARE principles (Wilkinson et al. 2016).
The resulting improvements will benefit researchers by bridging gaps between greater- and lesser-
resourced communities in computational literary studies and beyond, ultimately offering opportunities
to create new research and insight into our shared and varied European cultural heritage.
Rather than building entirely new resources for literary studies, the project is committed to exploiting
and connecting the already-existing efforts and initiatives, in order to acknowledge and utilize the
immense human labour that has already been undertaken. Therefore, the project builds on recently-
compiled high-quality literary corpora, such as DraCor and ELTeC (Fischer et al. 2019, Burnard et al. 2021,
Schöch et al. in press), integrates existing tools for text analysis, e.g. TXM, stylo, multilingual NLP
pipelines (Heiden 2010, Eder et al. 2016), and takes advantage of deep integration with two other
infrastructural projects, namely the CLARIN and DARIAH ERICs. Consequently, the project aims at
building a coherent ecosystem to foster the technical and intellectual findability and accessibility of
relevant data. The ecosystem consists of (1) resources, i.e. text collections for drama, poetry and prose in
several languages, (2) tools, (3) methodological and theoretical considerations, (4) a network of CLS
scholars based at different European institutions, (5) a system of short-term research stays for both early
career researchers and seasoned scholars, (6) a repository for training materials, as well as (7) an
efficient dissemination strategy. This is achieved through a collaboration between participating
institutions: Institute of Polish Language at the Polish Academy of Sciences, Poland; University of
Potsdam, Germany; Austrian Academy of Sciences, Austria; National University of Distance Education,
Spain; École Normale Supérieure de Lyon, France; Humboldt University of Berlin, German; Charles
University, Czech Republic; Digital Research Infrastructure for the Arts and Humanities, France; Ghent
Centre for Digital Humanities, Ghent University, Belgium; Belgrade Centre for Digital Humanities, Serbia;
Huygens Institute for the History of the Netherlands (Royal Netherlands Academy of Arts and Sciences),
Netherlands; Trier Center for Digital Humanities, Trier University, Germany; Moore Institute, National
University of Ireland Galway, Ireland.
Original languageEnglish
Publication statusPublished - 23 May 2022
DH Benelux 2022: RE-MIX. Creation and alteration in DH (Hybrid)
- University of Luxembourg, Belval, Luxembourg
Duration: 01 Jun 202203 Jun 2022


DH Benelux 2022
Internet address


  • EU funded
  • Digital Humanities
  • Computational Literary Studies


Dive into the research topics of 'Computational Literary Studies Infrastructure (CLSINFRA): a H2020 Research Infrastructure Project that aids to connect researchers, data, and methods'. Together they form a unique fingerprint.

Cite this