Modelling the Messy Complexity of Texts

Activity: Talk or presentationAcademic


This contribution presents ongoing research into the TAG data model and its modelling potential with regard to textual variation on a single document (i.e., intradocumentary variation). Contrary to the mono-hierarchical structure of XML which compels the encoder to choose either a documentary or a textual structure, TAG’s graph model allows for multiple structures to coexist. It also facilitates the expression of equally complicated but less famous features of historical documents, like nonlinear text or discontinuous text. Intradocumentary variation is a form of nonlinearity: deletions and additions “interrupt” the linear flow of the text. In the TEI/XML model, this phenomenon can only be expressed, queried and analysed with the use of intricate workarounds. The TAG data model, however, understands nonlinearity and is thus arguably closer to the nature of text. Consequently, it enables new forms of textual analysis. Using examples from a case study, we illustrate how different categories of variation can be expressed and queried idiomatically in TAG. The demo is embedded in a theoretical reflection on the different approaches to modelling variance, illustrating how the choices we make influence the ways in which variance is structured, processed and—especially—analysed. In short: how we can capture the messy complexity of the writing process, expand our textual awareness and enlarge our editorial knowledge.
Period06 Jun 2019
Degree of RecognitionInternational