TEI Standoff Markup - A work in progress

E. Spadini, Magdalena Turska, Misha Broughton

Onderzoeksoutput: Bijdrage aan conferentiePosterWetenschappelijk


“Markup is said to be standoff, or external, when the markup data is placed outside of the text it is meant to tag” (<tei-c.org>).

One of the most widely recognized limitations of inline XML markup is its inability to cope with element overlap; standoff has been considered as a possible solution to this problem.

However, as P. Bański (2010) points out, overlap and discontinuity are inherent in our theoretical constructs of texts, thus we need to embrace them rather than outwit them.

On a theoretical level, inline markup embeds disparate interpretations of the text into an already interpretative transcription; standoff markup clearly separates these layers of interpretation.

On a very practical level, standoff reduces distractions while encoding. It also facilitates further enrichment of existing digital texts thanks to a modular, decentralized method of introducing and storing the markup.

While several attempts have already been made to overcome the problem of overlap, either by introducing standoff markup in XML (such as JITM) or by using other, non-XML markup schemes (such as LMNL), our intention is to concentrate on a TEI based solution. Though there are, within the wider TEI ecosystem, isolated projects applying standoff markup for particular needs (e.g. the Shakespeare Folger Digital Texts), attempts to model standoff markup more generally  (e.g Pose-Lopez-Romary 2014) have lacked appropriate toolkits and broader user adoption.

We consider the preliminary notation we propose here a point of departure for further developments, and are designing it alongside packages for online publishing, querying and visualization.

Our light notation moves the bulk of markup into a separate <standoff> element, grouping “layers” of related textual features encoded via existing TEI elements (eg. <name> or <corr>), into individual <stf> elements, and proposes a schema for referencing the transcription using xml:id’s. Our transformation package aims to work directly on the standoff markup, without the necessity of reducing it back to inline TEI for parsing or querying.

Originele taal-2Engels
StatusGepubliceerd - 2015

