Parsing a markup language that supports overlap and discontinuity

Ronald Haentjens Dekker, Bram Buitendijk, Elli Bleeker

Onderzoeksoutput: Hoofdstuk in boek/boekdeelBijdrage aan conferentie proceedingsWetenschappelijkpeer review

2 Citaten (Scopus)

Samenvatting

Text As Graph Markup Language (TAGML) is a recently developed markup language that offers core support for overlapping and discontinuous markup. Designing and implementing a markup language technology stack that supports overlap poses numerous challenges; the most prominent being that the markup language cannot be expressed in a recursive context-free (CF) grammar. In this short paper we discuss our experiments with parsing TAGML based on a context-sensitive grammar. Our current approach implements an attribute grammar, which consists of a CF grammar with semantic actions. We discuss the advantages and disadvantages of our approach, and sketch several alternative methods.

Originele taal-2Engels
TitelProceedings of the ACM Symposium on Document Engineering, DocEng 2020
UitgeverijAssociation for Computing Machinery, Inc
ISBN van elektronische versie9781450380003
DOI's
StatusGepubliceerd - 29 sep. 2020

Publicatie series

NaamProceedings of the ACM Symposium on Document Engineering, DocEng 2020

Vingerafdruk

Duik in de onderzoeksthema's van 'Parsing a markup language that supports overlap and discontinuity'. Samen vormen ze een unieke vingerafdruk.

Citeer dit