The Memento Tracer Framework: Balancing Quality and Scalability for Web Archiving

Martin Klein, Harihar Shankar, Lyudmila Balakireva, Herbert Van de Sompel

Onderzoeksoutput: Hoofdstuk in boek/boekdeelHoofdstukWetenschappelijkpeer review

Samenvatting

Web archiving frameworks are commonly assessed by the quality of their archival records and by their ability to operate at scale. The ubiquity of dynamic web content poses a significant challenge for crawler-based solutions such as the Internet Archive that are optimized for scale. Human driven services such as the Webrecorder tool provide high-quality archival captures but are not optimized to operate at scale. We introduce the Memento Tracer framework that aims to balance archival quality and scalability. We outline its concept and architecture and evaluate its archival quality and operation at scale. Our findings indicate quality is on par or better compared against established archiving frameworks and operation at scale comes with a manageable overhead.
Originele taal-2Engels
TitelDigital Libraries for Open Knowledge
Subtitel23rd International Conference on Theory and Practice of Digital Libraries
Plaats van productieOslo
Pagina's163-176
Aantal pagina's14
Volume1909.04404
Uitgavev1
DOI's
StatusGepubliceerd - 10 sep 2019

Publicatie series

NaamarXiv
ISSN van geprinte versie2331-8422

Vingerafdruk Duik in de onderzoeksthema's van 'The Memento Tracer Framework: Balancing Quality and Scalability for Web Archiving'. Samen vormen ze een unieke vingerafdruk.

  • Citeer dit

    Klein, M., Shankar, H., Balakireva, L., & Sompel, H. V. D. (2019). The Memento Tracer Framework: Balancing Quality and Scalability for Web Archiving. In Digital Libraries for Open Knowledge: 23rd International Conference on Theory and Practice of Digital Libraries (v1 redactie, Vol. 1909.04404, blz. 163-176). (arXiv).. https://doi.org/10.1007/978-3-030-30760-8_15