Samenvatting
We present novel software to process scans of historical documents to extract their layout information. We do this using a ResNet backbone with a feature pyramid head. We extract region information directly into PageXML. For baseline extraction, we use a two stage processing approach. The software has been applied successfully to several projects. The results show the feasibility to automatically label text lines and regions in historical documents.
Originele taal-2 | Engels |
---|---|
Pagina's | 67-72 |
Aantal pagina's | 6 |
Status | Gepubliceerd - 25 aug. 2023 |
Evenement | HIP '23: Proceedings of the 7th International Workshop on Historical Document Imaging and Processing - San José, California, San José, Verenigde Staten Duur: 25 aug. 2023 → 26 aug. 2023 https://dl.acm.org/doi/proceedings/10.1145/3604951 |
Conferentie
Conferentie | HIP '23 |
---|---|
Verkorte titel | HIP '23 |
Land/Regio | Verenigde Staten |
Stad | San José |
Periode | 25/08/2023 → 26/08/2023 |
Internet adres |