A multiple-label guided clustering algorithm for historical document dating and localization

J.W.J. Burgers, Sheng He, Petros Samara, L.R.B. Schomaker

Onderzoeksoutput: Bijdrage aan wetenschappelijk tijdschrift/periodieke uitgaveArtikelWetenschappelijkpeer review

Samenvatting

It is of essential importance for historians to know the date and place of origin of the documents they study. It would be a huge advancement for historical scholars if it would be possible to automatically estimate the geographical and temporal provenance of a handwritten document by inferring them from the handwriting style of such a document. We propose a multiple-label guided clustering algorithm to discover the correlations between the concrete low-level visual elements in historical documents and abstract labels, such as date and location. Firstly, a novel descriptor, called Histogram of Orientations of Handwritten Strokes (HOHS or H2OS), is proposed to extract and describe the visual elements, which is built on a scale-invariant polar-feature space. In addition, the Multi-Label Self-Organizing Map (MLSOM) is proposed to discover the correlations between the low-level visual elements and their labels in a single framework. Our proposed MLSOM can be used to predict the labels directly. Moreover, the MLSOM can also be considered as a pre-structured clustering method to build a codebook, which contains more discriminative information on date and geography. Experimental results on the Medieval Paleographic Scale (MPS) data set demonstrate that our method achieves state-of-the-art results.
Originele taal-2Engels
Pagina's (van-tot)5252
Aantal pagina's5256
TijdschriftIEEE Transactions on Image Processing
Volume25
Nummer van het tijdschrift11
DOI's
StatusGepubliceerd - 2016

Vingerafdruk

Duik in de onderzoeksthema's van 'A multiple-label guided clustering algorithm for historical document dating and localization'. Samen vormen ze een unieke vingerafdruk.

Citeer dit