From Flat Lists to Taxonomies : Bottom-up Concept Scheme Generation in Linked Statistical Data Bottom-up Construction of Concept Schemes

Albert Meroño-Peñuela, Ashkan Ashkpour, Christophe Guéret

Research output: Contribution to conferencePaperScientificpeer-review

Abstract

RDF Data Cube allows the modeling and publishing of Linked Sta- tistical Data (LSD) in the Semantic Web. Often, variable values of such statisti- cal data come in a non-standardized way and represented by too narrow, con- crete or wrongly typed literals. Generally, adequate and standard concept schemes for such variables (especially in very specific domains like historical religious denominations, or building types in the pre-industrial era) do not exist and need to be created. This is a manual task that requires lots of expert knowledge and time investment. We present a workflow that combines hierar- chical clustering and semantic tagging to automatically build concept schemes in a data-driven and bottom-up way, leveraging lexical and semantic properties of the non-standard dimension values. We apply our workflow in two different use-cases and discuss its usefulness, limitations and possible improvements.
Original languageEnglish
Publication statusPublished - 2014

Keywords

  • clustering
  • linked statistical data
  • standardization
  • taxonomies

Fingerprint Dive into the research topics of 'From Flat Lists to Taxonomies : Bottom-up Concept Scheme Generation in Linked Statistical Data Bottom-up Construction of Concept Schemes'. Together they form a unique fingerprint.

  • Cite this