RDF Data Cube allows the modeling and publishing of Linked Sta- tistical Data (LSD) in the Semantic Web. Often, variable values of such statisti- cal data come in a non-standardized way and represented by too narrow, con- crete or wrongly typed literals. Generally, adequate and standard concept schemes for such variables (especially in very specific domains like historical religious denominations, or building types in the pre-industrial era) do not exist and need to be created. This is a manual task that requires lots of expert knowledge and time investment. We present a workflow that combines hierar- chical clustering and semantic tagging to automatically build concept schemes in a data-driven and bottom-up way, leveraging lexical and semantic properties of the non-standard dimension values. We apply our workflow in two different use-cases and discuss its usefulness, limitations and possible improvements.
|Publication status||Published - 2014|
- linked statistical data