Projects per year
Abstract
Statistical data is increasingly made available in the form of Linked Data on the Web. As more and more statistical datasets become available, a fundamental question on statistical data comparability arises: To what extent can arbitrary statistical datasets be faithfully compared? Besides a purely statistical comparability, we are interested in the role that semantics plays in the data to be compared. Our hypothesis is that semantic relationships between different components of statistical datasets might have a relationship with their statistical correlation. Our research focuses in studying whether these statistical and semantic relationships influence each other, by comparing the correlation of statistical data with their semantic similarity. The ongoing research problem is, hence, to investigate why machines have a difficulty in revealing meaningful correlations or establishing non-coincidental connection between variables in statistical datasets. We describe a fully reproducible pipeline to compare statistical correlation with semantic similarity in arbitrary Linked Statistical Data. We present a use case using World Bank data expressed as RDF Data Cube, and we highlight whether dataset titles can help predict strong correlations.
Original language | English |
---|---|
Publication status | Published - 2014 |
Event | 2nd International Workshop on Semantic Statistics (SemStats 2014) - International Semantic Web Conference, ISWC 2014, Riva del Garda, Italy Duration: 19 Oct 2014 → … |
Conference
Conference | 2nd International Workshop on Semantic Statistics (SemStats 2014) |
---|---|
Country/Territory | Italy |
City | Riva del Garda |
Period | 19/10/2014 → … |
Keywords
- correlation
- linked data
- semantic similarity
- statistical database
- statistics
Fingerprint
Dive into the research topics of 'Semantic Similarity and Correlation of Linked Statistical Data Analysis'. Together they form a unique fingerprint.Projects
- 1 Finished
-
Census data open linked – CEDA_R From fragment to fabric – Dutch census data in a web of global cultural and historic information
Scharnhorst, A., Mandemakers, K., van Harmelen, F., Doorn, P., Guéret, C., Ashkpour, A., Meroño-Peñuela, A. & Schlobach, S.
01/10/2011 → 31/03/2016
Project: Research
Activities
- 1 Talk or presentation
-
Semantic Similarity and Correlation of Linked Statistical Data Analysis
Albert Meroño-Peñuela (Speaker)
2014Activity: Talk or presentation › Academic