Data incognita: How do data become hidden?

Research output: Chapter in book/volumeChapterScientificpeer-review

31 Downloads (Pure)

Abstract

In the last chapter, we saw how general search engines such as Google return considerably more background noise for any specific search query than usable resultant web pages and how the long tail of the search algorithms hides research resources. This is a phenomenon not limited to cultural material, but where the existence of a class of professional trained and tasked with ensuring access to, and preservation of, these materials enables us to see clearly. We therefore now turn to another such manifestation of high (but potentially misplaced) societal trust in big data, that is the way in which analogue disconnected digital collections can sometimes become detached from their function as input into knowledge and identity formation processes. Cultural heritage institutions (CHIs) are moving to meet the new challenges of the digital world but there are yet more ways in which cultural heritage resources can be concealed from the researcher. To make their holdings digitally accessible and machine-readable CHIs must remove the aspects of hiddenness that reduce discoverability and accessibility for their users.

We use the term ‘hidden’ here not to imply active choices but to speak of the result: that data and cultural heritage resources are not visible to researchers who might otherwise use them. In asking why data are not used we are concerned with all factors that may lead to data becoming ‘hidden’ from the historical record. Such hiddenness will necessarily take many forms on a spectrum from inconsistent cataloguing practices, or a loss of institutional expertise, to the obvious forms of concealment when data has a privacy dimension, or being more obfuscated or ‘buried’ in a way that diminishes researchers’ chances of discovery. Cultural heritage practitioners are fully aware of many of these issues. These forms of ‘hiddenness’ exacerbate the discoverability challenges faced by researchers and particularly when search engines are the predominant means of discovery.

In this chapter, we will discuss the challenges caused by hidden data, as well as how data and research infrastructures can aid data discovery and reuse for researchers, beyond the de facto Google method, lessons of wider use in addressing challenges of biases and misinformation.
Original languageEnglish
Title of host publicationThe Trouble With Big Data
Subtitle of host publicationHow Datafication Displaces Cultural Practices
Place of PublicationLondon, Great Britain
PublisherBloomsbury Academic
Chapter5
Pages89–104
ISBN (Electronic)978-1-3502-3965-4, 978-1-3502-3963-0
ISBN (Print)978-1-3502-3962-3
DOIs
Publication statusPublished - 27 Jan 2022

Publication series

NameBloomsbury Studies in Digital Cultures
PublisherBloomsbury Publishing

Keywords

  • Big Data
  • humanities
  • cultural heritage
  • cultural heritage institutions
  • history of science
  • GLAM
  • New Media and Technology
  • Sociology of Science and Technology
  • sensemaking
  • data and language
  • data and power
  • data and invisibility
  • datafication
  • digital obscurity
  • hidden by privacy
  • discoverability
  • archives
  • archival practice

Fingerprint

Dive into the research topics of 'Data incognita: How do data become hidden?'. Together they form a unique fingerprint.
  • Knowledge Complexity

    Priddy, M. & Horsley, N.

    02/01/201707/05/2018

    Project: Research

Cite this