Connected Research: The Potentialof the PID Graph

Helena Cousijn, Ricarda Braukmann, Martin Fenner, Christine Ferguson, René van Horik, Rachael Lammey, Alice Meadows, Simon Lambert

PerspectiveConnected Research: The Potentialof the PID GraphHelena Cousijn,1,*Ricarda Braukmann,2Martin Fenner,1Christine Ferguson,3Rene ́van Horik,2Rachael Lammey,4Alice Meadows,5and Simon Lambert61DataCite, Welfengarten 1B, 30167 Hannover, Germany2Data Archiving and Networked Services, Anna van Saksenlaan 51, 2593 HW Den Haag, the Netherlands3The European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK4Crossref, New Road, Oxford OX1 1BY, UK5National Information Standards Organization (NISO), 3600 Clipper Mill Road Suite 302, Baltimore, MD 21211-1948, USA6UKRI-STFC, Scientific Computing Department, Rutherford Appleton Laboratory, Harwell Campus, Didcot OX11 0QX, UK*Correspondence:hcousijn@datacite.org identifiers (PIDs) provide unique and long-lasting references to entities. They enable uniqueidentification persistently over time and hence play a crucial role in supporting the FAIR (Findable, Acces-sible, Interoperable, Reusable) principles. In this paper, we describe how the benefits of PIDs can beamplified by connecting them via their metadata. We are introducing the next step in PID infrastructure:the PID Graph. The PID Graph establishes connections between different entities within the research land-scape, thereby enabling both researchers and institutions to access new information. The paper closeswith three recommendations, which will help to optimize the use and value of PIDs within the researchecosystem.INTRODUCTIONThe need for sharing research findings and integrating existinginformation to facilitate new discoveries is more evident thanever. With major steps being taken toward the implementationof transcontinental infrastructures such as the European OpenScience Cloud (EOSC), AmeliCA, and several national infrastruc-tures to share research outputs, it is timely to examine the use ofidentifiers and metadata in connecting research. Taking theEOSC as an example, Persistent identifiers (PIDs) are a promi-nent component of the European scientific ecosystem,1and aPID policy for the EOSC has been formulated.2To enhance the value of research, resources should be madeFAIR (Findable, Accessible, Interoperable, and Reusable).3Toenable a FAIR research landscape, a technical infrastructure isneeded that allows digital information to be found and accessedin a reliable and sustainable manner.4PIDs are a crucial aspectof this technical infrastructure, playing a role in each of theFAIR elements.5A PID isa uniqueand long-lastingreference to anentity, suchasadataset,paper,orperson.Itisamachine-readablestringofchar-acters, which conforms to a defined lexical scheme and must beassociated with one, and only one, entity within the world.6Unlikea uniform resource locator (URL), a PID reliably points to thatTHE BIGGER PICTUREPIDs provide unique and long-lasting references to entities and play a crucial role inresearch infrastructure. They enable unique identification persistently over time and contribute to makingresearch entities more FAIR (Findable, Accessible, Interoperable, and Reusable).The benefits of PIDs can be amplified by connecting them via their metadata. Therefore, we are introducingthe next step in PID infrastructure: the PID Graph. The PID Graph establishes connections between differententities within the research landscape, thereby enabling researchers and institutions to access new infor-mation.Through the PID Graph, the infrastructure is in place to answer new questions about connections within theresearch world. However, these will only have meaningful answers if sufficient information is present withinthe PID Graph. Therefore, the paper closes with three recommendations for different stakeholders, which willhelp to optimize the use and value of PIDs within the research ecosystem.
