How can Blockchain technology support Open Science data infrastructures
How can Blockchain technology support Open Science data infrastructures
Presentation and discussion of a use case of how selected metadata associated to an open science data set can be stored and shared on a blockchain
To provide an illustration, we will refer to Blockcerts, a system developed for self-management of educational credentials, already in commercial exploitation. Blockcerts stores a fingerprint of an academic credential (= a hash of the underlying credential) on public blockchain. The solution relieves the issuing body, e.g. a university, from having to verify the correctness of credentials each time it is accessed. Using this technology, individuals can take control of their own credentials through the possession of verified records, which they can use as needed.
Accordingly, we will suggest a similar use case describing how selected metadata associated to an open science (OS) data set can be securely stored and shared on a blockchain, and in this way offer a mechanism for all relevant parties to be able to verify the correctness of the metadata. Relevant metadata can include data descriptors, author identification credentials, and possible licenses or other conditions for use. These metadata may be produced by the responsible researchers themselves, their institution or their publisher, and/or other relevant (authorised) bodies. In addition, a digital fingerprint of the dataset itself may be stored on-chain.
Metadata can e.g. be organised according to different categories and formatted using the Dublin Core metadata standard. A “fingerprint” of the metadata is calculated and stored on the blockchain together with the hash of a certificate identifying the responsible research organisation. The science data itself is not stored on the blockchain; only the address pointing to the data is part of the metadata. It is, however, important to note there needs to be a qualified assessment of the authenticity of the certificate when the fingerprint is first uploaded to the blockchain, most likely by the certificate issuer.
For a more robust and immutable storage of OS data a distributed solution such as the Interplanetary File System (IPFS) may be used, which implies that identical data is spread over a number of individual nodes (PCs) to prevent blocking or tampering of the data, but also to offer better accessibility. Such nodes can be part of an EU OS cloud infrastructure. This is, however, optional and the described process here will work just as fine with centrally stored data.
To summarize, the use of this system would involve the following steps:
Important issues that are relevant when designing an OS data infrastructure including the use of Blockchain technology (BCT) will be:
Processes and procedures needed when parts of the research data has been updated
This passage is part of D6.3: Comparison of existing blockchain technologies to safeguard responsible OS written by Arild Johan Jansen & Svein Ølnes.