Issue 1, 2023

Towards more reproducible and FAIRer research data: documenting provenance during data acquisition using the Infofile format

Abstract

Information, i.e. data, is regarded as the new oil in the 21st century. The impact of this statement from economics for science and the research community is reflected in the hugely increasing number of machine-learning and artificial intelligence applications that were one driving force behind writing out the FAIR principles. However, any form of data (re)use requires the provenance of the data to be recorded. Hence, recording metadata during data acquisition is both an essential aspect of and as old as science itself. Here, we discuss the why, when, what, and how of research data documentation and present a simple textual file format termed Infofile developed for this purpose. This format allows researchers in the lab to record all relevant metadata during data acquisition in a user-friendly and obvious way while minimising any external dependencies. The resulting machine-actionable metadata in turn allow processing and analysis software to access relevant information, besides making the research data more reproducible and FAIRer. By demonstrating a simple, yet powerful and proven solution to the problem of metadata recording during data acquisition, we anticipate the Infofile format and its underlying principles to have great impact on the reproducibility and hence quality of science, particularly in the field of “little science” lacking established and well-developed software toolchains and standards.

Graphical abstract: Towards more reproducible and FAIRer research data: documenting provenance during data acquisition using the Infofile format

Supplementary files

Article information

Article type
Paper
Submitted
23 Nov 2022
Accepted
22 Dec 2022
First published
23 Dec 2022
This article is Open Access
Creative Commons BY-NC license

Digital Discovery, 2023,2, 234-244

Towards more reproducible and FAIRer research data: documenting provenance during data acquisition using the Infofile format

B. Paulus and T. Biskup, Digital Discovery, 2023, 2, 234 DOI: 10.1039/D2DD00131D

This article is licensed under a Creative Commons Attribution-NonCommercial 3.0 Unported Licence. You can use material from this article in other publications, without requesting further permission from the RSC, provided that the correct acknowledgement is given and it is not used for commercial purposes.

To request permission to reproduce material from this article in a commercial publication, please go to the Copyright Clearance Center request page.

If you are an author contributing to an RSC publication, you do not need to request permission provided correct acknowledgement is given.

If you are the author of this article, you do not need to request permission to reproduce figures and diagrams provided correct acknowledgement is given. If you want to reproduce the whole article in a third-party commercial publication (excluding your thesis/dissertation for which permission is not required) please go to the Copyright Clearance Center request page.

Read more about how to correctly acknowledge RSC content.

Social activity

Spotlight

Advertisements