James M.
Nyachwaya
a and
Scott E.
Lewis
*b
aDepartment of Chemistry and Biochemistry and School of Education, North Dakota State University, USA
bDepartment of Chemistry, University of South Florida, USA. E-mail: slewis@usf.edu
Open science practices involve sharing of research questions, methodology, resources, and publishing formats (Allen and Mehler, 2019). Common trends in open science include open code, open data, open access, open notebooks, open lab books, collaborative bibliographies, citizen science, open peer review, and pre-registration (Vicente-Saez and Martinez-Fuentes, 2018). A number of practices support open science, including wider sharing of data, research materials, code, replications and reanalyses, changing statistical approaches and how evidence is assessed, transparent ways of presenting data, use of double-blind review, use of preprints, as well as open access publishing. Open science is therefore transparent, shared, accessible and collaboratively developed (Deng, 2011; Bisol et al., 2014; Hampton et al., 2015).
The ultimate goal of open science is to make science more reliable (Munafò et al., 2017). Open science can help foster responsible and sustainable research. Benefits of open science practices include a potential for more visibility and citations by sharing data, career advancement through collaborations, media attention, as well as funding opportunities (fellowships and awards) from organizations that support open research (McKiernan et al., 2016). Open science practices are therefore valuable because they can help improve the quality and accumulation of knowledge.
As mentioned, data from human subjects may have instances where the data are not publicly available. The RSC exemplars currently include limited examples of statements where data may not be available. The following are exemplar statements that can be used for submitting to CERP when data is not available:
• The data are not publicly available as participants of this study did not consent for their data to be shared publicly.
• The data are not publicly available as approval for this study did not include permission for sharing data publicly.
• The data are not publicly available as publicly releasing the data could potentially compromise the privacy of the research participants.
These suggestions are not meant to be prescriptive. Authors are welcome to tailor these to describe their particular research study.
The RSC guidance also indicates that the statement: “Data are available upon request from the authors” is generally not acceptable. Our interpretation of this guidance is that if the data can be shared publicly, it should be shared in a permanent location. Having data only available by a request made to the authors makes the data availability contingent on the authors’ availability. If the author is no longer available at the corresponding address, or the author’s data storage malfunctions, the data would no longer be available. Publicly sharing the data in a data repository or as a supplemental file accompanying the paper reduces these risks. Thus, authors are encouraged to make the data publicly available and accessible when the data can be shared. When the data cannot be publicly shared, indicate the data is not publicly available. Note that this statement does not preclude interested CERP readers from contacting authors regarding data availability as there may be instances where the data can be shared individually with some conditions even though it is not being made publicly available.
It is also helpful to clarify what is meant by data, as submissions have varied attributions for data. Our interpretation of the intent for the data availability statement is to describe the availability of the entirety of the data that was collected which ultimately led to the evidence-based claims made in the paper. Ideally, this data set would facilitate a replication of the analyses that were conducted. For example, a paper can describe a novel teaching approach and its impact on student learning as evidenced by a collection of students’ test scores. In this case, the data would be the students’ test scores and other information that may have been used in the analyses such as independent variables (e.g. teaching method experienced) or covariates (e.g. prior test score performance). Providing the teaching method materials as data would not be sufficient as it would not facilitate reuse of the data or replication of the analyses conducted. Another example is an analysis of student interview transcripts. In this case, the entirety of the transcripts would comprise the data set. The exemplar quotes provided within the paper do not comprise the entirety of the data and would not facilitate reuse of the data.
Ultimately, the principles of open science can increase opportunities for future research from research studies that have been published and improve the transparency of the research process. However, when data originates from human subjects, honoring the contextual expectations for how the data would be shared takes precedent. The mandate for data availability statements allows for both these principles to manifest. Where data sharing is not permitted, authors are asked to write a brief statement indicating as such; where data sharing is permitted, authors are asked to store the data in a permanent and accessible location and indicate the location within the statement.
This journal is © The Royal Society of Chemistry 2024 |