Guidance on the data availability statement requirement in CERP

James M. Nyachwaya a and Scott E. Lewis *b
aDepartment of Chemistry and Biochemistry and School of Education, North Dakota State University, USA
bDepartment of Chemistry, University of South Florida, USA. E-mail: slewis@usf.edu

To promote adherence to the principles of open science, the Royal Society of Chemistry (RSC) initiated a mandate for the inclusion of a data availability statement across all journals. The RSC has published guidance on the requirement (Royal Society of Chemistry, 2024). Chemistry Education Research and Practice (CERP) has a unique position among RSC journals as the strong majority of CERP papers rely on data collected from human subjects. Research conducted on human subjects are also situated in local contexts which vary in the considerations for how much data may be shared. In many cases, research on human subjects requires approval from a review board that is independent from the research and may invoke limits on the extent to which data can be shared. In addition, participants have a right to privacy and providing informed consent to how their data may be shared. These considerations rightfully limit the extent data from human subjects can be made publicly available. As a result, we found it timely to provide guidance for authors on navigating the data availability statement mandate when submitting to CERP. This guidance is informed by conversations with the editorial board and in working with authors who have submitted to CERP since the requirement was initiated.

Open science and the benefits of data availability

Open science can be defined as “transparent and accessible knowledge that is shared and developed through collaborative networks” (Vicente-Saez and Martinez-Fuentes, 2018, p. 434), or “the process of making the content and process of producing evidence and claims transparent and accessible to others” (Munafò et al., 2017, p. 5). Open science has to do with the openness of how research is designed, carried out, documented, assessed and ultimately shared. Fundamental features of open science include openness, rigor, transparency, replicability, accumulation of knowledge and reproducibility (Crüwell et al., 2019). For this to be realized, there is a need for open data tools, open peer review methods and open access methods among others (Vicente-Saez and Martinez-Fuentes, 2018). It is worth noting that open science refers to a collection of research practices that are applicable in different research contexts, which help improve the quality and value of research while accelerating acquisition of knowledge in science (Crüwell et al., 2019).

Open science practices involve sharing of research questions, methodology, resources, and publishing formats (Allen and Mehler, 2019). Common trends in open science include open code, open data, open access, open notebooks, open lab books, collaborative bibliographies, citizen science, open peer review, and pre-registration (Vicente-Saez and Martinez-Fuentes, 2018). A number of practices support open science, including wider sharing of data, research materials, code, replications and reanalyses, changing statistical approaches and how evidence is assessed, transparent ways of presenting data, use of double-blind review, use of preprints, as well as open access publishing. Open science is therefore transparent, shared, accessible and collaboratively developed (Deng, 2011; Bisol et al., 2014; Hampton et al., 2015).

The ultimate goal of open science is to make science more reliable (Munafò et al., 2017). Open science can help foster responsible and sustainable research. Benefits of open science practices include a potential for more visibility and citations by sharing data, career advancement through collaborations, media attention, as well as funding opportunities (fellowships and awards) from organizations that support open research (McKiernan et al., 2016). Open science practices are therefore valuable because they can help improve the quality and accumulation of knowledge.

Data availability statements in CERP

To promote adherence to the principles of open science, the Royal Society of Chemistry (RSC) requires the inclusion of a data availability statement. The most important point to make is that the mandate is for a statement on data availability, not that the data necessarily be made available. Part of the intent is that by requiring a data availability statement, researchers may be more inclined to design studies with sharing data in mind and share data when it is appropriate. The RSC website (Royal Society of Chemistry, 2024) includes suggestions for data availability statements. If the data is being made publicly available, the data availability statement should describe where the data can be located. If the data is housed in a repository, the availability statement can include a URL to that repository. Alternatively, if the data is loaded as an electronic supplement to the paper, the availability statement can indicate so.

As mentioned, data from human subjects may have instances where the data are not publicly available. The RSC exemplars currently include limited examples of statements where data may not be available. The following are exemplar statements that can be used for submitting to CERP when data is not available:

• The data are not publicly available as participants of this study did not consent for their data to be shared publicly.

• The data are not publicly available as approval for this study did not include permission for sharing data publicly.

• The data are not publicly available as publicly releasing the data could potentially compromise the privacy of the research participants.

These suggestions are not meant to be prescriptive. Authors are welcome to tailor these to describe their particular research study.

The RSC guidance also indicates that the statement: “Data are available upon request from the authors” is generally not acceptable. Our interpretation of this guidance is that if the data can be shared publicly, it should be shared in a permanent location. Having data only available by a request made to the authors makes the data availability contingent on the authors’ availability. If the author is no longer available at the corresponding address, or the author’s data storage malfunctions, the data would no longer be available. Publicly sharing the data in a data repository or as a supplemental file accompanying the paper reduces these risks. Thus, authors are encouraged to make the data publicly available and accessible when the data can be shared. When the data cannot be publicly shared, indicate the data is not publicly available. Note that this statement does not preclude interested CERP readers from contacting authors regarding data availability as there may be instances where the data can be shared individually with some conditions even though it is not being made publicly available.

It is also helpful to clarify what is meant by data, as submissions have varied attributions for data. Our interpretation of the intent for the data availability statement is to describe the availability of the entirety of the data that was collected which ultimately led to the evidence-based claims made in the paper. Ideally, this data set would facilitate a replication of the analyses that were conducted. For example, a paper can describe a novel teaching approach and its impact on student learning as evidenced by a collection of students’ test scores. In this case, the data would be the students’ test scores and other information that may have been used in the analyses such as independent variables (e.g. teaching method experienced) or covariates (e.g. prior test score performance). Providing the teaching method materials as data would not be sufficient as it would not facilitate reuse of the data or replication of the analyses conducted. Another example is an analysis of student interview transcripts. In this case, the entirety of the transcripts would comprise the data set. The exemplar quotes provided within the paper do not comprise the entirety of the data and would not facilitate reuse of the data.

Ultimately, the principles of open science can increase opportunities for future research from research studies that have been published and improve the transparency of the research process. However, when data originates from human subjects, honoring the contextual expectations for how the data would be shared takes precedent. The mandate for data availability statements allows for both these principles to manifest. Where data sharing is not permitted, authors are asked to write a brief statement indicating as such; where data sharing is permitted, authors are asked to store the data in a permanent and accessible location and indicate the location within the statement.

References

  1. Allen C. and Mehler D. M. A., (2019), Open science challenges, benefits and tips in early career and beyond, PLoS Biol., 17(5), e3000246.
  2. Bisol G. D., Anagnostou P., Capocasa M., Bencivelli S., Cerroni A., Contreras J. L., Enke N., Fantini B., Greco P., Heeney C., Luzi D., Manghi P., Mascalzoni D., Molloy J., Parenti F. M., Wicherts J. and Boulton G., (2014), Perspectives on Open Science and scientific data sharing: An interdisciplinary workshop, J. Anthropol. Sci., 92, 179–200.
  3. Crüwell S., van Doorn J., Etz A., Makel M. C., Moshontz H., Niebaum J. C., Orben A., Parsons S. and Schulte-Mecklenbeck M., (2019), Seven easy steps to open science: An annotated reading list. Z. Psychol., 227(4), 237–248.
  4. Deng F., (2011), Open institutional structure, Q. J. Austrian Econ., 14(4), 416–441.
  5. Hampton S. E., Anderson S. S., Bagby S. C., Gries C., Han X., Hart E. M., Jones M. B., Lenhardt W. C., MacDonald A., Michener W. K., Mudge J., Pourmokhtarian A., Schildhauer M. P., Woo K. H. and Zimmerman N., (2015), The Tao of Open Science for ecology, Ecosphere, 6(7), 120.
  6. McKiernan E. C., Bourne P. E., Brown C. T., Buck S., Kenall A., Lin J., McDougall D., Nosek B. A., Ram K., Soderberg C. K., Spies J. R., Thaney K., Updegrove A., Woo K. H. and Yarkoni T., (2016), How Open Science helps researchers succeed, eLife, 5, e16800.
  7. Munafò M. R., Nosek B. A., Bishop D. V., Button K. S., Chambers C. D., Du Sert N. P., Ioannidis J. P. et al., (2017), A manifesto for reproducible science, Nat. Hum. Behav., 1, 0021.
  8. Royal Society of Chemistry, (2024), Data sharing guidance and policy for preparing your journal article, accessed at: https://www.rsc.org/journals-books-databases/author-and-reviewer-hub/authors-information/prepare-and-format/data-sharing/#dataavailabilitystatements on Aug 26, 2024.
  9. Vicente-Saez R. and Martinez-Fuentes C., (2018), Open Science now: A systematic literature review for an integrated definition, J. Bus. Res., 88(c), 428–436.

This journal is © The Royal Society of Chemistry 2024
Click here to see how this site uses Cookies. View our privacy policy here.