Open Access Article
Jochen Ortmeyer
a,
Vitali Sidorin
a,
Daniela Adele Hausen
b,
Ann-Christin Andres
c,
John Jolliffe
c,
Theo Bender
c,
Giacomo Lanza
d,
Steffen Neumann
e,
Oliver Koepler
f,
Nicole Jung
g,
Christoph Steinbeck
h,
Johannes Liermann
*c and
Sonja Herres-Pawlis
*a
aInstitute of Inorganic Chemistry, RWTH Aachen University, Landoltweg 1, 52074 Aachen, Germany. E-mail: sonja.herres-pawlis@ac.rwth-aachen.de
bHeinrich-Heine-University, Research Data Competence Centre, 40225 Düsseldorf, Germany
cJGU Mainz, Department of Chemistry, Duesbergweg 10-14, 55128 Mainz, Germany. E-mail: liermann@uni-mainz.de
dPhysikalisch-Technische Bundesanstalt (PTB), Bundesallee 100, 38116 Braunschweig, Germany
eLeibniz Institute of Plant Biochemistry, Program Center MetaCom, Weinberg 3, 06108 Halle, Germany
fTechnische Informationsbibliothek (TIB), Welfengarten 1 B, 30167 Hannover, Germany
gInstitute of Biological and Chemical Systems, Karlsruhe Institute of Technology, Kaiserstraße 12, 76131 Karlsruhe, Germany
hInstitute for Inorganic and Analytical Chemistry, Friedrich-Schiller-University, Lessingstr. 8, 07743 Jena, Germany
First published on 7th May 2026
Increasing digitalisation is revolutionising all scientific disciplines, but it poses a particular challenge in chemistry. Research data is generated, for example, during the synthesis of substances, the recording of spectra or the application of theoretical methods, and must therefore be appropriately documented, archived and made available for reuse. Traditionally, this happens in the form of paper-based laboratory notebooks. In order for the transition to the age of digital chemistry to succeed, a cultural change in the community is necessary. But how do chemists actually treat their research data? the recent NFDI4Chem survey provides answers on the progress of digitalisation. Moreover, we highlight how the consortium transported the community's needs into the second funding phase. This exemplifies a feedback-driven model for digital infrastructure design that could serve as a blueprint for other disciplines and national initiatives.
Several domain-specific efforts have attempted to measure and accelerate FAIR uptake in chemistry. National and international initiatives have surveyed needs and practices to guide infrastructure development: the German NFDI4Chem consortium has run community surveys and produced guidance for chemistry research data stewardship as part of the NFDI effort, explicitly framing its work around FAIR implementation for chemical workflows.8,9 The WorldFAIR project combined technical work with social surveys and policy briefs to identify cross-cutting obstacles (e.g., metadata heterogeneity, the difficulty of linking samples and digital data, and the need for discipline-specific recommendations) and to recommend practical policy measures for chemistry and other domains.10
Complementary to large infrastructure projects, community and publisher-focused studies have highlighted practical pain points that limit FAIRness in chemical publishing and repository practices. For example, recent discussions have argued for standardised templates and mandatory, machine-readable deposition of chemical structures and associated metadata to make published chemical data genuinely reusable. Such community commentaries emphasise the gap between aspirational FAIR policies and routine author behaviour.11–13 At the same time, cross-domain reviews14–17 of FAIR assessment tools and studies of data repositories in material sciences and chemistry show a proliferation of technical solutions (assessment tools, metadata standards, domain repositories), while also noting variability in uptake and maturity across subfields.18,19 These analyses underline that technical capability alone is insufficient: incentives, training, and community standards are equally important to increase FAIRness.20,21 In principle, this is a socio-technical endeavour and a lot of experienced personnel (e.g., data stewards and research data managers) is needed to make the cultural change happen.
Among infrastructure-level responses to the FAIR challenge in chemistry and the physical sciences is the UK-based Physical Sciences Data Infrastructure (PSDI).22,23 PSDI aims to interconnect existing data systems, provide conversion and search services, and offer training and guidance to make data more findable, accessible, interoperable and reusable. By providing (i) a data conversion service tackling format interoperability; (ii) community data collections with enriched metadata schemas; and (iii) guidance on data sharing and FAIR practices, PSDI addresses several known barriers to FAIR adoption in chemistry (e.g., legacy formats, heterogeneous metadata, lack of reuse tooling).
Taken together, the existing literature and survey work paint a consistent picture: awareness of FAIR concepts is growing in chemistry, and many infrastructural components (repositories, identifier systems, metadata initiatives) are emerging, but persistent cultural, technical and organisational barriers continue to limit routine FAIR implementation.15,24 A recent study by Bloodworth et al.13 highlights that current data-sharing practices in organic chemistry journals largely fail to meet FAIR Guiding Principles standards, with most authors sharing only minimal, non-machine-readable data and rarely depositing primary data in open repositories.
Moreover, the authors state that meaningful improvement in data accessibility and machine-readiness will require mandated journal policies, standardized formats, repository infrastructure, and cultural change to enable AI-driven discovery in chemistry. Especially, the PNNL report (ref. 15) reveals that chemistry lags materials science and physics in institutionalising FAIR data standards and that impediments to FAIR data generation in chemistry fall into two main groups: technical roadblocks (ontologies, metadata tools, repositories) and psychosocial barriers (incentives, uncertainty, questions of who funds and enforces compliance).15
The present user survey complements these prior initiatives by providing up-to-date empirical data from practicing chemists on awareness, concrete practices (data deposition, metadata capture and generation, use of persistent identifiers and ELNs), perceived obstacles (time, incentives, tooling), and priorities for training and infrastructure. By comparing our findings with the results and recommendations of NFDI4Chem, WorldFAIR, PSDI, FAIR assessment studies and community commentaries, we aim to provide actionable guidance for infrastructure providers, publishers and funders working to make chemical data more FAIR.
To set the framework, we would like to give some context on the National Research Data Infrastructure (NFDI): the NFDI is an initiative of the German Federal Government and the federal states. Its aim is to create a network across scientific disciplines that enables the sustainable handling of research data in accordance with the FAIR principles. The chemistry consortium NFDI4Chem strives to seamlessly digitalise the entire workflow in chemical research and thus close the analogue gaps and digital barriers in the digital data life cycle. To this end, the consortium provides the chemical community with various services, resources and training opportunities25 in order to provide them with the best possible support in the current cultural change. The digital transformation includes, for example, the provision of open-source electronic lab notebooks (ELNs), which greatly simplify and facilitate the preparation of data for publication in data repositories, as well as the development of tools and standards to foster data integration and harmonization and a close exchange with the community. Chemotion ELN26,27 together with its repo is an ideal example for FAIRification of data right from the beginning of the research process: the researchers plan the experiment in Chemotion, document every step, note observations in the software, collect the characterisation data, analyse them and use Chemotion to generate the synthesis parts of the SI (still needed for most chemical publications). The deposit in the Chemotion repository28 is then seamlessly performed via one click.
Taking the community into account is essential in order to drive forward the development of the infrastructure in line with the needs of users. As early as in 2019, we performed a user survey to shape the working programme of NFDI4Chem in the first phase.29 The survey showed that, although chemists broadly acknowledge the importance of research data management, its practical implementation remains limited. Data were often handled in an ad-hoc manner, with heterogeneous documentation practices and little use of standards or repositories. A major barrier identified in 2019 was the lack of user-friendly, well-integrated digital tools that fit smoothly into everyday laboratory workflows. Respondents expressed a strong need for standardized metadata, minimum information standards, and interoperable formats to enable data sharing and reuse in line with the FAIR principles. In addition to technical infrastructure, the survey highlighted the importance of training, guidance, and long-term institutional support to establish a sustainable culture of good data management. Overall, the results provided a baseline and clearly motivated the strategic focus of NFDI4Chem on infrastructure, standards, and education tailored to the chemistry community.
In total, 813 people participated in the survey, of which the most participants did their last research in Germany (84%), while the rest were spread out around the world. Approximately 86% of these people are working in a research related environment, namely universities (75%) and non-university research institutions (11%). Only a few participants stated that they are working in the industry (6%). Focusing on universities, mostly people in positions requiring a master's degree or higher, for example PhD students (33%), postdoctoral researchers (24%) or professors (22%), took part in this survey. Master's (7%) and bachelor's (2%) students participated less, likely due to limited incorporation into the research data management policies of the working group they are employed in. The participants are relatively equally distributed across the three main chemical subdisciplines, inorganic (27%), organic (34%) and physical (21%) chemistry. The rest represented a large variety of other subdisciplines, such as chemical engineering (6%), theoretical (9%) or pharmaceutical (6%) chemistry. Selected results are compared with another survey, which was conducted by NFDI4Chem in 2019 where 623 people participated. Out of those, 541 came from Germany, therefore only the german datasets were analysed.
![]() | ||
| Fig. 4 Sharing data on different media in and outside the working group of the surveys in 2019 and 2023. (How do you share your data within and outside your working group?). | ||
Additionally, participants were asked if they knew online databases or data repositories where they can find data for their domain. In this case, finding data is not only referred to viewing the data but also saving and opening it with the corresponding software, if the limitations due to licensing are not taken into consideration. 24% stated that they know such services. Most of the people who took the survey have reused data provided by other scientists (Fig. 5), mainly from colleagues in the same working group (50%) or SI from a publication (45%). A smaller part reuse data by colleagues from other institutes (28%) and via repositories (18%). Negated responses can be summarised with data being not available (8%), not sufficiently described (6%), distrust in others (3%) or no need to use other data (25%). Especially the latter point can be explained by missing interdisciplinary overlap in some niche chemical fields.
![]() | ||
| Fig. 5 Re-using data acquired by different means and negated responses of using external data. (Have you re-used data provided by other scientists?). | ||
![]() | ||
| Fig. 6 Publishing data as part on different platforms. (Have you already published your data in the form of raw data, processed data or analyzed data?). | ||
The NFDI4Chem survey highlights growing but still limited adoption of ELNs: about 30% of respondents use an ELN, up from 18% in 2019. PSDI's cross-domain data reveal comparable adoption rates, with chemistry and materials science leading in digital documentation but substantial portions of the physical sciences still relying on analogue or semi-digital records. In both communities, ELN integration with analytical instruments and repositories remains a bottleneck. Whereas NFDI4Chem directly supports open-source ELN development (Chemotion) and fosters the linkage of ELNs to repositories, PSDI respondents emphasised a need for interoperable laboratory information systems and conversion services to harmonise data across tools and formats. The two surveys therefore converge in identifying the “last meter” of digitalisation, the laboratory interface, as a shared challenge. Bringing the ELN on a physical device into the laboratory would help in many cases but is not trivial.
A striking parallel emerges in the area of metadata. NFDI4Chem respondents report that 56% annotate data with metadata, mostly through manual or hybrid approaches, and only one quarter have institutional metadata standards. PSDI results show a similar fragmentation: respondents recognise metadata as essential but report inconsistent practices, limited automation, and a lack of agreed vocabularies. Across both surveys, metadata creation is perceived as time-consuming and poorly rewarded. Both communities explicitly request clearer guidance, domain-specific templates, and automated metadata extraction from instruments—findings that validate each other and highlight an urgent need for coordinated standardisation across consortia. This coordination between consortia and with IUPAC and CODATA is already active since international integration was already an important task in the first funding period of NFDI4Chem.8
Both surveys depict a gradual shift from local, hardware-based storage toward institutional and repository services. In the NFDI4Chem survey, 64% of respondents use institutional servers and 20% deposit data in repositories, reflecting progress compared to 2019 but still limited uptake. The PSDI survey reports similar figures for repository use and highlights that most physical scientists rely on project or group servers rather than formal repositories. Common obstacles include uncertainty about repository suitability, lack of long-term preservation guarantees, and unclear licensing. Both communities express a clear wish for reliable, discipline-adapted repositories with intuitive submission workflows and machine-readable metadata – precisely the gaps NFDI4Chem's Chemotion repository and PSDI's community data collections aim to fill. On the long range, NFDI4Chem works on the federation of its repositories as well as fallback options and “exit-strategies” for single repositories.
A central outcome of both surveys is the recognition that infrastructure alone cannot achieve FAIRness without parallel cultural transformation. In the NFDI4Chem results, participants emphasise the need for institutional policies, incentives, and hands-on training; PSDI respondents echoed these concerns, adding that data management responsibilities are often unclear or undervalued in research careers. Both initiatives respond by embedding training components into their next phases: NFDI4Chem through the “FAIR for Chemists” curriculum support and an extensive set of training activities, PSDI through coordinated training resources and best-practice documentation. The parallel emphasis on community engagement indicates that sustainable change depends as much on people and skills as on technology. This is in high accordance with other studies on digital skills in chemistry in general.34
While NFDI4Chem focuses on the chemistry data lifecycle, its community has increasingly recognised the need for interoperability with adjacent domains such as catalysis, materials science, and engineering. PSDI's survey results reinforce this perspective by documenting widespread frustration with data silos and incompatible formats across the physical sciences. Both surveys call for persistent identifiers, harmonised metadata schemas, and linked ontologies as enablers of reuse. The alignment between chemistry-specific (NFDI4Chem) and cross-domain (PSDI) perspectives demonstrates that the FAIR transformation of chemistry cannot occur in isolation but must evolve within a federated, interoperable infrastructure landscape.
Although differing in scale and disciplinary focus, the NFDI4Chem and PSDI surveys are mutually reinforcing. NFDI4Chem offers granular insight into laboratory practices, repository interactions, and the chemistry community's digital culture; PSDI provides a broader systems-level view, identifying the same structural issues from a national infrastructure perspective. Where NFDI4Chem reveals the needs of individual chemists, PSDI situates those needs within a multi-domain ecosystem. Together, the two surveys provide a consistent empirical foundation for guiding European FAIR infrastructure development—emphasising interoperability, community engagement, and sustained training as the pillars of a successful digital transformation.
The survey reveals persistent uncertainty around metadata practices: only 56% of chemists enrich data with metadata, and just 25% report institutional rules for metadata creation. The extension proposal explicitly commits to standardising FAIR metadata workflows through harmonised ontologies and machine-actionable templates. With many NFDI4Chem tools we enable embedding ontologies in researchers' everyday tools which virtually “hides” their use (e.g., Chemotion ELN). Building on outcomes of the first phase, it foresees expanded use of minimum information about chemical investigations (MIChI)36,37 as metadata schemas, in alignment with IUPAC, and automated metadata harvesting from ELNs. Moreover, it introduces a “metadata readiness level” concept and quality assessment tools to encourage consistent, transparent metadata practices across institutions. In short, the proposal translates the community's desire for practical metadata guidance into structured, implementable standards supported by tooling and training.
According to the survey, only one-fifth of chemists currently use repositories for long-term storage, although awareness and usage are increasing. The extension proposal therefore strengthens the repository ecosystem as a key infrastructure component. It expands the capacity and interoperability of the Chemotion and RADAR4Chem repositories, introduces discipline-specific sub-repositories, and enhances DOI and ORCID integration for persistent identification. A further measure responds to the high reliance on institutional servers reported in the survey: NFDI4Chem will pilot interfaces between local data storage systems and central repositories, allowing researchers to deposit data directly from institutional infrastructures without duplication. This design supports both ease of use and FAIR compliance.
Cultural and educational barriers remain among the strongest obstacles identified in the community survey. Many participants cited uncertainty about FAIR data practices and insufficient institutional support. The freshly started second funding phase addresses this by expanding the training and outreach portfolio through:
• A modular FAIR for chemists training curriculum integrated into university courses,
• Certification schemes for FAIR data stewards,
• Targeted continuing-education formats for PIs and lab managers, and
• Online learning resources in the NFDI4Chem knowledge base.
The work package on “community engagement and training” explicitly embeds user co-creation and feedback cycles, ensuring that educational activities evolve alongside user needs. This mapping demonstrates that NFDI4Chem regards teaching RDM as equally vital as technical infrastructure.
Survey respondents indicated interest in broader discoverability and reuse of data beyond their own subdiscipline, yet only a quarter knew of relevant repositories (such as Chemotion Repo, nmrXiv etc). In response, the extension proposal emphasises cross-consortium collaboration within the NFDI and beyond. It plans integration with NFDI4Ing, NFDI4Cat, and FAIRmat, as well as with international partners such as the WorldFAIR Chemistry initiative. Through harmonised metadata profiles and shared vocabularies, NFDI4Chem aims to position chemistry within a connected FAIR ecosystem. For end users, this means enhanced data searchability, standard identifiers (e.g., InChI7) and shared access portals – directly addressing the community's call for better data discoverability and interoperability.
A recurring theme in the survey is the lack of incentives and clear mandates for data publication. The extension proposal therefore introduces policy and governance measures to reward FAIR data practices. These include citation tracking for datasets, integration of data publication into academic evaluation, and persistent tracking of dataset reuse. Furthermore, the proposal's governance model strengthens community representation by creating an expanded community board and user advisory panels—ensuring that the consortium continues to translate community feedback into action, as demonstrated in the current renewal.
| This journal is © The Royal Society of Chemistry 2026 |