From the journal Digital Discovery Peer review history

Developments and applications of the OPTIMADE API for materials discovery, design, and data exchange

Round 1

Manuscript submitted on 02 Feb 2024
 

05-Mar-2024

Dear Dr Rignanese:

Manuscript ID: DD-ART-02-2024-000039
TITLE: Developments and applications of the OPTIMADE API for materials discovery, design, and data exchange

Thank you for your submission to Digital Discovery, published by the Royal Society of Chemistry. I sent your manuscript to reviewers and I have now received their reports which are copied below.

After careful evaluation of your manuscript and the reviewers’ reports, I will be pleased to accept your manuscript for publication after revisions.

Please revise your manuscript to fully address the reviewers’ comments. When you submit your revised manuscript please include a point by point response to the reviewers’ comments and highlight the changes you have made. Full details of the files you need to submit are listed at the end of this email.

Digital Discovery strongly encourages authors of research articles to include an ‘Author contributions’ section in their manuscript, for publication in the final article. This should appear immediately above the ‘Conflict of interest’ and ‘Acknowledgement’ sections. I strongly recommend you use CRediT (the Contributor Roles Taxonomy, https://credit.niso.org/) for standardised contribution descriptions. All authors should have agreed to their individual contributions ahead of submission and these should accurately reflect contributions to the work. Please refer to our general author guidelines https://www.rsc.org/journals-books-databases/author-and-reviewer-hub/authors-information/responsibilities/ for more information.

Please submit your revised manuscript as soon as possible using this link :

*** PLEASE NOTE: This is a two-step process. After clicking on the link, you will be directed to a webpage to confirm. ***

https://mc.manuscriptcentral.com/dd?link_removed

(This link goes straight to your account, without the need to log in to the system. For your account security you should not share this link with others.)

Alternatively, you can login to your account (https://mc.manuscriptcentral.com/dd) where you will need your case-sensitive USER ID and password.

You should submit your revised manuscript as soon as possible; please note you will receive a series of automatic reminders. If your revisions will take a significant length of time, please contact me. If I do not hear from you, I may withdraw your manuscript from consideration and you will have to resubmit. Any resubmission will receive a new submission date.

The Royal Society of Chemistry requires all submitting authors to provide their ORCID iD when they submit a revised manuscript. This is quick and easy to do as part of the revised manuscript submission process.   We will publish this information with the article, and you may choose to have your ORCID record updated automatically with details of the publication.

Please also encourage your co-authors to sign up for their own ORCID account and associate it with their account on our manuscript submission system. For further information see: https://www.rsc.org/journals-books-databases/journal-authors-reviewers/processes-policies/#attribution-id

I look forward to receiving your revised manuscript.

Yours sincerely,
Dr Joshua Schrier
Associate Editor, Digital Discovery

************


 
Reviewer 1

In the manuscript “Developments and applications of the OPTIMADE API for materials discovery, design, and data exchange” the authors present the updated version 1.2 of the OPTIMADE API. The OPTIMADE API unifies the access to 18 databases of atomistic structures by providing a generic REST API so users can access all databases with similar queries. The manuscript covers new functionality added to the OPTIMADE API, introduces the 18 databases which support the OPTIMADE API, highlights its application and concludes with an outlook on the future perspective of the OPTIMADE API.

The manuscript is well written and provides the reader with a comprehensive overview of the impact of the OPTIMADE API on the materials science community. Consequently, it is recommended to accept the manuscript for publication.

Reviewer 2

I have thoroughly reviewed this manuscript titled "Enhancing Accessibility and Discoverability through Open Federation: A Case Study with OPTIMADE API". The authors have presented a convincing exploration of the importance of open federation among material data provider groups, emphasizing improved accessibility and discoverability for researchers in the related fields.

The manuscript effectively introduces updated features of the OPTIMADE API and various databases accessible through the API, demonstrating a commitment to FAIR data principles. The discussion in section 4 discussed the possibility of the challenging interoperability, enhancing the overall value of the manuscript. OPTIMADE API's contribution to acceleration of material development and its usability for community engagement is well-explained and substantiated.

In summary, the manuscript is commendable for its valuable contributions to the field, adhering to FAIR principles and emphasizing the significance of open federation. After incorporating the following minor revisions, I recommend accepting the manuscript.

(Minor revisions and clarifications are recommended to enhance the reader accessibility)

1. In the introduction, while agreeing with the emphasis on improving discoverability for lesser-known databases, the discussion in section 4 highlights examples from well-known databases like Materials Project and NOMAD. Authors should elaborate on their considerations regarding the use of data from both lesser-known and renowned databases.

2. Section 3 could benefit from a comprehensive table summarizing the key features of the introduced databases (material property, application, main functions, data size) to provide readers with an overall view. The placement of this table in the main text or supplementary materials should be at the authors' discretion.

3. The integration process of diverse database formats is a pivotal aspect of API development. Providing details on data template integration, format changes, and the original data source's type through well-organized figures would greatly enhance reader comprehension. The supplementary figure 1 in Nature Biotechnology, vol. 38, p 1087–1096 (2020, Duran-Frigola et al.) serves as an excellent example.

4. The description of functionalities related to ontologies and semantics in section 5.2 lacks clarity and specificity, making it difficult for readers to grasp the intended meaning. Clarifying and providing concrete examples would significantly enhance this section.

Reviewer 3

Excellent contribution to the community. This paper shows the continued adoption of the OPTIMADE API and new functionality within the specification.


 

This text has been copied from the PDF response to reviewers and does not include any figures, images or special characters

Dear Editor,
We are grateful to the journal for consideration of our manuscript ``Developments and
applications of the OPTIMADE API for materials discovery, design, and data exchange''
and the four Referees for their careful and conscientious review. We are delighted that two
Referees recommend immediate publication, and the other two Referees recommend
publication following minor revisions and clarifications. We copy our full response and
changes made in response below. We also provide a PDF with all the changes
highlighted.
Referee 1
Referee comment: The manuscript is well written and provides the reader with a
comprehensive overview of the impact of the OPTIMADE API on the materials science
community. Consequently, it is recommended to accept the manuscript for publication.
Our response: We thank the Referee for their reading of the manuscript, positive
comments, and recommendation for publication.
Referee 2
Comments to the Author: I have thoroughly reviewed this manuscript titled "Enhancing
Accessibility and Discoverability through Open Federation: A Case Study with OPTIMADE
API". The authors have presented a convincing exploration of the importance of open
federation among material data provider groups, emphasizing improved accessibility and
discoverability for researchers in the related fields.
The manuscript effectively introduces updated features of the OPTIMADE API and various
databases accessible through the API, demonstrating a commitment to FAIR data
principles. The discussion in section 4 discussed the possibility of the challenging
interoperability, enhancing the overall value of the manuscript. OPTIMADE API's
contribution to acceleration of material development and its usability for community
engagement is well explained and substantiated.
In summary, the manuscript is commendable for its valuable contributions to the field,
adhering to FAIR principles and emphasizing the significance of open federation. After
incorporating the following minor revisions, I recommend accepting the manuscript.
(Minor revisions and clarifications are recommended to enhance the reader accessibility)
Our response: We express our gratitude to the Referee for their careful reading of the
manuscript, positive comments, and for their constructive comments, which we respond
to below:
Referee comment #1: In the introduction, while agreeing with the emphasis on improving
discoverability for lesser-known databases, the discussion in section 4 highlights
examples from well-known databases like Materials Project and NOMAD. Authors should
elaborate on their considerations regarding the use of data from both lesser-known and
renowned databases.
Our response: We thank the Referee for this insightful comment. In response we have
added text to the introduction to highlight the benefits of having access to both large
databases and also the more specialist but small databases. Furthermore, in the opening
of section 4, this is further illustrated by mentioning an example that uses large
databases, and another with a specialist database.
Changes made: The following text has been added to the introduction:
This gives users access to data from both large and well-known sources, and many
specialist datasets focused on a family of materials of particular interest. The combination
of a general overview of all possible materials and detailed knowledge of particular
materials enables novel discovery and deep insights with for example machine learning.
The following text has been added to the start of Section 4:
We highlight examples where benefit from access to a large amount of data available in
large databases (e.g., the hard-coating alloys database discussed immediately below),
and examples that benefit from access to specialist data available only in the small and
focused databases (e.g., Sec. 4.1.1).
Referee comment #2: Section 3 could benefit from a comprehensive table summarizing
the key features of the introduced databases (material property, application, main
functions, data size) to provide readers with an overall view. The placement of this table in
the main text or supplementary materials should be at the authors' discretion.
Our response:
We have now clarified that the database size is listed in the final column of Table 1, we
apologise for this previously being insufficiently clear.
In Section 3 each database implementing OPTIMADE has a subsection to describe
precisely the suggested inclusions (material property, application, etc.), with appropriate
references. We believe this format allows databases to more appropriately represent the
features they want to highlight than would be possible by further extending the fairly
complex Table 1. The introduction to Section 3 now makes this intent clearer.
Changes made:
We have added the following text to the Table 1 caption:
The final column indicates the total number of structures served by each OPTIMADE API.
The introduction to Section 3 now includes the following text to clarify that database key
features are discussed in their individual subsections:
Therefore, below we discuss the key features of the major materials databases that make
data available through the OPTIMADE API.
The final subsection 3.18 also now includes text to link the individual database
subsections to Table 1:
While the key features of the databases are highlighted under the subsections dedicated
to the respective providers in Sec.3, a summarizing list of the
Referee comment #3: The integration process of diverse database formats is a pivotal
aspect of API development. Providing details on data template integration, format
changes, and the original data source's type through well-organized figures would greatly
enhance reader comprehension. The supplementary figure 1 in Nature Biotechnology, vol.
38, p 1087–1096 (2020, Duran-Frigola et al.) serves as an excellent example.
Our response: We are grateful for the Referee's comment, and think that further
clarification of the integration process would be helpful. We have done this in two ways:
The OPTIMADE consortium recommends that databases are integrated through the
Python library optimade-python-tools. This is a powerful, flexible, yet easy-to-use tool.
We have now clarified in the manuscript that the optimade-python-tools library is well
described in the paper [Journal of Open Source Software, 6(65), 3458 (2021)], and that
this contains details including template integration and format changes.
The first OPTIMADE paper [Nature Scientific Data 8, 217 (2021)] included a section
"Current Generation of Materials Database APIs" that discussed the original data formats
and their varied API calls. We have now referenced these queries from the submitted
manuscript, so, as the Refereee suggested, the reader can understand the progress
made by integrating the original data source's type.
Changes made:
We have now referenced the details including template integration and format changes
covered in the paper [Journal of Open Source Software, 6(65), 3458 (2021)] in Section
2.3.1:
Existing databases wanting to make use of the library need to provide mappings to and
from their existing data format and query mechanisms.
We have clarified the discussion of the original data source's type and calls to it made in
the first OPTIMADE paper [Nature Scientific Data 8, 217 (2021)]:
The initial motivation for OPTIMADE, and a discussion of the previously existing materials
API formats and filter mechanisms can be found in Ref. 3 which described the first
release.
Referee comment #4: The description of functionalities related to ontologies and
semantics in section 5.2 lacks clarity and specificity, making it difficult for readers to
grasp the intended meaning. Clarifying and providing concrete examples would
significantly enhance this section.
Our response:
We agree that the description in Section 5.2 should be made more concrete, and we have
thus extended the text accordingly.
Changes made:
Section 5.2 "Ontologies and semantics" has been extended with further details on what
we mean by semantic mappings and provided some examples; primarily in these two
segments of text:
For example, properties of an OPTIMADE structure (such as the specification of
periodicity or types of disordered occupation) can be mapped into concepts in the
crystallography domain ontology26 under development within the Elementary
Multiperspective Material Ontology (EMMO) ecosystem.33 This ontology is being created
as a collaborative effort that includes members of the OPTIMADE consortium. These
mappings of OPTIMADE properties into ontologies facilitate the alignment with other...
Examples of such use include the ability to reference, properties standardized by
OPTIMADE in, e.g., future EMMO-aligned domain ontologies and giving access to
property data via ontology-based GraphQL server generation.
Section 2.2.1 has also been extended with an example of one of our definitions (``nsites``)
hosted with a persistent URI.
We again thank the Referee for their helpful feedback. We believe that the resultant
changes have helped significantly improve the clarity of the manuscript, which we hope is
now ready for publication.
Referee 3
Referee comment: Excellent contribution to the community. This paper shows the
continued adoption of the OPTIMADE API and new functionality within the specification.
Our response: We are grateful to the Referee for the time they spent reading the
manuscript, and their encouraging comments for the ongoing development of OPTIMADE
and their recommendation for publication of the manuscript.
Referee 4
Comments to the Author: This paper describes the philosophy, history, recent
developments and applications, and future directions of the OPTIMADE API. OPTIMADE
is an incredibly important initiative, in my opinion, for bringing together the increasing
number of databases of materials and molecular structure and properties. OPTIMADE
acts as a focal point for communication and discussion among different database
developers, it helps develop standards and interoperability between them, and enables
users from the wider community to leverage the potential power of having access to
multiple databases via unified queries. The manuscript is very well written and well
structured. Good illustrative examples are provided throughout, without unnecessary
detail (which interested users can follow up via the links to resources provided), and the
directions that the OPTIMADE team are intending to take the project in the near future are
outlined. I am, therefore, very happy to support this manuscript for publication in Digital
Discovery. A few comments that the authors may wish to consider before final
publication:
Our response: We thank the Referee for their review, supportive comments, and positive
recommendation for publication. We have responded in full to their comments below.
Referee comment #1: Sentence on p6 appears incomplete: “In addition to server-focused
functionalities, the package also includes reusable code that can OPTIMADE consumers
and clients,...”
Our response: We are grateful to the Referee for pointing out that this sentence is non grammatical. We have added a word "help", which now makes the intended meaning
clear.
Changes made: The sentence highlighted by the Referee has been amended to now read:
In addition to server-focused functionalities, the package includes reusable code that can
help OPTIMADE consumers and clients,...
Referee comment #2: On p13: “The ability to repeat the query attests to the repeatability
of the OPTIMADE API”. Is repeatability the most appropriate word to use here?
Our response: We agree with the Referee that this sentence needs to make it clearer that
we refer to "reproducibility". We have therefore amended the sentence as copied below.
Changes made: The sentence has been changed to now read
The ability to repeat the query attests to how the OPTIMADE API helps with
reproducibility in research.
Referee comment #3: Acknowledgements: are all supporting entities appropriately
acknowledged? Eg, some of the workshops listed in Sec 5.1 were, I believe, supported
by the Psi-k Network (whose support is not mentioned in the manuscript).
Our response:
We did not originally include funding information about our 2021 partner workshop on
ontologies, which was indeed sponsored by Psi-k, MARVEL, and the Swedish e-Science
research centre. We thank the referee for pointing this out and have amended the
Acknowledgement accordingly.
Changes made:
We have inserted the following text in the Acknowledgements: "...; and for the partner
workshop at Linköping University in 2021, support from Psi-k, NCCR MARVEL (a National
Centre of Competence in Research, funded by the Swiss National Science Foundation,
grant No. 205602), and the Swedish e-Science Research Centre (SeRC)."
Referee comment #4: Future directions (and this is a more open question): given the
advances in large language models, what is the authors’ opinion on leveraging/
developing LLMs for interpreting natural language queries to the API or databases in
general?
Our response: We thank the Referee for pointing out the opportunity presented by large
language models. We now mention two possibilities: to help a non-expert user make a
query, and secondly to extract information for a database.
Changes made: We have added the following text to the upcoming features section 5.2.
Large language models have emerged as an exciting frontier for data science and
machine learning. We are now considering two uses of large language models for
OPTIMADE: firstly to help the end-user, and secondly help with the compilation of the
database. A large language model could help a non-expert formulate a query for
OPTIMADE, for example the query in section 2 could be found by requesting ``tell me the
structure of an oxide on silicon''. A second use is to pass the large language model either
textual data or a scan of a page of historical data, which can then be readily passed to
extract out relevant numbers for an OPTIMADE database. The value provided by
OPTIMADE here is to give a machine-actionable scaffold that an LLM can be validated
and evaluated against, in such a way that the data produced is automatically compatible
with other initiatives.
We underline our gratitude to the Referee for their constructive comments. This has led us
to improve the manuscript, which we believe is now ready for publication.
Summary
We again thank the journal and the four Referees for their helpful and insightful
comments. Having addressed the constructive comments of Referee 2 and Referee 4 in
full, which have improved the clarity of the text, we hope that the manuscript is now
suitable for publication in Digital Discovery.

Sincerely,
The authors




Round 2

Revised manuscript submitted on 05 Apr 2024
 

15-Apr-2024

Dear Dr Rignanese:

Manuscript ID: DD-ART-02-2024-000039.R1
TITLE: Developments and applications of the OPTIMADE API for materials discovery, design, and data exchange

Thank you for submitting your revised manuscript to Digital Discovery. I am pleased to accept your manuscript for publication in its current form. I have copied any final comments from the reviewer(s) below.

You will shortly receive a separate email from us requesting you to submit a licence to publish for your article, so that we can proceed with the preparation and publication of your manuscript.

You can highlight your article and the work of your group on the back cover of Digital Discovery. If you are interested in this opportunity please contact the editorial office for more information.

Promote your research, accelerate its impact – find out more about our article promotion services here: https://rsc.li/promoteyourresearch.

If you would like us to promote your article on our LinkedIn account [https://rsc.li/Digital_showcase] please fill out this form: https://form.jotform.com/213544038469056.

We are offering all corresponding authors on publications in gold open access RSC journals who are not already members of the Royal Society of Chemistry one year’s Affiliate membership. If you would like to find out more please email membership@rsc.org, including the promo code OA100 in your message. Learn all about our member benefits at https://www.rsc.org/membership-and-community/join/#benefit

By publishing your article in Digital Discovery, you are supporting the Royal Society of Chemistry to help the chemical science community make the world a better place.

With best wishes,

Dr Joshua Schrier
Associate Editor, Digital Discovery


 
Reviewer 1

No further comments from my side, I recommend accepting the manuscript.

Reviewer 3

The authors have addressed all reviewer concerns. The paper can be published in its current form.

Reviewer 2

I have carefully looked over the authors' replies and changes regarding all the suggested revisions. I made sure that my concerns were clearly understood and properly addressed. The changes have been carefully and thoroughly made. Importantly, the descriptions are now very reader-friendly. I believe the manuscript is ready for publication.




Transparent peer review

To support increased transparency, we offer authors the option to publish the peer review history alongside their article. Reviewers are anonymous unless they choose to sign their report.

We are currently unable to show comments or responses that were provided as attachments. If the peer review history indicates that attachments are available, or if you find there is review content missing, you can request the full review record from our Publishing customer services team at RSC1@rsc.org.

Find out more about our transparent peer review policy.

Content on this page is licensed under a Creative Commons Attribution 4.0 International license.
Creative Commons BY license