From the journal Digital Discovery Peer review history

You do not have JavaScript enabled. Please enable JavaScript to access the full features of the site or access our non-JavaScript page.

Round 1

Manuscript submitted on 02 Сент. 2021

Editor’s decision letter

11-Dec-2021

Dear Dr White:

Manuscript ID: DD-PER-09-2021-000009
TITLE: Natural Language Processing Models That Automate Programming Will Transform Chemistry Research and Teaching

Thank you for your submission to Digital Discovery, published by the Royal Society of Chemistry. I sent your manuscript to reviewers and I have now received their reports which are copied below.

After careful evaluation of your manuscript and the reviewers’ reports, I will be pleased to accept your manuscript for publication after revisions.

Please revise your manuscript to fully address the reviewers’ comments. When you submit your revised manuscript please include a point by point response to the reviewers’ comments and highlight the changes you have made. Full details of the files you need to submit are listed at the end of this email.

Digital Discovery strongly encourages authors of research articles to include an ‘Author contributions’ section in their manuscript, for publication in the final article. This should appear immediately above the ‘Conflict of interest’ and ‘Acknowledgement’ sections. I strongly recommend you use CRediT (the Contributor Roles Taxonomy from CASRAI, https://casrai.org/credit/) for standardised contribution descriptions. All authors should have agreed to their individual contributions ahead of submission and these should accurately reflect contributions to the work. Please refer to our general author guidelines https://www.rsc.org/journals-books-databases/author-and-reviewer-hub/authors-information/responsibilities/ for more information.

Please submit your revised manuscript as soon as possible using this link :

*** PLEASE NOTE: This is a two-step process. After clicking on the link, you will be directed to a webpage to confirm. ***

https://mc.manuscriptcentral.com/dd?link_removed

(This link goes straight to your account, without the need to log in to the system. For your account security you should not share this link with others.)

Alternatively, you can login to your account (https://mc.manuscriptcentral.com/dd) where you will need your case-sensitive USER ID and password.

You should submit your revised manuscript as soon as possible; please note you will receive a series of automatic reminders. If your revisions will take a significant length of time, please contact me. If I do not hear from you, I may withdraw your manuscript from consideration and you will have to resubmit. Any resubmission will receive a new submission date.

The Royal Society of Chemistry requires all submitting authors to provide their ORCID iD when they submit a revised manuscript. This is quick and easy to do as part of the revised manuscript submission process. We will publish this information with the article, and you may choose to have your ORCID record updated automatically with details of the publication.

Please also encourage your co-authors to sign up for their own ORCID account and associate it with their account on our manuscript submission system. For further information see: https://www.rsc.org/journals-books-databases/journal-authors-reviewers/processes-policies/#attribution-id

I look forward to receiving your revised manuscript.

Yours sincerely,
Linda Hung
Associate Editor
Digital Discovery
Royal Society of Chemistry

************

Reviewer comments

Reviewer 1

The authors provide a perspective into the use of natural language processing (NLP) models for automating programming in the chemical sciences. The report is centred on the GPT-3 model and its deployment in the Codex software provided by Microsoft. Overall, the authors do a good job outlining the use of NLP models for automated programming in chemistry research and teaching. The topic is highly relevant to Digital Discovery and therefore I recommend the manuscript be published. I have a few minor comments outlined below.

1. My main point is on the line "We should embrace the fact that we no longer need to spend hours on teaching loops and instead focus on chemistry." This goes against the tone of the rest of the article, which implies that tools such as Codex will enhance our programming skills rather than replace them. A knowledge of programming is still essential to i) validate the code is doing what is expected; ii) debug any errors; and iii) write code that is beyond the NLP model, such as using new packages not in the training corpus or implementing a completely novel analysis. I don't think it is a good idea to teach students how to use Codex without first teaching them how to code.

2. I am interested how often a model like Codex will/can be retrained considering the model size. New computational chemistry software is released regularly (100's of new packages a year), so regular retraining will be necessary to capture the use of these codes in the training corpus.

3. In the SI, it would be useful if the authors indicate which code blocks run successfully and whether any broken examples have been excluded from the results (i.e., give an indication of the authors' success rate using Codex).

Reviewer 2

This manuscript reports a perspective of recently developed natural language processing (NLP) models for automating programming relevant to computational chemistry research and teaching. The perspective was fascinating and a delight to read. Although I consider myself an expert in computational chemistry and at least slightly familiar with most of the tools discussed in the perspective, I was pleasantly surprised to see the NLP models were as advanced as what was demonstrated by the authors. I believe this manuscript is highly suitable for publication in Digital Discovery.

In preparing the referee report, I have been asked to comment on the originality, importance, impact, and reliability of the work. The perspective is certainly very original and important as it clearly shows how modern NLP tools can be productively used for routine applications in computational chemistry and teaching. With a few optional and minor revisions (see below) its impact might be enhanced. The reliability of this approach is somewhat harder to assess, though the authors themselves might be able to add a sentence or two to address this point.

In sum, I strongly recommend this manuscript be published in Digital Discovery, but I also recommend that the authors consider the following comments that I believe would strengthen the manuscript.

1. There does not seem to be only little mention of examples and challenges in coupling AI with chemistry (and more specifically NLP applied to chemistry). Ref 12 is a useful example, but there will likely be recent reviews that comment on this to help readers find more context for the task being discussed. There is one example that I admittedly have bias for (10.1021/acs.chemrev.1c00107, and references therein), but the authors should decide which references are best to use within the constraints of journal.

2. Regarding the first paragraph, I enjoyed reading this, but I have a few questions. It is possible that the default settings of pyscf would call for the model chemistry using HF and STO-3G to be used for any electronic structure calculation by default. Did Codex really make this decision and override defaults from pyscf itself, or did Codex learn what the default model chemistry was from the pyscf code, and was what happened really that surprising or not? Another minor point: I believe that since H2 is a two electron system, CCSD(T) calculations are not technically possible, and instead CCSD calculations would have actually been run (though some quantum mechanics codes might be coded in a way to interpret that is what the user meant anyway). These two questions relate to each other, and perhaps this raises the chance to comment on how different codes often have a relatively dumb form of AI baked into every code already that should be recognized to improve reliability going forward.

3. If length restrictions allow it and if the authors agree, it might improve the impact of this work by including a short table with notable examples that summarize the interesting examples of applications embedded throughout the entire perspective. For instance, A. specific examples of what NLP can already do well for chemistry; B. specific examples of what NLP can do do somehow haphazardly for chemistry; C. specific examples of what NLP can't do yet but that's on the horizon for chemistry. This might seem redundant since this is roughly the structure of the perspective already, but as I read this manuscript, I found myself wanting to highlight these examples and make this table for myself. It might be convenient to have such a table pre-made by the authors so others could more easily cite this work and share it to different audiences.

4. To address the editor's requests for comments on how this work is reliable, perhaps the authors add one or two sentences?

-- John A. Keith

Reviewer 3

The manuscript by Hocky and White provides a timely and valuable account on the use of natural language processing models, Codex in particular, in chemical science. Their tests in a series of routine programming tasks show that such models have the potential to drastically reduce our coding efforts and let us focus on high-level scientific problems, which will benefit both researchers and students. This paper clearly echoes the theme of the journal and is of high quality. I would recommend publication in Digital Discovery after my following points are addressed.

- Can the authors discuss reproducibility in the context of NLP? In the SI, the authors showed multiple responses for the same prompt. Is reproducibility no longer a concern here? Or is it better to have diverse responses for the same input? If so, will this impact data reproducibility in future researches?
- Some generated responses are actually incorrect (as also pointed out by the authors). For example, Result 2 of "Dissociation energy" mistakenly used nuclear repulsion energy (`mol.energy_nuc()`) as reference energy. Although it does not affect relative energies in this case, mistakes like this are quite concerning when the result looks correct superficially but has hidden errors. Can the authors comment on this issue?
- Related to my previous point, I would suggest the authors point out clearly the incorrect code/lines in each example (even in the SI) so that audience won't be misled.
- Despite those concerns, I am impressed by the overall accuracy of Codex and the small training size one needs (e.g., in the chemical entity recognition example). Is it also true in other tests that only a small number of training examples were given? Maybe the authors could elaborate more on the training process. I am also curious how chemistry code developers can contribute to ease the training step for users of NLP tools.

Reviewer 4

In this paper, the authors give perspectives on the impact of NLP tools on chemistry research and teaching. They provide examples to demonstrate that the latest large language models can automate programming to complete common tasks in computational chemistry. The limitations of the NLP models are also mentioned. This brings useful information to the community, and I think will be interesting to the audience. As such, I suggest publication of the paper with some questions to be addressed.
1. One major question is whether the current NLP models can generate more complex code. Most of the examples provided in the paper are relatively simple and some of them are just about generating input files for certain tasks and softwares. I guess this can be achieved merely by learning the documentations of these softwares. For more complex tasks such as implementing new algorithms or computational methods within the frameworks of existing softwares, reading and understanding the existing code usually takes quite a lot of time. I wonder if the current NLP models can learn complex code (e.g., the code for computing electron integrals) and help the users implement their own method even if they are not familiar with the underlying codebase. Some comments along these lines would be helpful.
2. A minor question about the SMILES example. Does the AI actually know how to write down the SMILES notation of a chemical or is it just searching the database for the matching chemical names?

Author response

This text has been copied from the PDF response to reviewers and does not include any figures, images or special characters.

We thank all referees for the support, detailed, and helpful feedback on our manuscript. Please see our responses below. Referee
text is in plain typeface, author replies are in italic typeface and
manuscript excerpts are in sans-serif font.
Referee 1
The authors provide a perspective into the use of natural language processing
(NLP) models for automating programming in the chemical sciences. The
report is centred on the GPT-3 model and its deployment in the Codex
software provided by Microsoft. Overall, the authors do a good job outlining
the use of NLP models for automated programming in chemistry research
and teaching. The topic is highly relevant to Digital Discovery and therefore
I recommend the manuscript be published. I have a few minor comments
outlined below.
Referee 1, Comment 1
My main point is on the line “We should embrace the fact that we
no longer need to spend hours on teaching loops and instead focus on
chemistry.” This goes against the tone of the rest of the article, which
implies that tools such as Codex will enhance our programming skills
rather than replace them. A knowledge of programming is still essential
1
to i) validate the code is doing what is expected; ii) debug any errors;
and iii) write code that is beyond the NLP model, such as using new
packages not in the training corpus or implementing a completely novel
analysis. I don’t think it is a good idea to teach students how to use
Codex without first teaching them how to code.
Author Response: We certainly agree with this point, and we have
rephrased our text to better get across our meaning:
We should embrace the fact that we no longer need to spend
hours emphasizing the details of syntax, and instead focus on
higher level programming concepts and on translating ideas from
chemistry into algorithms.
Referee 1, Comment 2
I am interested how often a model like Codex will/can be retrained
considering the model size. New computational chemistry software is
released regularly (100’s of new packages a year), so regular retraining
will be necessary to capture the use of these codes in the training
corpus.
Author Response: We are also interested to see how this evolves. An
important note is that the model does not need to be entirely retrained,
but rather can be updated in a process known as fine-tuning. Codex is
itself a fine-tuning of the GPT-3 language model.
More pertinent though is that one only needs to retrain the model if you
want it to suggest new packages and their usage from scratch. However,
one can also start the prompt to the Codex with function definitions or
examples of use, and it will then produce subsequent code that use those
functions.
Revised page 3
Note though that Codex does not need to have a priori knowledge of how to use your software of interest; API usage can
be suggested as part of the prompt similar to how the task is
defined in Fig. 2.
Referee 1, Comment 3
In the SI, it would be useful if the authors indicate which code blocks
2
run successfully and whether any broken examples have been excluded
from the results (i.e., give an indication of the authors’ success rate
using Codex).
Author Response: We have added notes in the SI which ones did
not run successfully and why. There is a secondary question about scientific correctness, which we discuss more below in response to Referee
3, Comment 2.
In our experience, which has grown since we first submitted this article in August, the success rate depends very much on how the prompt
is phrased, and to a lesser extent model parameters such as temperature. Whether the code executes also depends if all relevant packages
are imported.
We are currently preparing to submit a research article where we have
developed a database of examples and quantify the success rate, so that
we—as well as anyone in the community who wants to contribute—have
a tool to assess the quality of LLMs going forward.
Referee 2
This manuscript reports a perspective of recently developed natural language
processing (NLP) models for automating programming relevant to computational chemistry research and teaching. The perspective was fascinating and
a delight to read. Although I consider myself an expert in computational
chemistry and at least slightly familiar with most of the tools discussed in
the perspective, I was pleasantly surprised to see the NLP models were as advanced as what was demonstrated by the authors. I believe this manuscript
is highly suitable for publication in Digital Discovery.
In preparing the referee report, I have been asked to comment on the
originality, importance, impact, and reliability of the work. The perspective is certainly very original and important as it clearly shows how modern
NLP tools can be productively used for routine applications in computational chemistry and teaching. With a few optional and minor revisions (see
below) its impact might be enhanced. The reliability of this approach is
somewhat harder to assess, though the authors themselves might be able to
add a sentence or two to address this point.
3
In sum, I strongly recommend this manuscript be published in Digital
Discovery, but I also recommend that the authors consider the following
comments that I believe would strengthen the manuscript.
Referee 2, Comment 1
There does not seem to be only little mention of examples and challenges in coupling AI with chemistry (and more specifically NLP applied to chemistry). Ref 12 is a useful example, but there will likely be
recent reviews that comment on this to help readers find more context
for the task being discussed. There is one example that I admittedly
have bias for (10.1021/acs.chemrev.1c00107, and references therein),
but the authors should decide which references are best to use within
the constraints of journal.
Author response: We have added several references to the outlook
including this one on the uses and challenges and promises of using AI
in chemistry, including and beyond NLP.
Revised page 4
There are many exciting ways in which AI techniques are being
integrated into chemistry research [6, 1, 8]. Bench chemists
have expressed the fear that automation will reduce the need
for synthetic hands in the lab [4]...
Referee 2, Comment 2
Regarding the first paragraph, I enjoyed reading this, but I have a few
questions. It is possible that the default settings of pyscf would call
for the model chemistry using HF and STO-3G to be used for any
electronic structure calculation by default. Did Codex really make this
decision and override defaults from pyscf itself, or did Codex learn what
the default model chemistry was from the pyscf code, and was what
happened really that surprising or not?
Author response: To the best of our knowledge, Codex is really making a choice here, since as we see in the SI example 1 (used to produce
the figure), those options are chosen explicitly. Of course, these are
likely the most frequent choices in code on which the model was trained.
Referee 2, Comment 3
Another minor point: I believe that since H2 is a two electron system,
CCSD(T) calculations are not technically possible, and instead CCSD
4
calculations would have actually been run (though some quantum mechanics codes might be coded in a way to interpret that is what the
user meant anyway). These two questions relate to each other, and
perhaps this raises the chance to comment on how different codes often
have a relatively dumb form of AI baked into every code already that
should be recognized to improve reliability going forward.
Author response: Thank you for indicating this subtle point that we
hadn’t noticed. You are certainly correct. Checking in to this further,
the answer is that Codex actually produced a code which computes the
answer using both CCSD and ‘CCST(T)’ and then compared the two.
When we checked to make sure the code runs, it indeed does run and
gives identical answers. For simplicity and to avoid this confusion, we
have removed the (T) from this part of the text.
Your second point is an excellent one which we had not discussed in
the article. The NLP is obviously not truly aware of the underlying
chemistry (e.g. that H2 is a two electron system and hence cannot have
triple excitations), and hence is relying on good default and error handling in the chemistry codes to be successful, and ideally an intelligent
user as well! We now point this out explicitly:
Revised page 2
Moreover, keep in mind that Codex may produce code that is
apparently correct and even executes, but which does not follow
best scientific practice for a particular type of computational
task.
Referee 2, Comment 4
If length restrictions allow it and if the authors agree, it might improve
the impact of this work by including a short table with notable examples that summarize the interesting examples of applications embedded
throughout the entire perspective. For instance, A. specific examples
of what NLP can already do well for chemistry; B. specific examples of
what NLP can do do somehow haphazardly for chemistry; C. specific
examples of what NLP can’t do yet but that’s on the horizon for chemistry. This might seem redundant since this is roughly the structure of
the perspective already, but as I read this manuscript, I found myself
wanting to highlight these examples and make this table for myself. It
5
might be convenient to have such a table pre-made by the authors so
others could more easily cite this work and share it to different audiences.
Author response: Although this suggestion is appealing, on discussing
what would go into the table, we have decided that others might disagree
about what constitutes NLP doing something well or haphazardly, and
so we did not add this table in. This is certainly a good idea to revisit as the field becomes a bit more mature, and also as we expand our
aforementioned database of examples upon which we will check for NLP
success.
Referee 2, Comment 5
To address the editor’s requests for comments on how this work is
reliable, perhaps the authors add one or two sentences?
Author response: As mentioned above, we are currently studying
how reliable this approach is in different contexts and plan to submit
a research article soon. We have added a discussion of this to the
reliability section.
Revised page 4
Because the accuracy of Codex depends strongly on how the
prompts are phrased, it remains unclear how accurate it can be
for chemistry problems. We are currently developing a database
of chemistry and chemical engineering examples that can be
used to systematically evaluate LLM performance in these and
related domains. A second question remains as to whether the
code produced is scientifically correct (and best practice when
multiple solutions exist) for a given task, which will still require
expert human knowledge to verify for now. We also note that in
practice some of the correctness is ensured by default settings
of chemistry packages employed in the Codex solution, just as
they might be with human generated code.
– John A. Keith
6
Referee 3
Referee 3, Comment 1
The manuscript by Hocky and White provides a timely and valuable
account on the use of natural language processing models, Codex in
particular, in chemical science. Their tests in a series of routine programming tasks show that such models have the potential to drastically
reduce our coding efforts and let us focus on high-level scientific problems, which will benefit both researchers and students. This paper
clearly echoes the theme of the journal and is of high quality. I would
recommend publication in Digital Discovery after my following points
are addressed.
- Can the authors discuss reproducibility in the context of NLP? In
the SI, the authors showed multiple responses for the same prompt. Is
reproducibility no longer a concern here? Or is it better to have diverse
responses for the same input? If so, will this impact data reproducibility
in future researches?
Author Response: The output of a transformer is a probability distribution. It must be sampled from to generate meaningful text. There
are multiple ways to do this. Codex samples with a temperature (same
equation as Boltzmann) and a fixed random seed.
SI New section
Codex and other LLMs output a probability distribution, so that
multiple outcomes can be generated using top-k algorithms that
select the k most likely outcomes. This is how we have generated
multiple outcomes from a given prompt.
Referee 3, Comment 2
- Some generated responses are actually incorrect (as also pointed out
by the authors). For example, Result 2 of “Dissociation energy” mistakenly used nuclear repulsion energy (‘mol.energy nuc()’) as reference
energy. Although it does not affect relative energies in this case, mistakes like this are quite concerning when the result looks correct su7
perficially but has hidden errors. Can the authors comment on this
issue?
Author Response:We have addressed correctness in previous responses
and new text added to the main document. Additionally, we have added
notes to code examples where these kinds of mistakes have occurred, including this one. We agree it can be concerning.
Revised page 4
Because the accuracy of Codex depends strongly on how the
prompts are phrased, it remains unclear how accurate it can be
for chemistry problems. We are currently developing a database
of chemistry and chemical engineering examples that can be
used to systematically evaluate LLM performance in these and
related domains. A second question remains as to whether the
code produced is scientifically correct (and best practice when
multiple solutions exist) for a given task, which will still require
expert human knowledge to verify for now. We also note that in
practice some of the correctness is ensured by default settings
of chemistry packages employed in the Codex solution, just as
they might be with human generated code.
Referee 3, Comment 3
- Despite those concerns, I am impressed by the overall accuracy of
Codex and the small training size one needs (e.g., in the chemical entity
recognition example). Is it also true in other tests that only a small
number of training examples were given? Maybe the authors could
elaborate more on the training process. I am also curious how chemistry
code developers can contribute to ease the training step for users of NLP
tools.
Author Response: There are two different kinds of training here
that we want to carefully delineate. Ability to generate well formatted
python code, and specifics with regards to particular package APIs (such
as pyscf just discussed) is at the language model level. As we discussed
in response to referee 1 in regards to re-training, we do not know how
often this would happen, but we can give codex information within the
‘prompt’ which can cause it to generate better or correct code. This is
8
more similar to the ‘training’ used in the chemical entity recognition.
The chemical entity recognition really is as simple as it is shown in the
SI. Part of a methods section was taken by a recent paper from one of
us, and annotated. GPT-3 was then able to use that annotation to pick
out chemical entities from further sentences. We were very surprised
at how easy this was, at it seems extremely useful. But we have not
exhaustively tested how well this technique would extrapolate to other
chemistry text processing tasks as yet.
Referee 4
Referee 4, Comment 1
In this paper, the authors give perspectives on the impact of NLP tools
on chemistry research and teaching. They provide examples to demonstrate that the latest large language models can automate programming
to complete common tasks in computational chemistry. The limitations
of the NLP models are also mentioned. This brings useful information
to the community, and I think will be interesting to the audience. As
such, I suggest publication of the paper with some questions to be addressed. 1. One major question is whether the current NLP models
can generate more complex code. Most of the examples provided in
the paper are relatively simple and some of them are just about generating input files for certain tasks and softwares. I guess this can be
achieved merely by learning the documentations of these softwares. For
more complex tasks such as implementing new algorithms or computational methods within the frameworks of existing softwares, reading
and understanding the existing code usually takes quite a lot of time.
I wonder if the current NLP models can learn complex code (e.g., the
code for computing electron integrals) and help the users implement
their own method even if they are not familiar with the underlying
codebase. Some comments along these lines would be helpful.
Author Response: This is an excellent point, and we now point this
out in the Ongoing Challenges section that we have so far limited ourselves to deployment of existing algorithms and APIs. From our experience, it would be possible right now to feed examples of certain kinds
9
of calculations (e.g. electron integrals) and have codex generate code
that is very similar. Right now, we would not expect LLMs to be able
to generate code for entirely new methods, although it might be able to
generate a substantial amount of actual code from a pseucode algorithm
describing your new approach. We have added a note to the main text
about a different LLM that is specifically suited to solving and generating university level calculus problems, that are perhaps better suited to
your example.
Revised text, page 2
Although automatic code generation in chemistry is not new
(e.g. [7, 2, 10]), we believe that the scope and natural language
aspects mean that code-generating LLMs like Codex will have
a broad impact on both the computational and experimental
chemistry community. Furthermore, Codex is just the first capable model and progress will continue. Already in late 2021
there are models that surpass GPT-3 in language[9], equal it
but with 1/20th the number of parameters[3], and models that
can generate and solve university-level math problems[5].
Referee 4, Comment 2
2. A minor question about the SMILES example. Does the AI actually
know how to write down the SMILES notation of a chemical or is it
just searching the database for the matching chemical names?
Author Response: There is no database. Codex has memorized the
SMILES in its trainable weights that are just floating point numbers.
Revised Text
Page 2
While reading these examples, remember that the model does
not have a database or access to a list of chemical concepts. All
chemistry knowledge, like the SMILES string for caffeine in Example A, comes purely from the learned floating point weights.
10
References
[1] Nongnuch Artrith, Keith T Butler, Fran¸cois-Xavier Coudert, Seungwu
Han, Olexandr Isayev, Anubhav Jain, and Aron Walsh. Best practices
in machine learning for chemistry. Nat. Chem., 13(6):505–508, 2021.
[2] Jacob Austin, Augustus Odena, Maxwell Nye, Maarten Bosma, Henryk
Michalewski, David Dohan, Ellen Jiang, Carrie Cai, Michael Terry, Quoc
Le, et al. Program synthesis with large language models. arXiv preprint
arXiv:2108.07732, 2021.
[3] Sebastian Borgeaud, Arthur Mensch, Jordan Hoffmann, Trevor Cai,
Eliza Rutherford, Katie Millican, George van den Driessche, JeanBaptiste Lespiau, Bogdan Damoc, Aidan Clark, et al. Improving language models by retrieving from trillions of tokens. arXiv preprint
arXiv:2112.04426, 2021.
[4] Chemjobber. Will robots kill chemistry? Chem. Eng. News, 97, 2019.
[5] Iddo Drori, Sunny Tran, Roman Wang, Newman Cheng, Kevin Liu,
Leonard Tang, Elizabeth Ke, Nikhil Singh, Taylor L. Patti, Jayson
Lynch, Avi Shporer, Nakul Verma, Eugene Wu, and Gilbert Strang. A
neural network solves and generates mathematics problems by program
synthesis: Calculus, differential equations, linear algebra, and more.
arXiv preprint arXiv:2112.15594, 2022.
[6] John A. Keith, Valentin Vassilev-Galindo, Bingqing Cheng, Stefan
Chmiela, Michael Gastegger, Klaus-Robert M¨uller, and Alexandre
Tkatchenko. Combining machine learning and computational chemistry
for predictive insights into chemical systems. Chem. Rev., 121(16):9816–
9872, 2021. PMID: 34232033.
[7] Matthew K MacLeod and Toru Shiozaki. Communication: Automatic
code generation enables nuclear gradient computations for fully internally contracted multireference theory. J. Chem. Phys., 142(5):051103,
2015.
[8] Robert Pollice, Gabriel dos Passos Gomes, Matteo Aldeghi, Riley J
Hickman, Mario Krenn, Cyrille Lavigne, Michael Lindner-D’Addario,
AkshatKumar Nigam, Cher Tian Ser, Zhenpeng Yao, et al. Data-driven
11
strategies for accelerated materials design. Accounts of Chemical Research, 54(4):849–860, 2021.
[9] Jack W Rae, Sebastian Borgeaud, Trevor Cai, Katie Millican, Jordan Hoffmann, Francis Song, John Aslanides, Sarah Henderson, Roman
Ring, Susannah Young, et al. Scaling language models: Methods, analysis & insights from training gopher. arXiv preprint arXiv:2112.11446,
2021.
[10] Thorsten Zirwes, Feichi Zhang, Jordan A Denev, Peter Habisreuther,
and Henning Bockhorn. Automated code generation for maximizing
performance of detailed chemistry calculations in openfoam. In High
Performance Computing in Science and Engineering’17, pages 189–204.
Springer, 2018.

Round 2

Revised manuscript submitted on 06 Янв. 2022

Editor’s decision letter

25-Jan-2022

Dear Dr White:

Manuscript ID: DD-PER-09-2021-000009.R1
TITLE: Natural Language Processing Models That Automate Programming Will Transform Chemistry Research and Teaching

Thank you for submitting your revised manuscript to Digital Discovery. I am pleased to accept your manuscript for publication in its current form. I have copied any final comments from the reviewer(s) below.

You will shortly receive a separate email from us requesting you to submit a licence to publish for your article, so that we can proceed with the preparation and publication of your manuscript.

You can highlight your article and the work of your group on the back cover of Digital Discovery. If you are interested in this opportunity please contact the editorial office for more information.

Promote your research, accelerate its impact – find out more about our article promotion services here: https://rsc.li/promoteyourresearch.

If you would like us to promote your article on our Twitter account @digital_rsc please fill out this form: https://form.jotform.com/213544038469056.

Thank you for publishing with Digital Discovery, a journal published by the Royal Society of Chemistry – connecting the world of science to advance chemical knowledge for a better future.

With best wishes,

Linda Hung
Associate Editor
Digital Discovery
Royal Society of Chemistry

Reviewer comments

Reviewer 2

All comments appear to be satisfactorily addressed.

Reviewer 1

The authors have addressed all my previous comments. I therefore recommend this manuscript be accepted without further revision.

Reviewer 3

My comments have been appropriately addressed. I appreciate the authors' efforts and recommend the publication of this paper.

Transparent peer review

To support increased transparency, we offer authors the option to publish the peer review history alongside their article. Reviewers are anonymous unless they choose to sign their report.

We are currently unable to show comments or responses that were provided as attachments. If the peer review history indicates that attachments are available, or if you find there is review content missing, you can request the full review record from our Publishing customer services team at RSC1@rsc.org.

Find out more about our transparent peer review policy.

Content on this page is licensed under a Creative Commons Attribution 4.0 International license.

From the journal Digital Discovery Peer review history

Natural language processing models that automate programming will transform chemistry research and teaching

Round 1

Reviewer 1

Reviewer 2

Reviewer 3

Reviewer 4

Round 2

Reviewer 2

Reviewer 1

Reviewer 3

Transparent peer review