From the journal Digital Discovery Peer review history

Understanding and improving zeroth-order optimization methods on AI-driven molecule optimization

Round 1

Manuscript submitted on 25 Apr 2023
 

20-Jun-2023

Dear Dr Chen:

Manuscript ID: DD-ART-04-2023-000076
TITLE: Understanding and improving zeroth-order optimization methods on AI-driven molecule optimization

Thank you for your submission to Digital Discovery, published by the Royal Society of Chemistry. I sent your manuscript to reviewers and I have now received their reports which are copied below.

I have carefully evaluated your manuscript and the reviewers’ reports, and the reports indicate that major revisions are necessary.

Please submit a revised manuscript which addresses all of the reviewers’ comments. Further peer review of your revised manuscript may be needed. When you submit your revised manuscript please include a point by point response to the reviewers’ comments and highlight the changes you have made. Full details of the files you need to submit are listed at the end of this email.

Digital Discovery strongly encourages authors of research articles to include an ‘Author contributions’ section in their manuscript, for publication in the final article. This should appear immediately above the ‘Conflict of interest’ and ‘Acknowledgement’ sections. I strongly recommend you use CRediT (the Contributor Roles Taxonomy, https://credit.niso.org/) for standardised contribution descriptions. All authors should have agreed to their individual contributions ahead of submission and these should accurately reflect contributions to the work. Please refer to our general author guidelines https://www.rsc.org/journals-books-databases/author-and-reviewer-hub/authors-information/responsibilities/ for more information.

Please submit your revised manuscript as soon as possible using this link:

*** PLEASE NOTE: This is a two-step process. After clicking on the link, you will be directed to a webpage to confirm. ***

https://mc.manuscriptcentral.com/dd?link_removed

(This link goes straight to your account, without the need to log on to the system. For your account security you should not share this link with others.)

Alternatively, you can login to your account (https://mc.manuscriptcentral.com/dd) where you will need your case-sensitive USER ID and password.

You should submit your revised manuscript as soon as possible; please note you will receive a series of automatic reminders. If your revisions will take a significant length of time, please contact me. If I do not hear from you, I may withdraw your manuscript from consideration and you will have to resubmit. Any resubmission will receive a new submission date.

The Royal Society of Chemistry requires all submitting authors to provide their ORCID iD when they submit a revised manuscript. This is quick and easy to do as part of the revised manuscript submission process.   We will publish this information with the article, and you may choose to have your ORCID record updated automatically with details of the publication.

Please also encourage your co-authors to sign up for their own ORCID account and associate it with their account on our manuscript submission system. For further information see: https://www.rsc.org/journals-books-databases/journal-authors-reviewers/processes-policies/#attribution-id

I look forward to receiving your revised manuscript.

Yours sincerely,
Linda Hung
Associate Editor
Digital Discovery
Royal Society of Chemistry

************


 
Reviewer 1

The paper Understanding and improving zeroth-order optimization methods on AI-driven molecule optimization by Lo and Chen describes the application of Zeroth Order (ZO) optimizers to the problem of exploring chemical space. In particular, they discuss the application of ZO Gradient descent (ZO-GD hereafter), ZO-signGD and ZO adaptive momentum (ZO-Adam) to the problem of Query Molecule Optimization (QMO). The author starts from the result of previous studies (particularly Hoffman et al.) and present the purpose of in depth tests of the three ZO methods to an extensive set of benchmarks from the Guacamol project.
In the Methods section the authors describe the nature of ZO methods and discuss their advantages/disadvantages with respect to other derivative free methods such as evolutionary algorithms and Bayes methods and summarize the basis of ZO estimators and optimizers and the general algorithm. They also discuss the application of ZO methods to a “shallow” (finding substantially new molecules) and a “deep” problem (improving an existing molecule for a given property) and a mixed approach which combines ZO and other optimization methods. Finally, in the results section they present the performance of ZO-GD, ZO-signGD and ZO-Adam in three Guacamol tasks: perindopril_mpo (varying the number of aromatics rings in perindopril keeping the properties), zaleplon_mpo (varying from the formula of zaleplon), and deco_hop (maximizing SMILES similarity under the constrain of forbidden SMARTS patterns).
Overall, the paper il well organized and the goals and means clearly explained. The subject is a perfect match for the Digital Discovery readership and I recommend it for publication in the journal if considering the following points:
1. The authors rightly address a limit of feasibility in their study i. e. the problem of obtaining a molecule which can be reasonably synthetized. I wonder if this problem can be addressed by some kind of heuristic or score and be introduced in the goal.
2. In an exploration of a random point in the chemical space, it is not possible to get stuck in a local neighbourhood since many moves from the start would fail to generate a stable structure.
3. In figure 1 it is not clear to me if it refers to any two random vector or it some sort of average? I would guess the second but I failed to find it in the text. More generally I think that in a study with just three figures it would be beneficial to use more extensive figure captions. I also do not clearly understand how much two vectors are enough to describe the landscape.
4. Figure two is perhaps the main result of the paper. However, it is very hard to read, with small symbols and not very sharp colours; it would also maybe improve it to add panels on the flat zones of the graphs, where the curves overlap. Why the used so few markers and on different abscissas. Getting to the point, the authors state that ZO-signGD outperforms the other methods but the difference does not seem that big in several cases; Id’ rather say that ZO-GZ tends to underperform with respect to the other methods as far as I can read the figure. Also (and this is one of the main weaknesses of the study) since this one of the two results of the study it should be discussed more quantitatively and on a more extensive set of tests; or at least they should the discuss the validity of three tests and the related computational effort. Along the same lines, the authors should also discuss how and why the determined the number of iterations.
5. Figure 3 presents the performance of baseline methods on their own and combined with ZO methods. First of all, I would like a comment on the flat orange “gbpo” line in the second panel. Second if the results are not fully conclusive, as the authors themselves state, why do not complete the study with some other conclusion. It’s a bit odd to conclude a manuscript with an uncertain message even if the idea of hybrid methods looks quite sensible.
6. The authors should stress the nature of each test with the respect to the two categories of problems

Reviewer 2

This paper builds on earlier work of one of the authors, cited as reference 8, on the optimization of molecular structures with respect to some molecular property. The difficulty of doing this clearly depends on the property of interest. The discussion refers to activity cliffs, infrequent optima and large and flat unfavourable regions of chemical space from "a black box oracle returning a scalar corresponding to a molecular property of interest".

The results section lists three tasks within Guacamol. The link to Guacamol given in the paper to the Benevolent AI website no longer works. https://github.com/BenevolentAI/guacamol should be given in the paper.

The paper applies three optimisation methods to these three examples. This is interesting, but not a major advance. The generality of the conclusions that can be drawn from this study seems to be overstated.

The paper refers to ESI, but this does not seem to be available to me

Figure 1 shows visualisations based on 'two random directions'. The significance of the three different versions of each task is not explained.

I cannot fully assess this paper without access to the ESI. However, it seems to be a reasonable study which extends the results reported in reference 8

Reviewer 3

The authors describe an investigation into the use of zeroth-order optimization methods for latent vector space molecule generation methods. The article extends the work of others as noted and referenced.
The work is publishable once the issues below have been addressed.
The methods are clearly described.
Some further details are required on the benchmark studies.
● Include images of the three target structures
● Provide a table of the pi, cj, nj corresponding to the values in equation 1
● Which fingerprint is being used for similarity calculations?
● More detail on Figs 2 and 3. Is this the average score over multiple runs? What does the shading represent?
● Why are the scores not approaching 1? For the Zaleplon case for example adding a Fluorine or methyl group to one of the aromatic rings would lead to very similar structures with a different MF. How would these score?
● Furthermore, showing structures for some of the highest scoring molecules would help readers understand how the search process is working. Is the failure to identify compounds with a higher score a problem with the molecule generation approach, an artifact of the scoring function or a limit of the optimization from random start points getting stuck in local minima?


 

Dear Editor,

Please see our attached response letter for the revision report.

This text has been copied from the PDF response to reviewers and does not include any figures, images or special characters:

Manuscript ID: DD-ART-04-2023-000076
Title: Understanding and improving zeroth-order optimization methods on AI-driven molecule optimization
Authors: Elvin Lo, Pin-Yu Chen
General Response
We thank the editor for the efficient handling of our submission, and we thank the reviewers for their constructive review comments. In the revised version, we have addressed all of the reviewers’ comments and updated the manuscript accordingly. The major changes are highlighted in blue. We summarize the key revisions below and provide separate responses to each reviewer’s comments. The main additions are:
1. We have provided much greater detail of our benchmark setup and rearranged some information on experimental details for better readability. This also includes adding a figure, now Fig. 1, to show the target structures of the Guacamol objectives.
2. Second, we have discussed our experimental results and conclusions with greater specificity to better convey our results without overstating claims. This includes adding a figure, now Fig. 3, to show some QMO-optimized molecules from our experiments. Other figure captions have also been expanded for greater detail.

Response to Reviewer 1 (R1)
We thank the reviewer for their positive feedback on our submission and for recommending our work for publication after their points are addressed. With the reviewer’s detailed feedback, we have elaborated on technical details (including discussion of addressing the synthesizability problem and visualizing high-dimensional loss landscapes), improved figures and captions, and revised our Results and Conclusion sections to provide more detailed and precise discussion of the benchmark studies and results. In what follows, we address each of the reviewer’s comments.

R1Q1: The authors rightly address a limit of feasibility in their study i. e. the problem of obtaining a molecule which can be reasonably synthetized. I wonder if this problem can be addressed by some kind of heuristic or score and be introduced in the goal.
Response: To address the synthesizability problem, synthesizability scores of molecules may be used to modify the optimization objective function. We have expanded on our discussion of this in the Conclusion section. To provide better insight into QMO’s proposed molecules (and their synthesizability), we have added Fig. 3 to display some QMO-optimized molecules and their synthetic accessibility (SA) scores.
R1Q2: In an exploration of a random point in the chemical space, it is not possible to get stuck in a local neighbourhood since many moves from the start would fail to generate a stable structure.
Response: ZO methods follow noisy (estimated) gradients to search for the next candidate at each iteration, so it may get stuck if the loss landscape has many saddle points (see our loss landscape visualizations in Fig. 2). We agree with the reviewer that a totally random search can avoid getting stuck at bad local minima, but the hit rate (success of finding a valid candidate) can be quite low, especially with query constraints.
R1Q3: In figure 1 (now Fig. 2) it is not clear to me if it refers to any two random vector or it some sort of average? I would guess the second but I failed to find it in the text. More generally I think that in a study with just three figures it would be beneficial to use more extensive figure captions. I also do not clearly understand how much two vectors are enough to describe the landscape.
Response: To visualize the landscapes, we use two random vectors instead of an average. Projecting to two random vectors is a common approach to visualize high-dimensional loss landscapes (see reference 33). We have expanded our caption of the Guacamol function landscapes (now Fig. 2) to describe in detail how the plots were created. Other figure captions have been elaborated also.
R1Q4.1: Figure two is perhaps the main result of the paper. However, it is very hard to read, with small symbols and not very sharp colours; it would also maybe improve it to add panels on the flat zones of the graphs, where the curves overlap. Why the used so few markers and on different abscissas. Getting to the point, the authors state that ZO-signGD outperforms the other methods but the difference does not seem that big in several cases; Id’ rather say that ZO-GZ tends to underperform
with respect to the other methods as far as I can read the figure.
Response: First, readability of plots has been improved by adding more markers. Second, we agree with the reviewer that ZO-Adam performs similarly to ZO-signGD in some cases, specifically on the perindopril mpo task and on the zaleplon mpo task with Q = 100. Accordingly, we have elaborated and modified our discussion (especially in Section 3.2) to (1) emphasize the underperformance of ZO-GD, as the reviewer mentions, and (2) compare ZO-signGD and ZO-Adam in detail. Our discussion of ZO-signGD versus ZO-Adam in Section 3.2 now compares their performance on each of the three Guacamol tasks, providing much greater detail than before and no longer just stating that ZO-signGD outperforms the other methods “in most settings.” Here, our argument now mainly emphasizes the specific evidence that ZO-signGD has improved robustness to the function landscapes of molecular objectives.
R1Q4.2: Also (and this is one of the main weaknesses of the study) since this one of the two results of the study it should be discussed more quantitatively and on a more extensive set of tests; or at least they should the discuss the validity of three tests and the related computational effort. Along the same lines, the authors should also discuss how and why the determined the number of iterations.
Response: We have added discussion of the benchmarking scheme in the beginning of the Results section. First, we have discussed why Guacamol tasks are a desirable benchmark, also noting that Guacamol tasks are the core of the Practical Molecular Optimization or PMO benchmark by reference 20 (see https://github.com/wenhao-gao/ mol_opt), making for convenient comparison to other SOTA methods included in PMO. Second, we have further described each of the selected benchmark tasks to convey their significance.
For our comparison of ZO optimization methods in Section 3.2, our objective was to compare the full optimization trajectories and convergence. Numbers of iterations for each task were chosen to be sufficiently high for the ZO methods to converge. In practice, determining the number of iterations would depend on oracle query budgets.
R1Q5: Figure 3 (now Fig. 5) presents the performance of baseline methods on their own and combined with ZO methods. First of all, I would like a comment on the flat orange “gbpo” line in the second panel. Second if the results are not fully conclusive, as the authors themselves state, why do not complete the study with some other conclusion. It’s a bit odd to conclude a manuscript with an uncertain message even if the idea of hybrid methods looks quite sensible.
Response: The flat “gpbo” line correctly reflects that the algorithm could not find any better molecule as zaleplon mpo is a highly difficult task among Guacamol objectives (see reference 19). Reference 20 also shows the optimization curves of many SOTA algorithms on Guacamol objectives in Appendix A.2, similarly finding that many SOTA algorithms struggle on zalepon mpo. Regarding our conclusion on hybrid methods, we have elaborated our discussion in Section 3.2 to discuss why our experiments on hybrid methods, while preliminary, sufficiently serve as proof of concept for the potential of hybrid approaches. Our experiments “show that hybrid approaches successfully improve on the convergence speed of QMO, and the capacity of QMO for local search in chemical space makes it a promising option for refining a molecule in more complex design scenarios to satisfy the numerous property constraints of pharmaceutical drugs.”
R1Q6: The authors should stress the nature of each test with the respect to the two categories of problems
Response: We have expanded our discussion of the benchmark tests in our Results section to include discussion of the Guacamol suite, details of each selected test, and the nature of the tests with respect to the categories of problems identified in Section 2.5.

Response to Reviewer 2 (R2)
We thank the reviewer for their constructive feedback. With the reviewer’s comments, we have improved our discussion of our benchmarking examples, reworked our discussion and claims to avoid overstatements, and clarified technical details relating to our figures. We address each of the reviewer’s concerns as given below.

R2Q1: The results section lists three tasks within Guacamol. The link to Guacamol given in the paper to the Benevolent AI website no longer works. https://github.com/BenevolentAI/guacamol should be given in the paper.
Response: The link has been added to the Results section. We also elaborate that we use specifically the implementation of the Therapeutic Data Commons (TDC), reference 28, at https://tdcommons.ai.
R2Q2: The paper applies three optimisation methods to these three examples. This is interesting, but not a major advance. The generality of the conclusions that can be drawn from this study seems to be overstated.
Response: We have added discussion of our benchmarking scheme and its merits in the beginning of the Results section. First, we have described each of the selected benchmark tasks in much greater detail to better convey the scope of their significance. Second, while the benchmarking of molecule optimization algorithms remains a community-wide challenge (and no ubiquitous benchmarks have yet emerged), we have elaborated on the merits of Guacamol. For example, using Guacamol makes for convenient comparison to SOTA methods in the open-source Practical Molecular Optimization or PMO benchmark by reference 20 (see https://github.com/wenhao-gao/mol_opt).
To ensure our claims are not overstated, we have made several revisions. In our Conclusion, we have added to our list of limitations to mention that our results may be biased to similarity-based oracles. We have reworded several sentences to better emphasize that our conclusions are drawn specifically from our experimentation on Guacamol. We also emphasize that our goal is to “better characteriz[e] the performance of ZO methods for molecule optimization and provid[e] preliminary experiments with hybrid approaches as a proof of concept [to better] inform future applications of QMO for drug discovery” (from a new sentence in our Conclusion section).
R2Q3: Figure 1 (now Fig. 2) shows visualisations based on ’two random directions’. The significance of the three different versions of each task is not explained.
Response: We have added to both Section 2.2 (in the Methods) and Section 3.2 (in the Results) to highlight the significance of Q. The caption of the function landscapes figure (now Fig. 2) has been expanded to explain how the visualizations were created with two random directions.

Response to Reviewer 3 (R3)
We thank the reviewer for their detail-oriented suggestions and for recommending our work for publication after the comments are addressed. Most significantly, we have expanded Section 3 to describe the benchmark studies, including discussion of the Guacamol suite and specifics on the selected objectives, and added Fig. 3 showing several QMO-optimized molecules. We have also added clarifications on our methods, details on figures in their captions, and a table in our ESI to more conveniently provide hyperparameter details.

R3Q1: Include images of the three target structures
Response: We have added another figure, Fig. 1, to show the similarity targets and provide greater details on the functions.
R3Q2: Provide a table of the pi, cj, nj corresponding to the values in equation 1
Response: Equation (1) provides a possible convenient formulation of the optimization objective score to consider many properties and constraints. As discussed in Section 2.5, this formulation is used in reference 8. However, we do not use this formulation in our experiments with the Guacamol objectives. We have elaborated to clarify this in the Methods section. On a similar note, we have added details in another table, Table D1, to our ESI to provide better details of our experimental setup.
R3Q3: Which fingerprint is being used for similarity calculations?
Response: We have elaborated on our descriptions of the selected Guacamol objectives in the Results section, including the fingerprints used for similarity calculations.
R3Q4: More detail on Figs 2 and 3. Is this the average score over multiple runs? What does the shading represent?
Response: Yes, the curves are the average score of multiple runs and exact run details are included in the beginning of the Results section. The shading represents the standard deviation of the runs. The figure captions have been expanded to convey this information.
R3Q5: Why are the scores not approaching 1? For the Zaleplon case for example adding a Fluorine or methyl group to one of the aromatic rings would lead to very similar structures with a different MF. How would these score?
Response: We apologize for the confusion; the zaleplon mpo task does not search just for a maximally different
MF, but targets specifically the MF of C19H17N3O2 (while the MF of zaleplon is C17H15N5O). Specifically, the zaleplon mpo function returns the geometric mean of two scores: (1) Tanimoto similarity to zaleplon (calculated with ECFC4 fingerprints), and (2) an isomer function that targets the aforementioned molecular formula. The isomer function depends on the number of atoms of each element type in the target formula, and the total number of atoms in the molecule. We have elaborated on our descriptions of the Guacamol objectives to clarify this information. Each of these two scores has maximum value 1, but they cannot be simultaneously maximized. So the global maximum of the zaleplon mpo function, which is their geometric mean, is not 1.
R3Q6: Furthermore, showing structures for some of the highest scoring molecules would help readers understand how the search process is working. Is the failure to identify compounds with a higher score a problem with the molecule generation approach, an artifact of the scoring function or a limit of the optimization from random start points getting stuck in local minima?
Response: Fig. 3 has been added to show some QMO-optimized molecules (the same molecules that were used to create Fig. 2, the function landscapes). The failure to identify compounds with a higher score is largely an artifact of the scoring functions, as discussed in our response to R3Q5. The selected Guacamol objectives are the (arithmetic or geometric) means of several scoring functions, which are each bounded by 0 and 1. But as it is impossible to simultaneously maximize every scoring function, the global maxima of the selected Guacamol functions are not 1. QMO’s highest scoring molecules are reasonable when compared with SOTA algorithms, including our paper’s baselines as well as results from other papers like reference 20. But of course, the possibility of QMO getting stuck in local minima is certainly another limiting factor besides the nature of the scoring functions.




Round 2

Revised manuscript submitted on 22 Jul 2023
 

09-Aug-2023

Dear Dr Chen:

Manuscript ID: DD-ART-04-2023-000076.R1
TITLE: Understanding and improving zeroth-order optimization methods on AI-driven molecule optimization

Thank you for submitting your revised manuscript to Digital Discovery. I am pleased to accept your manuscript for publication in its current form. I have copied any final comments from the reviewer(s) below. One of the reviewers has noted a swapped caption. Please resolve this issue when you receive your proof corrections.

You will shortly receive a separate email from us requesting you to submit a licence to publish for your article, so that we can proceed with the preparation and publication of your manuscript.

You can highlight your article and the work of your group on the back cover of Digital Discovery. If you are interested in this opportunity please contact the editorial office for more information.

Promote your research, accelerate its impact – find out more about our article promotion services here: https://rsc.li/promoteyourresearch.

If you would like us to promote your article on our Twitter account @digital_rsc please fill out this form: https://form.jotform.com/213544038469056.

We are offering all corresponding authors on publications in gold open access RSC journals who are not already members of the Royal Society of Chemistry one year’s Affiliate membership. If you would like to find out more please email membership@rsc.org, including the promo code OA100 in your message. Learn all about our member benefits at https://www.rsc.org/membership-and-community/join/#benefit

By publishing your article in Digital Discovery, you are supporting the Royal Society of Chemistry to help the chemical science community make the world a better place.

With best wishes,

Linda Hung
Associate Editor
Digital Discovery
Royal Society of Chemistry


 
Reviewer 3

The authors have addressed satisfactorily the questions raised in the previous review round.
One minor comment.
Figure 1, e) and f) do not match the figure caption (they are swapped).

Reviewer 1

In this revised version, Lo and Chen have addressed all of my concerns and, as far as I understand, globally improved the quality of the manuscript. In particular, they have answered to questions R1Q1, R1Q2 (about which I was partly mistaken as they explain), R1Q3 (the captions now read very well even if I still fail to see the usefulness; that is probably a shortcoming of mine) and R1Q4.1I also notice that some of my points were raised by reviewer 2 as well (e. g. R2Q3). Concerning R1Q4.2 and R1Q5 which were my main criticisms the authors increased bit the level of detail and readability of results. I still find the use of tiny markers not the best but here we are in the realm of personal aesthetic taste. Only, if possible, I’d avoid a dark blue highlight for a black text in the future.
To conclude I recommend the paper for publication.




Transparent peer review

To support increased transparency, we offer authors the option to publish the peer review history alongside their article. Reviewers are anonymous unless they choose to sign their report.

We are currently unable to show comments or responses that were provided as attachments. If the peer review history indicates that attachments are available, or if you find there is review content missing, you can request the full review record from our Publishing customer services team at RSC1@rsc.org.

Find out more about our transparent peer review policy.

Content on this page is licensed under a Creative Commons Attribution 4.0 International license.
Creative Commons BY license