From the journal Digital Discovery Peer review history

Physics-informed models of domain wall dynamics as a route for autonomous domain wall design via reinforcement learning

Round 1

Manuscript submitted on 07 Jul 2023
 

21-Oct-2023

Dear Dr K Vasudevan:

Manuscript ID: DD-ART-07-2023-000126
TITLE: Physics-informed models of domain wall dynamics as a route for autonomous domain wall design via reinforcement learning

Thank you for your submission to Digital Discovery, published by the Royal Society of Chemistry. I sent your manuscript to reviewers and I have now received their reports which are copied below. ( I apologize for the long review period—we had multiple rounds of reviewers "ghost" us before we could accumulate a complete set of reviews.)

I have carefully evaluated your manuscript and the reviewers’ reports, and the reports indicate that major revisions are necessary.

Please submit a revised manuscript which addresses all of the reviewers’ comments. Further peer review of your revised manuscript may be needed. When you submit your revised manuscript please include a point by point response to the reviewers’ comments and highlight the changes you have made. Full details of the files you need to submit are listed at the end of this email.

Digital Discovery strongly encourages authors of research articles to include an ‘Author contributions’ section in their manuscript, for publication in the final article. This should appear immediately above the ‘Conflict of interest’ and ‘Acknowledgement’ sections. I strongly recommend you use CRediT (the Contributor Roles Taxonomy, https://credit.niso.org/) for standardised contribution descriptions. All authors should have agreed to their individual contributions ahead of submission and these should accurately reflect contributions to the work. Please refer to our general author guidelines https://www.rsc.org/journals-books-databases/author-and-reviewer-hub/authors-information/responsibilities/ for more information.

Please submit your revised manuscript as soon as possible using this link:

*** PLEASE NOTE: This is a two-step process. After clicking on the link, you will be directed to a webpage to confirm. ***

https://mc.manuscriptcentral.com/dd?link_removed

(This link goes straight to your account, without the need to log on to the system. For your account security you should not share this link with others.)

Alternatively, you can login to your account (https://mc.manuscriptcentral.com/dd) where you will need your case-sensitive USER ID and password.

You should submit your revised manuscript as soon as possible; please note you will receive a series of automatic reminders. If your revisions will take a significant length of time, please contact me. If I do not hear from you, I may withdraw your manuscript from consideration and you will have to resubmit. Any resubmission will receive a new submission date.

The Royal Society of Chemistry requires all submitting authors to provide their ORCID iD when they submit a revised manuscript. This is quick and easy to do as part of the revised manuscript submission process.   We will publish this information with the article, and you may choose to have your ORCID record updated automatically with details of the publication.

Please also encourage your co-authors to sign up for their own ORCID account and associate it with their account on our manuscript submission system. For further information see: https://www.rsc.org/journals-books-databases/journal-authors-reviewers/processes-policies/#attribution-id

I look forward to receiving your revised manuscript.

Yours sincerely,
Dr Joshua Schrier
Associate Editor, Digital Discovery

************


 
Reviewer 1

Benjamin R. Smith et al. demonstrate a reinforcement learning based on the domain wall measured by autonomous PFM platform and verified the results using phase-field simulations. Although domain walls have gathered continuous attention due to many interesting physical properties, the models for the domain walls are not fully compared quantitatively with experimental data. In this regard, the autonomous domain wall manipulations and subsequent reinforcement learning on the domain wall can be very interesting. In addition, the manuscript is well organized. Thus, it can be published in Digital Discovery. However, basic information on the experimental and theoretical conditions is not sufficient provided.
- As discussed by the authors, the prediction of 2-D image is not feasible due to the limited data currently available. Can the authors discuss more on the limited data?
- More details are necessary for the experimental and theoretical conditions. For examples,
1) Voltage and pulse width ranges for 801 useable transitions on page 8
2) Basic experimental information for the ferroelectricity of PTO thin film such as hysteresis loop
3) Pulse voltage and width for Initial wall in Fig. 3(b)
-There are some typos. For example, inital in Fig. 6.

Reviewer 2

Review:

Physics-informed models of domain wall dynamics as a route for autonomous domain wall design via reinforcement learning


Smith et al. describe the use of autonomous PFM data collection to modify domain wall positions in ferroelectric PbTiO3 films. Trained on PFM data, they were able to predict domain wall displacements for different pulse lengths and strengths. At this stage, it only works for the "1-D structure of the domain wall" and does not consider non-180° walls. This is evident in Fig 4, which shows that the expected domain wall displacement increases with increasing stimulation time or strength— the physically expected result. They further discuss how a "reinforced learning policy" can be used to manipulate ferroelectric domains.

While this work focuses on a highly specialized, unphysical situation, it does provide a prediction that might be of interest to those wishing to model domain wall properties and dynamics. However, this work has several significant challenges, detailed below. Because of these challenges I cannot recommend it for publication in its current form.

Scientific questions:
In bulk ferroelectrics, 180° domain walls are not expected to have any elastic energy, as they separate structurally identical areas and thus are not ferroelastic domain walls. So, what is the origin of the assumed elastic energy across 180° ferroelectric domain walls in thin films? Perhaps a surface area energy argument would be more appropriate to explain why the walls prefer to be smooth rather than "bulge"?

No color scheme is provided for the phase field simulations in Fig 3. Do the blue and green stripes represent different phases of the ferroelectric domain response? If so, does the red and yellow indicate other domain variants? If that's the case, wouldn't this suggest non-180° domain patterns within the bulges? Maybe the authors could comment on how this impacts the assumptions of their model?
("...the writing procedure produces nominally 180° domain walls...")

Do the authors have confidence in the "phase diagram" in Figure 4? If so, could they comment on the structure within the diagrams? For example, why does -4V for 240 ms cause much less movement than either longer or shorter pulses of the same voltage?


General comment:
I was very disappointed that there was no effort to discuss or consider the physical implications of the predicted domain wall dynamics beyond a plot suggesting that increasing the stimulation period, or intensity, results in more movement. This seems like a rather obvious result. It was particularly disheartening as the plots in Fig. 4 appeared to have some unexpected structures that were not discussed. Given that "domain wall dynamics" is part of the title, this seems like a significant oversight.

There is a lot of technical detail in the main text that could be moved to the supplementary. In the reviewer's opinion – after a commendable introduction - it reads too much like a lab project.

Specific comments:
1. Though the abstract is well-written, it does not accurately represent the main text. For instance, I cannot locate a discussion of two of the critical points mentioned in the abstract and the letter to the editor within the main text.

1.i) “Here, we present a reinforcement learning based experimental workflow deployed on an autonomous PFM platform that enables automated data collection of domain walls interacting with defects.”
I am unsure where they discuss the effect of domain walls interacting with defects, as the word “defect” does not appear after the introduction.

1.ii) “The surrogate enables generation of ‘phase diagrams’ of the domain wall, conditional on initial structure, and highlights the importance of domain wall pinning and elastic effects on repeated wall modification attempts.”
Intuitively the statement about pinning and elastic effects seems true but it is unclear how their model highlights the importance of such effects. Outside of the introduction they only mention elastic energy effects twice and they might be conflating elastic (structural) energies with domain wall surface area arguments. (See previous question about why 180 o ferroelectric domain have an elastic energy cost)

I also found no discussion about repeatability throughout the main text, only a statement noting that different starting conditions give different results.


2. The introduction is well written and gives a good overview about the ferroelectrics community. Perhaps an additional reference for ferroelastic domain walls e.g. one of the reviews by Carpenter or Saljie would be nice.


3. “In all cases, the tip follows a user-defined (implicit, e.g., via masks, or explicit) trajectory, and voltage is applied to the tip for specific moments to induce changes in the underlying domain structure.”
This is a little miss leading as one can mechanically write ferroelectric domain walls [1]. Please rephrase.

4. “This comprises a sequential decision-making task that is conditional on the pre-existing domain structure, the existence of defects, and the current state of the SPM tip, along with the type of domain wall itself, all of which can change spatially, temporally, or both.”
A nice turn of phrase, although the authors might be under selling themselves a little. It is known that electric fields to SPM tips can significantly change the defect structure of the sample, making the situation quite complex. The authors should not feel required to add a comment on this but it might be nice to mention – one recent example of such electric field modification came from the Meier team [2].

5. The textbox in figure 1 (b) is too small, please make it larger.

6. “Since the wall displacement mostly occurs in the section of the wall around where the bias was applied, the dynamics model only predicts displacements for the wall in these local regions”.
As a matter of academic interest, do the authors often see movement of domain walls in areas where the electric field wasn’t applied? It might be a nice topic for a future study to see remote correlations of domain walls.

7. Please include a citation to a work discussing the growth and bulk characterisation of your samples – so others see you are starting from high quality ferroelectric films.

8. “Because images are also captured immediately following when the bias is applied, our model’s predicted domain wall structures can be compared to the actual, observed domain wall structures.”
Does this mean that back-switching was observed in some areas? If so, was the model able to predict which areas back-switched and which were stable?

9. “The first additional term we add to the loss function is a term that emphasizes the agreement of the prediction with known local physics. In this experiment, the direction of displacement for the domain wall should align with the sign of
the voltage for the bias that was applied. When this is not the case and the surrogate model’s predictions do not agree with this physical prior, such predictions should be penalized.”
While the concept is sound and sensible, this example is tricky because some reports have claimed that domains can switch in opposite direction [3]. Perhaps it could be mentioned that such phenomena are not expected in this sample, thus validating your penalisation?

10. “However, the same was not true for pulse width, potentially due to changes in tip condition or other exogeneous variables.”
Again, here the established effects of electric field induced changes to the defect state could be mentioned [2] and the references therein. While the authors might, very commendably, be avoiding self-citations the work by Kalinin et al [4] seems particularly worth mentioning.

11. The text and numbers of Figure 4 are very hard to read, the x axis is particularly challenging. Please make this clearer. They are also encouraged to present a “cartoon” of the phase diagrams, to show the general trends and salient points.

12. Given Figure 4 shows the expected movement as response to stimuli strength and application time, have the authors tried to calculate effective inertia of domain walls? Or any other physically interesting properties?

13. From the referee’s perspective it is unclear what advantage the authors “reinforced learning policy” has over just writing a domain wall? Please make this clearer and perhaps the authors could include some before and after PFM images to show how they can change the domain state? Or show how they can create pattern which would not be trivial with a direct command from a human?

14. For the video links in the supplementary, please label what each video shows – including the scale the false colours represent.

15. “The sample is then imaged again, either with single frequency or band-excitation PFM..”
There is a major difference between the two techniques. Please state which is used for which image. In addition, state AC voltage used and the frequency (-range) used for the PFM.

16. Please provide a scale to show what the false colours indicate in Fig. 1c. If the large vertical boundary shown in the PFM phase is a 180 o wall, please why the amplitude of the PFM response is different on each side?

17. It would be nice for the readers if the authors noted why they chose to use PbTiO3 as the material to use for this study.

18. Some of the needles in Fig. 1c touch the wall. Surely the wall will have different switching properties at these points, as they represent different orientations of the ferroelastic domain structure? Would this not be quite problematic for training data?


References:
[1] Mechanical Writing of Ferroelectric Polarization
Lu, et al., 2012.
DOI: 10.1126/science.1218693

[2] Conductivity control via minimally invasive anti-Frenkel defects in a functional oxide
Evans, et al., 2020.
https://doi.org/10.1038/s41563-020-0765-x

[3] Anomalous Motion of Charged Domain Walls and Associated Negative Capacitance in Copper–Chlorine Boracite
Guy, et al., 2021.
https://doi.org/10.1002/adma.202008068

[4] The role of electrochemical phenomena in scanning probe microscopy of ferroelectric thin films.
Kalinin, et al. 2011.
https://doi.org/10.1021/nn2013518

Reviewer 3

In this work, the generation of domain walls was conducted through autonomous experimentation using a PFM. Scanning images were used to develop a physics-informed neutral network to predict the response of a domain wall upon an applied bias. This model was then used to conduct reinforcement-learning within automated experiments to generate structure-specific domain walls. In all, I think this is a well-written manuscript but still needs to undergo a few minor revisions before being accepted. Below are a few areas which I think can be addressed in further versions of the paper:

1) In Figures 3 and 4, phase-field simulations and the dynamics model were used to predict displacements in the domain wall upon the application of a bias. Are there any experimental data (from the autonomous experiments) which can be plotted or shown with these predictions to show the performance of these methods. It seems like qualitatively the physics of the displacement (direction) is captured, but would be interesting to see how they perform in a more quantitative sense.

2) Following the first comment, Figure S1 does seem to show a more quantitative comparison of the dynamics model prediction and experimental outputs. By just examining similar graphs in the provided Jupyter notebook of the model training, these seem to perform a lot better. Can you give a comment on the overall error of the model prediction compared to experiment? Especially in terms of either absolute displacement or average displacement?

4) In Figure 4b, the displacements in response to a negative potential don't seem to change the initial bulged wall back to a more ideal flat-wall state. This is something which was predicted in Figure 3b. Can comments be made on why there is a difference in the predicted behavior here?

3) In Figure 4a, I would expect a more 'symmetric' or 'isotopic' behavior of the domain wall phase diagram (when scanning from negative to positive applied potential) when starting with a flat wall. This would go with the comments on monotonic behavior encoded in the loss function, but the phase diagram in Figure 4a doesn't seem to show this. Why do you this is the case for the predicted model?

5) When performing the reinforcement learning experiments, please comment on what the stopping criterion was for the experiments

6) For the code review, it seems like during the model training, the data was split using a random splitting technique. Just be sure to call out the data splitting method more explicitly either in the manuscript and/or supporting information.


 

Dear Dr. Schrier,

We are delighted to resubmit for consideration our manuscript, “Physics-informed models of domain wall dynamics as a route for autonomous domain wall design via reinforcement learning” to RSC Digital Discovery.

We are very thankful to the three referees who critiqued our initial manuscript. We have responded to their criticisms in detail in the response document, and we trust our paper is now significantly improved from the initial version.

We have performed additional experiments to ascertain the domain structure as suggested by referees, in addition to additional simulations and analysis of the surrogate model. The result is that we have added two new figures to the paper, and moved one figure to the supplementary, as well as modifying two other figures with additional or newer data. This explains the reason for the long delay for our submission.

We look forward to hearing your decision on our manuscript soon. We sincerely thank the referees for their comments which have substantially improved our work.

Yours sincerely,

Rama Vasudevan on behalf of the authors.

This text has been copied from the Microsoft Word response to reviewers and does not include any figures, images or special characters:

************
REVIEWER REPORT(S):

Referee: 1


Comments to the Author

Benjamin R. Smith et al. demonstrate a reinforcement learning based on the domain wall measured by autonomous PFM platform and verified the results using phase-field simulations. Although domain walls have gathered continuous attention due to many interesting physical properties, the models for the domain walls are not fully compared quantitatively with experimental data. In this regard, the autonomous domain wall manipulations and subsequent reinforcement learning on the domain wall can be very interesting. In addition, the manuscript is well organized. Thus, it can be published in Digital Discovery.

We thank the referee for their positive appraisal of our manuscript.

However, basic information on the experimental and theoretical conditions is not sufficient provided.
- As discussed by the authors, the prediction of 2-D image is not feasible due to the limited data currently available. Can the authors discuss more on the limited data?

We thank the referee for this important question. The reason that much more data is needed in the 2D case is because we know from prior studies, that there is a substantial impact of the surrounding domain structure on the underlying dynamics at that position. Given this is a heterogeneous sample, there are many possible domain configurations in the vicinity of a domain wall, and the additional complicating factor of the wall structure itself (for example, bulged or not) will further impact how it responds to electric fields (for instance, see Physical Review B 82.2 (2010): 024111). This would require, at minimum, tens of thousands of transitions to be acquired to adequately sample this configurational space, which can only realistically be done through high-throughput scanning methods. We discuss this in the revised paper on page [24]. As an alternative, it may be possible to simulate thousands of realizations, although this will also take considerable effort and will not perfectly capture all relevant physics, e.g., such simulations will not capture the distribution of the sub-surface defects that cause wall pinning correctly.

- More details are necessary for the experimental and theoretical conditions. For examples,
1) Voltage and pulse width ranges for 801 useable transitions on page 8

We have added more details about the film in the revised manuscript including the domain structure, an example of the switching spectroscopy, and details of the experiment.

2) Basic experimental information for the ferroelectricity of PTO thin film such as hysteresis loop

We have added vector PFM scans as well as the hysteresis loop to verify the ferroelectricity of the sample and orient the reader as to the type of domain wall that is being studied. This is now included in the new Fig. 1 in the paper, which we reproduce below (Fig. R1).

Fig. R1: Domain structure and switching of the (110) PbTiO3 thin film. Band-excitation lateral PFM scans of the film are shown in panels (a-c) with the cantilever orientation with respect to the sample axes shown on the left. Polarization vectors are shown in the amplitude image. (d) Phase-field simulation of the domain structure with the coordinate transformation shown above. (e) Band-excitation piezoforce spectroscopy measurement showing the amplitude (blue) and phase (red) of an off-field hysteresis loop captured on the film.

3) Pulse voltage and width for Initial wall in Fig. 3(b)
This is now added to the figure caption, 3.26V / 1.15us as well as to the main text.

-There are some typos. For example, inital in Fig. 6.
We have checked the paper for typographic errors







Referee: 2

Comments to the Author

Review:
Physics-informed models of domain wall dynamics as a route for autonomous domain wall design via reinforcement learning

Smith et al. describe the use of autonomous PFM data collection to modify domain wall positions in ferroelectric PbTiO3 films. Trained on PFM data, they were able to predict domain wall displacements for different pulse lengths and strengths. At this stage, it only works for the "1-D structure of the domain wall" and does not consider non-180° walls. This is evident in Fig 4, which shows that the expected domain wall displacement increases with increasing stimulation time or strength— the physically expected result. They further discuss how a "reinforced learning policy" can be used to manipulate ferroelectric domains.

While this work focuses on a highly specialized, unphysical situation, it does provide a prediction that might be of interest to those wishing to model domain wall properties and dynamics. However, this work has several significant challenges, detailed below. Because of these challenges I cannot recommend it for publication in its current form.

We thank the referee for their comments and criticisms of our manuscript. We have attempted to answer all questions thoroughly in our rebuttal and in the revised manuscript.

Scientific questions:
In bulk ferroelectrics, 180° domain walls are not expected to have any elastic energy, as they separate structurally identical areas and thus are not ferroelastic domain walls. So, what is the origin of the assumed elastic energy across 180° ferroelectric domain walls in thin films? Perhaps a surface area energy argument would be more appropriate to explain why the walls prefer to be smooth rather than "bulge"?

We thank the referee for this comment. We have performed vector PFM experiments and find that these are not 180° walls (which was originally posited, due to seemingly clear vertical phase contrast in the PFM images). Those revised results are shown in the new Figure 1, and we conclude that the poling results in the formation of 90° domain walls. This is now discussed in the main text.

Additionally, we calculated the elastic energy of the domain wall in the different configurations and find there is considerably larger elastic energy at these walls than a typical 180° domain wall in e.g., (001) PZT thin films. We find the energy associated with the wall is about 8.2E6 J/m3. For comparison, previous calculations1 in (001) PZT thin films show that the energy density of a ferroelastic wall in that system is about 2.0E6 J/m3, and for a ferroelectric wall, it is about 1E6 J/m3.

When a bias is applied, the elastic energy density as a function of time after the bias is turned on varies in a complex manner depending on the initial state of the wall (straight or bulged) and whether positive or negative polarity is applied (see Fig. R2 below). Interestingly, the elastic energy density actually reduces, for some ‘bulging’ when positive bias is applied (Fig. R2(b)). When a positive bias is applied to an already bulged wall, the overall elastic energy density does not change significantly, when looking at longer time frames (Fig. R2(c)). Conversely, applying negative potential appears to change the elastic energy density moreso. However, for small voltages (-0.65V for example), when the bulge is mostly eliminated, the energy penalty is quite small. Therefore, we agree with the referee here – significant changes to elastic energy density is not found, and rather, it appears more likely that an overall surface energy argument is more important. This is now discussed on page 20.



Figure R2. Average elastic energy density (a) for stabilized PTO thin film before applying any bias. (b) when positive bias is applied to flat wall. (c) when positive bias is applied to pre-deformed (bulged) wall. (d) when negative bias is applied to pre-deformed (bulged) wall.


No color scheme is provided for the phase field simulations in Fig 3. Do the blue and green stripes represent different phases of the ferroelectric domain response? If so, does the red and yellow indicate other domain variants? If that's the case, wouldn't this suggest non-180° domain patterns within the bulges? Maybe the authors could comment on how this impacts the assumptions of their model?
("...the writing procedure produces nominally 180° domain walls...")

We apologize for this oversight. The colors are now referred to in the text, and the full color scheme of all variants is in the figure. It is true that the simulation suggests that there may be more domain variants within the bulges. However, we do not see any such variations within our PFM images – it looks like a rather straightforward bending of the wall, and no additional interfaces are created that we can visualize (see the included videos). One of the reasons the phase field model shows these additional phase variants is that the underlying code is quite sensitive to local changes in structure. Given that we do not observe evidence of this in the PFM data, we suggest these are the result of this over-sensitivity in the simulations. This is now mentioned in the revised paper.

Do the authors have confidence in the "phase diagram" in Figure 4? If so, could they comment on the structure within the diagrams? For example, why does -4V for 240 ms cause much less movement than either longer or shorter pulses of the same voltage?

We have confidence in the main trends, but it does seem that there are discrepancies, where for instance some higher voltages do not produce the desired displacement. One of the reasons for this may be that because we do not consider the surrounding domain structure, and there is limited data, then if a few 4V pulses were applied at a domain wall situated next to a strong pinning site (for instance, one of the needle domains), then there would be limited to no motion. The model would fit to these instances and lead to this type of seemingly unphysical result. We attempted to counter this via the addition of physics-based loss regularization, but as can be seen, this is not perfect. Moreover, the situation is likely to be particularly problematic at the lower voltages, where fluctuations in the response are likely to be more position dependent. For higher voltages, the wall will likely be displaced regardless of underlying pinning strength. In those cases, there is simply not much dependence on the pulse width. We now discuss these issues on page 17.

General comment:
I was very disappointed that there was no effort to discuss or consider the physical implications of the predicted domain wall dynamics beyond a plot suggesting that increasing the stimulation period, or intensity, results in more movement. This seems like a rather obvious result. It was particularly disheartening as the plots in Fig. 4 appeared to have some unexpected structures that were not discussed. Given that "domain wall dynamics" is part of the title, this seems like a significant oversight.

There is a lot of technical detail in the main text that could be moved to the supplementary. In the reviewer's opinion – after a commendable introduction - it reads too much like a lab project.

We thank the referee for this point. We have conducted further analysis of the surrogate model, to better understand the dynamics being learned, and present this new analysis in the new Fig. 5 in the revised manuscript along with additional modeling and discussion. This figure is reproduced here (Fig. R3).


Fig. R3. Switched areas as a function of voltage and pulse width, for an initial flat wall (a) and a positively bulged wall (b). (c) Velocities of the domain wall calculated for different voltages and 200ms pulse width (blue) with linear fits in this log v vs 1/E plot shown as a blue dashed line. The good fit indicates a creep regime. Compared to data by Tybell et al.2 on PZT films in a different geometry, the calculated slope is significantly lower.

We used the surrogate model to predict the switched areas for different pulse widths and pulse amplitudes, for the flat and bulged wall starting points, with the results shown in Fig. R3(a) and (b) respectively. From R3. (b), it is very evident that negative voltages applied to a bulged wall creates significant regions of switched polarization, moreso than for the flat case.

Perhaps most interestingly, we used the model to predict the domain wall velocities for different voltages assuming a pulse width of 200ms. For this we assume that the electric field E is ~V/d, where we assume a value of d = 20 nm. The real electric field is likely to be quite complex in such structures, but this serve as a decent upper bound. We can compute the domain wall velocities extracted under this approximation and plot them against those of Tybell et al.2 for PZT domain growth. Accordingly, we plot the log of the velocity against 1/E . The results are plotted in Fig. R3(c). The data fits well to a linear slope, i.e., evidence of a creep regime, however it is very evident that the slope is significantly less than those of Tybell et al. Note that the two scenarios are not directly comparable, since in the case of the PZT films the experiments by Tybell et al. were performed with nucleation and growth of domains directly underneath the tip, whereas here we are dealing with extension or contraction of domain walls in a different (in-plane) orientation. The slope of the velocity is about 6.5 times lower than that of Tybell et al. Again, it is difficult to read much into this, other than to note that (a) the wall appears to be governed by creep dynamics, and (b) the mobility is significantly reduced compared to 180° walls in PZT. This is now mentioned in the text on pages 21 and 22.

Specific comments:

1. Though the abstract is well-written, it does not accurately represent the main text. For instance, I cannot locate a discussion of two of the critical points mentioned in the abstract and the letter to the editor within the main text.


1.i) “Here, we present a reinforcement learning based experimental workflow deployed on an autonomous PFM platform that enables automated data collection of domain walls interacting with defects.” I am unsure where they discuss the effect of domain walls interacting with defects, as the word “defect” does not appear after the introduction.

The defects in this case are predominantly the pre-existing needle-like domains, which can act as pinning sites. We have removed the reference to defect in the abstract, and instead use the term ‘pinning site’.

1.ii) “The surrogate enables generation of ‘phase diagrams’ of the domain wall, conditional on initial structure, and highlights the importance of domain wall pinning and elastic effects on repeated wall modification attempts.” Intuitively, the statement about pinning and elastic effects seems true but it is unclear how their model highlights the importance of such effects. Outside of the introduction they only mention elastic energy effects twice and they might be conflating elastic (structural) energies with domain wall surface area arguments. (See previous question about why 180 o ferroelectric domain have an elastic energy cost)

As we discuss previously, we agree with the referee. After performing the new simulations we find there is not a significant elastic energy penalty from the bulged structure, and the surface energy argument is more probable. This is now mentioned on page 20.

I also found no discussion about repeatability throughout the main text, only a statement noting that different starting conditions give different results.

We note in the discussion section that the experiments to construct the training dataset were done over two separate days (on two different microscopes) with different tips (though same type of tip). Therefore, we have confidence that the data is repeatable – we mix all of this data together in forming the training dataset. We also attempted to train the dynamics model limiting the data to that acquired from a single sitting, and found the performance was significantly worse than using data form both sittings.

2. The introduction is well written and gives a good overview about the ferroelectrics community. Perhaps an additional reference for ferroelastic domain walls e.g. one of the reviews by Carpenter or Saljie would be nice.

We have added a reference to a Salje’s work on page 4, where the text refers to ferroelastics (ref. 40).

3. “In all cases, the tip follows a user-defined (implicit, e.g., via masks, or explicit) trajectory, and voltage is applied to the tip for specific moments to induce changes in the underlying domain structure.”
This is a little miss leading as one can mechanically write ferroelectric domain walls [1]. Please rephrase.

We have reworded this to be:

“Typically, when domain walls are written electrically with the SPM tip, the tip follows a pre-determined trajectory that is either implicit or explicitly defined, with voltage values that are set once and not amenable to feedback during the writing process”.

4. “This comprises a sequential decision-making task that is conditional on the pre-existing domain structure, the existence of defects, and the current state of the SPM tip, along with the type of domain wall itself, all of which can change spatially, temporally, or both.”

A nice turn of phrase, although the authors might be under selling themselves a little. It is known that electric fields to SPM tips can significantly change the defect structure of the sample, making the situation quite complex. The authors should not feel required to add a comment on this but it might be nice to mention – one recent example of such electric field modification came from the Meier team [2].

We have included a reference to the Meier group’s recent work in our paragraph:

“Moreover, we also note that the application of the voltage could cause changes to the underlying defect structure, such as by injecting or redistributing oxygen vacancies or other mobile ions, or creating other types of defects (for example, see work by Evans et al.). We do not rule out this possibility, but we can control the degree to which we inject this inductive bias by adjusting the strength of this term in the final loss function.”

5. The textbox in figure 1 (b) is too small, please make it larger.

This has been enlarged.

6. “Since the wall displacement mostly occurs in the section of the wall around where the bias was applied, the dynamics model only predicts displacements for the wall in these local regions”.
As a matter of academic interest, do the authors often see movement of domain walls in areas where the electric field wasn’t applied? It might be a nice topic for a future study to see remote correlations of domain walls.

This is of course interesting and something we are investigating in these samples. The written domain walls do not appear to be significantly perturbed far away from the pulsed region, but interestingly, we find that the pre-existing needle domains can sometimes be affected. It is not clear if this is due to the simple act of scanning, or repeated application of bias pulses some distance away, modification of surface defects by pulsing, etc. This is an avenue for future work.

7. Please include a citation to a work discussing the growth and bulk characterisation of your samples – so others see you are starting from high quality ferroelectric films.

We have included a PFM image as well as band-excitation piezoresponse spectroscopy hysteresis loop to show clear switching behavior and a nice clear domain structure. This is now contained in the revised Fig. 1. X-ray diffraction data of the film is also included in the revised supplementary.

8. “Because images are also captured immediately following when the bias is applied, our model’s predicted domain wall structures can be compared to the actual, observed domain wall structures.”
Does this mean that back-switching was observed in some areas? If so, was the model able to predict which areas back-switched and which were stable?

We did not observe back-switching in this sample, but we do observe significantly asymmetric growth of the domains if the wall is near a strong pinning site (i.e., one of the ‘needle’ domains). This can be seen when viewing the ‘data_1.mp4’ video (e.g., see transition 8). However, the reason we needed to incorporate this was because training on data where there is some asymmetry between positive and negative displacements can lead to scenarios where small negative or positive bias values can lead to model predictions in the opposite direction to what is expected.

9. “The first additional term we add to the loss function is a term that emphasizes the agreement of the prediction with known local physics. In this experiment, the direction of displacement for the domain wall should align with the sign of the voltage for the bias that was applied. When this is not the case and the surrogate model’s predictions do not agree with this physical prior, such predictions should be penalized.”

While the concept is sound and sensible, this example is tricky because some reports have claimed that domains can switch in opposite direction [3]. Perhaps it could be mentioned that such phenomena are not expected in this sample, thus validating your penalisation?

It is true that there are reports of backswitching, particularly in the boracites. However, it is not generally found in the case of standard prototypical ferroelectrics. We have made a point to note this in the revision on page 12 and added a reference to the work by Marty Gregg’s group.

10. “However, the same was not true for pulse width, potentially due to changes in tip condition or other exogeneous variables.”

Again, here the established effects of electric field induced changes to the defect state could be mentioned [2] and the references therein. While the authors might, very commendably, be avoiding self-citations the work by Kalinin et al [4] seems particularly worth mentioning.

We thank the referee for this comment. We have cited these works in the revised version and added an appropriate sentence as described above. We also added an extra line that it remains possible that such effects could explain the seemingly unphysical nature of the ‘phase diagram’. On page 12/13:

Moreover, we also note that the application of the voltage could cause changes to the underlying defect structure, such as by injecting or redistributing oxygen vacancies or other mobile ions,3 or creating other types of defects (for example, see work by Evans et al.4). We do not rule out this possibility, but we can control the degree to which we enforce this inductive bias by adjusting the strength of this term in the final loss function if desired.

We also note on page 19:
“Alternatively, as mentioned earlier, it is possible that certain defects, e.g. oxygen vacancies, could be injected or moved by the application of bias pulses. Such a circumstance would lead to anomalous features on the calculated diagrams based on threshold fields required to initiate such electrochemical processes. We cannot entirely rule out this possibility.”

11. The text and numbers of Figure 4 are very hard to read, the x axis is particularly challenging. Please make this clearer. They are also encouraged to present a “cartoon” of the phase diagrams, to show the general trends and salient points.

We have replotted the phase diagram to make it easier to read.

12. Given Figure 4 shows the expected movement as response to stimuli strength and application time, have the authors tried to calculate effective inertia of domain walls? Or any other physically interesting properties?

As described earlier, we plotted the velocity and found it comports to the creep regime.

13. From the referee’s perspective it is unclear what advantage the authors “reinforced learning policy” has over just writing a domain wall? Please make this clearer and perhaps the authors could include some before and after PFM images to show how they can change the domain state? Or show how they can create pattern which would not be trivial with a direct command from a human?

We thank the referee for this comment. We note that the RL policy was trained and tested only in the simulated environment. This was noted in the original manuscript, but we further re-iterate it in the revised manuscript to avoid confusion.

There are two main (potential) advantages of this policy over a simple human-based workflow: (1) it can be more adaptable to changes in tip condition than pre-defined policies, and this has recently been shown in STM-based tip manipulation [e.g., see Chen, I-Ju, et al. "Precise atom manipulation through deep reinforcement learning." Nature Communications 13.1 (2022): 7499.], and (2), it can enable more fine-scale manipulation of domain structure than humans might be able to achieve. This is now mentioned in the discussion section:

“The major advantage of this approach over a traditional human-based workflow is the potential for automatically manipulating structures in a reliable and reproducible manner. Although most SPMs can be programmed to perform tip-based lithography, this requires the bias values to apply to be known ahead of time, and no error correction is possible. In contrast, RL approaches have recently been shown in STM to be useful in precise atomic scale positioning, by Chen et al. The RL agent can be continually retrained based on new data and be more adaptable to changing conditions. Perhaps more interestingly, RL agents can be coupled with intrinsic curiosity rewards to enable manipulation and discovery of new types of domain states that are not envisioned by the human operator.”


14. For the video links in the supplementary, please label what each video shows – including the scale the false colours represent.

This has now been fixed in the revision.

15. “The sample is then imaged again, either with single frequency or band-excitation PFM..”
There is a major difference between the two techniques. Please state which is used for which image. In addition, state AC voltage used and the frequency (-range) used for the PFM.

1V AC is used for both BE-PFM and standard PFM measurements. Notably, we performed the single frequency PFM at the time when we did not have the automated setup for BE-PFM to acquire the datasets for training. However, in practice there is little difference for the purpose of this study. For BE-PFM, the frequency range was 333-413 kHz. The first video (“data_1”) is single frequency data, whereas the second video (“data_2”) corresponds to band-excitation PFM data. This is now mentioned in the Supplementary.

16. Please provide a scale to show what the false colours indicate in Fig. 1c. If the large vertical boundary shown in the PFM phase is a 180 o wall, please why the amplitude of the PFM response is different on each side?

We agree with the referee; a more thorough examination has shown these are in fact ferroelastic (90 degree) domain walls as our new revised Figure 1 demonstrates.

17. It would be nice for the readers if the authors noted why they chose to use PbTiO3 as the material to use for this study.

Our main reason was that PTO is a nice model system; secondly, this is a less explored orientation, so it also offers the chance to observe domain patterns that differ from typical a/c structures in (001)-oriented ferroelectric thin films of tetragonal structure. This is now discussed at the top of page 6.

18. Some of the needles in Fig. 1c touch the wall. Surely the wall will have different switching properties at these points, as they represent different orientations of the ferroelastic domain structure? Would this not be quite problematic for training data?

The referee is correct that the proximity to the needle structures can cause issues. The correct way to deal with this would be to include a patch of the domain structure in the model to account for this; however, we are unable to do this due to insufficient data (it would require at least an order of magnitude more transitions to be captured, which is too slow with our current acquisition rates). The result of neglecting this is that we will obtain a model that predicts the structure for the ‘average’ surrounding, i.e. it will not completely discount the effect of pinning from needle domains, but it will not completely consider them either. We agree this is not ideal, but we are working on methods to improve acquisition rates to enable more reliable predictions accounting for this structure. This is mentioned in the discussion section.

Referee: 3

Comments to the Author

In this work, the generation of domain walls was conducted through autonomous experimentation using a PFM. Scanning images were used to develop a physics-informed neutral network to predict the response of a domain wall upon an applied bias. This model was then used to conduct reinforcement-learning within automated experiments to generate structure-specific domain walls. In all, I think this is a well-written manuscript but still needs to undergo a few minor revisions before being accepted. Below are a few areas which I think can be addressed in further versions of the paper:

We thank the referee for their appraisal of our manuscript and trust that the revisions will address all remaining concerns.

1) In Figures 3 and 4, phase-field simulations and the dynamics model were used to predict displacements in the domain wall upon the application of a bias. Are there any experimental data (from the autonomous experiments) which can be plotted or shown with these predictions to show the performance of these methods. It seems like qualitatively the physics of the displacement (direction) is captured, but would be interesting to see how they perform in a more quantitative sense.

The referee brings up an important point. As discussed in the response to the previous referees, we have performed extensive additional analysis of both the simulations as well as the surrogate model predictions. We find for example that the domain wall growth dynamics appears to obey a creep regime, but that the overall velocities are significantly lowered compared to studies of ferroelectric domain wall velocities in thin PZT films.

2) Following the first comment, Figure S1 does seem to show a more quantitative comparison of the dynamics model prediction and experimental outputs. By just examining similar graphs in the provided Jupyter notebook of the model training, these seem to perform a lot better. Can you give a comment on the overall error of the model prediction compared to experiment? Especially in terms of either absolute displacement or average displacement?

On inspection, we noticed that the Jupyter notebook reset the generator, thereby shuffling the data that was plotted in the initial version. We have replotted the actual test data against the predictions, which is now in the supplementary. We find the mean absolute error on the test dataset is 13.2nm. This is now noted in the manuscript.

4) In Figure 4b, the displacements in response to a negative potential don't seem to change the initial bulged wall back to a more ideal flat-wall state. This is something which was predicted in Figure 3b. Can comments be made on why there is a difference in the predicted behavior here?

We plotted the expected response for a pulse width of 350ms, and this is in the revised manuscript as Fig. 6(c). Increasing the negative voltage eliminates the bulge and in fact grows it in the opposite direction as shown below:



Fig. R4: Applying bias to an already bulged wall. For a 350ms pulse, the surrogate model predictions for voltages ranging from -10V to 10V are plotted above. For -4.0V, the positive bulge is eliminated, and a small negative displacement is seen.

It is possible that smaller voltages could lead to a perfectly flat structure in principle, but that the lower voltages in practice results in simple pinning near the current location, as this figure suggests. Thus, finding the appropriate voltage/ pulse width to return to the flat configuration may be more challenging in the real sample with pinning sites.

3) In Figure 4a, I would expect a more 'symmetric' or 'isotopic' behavior of the domain wall phase diagram (when scanning from negative to positive applied potential) when starting with a flat wall. This would go with the comments on monotonic behavior encoded in the loss function, but the phase diagram in Figure 4a doesn't seem to show this. Why do you this is the case for the predicted model?

The referee is absolutely correct that we should at least expect that higher pulse widths should result in monotonic increases in the domain wall displacement for a given voltage. We suspect two reasons this may not be the case here. The first is simply that we have insufficient data, and we are overfitting to a few data points of large displacement at low pulse widths, and perhaps low displacements at higher pulse widths (for example, if these were by chance occurring more near the ‘needle’ domains). The other explanation is that there is indeed something more curious with respect to the physics. As we reference in a previous response above, there may be changes to the defect structure induced by the pulses themselves which makes interpretation significantly more fraught. The field distributions are also quite complicated, given these are ferroelastic 90° walls in a (110) film. Interestingly, this discrepancy mainly appears at positive bias. For negative bias, the ‘phase diagram’ appears to follow our intuition more closely. We are still investigating this. The monotonic loss is a soft regularization, and thus although it helps, it is unable to completely overcome this seemingly unphysical result.


5) When performing the reinforcement learning experiments, please comment on what the stopping criterion was for the experiments

The stopping criterion was simply the step number. In the revised manuscript we simplified the problem to just ten steps – to see how close the agent could manipulate the walls in the simulated environment to the target structure, assuming that only ten pulses could be applied.

6) For the code review, it seems like during the model training, the data was split using a random splitting technique. Just be sure to call out the data splitting method more explicitly either in the manuscript and/or supporting information.

Thank you, this is not mentioned in the manuscript on page 13.


References

1. R. K. Vasudevan, M. B. Okatan, C. Duan, Y. Ehara, H. Funakubo, A. Kumar, S. Jesse, L. Q. Chen, S. V. Kalinin and V. Nagarajan, Adv. Func. Mater., 2013, 23, 81-90.
2. T. Tybell, P. Paruch, T. Giamarchi and J.-M. Triscone, Phys. Rev. Lett., 2002, 89, 097601.
3. S. V. Kalinin, S. Jesse, A. Tselev, A. P. Baddorf and N. Balke, Acs Nano, 2011, 5, 5683-5691.
4. D. M. Evans, T. S. Holstad, A. B. Mosberg, D. R. Småbråten, P. E. Vullum, A. L. Dadlani, K. Shapovalov, Z. Yan, E. Bourret and D. Gao, Nat. Mater., 2020, 19, 1195-1200.




Round 2

Revised manuscript submitted on 12 Jan 2024
 

05-Feb-2024

Dear Dr Vasudevan:

Manuscript ID: DD-ART-07-2023-000126.R1
TITLE: Physics-informed models of domain wall dynamics as a route for autonomous domain wall design via reinforcement learning

Thank you for submitting your revised manuscript to Digital Discovery. I am pleased to accept your manuscript for publication in its current form. I have copied any final comments from the reviewer(s) below.

You will shortly receive a separate email from us requesting you to submit a licence to publish for your article, so that we can proceed with the preparation and publication of your manuscript.

You can highlight your article and the work of your group on the back cover of Digital Discovery. If you are interested in this opportunity please contact the editorial office for more information.

Promote your research, accelerate its impact – find out more about our article promotion services here: https://rsc.li/promoteyourresearch.

If you would like us to promote your article on our Twitter account @digital_rsc please fill out this form: https://form.jotform.com/213544038469056.

We are offering all corresponding authors on publications in gold open access RSC journals who are not already members of the Royal Society of Chemistry one year’s Affiliate membership. If you would like to find out more please email membership@rsc.org, including the promo code OA100 in your message. Learn all about our member benefits at https://www.rsc.org/membership-and-community/join/#benefit

By publishing your article in Digital Discovery, you are supporting the Royal Society of Chemistry to help the chemical science community make the world a better place.

With best wishes,

Dr Joshua Schrier
Associate Editor, Digital Discovery

EDITOR'S NOTE:

Please address the formatting issues noted by referee 2 in your page proofs.



 
Reviewer 2

The authors have undertaken significant additional work and have addressed the majority of my concerns. While I would still encourage them to streamline and condense their manuscript, I recognize that this suggestion is subjective. Therefore, in my view, there are now no remaining reasons to withhold publication.

They are encourage to read their proofs carefully, looking out for formatting challenges such as the x-axis in Fig. 5(c)) and odd styles like, “ 2.0E6 J/m3 ” .

Reviewer 3

I feel the authors have done a good job in addressing all the comments from the previous review and the publication is now ready to be accepted. Before final submission, please look over and verify Figure numbers match when referenced throughout the manuscript and Supporting Information. There were a couple cases where these were off.

Reviewer 1

The authors well address all of my comments and properly revised their manuscript. Thus, I would recommend its publication in Digital Discovery.




Transparent peer review

To support increased transparency, we offer authors the option to publish the peer review history alongside their article. Reviewers are anonymous unless they choose to sign their report.

We are currently unable to show comments or responses that were provided as attachments. If the peer review history indicates that attachments are available, or if you find there is review content missing, you can request the full review record from our Publishing customer services team at RSC1@rsc.org.

Find out more about our transparent peer review policy.

Content on this page is licensed under a Creative Commons Attribution 4.0 International license.
Creative Commons BY license