Open Access Article
Adrien
Gallego
ab,
Matthieu
Lavayssiere
a,
Xavier
Bantreil
ac,
Nicolas
Pétry
a,
Julien
Pinaud
b,
Olivia
Giani
*b and
Frédéric
Lamaty
*a
aIBMM, CNRS, ENSCM, Université de Montpellier, France. E-mail: frederic.lamaty@umontpellier.fr
bICGM, CNRS, ENSCM, Université de Montpellier, France. E-mail: olivia.giani@umontpellier.fr
cInstitut Universitaire de France (IUF), France
First published on 28th November 2025
The formation of amide bonds is of major interest in organic chemistry. Several methodologies have emerged in mechanochemistry to promote this reaction by using coupling agents. Herein, the acylation of unprotected amino acids using an acyl chloride in a ball-mill is described with different optimization processes. Indeed, the optimization of reaction conditions is part of every development of a new synthetic pathway. However, depending on the method which is used, the number of experiments to carry out can increase exponentially. Three different optimization methods were compared in the acylation of amino acids: One Factor at a Time (OFAT), Design of Experiments (DoE) and Bayesian Optimization (BO). The strengths and limitations of each methodology are highlighted providing new insights and an optimized practical amidation method taking into account the sustainability of this chemistry.
Our group decided to combine these statistical approaches with solvent-free methodologies (ball-milling, twin-screw extrusion…) as a support to design a more sustainable chemistry. Indeed, as said earlier, DoE or BO can determine an optimum with only a few experiments, meaning that less reagents are used in this process, as well as time or energy. To find out which method best reduces the number of experiments needed to optimize a reaction, OFAT, DoE and BO were performed in the acylation of amino acids by mechanochemistry.
| Entry | x (equiv.) | y (equiv.) | Time (min) | Liquid additive | NMR adjusted yieldb,c (%) |
|---|---|---|---|---|---|
| a The reaction was performed in a Retsch Mixer Mill 400 on 1 mmol scale. b See SI section for more details. c Values represent the 95% confidence interval, expressed as: mean values of several experiments ± 1.96 * standard error. | |||||
| 1 | 1 | 1.5 | 60 | — | 55 |
| 2 | 1.2 | 1.5 | 60 | — | 64 |
| 3 | 1.3 | 1.5 | 60 | — | 74 |
| 4 | 1.5 | 1.5 | 60 | — | 79 ± 9 |
| 5 | 1.6 | 1.5 | 60 | — | 82 ± 8 |
| 6 | 2 | 1.5 | 60 | — | 76 ± 2 |
| 7 | 1.5 | 1.5 | 30 | — | 76 |
| 8 | 1.5 | 1.5 | 15 | — | 79 |
| 9 | 1.5 | 1.5 | 10 | — | 68 |
| 10 | 1.5 | 1.5 | 5 | — | 80 ± 8 |
| 11 | 1.5 | 1.4 | 15 | — | 87 ± 17 |
| 12 | 1.5 | 1.3 | 15 | — | 71 ± 5 |
| 13 | 1.5 | 1.2 | 15 | — | 72 ± 17 |
| 14 | 1.5 | 1 | 15 | — | 79 |
| 15 | 1.5 | 1.4 | 15 | Ethyl acetate | 82 ± 12 |
| 16 | 1.5 | 1.4 | 15 | Acetone | 65 ± 3 |
| 17 | 1.5 | 1.4 | 15 | 2-Methyl tetrahydrofuran | 65 ± 4 |
| 18 | 1.5 | 1.4 | 15 | Acetonitrile | 64 ± 19 |
| 19 | 1.5 | 1.4 | 15 | Nitromethane | 88 ± 7 |
| 20 | 1.5 | 1.4 | 15 | Dimethyl carbonate | 80 ± 12 |
As the amidation reaction is performed with an activated acid, we expected that it could be complete in a reaction time shorter than 1 h. Therefore, 4 additional experiments were carried out at different milling times. To our delight, it appeared that milling the reaction media only 15 min provides a very good NMR adjusted yield (Table 1, Entry 8), not improved by prolonged milling time. Then, the optimization process focused on the amount of base. Since several acidic species are released during the reaction, it came as no surprise that adjusting the amount of base can influence the outcome. The best result was obtained with 1.4 equiv. of NaHCO3 (Table 1, Entry 11). Above this quantity, no improvement was observed. Finally, noting the high standard deviation in entry 11, attributed to an unfavourable rheology, the addition of a liquid additive was studied to improve the mixing and overcome this issue. While ethyl acetate and dimethyl carbonate (Table 1, Entry 15 and 20) did not alter the outcome of the reaction, the addition of acetone (Table 1, Entry 16), 2-methyl tetrahydrofuran (Table 1, Entry 17) or acetonitrile (Table 1, Entry 18) decreased the NMR adjusted yield by 20%. Only nitromethane improved the result, reaching 88% (Table 1, Entry 19), with a standard error reduced by a factor 2.5. Increasing the quantity of nitromethane in the reaction medium did not lead to any significant improvement in the outcome. Thus, according to the OFAT approach, product 2a was obtained with a very good NMR adjusted yield (88%) by milling for 15 min a mixture of 1 mmol of L-leucine 1a (1 equiv.), chloroacetyl chloride (1.5 equiv.), NaHCO3 (1.4 equiv.) and nitromethane as a liquid additive (η = 0.3 µL mg−1). However, even if the use of this additive proved to be efficient, its hazardous nature cannot be denied. In 2016, it has been classified as “Highly Hazardous” in the CHEM21 solvent guide.33 Therefore, since chloroacetyl chloride is a liquid reagent, it was interesting to investigate whether the reaction conditions could be improved without the use of a liquid additive. Such an approach would provide a safer and easier way to prepare acylated amino acid 2a.
OFAT is the most used method to optimize steps in organic chemistry. However, even if this method showed its relevance, it is time-consuming and the number of experiments required increases exponentially with the number of parameters studied. Furthermore, it is not guaranteed that the “real” optimum would be found, often because of unconsidered synergetic effects. Moreover, the optimum is generally found for one substrate, and changing the reagents might not directly lead to the corresponding optimum. To determine whether the global optimum was reached, we turned our attention to a more efficient and informative methodology: design of experiments.
The determination of the experimental space was made following the first intuition of the chemist. Setting suitable ranges in a DoE study is crucial as poor decisions can significantly reduce the study's effectiveness. In the case of too narrow ranges, the relevant trends may be missed but too large ones may compromise precision. Actually, such a difficulty was faced when we initially chose to adopt the same parameter variation ranges as those used in the OFAT section. It ended up with a surface response showing the optimum value on one of the corners of the experimental space (see the SI). An extension of the reaction space was necessary to assess whether the previously obtained value was near the optimum or if the global optimum resided in a different region of the experimental domain. Hence, it is important to define sufficiently broad ranges for each factor to ensure the design can explore a wide portion of the reaction space. In our case, we chose to keep the amount of 1a constant to 1 mmol (1 equiv.) and all the mechanochemical parameters (size/material of both jar and ball, milling frequency) identical to the ones in the OFAT optimization. The use of a liquid additive was discarded in this section. The screening study included three factors: amount of chloroacetyl chloride, amount of NaHCO3 and reaction time. A central composite face-centered design (CCF) was constructed in which each parameter has three levels whose values are described in Table 2.
| Factors | Levels | ||
|---|---|---|---|
| −1 | 0 | +1 | |
| Amount of chloroacetyl chloride (equiv.) | 1 | 1.5 | 2 |
| Amount of NaHCO3 (equiv.) | 1 | 2 | 4 |
| Reaction time (min) | 5 | 30 | 60 |
A CCF design offers a practical and efficient balance between model accuracy and feasibility. Such a design consists of 14 experiments and 3 centre points. The centre points consist of three reactions conducted under identical conditions at the midpoint of the design space – average of all factor ranges. These points help assess the reproducibility of the reaction. Ideally, identical conditions should yield consistent results. However, minor inevitable errors in experimental or analytical procedures can introduce some variation of the yields. The model imposes a minimum number of center points, while no upper limit is defined. Thus, 5 experiments were run as center points and other experiments were run in duplicate. To simplify the construction and the analysis of the DoE, Ellistat Software (version 7.8.7) was used. The 15 different experiments' dataset is displayed in Table 3. Noteworthily, the five center points (Table 3, Entry 9) yielded fairly consistent results suggesting a good reproducibility.
| Entry | x (equiv.) | y (equiv.) | Reaction time (min) | NMR adjusted yieldb (%) |
|---|---|---|---|---|
| a Values calculated on the basis of 5 experiments. b Values represent the 95% confidence interval, expressed as: mean values of several experiments ± 1.96 * standard error. | ||||
| 1 | 1 | 1 | 5 | 52 ± 7 |
| 2 | 1 | 1 | 60 | 59 ± 11 |
| 3 | 1 | 4 | 5 | 53 ± 6 |
| 4 | 1 | 4 | 60 | 61 ± 1 |
| 5 | 2 | 1 | 5 | 62 |
| 6 | 2 | 1 | 60 | 66 |
| 7 | 2 | 4 | 5 | 78 ± 35 |
| 8 | 2 | 4 | 60 | 83 ± 1 |
| 9a | 1.5 | 2 | 30 | 91 ± 4 |
| 10 | 1 | 2 | 30 | 56 ± 0 |
| 11 | 2 | 2 | 30 | 87 ± 6 |
| 12 | 1.5 | 1 | 30 | 68 |
| 13 | 1.5 | 4 | 30 | 93 ± 19 |
| 14 | 1.5 | 2 | 5 | 90 ± 8 |
| 15 | 1.5 | 2 | 60 | 83 ± 4 |
As shown in Fig. 1, which represents the comparison between the calculated NMR adjusted yield by DoE and the experimental value, the statistical model is reliable enough to predict accurately the best conditions for the transformation of 1a into 2a. Despite the remaining substantial variability in the response (R2 = 0.86), DoE proves to be a powerful tool for conduction of fast and trustworthy optimization studies. Noteworthily, the model is more reliable with high values of NMR adjusted yield. More statistical information is available in the SI.
Finally, the suggested best conditions for the acylation of L-leucine 1a required 1.66 equiv. of chloroacetyl chloride, 3.07 equiv. of NaHCO3 and 40 min milling time. The predicted NMR adjusted yield under these conditions is about 95% (Fig. 2). The experiment was run as a duplicate under these conditions and provided an NMR adjusted yield of 89 ± 1%. Notably, entry 13 in Table 3 shows a better result than the conditions suggested by the DoE model. Regarding the high standard error, the observed improvement may be due to random variation rather than a systematic effect.
![]() | ||
| Fig. 2 Correlation between the amount of chloroacetyl chloride, amount of NaHCO3 and NMR adjusted yield with reaction time fixed at 30 min. | ||
Moreover, since 49 experiments (113 if replicates are considered) were performed in the first part of the optimization (OFAT), we decided to include all these results in the DoE to strengthen our DoE model. The implementation of this non-classical approach was made possible by the prior availability of a comprehensive library of experimental data. It appeared that a little improvement could be obtained thanks to this approach. It suggested that a 94% NMR adjusted yield could be reached if 1 equiv. of 1a was milled for 30 min with 1.9 equiv. of chloroacetyl chloride and 3.4 equiv. of NaHCO3 (see the SI). This result is really close to the one determined when running the DoE model, suggesting the high suitability of the model established with only 19 experiments required through the DoE (32 experiments effectively performed when including the replicates). By running the reaction under the last conditions, the product of acylation 2a could be obtained with a 93% experimental NMR adjusted yield proving the effectiveness of the model. Analysis of the results provides an insight on the weight of each parameter on the formation of the product. The amounts of chloroacetyl chloride and NaHCO3 seem to be significant parameters while the reaction time is less influent. DoE enabled the identification of optimal conditions for converting 1a into 2a more efficiently than the OFAT approach, without requiring the use of any liquid additive.
DoE can easily be implemented to mechanochemical systems, helping to optimize experimental conditions in a straightforward manner and providing very accurate results. As a comparison point and to confirm the last conditions or even improve them without running again tens of additional experiments, we ultimately transitioned to a brand-new optimization method in mechanochemistry: Bayesian optimization (BO).
In the case of acylation of 1a into 2a, and following the initial five reactions, which were developed using a straightforward DoE approach (centered factorial design), the NMR adjusted yield data were sufficient for the BO algorithm to construct a preliminary surrogate model. For each iteration, the BO algorithm provided a series of 5 suggested experiments. The two showing the highest expected improvement (EI) were selected for running (see the SI). High yield values were easily reached, with two sets of conditions providing product 2a in 93% NMR-adjusted yield (Fig. 3, Entries 8 and 15). The conditions related to each iteration are described in the SI.
In summary, the optimal conditions for the acylation of L-leucine 1a are presented in Scheme 1. In both optimized cases, the amount of acyl chloride exceeds that of the base. Using approximately three equivalents of chloroacetyl chloride, a liquid reagent, appears to improve the rheology of the system, resulting in a more fluid reaction medium and enhances compound formation. However, increasing the amount of acyl chloride beyond this level, while maintaining a similar base proportion, does not further improve the outcome and instead reduces the yield to 88% (Fig. 3, iterations 14 and 16). Conditions (i), shown in the upper part of Scheme 1, were finally performed as a triplicate and the three experiments delivered consistent results, with an NMR adjusted yield of 92 ± 2%, showing the robustness of the methodology.
![]() | ||
| Scheme 1 Best conditions provided by BO for the synthesis of 2a. (i) refers to Fig. 3, iteration 8. (ii) refers to Fig. 3, iteration 15. | ||
Thanks to BO, the optimal conditions providing product 2a in a very good yield could be determined. This result has been obtained, after the initialization phase, after only 3 experiments in the iteration phase, proving the efficiency of such an approach in optimization studies. To confirm that the maximum was reached, 8 additional iterations were then performed without finding a better optimum.
![]() | (1) |
![]() | ||
| Fig. 4 Values of NMR adjusted yield and PMI for each iteration of the BO in the case of acylation of L-leucine 1a. | ||
This study revealed a powerful aspect of BO as it determines the conditions for the Pareto efficiency between PMI and NMR adjusted yield: it finds conditions where no objective can be improved without worsening the other objective. The acylation of 1a into 2a can be performed in an eco-friendlier way removing a third of the waste production (lowest PMI = 2.6) if a little decrease of the NMR adjusted yield to 88% is accepted (iteration 8 in Fig. 3). The conditions related to each iteration are described in the SI.
Since the BO method facilitates the extrapolation and optimization on other substrates, the acylation of L-phenylalanine 1b was studied. With 1b bearing an aromatic ring, the conversion could easily be measured by HPLC at 214 nm. Noteworthily, the consistency of the analytical method was assessed in duplicates by comparing the values of conversion obtained by HPLC and NMR. In the case of the NMR evaluation, the same treatment (liquid–liquid extraction) as for L-leucine 1a was applied to l-phenylalanine 1b. HPLC conversion and NMR-adjusted yield values were found to be very similar, highlighting the robustness of both analytical methods. When starting a new optimization, it is recommended to explore 2n + 1 experiments, where n is the number of dimensions (in our case, n = 4) to initiate the algorithm. However, another possibility is to use prior results and consider them as low fidelity data to stabilize the model in a rapid manner (see the SI) and therefore reducing the number of additional experimental points to run at the beginning of the optimization. Our initialization strategy combined proven-effective conditions from L-leucine 1a with a diverse set of exploratory experiments, providing a solid foundation for the Bayesian model. A weighting factor of 1/1000 was applied, where a weight of 1 was assigned to the values from 1a and a weight of 1000 to the experimental data from 1b. Remarkably, after only six conditions, the model exhibited sufficient stability to transition into the iterative optimization phase. Once the model stabilized, the surrogate function thus created was queried to systematically identify the most informative experimental conditions (see the SI). This approach aimed to maximize the efficiency of model refinement while simultaneously steering the optimization process toward the most favourable experimental outcomes. For implementation considerations, as the experiments were carried out on a Retsch Mixer Mill 400 requiring to load 2 milling jars, two experiments were suggested at each iteration. These were selected to maximize informational gain by minimizing overlap between the explored conditions (see the SI). Thanks to this strategy a sufficiently trained and stable model was achieved after just eight cycles of two experiments each. Finally, the model was asked to provide conditions either maximizing the HPLC conversion, or minimizing the PMI value or establishing a compromise between both responses. Gratifyingly, several sets of conditions were identified yielding results with high reproducibility and Pareto efficiency (Fig. 4).
The Pareto front represents the set of best trade-offs between the conflicting objectives. Once again, two sets of conditions were established: one maximizing the HPLC conversion and another minimizing the PMI, both representing the endpoints of the Pareto front (Fig. 5). Additionally, the final surrogate model can give insights into the impact of the milling load, a critical parameter in mechanochemistry. Since the milling load may influence the level of completion of a reaction, its role was specifically examined in the context of this transformation (Fig. 6). Two distinct trends emerge. The first and most prominent is the decline in conversion at higher milling loads, likely due to a fixed energy input (i.e., constant milling frequency, ball mass, and ball number), while the amount of substrate increases, diluting the energy delivered per unit of material. Additionally, at these milling loads, snowball effects were occasionally observed, which may also account for the decrease in HPLC conversion.41,42 On the other hand, the loss of conversion observed at the lowest milling loads may be attributed to the rheology of the “soft solid” reaction mixture. In this regime, the formation of preferred pathways for the milling ball may occur, resulting in poor mixing efficiency. This, in turn, can lead to lower conversions and reduced reproducibility. Although poor mixing can lead to erroneous data, these experimental points should not be discarded from the study. In such cases, the response would always be lower than the expected one. However, BO (and the broader DoE) is able to consider several experimental points for one set of conditions, thus enriching the information available and adding variance to the model. Such an approach would prioritize areas providing high expected values and good reproducibility. Overall, the optimal conversion appears to be achieved at a milling load of approximately 30 mg mL−1. The influence of this parameter should not be underestimated, as a mere 10% variation in milling load can lead to a significant difference in outcome, especially when considering potential scale-up.
![]() | ||
| Fig. 6 Influence of the milling load (decorrelated from the other variables) in the acylation of L-phenylalanine 1b by chloroacetyl chloride. | ||
Finally, BO also allows the identification of parameter weights. As shown previously, it has been confirmed that the milling time has a lower order of magnitude on the reaction while the amount of both chloroacetyl chloride and base will have a high one. The quantities of these reactants will determine the outcome of the reaction. The milling load, on its side, was easily implemented into the study and showed an average impact on conversion and reproducibility, the importance of this parameter might differ when scale-up is envisioned for the studied transformation.
| Method | Number of experiments required by the method | Number of experiments actually performed | Best conditions | NMR adjusted yielda (%) | Comment |
|---|---|---|---|---|---|
| a Values represent the 95% confidence interval, expressed as: mean values of several experiments ± 1.96 * standard error. b Including preliminary tests for the nature of the base, repeated trials for consistency and further investigations into the amount of liquid additives. c Adding nitromethane as a liquid additive, η = 0.3 µL mg−1. d Including replicates. | |||||
| OFAT | 49 | 113b | x = 1.5 | 87 ± 17% | Easiest implementation |
| y = 1.4 | 88 ± 7%c | Suitable for preliminary studies | |||
| Time = 15 min | |||||
| DoE | 19 (14 + 5 centre points) | 32d | x = 1.66 | 89 ± 1% | Reliable estimation of the optimum value |
| y = 3.07 | A second DoE should be necessary in a narrower space around the given value to be more precise | ||||
| Time = 40 min | |||||
| BO | 8 (initialization: 5 + iteration: 3) | 16 | x = 2.95 | 92 ± 2% | Faster optimization |
| y = 2.43 | Inherently optimizes high yield and reproducibility | ||||
| Time = 50 min | Easier implementation of additional parameters or targets | ||||
| Facilitated exemplification | |||||
![]() | ||
| Scheme 2 Exemplification of BO describing the conditions maximizing the conversion (left) or minimizing the PMI (right). ML = milling load. | ||
For the ones bearing an aromatic ring, the conversion could easily be followed by HPLC at 214 nm. The others were subjected to the same treatment as L-leucine 1a. Complete conversions were obtained in a fast and straightforward manner. Moreover, two sets of Pareto efficient conditions may be proposed: one that maximizes conversion (reaching or nearing 100%) but results in a higher PMI due to the excess reactants, and another that reduces the waste generation while maintaining high conversions.
After exploring a selection of primary amines, we turned our attention to a more challenging hindered substrate: L-proline 1g. The corresponding acylation product 2g is of high interest since it appears as a key intermediate in the synthesis of vildagliptin, an antidiabetic drug used for the treatment of type II diabetes which was approved by the European Medical Agency in 2007 (Scheme 3).9
Under mechanochemical conditions, the secondary amine of L-proline 1g could be acylated with an 81% NMR adjusted yield. Although the isolated yield was modest (52%) due to the strong hydrophilicity of the molecule, intermediate 2g was successfully obtained via a solvent-free method, paving the way for a more sustainable synthesis of vildagliptin.
| This journal is © The Royal Society of Chemistry 2026 |