Paul M.
Murray
*a,
Fiona
Bellany
b,
Laure
Benhamou
b,
Dejan-Krešimir
Bučar
b,
Alethea B.
Tabor
b and
Tom D.
Sheppard
*b
aPaul Murray Catalysis Consulting Ltd, 67 Hudson Close, Yate, BS37 4NP, UK. E-mail: paul.murray@catalysisconsulting.co.uk
bDepartment of Chemistry, University College London, 20 Gordon St, London, WC1H 0AJ, UK. E-mail: tom.sheppard@ucl.ac.uk; Tel: +44 (0)20 7679 2467
First published on 24th December 2015
This article outlines the benefits of using ‘Design of Experiments’ (DoE) optimisation during the development of new synthetic methodology. A particularly important factor in the development of new chemical reactions is the choice of solvent which can often drastically alter the efficiency and selectivity of a process. Whilst solvent optimisation is usually done in a non-systematic way based upon a chemist's intuition and previous laboratory experience, we illustrate how optimisation of the solvent for a reaction can be carried out by using a ‘map of solvent space’ in a DoE optimisation. A new solvent map has been developed specifically for optimisation of new chemical reactions using principle component analysis (PCA) incorporating 136 solvents with a wide range of properties. The new solvent map has been used to identify safer alternatives to toxic/hazardous solvents, and also in the optimisation of an SNAr reaction.
The uptake of novel synthetic methodology by researchers in industry and in other scientific fields is much more likely if the chemistry can be demonstrated to be ‘user friendly’. Important factors which can facilitate uptake of a particular reaction include: readily available reagents/catalysts; a wide substrate scope; good functional group compatibility; mild conditions; efficiency; sustainability and a good safety profile. However, such factors are rarely taken into account during the development of new chemistry. As noted by industrial researchers,1 many synthetic methodology papers fail to adequately explore the substrate scope of a new reaction and instead focus on reactions of largely unfunctionalised lipophilic compounds. Furthermore, despite the fact that well established statistical methods for reaction optimisation are widely used in industry,2,3 the uptake of these methods has been very low in academic chemistry.4,5 Often, the ‘optimisation’ process proceeds entirely via a trial and error approach involving the variation of one factor at a time (e.g. solvent, temperature, catalyst, concentration, etc.). This type of process can lead to researchers failing to identify ‘optimal’ conditions for a particular process if interactions between two or more factors are present.6 Thus, an attempt to optimise even two factors via a ‘one variable at a time’ (OVAT) approach can fail to find the optimum conditions if interactions between the factors are present (Fig. 1). For example, initial optimisation of an imaginary reaction via variation of the number of equivalents of reagent and the temperature involves variation of the first variable whilst keeping T = 40. This suggests that 2 equivalents of reagent give the ‘best yield’. Subsequent variation of the temperature whilst keeping eq. = 2 suggests that the optimum conditions are T = 55, eq. = 2. However, due to interaction between the factors this fails to identify the true optimum conditions where a higher yield of product can be obtained using smaller quantities of reagent (T = 105, eq. = 1.25). This is a consequence of the fact that the full reaction space has not been explored and at no-point was the combination of high T/low eq. considered.
The technique of ‘Design of Experiments’ is a statistical approach to reaction optimisation that allows the variation of multiple factors simultaneously in order to screen ‘reaction space’ for a particular process. Importantly, this enables the evaluation of a large number of reaction parameters in a relatively small number of experiments. Whilst this technique is routinely applied by process chemists in a wide range of industries, and also by academics working in engineering disciplines,7 it is rarely used in academic chemistry. This is in spite of the fact that optimisation of particular reactions is often an extremely time-consuming part of any research project focused on the development of new synthetic methodology. A major reason for this is the lack of expertise in the use of this technique in academia which leads to a significant ‘energy barrier’. A relatively common exception is the use of DoE for reaction optimisation in projects carried out in collaboration with industrial partners.5
This pitfall shown in Fig. 1 can readily be avoided using a true DoE approach in which each vertex of reaction space is explored. In combination with a ‘centre point’ experiment this is then used to evaluate the full multi-dimensional reaction space in order to determine where the highest yield can be obtained (Fig. 2). This provides a great deal more information about the behaviour of the reaction from a similar (or potentially smaller) number of experiments than the traditional approach. The DoE study uses standard statistical techniques to model the effect of each variable (and potential interactions between variables) on the reaction outcome. A further benefit of the statistical approach, is that it can provide a built-in ‘cross-check’ of each of the individual screening reactions, enabling any anomalous results to be readily identified. In the traditional OVAT approach, repetition of each experiment is advisable to ensure reproducibility, or the entire ‘optimisation’ could be led astray by a single anomalous result.
Fig. 2 A DoE study covering the entire reaction space will not miss the optimum conditions provided it lies within the space covered. |
By switching to a DoE approach, however, much more information could be obtained about the reaction at an early stage of the project. Optimisation of the initial example via DoE should provide greater understanding of the factors underpinning the reaction from a comparable number of experiments to the traditional approach. Using a resolution IV DoE design, which can identify all important factors and determine whether interactions between factors are present or not, up to eight factors can be explored in a total of 19 experiments (including the required centre points). This also provides a good understanding of any interactions between factors that may be present. The scope of these optimised conditions could then be explored with a selection of substrates as in the traditional approach. There is no reason to expect that these conditions will be suitable for all substrates, however, especially those that contain potentially reactive functionality.
Further benefit can be obtained, therefore, by taking one of the ‘difficult’ substrates, which gives a low yield under the standard conditions, and using a second DoE process to optimise the reaction. As considerable information has already been obtained from the original optimisation, it is likely that only a few carefully chosen factors will need to be varied in order to provide improvements in the yield. This additional stage of optimisation could serve to greatly increase the potential value of the reaction. By demonstrating that the new methodology can be applied to ‘difficult’ substrates through modification of the reaction conditions, the authors will provide a much better understanding of the versatility of the new reaction that has been developed. Potential end users of the chemistry will also have a good idea how to adapt the reaction conditions to make it work for the substrates they wish to employ.
This approach has been adopted by many chemists in industry, but the required PCA solvent maps are not readily available in the public domain. Industrial users typically have their own proprietary data, and solvent maps that have been published are either not targeted towards reaction optimisation (e.g. crystallisation)9 or are overly complex.10 Different solvent properties are important for different reactions, so it is important that a relatively diverse set of parameters are included. Important considerations include how solvation of compounds, reagents and catalysts is achieved, how the solvent hydrogen bonds with molecules, and how it interacts with solid materials. In this article we report a new PCA solvent map specifically designed for use in new chemistry development, and outline how this PCA map can be used for identifying alternatives to toxic/undesirable solvents and applied in combination with DoE for the optimisation of new synthetic methodology. In industry, the specific properties used in each solvent map differ from company to company, but the terms used in the map below have been found to be widely applicable in many industrial applications of PCA in DoE.
The solvent emerged as the most significant parameter, with the originally chosen solvent (MeCN) promoting the formation of both 4a and 5a. In contrast, the formation of MCR product 4a was strongly promoted in iPrOH, which also disfavoured the formation of the lactone 5a. In addition, a higher loading of Brønsted acid catalyst was shown to promote the formation of lactone 5a, whilst having negligible effect on the formation of the desired product. Thus, by switching the solvent to iPrOH and lowering the catalyst loading the selectivity of the reaction could be improved considerably. The findings from the DoE study subsequently enabled us to identify suitable reaction conditions for carrying out the MCR as a four-component reaction in which the oxazolidinone intermediate was generated in situ from reaction of an aminoalcohol and a carbonyl compound (Scheme 2a). In a second DoE study, alternative reaction conditions (DMSO, 1 eq. TsOH) were identified to give the lactone product in the absence of the carboxylic acid component (Scheme 2b).
Key considerations for the choice of solvents to be included were:
1. Availability from major chemical suppliers
2. Cost
3. Boiling point/melting point
4. Diversity of properties
5. Sustainability/safety issues
We also aimed to include all solvents commonly/traditionally used in academic laboratories, even those whose use is highly undesirable (e.g. CCl4, 1,2-dichloroethane) so that suitable alternatives can readily be identified from the solvent map.
A set of 136 solvents was selected to cover a wide range of different solvent properties (Fig. 3). Approximately twenty physical (e.g. melting point, boiling point) and calculated (e.g. Hansen solubility parameters)12 properties of these solvents were then used to construct a PCA map (Fig. 4). The dataset was analysed using Umetrics SIMCA software13 to produce a principal component model. Approximately 70% of the solvent properties are modelled effectively using three principal components and 80% are modelled by four principle components (Fig. 5). Evaluation of the PCA map indicates that the first principle component correlates to a large extent with solvent polarity with non-polar solvents having high PCA1 values, and polar solvents grouped towards the lower end of the scale. Similarly, PCA2 approximately correlates with polarisability and PCA3 with hydrogen bonding properties. As can be seen from the overview of the solvent map shown in Fig. 4, there is considerably more variation in solvent properties in terms of the first two principle components with a wide distribution across solvent space (−9 < PC1 < +8; −5 < PC2 < +5). There is much less variation in the third principle component with the vast majority of solvents lying in the range −3 < PC3 < +2. In both plots, there are some notable outliers including water (136), perfluoromethylcyclohexane (117), perfluorohexane (127), trifluoroacetic acid (133) and hexafluorobenzene (84). In order to apply this type of PCA map in a DoE study, a simplistic model is used in which each principle component is modelled as a separate factor in the design. The exact PC values of the solvents are not used in the design, just their approximate position on the map. Solvents are selected to represent a high (+1) and low (−1) value of each principle component; an additional ‘centre point’ solvent is also chosen which approximately occupies the middle of the solvent space being investigated (0). Thus, to explore the full range of solvent space in three dimensions, eight solvents at the vertices of a cube are chosen, along with a single centre point (Fig. 6). A basic investigation of the effect of solvent on a reaction can be carried out effectively using only two principle components, depending on which factors (polarity, polarisability or hydrogen bonding interactions) are the most important for the reaction being studied. In this case, only five solvents are used, one from each ‘corner’ of solvent space and a centre point. In either case, the use of the solvent map to select the solvents for the DoE study ensures that they have diverse properties across the 2/3 principle components.
Fig. 4 The PCA solvent map; for full details see the ESI.† |
Fig. 6 The use of solvent space in a DoE study requires the identification of a solvent approximately located at each vertex of a cube spanning the area of solvent space to be investigated. |
Suitable solvents on the PCA map which can be used as the vertices for a full exploration of solvent space, or corner points for a two-dimensional study of the first two principle components are shown in Table 2. Alternatively, only a subsection of solvent space can be explored: e.g. polar aprotic solvents; non-polar solvents. This can be achieved by selecting solvents at the vertices of a distorted cuboid (or corners of a distorted rectangle) covering the relevant area of solvent space.
Corner | Vertex | Solvent |
---|---|---|
0 | 0 | 1,4-Dioxane (6); 2-ethyl-1-butanol (15); 4-methyltetrahydropyran (30); acetic anhydride (32); methyl isobutyrate (100); toluene (131); trimethyl orthoformate (134); 3-pentanone (27); butanenitrile (41); butyl acetate (42); ethyl butanoate (68); n-propyl acetate (113) |
1 | 1 | 2-Butanol (13); 2-methyl-1-butanol (18); 2-methyl-1-pentanol (17); 2-methylpropan-1-ol (19); 2-methylpropan-2-ol (20); 2-propanol (23); 3-pentanol (26); 1-pentanol (116); 1-propanol (118); propionitrile (119) |
2 | 1,3-Propanediol (5); 2,2,2-trifluoroethanol (10); acetic acid (31); ethylene glycol (73); formic acid (77); methanol (93); trifluoroacetic acid (133); water (136) | |
2 | 3 | 1,1,3,3-Tetramethylurea (1); 1,3-dimethylimidazolidin-2-one (4); 1-ethyl-2-pyrrolidinone (8); 1-methylimidazole (9); dimethylsulfoxide (58); hexamethylphosphoramide (85); N,N′-dimethylpropyleneurea (104); N,N-dimethylacetamide (105); N-methylpyrrolidine-2-one (112); pyridine (122) |
4 | Benzyl alcohol (40); ethylene carbonate (72); formamide (76); glycerol (78); glycerol carbonate (79); glycerol-1-monobutylether (81); methanesulfonic acid (92); nitrobenzene (109); propylene carbonate (120); sulfolane (125) | |
3 | 5 | 1,2-Dimethoxyethane (3); 2-methyltetrahydrofuran (21); diethyl ether (53); diethylamine (54); di-n-propylether (60); ethyl n-butyl ether (71); methyl-t-butyl ether (102); n-butyl methyl ether (107); trimethylamine (132) |
6 | Heptane (83); hexane (86); methylcyclohexane (101); pentane (115); tert-butyl acetate (126) | |
4 | 7 | Dipentene (limonene) (61); di-tert-butyl ketone (64); ethyl amyl ketone (67); dipentyl ether (62) |
8 | Benzene (37); benzotrifluoride (39); carbon disulphide (44); carbon tetrachloride (45); chlorobenzene (46); cis-decalin (48); decane (51); fluorobenzene (75); hexafluorobenzene (84); mesitylene (91); m-xylene (103); o-xylene (114); perfluoromethylcyclohexane (117); p-xylene (121); tetradecafluorohexane (127); tetralin (130). |
A simple application of the solvent map is to identify alternative solvents for a reaction of interest. This can be particularly useful for substituting highly toxic or otherwise undesirable solvents. In Table 3, a list of potential substitutes for a selection of hazardous solvents is provided. Thus, carbon tetrachloride, which is still often used in radical reactions despite being heavily restricted as an ozone-depleting chemical, can potentially be substituted with trifluorotoluene. Similarly, trifluorotoluene or fluorobenzene can also be used as alternatives to the toxic solvents chloroform and 1,2-dichloroethane, the latter often being used in a variety of metal-catalysed transformations as a higher boiling point alternative to dichloromethane. A number of more attractive alternatives to dichloromethane itself can also be identified from the map including 1,4-dioxane, 4-methyltetrahydropyran and dimethyl carbonate, the latter having very good environmental credentials.16 A selection of alternatives to benzene and to dipolar aprotic solvents such as DMF, DMSO and HMPA are also provided, though it is acknowledged that many of these alternatives are already widely used in this context.
Solvent | Possible alternatives |
---|---|
CH2Cl2 (52) | 1,4-Dioxane (6) |
Dimethyl carbonate (56) | |
4-Methyltetrahydropyran (30) | |
CHCl3 (47) | Fluorobenzene (75) |
Trifluorotoluene (39) | |
Cl(CH2)2Cl (2) | Fluorobenzene (75) |
Trifluorotoluene (39) | |
CCl4 (45) | Trifluorotoluene (39) |
Decalin (48) | |
p-Xylene (121) | |
Benzene (37) | m-Xylene (103) |
o-Xylene (114) | |
Toluene (131) | |
Fluorobenzene (75) | |
Dipropylene glycol dimethyl ether (63) | |
DMSO (58) | 1-Methylimidazole (9) |
4-Formylmorpholine (28) | |
N-Methylpyrrolidinone (112) | |
1,3-Dimethylimidazolidin-2-one (4) | |
Ethylene carbonate (72) | |
DMF (106) | N,N-Dimethylacetamide (105) |
Pyridine (122) | |
Tetramethylurea (1) | |
N-Methylpyrrolidinone (112) | |
1-Methylimidazole (9) | |
HMPA (85) | DMPU (104) |
1-Ethyl-2-pyrrolidinone (8) | |
1,3-Dimethylimidazolidin2-one (4) | |
N-Methylpyrrolidinone (112) | |
Quinoline (124) |
In order to test the use of the solvent map for substituting chlorinated solvents, we explored alternative solvents for some recently developed gold-catalysed reactions (Scheme 3). The gold-catalysed cyclisation of alkynyl boronic acid 6 to boron enolate 7, originally developed in dichloromethane,17 was shown to take place equally effectively in dimethyl carbonate, a solvent with a considerably better safety profile (a). Similarly, the gold-catalysed hydroamination of cyclohexadiene 8, originally reported in 1,2-dichloroethane,18 could be carried out effectively in trifluorotoluene or fluorobenzene, the latter proving to be a much better solvent for this particular reaction (b). In both of the solvent substitution reactions shown in Scheme 3, no significant lowering of the reaction yield was observed on replacing the undesirable solvent with a safer/greener alternative. This suggests that these alternatives to chlorinated solvents should be routinely screened by researchers during reaction development, as this could significantly reduce the use of chlorinated solvents by their avoidance at an early stage of the process.
Scheme 3 Replacement of toxic/hazardous chlorinated solvents with safer alternatives; aisolated yield; b1H NMR yield using an internal standard. |
The selection of factors and ranges for a DoE study is of great importance, as poor choices can limit the utility of the exercise. Thus, it is essential to select wide-enough ranges for each factor which enable the design to explore a sufficiently large area of ‘reaction space’. However, for useful information to be gained, the reaction should still work (i.e. give a non-zero yield of the product) at the extreme edges of the design space. For an initial DoE study,21 we elected to examine the effect of varying the quantity of 12, DIPEA (1–5 eq.) and sodium iodide (0.1–2.0 eq.), alongside the reaction concentration (2–5 mL of DMF) and the temperature (120–200 °C). This was carried out via a total of 16 experiments plus three centre points to enable the effect of the factors to be determined. This enabled the factors favouring the formation of each of the different reaction products to be elucidated. The centre points are three reactions performed under identical conditions at the centre of the design space (i.e. the mid-point of all of the factor ranges) which provide an indication of the reproducibility of the reaction. Performing the reaction under identical conditions should of course give an identical outcome, but there are inevitably some errors in the experimental/analytical procedures which can lead to variation of the yields. It is therefore important to plan the design carefully to minimise any potential errors. For example, preparing a solution of a reagent of known concentration and dispensing appropriate volumes of this solution into each experiment will generally provide much greater accuracy than weighing out reagents for each reaction separately. In our case, stock solutions of 11 and 12 were prepared in order to minimise any variation in the amount of limiting reagent present in each reaction. Similarly, it is important to identify a reproducible method for measuring the yield. Early experiments demonstrated that the aqueous work-up of this reaction led to considerable variation in yield of the products 13a–13d, so in the DoE study all reactions were concentrated directly under vacuum prior to analysis of the crude residue by NMR using an internal standard. For the three centre points this gave fairly consistent yields, as can be seen in Fig. 7 which illustrates the much smaller variation in the replicate experiments (blue) in comparison to the other reactions (green) in terms of the yield of 13a observed.
Analysis of the results provides details of which factors affect the yield of the desired product 13a. These are illustrated by the coefficient plot shown in Fig. 8. Each green bar represents a significant factor in the reaction, illustrating the average effect on the yield of 13a on increasing the factor from the mid-point of the design to the highest value in the design. Thus, the most significant factor in the yield of 13a is the temperature, with the higher temperature (200 °C) giving on average a 2.5% increase in the yield. Notably, increasing the amount of NaI to 2 eq. leads, on average, to a 2% decrease in the yield of 13a, whilst increasing the amount of DMF leads to a 1.7% decrease.
The factors affecting the yields of each of the products 13a–13d are shown in Table 4. As expected, the NaI additive was not beneficial for the formation of the desired product 13a, and this was therefore omitted from subsequent reactions. Interestingly, the formation of the regioisomeric SNAr products 13b–13c is favoured by increasing the quantity of base used in the reaction, whereas the formation of the desired product 13a is largely unaffected by the amount of base. Furthermore, there is an interaction between the quantity of base used and the temperature: increasing the quantity of base leads to a much larger quantity of the side products 13b and 13c at higher temperature. It was therefore concluded that removing the DIPEA entirely in future reactions would be beneficial both in terms of improving the selectivity of the reaction and facilitating purification. During the course of this initial DoE study, a pure sample of the byproduct 13d was isolated and the structure confirmed. This byproduct is evidently formed through thermal decomposition of the solvent to generate dimethylamine which then undergoes an SNAr reaction with the chloropyrimidine 11. A switch in solvent was therefore necessary to avoid the formation of 13d, and we elected to make use of our newly developed PCA solvent map to evaluate an area of solvent space for this transformation, alongside temperature and concentration as the other important variables. We chose to incorporate solvent as a two-dimensional parameter in the design to provide a useful preliminary insight into the effect of the main two solvent principle components on the reaction (t1 and t2). Although these first two principle components only accurately model 55% of the original solvent properties, this is sufficient to provide an insight into which areas of solvent space are suitable for a particular reaction, and a further more detailed solvent optimisation can then be carried out subsequently if required. Solvents were selected approximately in each quadrant of the map, taking into account the temperature range to be studied, their compatibility with microwave heating and their ability to solubilise the reagents. Dimethylacetamide (105), 1-butanol (7), cyclopentyl methyl ether (50) and dipropyl ether (60) were selected as ‘corner’ points, with propionitrile (119) as a centre point (Fig. 9).
Product | Favoured by | Disfavoured by |
---|---|---|
13a | Increasing temp. | Increasing NaI |
Increasing solvent vol. | ||
13b | Increasing base | |
Increasing temp. | ||
13c | Increasing base | |
Increasing temp. | ||
13d | Increasing temp. | Increasing NaI |
Increasing base |
The DoE study also included temperature (100–140 °C) and concentration (0.1–0.5 M) as factors, and this required a total of eight experiments plus three centre points to give a resolution IV design in which individual factors are well resolved but interactions between factors are confounded.
As expected, the solvolysis product 13d was not observed in most solvents although it was still formed in one of the high temperature reactions carried out in DMA. An excellent model was obtained using multiple linear regression (MLR) for predicting the yield of the desired product 13a (Fig. 10). The factors affecting the yield of 13a are shown in the coefficient plot (Fig. 11). Temperature and concentration are the most significant factors, with higher temperature and higher concentration leading to an improvement in yield as might be expected. The interactions between factors are not fully resolved using a resolution IV design, so care must be taken in interpreting the results. The solvent dependence is somewhat complicated, with the two principle components potentially showing a significant interaction (though as the interactions are not resolved this interaction between t1 and t2 is confounded with the interaction between temperature and concentration; similarly the interaction between t1 and concentration is confounded with the interaction between t2 and temperature). However, subsequent preparative experiments confirmed that the solvent was an important factor.22 Thus, whilst neither principle component is very important as a factor in its own right, there is a strong interaction with the most favourable areas of solvent space being either high t1/low t2 or low t1/high t2. This interaction is illustrated by the plot in Fig. 12. This suggests that either DMA or Pr2O are preferable for the reaction, with the latter being slightly more effective – a somewhat unusual choice of solvent for an SNAr reaction! Furthermore, it was observed that the product 13a (and unreacted amine 12) often precipitated out of Pr2O at the end of the reaction, facilitating purification of the product. Satisfyingly, by carrying out the reaction at high concentration/temperature in Pr2O, a 57% isolated yield of product 13a was obtained (Scheme 5), along with small amounts of products 13b and 13c (and 13% recovered starting material). This gave material in sufficient quantity and purity for the project, so no further optimisation was carried out from this point onwards.
Fig. 11 Coefficient plot generated from the MODDE experimental design showing the factors influencing the yield of 13a. |
Footnotes |
† Electronic supplementary information (ESI) available: Principle component values for the new PCA solvent map, experimental procedures, spectroscopic data and 1H and 13C NMR spectra. CCDC 1423524 and 1423525. For ESI and crystallographic data in CIF or other electronic format see DOI: 10.1039/C5OB01892G |
‡ Arguably, most reactions that are developed use one of the following ten common laboratory solvents: Et2O, THF, MeCN, DMF, DMSO, EtOH, MeOH, CH2Cl2, PhMe and acetone. |
§ CCDC 1423524 and 1423525 contain the supplementary crystallographic data for compounds 13b and 13c. |
This journal is © The Royal Society of Chemistry 2016 |