Identifying high-performance catalytic conditions for carbon dioxide reduction to dimethoxymethane by multivariate modelling

An efficient algorithmic workflow was developed to optimize seven process parameters of a homogeneous catalytic system with minimal experimental effort.


Introduction
In this century, our ecosystem faces severe problems such as global warming, environmental pollution and resource depletion. The negative impact of humankind as well as its responsibility in solving these problems can no longer be ignored. 1,2 Besides the necessity for future-oriented global politics, 3,4 both the scientic community and the chemical industry must provide answers to crucial questions regarding sustainable process development, alternative energies as well as recycling of waste and pollutants. [5][6][7][8][9] In this context, more efficient techniques must be developed and applied in research to reduce time, cost and resources. [10][11][12] In catalytic investigations, system optimisation is typically approached by one-factor-at-a-time (OFAT) methods, successively screening along one parameter axis. Once optimised, a parameter is kept constant for the subsequent experiments. In this univariate analysis, variables are treated as being independent of each other. Beside the vast number of experiments that must be performed, local maxima with higher performances might be missed. Consequently, algorithm-based screening and optimisation techniques have been among the fastest growing research areas in recent years. [13][14][15][16][17][18][19][20][21][22][23][24] Considering the interactions of parameters, optimised results can be achieved with minimal experimental effort. 25,26 Recently, we applied a univariate optimisation approach including several hundred catalytic reactions to improve the selective ruthenium-catalysed transformation of carbon dioxide to dimethoxymethane (DMM) reaching a turnover number (TON) of 786 (Scheme 1). 27,28 The product DMM itself is a high value feedstock for biofuels, but can also be hydrolysed yielding formaldehyde and methanol or directly employed as a formaldehyde synthon. 29,30 Beside the desired product, only methyl formate (MF) was formed with TONs of up to 1290 (Scheme 1). 27 Previously, two studies on the selective hydrogenation of CO 2 by the group of Klankermayer showed the formation of DMM and MF by using a homogeneous ruthenium catalyst 31 with TONs of 214 and 104 or a cobalt catalyst 32 with TONs of 157 and 37, respectively. Further selective reductions toward the formaldehyde oxidation state were reported utilising hydroboration, 33-37 hydrosilylation, 38,39 and frustrated Lewis pairs, 40,41 however, being mainly of academic interest due to the stoichiometric use of reducing reagents.
Inspired by previous reports, which underline the importance of a holistic view on the results obtained, 13,19 we now utilised multivariate analysis 42,43 to further optimise the investigated catalysis. Based on the large amount of experimental data and the dependency of the catalytic system on seven different parameters, we envisioned to increase its performance by modelling and predicting conditions optimally accounting for parameter interaction. For this purpose, we devised a multistep process combining elements from machine learning, optimisation and experimental design (design of experiments, DoE). Fig. 1 gives a schematic overview of the easy-to-use algorithmic workow, which was developed in this work and is applicable for any catalytic screening. Experimental data ( Fig. 1(I)) was modelled using the random forest (RF) algorithm ( Fig. 1(II)) to identify promising subspaces with high catalytic performance ( Fig. 1(III)). To better understand the origin of the exceptional activity and to extend the amount of data for further modelling, the subspace was augmented ( Fig. 1(IV)) by additional experiments ( Fig. 1(V)) that were based on experimental design. Starting from these rst experiments, an iterative workowconsisting of DoE ( Fig. 1(V) and (VI)), optimisation (relaxation; Fig. 1(VII) and (VIII)) and evaluation ( Fig. 1(IX))was applied until the nal optimum was reached ( Fig. 1(X)).

Catalytic system
The selective transformation of CO 2 to DMM was performed using the ruthenium catalyst [Ru(N-triphos Ph )(tmm)] (tmm ¼ trimethylenemethane dianion) for hydride transfer and Al(OTf) 3 to facilitate the acetalisation with methanol (Scheme 1). Benecially, the robust catalyst can be obtained on a large scale in a simple three-step procedure, which underlines its suitability for industrial and large scale applications. 27 The catalysis is mainly affected by the reaction temperature (T), partial pressure of H 2 (p H 2 ) and CO 2 (p CO 2 ), reaction time (t), the amount of catalyst (n cat ) and Lewis acid (n add ) as well as the reaction volume (V).

Experimental data analysis
Following the rationale outlined in the previous section, the optimisation started with a corpus of 49 unique settings of the seven process factors (X: X 1 , X 2 , .,  27 Each setting was run as a triple replicate, except for three settings that were realised as simple replicates, thus making up the 144 cases of the dataset. Goal of the optimisation was nding reaction conditions X * : X * 1 ; X * 2 ; .; X * 7 , that maximise TON DMM (for a theoretical introduction on this approach, see the Methods section). As process factors, six of the previously investigated parameters, namely temperature, partial pressure of H 2 and CO 2 , reaction time as well as the amount of catalyst and Lewis acid, proved suitable for the optimisation. The inuence of varying the nature of the Lewis acid was neglected here, because statistical analysis of the dataset revealed Al(OTf) 3 as the only promising candidate. Instead, the volume of the catalysis solution was considered for the multivariate analysis to account for mass transfer phenomena and gas solubility effects (concentration effects) potentially inuencing the performance of the catalyst.
Modelling and analysis of datasets based on empirical optimisation (e.g. OFAT) is complex and more difficult than estimating parametric surrogate models from well-designed data. The difficulties arise primarily from non-linearities, which simple parametric models cannot appropriately describe. Furthermore, OFAT variations oen and inadvertently lead to co-varying and thereby confounding the effects of process parameters. For instance, in the present dataset the process parameters n cat and n add were found to be highly correlated with a Pearson correlation coefficient r(n cat À n add ) ¼ 0.895, rendering the data non-informative in terms of the independent effects of n cat and n add .  Next, the problem for these datasets is nding an appropriate functional representation,f (), of the target response as a function of the process factors X i , that is TON DMM ¼ f(T,p H 2 ,p CO 2 ,t,n cat ,n add ,V) + 3, without making any prior assumptions about the analytical form of the 'true' function f().
Here, we used RF as a powerful non-linear and easy-to-use method for the empirical model building of complex datasets ( Fig. 1(II)). 44,45 The RF models describe 79% of the responses' variance on average (R 2 (TON DMM ) ¼ 0.83; R 2 (TON MF ) ¼ 0.74), which is a good result given the heterogeneous nature of the dataset. Further, tuning the RF hyperparameters using 10-fold cross validation with consecutive blocks (mtry* ¼ 2 with R 2 ¼ 0.83) revealed that the default RF hyperparameters describe the data appropriately. The effect structure of the models can be conveniently explored by plotting the RF model predictionsf (X) against the process factors X 1 , X 2 , ., X 7 . Fig. 2 shows the effects of the process parameters T, p CO 2 , n cat and n add , given the median values p H 2 ¼ 90 bar, t ¼ 18 h and V ¼ 0.5 mL, as a conditional trellis plot.
As a tree-based ensemble method, RF splits and meanaggregates the experimental space into hyperrectangles, the consequence being that the RF model surface becomes nonsmooth (Fig. 2). This property of RF is a particular strength when it comes to identifying promising domains of the experimental space and was another motivation for choosing RF as modelling technique.
Evidently, n cat and n add exert strong non-linear, step like negative effects on the process performance (TON DMM ), dividing the n cat Â n add space into domains of different performance. The pressure, p CO 2 , reveals a positive and temperature, T, a small convex effect, both, however, negligible compared to the dominant effects of n cat and n add . At this point, the erroneous impression may arise that the heuristic interpretation of the experimental data could lead to a further improvement by a trivial reduction of the amount of catalyst, which is oen reduced to minimise catalyst deactivation. However, a closer look at the experimental data shows that a reduction in the amount of catalyst without changing other parameters leads to a decrease in catalytic activity with respect to DMM. This case shows exemplarily how important multivariate analytical methods can be, because they can easily identify complex correlations of process parameters that are not detectable by a univariate analysis of the scientist. With the condition TON DMM > 400 deduced from the landscape of TON DMM over n cat Â n add (Fig. 2), a subset of 26 observations with 9 unique settings was selected from the data ( Fig. 1(III)). In this subset, the parameters T, p H 2 , p CO 2 and t were found constant at T ¼ 90 C, p H 2 ¼ 90 bar, p CO 2 ¼ 20 bar and t ¼ 18 h, thereby suggesting to conditionally optimise n cat , n add and V rst, while keeping T, p H 2 , p CO 2 and t constant for the time being.

First DoE
These 9 candidate points (identied unique settings mentioned above) were subsequently augmented by an additional number of 6 points to fully support all second order effects of n cat , n add and V (Fig. 1(IV)). The intention behind this augmentation was to render the formerly confounded effects of n cat and n add estimable as well as, assuming high complexity, to allow for interactions and non-linearities of the process parameters as a solid basis for further optimisation (Fig. 1(V)).
The 6 augmentation trials were realised as triple replicates in the lab, added to the 9 settings already available and the responses (TON DMM , TON MF ) were modelled as a function of the least square parameters with stepwise ordinary least squares (OLS; Fig. 1(VI)). Both models accurately describe the data within the replication error and explain 91% (R 2 (TON DMM ) ¼ 0.91; DF ¼ 37) and 97% (R 2 (TON MF ) ¼ 0.97; DF ¼ 35) of the responses' variance with DF denoting the degrees of freedom (number of data points minus number of estimated model parameters).
The local effects of n cat , n add and V are depicted in Fig. 3A as response surface trellis plot. Again, there is a strong negative effect for n cat along with positive effects for n add and V. Together with the positive, synergistic effect between n add and V, the effect structure suggests to decrease n cat and to increase n add and V to further maximise the catalytic performance beyond the best result of the rst DoE (Fig. 4A, entry 4).

First relaxation
To optimise towards the direction of maximal improvement, following the procedure outlined in the Methods section [eqn (5)], the experimental space was relaxed with 10% step size. The triple obtained from relaxing the design space together with the achieved experimental results are listed in Fig. 4B (Fig. 1(VII) and (VIII); see the Methods section).
The joint condition max(TON DMM ), max(TON MF ) was best met by the 20% relaxation trial (Fig. 4B, entry 2), and the factor setting n * cat ¼ 0:075 mmol, n * add ¼ 3:576 mmol and V* ¼ 0.550 mL thus became the reference point for further optimisation. The 30% relaxation trial was very poor, indicating that a local maximum had been exceeded (Fig. 4B, entry 3). The pronounced drop in activity, resulting most likely from the reduced catalyst loading, indicates a molecular deactivation pathway of the catalytically active species due to potential inhibitors, such as carbon monoxide, moisture and oxygen, which are probably present in low concentrations.
At this point, there were two alternatives to proceed: (1) create an experimental design around n * cat , n * add and V* to fully identify the topology around the relaxation point at the desired resolution (complexity). (2) Consider n * cat ¼ 0:075 mmol, n * add ¼ 3:576 mmol and V* ¼ 0.550 mL as locally optimal and switch to optimising the candidates T, p H 2 , p CO 2 and t, which had so far been kept constant.
The poor outcome of the third relaxation trial (Fig. 4B, entry 3) showed that not much was to be expected from exploring the three-dimensional environment of the 20% relaxation trial any further. Therefore, option 2 was chosen and the 20% relaxation trial became the reference point for optimising the candidates T, p H 2 , p CO 2 and t.

Second DoE
A small linear design [eqn (2)] with 5 runs in the ranges listed in Fig. 4C was created with the reference point n * cat ¼ 0:075 mmol, n * add ¼ 3:576 mmol and V* ¼ 0.550 mL, T* ¼ 90 C, p * H2 ¼ 90 bar, p * CO2 ¼ 20 bar bar, t* ¼ 18 h at the design centre as replicate (Fig. 1(V); see the Methods section). The experiments were realised in the lab as triple replicate to provide a measure of accuracy ( Fig. 1(VI)).
The measured responses (TON DMM , TON MF ) were linearly modelled as a function of the process parameters with stepwise OLS. The models explain 92% (R 2 (TON DMM ) ¼ 0.92; DF ¼ 16) and 78% (R 2 (TON MF ) ¼ 0.78; DF ¼ 18) variance of the responses thus indicating a large signal-to-noise ratio for TON DMM and to a lesser extent for TON MF . Fig. 3B shows the linear effects of the process parameters on TON DMM as trellis response surface plot. The factors T and p H 2 both have strong positive effects on TON DMM , whereas t reveals only a small positive and p CO 2 a moderate negative effect on TON DMM . Optimal conditions were found in the upper le panel and these are the conditions of the top candidate found in the design list with T, p H 2 , t at the upper and p CO 2 at the lower bound, yielding respective TONs for DMM and MF of 2610 and 2356 (Fig. 4C, entry 4).

Second relaxation
Following eqn (5), the experimental space was relaxed in 25% and 50% steps and the relaxation trials were experimentally realised in the lab ( Fig. 4D and 1(VII); see the Methods section).
Again, we saw a small improvement of the 25% relaxation trial (Fig. 4D, entry 1) compared with the best candidate from the second DoE (Fig. 4C, entry 4), whereas the 50% relaxation candidate performed comparatively poorly (Fig. 4D, entry 2; Fig. 1(VIII)). With these relaxation trials, we reached the technical limits of our setup regarding hydrogen gas pressure, and therefore, the conditions of the 25% relaxation experiment can be considered locally optimal given the constraints of technical feasibility ( Fig. 1(IX) and (X)). We would like to point out that the catalytic conditions optimised by the here presented strategy may still represent a local maximum. Modication of the catalyst and the additive might also result in further improvement, but as demonstrated in this work the identication of high-performance catalytic conditions should be a prerequisite for a design strategy of new catalysts.
A complete overview of the results obtained at each step of the optimisation project is given in Fig. 4E, overall tripling the initial TON DMM value of 786 to a nal value of 2761. As illustrated, the standard deviation (SD) of TON DMM tends to increase with increasing mean value of TON DMM (Fig. 4E), which might be a joint effect from decreasing n cat and increasing both n add and V over the course of the optimisation project. Decreasing n cat and increasing V is equivalent to reducing the concentration of the catalyst, which presumably renders the system more susceptible to random disturbances by catalyst deactivation, thereby providing an explanation for the observed increase of the standard deviation. Simple Spearman rank correlation analysis of the relationship between SD(TON DMM ), SD(TON MF ) and the process parameters X i supports this hypothesis by revealing a negative and a positive association of SD(TON DMM ) with n cat and V, respectively (ESI Table 26 †). Another interesting result in Fig. 4E refers to the independent replication error. The linear design of the second DoE includes two independent settings of the 20% reference as centre points with each point measured as triple replicate (ESI Tables 11 and 17 †). This sextet from the second DoE (Fig. 4C, entry RS1) excellently matches the 20% relaxation triple (Fig. 4B, entry 2) indicating good repeatability and reliability of the system (Fig. 4E, ESI Tables 27 and 28 †). In a similar way, the best candidate from the second DoE (Fig. 4C, entry 4) has been independently replicated when running the second relaxation sequence (Fig. 4D, entry RS2) and both replicates turned out identical within the experimental error (Fig. 4E, ESI Tables 29 and 30 †).

Technical adaption
The results of the non-biased mathematical modelling approach presented in this study revealed that a better catalytic performance is inter alia strongly correlated to a combination of lower catalyst loadings and higher reaction volumes. The heuristic interpretation of this nding indicates that the poor solubility of hydrogen gas in methanol and mass transfer might be limiting factors within our experimental setup. We anticipated that a larger reaction volume, a higher surface to volume ratio and improved mixing would lead to better mass transfer and thus designed an experimental setup accordingly to enhance the catalytic performance. With this upscale autoclave, the TON for DMM increased to 3874, while the TON for MF reached a value of 1445, resulting in a higher selectivity toward DMM (ESI Table 21 †). The result of this technical adaption shows that while the reaction is already optimised to a high degree with respect to reaction parameters, further improvements can be expected by focusing on engineering aspects of the reaction setup.

Conclusion
We demonstrated the power of multivariate optimisation for catalytic processes over the usually applied cumbersome onefactor-at-a-time method. In the homogeneously catalysed transformation of CO 2 to DMM (dimethoxymethane) and MF (methyl formate) using the ruthenium-triphos complex [Ru(Ntriphos Ph )(tmm)], the TON (turnover number) for DMM was drastically increased to 2761 (with it: TON MF 1769) by an easy-touse algorithmic workow combined with only a small number of catalytic experiments. Given the complexity of the transformation, which depends on seven parameters, conventional OFAT screening techniques would have been very costly and time-consuming, with uncertain outcome.
Starting from catalytic data using RF (random forest) for empirical model building, an experimental subspace was identied and subsequently augmented to render the effects of a rst set of three process factors estimable. Modelling and optimisation, followed by relaxation led to a sequence of relaxation trials with one candidate assumed to be locally optimal. With this candidate as reference, a linear design of the remaining four variables yielded another substantial improvement. Relaxation of the second design further enhanced the catalytic performance, thereby reaching the technical limits of the setup.
The optimised conditions were used in a specically designed experimental setup and the highest TON for DMM of 3874 (with it: TON MF 1445) was obtained, which is, to the best of our knowledge, the by far highest value reported in the investigated catalysis.

Theoretical introduction
Experimental design (design of experiments, DoE) methods can be used to study the joint effects of several parameters X: X 1 , X 2 , ., X I on response Y. 26,42,43 This can formally be written as: (1) f() denotes the true, however unknown function, linking the responses Y with the process conditions X: X 1 , X 2 , ., X I , whereas 3 is a random element taken from a normal distribution with variance s 2 , 3 $ N(0,s 2 ) to account for experimental uncertainties. Conceptually, nature evaluates in an experiment the function f() known to her only at reaction conditions X, then adds some random noise 3 and returns the experimental results Y [eqn (1)]. Under the weak assumption that f() is smooth and continuous, f() can be locally approximated as polynomial parametric surrogates, g 1,2,3 (), of increasing complexity, formally: These are linear [eqn (2)], bilinear [eqn (3)] or quadratic [eqn (4)] parametric surrogates of the true function f(). With the experimental values Y, X 1 , X 2 , ., X I available, the unknown parameters a i , a ii , a ij can be estimated from the data using ordinary least squares (OLS). 46 Given process factors X 1 , X 2 , ., X I , their ranges X i˛{ LB,UB} with LB, UB denoting the lower and upper bounds of the process factors X i and, depending on the expected complexity, the parametric form of the surrogate model, the design points X (experimental design) optimally supporting the chosen model can be calculated. However, in an early project phase it is oen unclear which factors X i and ranges should be chosen and what levels of complexity must be assumed for the domain under investigation. Therefore, DoE can benet from experimental data analysis, with the latter helping to answer the questions arising in the former.
Aer rst optimisation by an experimental design, ascending in the direction of maximal improvement can be easily achieved by increasing (relaxing) the experimental space in discrete steps and by solving a maximisation problem subject to a sequence of hypercubical constraints, that is max(g(X)) subject to LB À kDX < X < UB + kDX (5) with DX being the step size of the relaxation, here taken to be 10% of the initial factor ranges, that is DX ¼ 0.1[UB À LB], and LB, UB denoting the lower and upper bounds of the process factors X i . Varying k ¼ 1, 2, ., K leads to a sequence of relaxation trials X * 1 ; X * 2 ; .; X * K to be realised in the lab.

Additional information
The detailed description of all experiments, the performed multivariate analysis, the spectroscopic data of compounds as well as the NMR spectra of compounds and catalysis samples can be found in the ESI. † In order to improve comprehensibility, simplied names were used in some cases rather than using exact IUPAC names. All calculations were done using the statistical soware R. 47 Random forest modelling was performed with the R-package 'randomForest'. 48 Experimental designs were calculated with the D-optimal criterion of the function optFederov() in the Rpackage 'AlgDesign'. 49 Optimisation was achieved with the