Max
Siebert
,
Gerhard
Krennrich
,
Max
Seibicke
,
Alexander F.
Siegle
and
Oliver
Trapp
*
Department Chemie, Ludwig-Maximilians-Universität München, Butenandtstr. 5-13, 81377 München, Germany. E-mail: oliver.trapp@cup.uni-muenchen.de
First published on 24th October 2019
In times of a warming climate due to excessive carbon dioxide production, catalytic conversion of carbon dioxide to formaldehyde is not only a process of great industrial interest, but it could also serve as a means for meeting our climate goals. Currently, formaldehyde is produced in an energetically unfavourable and atom-inefficient process. A much needed solution remains academically challenging. Here we present an algorithmic workflow to improve the ruthenium-catalysed transformation of carbon dioxide to the formaldehyde derivative dimethoxymethane. Catalytic processes are typically optimised by comprehensive screening of catalysts, substrates, reaction parameters and additives to enhance activity and selectivity. The common problem of the multidimensionality of the parameter space, leading to only incremental improvement in laborious physical investigations, was overcome by combining elements from machine learning, optimisation and experimental design, tripling the turnover number of 786 to 2761. The optimised conditions were then used in a new reaction setup tailored to the process parameters leading to a turnover number of 3874, exceeding by far those of known processes.
In catalytic investigations, system optimisation is typically approached by one-factor-at-a-time (OFAT) methods, successively screening along one parameter axis. Once optimised, a parameter is kept constant for the subsequent experiments. In this univariate analysis, variables are treated as being independent of each other. Beside the vast number of experiments that must be performed, local maxima with higher performances might be missed. Consequently, algorithm-based screening and optimisation techniques have been among the fastest growing research areas in recent years.13–24 Considering the interactions of parameters, optimised results can be achieved with minimal experimental effort.25,26
Recently, we applied a univariate optimisation approach including several hundred catalytic reactions to improve the selective ruthenium-catalysed transformation of carbon dioxide to dimethoxymethane (DMM) reaching a turnover number (TON) of 786 (Scheme 1).27,28 The product DMM itself is a high value feedstock for biofuels, but can also be hydrolysed yielding formaldehyde and methanol or directly employed as a formaldehyde synthon.29,30 Beside the desired product, only methyl formate (MF) was formed with TONs of up to 1290 (Scheme 1).27 Previously, two studies on the selective hydrogenation of CO2 by the group of Klankermayer showed the formation of DMM and MF by using a homogeneous ruthenium catalyst31 with TONs of 214 and 104 or a cobalt catalyst32 with TONs of 157 and 37, respectively. Further selective reductions toward the formaldehyde oxidation state were reported utilising hydroboration,33–37 hydrosilylation,38,39 and frustrated Lewis pairs,40,41 however, being mainly of academic interest due to the stoichiometric use of reducing reagents.
Scheme 1 Reductive transformation of CO2 towards the formaldehyde oxidation level yielding methyl formate (MF) and dimethoxymethane (DMM). |
Inspired by previous reports, which underline the importance of a holistic view on the results obtained,13,19 we now utilised multivariate analysis42,43 to further optimise the investigated catalysis. Based on the large amount of experimental data and the dependency of the catalytic system on seven different parameters, we envisioned to increase its performance by modelling and predicting conditions optimally accounting for parameter interaction. For this purpose, we devised a multi-step process combining elements from machine learning, optimisation and experimental design (design of experiments, DoE). Fig. 1 gives a schematic overview of the easy-to-use algorithmic workflow, which was developed in this work and is applicable for any catalytic screening. Experimental data (Fig. 1(I)) was modelled using the random forest (RF) algorithm (Fig. 1(II)) to identify promising subspaces with high catalytic performance (Fig. 1(III)). To better understand the origin of the exceptional activity and to extend the amount of data for further modelling, the subspace was augmented (Fig. 1(IV)) by additional experiments (Fig. 1(V)) that were based on experimental design. Starting from these first experiments, an iterative workflow – consisting of DoE (Fig. 1(V) and (VI)), optimisation (relaxation; Fig. 1(VII) and (VIII)) and evaluation (Fig. 1(IX)) – was applied until the final optimum was reached (Fig. 1(X)).
Fig. 1 Outline of the algorithm-based workflow combining experimental data analysis, DoE and optimisation. |
Entry | Variable | Mean | Median | Min | Max |
---|---|---|---|---|---|
1 | T (°C) | 89.2 | 90 | 20 | 120 |
2 | p H2 (bar) | 82.1 | 90 | 40 | 100 |
3 | p CO2 (bar) | 17.9 | 20 | 5 | 40 |
4 | t (h) | 26.5 | 18 | 1 | 168 |
5 | n cat (μmol) | 1.3 | 1.5 | 0 | 3 |
6 | n add (μmol) | 5.4 | 6.25 | 0 | 12.5 |
7 | V (mL) | 0.5 | 0.5 | 0.25 | 0.5 |
8 | TONDMM | 275 | 263 | 0 | 906 |
9 | TONMF | 145 | 71 | 0 | 1377 |
Modelling and analysis of datasets based on empirical optimisation (e.g. OFAT) is complex and more difficult than estimating parametric surrogate models from well-designed data. The difficulties arise primarily from non-linearities, which simple parametric models cannot appropriately describe. Furthermore, OFAT variations often and inadvertently lead to co-varying and thereby confounding the effects of process parameters. For instance, in the present dataset the process parameters ncat and nadd were found to be highly correlated with a Pearson correlation coefficient r(ncat − nadd) = 0.895, rendering the data non-informative in terms of the independent effects of ncat and nadd.
Next, the problem for these datasets is finding an appropriate functional representation, (), of the target response as a function of the process factors Xi, that is TONDMM = f(T,pH2,pCO2,t,ncat,nadd,V) + ε, without making any prior assumptions about the analytical form of the ‘true’ function f(). Here, we used RF as a powerful non-linear and easy-to-use method for the empirical model building of complex datasets (Fig. 1(II)).44,45
The RF models describe 79% of the responses' variance on average (R2(TONDMM) = 0.83; R2(TONMF) = 0.74), which is a good result given the heterogeneous nature of the dataset. Further, tuning the RF hyperparameters using 10-fold cross validation with consecutive blocks (mtry* = 2 with R2 = 0.83) revealed that the default RF hyperparameters describe the data appropriately. The effect structure of the models can be conveniently explored by plotting the RF model predictions (X) against the process factors X1, X2, …, X7. Fig. 2 shows the effects of the process parameters T, pCO2, ncat and nadd, given the median values pH2 = 90 bar, t = 18 h and V = 0.5 mL, as a conditional trellis plot.
Fig. 2 Trellis plot of the random forest predictions (T,pH2,pCO2,t,ncat,nadd,V) with pH2 = 90 bar, t = 18 h and V = 0.5 mL. Note the difference of max(TONDMM) in the experimental data (Table 1, entry 8) and the model. The RF values are predictions causing the range of the z-axis to be smaller than the range of the empirical data. |
As a tree-based ensemble method, RF splits and mean-aggregates the experimental space into hyperrectangles, the consequence being that the RF model surface becomes non-smooth (Fig. 2). This property of RF is a particular strength when it comes to identifying promising domains of the experimental space and was another motivation for choosing RF as modelling technique.
Evidently, ncat and nadd exert strong non-linear, step like negative effects on the process performance (TONDMM), dividing the ncat × nadd space into domains of different performance. The pressure, pCO2, reveals a positive and temperature, T, a small convex effect, both, however, negligible compared to the dominant effects of ncat and nadd. At this point, the erroneous impression may arise that the heuristic interpretation of the experimental data could lead to a further improvement by a trivial reduction of the amount of catalyst, which is often reduced to minimise catalyst deactivation. However, a closer look at the experimental data shows that a reduction in the amount of catalyst without changing other parameters leads to a decrease in catalytic activity with respect to DMM. This case shows exemplarily how important multivariate analytical methods can be, because they can easily identify complex correlations of process parameters that are not detectable by a univariate analysis of the scientist.
With the condition TONDMM > 400 deduced from the landscape of TONDMM over ncat × nadd (Fig. 2), a subset of 26 observations with 9 unique settings was selected from the data (Fig. 1(III)). In this subset, the parameters T, pH2, pCO2 and t were found constant at T = 90 °C, pH2 = 90 bar, pCO2 = 20 bar and t = 18 h, thereby suggesting to conditionally optimise ncat, nadd and V first, while keeping T, pH2, pCO2 and t constant for the time being.
The 6 augmentation trials were realised as triple replicates in the lab, added to the 9 settings already available and the responses (TONDMM, TONMF) were modelled as a function of the least square parameters with stepwise ordinary least squares (OLS; Fig. 1(VI)). Both models accurately describe the data within the replication error and explain 91% (R2(TONDMM) = 0.91; DF = 37) and 97% (R2(TONMF) = 0.97; DF = 35) of the responses' variance with DF denoting the degrees of freedom (number of data points minus number of estimated model parameters).
The local effects of ncat, nadd and V are depicted in Fig. 3A as response surface trellis plot. Again, there is a strong negative effect for ncat along with positive effects for nadd and V. Together with the positive, synergistic effect between nadd and V, the effect structure suggests to decrease ncat and to increase nadd and V to further maximise the catalytic performance beyond the best result of the first DoE (Fig. 4A, entry 4).
The joint condition max(TONDMM), max(TONMF) was best met by the 20% relaxation trial (Fig. 4B, entry 2), and the factor setting , and V* = 0.550 mL thus became the reference point for further optimisation. The 30% relaxation trial was very poor, indicating that a local maximum had been exceeded (Fig. 4B, entry 3). The pronounced drop in activity, resulting most likely from the reduced catalyst loading, indicates a molecular deactivation pathway of the catalytically active species due to potential inhibitors, such as carbon monoxide, moisture and oxygen, which are probably present in low concentrations.
At this point, there were two alternatives to proceed: (1) create an experimental design around , and V* to fully identify the topology around the relaxation point at the desired resolution (complexity). (2) Consider , and V* = 0.550 mL as locally optimal and switch to optimising the candidates T, pH2, pCO2 and t, which had so far been kept constant.
The poor outcome of the third relaxation trial (Fig. 4B, entry 3) showed that not much was to be expected from exploring the three-dimensional environment of the 20% relaxation trial any further. Therefore, option 2 was chosen and the 20% relaxation trial became the reference point for optimising the candidates T, pH2, pCO2 and t.
The measured responses (TONDMM, TONMF) were linearly modelled as a function of the process parameters with stepwise OLS. The models explain 92% (R2(TONDMM) = 0.92; DF = 16) and 78% (R2(TONMF) = 0.78; DF = 18) variance of the responses thus indicating a large signal-to-noise ratio for TONDMM and to a lesser extent for TONMF.
Fig. 3B shows the linear effects of the process parameters on TONDMM as trellis response surface plot. The factors T and pH2 both have strong positive effects on TONDMM, whereas t reveals only a small positive and pCO2 a moderate negative effect on TONDMM. Optimal conditions were found in the upper left panel and these are the conditions of the top candidate found in the design list with T, pH2, t at the upper and pCO2 at the lower bound, yielding respective TONs for DMM and MF of 2610 and 2356 (Fig. 4C, entry 4).
Again, we saw a small improvement of the 25% relaxation trial (Fig. 4D, entry 1) compared with the best candidate from the second DoE (Fig. 4C, entry 4), whereas the 50% relaxation candidate performed comparatively poorly (Fig. 4D, entry 2; Fig. 1(VIII)). With these relaxation trials, we reached the technical limits of our setup regarding hydrogen gas pressure, and therefore, the conditions of the 25% relaxation experiment can be considered locally optimal given the constraints of technical feasibility (Fig. 1(IX) and (X)). We would like to point out that the catalytic conditions optimised by the here presented strategy may still represent a local maximum. Modification of the catalyst and the additive might also result in further improvement, but as demonstrated in this work the identification of high-performance catalytic conditions should be a prerequisite for a design strategy of new catalysts.
A complete overview of the results obtained at each step of the optimisation project is given in Fig. 4E, overall tripling the initial TONDMM value of 786 to a final value of 2761. As illustrated, the standard deviation (SD) of TONDMM tends to increase with increasing mean value of TONDMM (Fig. 4E), which might be a joint effect from decreasing ncat and increasing both nadd and V over the course of the optimisation project. Decreasing ncat and increasing V is equivalent to reducing the concentration of the catalyst, which presumably renders the system more susceptible to random disturbances by catalyst deactivation, thereby providing an explanation for the observed increase of the standard deviation. Simple Spearman rank correlation analysis of the relationship between SD(TONDMM), SD(TONMF) and the process parameters Xi supports this hypothesis by revealing a negative and a positive association of SD(TONDMM) with ncat and V, respectively (ESI Table 26†).
Another interesting result in Fig. 4E refers to the independent replication error. The linear design of the second DoE includes two independent settings of the 20% reference as centre points with each point measured as triple replicate (ESI Tables 11 and 17†). This sextet from the second DoE (Fig. 4C, entry RS1) excellently matches the 20% relaxation triple (Fig. 4B, entry 2) indicating good repeatability and reliability of the system (Fig. 4E, ESI Tables 27 and 28†). In a similar way, the best candidate from the second DoE (Fig. 4C, entry 4) has been independently replicated when running the second relaxation sequence (Fig. 4D, entry RS2) and both replicates turned out identical within the experimental error (Fig. 4E, ESI Tables 29 and 30†).
Starting from catalytic data using RF (random forest) for empirical model building, an experimental subspace was identified and subsequently augmented to render the effects of a first set of three process factors estimable. Modelling and optimisation, followed by relaxation led to a sequence of relaxation trials with one candidate assumed to be locally optimal. With this candidate as reference, a linear design of the remaining four variables yielded another substantial improvement. Relaxation of the second design further enhanced the catalytic performance, thereby reaching the technical limits of the setup.
The optimised conditions were used in a specifically designed experimental setup and the highest TON for DMM of 3874 (with it: TONMF 1445) was obtained, which is, to the best of our knowledge, the by far highest value reported in the investigated catalysis.
Y = f(X) + ε | (1) |
Under the weak assumption that f() is smooth and continuous, f() can be locally approximated as polynomial parametric surrogates, g1,2,3(), of increasing complexity, formally:
(2) |
(3) |
(4) |
These are linear [eqn (2)], bilinear [eqn (3)] or quadratic [eqn (4)] parametric surrogates of the true function f(). With the experimental values Y, X1, X2, …, XI available, the unknown parameters ai, aii, aij can be estimated from the data using ordinary least squares (OLS).46
Given process factors X1, X2, …, XI, their ranges Xi ∈ {LB,UB} with LB, UB denoting the lower and upper bounds of the process factors Xi and, depending on the expected complexity, the parametric form of the surrogate model, the design points X (experimental design) optimally supporting the chosen model can be calculated. However, in an early project phase it is often unclear which factors Xi and ranges should be chosen and what levels of complexity must be assumed for the domain under investigation. Therefore, DoE can benefit from experimental data analysis, with the latter helping to answer the questions arising in the former.
After first optimisation by an experimental design, ascending in the direction of maximal improvement can be easily achieved by increasing (relaxing) the experimental space in discrete steps and by solving a maximisation problem subject to a sequence of hypercubical constraints, that is
max(g(X)) subject to LB − kΔX < X < UB + kΔX | (5) |
All calculations were done using the statistical software R.47 Random forest modelling was performed with the R-package ‘randomForest’.48 Experimental designs were calculated with the D-optimal criterion of the function optFederov() in the R-package ‘AlgDesign’.49 Optimisation was achieved with the augmented Lagrange method from the R-package ‘Rsolnp’.50 Graphics were produced with the R-package ‘lattice’.51
All catalyses and the corresponding analyses were performed following a procedure previously reported by our group.27 The catalysis was also performed in the absence of a catalyst, a co-catalyst or both to demonstrate the need of the catalytic system for the formation of DMM and MF. In all cases, no significant conversions for both of these compounds were observed (ESI†).
Footnote |
† Electronic supplementary information (ESI) available: Synthetic procedures, multivariate modelling procedure, additional tables, catalysis data, R-code. See DOI: 10.1039/c9sc04591k |
This journal is © The Royal Society of Chemistry 2019 |