Bruno
Lacerda de Oliveira Campos
*a,
Andréa
Oliveira Souza da Costa
b,
Karla
Herrera Delgado
a,
Stephan
Pitter
a,
Jörg
Sauer
a and
Esly
Ferreira da Costa Junior
*b
aInstitute for Catalysis Research and Technology (IKFT), Karlsruhe Institute of Technologie (KIT), Hermann-von-Helmholtz-Platz 1, 76344 Eggenstein-Leopoldshafen, Germany. E-mail: bruno.campos@kit.edu
bChemical Engineering Department, Federal University of Minas Gerais (UFMG), Av. Presidente Antônio Carlos 6627, Pampulha, 31270-901, Belo Horizonte, MG, Brazil. E-mail: esly@deq.ufmg.br
First published on 15th January 2024
Microkinetic models allow the description of complex reaction kinetics but require high computational costs, hindering their combination with detailed reactor models. In this contribution, a methodology to develop a surrogate artificial neural network (ANN) was proposed and demonstrated for methanol synthesis on Cu/Zn-based catalysts. The resulting model accurately reproduces the simulations of the original microkinetic model, reducing the computational costs by orders of magnitude. In the developed methodology, the ANN learns only the kinetics of the global reaction rates, thereby decreasing model complexity and computational costs while ensuring thermodynamic consistency. In addition, an improved activation function for the ANN was designed in this work to minimize computational costs and to smooth out calculations. The proposed approach creates a bridge to integrate microkinetics into applications in the field of reaction engineering, such as reactor design, process optimization, and scale-up.
However, due to the high computational costs of microkinetic models (MKMs), their application in combination with detailed reactor models is hindered. A promising alternative is the development of surrogate models that are able to learn the behavior of the microkinetic model and reproduce its simulations at much lower computational costs.1
Spline functions have been proposed to fulfill this purpose and have shown accurate results for a variety of reaction systems but have high storage requirements that scale exponentially with the number of inputs.5–7 To overcome these storage limitations, the application of an in situ adaptive tabulation technique was shown to be successful.8,9
As an alternative to spline interpolation, machine-learning techniques have been investigated for the development of surrogate models of microkinetics.10–12 In comparison with spline functions, it has been demonstrated that artificial neural networks (ANNs) had much lower computational costs and storage requirements, while higher accuracy was achieved with a smaller training database.12
For reaction rates of a given system that span several orders of magnitude, model accuracy was improved by using the logarithmic function of the reaction rates and concentrations and the inverse of temperature.5,10 As the logarithmic function requires strictly positive inputs, a question arises on how to deal with reaction rates that are sometimes positive and sometimes negative. Therefore, Partopour et al.11 proposed to separately learn the adsorption and desorption of each participating species. Recently, Döppel and Votsmeier12 proposed to map out the rate-determining steps and learn the forward and the reverse reactions separately.
In this contribution, a comprehensive methodology is presented to transfer information from a microkinetic model MKM to an ANN, maintaining model accuracy while sharply decreasing computational costs. The kinetics and thermodynamics of the global reaction rates are split, so that the former are learned by the ANN, while the latter are calculated afterward by known equations. This approach has two direct benefits: the kinetics are strictly positive (addressing the question posed in the last paragraph), and the thermodynamic consistency of the model is ensured. In addition, a new activation function for the ANN is proposed here, which was designed to minimize computational costs. To demonstrate this methodology, our recently published microkinetic model of the methanol synthesis on Cu/Zn-based catalysts13 was taken as a case study.
![]() | (1) |
![]() | (2) |
![]() | (3) |
![]() | ||
Fig. 1 Reaction network of the microkinetic model. Adapted with permission from Campos et al.13 Copyright 2021 Royal Society of Chemistry. |
In this model, three different active sites were considered: site (a) (pure Cu), site (b) (Cu/Zn), and site (c) (Cu or Cu/Zn available only for H2 and H2O adsorption). Assuming a mean-field approximation, the rate of a reversible elementary step k (rk, in mol kgcat−1 s−1) was calculated following the Eyring–Polanyi equation.16,17 This rate equation can be split into intrinsic kinetics (i.e. reaction constant k), gas phase dependency (FG), and catalytic surface dependency (FC).
The reaction constant (k) was calculated as follows:
![]() | (4) |
Here, T is the reaction temperature (K), βk is a correction term to ensure thermodynamic consistency,13,18kb is the Boltzmann constant (kJ K−1), h is the Planck constant (kJ s), EA,k is the activation energy (kJ mol−1), ΔS≠k is the entropy barrier (kJ mol−1 K−1), and R is the universal gas constant (kJ mol−1 K−1).
The gas phase and catalyst surface dependencies were calculated with the following expressions:
![]() | (5) |
![]() | (6) |
Finally, the rate of a reversible reaction k is given by combining eqn (4)–(6).
rk = k+·F+G·F+C − k−·F−G·F−C | (7) |
![]() | (8) |
Here, the superscripts + and − refer to the forward and reverse reactions, respectively.
The surface coverages were obtained with balance equations, resulting in a system of ordinary differential equations (eqn (9)).
![]() | (9) |
In general, the time needed for the reactive surface to reach a steady-state after a perturbation is orders of magnitude lower than the time needed for the reactor to stabilize after a change in the process conditions.23 Because of that, the steady-state solution of the kinetic model (eqn (9) with dθi/dt = 0) should be useful to simulate reactors in both steady-state and transient conditions. There are two ways to find the steady-state solution of eqn (9):
• To integrate the equations in time until steady state is achieved. MATLAB function ode23s was chosen to solve this stiff system.
• To make dθi/dt = 0, deriving a non-linear algebraic system of equations, which can be solved, e.g. with MATLAB function fsolve.
The second option has lower computational costs than the first one, but it requires educated initial values to converge to the correct solution. If these values are not known, then the time integration should be performed. In this work, the time integration from 0 to 15 s was performed, which was sufficient to reach steady state in all cases. The initial condition for the time integration was chosen case-by-case based on an initial guess method developed in a previous work (see Fig. S1 in the ESI†).13 With these educated guesses, a typical integration time to reach steady state was 1.5 s, while random initial conditions took longer but reached the same results (e.g. if all coverages were set to 0.01 as the initial condition, a typical time to reach steady state was 8 s).
Independently if the time integration or the algebraic system resolution is chosen, it is highly recommended to provide the analytical Jacobian matrix, as computational costs are significantly lowered and numerical instability is avoided. In addition, the analytical derivation of the Jacobian matrix for this type of mathematical system is relatively easy due to the nature of the equations (either linear or quadratic dependency on the variables).
The rates of the global reactions (eqn (1)–(3)) are equal to the rate of the following elementary steps (see Fig. 1):
rCO = r10 | (10) |
rCO2 = r14 + r15 | (11) |
rWGS = r22 + r23 | (12) |
The kinetic parameters (EA,k, ΔS≠k) were taken from DFT calculations,14,15 and only nM,Cat was fitted to experimental data. A total of 690 steady-state experiments from different setups13,24–26 were used to validate this model at a wide range of operating conditions. The kinetic parameters (EA,k, ΔS≠k) and the correction terms (βk) are provided in the ESI,† section S1.
In our case study, the microkinetic model of the methanol synthesis requires six inputs to calculate the reaction rate of a certain reaction condition: temperature (T) and the fugacity of each gas species participating in the reactions (fH2, fCO, fCO2, fCH3OH, fH2O). Therefore, these same inputs are considered for the ANN. Note here that the total pressure (p) must not be explicitly given as an extra input because the effect of non-ideality is already contained within the fugacities. Besides, the fugacity of (possible) inerts is only necessary in the reactor model and must not be given as extra inputs in the reaction model.
It is known that the rate of a global reversible reaction consists in a kinetic part (rK) and a thermodynamic part (rT), which are multiplying each other. This is explicit in formal kinetic models, exemplified here for CO2 hydrogenation to methanol:27
r = [Kin. term]·(Therm. term) = rK·rT | (13) |
![]() | (14) |
![]() | (15) |
![]() | (16) |
In microkinetic models, it is possible to isolate the kinetic and thermodynamic terms if there is a single global reaction (see the mathematical demonstration in the ESI,† section S2). For multiple global reactions, which is the case for the majority of the systems, the kinetic term of global reaction n (rKn) can be indirectly obtained by dividing the reaction rate (rn) by the thermodynamic term (rTn):
![]() | (17) |
![]() | (18) |
The kinetic term is the most complex part of the reaction rate estimation. It is the subject of thorough theoretical and experimental investigations, and, therefore, the reason for a variety of kinetic models in the literature describing the same system. In contrast, the thermodynamic term of a global reaction, often called “the affinity of the reaction towards the equilibrium”,28 can be easily calculated with fugacities and global equilibrium constants, which are generally described by known thermodynamic equations.
Therefore, the ANN should be used to learn and predict only the complex part (i.e. the kinetics, rK), while the thermodynamics (rT) can be calculated afterward with simple known equations (eqn (18)). The application of this procedure brings three benefits:
• A priori information is being used, reducing the amount of information the ANN has to learn and, therefore, its complexity.
• The thermodynamic consistency of the model is ensured.
• The kinetic terms (rK) are strictly positive, giving a suitable solution to the issue discussed in the introduction, i.e. manipulating the reaction rates with the logarithmic function, which requires strictly positive values.
Hence, it makes sense to set the outputs of the ANN to be rKn instead of rn. In this case study, the outputs of the ANN are then rKCO, , and rKWGSR. For the methanol synthesis, the thermodynamic terms and the equilibrium constants are given as follows:13,27,29
![]() | (19) |
![]() | (20) |
![]() | (21) |
K0P,CO = T−3.384·exp(10![]() | (22) |
![]() | (23) |
K0P,WGS = T1.097·exp(5337.4T−1 − 12.569) | (24) |
In Fig. 2, the pathways from inputs to outputs for both the MKM and the ANN are illustrated.
![]() | ||
Fig. 2 Pathway from inputs to outputs for the microkinetic model (MKM) (dashed lines) and the proposed artificial neural network (ANN) (solid lines). |
The microkinetic model was coupled with a reactor model of a plug flow reactor (PFR), considering variations only along the reactor length (1D model) and assuming isobaric and isothermal conditions. These considerations resulted in the following balance equations along the reactor length:
![]() | (25) |
![]() | (26) |
To model the real gas behavior and calculate the fugacities, the Peng–Robinson equation of state was used.19 Binary interaction parameters (kij) and other necessary data reported in the literature are considered,20,21 and an effective hydrogen acentric factor of ω = −0.05 was assumed.22
For a certain operating condition, first the equilibrium H2 conversion (XH2,eq.) was calculated (the methodology is provided in the ESI,† section S2). Then, the operation of a long PFR was simulated in MATLAB (integration with the function ode45), with the hydrogen conversion (XH2) being calculated in each step (eqn (21)), and an event function to stop the integration if the system is close to chemical equilibrium (eqn (22)).
![]() | (27) |
Stop integration if: XH2 ≥ 0.95·XH2,eq. | (28) |
Here, XH2 is the H2 conversion, ṅH2 is the hydrogen mole flow (mol s−1), while the subscripts indicate the axial position.
Finally, data points were collected from the PFR simulation at different XH2/XH2,eq values, totalizing 21 points for each simulation. The differential equations (eqn (19) and (20)) were solved in MATLAB with the function ode45, with relative and absolute tolerances set to 10−8. In Table 1, the chosen operating conditions are summarized, amounting to a total of 5720 simulations and 120120 data points.
Parameter | Values | No. of conditions |
---|---|---|
Simulation conditions | ||
Pressure (bar) | 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70 | 11 |
Temperature (K) | 483.15, 488.15, 493.15, 498.15, 503.15, 508.15, 513.15, 518.15, 523.15, 528.15, 533.15, 538.15, 543.15 | 13 |
H2/COX in feed | 0.67, 1.00, 1.50, 2.33, 4.00 | 5 |
CO2/COX in feed | 0.02, 0.08, 0.15, 0.22, 0.29, 0.36, 0.43, 0.50 | 8 |
Total number of simulations | 5720 | |
Data point collection in each simulation | ||
X H2/XH2,eq | 0.00, 0.025, 0.05, 0.10, 0.15, 0.20, 0.25, 0.30, 0.35, 0.40, 0.45, 0.50, 0.55, 0.60, 0.65, 0.70, 0.75, 0.80, 0.85, 0.90, 0.95 | 21 |
Total number of data points | 120![]() |
CO2/COX in feed between 0.02 and 0.50 were chosen, because this is the region where this microkinetic model excels.13 The feed mole fractions of H2, CO, and CO2 (respectively , yinCO, and
) were defined as follows:
![]() | (29) |
![]() | (30) |
![]() | (31) |
The information saved in each data point are the six inputs (T, fH2, fCO, fCO2, fCH3OH, fH2O) and the reaction rates (rCO, rCO2, rWGS). Since the microkinetic model does not explicitly calculate the kinetic part of the reaction rates (rKn), this was calculated separately:
![]() | (32) |
![]() | (33) |
![]() | (34) |
After this last calculation (eqn (26)–(28)), each data point contains the necessary information to train the ANN: six inputs (T, fH2, fCO, fCO2, fCH3OH, fH2O) and three outputs (rKCO, , rKWGS).
As mentioned in the introduction, the usage of the logarithmic function for both the reaction rates and the component concentrations is beneficial if the reaction rates span over several orders of magnitude.5,10 As it was not the case for the present system, this approach was not followed in this case study. Instead, a normalization of input j in each cell j of the input layer (IL) was performed:
![]() | (35) |
Here, Ij is the original value of input j, Ij,min is the minimum value of input j, Imax is the maximum value of input j, and Ij,norm, is the normalized value of input j. The minimum and maximum allowed values for each input are given in Table 2, which correspond to the operating window of Table 1 with a slight extension (5–10%).
In the hidden layer (HL), two mathematical operations occur in series in each cell w: first an aggregation function, then an activation function. Typically, the aggregation function is a sum of weighted inputs, which was also the approach considered here:
![]() | (36) |
Generally, activation functions, named here fa(x), are selected to give non-linearity to the model and to saturate at high absolute values of x (i.e. fa tends to a constant value for high x on both positive and negative directions). Typical non-linear activation functions are the hyperbolic tangent (tanh) and the sigmoid function.
Our interest was that the activation function and its first derivative are continuous in the whole x domain, thereby smoothing out calculations. Finally, the main goal of this work was to minimize computational costs, which should be remembered when choosing or designing an activation function.
Considering these characteristics, a new activation function was developed in this work, which has not been proposed elsewhere, to the best of our knowledge. The function, presented in Table 3, has three regions: constant fa = 1 for x ≥ 1 , constant fa = −1 for x ≤ −1, and a ninth-degree polynomial for −1 < x < 1. The polynomial was designed to be an odd function, i.e. f(−x) = −f(x), and to provide a smooth transition between the three different regions, so that the activation function and its first derivative are continuous along the complete real domain. The continuity and smoothness of fa(x) and can be seen in Fig. 3.
The computational time of the proposed function was compared to that of typical activation functions used in the literature for this field.12,30 The methodology was to evaluate the function for a random input 108 times in the range |x| ≤ 1.2. Both the sigmoid and the hyperbolic tangent were written using MATLAB function exp(x), instead of directly calling built-in functions tanh(x) or sigmoid(x), which are slower. Calculation time was given by MATLAB function tic toc, and the procedure was performed ten times to obtain average values and confidence intervals. All calculations were performed in MATLAB 2018a with the same computer (processor: Intel Core i7-7700 CPU @ 3.60 GHz, installed RAM: 32 GB, operating system: Windows 10 64 bit).
In Fig. 4, the computational time of the different functions is displayed. Our proposed function is more than two times faster than the hyperbolic tangent and the sigmoid function. Similar results were obtained by lowering the range of x to |x| ≤ 1, increasing it to |x| ≤ 2 or even to |x| ≤ 10.
The computational costs of the developed activation function are significantly lower due to the following characteristics:
• In contrast to highly non-linear functions which must be calculated iteratively (e.g. sin(x), ex), a polynomial is analytically calculated with simple operations (i.e. addition, subtraction and multiplication), requiring much lower computational costs.
• The polynomial was written in a way that minimizes the number of mathematical operations to only 10. This alternative description also avoids roundoff errors caused by finite-precision rounded arithmetic, which could appear in a typical polynomial calculation when adding up terms with large differences in the exponential factor.
• When the function is in the saturation region (|x| ≥ 1), the polynomial evaluation is not necessary, leading to minimal computational costs. This is another advantage of our activation function in relation to equations that tend to a constant value, such as tanh(x) and sigmoid(x), but still needs to be calculated in the whole function domain.
The output value of each cell w of the HL (sw,out) is the outcome of the activation function:
sw,out = fa(sw,in) | (37) |
In the output layer (OL), the aggregation function (eqn (38)) and the activation function (eqn (39)) are also present:
![]() | (38) |
ŝw,out = fa(ŝw,in) | (39) |
With the output values of the activation function (ŝn,out), the kinetic part of the reaction rates (rKk) were then obtained:
rKn = 0.5·rKn,max·(ŝn,out + 1) | (40) |
Here, rKn,max is the maximum value allowed for rKn. The maximum allowed values of the outputs were set to values moderately higher than the operating window covered by the generated data, and they are summarized in Table 2. The structure of the ANN is illustrated in Fig. 5.
Finally, to obtain the reaction rates, the kinetic term of global reaction n (eqn (40)) and the corresponding thermodynamic term (eqn (19)–(21)) are multiplied, as already shown in Fig. 2.
rn = rKn·rTn | (41) |
• One for the hidden layer (pwj, eqn (36)), with NHL × NI elements
• One for the output layer (nw, eqn (38)), with NHL × NO elementswhere NHL is the number of cells in the hidden layer, and NO is the number of outputs.
In order to estimate these parameters, the data points generated with the MKM (120120 points) were randomly divided between training (80%), test (10%) and validation (10%). The training points were used to solve the optimization problem, whose goal was to minimize the deviations between the MKM and the ANN predictions of the rKn:
![]() | (42) |
Here, NGR is the number of global reactions, NTP is the number of training points, wni are selected weights, and the subscripts MKM and ANN refer to simulations performed by the microkinetic model and the artificial neural network, respectively.
As the main interest is the prediction of the reaction rates (rn), the weights chosen were the thermodynamic term of the reaction rates (rTn), reducing the influence of points close to the equilibrium. While rTn is limited to +1 in the positive direction, it is unlimited in the negative direction. Therefore, a limit of 10 was set to wni:
wni = |rTni| | (43) |
if wni > 10, then wni = 10 | (44) |
In this work, ANNs with 10, 20, 30, and 40 cells in the hidden layer (NHL) were trained. They are named here ANN10, ANN20, ANN30, and ANN40, respectively. First, ANN10 was trained with random initial guesses. The parameters of the optimized ANN10 were used as initial guesses to train ANN20, with the new additional parameters (present only in ANN20) receiving random initial guesses with low absolute values. The same procedure was applied when training ANN30 and ANN40.
To solve this minimization problem, the quasi-Newton method was applied. To find the search direction, the gradient of the objective function was analytically calculated, and the inverse of the Hessian matrix was iteratively updated with the Broyden–Fletcher–Goldfarb–Shanno (BFGS) algorithm. After the search direction was defined, the optimal step size was determined by line search. The optimization problem was written and solved in C language.
For the plug flow reactor (PFR), the same procedure described in section 2.4 is made (eqn (23) and (24)), the only (obvious) difference being that rCO, rCO2, and rWGS were calculated by the ANN instead.
For the continuous stirred tank reactor (CSTR), considering steady-state operation of a gradient-free reactor, the following balance equations for each component and the total mole flow exiting the reactor (ṅout) were solved:
![]() | (45) |
ṅout = ṅin − 2·mCat·(rCO + rCO2) | (46) |
To solve this algebraic system, the function fsolve from MATLAB was used. The function and step tolerances were set to 10−8, and the experimental values were used as initial guesses of the mole fractions.
It should be noted here that the reaction rates and derivatives were written one-by-one in the programming of the microkinetic model instead of using sparse coefficient matrices and long “for” loops, which would increase computational time. Besides, the system of equations (eqn (8) and (9)) were solved by integrating the equations from 0 to 15 s.
![]() | (47) |
![]() | (48) |
![]() | (49) |
In Fig. 6, statistical data of the ANN performance is presented for the different hidden layer (HL) sizes investigated. All ANNs present significantly low mean deviations, with the accuracy being considerably improved by increasing the HL size. In particular, ANN30 and ANN40 show excellent results, with R2 close to 1, MSE close to 0, and ME below 0.15 mol kgcat−1 h−1 (ca. 1% of mean reaction rate values). Since the accuracy of ANN30 is significantly high and close to that of ANN40, while its computational costs are lower than that of ANN40 due to the former's lower complexity, ANN30 seems to provide the best compromise between accuracy and computational costs. Therefore, ANN30 is the model chosen for further analysis and validation.
![]() | ||
Fig. 6 Statistical data of the ANN as a function of the HL size. (a) Mean error (ME). (b) Mean squared error (MSE). (c) R2. |
In Fig. 7, parity plots of the 120120 reaction rate points (rn) are shown for the ANN30. The points are close to the bisector for the three global reactions, confirming that the model correctly simulates the data in the operating window studied. Besides, no significant systematic deviation was observed.
![]() | ||
Fig. 7 Parity plots of the 120![]() |
In Fig. 8, the effect of each input in the reaction rates is shown for the MKM and the ANN30. The ANN30 outputs are almost a perfect match to the MKM simulations, showing that the black box model correctly learned the influence of each input on the reaction rates, and no indication of overfitting is found. Some of the effects adequately learned by the ANN are, for example:
• Significant temperature influence in all reactions, including a negative influence in the WGSR (as its reverse reaction is favored).
• Positive effect of H2 fugacity in the methanol synthesis and strong negative effect of this input in the WGSR.
• Positive effect of CO fugacity on CO hydrogenation to methanol and the WGSR, while CO content does not affect CO2 hydrogenation.
• Limited positive effect of CO2 fugacity on CO2 hydrogenation to methanol (as formate coverage becomes close to 100%).
• Negative effect of CO2 fugacity on CO hydrogenation to methanol (inhibition by formate poisoning) and on the WGSR (formate inhibition and improved reverse reaction).
• Negative effect of CH3OH fugacity on methanol synthesis (product inhibition) and no effect on the WGSR.
• Strong positive effect of H2O fugacity on the WGSR, and negative effect on CO2 hydrogenation to methanol (product inhibition). An inhibition of the methanol synthesis via a water intermediate (e.g. OH* or H2O*) might probably be significant at higher water concentration, which is typical for the methanol synthesis at high CO2 content.
The accurate reproduction of the input influence on the reaction rates is also an evidence that the ANN30 can correctly predict reactor conditions at non-stoichiometric conversion. This is the case of simulations including diffusion in the mesoscale, in which the components may diffuse with different velocities and change the gas composition along the catalyst pores.
After this thorough model validation regarding the reaction rates, the ANN30 was coupled with a reactor model (eqn (25) and (26)) and tested. Simulations of a plug flow reactor (PFR) at different conditions were performed by both models (ANN and MKM), and the axial profiles are presented in Fig. 9. The ANN30 curves matched the MKM simulations, including correct prediction of the equilibrium. The influence of the total pressure, the temperature, and CO2 content in both the kinetic and the thermodynamic regime was correctly described. In addition, no strange behavior is observed (e.g. oscillating behavior close to the equilibrium).
After all these validation steps, a final test was performed with the ANN30: the simulation of real steady-state experimental data from two different setups: 234 points performed in a fixed-bed tube reactor (plug flow reactor, PFR),13 and 46 points performed in a Berty-type reactor (continuous stirred tank reactor, CSTR).24 The operating window was wide: 30–60 bar, 210–260 °C, 3.6–40 m3 h−1 kgcat−1, H2/COX in feed: 0.6–5.0, CO2/COX in feed: 0.09–0.50. Parity plots of the results are shown in Fig. 10.
![]() | ||
Fig. 10 Validation of the ANN30 with steady-state experimental data performed in a PFR10 and in a CSTR.18 (a) CO. (b) CO2. (c) CH3OH. |
The ANN30 coupled with reactor models correctly predicted the experimental data (see Fig. 10), with all points close to the bisection and most points predicted with less than 10% error. Low deviations are seen for the different reactor geometries (PFR and CSTR) and for the varied amounts of methanol produced. A systematic slight overestimation (5–15%) of the methanol points at high production can be observed, with an identical behavior occurring for the MKM (see Fig. S2†).
After confirming that the ANN30 accurately reproduces both the MKM simulations and the experimental data, the computational costs were addressed. In Fig. 11, the average computational time of the MKM and all developed ANNs for one evaluation of the reaction rates is presented. Since the computational time to solve the MKM varies depending on the Jacobian matrix calculation method, it was provided for both the analytical Jacobian (AJ) and numerical Jacobian (NJ) evaluation. Additionally, the computational costs of the formal kinetic model (FKM) from Graaf et al.31 were also included as a reference. The confidence interval for a 0.05 level of significance was approximately ±0.7% of the corresponding average value in all cases.
![]() | ||
Fig. 11 Average computational time of different models to calculate the reaction rates (μs). FKM – Graaf refers to the formal kinetic model developed by Graaf et al.24 The microkinetic model was solved by two approaches: providing the analytical Jacobian matrix (AJ) and calculating the Jacobian matrix numerically (NJ). |
The computational costs of the developed ANN30 were two orders of magnitude lower in comparison with the MKM (three, if no analytical Jacobian is provided), a significant improvement. Besides, the computational time of ANN30 is relatively close to the one of a well-known formal kinetic model in the literature31 (which is a simplified kinetic approach with lumped parameters). Therefore, the developed ANN is adequate for the combination with complex reactor models, e.g. computational fluid dynamics. In addition, the computational time of the ANN scales linearly with the hidden layer size, being no bottleneck in case larger ANNs are required to learn more complex reaction mechanisms.
The computational costs were reduced by 2–3 orders of magnitude when using the ANNs, and they were relatively close to formal kinetic models describing the same system. An additional relevant contribution of this work was the design of a new activation function, which provides non-linearity to the model at significantly lower costs compared to common functions used in ANN development. In addition, neither the proposed function nor its first derivative contain any discontinuity, therefore smoothing out calculations, which is especially beneficial when solving optimization problems.
The proposed methodology should be applicable to reduce the computational costs of other microkinetic models. The resulting ANN enables coupling with catalyst deactivation models and complex reactor models.
ANN | Simulated with the artificial neural network |
eq. | Equilibrium |
in | Inlet stream or input value |
max | Maximum allowed value |
min | Minimum allowed value |
MKM | Simulated with the microkinetic model |
norm | Normalized value |
out | Outlet stream or output value |
K | Kinetic term |
T | Thermodynamic term |
+ | Forward reaction |
− | Reverse reaction |
Footnote |
† Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d3re00409k |
This journal is © The Royal Society of Chemistry 2024 |