Reaction dynamics as the missing puzzle piece: the origin of selectivity in oxazaborolidinium ion-catalysed reactions

The selectivity in a group of oxazaborolidinium ion-catalysed reactions between aldehyde and diazo compounds cannot be explained using transition state theory. VRAI-selectivity, developed to predict the outcome of dynamically controlled reactions, can account for both the chemo- and the stereo-selectivity in these reactions, which are controlled by reaction dynamics. Subtle modifications to the substrate or catalyst substituents alter the potential energy surface, leading to changes in predominant reaction pathways and altering the barriers to the major product when reaction dynamics are considered. In addition, this study suggests an explanation for the mysterious inversion of enantioselectivity resulting from the inclusion of an orthoiPrO group in the catalyst.


A. Density functional theory (DFT) calculations
The optimised structures obtained from DFT calculations were validated through frequency analyses, confirming that they correspond to either a minimum or a first-order saddle point on the potential energy surface (PES).Quick reaction coordinate (QRC) calculations 1 were conducted to further verify that the identified TSs correspond to the relevant processes of interest.

B. Conformational searching calculations
Conformational searching calculations were conducted in MacroModel (v13.4) with MacroModel (release 2021-4). 2 The OPLS4 3 force field was employed with the mixed torsional/low-mode sampling method and a maximum of 2000 steps as the limit.Conformers with energies within a window of 41 kJ mol -1 (equivalent to 10 kcal mol -1 ) were saved for subsequent analyses.

C. Data analyses and scripts
Data analyses, such as processing outputs from DFT calculations, were conducted with Python (3.8.12).Graphs were created with Matplotlib (3.3.2) or Plotly (5.1.0) 4.D. The use of CONFPASS CONPFASS (https://github.com/Goodman-lab/CONFPASS) 5 was used to assist in re-optimisations of force field structures at the DFT level with confidence that key stable structures are obtained.The default setting (pipeline-mix, x = 0.8 and Q = 0.2) was used for generating the priority list for reoptimising conformers.We have ensured that the %Conf at r opt is greater than 80% before terminating the re-optimisation process.r opt is the number of re-optimised conformers over the total number of conformers in the conformational searching output file.%Conf refers to the confidence that the reoptimisation process can be terminated.E. VRAI-selectivity extension: VRAI-multi VRAI-multi.py is an extension to the VRAI-selectivity.pyscript (https://github.com/Goodmanlab/VRAI-selectivity) 6,7.VRAI-multi automates the process of VRAI-selectivity analyses for treating systems with complex PES, ie with more than two products that share the same intermediate structure on their reaction pathway.

Perform VRAIselectivity calculations
Process the binary product ratios to deduce product percentages pd.DataFrame of product percentages The input to VRAI-multi is the path to the folder that contains the required files for the VRAI-selectivity analyses.For the script to correctly identify and process the files, the following file naming formats must be adopted: -Geometry of the first TS in mol file must have a 'TS1.mol'suffix.
-Gaussian16 frequency calculation output file for the first TS must adopt the same filename as the corresponding mol file but with a '.out' suffix.o Example: ol17ryu_e_int2_SSR_11_TS1.mol and ol17ryu_e_int2_SSR_11_TS1.out VRAI-multi generates a data frame that comprises rows containing input file information for conducting VRAI-selectivity calculations.The data frame is constructed to encompass all possible binary combinations of products and TS1s.
Following each row in the input file data frame, the VRAI-selectivity calculations are executed.Binary product ratios are obtained and recorded into a raw result data frame.The raw result table is processed to calculate the product percentages.For results that involve the same intermediate and TS1: 1.The calculation results that share a common product are identified.
2. The product percentages are determined by calculating the ratio of the common product in different sets of results.3. The above process is repeated for all possible products.The output of the pipeline is a data frame that contains the product percentages, which are calculated by using different products as the reference in the calculation.It is crucial to verify the consistency of the product percentages obtained from different sets of results to accept them as reliable outcomes.

The uncatalysed pathway
The uncatalysed pathways were studied using substrates from the ketone-selective reaction.We simplified the substrates by replacing phenyl with methyl groups.The mechanism for the uncatalysed pathway is distinctively different from the COBI-catalysed pathway.The C-C formation leads to cyclic intermediate structures.The pathway via the 5-membered ring intermediate is more favourable kinetically compared to the pathway via the 4-membered ring intermediate.

Calculation breakdowns
Notes on the result presentations: Unless otherwise specified, the following notations apply to all the tables below.
-Result tables for 'with VRAI' predictions: %(TS1) refers to the percentage population of the TS1.%(VRAI) is the predicted product percentage from VRAI-selectivity calculations.The frequency analyses required for VRAI-selectivity calculations were conducted at the B3LYP-D3/6-31G(d) level of theory.In the VRAI-selectivity calculation, the given TS1 corresponds to the first transition state and the product structures come from the QRC calculations of the given TS2.%(Pathway) values were calculated as a product of %(TS1) and %(VRAI).

RMSD calculations
Root-mean-square deviation (RMSD) calculations were performed with the GetBestRMS function under the rdkit.Chem.rdMolAlignmodule on optimised TSs to identify the duplicate structures, which should also be also degenerate in energy.The pair of conformers were considered as having the same structure if the RMSD value < 0.005 Å.The duplicate structures were removed from the data set before calculating product percentages.

The ketone-selective reaction with Cat-B
TST only: The epoxide-selective reaction TST: Table 3.7.Calculation breakdowns for the predicted product percentages of the epoxide-selective reaction based on TST only.%(TS3) corresponds to the percentage population of the TS3.%(Pathway) values are calculated as a product of %(TS1), %(TS2) and %(TS3).Adding 'agw21ryu_d_int2_S' to labels given in TS1, TS2 and TS3 column gives the Gaussian 16 output filename of the structure.NA implies that the process is barrierless and hence no corresponding TS structure is obtained via this pathway.

Explaining selectivity
A. Distortion-interaction analyses

. Illustrations for distortion-interaction analyses
We performed distortion-interaction analyses on key TS1 structures for each reaction following the below procedure:   Inclusion of solvent models 8,9 in single-point energy calculations were considered for key TS1 and TS2 structures with an ΔΔG ‡ < 2.5 kcal mol -1 .We re-calculated the product percentages with the new energy values.45.67% *In the energy check for the VRAI-selectivity test on TS0 as the first transition state, the second transition state (ie TS1) is higher in energy by 0.9 kcal mol -1 than the first transition state.In this scenario, the program would proceed to calculate the product percentages with the transition state theory (TST) Table 5.3.Benchmarking -inclusion of solvent models: This table presents the mean absolute error (MAE) data of the calculated percentages compared to the experimental percentages.Unless otherwise specified, the level of theory for the calculations is ωB97XD/6-311++g(d,p)/SMD/toluene //B3LYP-D3/6-31g(d).

MAE compared to the experimental percentage
Calculated Percentage (with VRAI) Calculated Percentage (TST) C. Structure re-optimisations Key TS1, TS2 and INT2 structures with an ΔΔG ‡ < 2.5 kcal mol -1 at the ωB97XD/6-311g(d,p)//B3LYP-D3/6-31g(d) theory level were re-optimized at other levels of theory.The product percentages were calculated with the new energy values.

A. Molecular dynamic simulations
Quasi-classical molecular dynamics (MD) simulations were conducted using Jprogdyn 10 in conjunction with Gaussian.The input structure for the MD simulation was the lowest energy TS1 of the ketoneselectivity reaction with Cat B. This is the simplest reaction system that we have invested in this study.
A trial run was conducted with a timeframe from -500 fs to 500 fs.The default setting was employed (ie timestep = 1 fs).The result confirmed that the backward direction from the TS1 leads to the formation of the INT2 adduct.
An additional set of 10 trials was performed.Each trial commenced from the chosen TS1, with trajectories going in the backward direction.The MD calculations were terminated either upon the generation of a product or in the event of recrossing, characterized by the distance between forming C-C bonds exceeding 3.4 Å.Details of the calculation are in the table below.

Notes on pervious works
1 The reaction with the greatest number of atoms comes from the work of Davies et al. 12

Key Structures
The structural information is available in the Cambridge Apollo Repository: https://doi.org/10.17863/CAM.96901 The key structures are included in the 'SI_key_structure_cobi' folder as opt+freq or opt Gaussian calculation output files.The optimisations were conducted at the B3LYP-D3/6-31G(d) level of theory.For the TSs, the filename listed below can be mapped to the label given in the calculation breakdown tables in Section 3 of this document.

SI Figure 1 . 1 .
Distribution of the final %Conf and r opt from the CONFPASS test for confirming the completion of the re-optimisation process (ie the global minimum has been obtained).

Figure 2 . 1 .
Figure 2.1.The energy profile of the uncatalysed pathway.The ΔG values relative to the reactant are labelled on the diagram in kcal mol -1 .The stereochemistry in INT2 is RR.

Figure 7 .Figure 7 . 1 .
Figure 7.1.Distribution of the size of the chemical systems covered in our previous work.6,11The reaction with the greatest number of atoms comes from the work of Davies et al.12 The designated folder must contain a minimum of one set of TS1 files, one set of INT files and two product mol files.It is acceptable to include additional sets of product or TS1 files, but the number of intermediate files in the folder should not exceed one set (ie a mol and out file).
Geometry of the intermediate in .molfile must have a 'int.mol'suffix.-Gaussian16 frequency calculation output file for the intermediate must adopt the same filename as the corresponding mol file but with a '.out' suffix.o Example: ol17ryu_e_int2_SSR_6_int.mol and ol17ryu_e_int2_SSR_6_int.out -Gaussian16 frequency calculation output files for the second TS must have a 'TS2.out'suffix.

Table 3 . 1 .
Calculation breakdowns for the predicted product percentages of the ketone-selective reaction with Cat-B based on TST only.Adding 'ol17ryu_b_int2_S' to labels given in TS1 and TS2 column gives the Gaussian 16 output filename of the structure.

Table 3 .2. Calculation
breakdowns for the predicted product percentages of the ketone-selective reaction with Cat-B incorporating VRAI-selectivity calculations (see Figure5in the main text for detail).Adding 'ol17ryu_b_int2_S' to labels given in TS1 and TS2 column gives the Gaussian 16 output filename of the relevant structure.

The ketone-selective reaction with Cat-C TST only: Table 3.3. Calculation
breakdowns for the predicted product percentages of the ketone-selective reaction with Cat-C based on TST only.Adding 'ol17ryu_c_int2_S' to labels given in TS1 and TS2 column gives the Gaussian 16 output filename of the structure.

Table 3 .
4. Calculation breakdowns for the predicted product percentages of the ketone-selective reaction with Cat-C incorporating VRAI-selectivity calculations.Adding 'ol17ryu_c_int2_S' to labels given in TS1 and TS2 column gives the Gaussian 16 output filename of the relevant structure.

Table 3 . 5 .
Calculation breakdowns for the predicted product percentages of the ketone-selective reaction with Cat-D based on TST only.Adding 'ol17ryu_e_int2_S' to labels given in TS1 and TS2 column gives the Gaussian 16 output filename of the structure.

Table 3 . 6
. Calculation breakdowns for the predicted product percentages of the ketone-selective reaction with Cat-D based on TST and VRAI-selectivity calculations.Adding 'ol17ryu_e_int2_S' to labels given in TS1 and TS2 column gives the Gaussian 16 output filename of the relevant structure.

Table 3 . 8 .
Calculation breakdowns for the predicted product percentages of the epoxide-selective reaction based on TST and VRAI-selectivity calculation.Adding 'agw21ryu_d_int2_S' to labels given in TS1 and TS2 column gives the Gaussian 16 output filename of the relevant structure.NA implies that the process is barrierless and hence no corresponding TS structure is obtained via this pathway.

Table 3 . 9 .
Calculation breakdowns for the predicted product percentages of the aldehyde-selective reaction based on TST only.Adding 'jacs13ryu2_d_int2_S' to labels given in TS1 and TS2 column gives the Gaussian 16 output filename of the structure.Some files may have additional suffixes.

Table 3 .10. Calculation
breakdowns for the predicted product percentages of the aldehyde-selective reaction based on VRAI-selectivity calculations only.See Figure7.C in the main text for elaborations on the VRAI-selectivity calculations.%(Pathway) values are calculated as a product of %(VRAI1) and %(VRAI2).Adding 'jacs13ryu2_d_int2_S' to labels given in TS0, TS1 and TS2 column gives the Gaussian 16 output filename of the relevant structure.Some files may have additional suffixes.

Table 3 .
11. Mean absolute error (MAE) calculations: This table presents the mean absolute error (MAE) data of the calculated percentages compared to the experimental percentages.The level of theory for the

Table 4 . 1 .
Distortion-interaction analysis results.Negative ΔU ‡ were obtained as the association between INT1 and diazo was not accounted, which is likely to be very exothermic.Adding the text in the bracket to the TS1 label in each section gives the Gaussian 16 output filename of the relevant structure.Wei et al. have previously a mechanistic study on our chosen reactions for the aldehydeselective reaction at the B3LYP/6-31g(d)/propionitrile/PCM level of theory.We extracted the xyz coordinates of the key TS1 structures from the supporting information of their work.We re-optimised the structures and calculate ΔΔG ‡ with the theory level used in this study, H Figure 4.4.Hirshfeld charge analyses on TS1 structures.Pathways that linked with the above TS1 contributes to more 50% of the final product composition.Figure Hirshfeld charge analyses on TS2 structures.The chosen TS2 structures have the lowest ΔG ‡ among other TS2 structures that lead to the same product.Level of theory = ωB97XD/6-311g(d,p)//B3LYP-D3/6-31g(d) D. INT1 Figure 4.6.The INT1 diastereoisomer stability in ketone-selective reactions: The catalysts have a stereochemistry of SS.If the formed B-O bond is syn to the N-H bond, 'D' label is given.The 'U' label refers to the opposite stereochemistry, ie formed B-O bond is anti to the N-H bond.

Table 5 . 1 .
Comparisons with previous studies.Adding the text in the bracket to the TS1 label in each section gives the Gaussian 16 output filename of the relevant structure.ΔΔG ‡ (TS1) values are calculated based on the ΔG ‡ (TS1) of the jacs13ryu2_d_int2_SRR_1_TS1_2 structure from this work.

Table 5 . 2 .
Benchmarking -inclusion of solvent models: Calculated percentage (TST) results are derived based on the TST assumption.The selectivity of the reaction depends on the kinetics of TS1 and TS2.Calculated percentage (with VRAI) results incorporate the VRAI-selectivity calculation outcomes.We assume that the stereochemistry is controlled by kinetics via TS1 and the chemoselectivity, ie processes beyond TS1, are controlled by reaction dynamics.Unless otherwise specified, the level of theory for the calculations is

Table 5 . 5 .
Benchmarking -structure re-optimizations.Calculated percentage (TST) results are derived based on the TST assumption.The selectivity of the reaction entirely depends on the kinetics of TS1 and TS2.Calculated percentage (with VRAI) results incorporate the VRAI-selectivity calculation outcomes.We assume that the stereochemistry is controlled by kinetics via TS1 and the chemoselectivity, ie processes beyond TS1, are controlled by reaction dynamics.

Table 5 . 6 .
Benchmarking -structure re-optimizations: This table presents the mean absolute error (MAE) data of the calculated percentages compared to the experimental percentages.

Table 6 .
With the lowest energy TS1 of the ketone-selectivity reaction with Cat B as an example, we compare the CPU time required to run the corresponding VRAI-selectivity calculation and the MD simulation for producing reaction trajectories.