Mechanistically informed predictions of binding modes for carbocation intermediates of a sesquiterpene synthase reaction

Predicting the binding mode of carbocations produced in sesquiterpene synthase enzymes is not unlike finding a piece of hay in a haystack. A new method for tackling this problem is described.

Two--dimensional Scan for Identifying the 3 to 5 Transition State Structure.
In our attempt to find a discreet TSS that links 3 to 5 a two--dimensional scan was performed on the carbon--carbon bond forming event, red in Figure A, and the carbon-hydrogen bond forming event, blue in Figure A. The hydrogen was model as being donated from an activated phenol (a placeholder for tyrosine). The carbon--carbon bond distance was scanned from 1.8 Å to 3.5 Å. The carbon--hydrogen distance was scanned from 1.0 Å to 2.0 Å. The scan was then plotted against energy (Figure B. It was hard to tell from Figure B, but there was a clear ridge --which may correspond to a TSS. A limited portion of the scan, which better illustrates that ridge is show below as Figure C. Multiple points along that ridge were submitted for a TSS, but none came to true TSS.

Coordination Constraints
There were nine different constraint files used during the docking -one per each catalytic motif. The majority of the constraints were actually used to dock the Mg/PPi complex into the actives site in an orientation similar to those observed in crystal structures. These are called coordinating constraints, because they are the constraints used to ensure protein coordination to the magnesiums. The coordination constraints generated by measuring the distance, angle and dihedrals from crystal structures that have all three magnesiums and take their average as the constraint value and twice their standard deviation as constraint window (Table 1).

Heatmaps
For a greater explanation on the formation of the heatmap see Figure D.  In the main text of the paper only the percentage of low energy structure were shown. Included below are the absolute numbers of structures found for each catalytic orientation or motif for docking into the 5EAT crystal structure.

Misleading Crystal Structures.
Based on the orientation in crystal structures one would expect that motif 2 would be the most likely to score well ( Figure G1). There are three different crystal stuctures that have similar binding orientations (Figure G2). Based on this methodology, that orientation scores poorly. (1) Crystal structure 3LZ9 with the three oxygens labeled according the constraints chart (Figure 3) in the main text. The distance labeled is the distance to the carbon to deprotonate from oxygen B. (2) The three crystal structures cited in the main text that all appear to support orientation 2 as the most likely. The green strucute is 3LZ9, the cyan structure is 3M01 and the magenta structure is 5EAU. The substrate analog in 3LZ9 and 3M01 is 2fluorofarnesyl diphosphate. The substrate in 5EAU is 3-trifluromethylfarnesyl diphosphate.

A" B"
C"
Included here are representative structures for the populations that form up the trimodal distribution found in the 6 to 7 RMSD calculation ( Figure H). Figure H. The three binding modes found in the trimodal distribution in the RMSD calculation.
(1) This is the low RMSD population, which aligns well with structures from intermediate 6 (2) This is the docking orientation with a RMSD values ~ 3Å. This population is 180° rotation from the orientation in 1.
(3) This is representative of the highest RMSD docking orientation. (4) This is all three docking orientations aligned together.

Bimodal Distribution
In the main text the bimodal appearance of the 2 to 3 transition is largely a result of the conformational freedom in the isopropylene tail in intermediate 3 (Figure I). When the RMSD is recalculated for that transition, the shape of the distribution goes to a more gaussian-like distribution ( Figure J).

Partial charges of cations in Rosetta
Rosetta doesn't have discrete terms for the handling of carbocations in its scoring function. Rosetta does assign partial charges to all atoms, but those partial charges, which are derived from QM calculations, are design for handling proteins. To investigate the impact of the partial charges on the results, the partial charges that Rosetta adds were overwritten with two different partial charges from QM calculations. The first partial charges were taken from the Mulliken charges, the second set of charges were taken from the calculated electrostatic charges from the flag pop=chelpg option in Gaussian09. Intermediate 7 was docked with these three different partial charge options (Figure K). It is clear that the type of partial charge does not affect the results of the docking, where all three options identify the same catalytic motif as the most likely. This result led us to use the default charges for Rosetta as a way to make the method as generalizable as possible. That the partial charges don't make a bigger difference leads us to conclude docking in our simulations is dominated by shape. In addition, many of the physical properties that chemists attribute to electrostatics are included in other portions of the scoring function -e.g., hydrogen bonding is covered by its own energy term. As of yet, Rosetta doesn't include terms for non-classical hydrogen bonding interactions, such as cation-π interactions; that is something that our group is working on adding. 3) (blue header) Docking results when the calculated electrostatic potential charge replaced the default charges used in rosetta. Although specific numbers change slightly there wasn't significant difference in the results regardless of which partial charges were used.