Comparative Conformational Analyses and Molecular Dynamics Studies of Glycylglycine Methyl Ester and Glycylglycine N-methylamide

Compared to their amide analogs, peptidic esters have a lower propensity for intramolecular hydrogen bonding, and thus most likely quite different stable geometries. On the other hand, their similarity and facile conversion to peptides has led to their broad use in synthetic and biological applications. This dichotomy creates a need to understand their conformational properties. Here, we study the geometries of glycylglycine methyl ester (GGMe, the simplest dipeptide ester) and its amide counterpart (GGAm) using density functional methods. The optimized conformational states were analysed in gas phase and also using a dielectric continuum aqueous phase model. In addition, molecular dynamics studies were carried out to explore effects of molecular water solvation on structure and conformational flexibility. The two atom change, from amide to ester, results in significantly different conformational profiles and solvation characteristics. In gas phase calculations, the strength of the CO–HN (3→1) intramolecular hydrogen bond in GGAm determines its minimum energy conformation, while GGMe is extended; cis-geometries are more energetic by 6 or 5 kcal mol−1 for the two molecules, respectively. The addition of a continuum dielectric to model an aqueous phase environment weakens hydrogen bonding such that the intramolecular H-bonds are replaced by geometries with less internal strain and more ideal chemical topologies. As a further consequence of the electrostatic shielding, the relative energies of the cis-geometries are reduced by more than half. Molecular dynamics simulations predict GGAm to be more flexible and more extensively solvated than GGMe. Roughly 40% of the increased solvation is due to the additional hydrogen bond donor NH group of the amide; the rest is due to increased hydrogen bonding to the amide oxygen. These analyses of the solvent dependent structural characteristics of simple peptides and peptide esters provide a basis for understanding and design applications in biological recognition, drug design, and synthetic chemistry.


Introduction
Peptides and peptide-like molecules have long been of interest for chemists worldwide. 1,2As methods have advanced, attempts have been made regularly to rationalize local conformations of amino acid residues and relative stabilities of peptide chains using theoretical chemistry and small prototypes such as dipeptides [3][4][5][6][7][8] of glycine and/or alanine.Such studies have successfully established the importance of amidic hydrogen bonds in peptide chain folding, for example, the greater stability of C 7 -type conformations 4,6,7 compared to the fully extended (C 5 -type) ones.In contrast, peptidic esters lack amidic hydrogen bonding capabilities at the ester group, leading to an expectation 9,10 of significantly different conformational stability proles.Indeed, the replacement of amide groups by ester groups has oen been used to study the role of amidic hydrogen bonding experimentally [11][12][13][14][15][16][17][18][19] in peptide folding and protein-protein interactions.However, peptidic esters have not received as much attention for theoretical studies, despite their wide use in peptidomimetic chemistry [20][21][22][23][24] and as starting materials [25][26][27] for synthesis of cyclic peptides.Thus, theoretical studies describing the implications of amide-to-ester alteration on a peptidic scaffold can be useful for various purposes, including peptidomimetic drug design and synthetic chemistry applications involving cis/trans isomerization or cyclization reactions.Such studies would benet from a solid foundation of detailed conformational studies on simple peptidic esters, including analyses of trans and cis geometries, conformational exibility and solvent interactions.
Recently, we reported 28 theoretical studies on cis/trans isomerization in secondary amides using density functional calculations, where we used N-methylacetamide (NMA) and glycylglycine methyl ester (GGMe) as model molecules to understand peptide bond isomerization.We found systematic theoretical studies on conformational properties of GGMe, its substituted derivatives, or other dipeptide esters or peptidic esters to be lacking.As the simplest peptidic ester, theoretical studies and conformational analyses on GGMe should provide a solid foundation for detailed comparative studies on substituted dipeptide esters.NMA provides a good comparison to study the properties and geometry of the amide functionality, 29,30 while comparison with its amide counterpart glycylglycine N-methylamide (GGAm) would demonstrate the effect of the change from amide to ester in terms of conformational stability.With this background, we present our theoretical studies on GGMe, and compare the ndings with NMA and GGAm (molecules shown in Fig. 1).

Experimental
Generation of structures NMA, GGMe and GGAm structures were initially generated as 2D structures with trans orientations using MarvinSketch (ChemAxon, Budapest, Hungary) and were imported to the Maestro 31 module of the Schrodinger suite (Schrödinger, LLC.New York City, USA) from which all further studies were performed.The cis geometries were generated by manual adjustment of u dihedral.

Conformational search
Initial rotamer sets for both the trans and cis geometries were generated with the MacroModel 32 conformational search algorithm using the MMFFs force-eld.The resultant conformational search geometries with energies less than 21 kJ mol À1 were subjected to geometric optimization.For gas phase calculations, the conformational search gave 39 trans and 30 cis rotamers of GGMe, and 12 trans and 39 cis rotamers of GGAm.For the water phase, the conformational search gave 37 trans and 29 cis rotamers of GGMe, and 32 trans and 37 cis of GGAm.

Geometric optimization
Geometric optimizations were performed using Jaguar. 33,34The conformational search rotamers were subjected to geometric optimization at B3LYP/6-31++G** level with maximum grid density and the "accurate" accuracy level of SCF.Frequency analyses were carried out to conrm convergence to optimized minimum energy geometries with no imaginary frequencies.For NMA, "tight" convergence criteria were required to determine the optimized geometry.For GGMe and GGAm, default Fig. 1 GGMe, NMA and GGAm structures, atom numbers and important measurements that will be used in the following discussion.convergence criteria were sufficient.Besides these settings, the PBF solvent model with the default setting for water was used for optimization in a water dielectric continuum.For GGMe, 18 trans and 20 cis unique optimized geometries were found for the gas phase, and 23 trans and 14 cis unique optimized geometries were found for the water phase.For GGAm, 12 trans and 20 cis unique optimized geometries were found for the gas phase, and 21 trans and 18 cis unique optimized geometries were found for the water phase.

Single point energy calculations
Accurate single point energies were calculated using Jaguar 33,34 for all optimized geometries at the B3LYP/6-311++G (3df, 3pd) level, with maximum grid density and the "accurate" accuracy level of SCF.Along with these settings, the PBF solvent model with the default settings for water was used for the calculations in water dielectric continuum.Vibrational analyses were carried out at the B3LYP/cc-pVTZ(-f)++ level and the free energy values obtained for 298.15K were used to calculate relative free energies.

Molecular dynamics
The molecular dynamics simulations were carried out using the program Desmond.Using the optimized minimum energy geometry of GGMe/GGAm, the MD systems were generated with System Builder module based on TIP4Pew solvent model as a 20 A Â 20 A Â 20 A orthorhombic box with default settings.The systems thus generated were then subjected to NPT molecular dynamics simulation at 300 K and 1.01325 bar for total 12 ns with 1.2 ps recording time and trajectory spanning 4.8 ps, resulting into 2500 frames each.a Gas phase energy relative to the minimum energy trans geometries, calculated in kcal mol À1 at B3LYP/6-311++G (3df, 3pd) level.b Relative free energy at 298.15 K relative to minimum energy trans geometries, calculated in kcal mol À1 at B3LYP/cc-pVTZ(-f)++ level.

Conformations in gas phase
The most stable trans and cis geometries of GGMe, NMA and GGAm in the gas phase are shown in Fig. 2, and their key structural parameters have been tabulated in Table 1.As we have reported 28 recently, the optimized trans and cis geometries of NMA were found to be more planar than those reported previously 35 at SCF and MP2 levels, and the relative stabilities of trans and cis geometries were in excellent agreement. 29,36With the absence of amidic hydrogen on C-terminal, GGMe was found to be most stable in an extended (C 5 ) trans form, which is in sharp contrast to GGAm, whose most stable conformation was the trans form with usual C 7 -type (g-turn) folding. 7,37The characteristics of the C 7 -form of GGAm, e.g. an intramolecular hydrogen bond between O10 and H20 (2.05 A), were consistent with previously reported C 7 -forms of similar compounds, such as the glycine dipeptide.Other folding geometries (such as bturn, a-turn, d-turn etc.) require longer peptide chains. 37,38nstead, both GGMe and GGAm showed 2/1 hydrogen bonding (2.24A and 2.18 A, respectively) between the pyramidal amino nitrogen of N-terminal and the amide hydrogen of the peptide bond.The extended C 5 -form of GGMe also implied a 2/2 intramolecularly H-bonded conformation, and thus the hydrogen bond here (2.35 A) for the C-terminal glycine residue appeared bifurcated (and therefore longer than 2.3 A) because of simultaneous 2/1 hydrogen bonding. 37In contrast, with no such bifurcation, the cis form of GGMe (also in an extended conformation) showed a clearer 2/2 intramolecular hydrogen bonding (2.27 A).While the "peptide backbone" in both trans and cis isomers of GGMe and NMA remained largely planar, in contrast to GGAm, it deviated only somewhat from ideal planarity, with all consecutive dihedral angles within a range of AE5 of perfect planarity (dihedral values of AE180 or 0 ).The cis and trans forms represent two distinct peptide geometries, each of which is similar for all three compounds.The C-N peptide bond lengths as well as bond angles :CNC were larger for the cis forms than those for the trans forms, reecting the repulsion between two carbons on the two sides of the peptide unit As we have analysed previously, 28 NMA is a small and simple molecule with only methyl groups attached to the amide moiety, and lacks asymmetric rotatable bonds and substitutions that create multiple local minima rotamers.Hence, a particular optimized state (such as trans or cis) can be effectively represented by a single respective geometry.However, this is not the case for larger molecules, such as GGMe and GGAm, as multiple rotatable bonds give rise to multiple local minimum rotamers (see Experimental).Fig. 3 shows the ensembles of such optimized structures of GGMe and GGAm in gas phase.This serves as an illustration of a rst level of complexity in real peptidic and peptidomimetic systems, with far more rotatable bonds, giving rise to multiple rotamers.
As shown in Fig. 3, the cis geometries are signicantly unstable compared to the trans geometries (by >4 kcal mol À1 ), so conformational analysis of major forms requires detailed discussion of only trans geometries.
For the trans GGAm geometries, the C 7 -type conformation is characterized by an intramolecular 3/1 amidic hydrogen Table 2 Intramolecular hydrogen bonds for the optimized conformations of trans GGAm in gas phase a a HB: hydrogen bond between two atoms; "+": present; "À": absent.
Table 3 Intramolecular hydrogen bonds for the optimized conformations of trans GGMe in gas phase a a HB: hydrogen bond between two atoms; "+": present; "À": absent.bond, while C 5 -type conformations are characterized by weaker 2/2 intramolecular hydrogen bonding.Among the 12 optimized conformations identied for trans GGAm (Table 2), only 2 are C 5 (entries 5 and 6), and other 10 are C 7 geometries.The most stable C 5 conformer was predicted to be $1.63 kcal mol À1 more energetic than the most stable C 7 conformer of GGAm.Additionally, the 2/1 hydrogen bonding also appeared to be important, as all 5 conformers of GGAm with such hydrogen bonding were found to be more stable than the other 7, which lack the hydrogen bond.In terms of energy, this additional stabilization seemed to affect the C 7 conformation more than C 5 .While the lack of 2/1 hydrogen bonding destabilized the C 5 geometry by only 0.5 kcal mol À1 (entries 5 vs. 6), the same difference among C 7 conformers resulted in destabilization by $2.9 kcal mol À1 (entries 1-4 vs. 11 and 12).
Fig. 4 The ensembles of optimized rotamers of trans (upper two rows) and cis geometries (lower two rows) of GGMe (left group) and GGAm (right group) in the water phase dielectric continuum model.The geometries are aligned at the amide bond and projected from side (left), Nterminus (middle) and above (right) within each group.The second and fourth rows display the conformers colored according to their relative stabilities.
Fig. 5 Optimized geometries of GGMe (upper row) and GGAm (lower row) with energies less than 0.6 kcal mol À1 (equiv.to 1 RT at 300 K) in gas phase (left) and in water dielectric continuum (right).GGMe, lacking of the amidic hydrogen on C-terminus, shows no C 7 -type stable conformation, leaving the most stable conformer to be a C 5 conformer.As listed in Table 3, the j dihedral of the C-terminal glycine residue (dihedral 4-5-6-7) was observed to be energetically decisive, with values close to AE180 or within 10 of 0 .The 7 geometries with dihedral 4-5-6-7 values close to AE180 were more stable than the other 11 geometries with the value close to 0 .The absence of 2/1 hydrogen bonding destabilized the extended forms by $1 kcal mol À1 (entries 1 and 2 vs. 4 and 5 in Table 3).

Conformations in water phase
The calculations in a solvent phase require consideration of solvent interactions.Because QM studies with explicit solvent molecules are not feasible, such calculations are typically carried out with implicit solvent models, whereby a solvent is modelled as a dielectric continuum.However, this simplication eliminates any consideration of the effects of molecular water, especially hydrogen bonding into the water structure.Therefore, such studies may be used to evaluate a low energy set of conformational states, rather than identication of a specic minimum energy conformation.
For initial evaluation of aqueous phase geometries, 23 trans and 14 cis optimized rotamers of GGMe, as well as 21 trans and 18 cis optimized rotamers of GGAm (Fig. 4) were identied using a PBF solvent model.From the geometries, it is evident that intramolecular H-bonds are no longer decisively important conformational determinants in the polar water dielectric continuum.The conformations with higher propensity to interact with solvent molecules are estimated to be more stable  than the conformations with intramolecular H-bonding, thereby resulting in a signicantly different conformational prole compared to the gas phase.Thus, in striking contrast to the gas phase geometries (see above), none of the optimized GGAm geometries showed a C 7 conformation, and only one optimized geometry showed an extended C 5 conformation (relative energy: 0.68 kcal mol À1 ).Similarly, a C 5 conformation of GGMe is no longer the most stable conformation (relative energy: 0.53 kcal mol À1 ).The solvent interactions also stabilize the cis geometries, and hence the most stable cis geometry is estimated to be <2 kcal mol À1 , compared to >4 kcal mol À1 in the gas phase.Overall, the geometries of GGMe and GGAm show very different conformational properties in both the gas phase and the aqueous dielectric continuum model, especially for stable geometries (up to relative energies $0.6 kcal mol À1 ) at room temperature (Fig. 5).

Solvation and hydrogen bonding with water
Physical aqueous phases involve interactions with molecular water, requiring modelling techniques beyond simple continuum dielectric models, especially when hydrogen bonding with water occurs.A major question thus becomes the average structure of hydrogen bonding interactions.For this study, we addressed this question in terms of average hydrogen bonding over time using molecular dynamics simulations of GGMe and GGAm solvated by TIP4Pew water at 300 K.
The average hydrogen bonding values were calculated over the simulation times as a ratio of total number of hydrogen bonds observed for a specic atom to the total number of frames.Different atoms showed different propensities to form hydrogen bonds (Table 4).Overall, for an average frame, GGMe showed moderate solvation by the water phase with an average total of 5.6 hydrogen bonds.In contrast, GGAm is better solvated, with a total of 6.5 hydrogen bonds on average.On a closer look, it is evident that the difference results from the altered hydrogen bond forming capabilities of amide and ester functionalities.The amide group in GGAm provides an additional hydrogen with a higher hydrogen bond forming propensity (0.46) relative to the ester sp 3 oxygen (0.09) of GGMe.The remainder of the difference can be explained by the relative propensities of the amide and ester carbonyl oxygen atoms to form hydrogen bonds with water.The GGMe O9 atom appears in 1364, 993, and 57 frames (of the total 2500 frames) as singly, doubly, or triply hydrogen bonded to water, whereas GGAm structures show these frequencies to be 313, 1798, and 377 frames, respectively.(The OPLS3 force eld assigns a more negative charge to the amide carbonyl oxygen compared to the ester carbonyl oxygen; whether this or other parameterization differences are decisive cannot be evaluated, as the details are proprietary.)The triply hydrogen bonded structures reect a tetrahedral geometry (Fig. 6) of the oxygen atom, consistent with an amide resonance structure that places greater negative charge on it.Overall, the force eld parameterization, in combination with structural details, predict the stronger solvation of carbonyl O9 of GGAm (2.03) compared to the carbonyl oxygen of the ester group in GGMe (1.41).

Conformational exibility
The comparative conformational exibility of peptidic esters is an important consideration for their use as peptidomimetics.Conformational exibility is also related to the reactivity for intramolecular cyclization.As shown in Fig. 7, the MD studies revealed a high degree of conformational variability for GGAm, represented by frequent transitions between low RMSD (<0.5) geometries and high RMSD ($1.5) geometries, as well as a broader distribution of radius of gyration (rGyr).On the other hand, GGMe geometries displayed relatively lower conformational variability, represented by less frequent transitions between one cluster of low RMSD (<0.5) geometries and another of high RMSD ($1.5) geometries, as well as a narrower distribution of radius of gyration (rGyr).
This difference in conformational exibility was analysed in terms of specic scaffold dihedrals, i.e. frequency histograms of the scaffold dihedral angles for the MD geometries (Fig. 8).Here also, GGMe showed less exibility than GGAm.The most noteworthy difference was observed for dihedral 4-5-6-7 (j dihedral of the C-terminal glycine residue, row 5), where GGAm showed many more structures with the dihedral values between À90 to 90 ; GGMe had virtually none.Interestingly, this is also consistent with the observations from the gas phase geometries, that the dihedral 4-5-6-7 values near 0 are unfavourable compared to values near AE180 .GGMe also demonstrates relatively restricted preference for dihedral 3-4-5-6 values dihedral of the C-terminal glycine residue, row 4) close to AE90 , having few geometries with the dihedral values close to 0 or AE180 .On the other hand, GGAm shows a considerable proportion of geometries close to AE180 .This is clearly reected in the Ramachandran plot for the C-terminal glycine residue of GGMe and GGAm geometries (Fig. 9).GGAm geometries are spread across all 5 regions (a, a L , b P , b L , b S ), and predominantly within the b P and b L regions (unlike the distribution in proteins, which is predominantly within the a and a L regions 39 ).In contrast, the GGMe conformations are almost exclusively within the b P and b L regions.Interestingly, QM studies with the aqueous dielectric continuum model for GGMe provide a few moderately stable geometries in (180, 0) region, but no such geometries are observed in molecular dynamics.

Conclusions
Peptidic esters are important, as prodrugs or peptidomimetics, and also as precursors of cyclic peptides.Here we have described conformational studies of the simplest peptidic ester, glycylglycine methyl ester (GGMe) using density functional studies and compared the results with its amide counterpart (GGAm).Multiple optimized rotamers of trans and cis geometries in gas phase and water dielectric continuum were identi-ed, showing how intramolecular hydrogen bonding determined lowest energy conformations in the gas phase, but not in the water phase.To study effects of molecular water, solvation properties and conformational exibility were analysed with molecular dynamics.These simulations predicted weaker solvation at both oxygen atoms of the ester group, along with lower exibility of GGMe.These analyses of the solvent dependent structural characteristics of simple peptides and peptide esters provide a basis for further theoretical studies, such as on substituted derivatives of GGMe or other larger peptidic esters, with potential implications for the overall extent of exibility and reactivity.Such studies are important for understanding and design applications in biological recognition, drug design, and synthetic chemistry.9 Ramachandran plot in terms of shifted f (0 to 360 ) and j (À90 to 270 ) dihedral coordinates of the C-terminal glycine residue in GGMe and GGAm geometries observed in molecular dynamics at 300 K for 12 ns as well as in the QM derived geometries in gas phase and water dielectric continuum.The relative energy scale is applicable to the QM derived geometries only.

Fig. 3
Fig.3The ensembles of optimized rotamers of trans (upper two rows) and cis geometries (lower two rows) of GGMe (left group) and GGAm (right group) in gas phase calculations.The geometries are aligned at the amide bond and projected from side (left), N-terminus (middle) and above (right) within each group.The second and fourth rows display the conformers colored according to their relative stabilities.The 3/1 hydrogen bond in C 7 -geometries of trans GGAm is shown as yellow dashed lines.
. The ester group in GGMe shows two distinct bond-lengths for carbonyl C]O and methoxy C-O bonds, with a shorter C]O bond (1.21 A) and a longer C-O bond (1.34 A) for both cis and trans geometries.The bond angles :C 5 -C 6 -O 7 and :C 6 -O 7 -C 8 are measured $111 and $116 respectively.The most stable cis geometry for GGMe is less unstable than that for GGAm relative to the respective trans isomers.

Fig. 6 (
Fig. 6 (A) Clustering of water molecules for the 17 of the first 100 frames that show simultaneously three solvent hydrogen bond interactions with O9 of GGAm.The geometries are aligned at the Cterminal amide groups (only the atoms neighbouring the amide groups are shown).(B) The amide carbonyl O9 of GGAm displays tetrahedral geometry while making simultaneously three hydrogen bonds with water molecules.

Fig. 7
Fig. 7 Different properties of GGMe (left) and GGAm (right) geometries observed during 300 K NPT molecular dynamics simulation plotted along time as well as displayed as frequency distribution histogram.MolSA: molecular surface are calculated with 1.4 A probe radius, equivalent to a van der Waals surface area; SASA: solvent accessible surface area; PSA: polar surface area i.e. solvent accessible surface area contributed only by oxygen and nitrogen atoms.

Fig. 8
Fig. 8 Distribution of scaffold dihedrals during the molecular dynamics studies on GGMe and GGAm.The plots summarize the conformational evolution throughout the simulation trajectory (0.00-12.00 ns).The colour-coded scaffold dihedrals are highlighted on 2D structures of GGMe and GGAm on top.Each dihedral is accompanied by a dial plot and bar plots of the same color.Dial plots describe the conformation of the torsion throughout the course of the simulation.The beginning of the simulation is in the center of each dial plot and the time evolution is plotted radially outwards.The bar plots represent the frequency distribution of the dihedral angle values.

Fig.
Fig.9Ramachandran plot in terms of shifted f (0 to 360 ) and j (À90 to 270 ) dihedral coordinates of the C-terminal glycine residue in GGMe and GGAm geometries observed in molecular dynamics at 300 K for 12 ns as well as in the QM derived geometries in gas phase and water dielectric continuum.The relative energy scale is applicable to the QM derived geometries only.

Table 1
Optimized minimum energy geometries of trans and cis GGMe, NMA and GGAm

Table 4
Atom-wise distribution of hydrogen bonds formed by GGMe and GGAm with surrounding water molecules during NPT molecular dynamics at 300 K for 12 ns, analyzed in terms of 2500 frames.Each frame corresponds to 4.8 ps