Francisco
Adasme-Carreño
*ab,
Alvaro
Ochoa-Calle
*c,
Marcelo
Galván
c and
Joel
Ireta
*c
aCentro de Investigación de Estudios Avanzados del Maule (CIEAM), Vicerrectorá de Investigación y Postgrado Universidad Católica del Maule, Talca 3480112, Chile. E-mail: fadasme@ucm.cl
bLaboratorio de Bioinformática y Química Computacional (LBQC), Departamento de Medicina Traslacional, Facultad de Medicina, Universidad Católica del Maule, Talca 3480112, Chile
cDepartamento de Química, División de Ciencias Básicas e Ingeniería, Universidad Autónoma Metropolitana-Iztapalapa, Ciudad de México 09340, Mexico. E-mail: aochoagauge@gmail.com; iret@xanum.uam.mx
First published on 7th February 2024
Proper description of solvent effects is challenging for theoretical methods, particularly if the solute is a zwitterion. Here, a series of theoretical procedures are used to determine the preferred solvated conformations of twelve hydrophobic dipeptides (Leu-Leu, Leu-Phe, Phe-Leu, Ile-Leu, Phe-Phe, Ala-Val, Val-Ala, Ala-Ile, Ile-Ala, Ile-Val, Val-Ile and Val-Val) in the zwitterionic state. First, the accuracy of density functional theory (DFT), combined with different implicit solvent models, for describing zwitterions in aqueous solvent is assessed by comparing the predicted against the experimental glycine tautomerization energy, i.e., the energetic difference between canonical and zwitterionic glycine in aqueous solvents. It is found that among the tested solvation schemes, the charge-asymmetric nonlocally determined local-electric solvation model (CANDLE) predicts an energetic difference in excellent agreement with the experimental value. Next, DFT-CANDLE is used to determine the most favorable solvated conformation for each of the investigated dipeptide zwitterions. The CANDLE-solvated structures are obtained by exploring the conformational space of each dipeptide zwitterion concatenating DFT calculations, in vacuum, with classical molecular dynamics simulations, in explicit solvents, and DFT calculations including explicit water molecules. It is found that the energetically most favorable conformations are similar to those of the dipeptide zwitterions in their respective crystal structures. Such structural agreement is indicative of the DFT-CANDLE accomplishment of the description of solvated zwitterions, and suggests that these biomolecules self-assemble as quasi-rigid objects.
All hydrophobic natural dipeptides have been crystallized (Fig. 2).14 Some of these dipeptides crystallize forming pores, and others do it as a set of layers.14 In these crystals, dipeptides assemble as zwitterions, i.e., with –NH3+ and –COO− as the capping groups (Fig. 1). In aqueous solvents, dipeptides prefer to be in the zwitterionic state too.15 Therefore, we would like to get insight into the influence that such locally charged but globally neutral state has on dipeptide conformations, which may be crucial along the self-assembly pathway of these systems. In this work, we use theoretical methods for investigating the influence of the zwitterionic state in the conformational preferences of two sets of hydrophobic natural dipeptides. Both sets of dipeptides crystallize forming pores. The set here named class V–A comprises the Ala-Val (AV), Val-Ala (VA), Ala-Ile (AI), Ile-Ala (IA), Ile-Val (IV), Val-Ile (VI) and Val-Val (VV) dipeptides, and the set labelled as class F–F includes the Phe-Phe (FF), Leu-Leu (LL), Leu-Phe (LF), Phe-Leu (FL) and Ile-Leu (IL). Determining peptide zwitterion conformations in solution is challenging using theoretical methods. This is primarily due to the inadequacy of the procedures commonly used for describing solvent effects, particularly in situations in which the solute is locally or globally charged.
Commonly, the non-zwitterionic N-acetyl-alanyl-N′-methylamide (Ac-Ala-NHMe) molecule is used to investigate peptide conformational preferences.16 Ac-Ala-NHMe is a capped peptide comprising two peptide bonds flanking a single side chain; hence, it is usually called dipeptide. The crystallized natural peptides mentioned above, however, are constituted by two amino-acid residues linked by one peptide bond. Thus these are named dipeptides too. Hereinafter the term dipeptide is used solely for referring to the latter systems. Dipeptide conformations are here characterized in terms of the ϕ and ψ torsional angles (Fig. 1), in the same way as these torsional angles are ordinarily used to describe the Ac-Ala-NHMe conformations.
According to calculated Ramachandran maps for Ac-Ala-NHMe, i.e., plots of its potential energy surface (PES) in terms of ϕ and ψ, the most stable conformation in aqueous solvent is the so-called polyproline II conformation,17–22 outcome that has been confirmed using Raman optical activity experiments.23 If Ac-Ala-NHMe is considered to be in vacuum, the preferred conformation is predicted to be an extended one,24,25 a geometry similar to that of protein residues in strands forming β-sheets.26 The β-sheets are common protein backbone conformations characterized by ϕ, ψ mean values around −130°, 144° in antiparallel β-sheets and −122°, 137° in parallel β-sheets, respectively.27 Thus, either in aqueous solvent or in vacuum, Ac-Ala-NHMe prefers to adopt conformations that lie in the upper left quarter of the Ramachandran map (Fig. 3).28 Molecules containing glycine (Gly) instead of Ala behave similarly.15,29 The conformations of class V–A dipeptides in crystals grown in aqueous solutions are similar to those of a β-strand segment (Fig. 3). These conformations are characterized by ϕ, ψ dihedral angles with values around −137° and 155°, respectively.30 However, conformations of class F–F dipeptides in their crystals are quite different from those of the class V–A (Fig. 3). Conformations of class F–F dipeptides lie in the upper right quarter of the Ramachandran map and are characterized by a ϕ angle around 60°, and a ψ angle with values between 120° and 170° (Fig. 3).30
![]() | ||
Fig. 3 Ramachandran plot and dipeptide conformations in the crystals. The + symbols at the upper left quarter correspond to the V–A class conformations in the crystals (side chains in trans orientation). The + symbols at the upper right quarter correspond to the F–F class conformations in the crystals (side chains in cis orientation). Solid lines account for the 95% of residue conformations in protein crystals, and dashed lines for 99%.31 Apart from the extended conformations, that usually form part of β-sheets and polyproline II motifs, regions in which residue conformations correspond to right- and left-handed helices, other major backbone motifs of proteins, are also indicated. |
Despite the conformational differences described above, both classes of dipeptides, the F–F and V–A, crystallize as a bundle of tubes in aqueous solutions.32,33 In such crystal structures, however, class V–A dipeptides accommodate their side chains in a trans arrangement, while the class F–F dipeptides do it in a cis conformation, with respect to the peptide plane (Fig. 2). Consequently, the interior of the formed tubes is hydrophobic and narrow in class V–A crystals, whereas hydrophilic and wide in the class F–F ones. These crystal morphologies seem to be dictated by inter- and intra-peptide interactions, as well as by interactions with the solvent.34–37 For example, if the FF crystal is grown in methanol or tetrahydrofuran instead of water, the crystal structure changes as well as the FF conformation in it, which becomes similar to an extended one, like the arrangement of amino-acid residues in strand-forming β-sheets.35,37
To determine the conformational preferences of dipeptide zwitterions in aqueous solvent, we use density functional theory (DFT) calculations and classical molecular dynamics (MD) simulations, combined with explicit and implicit solvent models. As mentioned above, proper description of solvent effects is challenging for theoretical methods. Particularly if the solute is a zwitterion, since it is a neutral molecule possessing both positive and negative electrical charges. Therefore, care should be taken in selecting the solvent model for investigating such kinds of systems. If an implicit solvent model is chosen, it should describe equally well both the cationic and anionic environments coexisting in the same molecule. The latter is not the case for commonly used implicit solvent models.38 If an explicit solvent model is used, it should be considered that a minimum number of water molecules is required to stabilize the zwitterionic state of a peptide.39,40 Inclusion of explicit solvent in DFT calculations has been mostly done for investigating microsolvation effects on the conformation of peptides and their relative stability at 0 K.27,41,42 Implicit solvent models, however, can be used to estimate relative stabilities at room temperature, as these models are usually parameterized to estimate free energies. Here we use both solvent models together with DFT calculations to determine the optimal conformation of the zwitterions in solvent and implicit solvent to determine relative stabilities at room temperature.
To choose a suitable implicit solvent model that adequately describes solvated dipeptide zwitterions, we first investigate the tautomerization reaction of Gly (Fig. 4). It is found that among the different implicit solvation models tested, the charge-asymmetric nonlocal determined local-electric (CANDLE) solvation model43 gives the best result for the free energy difference between the canonical and the zwitterionic state of Gly, as compared to the experimental value. Thus, geometries obtained for each dipeptide zwitterion, obtained combining MD simulations and DFT calculations, are solvated using CANDLE. In this manner, it is found that the most favorable conformation for dipeptide zwitterions in aqueous solution is quite similar to the conformation of the dipeptide in the corresponding crystal structure, and that the zwitterion–solvent interaction is fundamental in determining the preferred conformation of, primarily, the class F–F dipeptides. Furthermore, it is also found that the conventional continuum solvation model such as the integral equation formalism of the polarizable continuum model (PCM)44–46 predicts that only the V–A class dipeptides adopt conformations like these in their respective crystal structures. Lastly, it is argued that the obtained results help to understand the self-assembly mechanism of the investigated systems.
The PES maps are obtained varying the torsion angles ϕ and ψ from −180° to 180° in twelve steps of 30° each, yielding a total of 144 distinct geometries. The latter are subjected to geometry optimization using the RMM-DIIS algorithm62 (or the conjugate gradient algorithm upon convergence issues) with a convergence criterion of a maximum force of 1.2 × 10−2 eV Å−1. The atoms defining ϕ and ψ are fixed and the others fully optimized. The positions for the fixed atoms are chosen such that bond distances and angles associated with them equal those of the optimal geometry of the fully extended conformation in vacuum. The optimal orientation of the carboxyl capping group is only evaluated for the FF dipeptide varying the N–C–C–O dihedral angle. The –COOH orientation with the lowest energy for each (ϕ, ψ) pair in FF is used in the initial geometry for all the other dipeptides. The full PES maps are generated using a cubic spline interpolation.
The structures connected to the minima thus obtained are further relaxed without constraints. Afterwards, these optimized canonical dipeptides are turned into zwitterions, solvated with explicit water molecules by inserting them into a pre-equilibrated cubic box of water molecules with a padding of 10 Å. For this the VMD software is used.63 The solvent is fully relaxed around the dipeptides using a classical MD equilibration protocol. The water molecules are first energetically minimized by 20000 steps using the conjugate gradient and line search algorithm. The CHARMM36 force field64 with the CMAP correction is used to describe the dipeptides together with the TIP3P water model.65 The particle mesh Ewald algorithm66 is chosen to treat the electrostatic interactions under periodic boundary conditions, and the van der Waals forces are smoothly switched off between 9 and 10 Å. The minimization is followed by two 100 ps NVT and 500 ps NPT MD simulations. A 2 fs integration time step is used with the water covalent bonds restrained by the SETTLE algorithm.67 The r-RESPA algorithm68 is used to integrate the equations of motion, with non-bonded short-range forces computed every time step and a full electrostatic evaluation every five time-steps. The temperature is kept at 300 K using a Lanvegin thermostat with a damping coefficient of 2 ps−1. Pressure is kept at 1 atm by the Nose–Hoover Langevin piston method,69,70 with an oscillation period of 400 fs and a damping timescale of 300 fs. The dipeptide atoms are fixed at the initial coordinates. These calculations are performed using the NAMD v2.14 software.71
A single frame from the trajectory is selected for each dipeptide such that the backbone hydrogen bond (H-bond) acceptors and donors form an expected minimum number of hydrogen bonds; i.e., 3 for –NH3+ cap, 6 for –COO− cap, 2 for backbone –CO, and 1 for backbone –NH. Water molecules that do not interact with the dipeptide are removed. The microsolvated geometries are then geometry optimized with PBE-TS, first with the dipeptide atoms fixed to optimize the orientation of the water molecules, and next fully optimizing all the atomic coordinates. These calculations are carried out using VASP with the same parameters as those used in the PES calculations.
Finally, the optimized geometries for the microsolvated dipeptide zwitterions are solvated with the implicit solvent models CANDLE and PCM, and fully geometry optimized under each scheme. As is shown below, CANDLE is chosen owing to the adequate description of the Gly tautomerization reaction. PCM is also chosen for contrasting the CANDLE results against a commonly used solvation scheme. In this latter step microsolvation water molecules are not considered.
Starting from the geometries reported in ref. 77, we calculate the relative stability of zwitterionic Gly with respect to the canonical one, using the solvent models mentioned in the methodology section and two models for representing the solute: the non-assisted model (NAM), in which the solute is the bare Gly, and the assisted-model (AM) in which the solute is Gly plus a water molecule. Both, the canonical and zwitterionic Gly conformations here investigated are such that all heavy atoms lay in the same plane (Fig. 4). In the canonical glycine, the –OH group lays in the same plane as the heavy atoms, and it is located such that it can form a hydrogen bond with the –NH2 group (Fig. 4(a)). In the AM, the water molecule is located between the amine and carboxilate groups such that it forms hydrogen bonds with both groups either in the canonical and in the zwitterionic state (Fig. 4(b)). The solvated canonical and zwitterionic AM and NAM are fully optimized using each of the solvation schemes.
According to the results for ΔEtot listed in Table 1, all the solvation schemes predict the zwitterionic Gly as the energetically favorable tautomer in solution, either with NAM or AM. Except for PCM which only in combination with the AM forecasts the expected ordering. We compare ΔEtot directly against the experimental free energies as the implicit solvation schemes here used were parameterized to predict free solvation energies. Notably, ΔEtot calculated with CANDLE and the AM differ by less than 1 kcal mol−1 from the experimental tautomerization free energy, and CANDLE in combination with NAM predicts that only the zwitterionic state exists in aqueous solvent. Hence, to be able to calculate ΔEtot with CANDLE and the NAM we perform single-point calculations combining the PCM-optimized geometries with CANDLE. The latter is also done using the GLSSA13 and CDFT solvation schemes. Unexpectedly, all but one of the ΔEtot values calculated in this manner are in better agreement with the experimental value; e.g., we get a difference between experiment and theory of less than 0.1 kcal mol−1 solvating with CANDLE the PCM-optimized NAM. Nevertheless, the PCM-optimized NAM, solvated with the other solvation schemes, still underestimates the experimental value (in absolute value) by ∼2.5 kcal mol−1 (34%) or more. Conversely, all the solvation schemes improve the estimation of the tautomerization energy if the AM is used instead of the NAM. Except for CANDLE in combination with the PCM-optimized geometries. Yet, CANDLE together with the AM provides a better estimation of the tautomerization energy than the other solvation schemes combined with the AM. The appropriate performance of CANDLE is related to the fact that it is calibrated to describe solvation free energies at 298 K of negative and positive species with the same accuracy. For calibrating the CANDLE parameters a set of 346 free energies (240 of neutral molecules, 54 of cations and 55 of anions) obtained at 298 K was used, as reported in ref. 43, 78 and 79. Accordingly, for investigating the most stable conformation of the V–A and F–F dipeptides in aqueous solvent, we use PCM-optimized NAM geometries solvated with CANDLE, and further re-optimized under the CANDLE solvation scheme. The energies thus obtained are used for the conformational ordering described below. For comparison purposes we also report the energetic analysis obtained with PCM and the NAM, as PCM is a commonly used implicit solvation model.
![]() | ||
Fig. 5 Ramachandran PES of each the twelve investigated dipeptides in vacuum. Relative energies are referred to the lowest minimum at each PES. + symbols indicate the conformation observed in the corresponding crystals. The minimum position at each basin is labelled with t (in yellow) if the relative side chain orientation is trans, or with c (in red) if such orientation is cis. The labels ![]() ![]() ![]() |
The chart in Fig. 6 shows the relative energies for the conformations connected to the minimum at each basin. These relative energies are referred with respect to the lowest one. All the lowest energy conformations of the V–A class dipeptides in vacuum belong to t basins, except for the V–A dipeptide whose lowest energy conformation is located in the corresponding c′ basin (Fig. 6). On the contrary, the lowest energy conformations of the F–F class belong to c′ basins, except for the IL dipeptide whose lowest energy conformation is located in the corresponding t basin. Notably, the energy difference between the t and the c′ minima is around 1 to 3 kcal mol−1 for all these canonical dipeptides in vacuum. Nevertheless, none of these dipeptides crystallize adopting a conformation connected to a c′ basin. In Fig. 6 are also depicted the relative energies for the canonical dipeptides in vacuum investigated by Gonzalez-Diaz et al.30 (the + symbols). As it can be seen, some of these conformations are lower in energy than the ones found here, although by less than 1 kcal mol−1. The discrepancy is attributed to the orientation of the –COOH capping group. Here it was only sampled for FF, but the ideal orientation may vary depending on the dipeptide and conformation. That may account for the inconsistencies found for VA and IL.
![]() | ||
Fig. 6 Relative energies, ΔE in kcal mol−1, of the canonical dipeptide conformations in vacuum connected to the t (in yellow), c (in red), t′ (in pink) and c′ (in green) minima along the potential energy surfaces shown in Fig. 5. The reference energy corresponds to the lowest energy conformation found in this work for each system. Left panel class V–A dipeptides. Right panel class F–F dipeptides. For LF, the energy of t and c conformations is the same, therefore a yellow mark is not distinguishable for this dipeptide. The + symbols stand for the ΔE values corresponding to the canonical gas-phase dipeptides reported in ref. 30. |
In the second step for evaluating the solvation effects, the microsolvation shell of the optimized dipeptide zwitterions is stripped out. Then, these are solvated with implicit solvent. In this step we have also considered other dipeptide conformations belonging to the t and c basins, namely the dipeptide zwitterions obtained from the conformations marked with + symbols in Fig. 5. All these structures are fully optimized considering the PCM implicit solvent. Next, the lowest energy conformations found at each basin are chosen for the energy comparison presented in Fig. 7. According to these results, all class V–A PCM-solvated dipeptide zwitterions and some of the class F–F (IL and LL) prefer to adopt a conformation belonging to the t basin. The other class F–F PCM-solvated dipeptide zwitterions prefer to adopt conformations belonging to the c′ basin. Lastly, the PCM-optimized structures are further optimized considering them solvated with the CANDLE implicit solvent model. The energy comparison of the lowest energy conformations found at each basin is presented in Fig. 8. For class V–A, all dipeptide zwitterions prefer to adopt conformations belonging to the t basin, as it is found using PCM. However, for class F–F, the outcome is different; all dipeptide zwitterions prefer to adopt conformations belonging to the c basin. Differences in the results obtained with PCM and CANDLE are also noticeable in the energetic ordering of the conformations belonging to the other basins. While CANDLE predicts that the second most stable conformation is around 1 to 2 kcal mol−1 above the most stable ones, PCM predicts differences ranging from 1 to 7 kcal mol−1. Moreover, CANDLE predicts that the two lowest energy conformations are connected either to basin t or basin c, except for LF that prefer basin c′ for the second lowest energy conformation, whereas PCM predicts that the two lowest energy conformations belong to basin t, c′ or c. Yet, both implicit solvation schemes predict that for all dipeptide zwitterions but one (solvated with PCM), the least stable conformation is the connected to the t′ basin.
It is noteworthy that the lowest energy conformations for all the dipeptide zwitterions solvated with CANDLE, except one, are alike to those observed in their corresponding crystal structures, as is illustrated by the Ramachandran-like plot depicted in Fig. 9. The exception is the conformation for the AV dipeptide, which still belongs to the t basin but with a ϕ angle differing about 100° with respect to the experimental value. For that structure, however, an alternative conformation is found in the t basin that is solely ∼0.3 kcal mol−1 higher in energy than the lowest one (see Fig. 8). This second structure has a ϕ angle that differs by less than 20° from the ϕ value for the dipeptide zwitterion in the crystal structure (brown open dot in Fig. 9).
According to the results presented above, the CANDLE solvation scheme is the one that predicts accurately the tautomerization free energy for glycine. Hence, one may assume that it also correctly describes class V–A and F–F dipeptide zwitterions in aqueous solvent. The latter is supported by our findings about the most stable dipeptide conformations in aqueous solvent, i.e., the CANDLE-solvated dipeptide zwitterions prefer conformations similar to the ones in their corresponding crystal structures.
These results suggest that along the self-assembling path for these systems, the dipeptide zwitterions behave like semi-rigid objects in aqueous solvents, changing little their backbone conformation and side chains relative orientations, thus avoiding large conformational changes that could prevent them from crystallizing in a reasonable time. Therefore, it can be suggested that diffusion and desolvation, but not the conformational changes, should play a major role in the self-assembly process of such systems.
Footnote |
† Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d3cp05742a |
This journal is © the Owner Societies 2024 |