Open Access Article
Kasra Hanifi
and
Lars Konermann
*
Department of Chemistry, The University of Western Ontario, London, Ontario N6A 5B7, Canada. E-mail: konerman@uwo.ca
First published on 13th May 2026
The transition of proteins from solution into the gas phase during native electrospray ionization (ESI) continues to be an enigmatic topic. Molecular dynamics (MD) simulations are an important avenue for gaining mechanistic insights, but most previous MD studies did not consider the role of proton transfer (PT) events. Here, we overcome this limitation by implementing a droplet mobile-proton MD (dMPMD) protocol for aqueous ammonium acetate solutions. The droplet radius (up to 8.5 nm) and protein size (up to 55 kDa) are the largest studied to date, bringing the simulations closer to experimental conditions than any earlier modeling studies. Consistent with experiments, our data show that PT among NH4+, acetate (Ac−), and protein sites produces NH3 and acetic acid (HAc). NH3 vaporizes rapidly, while HAc accumulation causes Ac−/HAc buffering at pH 4.76 ± 1. Droplet evaporation near the Rayleigh limit produces gaseous proteins with a net charge close to that of protein-sized water droplets, in line with the charged residue model (CRM). The CRM/Rayleigh framework applies regardless of amino acid sequence and residue basicity. For proteins with very few basic sites, charging involves carbonyl-trapped NH4+ adducts. Switching to gas-phase mobile-proton MD (gMPMD) simulations, we track proteins after their release from droplets, culminating in native-like conformers with numerous salt bridges. Our data provide unprecedented insights into the ESI process, and the maturation of nascent protein ions to metastable structures that are detectable in experiments.
The ESI process starts with solution in a high voltage capillary. Droplets emitted from a Taylor cone at the capillary outlet undergo repeated cycles of fission and evaporation. Throughout these events, the net number of charges on each droplet stays close to the Rayleigh limit21–23
| zR = 8π/e × (ε0γr3)1/2 | (1) |
Bulk solution experiments promote the stability of native proteins by using neutral aqueous buffers with background electrolytes (e.g., 50 mM phosphate with 100 mM NaCl at pH 7). Unfortunately, such non-volatile solutes tend to associate with proteins during ESI, culminating in heterogenous adduct clusters instead of “clean” [M + zH]z+ ions.21,43 Several ESI-compatible (volatile) buffers and additives have been tested,37,44–48 but most native ESI studies continue to rely on aqueous ammonium acetate (NH4Ac).7 This salt decomposes via proton transfer (PT). The resulting products, ammonia and acetic acid, have a high evaporation propensity (NH4+ + Ac− → NH3 + HAc).21,43 Protein-bound NH4+ can evaporate as NH3, leaving behind a proton. Similarly, protein-bound Ac− can evaporate as HAc after acquiring a proton.4,21,49 Thus, NH4Ac solutions favor the formation of [M + zH]z+ ions without undesired adducts.21
Despite its widespread use, NH4Ac is not an ideal additive for ESI. NH4Ac produces a physiologically relevant near-neutral environment in water,50,51 but NH4Ac solutions do not have buffering capacity at pH 7.43 Instead, they buffer ±1 pH units around the pKa values of HAc (4.76) and NH4+ (9.25). Thus, addition of acid or base to neutral NH4Ac solution can cause large pH changes that destabilize native proteins.38,40,43,46 Two factors contribute to acidification in positive ESI mode: (i) water oxidation in the emitter releases protons.52 (ii) For NH4Ac-containing droplets, PT generates NH3, a basic gas that evaporates rapidly. The other product (HAc) is a liquid acid that evaporates more slowly. Thus, HAc accumulation lowers the droplet pH until acetate buffering starts to take place, from the initial value of pH 7 down to 4.76 ± 1.43 How exactly this mildly acidic droplet milieu affects proteins during the ESI process remains to be explored.31,53
Molecular dynamics (MD) simulations of nanodroplets have become an important tool for probing ESI mechanisms.35,54–65 Instead of focusing on [M + zH]z+ ions that are most relevant in experiments, many earlier MD studies modeled analytes that were charged by metal adduction, e.g., [M + zNa]z+. This approach sidesteps a limitation of classical MD force fields, i.e., their inability to capture the formation and dissociation of covalent bonds (such as PT events involved in [M + zH]z+ formation). Over the past few years, mobile-proton MD (MPMD) methods have opened the door to simulations on ESI droplets43,66,67 and gaseous protein ions13,68,69 with PT.
We recently proposed an MPMD approach for modeling the behavior of NH4Ac in ESI droplets (referred to as dMPMD, where the “d” stands for “droplet”).43 For the first time, that work captured the PT-mediated formation of NH3 and HAc. However, our study43 only examined analyte-free ESI droplets. Our subsequent goal was to “extend these simulations to ESI droplets containing proteins […], to obtain a truly molecular understanding of native ESI”.43 The present study accomplishes this goal.
While the current work was in progress, we became aware of a paper by Cordes and Gallagher.67 Building on our recently developed methodology,43,66 ref. 67 also modeled protein ion formation from NH4Ac-containing aqueous droplets.67 By pursuing a strategy complementary to that of ref. 67, the current work provides novel mechanistic insights into native ESI via a two-step strategy. Initially, we performed dMPMD simulations for modeling the release of protein ions from ESI droplets into the gas phase. Subsequently, the fate of these nascent ions was explored using gas-phase MPMD simulations (gMPMD, where the “g” stands for “gas phase”), to explore their structural evolution in vacuo. Our approach allows computational studies on droplets and proteins in an unprecedented size range. For the first time, these simulations capture the complete sequence of ESI events, from proteins in droplets to metastable gas-phase structures that can be compared to IMS experiments, with inclusion of PT events along the entire pathway.
Droplets contained water, NH4+, Ac−, and HAc, with a protein initially centered. Parameters for non-protein solutes were taken from ref. 43. Molalities were used to ensure consistent NH4Ac concentrations in the non-protein part of the droplets. Ubiquitin droplets with an initial radius of r0 = 3 nm had a NH4Ac molality of 2.3 m, corresponding to a molarity of 1.6 M when considering the entire droplet volume. For 3 nm lysozyme droplets, we used 2.3 m (1.5 M). Molalities of 6 nm droplets were 1.15 m (1.00 M, 0.99 M, 0.96 M, and 0.93 M for ubiquitin, lysozyme, pepsin, and TTR). Ubiquitin runs with r0 = 8.5 nm employed 1.05 m (0.92 M). These NH4Ac concentrations are higher than those of bulk ESI solution, where [NH4Ac] is typically 10–100 mM.7 However, the droplets simulated here represent a regime where water evaporation has increased solute concentrations.21 The [NH4Ac] of 1.6 M for r0 = 3 nm matches the conditions of our earlier work.43 NH4Ac concentrations for larger droplets were chosen such that [NH4Ac] approached 1.6 M once the droplets had shrunk to 3 nm. Additional HAc were included to ensure an initial [Ac−]/[HAc] ratio consistent with the Henderson-Hasselbalch equation (eqn (2)) for pH 5.4.43
![]() | (2) |
Droplets underwent energy minimization, followed by 500 ps of equilibration during which the temperature was raised to 370 K. After maintaining 370 K for much of the run, the temperature was raised to 450 K for promoting the final evaporation steps. The 370 K/450 K periods were 50 ns/25 ns for r0 = 3 nm, 112.5 ns/37.5 ns (6 nm), and 137.5 ns/37.5 ns (8.5 nm). Although typical ESI interfaces employ droplet heating, the MD temperatures used here are likely higher than those in experiments, where droplet temperatures have been estimated to be ∼320 K.73 Unfortunately, droplet runs at T ≤ 320 K yield slow evaporation rates that preclude studies of ESI events.55 This is especially true for TIP4P/2005 water which evaporates more slowly than other models, keeping in mind that those other models have unrealistically low surface tension.74 Using TIP4P/2005 water at 370 K/450 K overcomes these problems, as shown previously.60,61,67,71 A possible concern with the temperature regime in our simulations is the occurrence of thermal protein unfolding. However, the data discussed below (as well as earlier experiments75 and simulations60,61,67,71) demonstrate that the short time scale of ESI precludes thermal unfolding under the conditions used here. Droplet volume was determined as Vdroplet = [mass of water + solutes + protein (kg)]/[1000 kg m−3], and droplet radii for eqn (1) were estimated using Vdroplet = 4/3πr3.
![]() | ||
| Fig. 1 Proton donors and acceptors considered for the dMPMD algorithm of this work. Arrows indicate possible PT reactions. Snapshots of all PT events are exemplified in Fig. S4. | ||
Donor/acceptor pairs were identified as PT candidates if they were H-bonded to one another, with a distance less than a specific cutoff (PT_dist_max, see below). Unless noted otherwise, PT was executed with a probability PT_prob = 0.1 for each candidate pair. NH4+/NH3 and Ac−/HAc conversion was performed as described43 (Fig. S4A, top panel). Protonation/deprotonation of side chains and termini was performed using GROMACS pdb2gmx. Rapid NH3 evaporation from the droplet justifies modeling PT from NH4+ to Ac− as an irreversible step.43 Protein sites were able to switch back and forth between protonation states, e.g. K+ deprotonation by Ac−, with subsequent K0 protonation by NH4+. For determining the maximum distance for which PT was permitted (PT_dist_max), we generated H-bond distance histograms for all possible donor/acceptor pairs (Fig. 1) using the strategy of Fig. S5. Histogram maxima were used as PT_dist_max for each pair (Fig. S6).
| EgMPMD = ECoul + EPA | (3) |
![]() | (4) |
![]() | (5) |
(i) Ac−/HAc buffering stabilizes the droplet pH at 4.76 ± 1 in native ESI experiments and in our simulations (eqn (2)).43 The sparsity of H3O+ under these conditions prompted us to model droplets without explicitly including H3O+ (see Section 2.2). This strategy eliminated the need to simulate H3O+ Grotthuss diffusion,66,67 thereby streamlining the code and facilitating its application to large systems. In contrast, ref. 67 modeled ESI droplets with a high H3O+ abundance that is difficult to reconcile with a Ac−/HAc-buffered milieu.43 It cannot be ruled out that in ESI experiments a small fraction of protein protonation events involve transiently formed H3O+. However, for any H3O+ that might be present, Ac− is a much more likely proton acceptor (e.g., ubiquitin only has ≤12 vacant protonation sites at pH 5.4 [Fig. S1], while a r0 = 6 nm droplet contains 544 Ac−). Thus, the direct participation of H3O+ in protein protonation will be minimal, providing additional justification for not explicitly considering H3O+ in our MD strategy.
(ii) Ref. 67 executed PT on the basis of ΔG° estimates. Their eqn (2) predicts that NH4+ + Ac− → NH3 + HAc in droplets has ΔG° = +26 kJ mol−1. This unfavorable value is at odds with the fact that NH3 and HAc readily form in evaporating NH4Ac solutions,43 illustrating that bulk thermodynamics may be unsuitable for predicting the behavior of non-equilibrium systems with liquid/vapor interfaces. Similarly, ΔG° values estimated from gas-phase basicities (GPBs) in ref. 67 did not consider electrostatic charge solvation effects which are key for determining PT favorability.13,53,78,80 The use of the empirical probability parameter PT_prob in our work sidesteps these issues (see Section 2.3).
Ubiquitin (8656 Da, pI 6.6) is a widely studied model protein.11,67,78,81 Fig. 2A illustrates trajectory snapshots in positive ion mode. Water evaporation caused droplet shrinkage. PT-generated NH3 and HAc evaporated as well. Charge loss from the evaporating droplets took place via NH4+ IEM ejection.26,27 These events culminated in 6+ gaseous ubiquitin. The observed droplet evaporation to dryness implies that this protein ion was formed via the CRM.21,24
![]() | ||
| Fig. 2 Ubiquitin native ESI (dMPMD) simulations in r0 = 3 nm aqueous NH4Ac droplets, using positive ESI mode. (A) Trajectory snapshots. NH4+, Ac−, NH3, and HAc are shown in spacefill representation, water is shown as lines that are barely discernible, the protein is colored green. All other panels show data from 20 repeat runs. (B) Number of H2O in droplet, (C) net droplet charge, (D) number of solutes in droplet, (E) cumulative number of PT events from NH4+ to Ac−, and PT events involving the protein, (F) droplet pH, (G) relative droplet charge. Panels F and G only show the time range down to ∼50 H2O, because the corresponding eqn (1) and (2) refer to aqueous systems. Also included in panel G is the average z/zR in this 20 ns window. (H) Protein adducts at the end of droplet evaporation. (I) Simulated protein ESI charge state distribution (including adducts). | ||
Fig. 2 includes data from 20 repeat runs, all of which followed a pattern similar to Fig. 2A. Most water evaporated within 25 ns (Fig. 2B), concomitant with a stepwise decrease in droplet charge via NH4+ IEM events (Fig. 2C). PT produced NH3 and HAc (Fig. 2D and E). The high volatility of NH3 translated in a low NH3 steady-state concentration (Fig. 2D). In contrast, the lower volatility of HAc caused accumulation of this acid, resulting in a pH decrease from 5.4 to ca. 4.5 (Fig. 2D, F and eqn (2)). The droplet charge remained around 80% of the Rayleigh limit (Fig. 2G and eqn (1)). Some proteins retained one or two solutes at the end of the 75 ns simulations, with binding of a single NH4+ as the most common outcome (Fig. 2H). The ESI charge state distribution, obtained by tallying all charges at the end of the 20 runs (side chains, termini, and adducts) ranged from 5+ to 7+, consistent with experiments67,78 (Fig. 2I).
PT of donor-acceptor candidate pairs was executed with a probability PT_prob (see Section 2.3). This probability represents an empirical parameter rather than a physically derived property. The value of PT_prob = 0.1 used for Fig. 2 matches the settings employed earlier for protein-free droplets.43 To test the effects of PT_prob, we performed ubiquitin simulations under different PT_prob regimes (Fig. S7). The different settings affected how the droplet composition changed over time. However, all simulations ultimately produced similar protein charge states (Fig. S7, bottom), covering the range of 5+ to 7+ that matches experimental spectra.67,78 In other words, the simulation outcomes were largely insensitive to the choice of PT_prob, prompting us to retain PT_prob = 0.1 for all data discussed below.
Simulations were also conducted on lysozyme (14
305 Da, pI 10.5), another common model protein.67,78 ESI events for lysozyme resembled those for ubiquitin (Fig. S8). dMPMD-generated lysozyme charge states ranged from 7+ to 9+, matching experimental data.67,78
Native ESI experiments in negative ion mode tend to produce slightly lower charge states |z| than in positive ion mode.67,78,82,83 This effect was reproduced in our simulations, where average z values were 5.5- vs. 6.1+ for ubiquitin, and 7.0- vs. 8.1+ for lysozyme (Fig. 2 and Fig. S8–S10). Our simulations reveal that this disparity is rooted in the ejection behavior of excess charge carriers, NH4+ vs. Ac−, illustrated for ubiquitin in Fig. S11. Charge ejection always occurred with a solvation shell. For droplets that still contained abundant water, ejected NH4+ and Ac− were mainly solvated by H2O (Fig. S11C and J). As the droplets gradually dried out, solvation by HAc became more prevalent (Fig. S11G and N). The absolute system charge in positive and negative ion mode was indistinguishable at t ≈ 30 ns (Fig. S11B), implying that “wet” droplets do not hold the key for explaining the charge disparity between positive and negative ion mode. Instead, differences manifested themselves when the systems were almost anhydrous. Charge loss in positive ion mode became rare at t > 30 ns, with only a single ejection in 20 simulations (Fig. S11C–G). Negative ion mode exhibited a higher charge loss propensity at this late stage, with 14 ejections for t > 30 ns (Fig. S11J–N). Thus, our data show that toward the end of droplet evaporation, Ac−-mediated charge loss in negative ion mode is more facile than NH4+-mediated charge loss in positive ion mode. This trend was also seen for lysozyme (Fig. S12).
In summary, the dMPMD strategy developed here is suitable for modeling native protein ESI in positive and negative ion mode. All r0 = 3 nm droplet runs showed protein ion production via the CRM, accompanied by IEM charge ejection (solvated NH4+ or Ac− in positive or negative ion mode, respectively). Simulated protein charge states were in agreement with experiments.78 Except for Fig. S9–S12, all data in this work represent positive ion mode.
dMPMD simulations on r0 = 6 nm droplets were performed for ubiquitin and lysozyme, as well as pepsin which represents a large monomeric protein (35 kDa, pI ≈ 3, protein radius ≈ 3 nm). We also modeled droplets containing transthyretin (TTR), a 55 kDa noncovalent tetramer (pI ≈ 5, protein radius ≈3.5 nm). Results for TTR are illustrated in Fig. 3. Data for the other three proteins are summarized in Fig. S13–S15. All r0 = 6 nm trajectories showed droplet evaporation to dryness, following a sequence of events similar to that seen for the r0 = 3 nm droplets in Fig. 2 and Fig. S8–S10. Simulated charge state distributions for ubiquitin were virtually identical for 3 nm and 6 nm droplets (Fig. 2I and Fig. S13I). The same was true for lysozyme (Fig. S8I and Fig. S14I). Overall, our simulations reveal that the CRM is operative for proteins across a wide size range (we tested 8.5 kDa to 55 kDa), with different acid/base properties (we examined pI 3 to 10.5), and for droplet radii up to at least 6 nm. TTR represents the largest protein to date that has been subjected to ESI simulations (Fig. 3).
![]() | ||
| Fig. 3 TTR native ESI (dMPMD) simulations in r0 = 6 nm aqueous NH4Ac droplets, using positive ESI mode. (A) Representative trajectory snapshots. Panels B-I summarize data from 20 replicates. For additional details, see the caption of Fig. 2. | ||
Simulated ESI charge state distributions for ubiquitin, lysozyme, and pepsin from r0 = 6 nm droplets were almost identical to the results of native ESI-MS experiments (Fig. 4A–F). Published TTR spectra show some variability, reflecting the TTR propensity to undergo chemical modifications (Fig. S2), and the capability of the flexible N-termini to adopt different conformations (Fig. S3). Experimental TTR charge states in the literature cover the range of 13+ to 17+.17,84,85 Our experiments yielded 15+ to 17+ (Fig. 4G). The simulated TTR charge states of 14+ to 16+ are consistent with the experimentally observed range (Fig. 4H). The overall agreement between experimental spectra and simulation results in Fig. 4 is remarkable, attesting to the robustness of our dMPMD approach.
![]() | ||
| Fig. 4 Native ESI mass spectra compared to dMPMD-generated protein charge states for r0 = 6 nm droplets. (A) and (B) Ubiquitin, (C) and (D) lysozyme, (E) and (F) pepsin, and (G) and (H) TTR. Each simulated spectrum is based on 20 replicates. Dashed red lines and red numbers indicate ESI charge states predicted as 0.85 × zR, where zR is the Rayleigh charge of a protein-sized water droplet (eqn (1)), using radii determined from the protein mass assuming spherical shape and a 1 g cm−3 density.24 Shown along the right are representative protein structures after 150 ns of droplet evaporation. | ||
O groups (Fig. 5A).
![]() | ||
| Fig. 5 Carbonyl trapping of NH4+. (A) NH4+ binding to ubiquitin under the conditions of Fig. 2. Red labels indicate backbone (bold) and side chain (italicized) oxygens. (B) Pepsin 11+ produced in the dMPMD runs of Fig. S15, with 14 bound NH4+ ions. (C) and (D) Close-up of NH4+ binding to pepsin. (E) and (F) MS/MS data acquired with gentle activation (trap CV 2 V, panel E) and with moderate activation (15 V, panel F). Theoretical isotope distributions of [pepsin + (11-n)H + nNH4]11+ are shown for n = 0 (red), and n = 1, …, 5 (blue). Panel F also shows water loss, a common event during collisional activation of S/T-containing proteins (the S/T content of pepsin is 21%).87 | ||
Carbonyl-trapped NH4+ was particularly abundant in pepsin simulations, with occasional participation of other binding sites such as Ser-OH (Fig. 5B–D and Fig. S15H). Consistent with their high abundance in our MD data, NH4+ adducts were observed in native ESI mass spectra of pepsin (Fig. 5E and Fig. S16). Collisional activation caused a gradual shift to lower m/z (Fig. 5E and F). The process did not change the protein charge, implying NH3 loss rather than NH4+ ejection (Fig. S17). This interpretation is consistent with earlier experiments.4,49 The MS/MS data of Fig. 5 suggest that collisional heating triggers PT from residual NH4+ to acceptor sites on the protein (acceptor + NH4+ → acceptor-H+ + NH3).21 The scarcity of canonical basic acceptors in pepsin (a total of 5 sites, Fig. S1) implies the involvement of relatively unfavorable acceptors after collisional activation, such as side chains of Asn, Gln, Pro, and possibly even backbone carbonyls.86 These unfavorable proton acceptors likely also participate under denaturing conditions that produce [pepsin + zH]z+ ions without NH4+ adducts (Fig. S2D).
Native ESI mass spectra of large protein complexes tend to show peak broadening, an effect often attributed to “salt adduction”.7 Adduct-mediated broadening is exemplified by the TTR experimental data in Fig. 4G. Our dMPMD results show that residual HAc is a main contributor to peak broadening (Fig. 3H). Some HAc adducts were buried in internal cavities of TTR, illustrating why native ESI can produce adducted ions even in the absence of salt contaminants (Fig. 3A, t = 150 ns). Collisional activation during ion sampling promotes partial loss of these adduct (Fig. S18).
To access a regime that covers a wider range of behaviors, we performed dMPMD on ubiquitin-containing droplets with r0 = 8.5 nm (∼76
000 H2O). To our knowledge, these are the largest ESI droplets simulated using MD techniques (Fig. 6A). Three runs were performed, each of which required ∼1 month of wall clock time. Each run produced a gaseous protein via the CRM (Fig. S19), similar to the smaller droplets discussed earlier (Fig. 2 and Fig. S13). Protein charge states were 5+ (twice) and 7+, consistent with the experimentally observed range (Fig. 4).
The shrinking r0 = 8.5 nm droplets retained a spherical shape throughout most of their life cycle (Fig. S19A). NH4+ and Ac− were distributed uniformly in the aqueous phase (Fig. S20A–C). The NH3 radial distribution showed a local maximum at the droplet surface, representing PT products that had diffused to the droplet periphery and that were about to evaporate (Fig. S20D). Highly pronounced surface accumulation was seen for HAc (Fig. S20E). The high surface affinity reflects the less favorable water solvation of HAc compared to Ac−. HAc enrichment at the liquid/vapor interface is consistent with the ESI behavior of other partially nonpolar species.88,89 Most of the HAc molecules at the surface had their carboxylic acid group in contact with water, while the desolvated methyl group protruded into the vapor phase (Fig. S20H). The protein tended to reside relatively close to the surface as well, instead of retaining its initial centered position in the droplet (Fig. S20F).
Strikingly, each of the large droplets underwent a jet fission event, causing a sudden reduction in overall charge (Δz = 9 in Fig. 6B and C). Fission started with elongation of the initially round droplet (Fig. 6A, t = 0 in Fig. S19A), followed by formation of a Taylor cone-like protrusion that released a burst of progeny droplets from its tip (Fig. 6D and E). Several progenies were multiply charged; the largest one in Fig. 6E had a 3+ charge and contained ∼200 molecules. This progeny size is in contrast to the IEM ejection of singly charged ions (typically NH4+ with ∼10 H2O) that represents a competing charge loss mechanism (Fig. 2A and 3A and Fig. S19A). None of the progeny droplets contained a protein under the conditions studied here, although we cannot rule out the feasibility of such protein IEM or “intermediate regime ESI” events.25,31
The MD data in Fig. 6 resemble experimentally observed jet fission events.21–23 Only very few past MD studies ventured into a droplet size regime where such events become feasible.90,91 The unprecedented size used here for the first time produced trajectories that captured the entire range of droplet behaviors, i.e., jet fission, evaporation to dryness with numerous IEM events, culminating in a CRM-generated gaseous protein ion (Fig. 6 and Fig. S19, S20).
Simulation data supporting the CRM were previously obtained for much smaller droplets.65 A weakness of those earlier studies is the possibility that their outcomes might have been biased by the initial droplet size used. It is gratifying that trajectories consistent with the CRM were also obtained for the very large droplets simulated here, confirming that earlier mechanistic insights were not tainted by size-related artifacts.
Fig. 4 demonstrates that the CRM/Rayleigh framework applies to proteins that have vastly different sizes and basicities. An interesting scenario arises for proteins that possess very few basic sites. Specifically, pepsin contains only five canonical basic moieties (1nt, 2R, 1K, 1H, Fig. S1). From this amino acid composition, one might expect that pepsin should be unable to carry a charge greater than 5+. However, the CRM/Rayleigh framework predicts a native ESI charge of ∼11+, which is in agreement with experiments and simulations (Fig. 4E and F). Our data show that native ESI charging beyond 5+ in pepsin is mediated by carbonyl-trapped NH4+ ions (Fig. 5 and Fig. S15H), as previously suggested on the basis of experimental data.49
It has been proposed that residue basicity plays a key role for protein charging in native ESI.67 This assertion contradicts experiments that demonstrated major disparities between analyte basicity and ESI charge states.94 Instead, the CRM/Rayleigh framework implies that protein size is the single most important determinant of native ESI charge states.24,49,93,94 The expected size dependence was reproduced in our simulations for proteins representing a wide range of basicities, demonstrating that basicity is not a key determinant of native ESI protein charge states.24,49,93,94
Following their CRM production from r0 = 6 nm droplets (Section 3.3), ubiquitin, lysozyme, and TTR in the most abundant charge states were modeled by gMPMD. Fig. 7 illustrates data for TTR. The radius of gyration (Rg, reflecting the overall protein size) remained close to the initial solution value throughout the droplet stage of the simulations (0–150 ns). During the subsequent 200 ns in the gas phase, TTR underwent a slight compaction, culminating in a Rg 6.5% lower than for the solution structure (Fig. 7A). Similar compaction events have been reported for other complexes.100,101,103 Structural changes throughout the droplet and gas-phase simulations resulted in a ∼1 nm RMSD relative to the solution structure (Fig. 7B). Fig. 7C puts this RMSD into context, highlighting that most aspects of the solution architecture were retained in the gas phase, albeit with some distortion of β sheets and loops.
The final gaseous TTR structures exhibited an extensive salt bridge network, involving protonated basic sites and R-COO− moieties (Fig. 7D). The presence of these R-COO− moieties in [M + zH]z+ ions may seem counterintuitive, prompting many earlier MD studies to employ all-positive protonation patterns. However, it is now well established that intramolecular charge solvation and nearby positive charges can stabilize R-COO− groups (eqn (4)).53,105,106 The zwitterionic motifs formed in our gMPMD simulations are consistent with those data (Fig. 7D). Starting from the initial pH 5.4 droplet pattern (Fig. 7E), the zwitterionic character of TTR initially diminished until the droplet stage of the simulations had terminated (Fig. 7F). During the subsequent 200 ns in the gas phase, intramolecular PT generated additional positively charged sites and R-COO− groups. All of the latter were involved in salt bridges (Fig. 7G and Fig. S21).
gMPMD simulations on ubiquitin and lysozyme yielded data similar to those for TTR, with numerous salt bridges, and with retention of native-like structures (Fig. S21–S23). A few ubiquitin and lysozyme runs generated slightly unfolded conformers, indicating that the gas phase resilience of these low MW proteins is somewhat lower than that of TTR, likely reflecting their larger surface-to-volume ratio. The propensity of ubiquitin to produce a small fraction of semi-unfolded conformers has been noted previously in IMS experiments,11,78 although most ions retained compact structures (consistent with our simulations, Fig. S22).
In the case of pepsin, the scarcity of basic residues and the overabundance of acidic sites called for a slightly different simulation strategy. Consistent with experiments,49 dMPMD-generated ions showed complete protonation of acidic sites. Protonation of the five basic sites, along with six NH4+ adducts, resulted in a 11+ overall charge. Complete protonation of carboxylates implies the absence of salt bridges and mobile protons. Gas-phase simulations on pepsin were therefore performed with static protons. Pepsin retained a native-like but slightly collapsed gas-phase structure (5.9% lower Rg than the X-ray structure, Fig. S24).
The gMPMD structures of ubiquitin and lysozyme were in excellent agreement with IMS experiments (Fig. S25A–D). For pepsin and TTR, experimental collision cross sections were slightly lower than for the gMPMD structures, indicating that gas-phase collapse of these proteins did not fully go to completion on the 350 ns time scale of our simulations (Fig. S25E–H).
The dMPMD approach used here captures the experimental finding43 that PT converts NH4Ac solution into an Ac−/HAc buffer that stabilizes the droplet pH at 4.76 ± 1 (see, e.g., Fig. 2F and 3F). Our data demonstrate that this mildly acidic milieu allows the retention of folded protein conformations on the short (ns to μs)21 time scale of nanodroplet evaporation, without acid-induced denaturation.38 Even the subsequent loss of solvent only causes relatively small changes. The final gas-phase structures still resemble the initial solution conformations, albeit with a slight compaction (Fig. 7C and Fig. S22C, S23C, S24C).100,101,103 Thus, our findings bolster the view that solution-like structures can survive the transition from solution into the gas phase during native ESI,11,12,14–18
Our simulations support the CRM/Rayleigh framework, which offers a simple way to predict native ESI charge states.21,24,49,93 The model entails protein release into the gas phase upon droplet evaporation to dryness. Because the droplets stay close to the Rayleigh limit throughout their life cycle, the “residue” formed when the last solvent disappears has a net charge close to that of a protein-sized aqueous droplet. We demonstrate that this concept applies regardless of protein size and physicochemical parameters, for a pI range of at least 3.2 to 10.5. In addition, this model applies to native ESI in both positive and negative ion mode, provided that subtle stability differences of NH4+-charged vs. Ac−-charged systems during the final stages of desolvation are taken into account.
Our findings contrast the proposal67 that protein charging is governed by residue basicity. Support for our view that basicity is not critical for native ESI charging comes from experiments on proteins that behave according the CRM/Rayleigh framework, despite not possessing any titratable side chains (such that there is no PT between protein and solvent during ESI).93 Native ESI charging for such proteins is mediated by NH4+ adducts, similar to the pepsin behavior seen here. Pertinent insights also come from MD data on aqueous ESI droplets containing Na+ as the only type of charge carrier. [M + zNa]z+ ions formed from such droplets have charge states matching those of experimentally observed [M + zH]z+ ions.65 This consistent charging behavior is despite the fact that protonation and Na+ binding are very different types of events, the latter not being governed by basicity.
Earlier investigations highlighted possible alternatives to the CRM/Rayleigh framework, demonstrating that small proteins may undergo IEM ejection.25 However, protein IEM is viable only for compact conformers that carry a large charge (e.g. ubiquitin12+ at pH 3).25 For the conditions studied here, droplets were buffered at pH 4.76 ± 1, thereby precluding the accumulation of a large solution charge on ubiquitin. It is therefore unsurprising that the current simulations did not show protein IEM. Instead, we observed CRM behavior regardless of protein and droplet size. The persistence of compact structures under native ESI conditions also precludes ion formation via the chain ejection mechanism (CEM), a pathway pursued by unfolded proteins.65
The streamlined dMPMD algorithm used here allows ESI droplet simulations in an unprecedented size regime. By tracking the behavior of these large systems through the entire sequence of events (fission, evaporation, gas-phase ion formation), we were able to capture a much larger segment of the overall ESI process than previous modeling studies. In the future, we plan to extend this range even further, all the way to the ESI capillary and the formation of ESI droplets at the Taylor cone. Initial progress toward this goal has been achieved,107 albeit not for macromolecular analytes such as proteins.
The code is available on request from the authors.
| This journal is © the Owner Societies 2026 |