Open Access Article
Ibrahim A. Imam†
a,
Trevor Morey†
ab,
Yuexu Jiangac,
Duolin Wangc,
Dong Xuc and
Qing Shao
*a
aDepartment of Chemical and Materials Engineering, University of Kentucky, Lexington, KY 40506, USA. E-mail: qshao@uky.edu
bDepartment of Chemistry, University of Michigan, Ann Arbor, MI 48109, USA
cDepartment of Electrical Engineering and Computer Science and Christopher S. Bond Life Sciences Center, University of Missouri, Columbia, MO 65211, USA
First published on 4th March 2026
Antifreeze peptides inhibit ice crystal growth and recrystallization, and are promising components of cryoprotective formulations for cell, tissue, and food preservation, as well as anti-icing surface coatings. However, the discovery of new antifreeze peptides has been hindered by their sequence diversity and the limited scalability of experimental screening. In this study, we identify novel antifreeze peptide candidates from a microbiome-derived sequence library using ensemble machine learning and molecular dynamics (MD) simulations. We developed an ensemble classifier composed of 10 adapter-tuned protein-language models and a random forest meta-learner. After training on a curated dataset of 73
766 sequences, we applied this ensemble to 56
008 amino acid sequences from an Arctic microbiome library to identify antifreeze peptide candidates. Structural prediction yields a diverse range of conformations for six selected candidates, including α-helices, coils, and combinations of both. To evaluate their functional relevance, atomistic MD simulations were conducted to assess conformational stability and solvent interactions under freezing conditions. One candidate shows persistent helicity, surface amphipathicity, and an organized hydration pattern consistent with structural signatures reported for ice-binding helices. These findings expand the known landscape of antifreeze peptides and highlight a scalable strategy for discovering functional peptides from complex biological sources.
The functional potency of AFPs arises from their structural diversity and distinct mechanisms of interaction with ice.19–22 These proteins and peptides are broadly classified into four types based on their structures: (I) short, alanine-rich α-helices defined by a canonical Thr-Ala-X repeating motif that organizes their ice-binding surface (IBS);23,24 (II) globular, cysteine-rich with complex, non-repetitive IBS stabilized by multiple disulfide bridges;25 (III) globular with a compact β-sandwich fold to project a flat and highly repetitive IBS;26,27 and (IV) glutamate and glutamine-rich, composed of multiple α-helices, and featuring a structurally irregular ice-binding surface compared to other types.28,29 Despite this structural variance, their function converges on a shared mechanistic principle: irreversible adsorption to specific ice crystal planes via well-defined, flat, and often amphipathic IBS.30 This adsorption locally increases the curvature of the ice-water interface via the Kelvin effect, inhibiting further crystal growth and recrystallization.31 While this structural arrangement determines antifreeze activity, reliance on specific motif patterns such as regularly spaced threonine/alanine arrays on flat ice-binding surfaces constrains the range of biophysical properties (e.g., solubility, thermal stability, ice-binding kinetics) available for material design.23
Efforts have been focused on discovering AFPs and antifreeze peptides from bioprospecting organisms in extreme environments. Indeed, this approach has successfully identified numerous antifreeze peptides through phenotypic screening and sequence homology searches.32,33 The focus is on peptides sharing homology with canonical ice-binding motifs. Exploring uncharacterized sequence space through computational prediction, therefore, offers a powerful complementary strategy to discover peptides with potent ice-inhibiting properties that do not conform to these established structural classes. Recent machine-learning predictors have accelerated AFP discovery by learning sequence-derived representations beyond motif homology.34,35 Such novel sequences could possess superior biophysical profiles, thereby overcoming material design challenges such as limited solubility, aggregation propensity, and suboptimal thermal stability. Exploring such potential might require moving beyond homology-based methods toward models that identify function independent of sequence similarity.
Protein language models (PLMs) have transformed the process of discovering proteins and peptides with desired properties by learning deep contextual representations directly from sequence data.36–38 These models vary in their design and training objectives. ProtBERT and ProtT5, members of the ProtTrans39 family exemplifies this diversity. ProtBERT is optimized for masked language modelling, and ProtT5 employs an encoder–decoder framework for generative tasks. Beyond ProtTrans, other architectures explore distinct directions. ProteinBERT40 integrates functional annotations into its training to enhance biological interpretability, while ProGen41 adopts a generative GPT-style architecture to design novel proteins. The Evolutionary Scale Modeling (ESM)42 family, in contrast, was trained on massive, diverse sequence databases specifically to capture deep evolutionary context and co-variation signals, which are correlated with protein function. The capacity to learn functional representations directly from sequence, without reliance on homology, sets the pace for developing powerful pipelines for screening vast, unannotated metagenomic databases.
ESM243 has shown its extraordinary ability to use rotary positional embeddings to efficiently capture long-range amino acid dependencies.44–46 The 650-million parameter variant (ESM2-650M) provides an optimal balance between predictive accuracy and computational throughput for large-scale screening.47 However, the ESM2 model presents a significant challenge. Full fine-tuning is not only computationally costly but also risks model degradation by overwriting the powerful, generalizable representations learned during pretraining. To strike a balance between specialization and efficiency, we employed parameter-efficient adapter-based fine-tuning.48 The strategy preserves the model's core knowledge by freezing the 650M-parameter backbone and training only a small, specialized set of adapter weights. This enables the development of a downstream binary classifier to identify functional peptides.
Ensemble learning is a method that aggregates predictions from multiple independent models to improve overall performance.49 An ensemble of models can attenuate the bias of individual models. Common approaches to developing an ensemble of models include voting, where predictions are aggregated via majority voting or probability averaging; bagging, which trains models on different data subsets to reduce variance; boosting, which sequentially corrects errors from previous models; and stacking, which uses a meta-learner to optimally weight individual predictions. These strategies have been applied to the prediction of functional peptides. Studies have reported improved accuracy using ensembles for antimicrobial peptides,50 anticancer peptides,51 antibiofouling peptides,47 and neuropeptides.52 For antifreeze peptide discovery, where experimentally validated examples are scarce and structural diversity is high, ensemble methods provide a practical way to maximize prediction reliability.
All-atom molecular dynamics (MD) simulation is a powerful method for further analysing the potential of deep learning predicted candidates and characterizing antifreeze protein (AFP) and peptide mechanisms at an atomic resolution that static structural methods cannot access.53–55 Pioneering studies used MD to elucidate how Type I AFPs pre-order water on their IBS to template ice recognition,56–58 to quantify the thermodynamic contributions of individual residues to adsorption,59,60 and to demonstrate that antifreeze glycoproteins bind via dynamic hydrogen-bonding networks rather than rigid complementarity.61,62 Subsequent work has used MD to resolve the hydration dynamics governing binding affinity,63,64 to define the molecular basis of hyperactive versus moderate activity,65 to identify conformational transitions at the ice interface,66 and to validate the mechanisms of computationally designed peptides.67 These studies have established MD as a versatile computational framework for the rational design of AFPs and antifreeze peptides, enabling assessment of structural descriptors like stability, amphipathic profile, and solvent interaction required for ice-binding function.
In this study, we integrated PLM adaptation, ensemble learning and MD simulation to identify and characterize novel antifreeze peptide candidates from a library of amino acid sequences identified in the Arctic microbiome. We first compiled a database of amino acid sequences derived from the Arctic microbiome. We then applied an ensemble of adapter-tuned ESM2-650M classifiers to identify high-confidence candidates with physicochemical properties distinct from those of the known AFP motifs. These candidates were then subjected to a two-step physics-based in silico procedure for further analysis of their chemicophysical properties. In the first stage, we predict their possible 3D structures using AlphaFold3. In the second stage, we perform all-atom MD simulations to analyse their conformational stability, quantify their solvent exposure and amphipathic character, and unveil their hydrogen bonding patterns. This computational analysis provides a comprehensive structural and dynamic filter, allowing us to prioritize the candidates and identify a prime candidate for experimental validation. The rest of the paper is organized as follows. Section 2 will focus on the data construction, deep learning methods, and MD simulation details. Section 3 will focus on the results and discussion. Section 4 will present a summary.
To mitigate bias arising from unequal sequence lengths between AFP and non-AFP proteins, all sequences beyond a maximum length of 200 residues were excluded. This cutoff was chosen based on the observed distribution of AFPs to avoid introducing sequence-length bias into the classifier. In total, 73
766 sequences were used for model development. A balanced subset of 1100 AFP and non-AFP sequences was reserved for training and evaluating the random forest-based ensemble, with a stratified split into training and testing sets at a 4
:
1 ratio.
For adaptation of the protein language models (PLMs), the remaining 6506 AFP and 65
060 non-AFP sequences were partitioned into ten balanced groups. Non-AFP sequences were randomly divided into groups of 6506 sequences each, and the complete AFP set was added to every group to ensure consistent representation and reduce class-imbalance bias. Each group was subsequently clustered using MMSeqs2, and the resulting clusters were split into training, validation, and testing subsets in a 70
:
15
:
15 ratio. Sequence similarity across subsets was restricted to below 30% (global identity), thereby reducing redundancy and ensuring robust evaluation of model generalization. A detailed summary of dataset composition and splitting for all groups is provided in Table S1.
| Hyperparameters | Values |
|---|---|
| Max seq length | 220 |
| Adapter layer number | 16 |
| Epoch number | 50 |
| Batch size | 32 |
| Optimizer | Adam |
| Learning rate | 0.00001 |
The final representation for classification is obtained by averaging the hidden states across all residue positions in each sequence, yielding a fixed-length embedding. This pooled representation is then passed through a single-layer feed-forward classifier with sigmoid activation to produce a binary probability score for AFP. Training is conducted for a maximum of 50 epochs with early stopping based on validation loss, using the curated dataset described in Section 2.1 for monitoring.
Each simulation box was arranged along the z-axis in a solvent-ice slab-solvent configuration, where the ice layer was sandwiched between two water phases to mimic a quasi-symmetric environment. Along the z-axis, the box length was chosen such that the water–ice–water slab was separated from its periodic images by a 3 nm thick vacuum buffer region, preventing interactions between replicated interfaces. The peptide was placed in the solvent 1.5 nm away from the ice surface to prevent steric interference while allowing spontaneous peptide diffusion toward the interface during the simulation run. Fig. 2 presents a representative snapshot of the simulation box, while Table S2 summarizes the dimensions and composition of all simulated systems.
![]() | ||
| Fig. 2 Representative snapshot of the simulation system. The peptide candidate (cyan) is 1.5 nm away from the ice-slab (licorice) and solvated with water molecules (red). | ||
The basal-plane ice-water interface was chosen as a well-characterized, structurally simple model system for probing peptide structure and hydration under sub-zero conditions, rather than as a detailed model of the preferred binding plane of a particular AFP. The basal-plane interface serves as a generic, strongly structured hydration environment for stress-testing the conformational stability of the helical peptide candidates.
All peptide molecules were described using the Amber14SB75 all-atom force field, which reliably represents both backbone and side-chain conformations. The TIP4P/Ice76 water model was used to describe the ice and solvent molecules, as it is well-suited for simulations involving ice-water phase equilibria. To maintain overall charge neutrality, sodium (Na+) and chloride (Cl−) counterions were added to each system.
Non-bonded interactions were calculated as the sum of a Lennard-Jones 12-6 potential and a Coulombic potential, as shown in eqn (1):
![]() | (1) |
is the energetic parameter, σij is the geometric parameter, and ei is the partial charge of atom i.
Bonded interactions, including bond, angle, and dihedral terms, were quantified according to the AMBER force field specifications. Visual Molecular Dynamics (VMD) software77 was used for comprehensive visual analysis.
To maintain a solid ice slab in contact with a supercooled aqueous region under sub-zero conditions, the system was coupled to two independent thermostats: the ice layer was maintained at 200 K, and the liquid water, ions, and peptide (non-ice) were maintained at 250 K. This dual-thermostat scheme keeps the ice slab well below its model melting point while maintaining the surrounding water in a supercooled state with sufficient mobility to sample peptide conformational dynamics at the ice-water interface. Temperature control in both groups employed the V-rescale thermostat79 with 0.1 ps time constant to ensure stable regulation. All covalent bonds involving hydrogen atoms were constrained using the LINCS algorithm.80 Lennard-Jones interactions were truncated at 1.2 nm, and long-range electrostatics were computed using the particle-mesh Ewald81 (PME) method.
Trajectory coordinates were saved every 100 ps, and all analyses were performed on the final 350 ns of each trajectory, which corresponds to the period after an initial equilibration phase where the system reached stability, as confirmed by convergence of temperature and potential energy.
Because the oxygen atoms in the ice slab were position-restrained and separate thermostats were applied to the ice and water-peptide regions, these simulations represent a quasi-static model of an ice-water interface. They are therefore used to probe peptide conformational behaviour and hydration under sub-zero interfacial conditions, rather than to provide quantitative information on ice growth or melting kinetics. Post-processing and quantitative analyses were performed using both GROMACS analysis utilities and custom Python scripts developed in-house to automate time averaging of structural and hydration descriptors and to assess convergence.
To further assess the specific influence of the ice-water interface, we also performed control simulations of all six peptides in bulk supercooled water at 250 K (no ice slab). These bulk-water systems contained only the peptide, water, and ions, and were run with simulation settings identical to those used for the ice–water interface trajectories.
The confusion matrix (Fig. 3) further illustrates this balance: 213 non-AFP sequences and 203 AFP sequences were correctly classified, with only 24 total misclassifications. The ensemble achieved an overall accuracy of 0.945 while maintaining balanced performance across both classes, making it a reliable screener for identifying novel antifreeze peptide candidates. The ensemble's balance is critical for downstream applications, where minimizing bias ensures that promising peptide candidates are not overlooked.
![]() | ||
| Fig. 3 Confusion matrix showing the performance of the ensemble model on AFP classification, comparing predicted and true labels for AFP and non-AFP sequences. | ||
008 protein sequences derived from an Arctic microbiome82 library, selected because organisms in extreme cold environments are likely to possess proteomes enriched in antifreeze proteins. This search identified 11
286 candidate antifreeze sequences. Second, the candidate list was refined to retain only sequences with 30–50 amino acid residues, yielding 144 candidates. Third, these 144 sequences were subjected to statistically informed outlier-based filtering to identify candidates most distinct from the AFP training distribution.
To evaluate statistical distinctness, sequences were embedded in a five-dimensional physicochemical descriptor space using the E-descriptor system defined by Venkatarajan and Braun.83 The system derives five principal components (E1–E5) from multidimensional scaling of 237 physicochemical parameters, where E1 represents hydrophobicity, E2 captures molecular size and steric properties, E3 encodes α-helix propensity, E4, relates to partial specific volume, and E5 indicates β-strand propensity. The vector value of these E-descriptors for each amino acid are presented in Table S3. For every peptide, per-residue E-values were averaged to yield a single five-component vector describing its position in descriptor space.
E-Descriptor vectors were first computed for AFPs in the training dataset, from which the mean vector and covariance matrix in E-space were estimated. Model-predicted peptide candidates were then embedded in this same space, and the Mahalanobis distance from each candidate to the training distribution mean was calculated. To assess whether candidates represented statistically significant outliers, deviations were evaluated using an upper-tail chi-square test with 5 degrees of freedom, corresponding to the five-dimensional E1–E5 representation. Sequences with significant deviation (p ≤ 0.01) were flagged as physicochemically distinct from the characterized AFP training set.
By focusing on statistically significant outliers in this five-dimensional E-descriptor space, we prioritized candidates that occupy physicochemical regions underrepresented among characterized AFPs. This strategy is designed to probe the periphery of known AFP chemical space and potentially reveal novel sequence solutions to antifreeze function, while recognizing that such outliers may also include false positives that require further computational and experimental scrutiny.
Table 2 presents the six (6) peptide sequences that satisfied this statistical criterion. For visualization, principal component analysis (PCA) was performed on the E-embeddings. The first three components captured more than 75% of the variance, and the selected sequences occupied marginal regions relative to the training cloud. Fig. S1 shows the 3D PCA projection of the AFP training set, the overlapped model-predicted peptides, and the six diverse outlier candidates, highlighting their peripheral positioning and potential as unconventional AFP leads.
| # | Amino acid sequence | Sequence length |
|---|---|---|
| 1 | MGNEQKQHHEPEREEHRQKPEEEKPQTWKHPDDGTELSERDQERPLKP | 48 |
| 2 | MVVIMTAVMITNVVMTVAMIGDMTMTAGVIAMVETRIIAGKLLQ | 44 |
| 3 | MLSLNGCTVLAIADVAVATTVKVGGAVVGTAVDVTKAGVGAVTGSAAK | 48 |
| 4 | MSKDRKTVKEVKKQPTVNVNKKQSAYQSGKGSASSDLGKK | 40 |
| 5 | MKKIYISAAVLLAVAITEGCIKQRVAESSAAKHISVRKAL | 40 |
| 6 | MLLIMALITSVQTTLLLMVLTTFLLTGLLLTALITWALKA | 40 |
![]() | ||
| Fig. 4 AlphaFold3 structural predictions for six candidate peptides identified by the ensemble model. | ||
The biophysical properties and dynamic behaviour of these peptides were investigated using 400 ns all-atom MD simulations. Secondary structure evolution was analysed using the Dictionary of Secondary Structure of Proteins (DSSP) algorithm,86 which assigns structure based on backbone hydrogen bonding patterns (Fig. 5). This revealed dynamic diversity in conformational stability, particularly within the main helical class. Peptides #2 and #5, demonstrated notable conformational rigidity, maintaining end-to-end α-helical structure, with minimal termini fraying, throughout the simulation. In contrast, peptides #3 and #6, also predicted to be fully helical, exhibited different dynamic profiles. Peptide #3 retained its central helical core but displayed a clear disorder at both its N- and C-termini. Peptide #6 initially adopts a stable helix comparable to peptides #2 and #5, after which it exhibits structural instability, marked by the partial unfolding of its C-terminal residues into a disordered coil after approximately 280 ns. The simulations also affirmed the non-canonical predictions. Peptide #4 maintained its stable central α-helix while its termini remained highly disordered, and peptide 1 behaved as a dynamic disordered system, sampling mostly helical segments and irregular secondary structures. The DSSP dynamic profiles reveal a hierarchy of structural stability within the helical class. The rigid architectures of peptides #2 and #5 contrast sharply with the conditional stability of peptides #3 and #6 and the persistent disorder of peptide #1. Such diversity in conformational behaviour may reflect distinct stability and amphipathic presentation relevant for downstream validation. To illustrate the structural diversity among predicted peptides, representative MD snapshots at multiple time points are provided for two contrasting candidates, peptide #1 and peptide #5 (Fig. S2), showing both peptide conformational evolution and the development of interfacial water structuring during equilibration.
![]() | ||
| Fig. 5 Secondary structure evolution of (A) peptide #1, (B) peptide #2, (C) peptide #3, (D) peptide #4, (E) peptide #5, and (F) peptide #6, during MD simulations. | ||
To assess peptide proximity to the ice interface, we quantified the peptide–ice separation (interfacial gap) along the slab normal throughout the trajectories (Fig. S3). All peptides begin at an initial separation of approximately 1.4–1.5 nm and relax over the first 100–150 ns to separations of 1.9–2.2 nm, after which the distance fluctuates around this larger value (Fig. S3). This indicates that there was no persistent adsorption of any of the candidates on the restrained basal surface. This absence of spontaneous approach on the basal plane is consistent with expectations for α-helical candidates and is not interpreted here as evidence against antifreeze potential.
![]() | ||
| Fig. 6 Peptide solvent accessibility; (A) distribution of total SASA across peptides. (B) Average solvent exposure of hydrophobic and hydrophilic residues. | ||
These SASA distributions reflect the peptides' conformational dynamics and helical stability. The large distributed SASA of Peptide #1 is a biophysical effect of its disordered, solvent-exposed conformational ensemble. At the opposite end of the stability spectrum, the minimal SASA values for peptides #2 and #5 imply the formation of compact, stable helical structures that minimize solvent interfacing through efficient residue packing. The intermediate values of peptides #3, #4, and #6 are also non-random; they reflect the partial disorder (flexible termini or C-terminal unfolding) observed in the DSSP analysis (Fig. 5) in Section 3.3, which increases their average solvent interface.
Decomposition of SASA into hydrophobic and hydrophilic components revealed two surface chemistry profiles among the peptides (Fig. 6B). Peptides #1 and #4, both structurally disordered, exposed predominantly hydrophilic surfaces. Peptide #1 demonstrated 14.6
:
49.1 nm2 hydrophobic-to-hydrophillic exposed area, while Peptide #4 showed 9.7
:
36.5 nm2 exposure. The remaining peptides exposed greater hydrophobic surface area, with varying hydrophobic-to-hydrophilic ratios. Peptide #6 displayed the largest hydrophobic exposure (33.3
:
10.1 nm2 hydrophobic to hydrophilic exposure), followed by Peptide #2 (29.6
:
13.1 nm2) and Peptide #3 (24.7
:
19.8 nm2). These variations in surface composition reflect the interplay between sequence composition and conformational stability.
Peptide #5 diverged clearly from both classes, presenting balanced hydrophobic-to-hydrophilic ratio (20.6 nm2 vs. 21.3 nm2). For a stable and extended helical structure, this near-equal distribution indicates spatial segregation of hydrophobic and hydrophilic residues onto opposing helical faces. Such amphipathic organization is commonly observed in type I antifreeze proteins, where the balanced surface architecture facilitates oriented binding to ice crystal planes. The combination of sustained helical stability and balanced surface chemistry distinguishes peptide #5 from the other candidates and suggests structural compatibility with ice-binding function, though experimental validation remains necessary to confirm activity. Matched bulk-water control simulations at 250 K reproduce highly similar total SASA distributions and hydrophobic/hydrophilic SASA partitioning, preserving the same peptide ordering (Fig. S4A and B), indicating no systematic interface-induced shift in solvent exposure.
As shown in Fig. 7, Peptide #1 exhibited the highest solvent interaction, forming approximately 137 sidechain and 76 backbone hydrogen bonds. This reflects the extended, disordered structure earlier attributed to peptide #1, which lacks internal hydrogen bonding and exposes most polar and non-polar group. Peptide #4 formed around 73 sidechain and 51 backbone hydrogen bonds with surrounding solvent molecules, indicating a flexible conformation with partial internal organization.
![]() | ||
| Fig. 7 Average number of hydrogen bonds formed by backbone atoms (C-α, C, N, O) with water molecules compared to those formed by side-chain atoms. | ||
In contrast, peptides #2, #3, #5, and #6 showed lower backbone hydration, ranging from 34 to 49 hydrogen bonds. This reduction reflects the formation of intramolecular backbone hydrogen bonds in helical or ordered structures, which limits backbone accessibility to solvent. Peptide #6 showed the lowest side-chain hydration at 18 hydrogen bonds, indicating a predominant hydrophobic surface character with minimal polar–solvent interaction. Peptide #3 maintained higher side-chain hydration at around 33 hydrogen bonds, likely reflecting its disordered termini or exposed loop regions identified in the DSSP analysis (Section 3.3). Peptide #2 displayed 32 sidechain and 37 backbone hydrogen bonds. This suggests moderate surface polarity within a stable helical framework.
Peptide #5 displayed a highly distinct hydration profile, forming approximately 44 sidechain and 34 backbone hydrogen bonds. This moderate asymmetry highlights a critical structural equilibrium: the peptide core remains effectively shielded, while polar residues maintain interactions with the solvent. This hydration pattern serves as a key indicator of surface polarity, suggesting amphiphilic nature that is consistent with structural pattern that could, in principle, support an ice interaction in appropriate crystallographic planes.
While these simulations do not directly measure ice-binding activity, the structural properties of peptide #5 like stable helical structure, balanced solvent exposure, and side-chain-dominated hydration consistent with organized polar surface regions, aligns with the biophysical features characteristic of functional Type I antifreeze proteins. This convergence of structural and hydration signatures supports peptide #5 as a high-priority candidate for experimental ice-binding testing.
This study demonstrates the utility of combining protein language models with MD simulations for functional peptide discovery. The presented approach enables rapid screening of large unlabelled peptide datasets and provides atomic-level insights into structure–function relationships. Future extensions using enhanced sampling or explicit simulations ice nucleation resistance may refine our understanding of antifreeze mechanisms and guide rational peptide optimization for cryopreservation and materials science applications. Because α-helical AFPs often bind non-basal planes, future work will also examine prism and pyramidal interfaces and may compute adsorption free energies to assess plane specificity. Experimental studies will be crucial for validating the predicted peptide candidates and for refining the computational design rules that emerge from this work.
Supplementary information: Table S1 Summary of AFP and non-AFP sequence counts; Table S2 The details of initial system setup; Table S3 The 5 vector E-descriptors of each amino acid proposed by Venkatarajan & Braun; Fig. S1 The 3D PCA visualization of AFP predicted peptides overlapped with AFP from the training data; Fig. S2 Representative VMD snapshots of the ice–water slab simulation for peptide #5 at 0, 50, 100, 200, and 400 ns for (A) peptide #1, (B) peptide #5. The restrained basal-plane ice slab maintains an ordered lattice, while the adjacent liquid water exhibits interfacial ordering; Fig. S3 Peptide-ice interfacial gap (surface-to-surface separation along the slab normal, z) as a function of simulation time for the six peptide candidates in the basal-plane ice–water slab system (peptides initially placed ∼1.5 nm from the ice surface); Fig. S4 Comparison of key structural/solvation descriptors between peptide–ice–water slab simulations and matched bulk-water controls. Left panels: basal-plane ice–water slab; right panels: bulk water (no ice), both at 250 K for the liquid/peptide region (ice lattice restrained at 200 K in slab runs). Metrics computed over the production window (t ≥ 50 ns). (A) Distributions of total solvent-accessible surface area (SASA). (B) Mean hydrophobic and hydrophilic SASA components. (C) Mean peptide–water hydrogen-bond counts partitioned into backbone–water and sidechain–water contributions. See DOI: https://doi.org/10.1039/d5tb02758f.
Footnote |
| † The two authors contribute equally to this paper. |
| This journal is © The Royal Society of Chemistry 2026 |