Open Access Article
Michael A. Herrera
*a,
Grace K. King†b,
Zoe Ozols‡
b,
Gioele A. Tiburtini‡
c,
Nicoletta Schiavo‡c,
Francesca Spyrakis
*c,
Louise K. Charkoudian
*b and
Dominic J. Campopiano
*a
aSchool of Chemistry, The University of Edinburgh, Edinburgh EH9 3FJ, UK
bDepartment of Chemistry, Haverford College, Haverford, Pennsylvania 19041, USA
cDepartment of Drug Sciences and Technology, University of Turin, Turin 10125, Italy
First published on 24th April 2026
Acyl carrier proteins (ACPs) are dynamic, structurally conserved α-helical proteins central to many primary and secondary metabolic processes. Whilst prior engineering efforts have focused on strategic mutagenesis and “helix swaps”, much of the ACP sequence design space remains underexplored. Here, we create diverse variants of the archetypal ACP subclass – AcpP – using a bespoke sequence-generating algorithm (ALGO-CP), which utilises a combined evolutionary and physicochemical design approach. Using ALGO-CP, we generated two soluble candidates – ALGO-055 and ALGO-059 – that can undergo full post-translational modification from apo → holo → acyl forms in vitro, using recombinant modifying enzymes. Building on these successful designs, we further adapted ALGO-CP to produce several chimeras, two of which – chALGO-012 and chALGO-024 – also exhibit full modifiability. We explore the structural plasticity of our ALGO variants via robust molecular dynamic simulations, and we further reveal by circular dichroism spectroscopy that ALGO-055 and ALGO-059 lack the canonical α-helical fold of an ACP, whilst remaining soluble and readily modifiable. Upon acylation of ALGO-055 and ALGO-059, we observe a marked increase in helicity indicative of protein restructuring. Of note, both ALGO-055 and ALGO-059 harbour several rare amino acid variations across their sequences, whilst preserving many important acidic “hotspots” involved in key protein–protein interactions. By testing the limits of the AcpP design space, our findings suggest that some key aspects of ACP behaviour (specifically post-translational modification) can be retained independently of the canonical structure. This work establishes a foundation for probing ACP sequence diversity through a hybrid computational-experimental approach. ALGO-CP is available under AGPL3.0 license: https://github.com/MAHerrera-94/ALGO_CP.
ACPs are remarkably diverse at the sequence level, with many homologues sharing as little as ∼20–30% identity. Despite this, ACPs generally conserve a compact helical fold, comprising four antiparallel helices (αI–IV) connected by flexible loops (Fig. 1A). For an ACP to be functionally competent, it must undergo post-translational modification (PTM) to its holo-form via the attachment of a 4′-phosphopantetheinyl (4′-PP) prosthetic group, derived from coenzyme A (CoASH, Fig. 1B). This PTM is catalysed by a 4′-phosphopantetheinyl transferase (4′-PPTase), which fixes the 4′-PP group to an invariant serine residue situated before helix αII. In this holo-form, substrates are covalently tethered to the terminal thiol group of the 4′-PP; bound lipophilic substrates can be sequestered within the hydrophobic cavity of the ACP, thereby protecting it from premature hydrolysis or undesired side-reactions (Fig. 1C).8,9 The surface of the ACP mediates interactions with partner enzymes, with solvent-exposed residues along helix αII (often called the “universal recognition helix”) playing a major role in PPIs. The nature and precision of these PPIs is governed by the ACP surface charge distribution, hydropathy and topography, as evidenced by numerous structural and NMR analyses of ACP complexes.10–17 Like many ancient enzyme families, such as the short-chain dehydrogenase/reductase,18 it is widely held that ACPs are robust to sequence variation, provided these key structural, functional and biophysical features are retained.
Efforts to engineer ACPs include rational mutagenesis19,20 and the creation of chimeras,21,22 in which helical segments are exchanged between natural homologues to introduce new functions and/or re-route biosynthetic pathways. Whilst such strategies have produced functional variants, they explore only a narrow corner of ACP design space, leaving many novel ACP designs – and functions – out of reach. A more rigorous exploration of this sequence space could deepen our understanding of ACP evolution and uncover new insights for engineering. Computational biology is well-placed to address this challenge, yet to the best of our knowledge, no such strategy has been applied to the design of new ACPs.
In this work, we developed a simple and tuneable sequence generation algorithm (ALGO-CP) designed to map the sequence space of ACPs. Rather than strictly relying on phylogenetic constraints, ALGO-CP takes a softened design approach that blends evolutionary conservation with local physicochemical information. In theory, this would permit the controlled sampling of amino acids that are rarely observed in natural homologues, thereby pushing the limits of natural sequence diversity. Using the prokaryotic housekeeping ACP (AcpP) subclass as a template, we applied ALGO-CP to generate diverse de novo ACP-like sequences, two of which – ALGO-055 and ALGO-059 – could undergo PTM to their holo- and acylated forms in vitro. By re-configuring ALGO-CP, we created two soluble chimeric variants of ALGO-055 and ALGO-059 – named chALGO-012 and chALGO-024 – that also undergo complete PTM. We further reveal by circular dichroism (CD) spectroscopy that ALGO-055 and ALGO-059 are modifiable despite lacking the canonical α-helical structure of an ACP, and we report evidence of structural recovery upon acylation. We propose that these unique properties arise from rare amino acid combinations not featured in natural AcpP homologues, alongside the preservation of key acidic regions necessary for productive PPIs.
000 copies per cell, Escherichia coli AcpP (EcAcpP) can interact with >25 other proteins to synthesise several life-critical lipids (fatty acids, phospholipids, lipid A, lipoic acid, biotin),10,24–26 mediate nutrient stress responses (via communication with SpoT)27,28 and promote chromosome organisation (via interaction with the chromosome partition protein MukB).29 The multi-functional “pocket knife” versatility of AcpP, together with the sheer abundance of sequence homologues, makes it an ideal starting template for de novo ACP design using ALGO-CP.
To seed the design process, we first retrieved 2558 AcpP homologues from UniRef90 using 17 validated AcpP sequences as queries (Table S3). This dataset was cleaned to remove duplicate and incomplete sequences, bringing the new total to 2167. These homologues were aligned using MAFFT,30 and the resulting MSA was used as input for ALGO-CP. We configured ALGO-CP to generate sets of 5000 unique sequences, with each iteration raising the weighting coefficient r from 0.00 to 1.00 in increments of 0.05. For each set, we computed the mean positional values of HKD, pI, and vdW volume, and compared these to the input MSA via pairwise correlation (Fig. S1). As anticipated, the sequences generated by ALGO-CP became more divergent from the input MSA with increasing r, reflecting a shift towards sequence novelty permitted by the softer computational strategy of route-AP (Fig. 2A).
To approximate the balance point between the two sub-algorithms, we averaged the pairwise property correlations within each sequence set and analysed their residual proximity to the grand property mean across all sequence sets (Fig. 2B, C). Negative residuals indicate relatively conservative sequences, whereas positive residuals indicate relatively divergent sequences. Based on this proximity analysis, we estimated that sequences generated using r = 0.60, which returned the smallest residual value (8.75 × 10−4), may best reflect this balance between conservative vs. explorative sequence design. A subset of 100 randomly-sampled, unique sequences generated at r = 0.60 were found to exhibit intermediate overall sequence properties relative to those sampled from pure AC- and AP-guided designs (Fig. 2D, E and Tables S4, S5, S7, S8, S10, S11). Furthermore, this subset could be accurately folded using AlphaFold331 (pLDDT = 90.1 ± 2.75), suggesting that they may adopt ACP-like folds if expressed in soluble form (Fig. 2F and Tables S6, S9, S12). However, since ALGO-CP does not explicitly predict solubility or biochemical function, we proceeded to characterise a sample of these sequences experimentally.
From the previous subset of r = 0.60 sequences sampled for property analysis, seven candidates – ALGO-(013, 023, 040, 044, 055, 057, 059) – were randomly selected for in vitro characterisation (Table S13 and Fig. S2). The candidate sequences were codon-optimised for E. coli, synthesised and cloned into pET28a with a C-terminal His-tag for ease of purification. Following heat-shock transformation and antibiotic selection, overnight protein expression could be induced with 1 mM IPTG at low temperatures (18 °C). Of the seven candidates, ALGO-(023, 044, 057) were expressed as insoluble inclusion bodies, and were not taken further. The remaining candidates could be captured using Ni2+-affinity resin and further cleaned using a PES centrifugal filtration unit (Fig. 3B and Fig. S5–S8).
Following dialysis, the identity and PTM status of each soluble ALGO candidate was confirmed by LC/ESI-MS (Table S14 and Fig. S18–20, S22). Notably, both ALGO-055 and ALGO-059 were purified partly in holo- form, as indicated by a molecular weight (MW) increase of 340 Da consistent with the covalent attachment of 4′-PP. Encouragingly, this implied some in vivo interaction with the endogenous EcAcpS during protein expression. Both candidates could be readily converted to their respective holo forms in vitro using purified EcAcpS, in the presence of Mg2+, CoASH and dithiothreitol (DTT, see Fig. 3C, D and Fig. S21, S23). Interestingly, both ALGO-055 and ALGO-059 could not be fully converted by EcAcpS when immobilised on Ni2+-affinity resin, which is an established method for the reliable and complete PTM of His-tagged ACPs.40 This observation suggests that immobilisation could impose a significant conformational constraint on these ALGO sequences, hampering productive engagement with EcAcpS (Fig. S24, S25). Furthermore, only partial conversion was achieved using BsSfp, both in vitro (using purified recombinant enzyme) and in vivo (using the sfp knock-in strain E. coli BAP1),41 suggesting that these sequences struggle to establish productive PPIs with this 4′-PPTase (Fig. S26, S27). Conversely, neither ALGO-013 nor ALGO-040 could undergo apo → holo PTM at all, and these candidates were not examined further (Fig. S28, S29).
Using our standout candidates ALGO-055 and ALGO-059, we next attempted the covalent attachment of a lauroyl (C12) acyl chain in vitro using VhAasS. The acylation of both holo-ALGO-055 and holo-ALGO-059 proceeded in the presence of Mg2+, ATP, DTT, lauric acid and 1% DMSO. Gratifyingly, we observed >99% acylation of holo-ALGO-055 and holo-ALGO-059 by LC/ESI-MS, as signified by a MW increase of 182 Da, after only 1 hour of room temperature incubation (Fig. 3C, D and Fig. S30, S31). Thus, we identified two unique ACP-like sequences that can not only acquire 4′-PP, but can also fix acyl cargo.
Encouraged by these results, we configured ALGO-CP (via route-AC) to stochastically generate chimeric variants of ALGO-055 and ALGO-059, mimicking the process of molecular breeding to expand sequence diversity within the AcpP design space. A sequence alignment of both ALGO candidates was used as input. From 5000 generated sequences, five (chALGO-[009, 012, 024, 044, 097]) were sampled from a random subset (n = 100) for experimental testing (Fig. 3E and Table S13). The chimeras chALGO-012 and chALGO-024 (Fig. 3F) were expressed in soluble form and could be further purified by Ni2+-affinity chromatography (Fig. 3G and Fig. S9, S10). As before, these proteins were also recovered partly in holo- form, signalling in vivo PTM by endogenous EcAcpS. Like their parental sequences, both candidates could readily undergo complete PTM (apo → holo → acyl) in vitro using EcAcpS and VhAasS (Fig. 3H, I and Fig. S32–S37, Table S14).
To the best of our knowledge, both ALGO-055 and ALGO-059, and their chimeric variants chALGO-012 and chALGO-024, are the first examples of expressible, soluble and modifiable ACPs designed entirely de novo by a computer algorithm. More broadly, this result showcases how ALGO-CP can be tuned to probe the sequence design space and to create further diversity from initial successful designs.
Secondary structure prediction using PSIPRED42 indicates that both ALGO sequences are likely to be predominantly α-helical (Fig. S38, S39), with complementary DISOPRED343 analysis predicting minimal intrinsic disorder (Fig. S40, S41). We further generated high-confidence structural models of our ALGO variants in their apo- form (created using AlphaFold3, pLDDT = 93.74 and 93.52 for ALGO-055 and ALGO-059, respectively, Fig. S42). These models display well-ordered α-helices typical of an ACP fold, with solvent-exposed surfaces that exhibit similar Coulombic electrostatic potential and molecular lipophilicity potential to the 1.1 Å crystal structure of EcAcpP (PDB: 1T8K, Fig. S43, S44).44
These predicted structural models were subjected to multiple 1 µs molecular dynamics (MD) simulations, with EcAcpP used as the reference system. The structural stability of all proteins was assessed in apo-, holo- and C12-acyl states. In brief, the protein backbone of both holo-EcAcpP and holo-ALGO-059 appears to be partly destabilised by the presence of the 4′-PP group, as reflected by larger RMSD fluctuations relative to their apo-forms (Fig. 4). A partial recovery of conformational stability is observed upon acylation, especially ALGO-059, likely due to favourable interactions between the C12 alkyl chain and the hydrophobic residues lining the ACP core. In contrast, ALGO-055 equilibrated rapidly and remained comparatively consistent across apo-, holo- and acyl- forms, indicating intrinsic sequence-structure features that buffer 4′-PP-induced perturbations. To investigate the relationship between protein stability and 4′-PP dynamics, the RMSD of the Ser-4′-PP moiety was calculated relative to a pocket-bound reference conformation (Fig. S45A). Indeed, replica 2 of holo-EcAcpP (Fig. S45B), and replica 1 of holo-ALGO-059 (Fig. S45C) display a concomitant increase in the RMSD of the 4′-PP moiety and the backbone RMSD of the protein, associated with the progressive displacement of the prosthetic group from the ACP core (∼230 ns and ∼250 ns for holo-EcAcpP and holo-ALGO-059, respectively). In contrast, the movement of 4′-PP does not induce backbone destabilisation in holo-ALGO-055. This analysis reveals a correlation between the fluctuation of 4′-PP and protein RMSD, although it cannot definitively determine whether 4′-PP displacement precedes or follows protein destabilisation.
Further DSSP analysis revealed that both holo- and acyl- forms exhibit localised secondary structure rearrangements, primarily transient transitions from α-helical regions into turns and bends (Fig. S46). These structural changes are more evident in EcAcpP and ALGO-059, whilst ALGO-055 shows a more conserved secondary-structure pattern, consistent with its enhanced conformational stability. To further dissect helix-specific contributions to ACP stability, we analysed the temporal evolution of α-helical content for the four canonical helices (αI–αIV) across all proteins and states (Fig. 5). This per-helix analysis extends the global DSSP profiles by isolating secondary structure dynamics within individual helical segments. ALGO-059 exhibits the most pronounced helix destabilisation, particularly in helices αI and αIII. Helix αII, which harbours the 4′-PP attachment site, shows a marked decrease in α-helical content across all proteins in their holo- forms. This is attributable to the dynamic motion of the 4′-PP prosthetic group, which repeatedly enters and exits the hydrophobic binding pocket, as confirmed by principal component analysis/clustering centroid analysis (Fig. S47).
In general, whilst ALGO-055/059 may capture key global properties of natural AcpP homologues, our MD simulations point toward distinctive conformational behaviours. Relative to EcAcpP, ALGO-055 demonstrates superior stability across all states, whilst ALGO-059 suffers pronounced plasticity and α-helical loss, occasionally approaching complete unfolding in select helices. These differences appear to be driven by the extent of coupling between 4′-PP dynamics and the protein backbone, with ALGO-059 more perturbed by 4′-PP motions. Collectively, these results suggest that subtle sequence-dependent effects give rise to markedly different dynamic responses.
However, to our surprise, neither apo/holo-ALGO-055 nor apo/holo-ALGO-059 demonstrated the diagnostic stalls of an α-helix when studied by CD spectroscopy (Fig. 6A–C). In stark contrast to our in silico predictions, the CD spectra of apo/holo-ALGO-055 and apo/holo-ALGO-059 deviate significantly from the characteristic double minima (208 nm and 222 nm), strong maxima (193 nm) and isodichroic point (201 nm) of a well-folded helical protein, as exemplified by apo/holo-EcAcpP (Fig. 6A). Given that some ACPs, such as V. harveyi AcpP, are natively unfolded in the absence of Mg2+, we supplemented the buffer with 5 mM MgCl2 to evaluate whether the ALGO variants can recover their helicity with the assistance of divalent cations. Remarkably, this supplement only slightly perturbed the CD spectra and did not clearly restore helicity to either ALGO protein (Fig. S48), indicating that they may well exist as minimally structured, predominantly non-helical ensembles in these forms.
Prompted by this discrepancy, we then examined whether the acylation of our ALGO proteins could induce structural ordering, either in-full or in-part. Following holo → acyl- conversion with VhAasS and C12-acid, we observed a marked increase in the helical content of both ALGO variants and EcAcpP. We further purified C12-EcAcpP and C12-ALGO-055/059 from the acylation mixture and confirmed that these proteins retain their helicity when isolated from other assay components (Fig. 6A–C and Fig. S11–S14), thus showing that this gain-of-structure is intrinsic to the C12-acylated state. Binary order–disorder classification,45 based on our CD data, predicts that both ALGO variants transition from disordered (apo/holo) to ordered states upon C12-acylation (Table S15).
A comparative analysis of the estimated helical content of EcAcpP, ALGO-055 and ALGO-059 (Fig. 6D and Table S16) reveals a consistent drop in helicity upon apo → holo conversion, in agreement with our MD simulations of pre-folded models. However, we provide further evidence that C12-acylation significantly promotes structural organisation across all proteins studied. Whilst this observation aligns with previous literature,46,47 the extent of structural rehabilitation in our ALGO variants (especially ALGO-059) is noteworthy, albeit our analysis does not definitively conclude complete or canonical folding. Nevertheless, our combined CD and MD data supports that: (i) acyl cargo bound to the 4′-PP arm can act as a structural chaperone, most likely by organising and burying hydrophobic residues within the protein core, and (ii) 4′-PP modification enhances the structural plasticity of natural ACPs by partially destabilising α-helices, which may help to promote promiscuous PPIs.
The discrepancy between our apo/holo CD spectroscopy results and established findings on V. harveyi AcpP prompted a closer inspection of these ALGO sequences. It has been reported that a single A76H mutation in V. harveyi AcpP restores its helicity via π-stacking interactions with Y72, even in the absence of divalent cations.48 Whilst ALGO-055 and ALGO-059 retain the equivalent Y72 residue, ALGO-055 harbours the “destabilising” A76 variant, whereas ALGO-059 carries the “stabilising” H76. Taken together with our Mg2+ supplementation experiment, this suggests that ALGO-055 and ALGO-059 may be too divergent from natural AcpP homologues for this stabilisation mechanism to be effective. Alongside our PTM results, this points toward an underlying, hitherto undescribed sequence-structure–function relationship. Accordingly, we performed a deeper post hoc analysis of our ALGO sequences to explore this relationship further.
Our analysis of ALGO-055 and ALGO-059 revealed that both proteins harbour at least 16 amino acid variations that are represented by <5% of the AcpP homologues used in the input MSA. Of these variations, ten (ALGO-055) and seven (ALGO-059) are present in <1% (Tables S17, S18 and Fig. 7A). These variations occur across the sequence length of the proteins, with the regions typically associated with helix αII and αIII being the best-preserved by ALGO-CP. By BLASTp, the closest homologues of ALGO-055 and ALGO-059 belonged to Providencia heimbachae (69.23% identity) and Lacimocrobium alkaliphilum (65.38% identity), respectively. When compared to EcAcpP, both proteins contain ≥30 amino acid variations, approximately half of which are non-conservative by BLOSUM62 analysis with median Grantham distance scores of 89 (ALGO-055) and 78 (ALGO-059) (Table S17, S18 and Fig. 7B, C). Remarkably, ALGO-055 and ALGO-059 share only 51.28% sequence identity with each other, with 17 non-conservative variations scoring a median Grantham distance of 91 (Table S19).
Overall, the Grantham distance scores of many of these variations fall within the lower 50th percentile of all possible amino acid pairings (i.e., <96), suggesting moderate, rather than extreme, physicochemical divergence. Nevertheless, it remains possible that the accumulation of these rare and non-conservative amino acid variations contributes to the observed destabilisation of the canonical α-helical structure, though the precise variants/mechanisms responsible for this effect are difficult to predict. We note, however, that ALGO-055 and ALGO-059 preserve several acidic “hotspots”, clustered mostly downstream from the invariant serine, that are known to interface with the highly electropositive docking surfaces of EcAcpS and VhAasS (Fig. S49).39,51 This electrostatic complementarity may still provide adequate priming/orientation for productive PPIs, even in the absence of an ordered structure.
Similarly, it is difficult to ascertain deleterious amino acid variations (or combinations of mutations) from such a limited set of de novo sequences. However, a meta-analysis of our initial ALGO candidates identified 95 total variations that are exclusively found in unsuccessful ALGO sequences (i.e., sequences that failed to express in soluble form or undergo PTM) versus EcAcpP, ALGO-055 and ALGO-059 (Table S20 and Fig. 7D). Of these, 69 are featured in <10% of all 2167 AcpP homologues studied, including 38 that are found in <1%. In total, we identified 11 non-conservative amino acid variations with Grantham distances >120 when compared to EcAcpP; many of these are clustered around αI and αIV, with none occurring along the crucial recognition helix αII (Table S20 and Fig. 1A–C, 7E, F). We propose that sequences enriched with these rare and/or unorthodox amino acid variants may be predisposed to poor solubility or functionality, which may explain why such variations were selected against during AcpP evolution.
Both ALGO-055 and ALGO-059 occupy a peculiar region of an otherwise fragile sequence design space, in which some biochemical function (specifically, essential PTM) is retained in the absence of a hallmark α-helical structure. In 2005, Christopher Walsh and team used phage display to identify a 11-mer PCP fragment (ybbR-13) that undergoes apo → holo PTM via interaction with BsSfp.52 Notably, even this highly truncated peptide adopts a clear α-helical conformation in solution, further highlighting the idiosyncrasy of these ALGO designs. The behaviour of these proteins may reflect a disordered “precursor-like” state to the canonical ACP fold, one in which an amorphous configuration of key contact residues is still sufficient for PTM. Indeed, it is widely appreciated that intrinsically disordered proteins can adopt three-dimensional folds via proximity to/engagement with a protein partner, and it is this structural malleability which enables promiscuous interactions.53 As biosynthetic assemblies and physiological requirements grew more complex, it is possible that the emergence of a defined α-helical fold may have conferred advantages in partner selectivity and/or kinetic efficiency through more nuanced PPIs, a hypothesis which could be explored via further phylogenetic/co-evolutionary analyses. Based on our current data, we postulate that ACP sequences with greater helical propensity – with or without the assistance of chemical chaperones – were gradually selected to meet the demands of biosynthetic precision and function, rather than out of necessity for PTM alone. Our findings also provide further evidence that the canonical ACP fold may have partially emerged as a consequence of cargo engagement, which in turn would facilitate substrate sequestration and protection. Assuming that the canonical helical structure of AcpP is essential for cell fitness, we propose that ALGO-055 and ALGO-059 could be evolved, under in vivo selective pressure, to determine whether and how this fold can be re-acquired to support complex physiological functions. To this end, the E. coli ΔacpP knockout strain CY1877 would make an excellent background for a live/death complementation screen.22
Future work could investigate further structural and biophysical details using 1H–15N HSQC correlation spectra and thermal denaturation experiments, as well as kinetic differences in PTM between our ALGO variants and natural ACPs. Substrate sequestration behaviour of our acyl-ALGO designs could also be investigated further using vibrational spectroscopy.9,57 Finally, the systematic sampling of sequence designs across different r weighting coefficients may provide additional insights into sequence–structure–function relationships.
GROMACS 202363 package was used to run MD simulations of apo-, holo- and acylated systems. AMBER99sb was used as force field and TIP3P as water model. The solvated system was neutralized with counterions and a 0.1 M NaCl concentration was added. Energy minimization was performed using a maximum of 50
000 steps of steepest descent with grid neighbour searching, no constraints, and no thermostating. Long-range electrostatic interactions were treated using the PME method, while short-range Lennard-Jones and van der Waals interactions were calculated using a 12 Å distance cutoff. The system was equilibrated in four steps. Three consecutive NVT simulations were used to gradually heat the system from 0 K to 300 K in increments of 100 K, each lasting 0.1 ns, followed by a 1 ns NPT equilibration at 300 K. Harmonic positional restraints were applied to the protein and eventual prosthetic group′s heavy atoms throughout the equilibrations, using a force constant of 1000 kJ mol−1 nm−2. Velocities were propagated between equilibration steps. Exclusively for the holo- simulations of the ACP enzymes ALGO-059 and ALGO-055, three NPT equilibration phases were performed, during which harmonic positional restraints on the protein backbone had to be gradually reduced from 1000 to 100 kJ mol−1 nm−2.
Production runs were carried out in the NPT ensemble at 300 K without any restraints, using a 2 fs integration step. Each ACP was simulated in three independent replicas (1 µs). The temperature was set at 300 K using a velocity-rescale thermostat.64 Long-range electrostatics were treated with PME, bonds to hydrogens were constrained with LINCS,65 and van der Waals interactions used a Verlet cutoff with force-switching. Trajectories and energies were saved every 10 ps, and PBC were applied in all directions.
The covalently attached prosthetic group (4′-phosphopantetheine) and its acylated derivative (lauroyl-/C12) were parametrised prior to MD simulations. Their structures were extracted from the PDB entries, respectively 5H9H and 2FAE PDB IDs. Missing atoms were added, partial atomic charges were calculated using the DFT method, and bonded and non-bonded parameters were assigned using the GAFF2 force field.66 ACPYPE67 was employed to generate GROMACS-compatible topologies for the ligand. The ligands were incorporated into the target ACPs by extracting the 4′-phosphopantetheine and lauroyl groups from the donor X-ray structures using PyMOL, manually positioning them into the binding pockets, and building the covalent linkage to the invariant serine residue to model holo- and acyl states; positional restraints were applied to ensure stability during equilibration.
Trajectories were centred and fitted to the protein backbone prior to analysis. Secondary structure was analysed using the DSSP dictionary, integrated in GROMACS. Structural stability was evaluated through backbone RMSD. Trajectories were inspected using VMD.68
To further characterise the conformational space of the acyl- and holo-ACP systems, principal component analysis (PCA) and clustering were performed on all three independent replicas for each system using MDTraj.69 Trajectories were merged and non-hydrogen atoms of the covalently attached prosthetic group or acyl chain were analysed. Cartesian coordinates were projected onto the first two principal components and clustered using a quality-threshold-based approach with a 4.5 Å cutoff for acyl- and 5.5 Å for holo-. Representative frames were selected as those closest to each cluster centroid, and cluster populations were computed as fractions of frames.
a) to
i, where:
ai is the normalised similarity score for a given amino acid, and Y denotes the set of all 20 canonical (proteinogenic) amino acids. This scoring function favours amino acids whose physicochemical properties closely match the local average at each position, enabling route-Ap to explore substitutions not necessarily observed in natural sequences, but are compatible with the general physicochemical properties of the sequence position.
The distributions generated by route-AC and route-AP can be weighted and combined into a unified probability profile AU, allowing the user to bias amino acid selection towards conservation or physicochemical-guided design:
| AU = r·AP + (1 − r)·AC |
000 rpm, 45 minutes, 4 °C). The cell-free extract was clarified by filtration (Millex-HP 0.45 µm polyethersulfone, Merck). The recombinant protein was captured from cell-free extract using a Histrap HP (1 mL) column and washed (35 mM imidazole, 15–20 mL). Protein was eluted in fractions (300 mM imidazole, 1 mL) and pooled. ALGO proteins were passed through a centrifugal filtration unit (30–100 kDa MWCO) to remove protein contaminants, and the flowthrough was dialysed in imidazole-free buffer. EcAcpP was polished by size exclusion chromatography using HiLoad 16/600 Superdex 75 pg (120 mL) column. For long-term storage, protein aliquots were flash frozen in liquid nitrogen and stored at −80 °C.
000 rpm, 45 minutes, 4 °C). The cell-free extract was clarified by filtration (Millex-HP 0.45 µm polyethersulfone, Merck). The recombinant protein was captured from cell-free extract using a HiTrap TALON Crude (1 mL) column and washed (10 mM imidazole, 15–20 mL). BsSfp was eluted in fractions (150 mM imidazole, 1 mL), pooled and dialysed in imidazole-free buffer. For long-term storage, protein aliquots were flash frozen in liquid nitrogen and stored at -80 °C.
| ATP | Adenosine triphosphate |
| DSSP | Dictionary of secondary structure of proteins |
| IPTG | Isopropyl β-D-thiogalactopyranoside |
| RMSD | Root mean square deviation |
The code for ALGO-CP and related data analysis can be found at https://github.com/MAHerrera-94/ALGO_CP. The version of the code employed for this study is version 1.0.
Supplementary data has been deposited with Protein Data Bank.72–74
Footnotes |
| † M. A. H. dedicates this manuscript to his close friends and family. |
| ‡ These authors contributed equally. |
| This journal is © The Royal Society of Chemistry 2026 |