Unlocking novel therapies: cyclic peptide design for amyloidogenic targets through synergies of experiments, simulations, and machine learning

Existing therapies for neurodegenerative diseases like Parkinson’s and Alzheimer’s address only their symptoms and do not prevent disease onset. Common therapeutic agents, such as small molecules and antibodies struggle with insuﬃcient selectivity, stability and bioavailability, leading to poor performance in clinical trials. Peptide-based therapeutics are emerging as promising candidates, with successful applications for cardiovascular diseases and cancers due to their high bioavailability, good eﬃcacy and specificity. In particular, cyclic peptides have a long in vivo stability, while maintaining a robust antibody-like binding aﬃnity. However, the de novo design of cyclic peptides is challenging due to the lack of long-lived druggable pockets of the target polypeptide, absence of exhaustive conformational distributions of the target and/or the binder, unknown binding site, methodological limitations, associated constraints (failed trials, time, money) and the vast combinatorial sequence space. Hence, eﬃcient alignment and cooperation between disciplines, and synergies between experiments and simulations complemented by popular techniques like machine-learning can significantly speed up the therapeutic cyclic-peptide development for neurodegenerative diseases. We review the latest advancements in cyclic peptide design against amyloidogenic targets from a computational perspective in light of recent advancements and potential of machine learning to optimize the design process. We discuss the diﬃculties encountered when designing novel peptide-based inhibitors and we propose new strategies incorporating experiments, simulations and machine learning to design cyclic peptides to inhibit the toxic propagation of amyloidogenic polypeptides. Importantly, these strategies extend beyond the mere design of cyclic peptides and serve as template for the de novo generation of (bio)materials with programmable properties.


Introduction
Neurodegenerative disorders, such as Alzheimer's, Parkinson's and Creutzfeld-Jakob disease, affect over 50 million people world-wide, with over 10 million new cases a year.The world a University of Amsterdam, van 't Hoff Institute for Molecular Sciences, Science Park 904, P.O.Box 94157, 1090 GD Amsterdam, The Netherlands.E-mail: i.m.ilie@uva.nlb Amsterdam Center for Multiscale Modeling (ACMM), University of Amsterdam, P.O.Box 94157, 1090 GD Amsterdam, The Netherlands

Daria de Raffele
Daria De Raffele is a postdoctoral researcher in the group of Dr Ioana M. Ilie.She obtained her PhD from Universitat Jaume I where she focused on the QM/ MM theoretical study of enzyme promistuity.Her research focuses on the design of cyclic peptides to inhibit the toxic propagation of the prion protein.

Ioana M. Ilie
Dr Ioana M. ILIE is an Assistant Professor in the Computational Chemistry group at the University of Amsterdam.The research in her group focusses on multiscale simulations of biomolecular systems.In particular, her team relies on rational design and multiscale simulations to (1) understand and control the aggregation mechanisms of polypeptides and their response to the biological environment, (2) to design of peptide-based therapeutics for neurodegenerative diseases and (3) to develop smart (bio)materials with responsive properties.
health organization projects that by 2040, neurodegeneration will become the second leading cause of death after cardiovascular disease. 1 The associated polypeptides are intrinsically disordered (IDP) or rich in intrinsically disordered regions (IDRs), characterized by the lack of stable secondary structures in the native state. 2What they all share is the ability to accumulate in liquid-like membraneless organelles and/or form insoluble solid-like aggregates (e.g.][5][6][7] Various approaches have been developed to interfere with the accumulation processes by stabilizing or eliminating specific monomeric or aggregated forms of the responsible polypeptides. 8,92][13] Traditional small molecule drugs and protein-based therapeutics have made good contributions, yet their limitations in terms of selectivity, stability, and bioavailability, 14,15 as well as their repeated failure in clinical trials 16,17 have inspired the search for alternative therapeutic approaches.Among these, peptides and particularly cyclic peptides are attracting considerable attention due to their unique structural properties and diverse biological activities. 18,19Their cyclic nature confers enhanced stability and resistance to proteolytic degradation, while maintaining a robust binding to the target. 20Cyclic peptides have proven to be excellent candidates for cancer therapy, 21 organ transplantation 22 and inhibition of amyloid aggregation. 23,24Their size and functional properties ensure that the contact area is large enough to provide high selectivity, 25 their ability to form salt-bridges and hydrogen bonds can lead to strong binding affinities, 26 and cyclization increases their proteolytic stability. 27myloid-forming polypeptides, such as amyloid-b (Ab 42 ), a-synuclein (a-syn) and amylin (hIAPP), share the intrinsic disorder independently of their size or residue sequence.The cellular prion protein (PrP C ) consists of a membrane-anchored ordered globular domain composed of three a-helices and a two stranded anti-parallel b-sheet preceded by a 100 residue unstructured flexible tail.Despite the well-defined secondary structure in its monomeric form, the cellular prion protein lacks a specific binding site accessible to potential small molecule inhibitors. 8,28,29Due to their properties, cyclic peptides can selectively intervene in the folding and aggregation process, bind even to targets lacking an easily accesible druggable pocket 30 or heterogeneous and dynamic species, 31 regulate the conformational stability of the target polypeptide and potentially halt or slow down disease onset or progression.Furthermore, the stability and permeability of cyclic peptides enable them to cross the bloodbrain barrier, 32 a crucial requirement for effective neurodegenerative disease therapies.
Advances in peptide synthesis techniques, combinatorial chemistry, and computational tools allow the de novo design and tuning of the structural elements, target specificity, binding affinities, solubility, cell permeability and proteolytic stability of natural and synthetic cyclic peptides.De novo design of cyclic peptides often rely on protein engineering strategies, such as rational design and directed evolution, which aid in the discovery and/or improvement of peptides for drug-related applications. 33Over the past years, computer simulations and machine learning enabled the exploration of a vast chemical space, accelerating the design and optimization of lead peptidomimetic candidates. 34,35Combined with directed evolution, they are versatile tools that enable an initial in silico screening step to scan the full combinatorial libraries and proposed mainly small molecules to be tested in vitro. 36While most of these models are trained on experimental data, more recently machine learning combined with molecular dynamics simulations successfully proposed, optimized and reduced the number of chemical compounds to be tested experimentally at a later stage. 37In contrast to small molecules and protein optimization, the use of machine learning for de novo peptide design is still in its early stages [38][39][40] and its potential has been demonstrated mainly in non-therapeutic applications. 39n this paper, we provide an overview of the recent advancements of the utilization of cyclic peptides as therapeutic or imaging agents for neurodegenerative diseases, particularly focusing on the amyloid-b peptide, a-synuclein, amylin and the cellular prion protein.We emphasize on the importance of the synergy between computer simulations and experiments in light of the latest developments in machine learning for cyclic peptide design and optimization.Additionally, we provide a recipe for a potential approach to capitalize on the predictive power and results from computer simulations and AI in the development of cyclic peptide-based therapeutics.

Small molecules and antibodies
To date, extensive research efforts have been dedicated to the development and advancement of small molecules and antibodies targeting neurodegenerative targets into clinical trials. 16,17,43,44Small molecules have a low molecular weight (o900 Da) and hydrophobicity, and can therefore more readily traverse cell membranes and distribute throughout the body. 45hey offer advantages such as oral administration and scalability for mass production. 46Nevertheless, they often bind to rigid targets with accessible druggable pockets, i.e. active sites or cavities on the surface of a protein with well defined structure that can accommodate small molecules.Proteins and peptides associated with neurodegenerative disorders are often intrinsically disordered and do not possess a well defined structure in their native state, 2,47,48 which prevents the existence of druggable pockets and implicitly access to long-lived cavities for small-molecule binding.Despite their shortcomings, small molecules have been at the forefront of drug development against amyloid-b 42 , a-synuclein, amylin and prion protein condensation.Particularly, natural products and their degradation products were shown to alter the aggregation of the target polypeptide or modulate its toxic behavior [49][50][51][52][53][54][55][56][57][58][59][60][61][62][63][64] (Fig. 1(a)).One example comes from curcumin, which is a natural product, that has various benefits such as This journal is © The Royal Society of Chemistry 2024 preventing amyloid formation and promoting the formation of ''off-pathway'' soluble oligomers and prefibrillar aggregates; 65,66 it can also disrupt Ab 40 fibrils and break down tau tangles. 67,68dditionally, the degradation products of curcumin -ferulic acid and vanillin -have a better solubility than curcumin.Ferulic acid can destabilize Ab 40 and Ab 42 fibrils 69 and vanillin can prevent amyloid aggregation in living organisms. 70Likewise cholic acid, a primary bile acid, presents an anti-amyloidogenic behavior, inhibiting amyloid formation and preventing secondary nucleation. 59Vitamin k3, known for its anti-inflammatory and anti-cancer properties, inhibits the Ab 42 aggregation process and has a positive impact on reducing cytotoxicity in human neuronal cell line. 62We refer the interested reader to a comprehensive review of small molecule inhibitors for amyloid-b 71 for amyloidogenic polypeptides in general. 72ntibodies are Y-shaped proteins of larger molecular weight than small molecules (4150 kDa), which can recognize and bind to protein targets with high specificity and modulate their toxic behavior. 73In particular, monoclonal antibodies have been designed both for therapeutic and diagnostic applications.2][13] For instance, the DesAbs single-domain antibodies targeting Ab 42 epitopes 74 interact with the monomeric peptide, bind with high affinity to the oligomeric species, but not the fibrillar structures, can inhibit secondary nucleation 12 and suppress Ab 42 -mediated toxicity in C. elegans. 74Aducanumab and lecanemab, approved anti-Ab 42 agents, are monoclonal antibodies effective for patients in the early stages of AD due to their ability to reduce amyloid deposits in the brain. 43Aducanumab (Fig. 1(b)) binds at the N-terminal residues 3 to 7 and can discriminate between monomers and aggregated species, 44 while lecanemab binds to soluble Ab 42 protofibrils. 75Cinpanemab and prasinezumab, two monoclonal antibodies directed against a-synuclein aggregates failed in clinical trials due to the lack of positive effects in disease progression. 16,17The POM-family of antibodies (POMologues) has been developed to recognize a variety of epitopes along the sequence of the cellular prion protein 10 and modulate its toxic effects. 76Notwithstanding, no drug against prion diseases is currently in clinical trials.Despite the recent success in the Alzheimer's field with aducanumab 43 and the potential of gantenerumab, 77 antibodies have limitations as therapeutics, including stability and immunogenicity, 14,15 which can impact clinical efficiency.

Cyclic peptides
Cyclic peptides are naturally occurring or chemically synthesized macrocycles consisting of circular sequences of amino acids, [78][79][80][81] Fig. 1(c).Compared to small molecules, they can bind larger, more polar and solvent exposed protein surfaces. 20o mimic naturally occurring cyclic peptides, the macrocycles can be experimentally synthesized from linear precursors by connecting their N-and C-terminal residues via covalent bonds (head-to-tail cyclization). 82Chemically, cyclization can be achieved through lactamization or via disulfide bond formation ensuring the link between the two termini of the linear precursor. 83Head-to-tail cyclization restricts the dynamics of a peptide and can stabilize the formation of b-hairpins.Alternative approaches to favor other conformations such as a-helices can be achieved via stapling, i.e. via cross-linking of two or more side-chains.We refer the interested reader to detailed reviews on the synthesis of cyclic peptides 84 and stapled peptides, 85,86 and focus in the following paragraphs on head-to-tail cyclized peptides.
Because of their physicochemical properties, cyclic peptides present a series of advantages as compared to their linear precursors, small molecules and biological therapeutic agents such as antibodies.First, the rigidity obtained through cyclization provides increased stability, higher resistance to proteolysis 27 and enhanced cell permeability as compared to linear peptides. 20,87,88Second, their size and functional properties ensure that the contact area is large enough to provide high selectivity, and their ability to form salt-bridges and hydrogen bonds can lead to strong binding affinities. 26Hence they can maintain a robust antibody-like binding to (undruggable) interfaces with high affinity, 20,79,89 due to their larger surface and implicitly the higher number of hydrogen bonding partners.Third, cyclic peptides have good in vivo stability, which contributes to enhanced retention and circulation, particularly if they are rich in non-canonical amino acids. 20yclic peptide-based therapeutics also face a series of challenges.Orally administered cyclic peptide-rich drugs struggle with poor oral bioavailability, 78 because of the susceptibility of cyclic peptides to resist proteolytic degradation in the gastrointestinal tract. 27Nevertheless, different routes of administration, such as subcutaneous or via intravenous injections, overcome these difficulties and aid in the efficient delivery of the peptide-drug to the target. 84Another obstacle involves preventing off-target interactions, a challenge often tackled by selectively modifying natural amino acids in the sequence to non-natural ones.Naturally-occurring cyclotide kalata B1 (MW 2.92 kDa), derived from residues 306-311 of tau. 42n the amyloid world, the cyclic peptide development has been growing over the past decade. 30,78,80Typical methods involve designing peptides rich in aromatic moieties, hydrophobic amino acids, or D-amino acids (due to the stereoselectivity for L-amino acids of proteases) that disrupt the aggregation process, i.e. b-breakers or agents that bind to monomeric and oligomeric species, competing with the responsible polypeptide to hinder its aggregation and/or toxic transformation. 90For instance, the RD2D3 D-peptide (H-ptlhthnrrrrrrprtrlhthrnr-NH 2 †), designed to modulate the binding of PrP C to Ab 42 oligomers, interferes with Ab 42 -PrP C heteroassembly in a concentration-dependent manner. 91Its cyclic successor presents better in vitro potency and pharmakinetic properties 92 and could potentially alter Ab 42 aggregation.The bicyclic DesBP peptide (RAACKLGIKACTSVYHACGGKRR) was rationally designed to bind monomeric Ab 42 at residues 31-36 and 38-42 24,93 and was shown to alter the morphology of Ab 42 aggregates in a dose dependent manner.In particular, higher peptide concentrations lead to increased aggregate disorder and reduced cytotoxicity. 93Similarly, the BD1 cyclic peptide (O-ySGLIKWTTALLRTYC-NH 2 ) was shown to inhibit a-synuclein fibril formation in vitro. 94The D,L-a-peptide CP-2 cyclic peptide (IJwHsK ‡) prevents a-syn aggregation into toxic oligomers by an ''off-pathway'' mechanism. 95Particularly, it interacts with the N-terminus and the non-amyloidogenic region, altering the protein's membrane interaction properties and fibril morphologies, thereby preventing the toxic membrane disruption.The macrocyclic inhibitory peptides (MCIPs), were designed to bind to amyloids by mimicking human IAPP (hIAPP) interaction surfaces while maintaining only minimal hIAPP-derived self-/cross-recognition elements. 96Inhibitor selectivity was tuned by chirality, which lead to nanomolar binding affinities to hIAPP, to both amyloid-b 40 and amyloid-b 42 peptides, high proteolytic stability in human plasma and human brainblood-barrier crossing ability. 96Also, disulfide-rich macrocyclic peptides are versatile scaffolds for stable biochemical tool development.Two examples are SFTI-1 (GRCTKSIPPICFPD, disulfide connectivity: Cys3-Cys11), a cyclic peptide that inhibits trypsin, and the kB1 cyclotide (GLPVCGETCVGGTCNTPGCTCSWPVCTRN, disulfide connectivity: Cys5-Cys19, Cys9-Cys21 and Cys14-Cys26), which have an inherent ability to inhibit the fibril growth of the tauderived hexapeptide Ac-VQIVYK-NH2 (AcPHF6). 42Particularly, kB1 is a stronger inhibitor of tau fibrillizatiom than SFTI-1, enabling better binding and/or disruption of AcPHF6 fibrils.Recently, tau mimetic peptides (b-bracelets) have been designed starting from the high-resolution structure of the tau fibril fold by extracting b-strand sequences linked by b-arcs. 97The newly generated peptides self-assemble into parallel b-sheet fibrils and can serve as templates for the design of soluble inhibitors of tau seeding.
In terms of the cellular prion protein, no progress has been made on the therapeutic cyclic peptide market, despite its well defined secondary structure in the soluble form.Potential causes are the lack of druggable pockets or a stable unique binding region in the globular domain, and the intrinsically disordered nature of the tail.Though, the existence of monoclonal antibodies that bind in the nanomolar regime to PrP C indicate that putative interaction sites are available. 10We hypothesise that the rational design of cyclic peptides starting from available high resolution structures of PrP C in complex with monoclonal antibodies may serve as starting points for the design of cyclic peptides that can potentially stabilize the soluble isoform of the protein and therefore prevent its toxic transformation.Alternatively, by tweaking the environmental conditions through mild solvent alteration, e.g. by replacing water with D 2 O 98 or by adding organic compounds, 99,100 one can delicately alter the conformational landscape of the protein to reveal new (allosteric) druggable pockets without disturbing the protein's secondary structure.We refer the interested reader to a series of reviews on peptide-based strategies to interfere with protein misfolding and aggregation, 101,102 a review on the therapeutic potential of cyclic 90 and bicyclic peptides. 103Studies older than 10 years focusing on antiamylin cyclic peptides and peptide-based inhibitors have been reviewed elsewhere. 104Design methods and pitfalls

Conventional peptide design approaches
To design a soluble peptide-based binder with simultaneously high target specificity, binding affinity, cell permeability and proteolitic stability requires prior knowledge of the molecular target and its environmental conditions.Experimentally, genetically encoded methods such as phage display 105 or mRNA display 106,107 allow the generation of libraries of cyclic peptides that bind with high affinity to the target. 81,108While these libraries offer the generation of a vast array of molecules, the chemical synthesis step as well as the numerous experimental trials are time and resource consuming.Furthermore, translating cyclic-peptide hits obtained through display technologies into clinical applications has proven challenging due to potential shortcomings in their pharmacological properties, including limited oral bioavailability, cell permeability, and solubility. 108Other approaches, such as directed evolution mimic the natural evolution process of a peptide by creating a diverse library to screen for mutants with improved characteristics. 109Directed evolution does not require information on the structure-function relationship of the substrate, and relies on an iterative procedure of random mutations and artificial selection to discover new and useful proteins, but is limited by the exhaustive pool of possibilities to be tested.
Over the past years, rational design approaches for de novo peptide design have gained momentum.Rational design relies on a detailed understanding of the amino acid sequence, protein high resolution structure, function and interaction mechanisms. 331][112] Rational design relies on human intervention, which often offers an informed and efficient means to narrow down the search space for amino acids, resulting in a smaller and more manageable pool of This journal is © The Royal Society of Chemistry 2024 effective peptides.De novo rational cyclic peptide design requires (a) high resolution three dimensional structures and biochemical/biophysical information of the target protein, and/ or (b) detailed information of the ligand properties (e.g.hydrogen bonding abilities to the target, hydrophobicity, cyclization chemistry, existence of natural and non-natural residues) and conformations (i.e. the designed peptide may assume different conformations in the bound and unbound states). 113Recent advancements in cryo-electron microscopy (cryo-EM) have enabled the determination of the three-dimensional (3D) high resolution structures of new amyloidogenic aggregates and their monomeric precursors. 114These 3D structures corroborated with a comprehensive understanding of molecular interactions and structure-function relationships could enable the rational design of (cyclic) peptides tailored for amyloidogenic targets.As a matter a fact, rational design has been successfully used to generate the DesAbs antibodies targeting amyloid-b 12 or specific a-synuclein and hIAPP epitopes. 115tarting from the high resolution structures of amyloid fibrils of tau, a-synuclein, and amyloid-b, miniproteins, ranging from 35 to 48 residues, were successfully designed to bind to the fibrillar tips of the targets and inhibit aggregation in in vitro and in vivo. 116First a library of peptide-based inhibitors was created using Rosetta.Subsequently, Rosetta's MotifGraft protocol 117 was used to dock the inhibitors onto the fibrils and energy minimized.The top-ranking inhibitors, i.e. the best binders, were subjected to molecular dynamics simulations to assess the stability of the complexes.Lastly, Rosetta's ab initio structure prediction algorithm 118 was employed for the final screening of inhibitors.Inhibitors with the most favorable energy predictions and the smallest root mean squared deviations from the original design were selected for experimental validation.
From a computational perspective, virtual screening allows fast screening of millions of compounds prior to experimental testing, thereby reducing cost and saving time.Virtual screening using cyclic peptides is limited by the availability of threedimensional structures of the targets, by the absence of druggable pockets and by the lack of information on the structure of the designed cyclic peptide.To overcome some of the limitations, different computational techniques have been combined with machine learning to predict protein structures and complexes thereof.Notable examples include HADDOCK (High Ambiguity Driven protein-protein Docking), [119][120][121] RosettaFold 122 and AlphaFold2. 123,124HADDOCK uses biochemical and biophysical interaction data, such as nuclear magnetic resonance titration experiments or mutagenesis data, to facilitate the protein-protein docking process. 119Recent developments include the generation of cyclic peptide conformations and docking to the protein target using knowledge of the binding site on the protein side to drive the modeling. 125AlphaFold2 is a deep-learning algorithm that incorporates neural network architectures inspired by the physical and geometric aspects of protein structures. 126It employs insights from evolutionary conservation through the analysis of multiple sequence alignments.These alignments are generated by considering information from evolutionary related proteins, along with the 3D coordinates of a few homologous structures known as templates.Similarly, RoseTTAFold also utilizes multiple sequence alignments and a set of initial templates to accurately predict folded structures 122 and protein-protein complexes. 40,122These technological advancements contribute significantly to the prediction protein structures through computational means.The intrinsic disorder associated with amyloidogenic polypeptides implies that the target protein lacks a stable structure and that its native state is better described by a diverse conformational ensemble rich in disordered structures. 2,127,128In this context, AlphaFold2 fails to predict such conformations, which often gives rise to unrealistic structures that do not accurately capture the states in the ensemble [127][128][129] (Fig. 2).The lack of realistic and physically accurate ensembles of structures hampers the design of any type of inhibitor, which represents a limitation of these novel deep learning techniques.

Molecular dynamics simulations in peptide design
Computer simulations are powerful tools that enable the generation of quantitative conformational ensembles for the target (intrinsically disordered) protein with properties comparable to experimental results. 127,128,130Moreover, molecular dynamics simulations can go beyond experimental resolution to provide valuable insights into the stability and dynamics of cyclic peptides, 131 structural detail on their membrane permeability, 132 as well as quantitative distributions of their target-bound and target-unbound states. 128ecently extensive molecular dynamics simulations at full atomistic resolution (Table 1) have been used to successfully identify transient monomeric Ab 42 conformations that have characteristics of fibrillar structures. 133States of the monomeric, dimeric, oligomeric and fibrillar amyloidogenic polypeptides have been thoroughly characterized and have been reviewed elsewhere. 2,128,130The identified pool of structures could be potentially used for small molecule or cyclic peptide docking and design.Ideally, access to a well organized, reliable, and consistently maintained database of molecular dynamics trajectories of amyloidogenic polypeptides would avoid the repeated generation of similar trajectories and enable more  For small molecule docking, snapshots from molecular dynamics simulations of the Ab 42 monomer, 134 dimer 50 or multimers 52,135 have been clustered to generate representative ensembles to be prepared for docking, which can be experimentally validated. 62Briefly, curcumin and a set of curcumin derivatives were docked onto Ab 42 multimeric conformations generated with molecular dynamics simulations. 50,52Results revealed that the small molecules interact with high probability with the amyloidogenic driving domains 16 KLVFF 20 and 29 GAIIG 33 of Ab 42 and disrupt their secondary structure in the hexameric 52 and dimeric arrangements. 50Interestingly, Silybin A (Sil A) and Silybin B (Sil B), two diasteroisomers of silibinin were shown to have different interaction preferences to Ab 40 and distinct biological response. 51Sil A binding the aromatic residues F19 and F20 slowed down aggregation, while Sil B interacting primarily with the C-terminus of the polypeptide fully abolished amyloid aggregation.Compelling evidence suggests that Silybin B is a powerful inhibitor also against the toxic self-assembly of hIAPP. 53Simulation and experimental work, revealed that the frequent interactions of Sil B with the S20-S29 sequence induces disorder in the amyloidogenic core and attenuates hIAPP toxicity and aggregation propensity. 53Myricetin, another polyphenolic flavonoid was shown to bind hIAPP at the amyloidogenic core and its C-terminus preventing aggregation and distorting the fibrils. 136The differential binding score (DIBS) was introduced to determine the binding preferences of ligands to an ensemble of IDP conformations by comparison against random coil ensembles of the same protein extracted from MD simulations. 137The validation was performed on epigallocatechin-3-gallate (EGCG) binding to the unstructured N-terminus of the tumor suppressor p53 protein, which compared favorably to experimental results.The predictive ability of simulations has been demonstrated in a translational study, in which atomistic simulations were used to design new polythiophene derivatives against prion aggregation, prior to in vivo testing. 54The compounds subsequently showed substantial prophylactic and therapeutic potency in prioninfected mice.Hence, simulations are powerful tools to generate conformational ensembles of the target polypeptide, which can act as scaffolds for the docking and design of molecules to target specific amyloid-forming regions.
The effects of antibodies on the structural and dynamic properties of amyloidogenic polypeptides have also revealed valuable insight into their modulating properties.Specifically, molecular dynamics simulations of Aducanumab in complex with Ab 42 revealed that the antibody sterically binds to monomeric, oligomeric and fibrillar species, with the binding site at the N-terminus (residues 2-7) preserved across all systems. 41dditionally, the results showed that the monomer unfolds and hydrophobically collapses on the antibody's surface, while in the complexes with aggregated species, the b-sheet structure of the peptide remains conserved. 41All-atom simulations of PrP C in complex with the neurotoxic POM1 and the innocuous POM6 antibodies revealed that the two antibodies, despite targeting similar epitopes, modulate differently the intrinsic flexibility of the protein 28 and its orientation with respect to the cellular membrane. 29The information extracted from the simulations of amyloidogenic polypeptides in complex with antibodies could serve as starting points for the optimization and design of agents (e.g.antibodies, peptides) to bind with higher affinity towards selected species or for the rational design of cyclic peptides to modulate the target's conformational landscape enabling access to new binding sites.Aside from the structure and the conformational landscape of the target, the conformations of the designed cyclic peptide in the target-bound and target-unbound states play an important role.Essentially, the designed cyclic peptides often adopt different conformations in solution as compared to the targetbound state.To design an efficient peptide-based inhibitor one needs to understand the conformational transitions of the cyclic peptides between the different states.While some peptideprotein complex structures are available, obtaining high resolution structures of cyclic peptides in solution is hampered by their low core-to-surface ratio, absence of specific couplings (e.g.NH-Ha) and diverse conformations in solution. 131Hence, molecular dynamics simulations have been successfully used to predict the energetically relevant conformational ensembles of cyclic peptides in solution, which compare favorably to available experimental data (e.g.NMR chemical shifts). 138We refer the interested reader to a comprehensive review of computational methods to characterize the behavior of cyclic peptides in solution 131 and underline the synergistic effects of experimental and computational works.
Regarding the implications to the cyclic peptide design aspect, molecular dynamics simulations exceed experimental resolutions and can provide insight into the structural interactions between the peptide and the target at atomistic level of detail.For instance, macrocyclic peptides found in plants (cyclotides) have been experimentally shown to inhibit the aggregation of tau and amyloid-b 42 fibrils. 139The peptide was subsequently docked onto 3D structures of Ab 42 fibrils and subjected to molecular dynamics simulations. 140The results explained experimental observations to reveal that the Cter-M cyclotide from C. ternatea (GLPTCGETCTLGTCYVPDCSCSW-PICMKN) binds the Ab 42 fibril via hydrogen bonding, hydrophobic, electrostatic and p-p interactions, thereby inhibiting aggregation. 140Particularly, the peptide disrupts intermolecular hydrogen bonds and salt bridges in the Ab 42 fibril, which are crucial for its structural integrity.The effects occur within the first 50 ns of the simulations with disruptions in the fibril secondary structure at residues 2-7 and 38-41, resulting in the loss of extended b-sheet conformations.Importantly, the Ab 42 fibril in absence of the peptide maintains stable extended bsheet conformations throughout the simulation trajectory.
Other approaches rely on available high resolution structures of protein complexes to identify linear interface motifs with an appropriate distances between residues to facilitate subsequent cyclization. 141In particular, backbone motifs of epitopes within protein-protein interfaces were identified and compared against available cyclic peptide databases to pinpoint promising candidates with desired structural features. 141ubsequently, the generated cyclic peptide-protein complexes underwent refinement through molecular dynamics simulations in explicit solvent to determine the binder with the highest target affinity.To validate the efficacy of this method, initial tests were conducted on a complex formed by the bovine trypsin inhibitor (BPTI) protein and the trypsin protease.The method identified a cyclic peptide that resembled the BPTI protein backbone at the interface, which is in agreement with experimentally known structures.
Despite extensive simulations, challenges remain when exploring the conformational space of IDPs both in the presence and in absence of modulators. 142Convergence is an issue due to the rugged free energy landscapes of the polypeptides, their size (which at times imposes the use of large simulation boxes) and/ or kinetic traps.Some of these difficulties are overcome by using enhanced sampling techniques, implicit solvents and/or coarsegrained models, which together with advances in computing power enable the access to longer time-and length-scales.Alternative approaches, include reducing the size of the system by simulating fragments of the polypeptide of interest and using statistical mechanics tools to derive the conformational free energies of the full IDP. 143Current force fields struggle with over-or underestimating the properties of an IDP as compared to experimental quantities.Here, the IDP-tailored choices are the all-atom additive Charmm36m 144 and Amber ff14IDPSFF, 145 which have been fine tuned to reach experimental agreement and improve the conformational sampling of intrinsically disordered proteins. 146More recently, machine learning has been integrated into the development and improvement of force fields 147,148 and novel techniques are emerging for IDP-specific force fields.An example is Charmm-NN, which uses atom-type based neural networks to calculate energies and forces 149 and is subject to further improvements.A detailed overview of the challenges associated with IDP simulations and their reconciliation with experimental data have been reviewed in ref. 2 and 142.On the methodology side, the determination of the binding free energies of the cyclic peptide to the target also require special attention.For instance, using perturbation free energy calculations, a popular method with small molecules, one can determine the relative binding free energies and mechanistic detail, while preserving the flexibility of the complex. 150Nevertheless, the convergence still remains an issue.Alternatively, umbrella sampling, a technique that provided valuable insight into the themodynamics of monomer attachment to amyloid fibrils, 143,151,152 would be a suitable choice for the determination of the binding free energies of a peptide to the target.

Computational methods for cyclic peptide design
Various computational methods have proven essential in the design of cyclic peptides for amyloidogenic targets.TANGO, 153 an algorithm developed to identify amyloidogenic sequences in proteins, was used to guide the search for cyclic peptides with improved binding affinity to Ab 40 oligomers. 154Residues 102-117 (PRRYTIAALLSPYSWS) from the G strand of protein transthyretin (TTR) were used as starting point to generate a peptide that binds Ab 40 and redirects it towards protease-sensitive, nonfibrillar aggregates.The peptide was head-to-tail cyclyzed and TANGO was used to select specific mutations that would retain or stabilize the antiparallel two-stranded b-sheet, resulting in a series of cyclic G (CG) peptides.Out of the five newly synthesized peptides, CG8 (TKVVTpPRYTIAKLSSPYSYSQ) §) had the most pronounced affinity towards Ab 40 , results confirmed by ThT fluorescence analyses. 154Cyclization of CG8 via a disulfide bond using the simple cyclic peptide application (SCPA) 155 within ROSETTA and the addition of a second D-proline (TKVVTpPRYTIAKLSSpPSYSQ) lead to increased peptide stability, enhanced conformational uniformity, and a higher b-sheet content. 156These findings highlight the added value of of cyclization and conformational homogeneity as design strategies.
Des3PI (design of peptides targeting protein-protein interactions) is a novel computational fragment-based approach for designing cyclic peptides with high target specificity. 157This algorithm performs docking calculations of an amino acid library onto the targeted protein surface and subsequently connects residues with favorable target binding affinities to generate novel cyclic peptide sequences and structures.We envision that the potential of this method can be exploited to the maximum when combined with quantitative representations from molecular dynamics simulations to generate novel amyloid-binding cyclic peptides.
Among the computational methods employed in designing peptides, FoldX emerges as a powerful tool due to its ability to determine the free energy contributions of each atom at protein interfaces based on its own position relative to neighbours in the complex. 158It can thereby predict the impact of mutations on protein stability and optimize protein sequences for improved stability and desired functional properties.Relying on FoldX to perform an exhaustive thermodynamic profiling, the tandem peptide CAP1 was designed to inhibit tau aggregation. 159Both in vitro and in vivo experiments confirmed computational predictions by showing that CAP1 binds with high specificity and affinity (EC 50 = 145 AE 49 nM) to tau aggregates, impeding their spread within cells.Additionally, CAP1 proved effective in hindering the ability of tau polymorphs obtained from the brains of Alzheimer's disease patients to initiate aggregation.

Machine learning for cyclic peptide design, property and activity prediction
Machine learning enables the rapid in silico screening and development of small molecules with therapeutic applications. 36On the peptide engineering side, machine learning has found recent applications in antibody optimization 160,161 and enzyme evolution. 34The potential for cyclic peptide design is still in its initial stages and requires accurate and reliable training data.For instance, using conformational ensembles from molecular dynamics, data can be generated and incorporated in training sets to create machine learning models able to accurately predict structural ensembles of peptides and their complexes or to generate peptide sequences with improved physico-chemical properties.For improved performance and increased accuracy, experimental data (e.g.binding, toxicity) can be incorporated.
In fact, by using data from molecular dynamics simulations of cyclic pentapeptides with diverse sequences and structural attributes as training datasets, machine learning models have been trained to predict structural ensembles for novel cyclicpeptide sequences, a method known as structural ensembles achieved by molecular dynamics and machine learning (StrEAMM). 162Alternative methods rely on generating comprehensive training datasets comprising sequences of blood-brain barrier penetrating linear peptides (BBPs) sourced from established databases and scientific literature, alongside non-BBPs peptides from UniProt to predict and explore novel BBPs with improved properties. 163AbDiffuser introduced a diffusion model tailored for the generation of three-dimensional antibody structures and corresponding sequences for biotechnological applications. 164Large protein families can be reliably mapped to a sequence ordinate using sequence alignment.AbDiffuser is an equivariant diffusion model designed to take advantage of these properties.The model adheres to physicsbased constraints (e.g.bonds, torsional angles) and can accommodate different sequence lengths, thereby reducing the memory complexity.AbDiffuser relies on the Aligned Protein Mixer (APMixer), a neural network operating within the SE(3) equivariance framework to ensure consistent behavior, when subjected to rotations and translations in the three-dimensional space.Validation of the predictions through in silico and in vitro work underlines the importance of computational and experimental synergies when designing molecules with tailored properties.
Within the landscape of neurodegenerative diseases, MobiDB emerges as a resource that provides a comprehensive view of polypeptide disorder. 165This repository compiles an array of comprehensive data related to intrinsically disordered proteins (IDPs) and regions (IDRs) encompassing both experimental and computational information on protein disorder, (e.g.sequences, structures, and functional annotations).Its utility is extended to experimental scientists seeking detailed information about individual protein systems, as well as bioinformaticians who require substantial, unified protein datasets for building statistical classifiers.More recently, MobiDB integrates AlphaFold predictions sourced from AlphaFoldDB. 166 recent study highlighting the synergy between modern computational techniques and experiments, focused on developing a versatile method for designing proteins capable of targeting specific peptide sequences derived from armadillo proteins. 167Using no known structure, Monte Carlo simulations were employed to construct a hash table for bidentate side-chain-backbone interactions, to ensure the stability of the desired protein-peptide interface.Identified key residues were optimized using Rosetta to construct both the protein and § Small letters indicate D-enantioneric amino acids.
journal is © The Royal Society of Chemistry 2024 peptide sequences while keeping the identified residues unchanged.To enhance the binding affinity and specificity, alanine scanning was performed and the binding free energies were determined to select the most favorable binders, validated by experimental techniques (e.g.X-ray crystallography, circular dichroism and biolayer interferometry).For IDPs, a similar approach may aid in the initial generation of polypeptide-cyclic peptide complexes than can then be investigated and optimized via molecular dynamics simulations.
An alternative approach known as hallucination relies on reversing deep neural networks trained to predict native protein structures, to design novel protein sequences and structures. 168Briefly, information encoded in several parameters of protein structure prediction networks containing learned representations and patterns that enable the networks to capture and predict various aspects of protein structures, including amino acid interactions and statistical relationships, is used to create realistic protein backbones and their corresponding amino acid sequences.First, random amino acid sequences are input into the trRosetta structure prediction network 169 to predict distance maps.Then Monte Carlo sampling is employed in residue space to refine the sequences and improve the predicted structures.This process generates a diverse array of proteins with varied sequences and structures.To validate the physical manifestation of these hallucinations, synthetic genes for 129 hallucinated proteins were expressed and purified.Among these, 27 proteins exhibited circular dichroism spectra consistent with the target structures and the resolved three-dimensional structures of three selected proteins matched the hallucinated models, underlying the potential of the method in de novo protein design.
Chroma introduced a generative approach to design peptides with customized structures and functions. 170It employs a diffusion process, which incorporates conformational statistics of polymer ensembles (e.g.dihedral angles, bonds) and a neural architecture for molecular systems based on random graph neural networks for molecular systems.The model can be conditioned via external constraints (e.g.symmetries, substructures, and natural language prompts) to generate proteins with specific properties, including inter-residue distances, distinct structural domains, and semantic properties guided by classifiers.
A recent investigation explored the synergistic potential of integrating advanced deep learning methods with a Rosetta-based approach to enhance the accuracy and efficacy of designed protein sequences binding to specific target molecules. 171The success rate is defined by the C a root mean squared deviations of the binder between structures generated with AlphaFold2 126 or RoseTTAFold 172 and Rosetta-designed structures.Large differences between them, i.e. deviations larger than 2.0 Å, indicate potential design pitfalls for protein binders.Complemented by confidence metrics from pairwise atomic environment predictions, successful binders are separated from those that do not perform well.The results show that AlphaFold2 or RoseTTAFold as evaluation filters in the protein design process increases the design success rate by 10-fold as compared to Rosetta.
Other strategies integrate RoseTTAFold, 172 into denoising diffusion probabilistic models (DDPMs) to design novel proteins with specific structural or functional attributes. 173This effort gives rise to RFdiffusion, 174 which incorporates RoseTTA-Fold as a denoising network within a generative diffusion model.Briefly, protein backbones are created from scratch by initializing frames of random residues and RFdiffusion is used to produce a refined and denoised prediction.Subsequently, sequences for these structures are generated employing the ProteinMPNN network. 175RFdiffusion predictions can be optimized by incorporating additional information (e.g.partial sequence and fold data) and enhanced through pre-trained weights and the application of loss functions.
Novel methods for cyclic peptide generation and design are rapidly emerging and might prove to be useful in the amyloidogenic polypeptide landscape.For instance, RINGER is a novel macrocycle conformer generator, which is a diffusion-based transformer model tailored to generate novel peptide macrocycles with specific sequences. 176Alphafold has been recently modified to predict the structure of macrocyclic peptides (AfCyc-Design), which have been then experimentally validated. 177n the coarse-grained side, CycloPep emerges as a powerful tool to generate cyclic peptides compatible with the MA(R/S)TINI force field. 178

A powerful trio: simulations, experiments and machine learning
The integration of computer simulations and experiments into machine learning powered engines enables the design, optimization and validation of custom protein-binding agents in an informed, fast and robust way.Taken together, these techniques have the necessary ingredients to generate novel, amyloidspecific and effective cyclic peptide binders, and hence make the next substantial step in the design of cyclic peptides as therapeutic agents or biomarkers against neurodegenerative diseases.
Following the recipe introduced throughout this paper, there are at least four ingredients required for the successful de novo peptide design binding amyloidogenic targets (Fig. 3).First, the target scaffold and, in particular, quantitative distributions of conformations of this scaffold are necessary elements. 127Available three dimensional high resolution structures are excellent candidates, however in absence thereof, deep learning based methods such as AlphaFold2, 126 RosettaFold 122 or Chroma 170 can accurately predict 3D models of protein structures even under user-specified environmental conditions. 179For IDPs, structure prediction is more challenging because of their native disorder characterised by a rugged free energy landscape. 127ortunately, existing or predicted structures can be investigated to obtain quantitative conformational distributions using molecular dynamics simulations at full atomistic resolution (if the system size allows) or at coarse-grained level (when dealing with bigger targets or aggregates). 2For the latter, different methods can be employed to reinstate atomistic detail, 180,181  enable to extract the statistically relevant states of the target, to be used in subsequent steps.
Second, quantitative characterization of the conformations populated by the cyclic peptides in the target-bound versus target-unbound states are factors to be accounted for.The size of the peptides (below 20 residues) and their cyclic nature often limits their structure generation or even the complex prediction via deep learning approaches such as AlphaFold-Multimer 182 or via experimental techniques. 131Assuming that the initial peptide-protein complex is known, e.g. from crystal structures 183 or from de novo design, 184 one can isolate the peptide from the complex and explore its conformational space in the unbound state via (enhanced sampling) molecular dynamics simulations at full atomistic resolution. 131For a cyclic peptide with unknown bound and unbound conformations, a convenient approach to obtain statistically meaningful conformations in solution is to generate its sequence by building its residues in an excluded volume-obeying manner, 185 and sampling its conformational space via Monte Carlo simulations and/or relaxing it using (enhanced sampling) molecular dynamics simulations.Nevertheless, the latter contains no information on the conformations sampled by the peptide when bound to the target, which may represent a bottleneck when trying to dock to the target.
Third, a key aspect is the structure of the complex, which aids in understanding what type of interactions drive the assembly and which residues contribute the most to peptide binding and complex stability.Experimentally, a series of crystal structures of cyclic peptide-protein complexes have been resolved 183 but none in complex with amyloidogenic targets.The thermodynamics and kinetics of peptide binding can be tested using methods such as surface plasmon resonance or isothermal titration calorimetry but none provides specific information on the binding epitope.Computationally, if the representative 3D structures are known, the peptide can be rationally designed and/or docked onto the target and enhanced sampling or deep learning techniques are employed to extract its binding free energy. 186Alternatively, in absence of known structures and/or unstably bound complexes, long molecular dynamics simulations could potentially reveal new binding sites.This approach may be efficient if the amyloidogenic target has a well defined secondary structure as is the case for PrP C , or has druggable pockets.However, for polypeptides with a high degree of plasticity this is a resource intensive and potentially ineffective strategy, which would only slow down peptide design.Machine learning can facilitate the design of peptides, and corroborated with simulations and/or experiments, can aid in the estimation of binding affinities, 187 and improve the peptide sequence for optimal binding to the target. 171,188Hence, if combined in an effective manner, computer simulations and machine learning can considerably increase peptide design and optimization efficiency, and can therefore speed up drug development.
The fourth ingredient prior to clinical advancement is the experimental in vitro and in vivo validation.Given the complementarity of computational and experimental work, an attractive approach would be to integrate the trio, i.e. simulations, machine learning and experiments, into a dynamic and iterative engine.For instance, molecular dynamics simulations and deep learning, could be first used to predict and optimize protein and peptide conformations, stability, binding affinities, aiding in the selection of lead candidates prior to experimental validation.Then results from the trio can be incorporated into feedback loops 37,189,190 that would allow the design of novel and improved peptide sequences, prediction of cyclic peptide bioactivity, better target selectivity, and off-target effects, thus aiding in the faster identification of potent and safe candidates.Hence, the unique integration of such methodologies can aid the design and optimization of novel experiments and computational work.Furthermore, an approach as such can significantly reduce the number of experiments that are required for validation and can increase the homogeneity across the experimental data sets (e.g.environmental conditions). 39g. 3 Schematic overview of the proposed de novo peptide design strategy against amyloidogenic targets.The first steps involve the preparation of the structures, the identification ofquantitative conformational distributions of the target (top left panel) and the binder (bottom left panel).After preparing the two structures, the peptide can be docked onto the monomeric or multimeric target to determine the structure of the complex.Alternatively, the complex can be rationally designed starting from high resolution structures (middle panel).Next, the binding of the peptides is computationally and experimentally tested.Importantly, the extracted data (e.g.binding free energies, kinetics) can be incorporated in feedback loops powered by machine learning engines (e.g.active learning cycle) to improve the peptide sequences and/or properties.After several cycles, the best binders are advanced into pre-clinical validation.Created with BioRender.com.

Perspectives and outlook
In the last 20 years more than 16 cyclic peptide based therapeutics have been FDA and EMA-approved, mainly as antibiotics, anticancer therapeutics, antifungals and immunomodulators. 80,191espite extensive research, no cyclic peptide-based drug for neurodegenerative diseases has passed clinical trials.The challenges arise due to the intrinsic disorder of the targets lacking traditional long-lived druggable pockets, 90 the limited understanding of the associated mechanisms and the often ineffective integration and active feedback between disciplines.Here, we reviewed the design strategies for cyclic peptide design against amyloid-forming targets from a computational perspective and emphasized on the potential of the interconnection between computer simulations, experiments and machine learning in anti-amyloid cyclic peptide design for therapeutic, imaging and diagnostic applications.As such, we proposed a recipe, which can function like a digital twin i.e. creating scenarios relying on available information to improve performance and prevent design flaws, allowing for rapid analysis and accurate predictions. 192ssentially in cyclic peptide design, the digital twin would rely on information from computational and experimental findings to simulate the effect of a cyclic peptide-based drug on an amyloidogenic target, while enhancing the design and optimization of future peptide-based drugs, with better targeting abilities, reduced risk and lower cost.Clearly, the integration is not effortless and requires the efficient incorporation of extensive data from both experiments and simulations (e.g.binding constants, toxicity assays, morphological effects etc.) to be constantly exchanged between the physical and virtual machine.Importantly, while simulations essentially act as digital twins by themselves, the incorporation of homogeneous experimental data via machine learning powered engines can improve predictions, making the next substantial step in (peptide-based) drug design.
The concepts and proposed strategies extend beyond drug design for therapeutic applications and hold the potential to aid in adjacent fields such as (bio)material design or controlled drug delivery. 193In essence, it all boils down to the gathering and the smart processing information from diverse sources to create a digital correspondent of a material capturing its composition, structure, responsiveness to external stimuli etc. 194 to generate design rules for programmable and adaptable materials. 195

Fig. 2
Fig. 2 AlphaFold2 predictions for (a) amyloid-b 42 , (b) a-synuclein, (c) hIAPP, (d) PrP C and (e) tau.The structural elements are color-coded according to the confidence level of their AlphaFold2 prediction, red to blue for low to high confidence intervals, respectively.
rapid and consistent advancement in amyloid-related drug discovery.Example of such a publicly available database is the Molecular Dynamics Data Bank.The European Repository for Biosimulation Data.

Table 1
Computational studies of amyloidogenic polypeptides in complex with different agents This journal is © The Royal Society of Chemistry 2024