Lisa J.
Lapidus
*
Department of Physics and Astronomy and Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI 48824, USA. E-mail: lapidus@msu.edu
First published on 9th October 2012
Much work in recent years has been devoted to understanding the complex process of protein aggregation. This review looks at the earliest stages of aggregation, long before the formation of fibrils that are the hallmark of many aggregation-based diseases, and proposes that the first steps are controlled by the reconfiguration dynamics of the monomer. When reconfiguration is much faster or much slower than bimolecular diffusion, then aggregation is slow, but when they are similar, aggregation is fast. The experimental evidence for this model is reviewed and the prospects for small molecule aggregation inhibitors to prevent disease are discussed.
![]() Lisa J. Lapidus | Lisa Lapidus obtained her PhD in atomic physics from Harvard University in 1998 then made the dramatic switch to biophysics when she joined the lab of William Eaton at the NIH. She also trained with Steven Chu at Stanford University before joining the faculty at Michigan State University in 2004. In addition to investigating intramolecular diffusion in unfolded proteins, her lab also develops new techniques to study protein folding using optical spectroscopy and microfluidics. |
Furthermore, it is increasingly understood that aggregation must be considered within the context of a cell.7 Certainly the crowded conditions in the cytoplasm will affect the likelihood that two proteins will interact, leading to aggregation, and may also affect the stability and folding pathways of some proteins.8 But a cell also contains a set of housekeeping genes to prevent, reverse or divert aggregation.9 A recent study shows that one function of chaperones, besides preventing and reversing misfolding, is to bind to small, toxic oligomers to promote assembly to larger, non-toxic oligomers, thus preserving the cell at the cost of the protein.10 The balance of protein production, chaperone activation and proteolysis has been termed proteostasis, and it appears that misfolding and aggregation of one protein can induce misfolding and aggregation of other proteins by putting the proteostasis network out of balance.11–13 A new model of this network, FoldEco, shows a cell is willing to expend large amounts of energy and discard a lot of misfolded protein to keep this balance.14 The basic implication of the proteolysis hypothesis is that an imbalance of this network leads to disease such as Alzheimer's and Parkinson's. That many of these aggregation-based diseases typically happen late in life and the fact that aggregation seems to be a generic process for all proteins suggests that aging causes proteostasis to break down, leading to disease of the most aggregation-prone proteins. There is evolutionary pressure on protein sequences to avoid aggregation and for the network to rescue misfolded and aggregated proteins but only over the reproductive lifetime of an organism, far short of the actual lifetime of modern humans.15,16
Nevertheless, the fundamental heart of this entire problem of non-functional aggregation is the physical process in which two or more proteins clump together in a non-specific and uncontrolled way, and a basic question is why some sequences aggregate and others don't. Several groups have attempted to rationalize the relationship between primary sequence and aggregation propensity,17 producing, for example, the Tango18 and the Zyggregator19 algorithms that calculate a number of properties of the sequence, including hydrophobicity, charge and secondary structure propensity, to create an aggregation metric. Generally sequences that have high hydrophobicity, low net charge and high β-sheet propensity are more likely to aggregate. While these algorithms are successful at predicting aggregation-prone sequences or regions of sequences, like protein structure prediction algorithms these methods do not describe the pathway of protein assembly in any detail such as when or how monomers adopt structures that are likely to stick to other monomers.
Aggregation is often thought to be a process by which proteins adopt specific, non-native, conformations that allow bimolecular association that is difficult to reverse (see ref. 20 for an excellent review of the field). Most analyses of aggregation kinetics indicate that the critical nucleus for the first step of aggregation is a monomer, though there may be a second critical nucleus for initiating fibril formation. Some proteins appear to aggregate from native like conformations,21,22 but most of the proteins and peptides associated with aggregation-based disease are either intrinsically disordered or have intrinsically disordered regions (i.e. Aβ, α-synuclein, IAPP and prion). There are a number of studies that have found structured intermediates as the aggregation precursor, but the difficulty in determining short-lived (less than a few milliseconds) intermediates suggests that some of these precursors could be subsets of the unfolded state under folding conditions.23–25 Furthermore, the dynamic stability of these states is usually unknown so it is unclear if there really is a single aggregation-competent precursor for most proteins. If aggregation results from an ensemble of kinetically accessible conformations then the reconfiguration time between these conformations must affect the rate of aggregation and perhaps the morphology of the resulting aggregates as well.
There are numerous papers on the structure and kinetics of the later, slower phases of aggregation as well as discussion as to which phases are toxic in aggregation-based diseases. This review will focus on the earliest steps in aggregation, bimolecular association and the formation of low molecular weight oligomers, because regardless of how aggregation progresses and which step is toxic, all aggregation must start from these first steps and prevention of bimolecular association will certainly prevent subsequent disease. In particular I present the unusual hypothesis that aggregation is kinetically controlled by reconfiguration of the protein monomers.
The simplest models of aggregation based on reconfiguration are shown in Fig. 1. The unfolded protein ensemble under aggregating conditions is M and M*, where M* are aggregation competent conformations (i.e. they have solvent exposed hydrophobic residues). The diffusive reconfiguration rate between these two states is k1 ≈ k−1 because the reconfiguration is primarily driven by diffusion and has no significant energetic barriers between the two states. The reconfiguration rate can be estimated by assuming it is the rate to diffuse one point on the chain across the chain diameter, k1 ∼ 4D/(2RG)2, where D is the intramolecular diffusion coefficient and RG is the radius of gyration of the chain, and have been observed to range from ∼104–107 s−1 (see below for details of experimental observations). Since there is no stable native structure, this rate will primarily be determined by the total hydrophobicity and hydrophobic pattern in the sequence. Two M* monomers may come into close contact by the bimolecular diffusion-controlled rate, kbi. While both monomers are in the encounter complex two things may happen, they may make stabilizing bimolecular interactions that lead to an oligomeric state O or one or both monomers may reconfigure to M which makes stabilizing bimolecular interactions difficult and the complex comes apart. The rate of forming O is slow compared to the other rates and may depend on many factors such as large-scale structural rearrangement within the complex (scheme i) and/or the addition of another monomer (scheme ii). This scheme produces three kinetic regimes. When k1 ≫ kbi reconfiguration is so high that the chain does not reside in M* long enough for the encounter complex to make stabilizing interactions. When k1 ≪ kbi the likelihood of two M* coming together is very unlikely so the encounter complex is rarely made. But when these rates are the same, the encounter complex forms fairly often and proceeds to the oligomeric state without coming apart.
![]() | ||
Fig. 1 Kinetic models (i–iii) describing the early phases of aggregation. For all schemes, kbi = 1 × 105 s−1, kO = 100 s−1 and k1 as marked on the plots. (a) Solution to scheme (i), in which O forms by conformational change of [M*M*] and k−1= k1. (b) Solution to scheme (ii), in which O forms by the addition of a third monomer, M* to [M*M*] and k−1 = 0.5k1. (c) Solution to schemes (ii) and (iii) for knuc = 1 s−1 and kf = 100 s−1. |
There is some evidence to support this model from other groups in the field. Early simulations of fibril formation showed that dimers formed in a two-step process whereby hydrophobic contacts form rapidly and then backbone hydrogen bonds zip up the structure through a longer relaxation process, thereby kinetically trapping some conformations.26 Experiments on mutants of superoxide dismutase 1 (SOD1) also support a two-step aggregation model, albeit with the preformed amyloid.27 An encounter complex as described above has been observed computationally for Aβ42 which finds that the second, accommodation, step reduces the free energy of the dimer.28,29 Analysis of polyglutamine aggregation also supports the formation of disordered dimers over homogeneous nucleation of β-sheet fibrils from monomers.30,31 Measurements of polyQ with different residues interrupting the sequence also show that the exact monomer conformational ensemble affects the aggregation kinetics.31 Even the elongation of Aβ16–22 fibrils show a two-step “dock and lock” mechanism in which a disordered monomer joins the fibril before adopting the cross-β conformation.32
Fig. 1a shows the solutions of kinetic scheme (i) using kbi = 1 × 105 s−1, a reasonable rate for a solution of ∼100 μM proteins of average molecular weight in water, and a range of k1. Qualitatively, the rate of aggregation is slower for very low and very high k1, but the range over which aggregation is most rapid is not quite k1= kbi. However, if the formation of O requires the addition of another M*, as shown in scheme (ii), then the solutions are in better quantitative agreement with observation and the range over which aggregation is rapid is only when k1 and kbi are within an order of magnitude. The addition of more monomers to create O raises the critical value of k1 with maximal aggregation propensity. Finally if the rate k−1 is slightly lower than k1 (k−1 = 0.5k1) in scheme (ii) then maximal aggregation occurs for k1 = 105 s−1 (Fig. 1b). Also, obviously, increasing kbi due to increased concentration (such as in a cell), increases the k1 with maximal aggregation propensity. The rate of forming O is assumed to be slower than reconfiguration, but an increased kO does not affect the overall trend of maximal oligomerization occurring when k1 ∼ kbi. It is also possible that reconfiguration within the encounter complex is slower than k−1 because bimolecular association with another monomer presents a drag on reconfiguration. Solution of scheme (ii) in which the escape rate from [M*M*] is 0.1k1 increases the oligomer formation rate but the trend of maximal k1 is unchanged.
Once in O diffusive reconfiguration is too slow to be relevant to subsequent structural changes, usually leading to a fibrillar state. The simplest model of fibril nucleation and propagation is shown in scheme (iii) and Fig. 1c for illustration requires a slow nucleation rate (knuc), in which the oligomer undergoes a conformational change to a nascent fibril, and a more rapid fibrillation rate (kf), in which oligomers add to the end of a growing fibril. Alternatively the fibril could grow from adding monomeric M* instead, which results in a slightly more gradual fibril formation but little change in the rates. However the real process could easily be more complicated. The process would obviously be dependent on protein sequence, solution conditions and other factors that have been shown to affect morphology. It is likely that the sequence effects that determine the monomeric reconfiguration rate also affect rates of nucleation or fibrillation, leading to the much larger difference in lag times and rise times between aggregation-prone and well-behaved proteins that are typically observed. For example, α-synuclein will fibrillize in several days while an IDP such as aaVLEA has never been observed to form fibrils at all, yet the reconfiguration rate differs only by 10-fold.
At first glance this model seems unlikely to be realistic because reconfiguration times for proteins are conventionally thought to be 10–100 ns, much faster than bimolecular diffusion times.33 This estimate is based on measurements of intramolecular diffusion in disordered peptides,1,34 but more recent studies, outlined below, have observe a much larger range. While the physical basis for the sequence dependence of reconfiguration is still unknown, it could be that both the overall hydrophobicity and hydrophobic pattern of the sequence are the determining factors. For all of these sequences, the measurements are made on a broad ensemble of conformations in water that should be determined, at least in part, by hydrophobic interactions between different parts of the chain. Thus a sequence that can make large hydrophobic clusters between distant points on the chain may be less diffusive than one that has hydrophobic residues distributed evenly along the sequence, which would in turn be less diffusive than a sequence with very few hydrophobes.
![]() | ||
Fig. 2 Determination of the rate of contact formation between the probe, tryptophan (W), and the quencher, cysteine (C), within an unfolded protein. Pulsed optical excitation leads to population of the lowest excited triplet state of tryptophan. Tryptophan contacts cysteine in a diffusion-limited process with rate kD+, and then either diffuses away or is quenched by the cysteine. The observed rate of Trp triplet decay is given by where kD+ is the rate of diffusion of the two ends towards each other, kD− is the rate of diffusion away and q is the rate of quenching. If q ≫ kD−, the observed lifetime reduces to kobs ≈ kD+. More generally, eqn (1) can be rearranged to give where kR is the reaction-limited rate and kD+ is the diffusion-limited rate,1T is the temperature and η is the viscosity of the solvent.The reaction-limited and diffusion-limited rates are given by Szabo Schulten and Schulten theory which describes the dynamics of unstructured peptides as diffusion on a one-dimensional potential of mean force that is related to the distribution (P(r)) of Trp-Cys distances r1 where q(r) is the rate of quenching at r (which has been determined experimentally by Lapidus et al.4), a is the point of closest approach (typically 4 Å), lc is the contour length of the chain between Trp and Cys (and, hence, their maximum separation), and D is the effective intramolecular diffusion coefficient. P(r) could be given by a simple polymer model such as a wormlike chain, by an empirical distribution based on experimental measurements or by a molecular dynamics simulation. Generally, both diffusion-limited and reaction-limited rates are inversely proportional to the average chain volume and the diffusion-limited rate is directly proportional to the diffusion coefficient,6 |
![]() | ||
Fig. 3 Measured diffusion coefficients. (a) Diffusion coefficients measured for a variety of sequences in water at 20 °C. The dark blue bars belong to the fast reconfiguration regime, red bars to the middle regime and cyan bars to the slow regime. (b) Diffusion coefficients of α-synuclein measured in various solution conditions, with the mutation A30P (red bar) and with the aggregation inhibitor, curcumin (cyan bar). The numbers in parenthesis indicate the position within the sequence of Trp (39 or 94) and Cys (69). |
Polyglutamine (polyQ) peptides longer than ∼30 residues are known to aggregate and people with more than 30 Gln repeats in the larger Huntington protein are prone to Huntington's disease, with the age of onset inversely related to the total number of repeats. Measurements of shorter peptides showed that this sequence is very stiff based on the turnover in the reaction-limited rate at ∼10 residues,50 while measurements and simulation of longer peptides revealed compact structures.31,51–54 Fitting the short peptide measurements to a wormlike chain model revealed an extremely high persistence length (13 Å) but later analysis with a coarse-grained protein model showed that at long lengths the intrinsic stiffness is counterbalanced by strong attractive interactions,55 in agreement with FRET and FCS measurements of the longer lengths. Thus, unlike other disordered peptides, the diffusion coefficient of polyQ decreases with length. An extrapolation of the measured diffusion coefficient at short lengths to 40–60 residues, lengths known to promote Huntington's disease, gives D ∼ 10−7 cm2 s−1. The radius of gyration of a 30 residue polyQ peptide is estimated to be 25 Å56 yielding a reconfiguration rate of 1.6 × 106 s−1.
A point mutation of protein L, F22A increases the diffusion coefficient by almost two orders of magnitude and anecdotally is known to aggregate in water.39 Thus it appears that intramolecular diffusion is exquisitely sensitive to sequence. Using a polymeric model which accounts for preferential intramolecular interactions, we find that phenylalanine 22 in protein L preferentially makes contacts non-local in sequence, making the chain less diffusive, while alanine 22 will preferentially make more local contacts, keeping it more diffusive.38 Surprisingly, MD simulations find that RG for the F22A mutant is only 5% larger than the wildtype,39 leading to a reconfiguration rate of 1 × 106 s−1.
Following the model laid out here, an inhibitor could prevent the first stage of aggregation by either shifting the reconfiguration rate to the fast or slow regimes. In shifting to the fast regime, the protein would become more like a typical IDP with a large volume and rapid intramolecular diffusion. Alternatively, an inhibitor could slow down diffusion to the rate of well-behaved foldable proteins. Such a molecule would nucleate hydrophobic clustering by distant parts of the chain, making the chain more compact and less diffusive. However, if it is an IDP, this slower chain is not going to ever fold and the chance to make stable bimolecular associations over long periods of time (seconds to minutes) is still quite high. Thus whether such a molecule really prevents aggregation in a cell depends in detail on how the chaperone network treats such a chain. Since it is dynamically similar to an unfolded or misfolded foldable protein, it could be tagged for proteolysis, but if chaperones were to attempt to undo the “misfolded” conformation aggregation would probably still occur because it will never fold. Furthermore, repeatedly engaging chaperones might disrupt proteostasis more than the protein without the inhibitor. Thus understanding the full cellular response is necessary when designing a therapeutic.
To further test this model, my lab has begun to investigate the effect of small molecule aggregation inhibitors on intramolecular diffusion in α-synuclein. As a first step we chose curcumin, a naturally occurring polyphenol in the spice turmeric, which has been shown to inhibit aggregation in both α-synuclein as well as the Alzheimer's peptide. Using optical changes in both the curcumin and the Trp mutated into the α-synuclein, we found that curcumin bound strongly to the monomeric protein. Both fibrilization and oligomerization were inhibited and intramolecular diffusion increased, particularly at high temperatures where aggregation was more likely (see Fig. 3b). Thus it appears that curcumin bound to α-synuclein changes the conformation ensemble by making the chain less compact and more diffusive. With the reconfiguration time increased at high temperatures, aggregation was unlikely because k1 ≫ kbi.72
This journal is © The Royal Society of Chemistry 2013 |