R. Frederick
Ludlow
and
Sijbren
Otto
*
Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge, UK CB2 1EW. E-mail: so230@cam.ac.uk; Fax: +44 (0)1223 336017; Tel: +44 (0)1223 336509
First published on 3rd September 2007
The study of complex mixtures of interacting synthetic molecules has historically not received much attention from chemists, even though research into complexity is well established in the neighbouring fields. However, with the huge recent interest in systems biology and the availability of modern analytical techniques this situation is likely to change. In this tutorial review we discuss some of the incentives for developing systems chemistry and we highlight the pioneering work in which molecular networks are making a splash. A distinction is made between networks under thermodynamic and kinetic control. The former include dynamic combinatorial libraries while the latter involve pseudo-dynamic combinatorial libraries, oscillating reactions and networks of autocatalytic and replicating compounds. These studies provide fundamental insights into the organisational principles of molecular networks and how these give rise to emergent properties such as amplification and feedback loops, and may eventually shed light on the origin of life. The knowledge obtained from the study of molecular networks should ultimately enable us to engineer new systems with properties and functions unlike any conventional materials.
R. Frederick Ludlow | Fred Ludlow was born in England in 1982. He received his MChem from the University of Oxford in 2004, spending the final year of the course working with Paul Beer on anion templated rotaxanes. He is currently a third year PhD student at the University of Cambridge working in the field of supramolecular chemistry in Sijbren Otto's group. His research interests include molecular recognition in aqueous solution and the use of dynamic combinatorial chemistry to extract information about host–guest systems. |
Sijbren Otto | Sijbren Otto received his MSc (1994) and PhD (1998) degrees cum laude from the University of Groningen in the Netherlands, where he worked on physical organic chemistry in aqueous solutions in the group of Jan Engberts. In 1998, he moved to the United States for a year as a postdoctoral researcher with Steve Regen (Lehigh University, Bethlehem, Pennsylvania) investigating synthetic systems mediating ion transport through lipid bilayers. In 1999, he received a Marie Curie Fellowship and moved to the University of Cambridge where he worked for two years with Jeremy Sanders on dynamic combinatorial chemistry. He started his independent research career in 2001 as a Royal Society University Research Fellow at the same university. His current research interests involve molecular recognition at biomembranes and dynamic combinatorial/systems chemistry in water. |
We believe that the time has come for chemists to firmly embrace complexity and we make a case for systems chemistry3 as a new discipline that looks at complex mixtures of interacting molecules. Complex mixtures can give rise to interesting and desirable emergent properties—properties that result from the interactions between components and cannot be attributed to any of these components acting in isolation. Complementary to this, a complex mixture contains information about all its constituents and the study of the complete mixture should in principle allow us to obtain properties of interest of all these molecules simultaneously, provided we can find a way of deconvoluting the results of such investigations. Furthermore, there is considerable interest in uncovering the organisational principles behind complex networks in order to understand their workings and eventually be able to modify and engineer networks.
We will discuss some selected examples of pioneering experimental work on systems chemistry covering a wide range of subjects, including dynamic combinatorial libraries, oscillating reactions, replicating networks and self-assembling systems. First though, we provide a concise summary of the state of the art in systems biology, which is a field to which systems chemistry is intimately related and which currently struggles with issues that may perhaps be more conveniently addressed by a more chemical approach.
The term systems chemistry is potentially very broad. For example, the fields of heterogeneous catalysis as well as atmospheric and environmental chemistry deal with large systems of interacting chemical species. We will restrict this review to cover work on synthetic systems in solution, and apologise to anyone whose favourite subject is glossed over or ignored altogether. Nevertheless, we hope our paper will spark interest in what is, for many chemists, an unconventional subject.
While the ultimate aim of systems biology is being able to predict, repair, control and eventually design a biological system, most of the current work is more down to earth and focussed on improving the understanding at systems level. At present, the main challenges are to drill down from the global picture of a network to identify the basic network motifs7 and to determine the way these are interlinked.8,9 Possible approaches include tinkering with existing networks to identify some of the organisational principles or even engineering new functional networks in living organisms.10 However, biological organisms are relatively fragile: too much tinkering will result in death, limiting the use of this top-down approach. This calls for an alternative bottom-up approach; an area where systems chemistry may provide new fundamental insights.
This behaviour has been successfully exploited for the discovery of new compounds that are good at molecular recognition (synthetic receptors for small molecule guests, or ligands for biomacromolecules). If a template molecule is added to a DCL, those oligomers which can form favourable interactions with it are stabilised, and so the equilibrium shifts to favour those library members which bind the template. Under the right experimental conditions, the strongly binding oligomers are amplified at the expense of the weaker binders.
The development of this field has been driven by the need for improved methods for developing synthetic receptors and ligands for biomolecules. The approach hinges on the intuitive hypothesis that there should be a correlation between binding affinity and amplification. However, subsequent theoretical work revealed that dynamic libraries can show some not immediately intuitive deviations from this behaviour, reflecting the fact that the product distributions in DCLs are dictated by the interplay of binding equilibria and mass balance equations involving all the species in the network. Theoretical and experimental studies by Severin et al. revealed two situations where the correlation between host–guest binding affinity and amplification can break down, or even reverse.14,15
For the first case, consider a library containing a single building block, A, which can form a dimer or trimer. We shall assume that both dimer and trimer bind to the template, T, with the trimer having a higher affinity. At low relative template concentration, the trimer will be amplified as expected, but if the template concentration is high, this will not necessarily be the case. As shown in Fig. 1a, forming two trimer–guest complexes requires the disruption of three dimer–guest complexes, so the trimer must be a significantly stronger binder for it to be amplified. If the dimer is only slightly weaker, it may still be preferentially amplified.
Fig. 1 Equilibria representing the competition between oligomers in a DCL. |
The second case involves competition between homo-oligomers and hetero-oligomers. Consider a library containing three building blocks, A, B and C, which form all possible trimers. The A3homotrimer and ABC heterotrimer can both bind to a template, T, which is present in excess. In this case, the relevant equilibrium is that shown in Fig. 1b: in order for one A3·T complex to form, three ABC·T complexes must be disrupted. Therefore the homotrimer, A3, must have a much higher affinity for the template than ABC in order for the amplification to reflect their relative association constants.
These examples serve to highlight the danger of thinking in terms of individual molecules, when it is the free energy of the entire system that is important. In both cases, the problem arises because many weak host–guest complexes need to be disrupted to form one strong host–guest complex. One way to negate this is to lower the template concentration so that fewer of the weak host–guest complexes form in the first place.
These principles apply not only to the relatively simple systems of Fig. 1 but also to much larger DCLs. Fig. 2 shows two examples of the correlation between binding affinity and amplification in a simulated 322-component library at two different concentrations of template.16
Fig. 2 The relationship between amplification (AF; the ratio between the concentration of a library member in the presence of a template compared to that in the absence of the template) and free energy of binding for all binders in two simulated DCLs that differ only in the concentration of the template T: (a) [T] = 10 mM; (b) [T] = 1 mM. |
Another solution is to ensure there is a reservoir of building blocks, only a fraction of which can exist in active oligomers. Severin et al. have reported a library with an unusual network topology that achieves this.17 The library is based on metal–ligand exchange, with two different self-sorting ligands. One ligand forms active receptors, while the other forms a library of non-receptor complexes which acts as a metal ion reservoir. A better correlation is observed between binding constant and amplification factor than for the system containing only the active ligand.
As we have seen, a DCL is a complicated and sometimes non-intuitive system. However, it is not as complex as systems such as stock markets, fluid flows or cellular automata, because the final state is not dependent on the history of the system. For a given set of building block and template concentrations, the same equilibrium will be reached whatever the starting point, and this equilibrium point can be exactly calculated from quantitative knowledge of all the individual interactions within the system. It is this relative simplicity which allows us to be confident that, given careful experiment design, the amplified species in a templated library is likely to be a strong binder. In most instances, studying the behaviour of the library does not give us any information that we could not have obtained by studying its parts in isolation; however, a DCL is an efficient short cut to this information. Two examples will be discussed which demonstrate the wealth of information present in a DCL and how it can be accessed using only relatively simple analytical techniques.
Recently, we have shown that quantitative information about library members' affinities for a guest can be determined directly from the library distribution.18 This enables host–guest interactions to be quantified without the need for isolation and purification of individual library members. To demonstrate the potential of this method, we simulated the composition of a 31-component DCL based on fixed host–guest binding constants at a number of different template concentrations to serve as an “experimental” data set. The library distribution was modelled computationally at each template concentration using a set of trial values for the template-binding affinities of each oligomer and the error between simulated and “experimental” concentrations was determined. The trial values were then varied so as to minimize the error using a standard algorithm. Good fits could be obtained for the majority of the compounds in the mixture, with a particularly good agreement for the more strongly binding oligomers (Fig. 3).
Fig. 3 Comparison of the “experimental” and fitted values of the host–guest binding energies in a simulated 31-component dynamic combinatorial library. |
We have also tested the method on a truly experimental system and obtained good agreement between fitted binding constants and those obtained by microcalorimetry.
The majority of work on DCLs has focused on methods for the identification and characterisation of good binders, either to form one part of a host–guest system or as catalysts.13 Severin and co-workers have described a different application of DCLs in which the library's guest-induced adaptation is used to determine the identity of an unknown guest. Any molecule that interacts with a DCL will cause a perturbation to its composition that is characteristic of the particular molecule. Thus, it should be possible to work backwards from the adaptation of a library to the identity of the guest. As a proof of principle study, a DCL based on two metal ions and three coordinating dye compounds (Fig. 4) was used.19
Fig. 4 Generation of a DCL of metal–dye complexes by mixing three dyes with CuCl2 and NiCl2 in buffered aqueous solution. |
Addition of a dipeptide guest caused the library to re-equilibrate, resulting in a change to the UV/vis spectrum. Initially, DCLs were prepared with one of the six dipeptides, Val–Phe, Gly–Ala, His–Ala, Ala–His, Phe–Pro and Pro–Gly, as guests. The UV/vis spectra of these solutions confirmed that this method could distinguish between the six dipeptides. To test the sensitivity of this method, a further experiment was carried out using the five structurally similar dipeptides, Gly–Ala, Val–Phe, Ala–Phe, Phe–Ala and D-Phe–Ala. In this case, the differences between the UV/vis spectra of the libraries were much smaller, and so linear discriminant analysis (LDA)20 was used to classify the compounds. Fifteen spectra were recorded for each peptide, at slightly varying peptide concentration. Eight wavelengths were then selected from the spectra, and the absorption at these wavelengths formed the input.
Using the entire data set as a training set generated Fig 5. A clear separation can be seen between the different peptides. In another experiment, 50% of the observations were randomly selected and used as the training set, 97% of the remaining observations were correctly classified. This is particularly impressive given the similarity of the peptides.
Fig. 5 Linear discriminant analysis score plot showing clear separation of a series of closely related dipeptide analytes. |
In a separate study, a sensor was developed for distinguishing Gly–Gly–His from either His–Gly–Gly or from Gly–His–Gly.21 The experimental simplicity of the system allowed a large number of libraries to be set up and compared for their ability to distinguish between isomers. Using this approach, optimised DCL sensors were discovered for a variety of sensing applications in addition to sequence differentiation, including concentration quantification and identifying the proportions of components in a mixture.
Fig. 6 Carbonic anhydrase selectively protects those library members that bind to it from being hydrolysed. |
In a further publication, Gleason and Kazlauskas et al. described a pseudo-dynamic combinatorial library (pDCL).24 Again, the system contained separated screening and hydrolysis chambers, but this time a synthesis chamber was added in which the peptides could be regenerated by reaction of the amine hydrolysis product with solid-supported active esters (Fig. 7). Four different activated esters 2a–2d were combined with two amines to produce eight potential CA inhibitors. The concentration of these peptides was monitored over a number of days with periodic addition of fresh activated ester. Again a large amplification in the selectivity was observed—the final concentration ratio of the two best inhibitors was greater than 100 ∶ 1, despite an affinity ratio of only 2.2 ∶ 1.
Fig. 7 The experimental setup of the pseudo-dynamic combinatorial library developed by Gleason and Kazlauskas et al. |
This pDCL bears an intriguing resemblance to the model for pre-biotic peptide synthesis and degradation proposed by Wächtershäuser and co-workers.25 Under conditions similar to those found around volcanic vents (CO and colloidal transition-metal sulfides), peptide synthesis and degradation were found to occur simultaneously. The authors speculate that the resulting dynamic chemical libraries may well become self-selecting if the constituents are differentially stabilised by binding as ligands to the transition-metal centres that are involved in their production. This may give rise to positive feedback loops that could well have played a role in the emergence of life.
Self-replicators have been developed by a number of groups, based on a variety of chemistries, including RNA, peptides and purely synthetic compounds.28,29 Work carried out by Ghadiri and co-workers in the late 1990s on self-replicating peptides provided two examples of how complex behaviour such as symbiotic cooperation30 and dynamic error correction31 can emerge from networks of interacting self-replicators.
More recently, von Kiedrowski et al. reported on a system first described by Wang and Sutherland32 for which they established the presence of several simultaneous autocatalytic and cross-catalytic pathways,33 while Philp and Kassianidis have described a reciprocal replicating system in which two molecules catalyse the other's formation but not their own.34
Larger systems have also been studied; Ghadiri and co-workers have reported the behaviour of a network arising from a series of nine self-replicating, coiled-coil forming peptides.35 The basic reaction underpinning this system is shown in Fig. 8 The reaction between an electrophilic (E) and a nucleophilic (N) peptide fragment can be accelerated by a full-length template peptide (T) via the quaternary complex [ENTT]. The efficiency of this templated reaction depends on the stability of the quaternary complex, which can be estimated from the structure of the peptides. This allowed Ghadiri to construct a graph of the reactions in which the nodes represent the templates and the edges the predicted catalytic pathways (Fig. 9).
Fig. 8 Schematic mechanism of templated peptide formationvia the quarternary coiled-coil complex [Ej·N·Ti·Ti]. |
Fig. 9 The predicted network of auto and cross-catalytic reactions in Ghadiri's peptide replicator system. |
When a subsection of this graph containing nine peptides was selected and investigated experimentally, some of the predicted reactions were not observed. They went on to demonstrate that all the “missing” pathways were indeed active when studied in isolation, but were suppressed in the larger system due to competition with more favourable reactions. This system is simple by biological standards, containing only 10 reactants and 9 products, yet it is still capable of complex dynamic behaviour, arising from the interaction of several sub-systems. The peptide networks can be exploited in the design of molecular Boolean logic gates.36 By varying the input concentrations (templates and fragments), and monitoring the formation of a particular product, various subsections of the network could be shown to express OR, NOR and NOTIF logic, which constitute basic elements of molecular computing.
One of the advantages of molecular computing is the potential for parallelisation, particularly when applied to large combinatorial search problems. For example, DNA computers have been used to solve the travelling salesman problem,37 a task that involves generating and testing many candidate solutions in order to determine the optimum. Whilst an electronic computer must work through these sequentially, the enormous number of molecules in a DNA computer allows many solutions to be searched simultaneously. Work on these systems is described in more detail in a recent review by Ezziane.38
Fig. 10 Turing patterns (a–c) from simulations of cellular behaviour arising through coupling of an intracellular autocatalytic reaction to differential trans-membrane signal transduction rates of activator and inhibitor messengers. (d) Emergence of pattern a over time from a homogeneous starting state. Reproduced with permission of J. Theor. Biol.40 |
The most well-known oscillating process is the Belousov–Zhabotinsky reaction, discovered by Belousov in 1950, who experienced great difficulty in getting his results published. In fact, only after Zhabotinsky got involved a decade later did the first reports of this reaction appear in the Russian literature. It took another decade before the oscillatory behaviour was understood mechanistically41,42 and another before oscillating reactions could be systematically designed.43 For the Belousov–Zhabotinsky reaction, the oscillations result from the complex interplay between 18 different transformations, many of which feature the same species as reactants or products. The overall reaction is the Ce(IV) catalysed oxidation of citric acid by BrO3– to give bromide, CO2 and water. A key feature in the mechanism of this and many other oscillators is the autocatalytic production of an intermediate, in this case HBrO2. For the complete mechanism, the reader is referred to ref. 42.
In order to sustain oscillations, a continuous conversion of starting material into product must occur, requiring an open system into which starting material is fed and from which the product is removed. This is the situation in flow reactors and oscillating behaviour is now well recognised by chemical engineers.
For more detail on the subject of oscillating reactions and pattern formation the reader is referred to three excellent recent reviews.39,44,45
S2O32– + 2ClO2– + 3H2O → 2SO42– + 2H3O+ + 2Cl– |
At the same time H3O+ is consumed through:
4S2O32– + ClO2– + 4H3O+ → 2S4O62– + 6H2O + Cl– |
A series of elegant experiments were carried out with this system, including one in which a microfluidic device was used, which mimics the human circulatory system in which large inlet (artery) and outlet (vein) channels connect a set of smaller capillaries. When one of the capillaries was punctured, gelation (clotting) occurred and propagated within the damaged capillary without affecting any of the other channels. What triggered this spontaneous initiation is not fully understood. Intriguingly, the authors found that the extent to which gelation propagated through the system was dependent on the nature of the connection between the capillaries' inlet and outlet channels. While no propagation took place when the connections mimicked those found in nature, other connections were found to give catastrophic propagation of clotting into ‘veins’ and ‘arteries’. In a later study, the authors found a good correspondence between the spatiotemporal dynamics of clotting initiation between their chemical model system and experiments with human blood plasma.47
More elaborate control over a chemical network can be exerted through establishing interactions between molecules (in network terminology: through creating the vertices between nodes). The selectivity of these interactions is important if vertices need to be created between specific nodes only. Isaacs and co-workers have addressed this issue by mixing a series of molecules well known to recognise themselves or pair up with a complementary partner. The authors observed a strong preference for thermodynamically controlled self-sorting; i.e. the various interacting pairs formed the expected complexes, essentially ignoring the other molecules present in the solution.51 In a more recent paper, Isaacs et al. investigated self-sorting in a small 4-component system containing two cucurbituril hosts and two guests, each with multiple binding sites. Different host–guest pairings were observed under kinetic and under thermodynamic control.52
Systems chemistry may contribute to developing an improved understanding of the organisational principles of biological networks and how these are related to function. Model systems that reflect the behaviour of the real biological network may be used to make predictions of their behaviour and may lead to the discovery of new ways of manipulating and controlling biological systems. It complements activities aimed at assembling unnatural systems from interchangeable biological elements taken from existing organisms.53
Unravelling the origin of life will, in all likelihood, involve a systems chemistry approach.27 The work on the development of replicators and self-assembling membranes can be seen as the first steps in this direction.
Molecular computing, particularly DNA computing, has the potential to out-perform silicon based computers for several combinatorial search problems. As well as these, calculations in chemico could be advantageous in the sense that it is not necessary to lay every circuit down on a device: computation may be performed in a self-regulating communicating solution of molecules.
One of the unique capabilities of chemists is their ability to design and create new molecules. Extending this creativity from isolated molecules to molecular networks is bound to give rise to many new molecular systems with unique and exciting properties.
Footnote |
† While combinatorial chemistry has traditionally included making compound libraries as mixtures, there has been a shift towards high-throughput parallel screening of pure compounds because mixtures of molecules frequently showed ‘false positives’ i.e. activity that arises from a combination of different compounds and that disappears upon deconvolution. Although perhaps undesirable in a drug discovery process, such behaviour provides clear evidence of the added value of complex mixtures that remains largely unexplored. |
This journal is © The Royal Society of Chemistry 2008 |