Yolanda
Schaerli
ab and
Mark
Isalan
ab
aEMBL/CRG Systems Biology Research Unit, Centre for Genomic Regulation (CRG), Dr Aiguader 88, 08003 Barcelona, Spain. E-mail: yolanda.schaerli@crg.eu, isalan@crg.es
bUniversitat Pompeu Fabra (UPF), 08003 Barcelona, Spain
First published on 23rd January 2013
The promise of wide-ranging biotechnology applications inspires synthetic biologists to design novel genetic circuits. However, building such circuits rationally is still not straightforward and often involves painstaking trial-and-error. Mimicking the process of natural selection can help us to bridge the gap between our incomplete understanding of nature's design rules and our desire to build functional networks. By adopting the powerful method of directed evolution, which is usually applied to protein engineering, functional networks can be obtained through screening or selecting from randomised combinatorial libraries. This review first highlights the practical options to introduce combinatorial diversity into gene circuits and then examines strategies for identifying the potentially rare library members with desired functions, either by screening or selection.
Whereas early synthetic genetic circuit engineering studies began with simple devices like a negative feedback loop12 and a toggle switch,13 nowadays the community focuses on circuits displaying more complex behaviours. Recent examples include oscillators comprising thousands of synchronised bacterial colonies,14 layered logic gates in bacteria,15 or mammalian cells performing programmable half-subtractor and half-adder calculations.16 Despite this progress, and the growing interest of the scientific community, it is well-known that synthetic biology faces serious technical difficulties.17–20 For example, an article entitled “Five hard truths for synthetic biology”17 outlined several problems in the field: many parts are not well characterised, are incompatible with the host cell and do not work as predicted when assembled into circuits. Therefore, building a synthetic network is usually still a challenging, slow and difficult process, involving “tweaking” and “debugging” the initial design to obtain a working device. The devil is often in the detail of small context-dependent effects, thus limiting our ability to build more complicated networks easily.
Although rationally designing and building any given device remains the goal, we have to admit that our current knowledge and understanding of how biology works is frequently insufficient. Therefore, synthetic biologists have started to apply the powerful method of directed evolution to synthetic networks.21–25 Combinatorial libraries of network variants are produced and then the variants with the desired properties are found by screening or competitive selection. Typically, the identified variants are subjected to one or more rounds of diversification and selection or screening.
In this review, after briefly touching upon the origins of directed evolution, we describe how combinatorial diversity can be introduced into synthetic transcriptional networks. We then focus on selection and screening systems to identify the functional devices in a library. As most of the reviewed work has been carried out in prokaryotes, the emphasis is on bacterial systems but analogous techniques could be applied to eukaryotic cells.
One field of biological engineering which is now relatively mature, and where new functional constructs are routinely made, is protein engineering. New proteins are constructed, often using structural information and an element of rational design,27,28 but also through screening or selecting from large randomised combinatorial libraries. In a screening assay, a specific output of the individual library members (e.g. their fluorescence) is measured and the best variants are taken to the next round of randomisation or screening. By contrast, in a selection, the desired behaviour of a variant is linked to a competitive survival advantage, so that only positive clones should ultimately survive the selection procedure. Many highly efficient selection systems (e.g. phage display29) as well as screening systems (e.g. based on flow cytometry30) have been successfully applied to protein engineering.31,32 It is therefore manifest that generating diversity and selecting or screening for the desired variants could also be a powerful tool for the engineering of synthetic networks.
Another consideration should be the size of the library that can be screened or selected. For instance, when engineering synthetic transcription networks, it would be wasteful to randomise each transcription factor residue; most mutants would be non-functional or similarly functional when compared with the original network, and the library size would quickly become too big to screen. Rather, it would be smarter to mutate around the transcription factor DNA-binding interface, either mutating the key amino acid residues making DNA contacts, or the corresponding DNA bases in the promoter region. Thus, targeted mutations can provide functional diversity in relatively small, easy-to-handle libraries. The options for introducing diversity are numerous (Fig. 1) and include: network connectivities, promoters, ribosomal binding sites, codon variations, intergenic regions, protein parts, degradation tags and others. Here we discuss the different possibilities:
![]() | ||
Fig. 1 Where to introduce combinatorial diversity in bacterial gene network libraries. An operon containing two genes is shown as an example. The options of where to introduce variations are numerous: in the network connectivities (not shown), promoters, ribosomal binding sites, (start) codons, genes, intergenic regions, degradation tags, and combinations thereof. |
While it is conceivable that this approach could be used to identify more new circuits, a different procedure predominates in the literature: for a given desired output a possible network topology is rationally designed and suitable parts are chosen and assembled. This process is potentially guided by a model.36–41 Directed evolution is then used only if the built circuit is non-functional or needs improvement.15,42,43
The group of Elowitz built a combinatorial library of random promoter architectures containing up to three transcription factor binding sites, which could be placed in the distal, core or proximal regions of the promoter.44 A subset of the library was characterised and promoter strengths were observed that varied over five orders of magnitude. From this analysis, empirical rules were derived for bacterial promoter design, for example repression is strongest when the repressor binding site is located in the core part of the promoter and is weakest in the distal part.
Alternatively, the architecture of the promoter is kept constant and diversity is introduced by randomising all bases,45 or only a subset of the promoter while leaving key motifs unchanged.15,38,46–49 The latter strategy has the advantage that the promoter function is retained in most library members and that the library size can be kept small.
After the generation of the promoter library two different approaches have been pursued. First, the diversity is directly incorporated into the synthetic circuit and a screening or selection is performed to obtain the final working device.15 Alternatively, members of the library are characterised in the context of a basic device, to obtain a collection of promoters covering a wide range of strengths. The promoter matching the required strength is then used to build the intended network.38,45,48,49 The second approach requires a good idea of what promoter characteristics will render the device functional, i.e. from detailed in silico modelling. The advantage is that the collection of well-characterised promoters can be re-used for building different devices. While promoter engineering alters transcription expression levels, it is also possible to tune post-transcriptional processes, as described in the following sections.
In another example, mutations were simultaneously targeted to the RBSs of two transcription factors, thus allowing a search for the right balance of their expression levels.52 Similarly, RBS libraries were applied in the construction of logic gates,15,43,53,54 orthogonal transcription–translation networks,47,55,56 a rewritable digital data storage device42,57 and bistable switches.57,58
Not only can the start codon be exchanged, but also the other codons. While it has long been known that codon usage alters gene expression levels, a recent study quantified this effect and found a 250-fold variation in GFP expression levels in E. coli, using different synonymous codons.59 Design parameters can be obtained to control synthetic gene expression in E. coli,60 and this could provide a source of variation for combinatorial network libraries. Various mechanisms may allow variation, including the use of rare codons or altering RNA secondary structure, via the presence or absence of hairpins.
A frequent goal of the directed evolution of protein parts for synthetic biology is to achieve orthogonality, i.e. the parts should only interact with their defined molecular partner, but not with other cell components.63 For example Zhan et al. evolved new variants of the lacI repressor to recognise different DNA sequences than the wild-type protein, and which no longer bind to the natural lacI operator (lacO).64 Collins et al. engineered a version of the LuxR transcription factor that activates transcription after binding to a different signalling molecule than its parent protein and which no longer recognises the natural quorum sensing signal.65 Similarly, orthogonal transcription factors,66–69 ribosomes,70 polymerases,49,71 receptor–ligand pairs72 and chaperones15 have been created to expand the repertoire of available protein parts for synthetic biology. In contrast to standard protein engineering (e.g. the improvement in the catalytic efficiency for a new substrate), directed evolution of an orthogonal part not only requires a positive selection or screen for improved binding or activity, but also a negative one in order to exclude undesired cross-talk.
Although the randomisation categories described above cover many possibilities, in principle, many other positions and processes could encode useful functional diversity. The main aim of any library design is to cover a useful range of variations (e.g. ‘low’, ‘middle’ or ‘high’ activity) with as few variants as possible, so as not to increase library size beyond practical limits. Having designed libraries, the next stage is to search for required outputs under particular conditions.
![]() | ||
Fig. 2 Example of a screening protocol. Screening for an AND gate, as performed by the Voigt group,54 requires a positive screen in the presence of the two inputs and three negative screens in the presence of one input or in the absence of any input. |
Such systems generate their own set of practical considerations. Often it might not be clear where to set the fluorescence threshold between positive and negative clones. Cells in the OFF state might already have some fluorescence, but an even higher signal in the ON state. Depending on the application, different thresholds can be acceptable. If so, it might be advisable to measure the ON/OFF ratios for all the members, thus avoiding discarding functional variants. Moreover, there can be a difference in responses at the levels of cell populations and individual cells. Depending on the application, it can sometimes be useful to retain even clones where only a small percentage of cells in the ‘clonal’ population display the desired behaviour when induced. To understand such systems, flow cytometry measurements can be essential.
As the functions of synthetic circuits become more complicated, the screening systems must also become more sophisticated. Lou and colleagues built a device in bacteria that switches from green to red after a first UV exposure and back to green after a second UV exposure (push-on push-off switch).52 To screen for this behaviour many steps were necessary: after transformation green colonies were picked and transferred to two agar plates, only one of which was irradiated with UV. Colonies that switched to red after exposure to UV, but did not in the absence of the signal were chosen. Subsequently the same procedure was repeated, but this time red cells switching to green after UV irradiation were selected.
If cells are grown in liquid culture the fluorescence can be measured with a fluorescence spectrometer or a flow cytometer. For cells growing on plates, a fluorescence microscope or an UV transilluminator can be used, the latter only if the excitation range of the fluorescent protein falls in the UV range (e.g. GFPuv78). Liquid cultures are usually handled in a 96-well format and standard agar plates fit about 200 colonies, meaning that the throughput of plate-based fluorescence screens is commonly 100–2000 library members.
Flow cytometry can be applied in two ways: as a fluorescence measuring tool or as a fluorescence activated cell sorting (FACS) tool. The advantage of using a flow cytometer instead of a fluorescence spectrometer for measurement is that the fluorescence of the individual cells is recorded. This allows one to determine the distribution of the cells within a population, rather than only measuring an average. For example a population might display an unexpected bimodal distribution due to the variable metabolic burden that is imposed on the host by the synthetic device.79 In other networks, it is only a part of the population that actually responds to the signal.16,52
A FACS machine can screen about 107 cells per hour80 and thus enables access to much bigger libraries than other screening methods based on fluorescence. Rather surprisingly, with the exception of few examples,46,47 this technique has hardly been applied so far to synthetic genetic circuits. The common occurrence of heterogeneous cell populations for one synthetic network, as mentioned above, might be one reason, as these will cause many false positives or negatives in a screening based on single cells. Nevertheless, the advantage of the superior throughput of this method probably outweighs the effort of eliminating errors post-sorting. We therefore expect that FACS will be used more often for the screening of synthetic transcriptional circuits in the future.
Although fluorescent proteins are by far the most commonly used markers for screenings, they are not the only option. Enzymes producing a colorimetric or a luminescent readout are also feasible. For example, in the above mentioned evolution of orthogonal lacI repressors, blue/white colony screening employed β-galactosidase expression and X-gal staining.64
Initial selections of this type used two independent genes on different plasmids,65 on the same plasmid81 or as a genetic fusion70 for the two selection conditions. The disadvantage of this strategy is that any mutation disabling the function of the killer gene results in false positives. Since false positives can quickly outgrow the rare positive library members, especially if the selections are performed in liquid culture, it is important to plate out cells on Petri dishes during selections. Moreover, it is necessary to remove and replace the selection marker plasmid after each round of selection because otherwise this will accrue false positive mutations under the strong selection pressure.70,81
To make selections more robust and to simplify processes, systems have been developed where one gene can function as a selection marker for both positive and negative selection. To achieve this, the selection marker gene can either enable cell survival or induce cell death under defined conditions.53,55,56,76,77 Because any mutation that eliminates the function of the protein will most likely affect both the ON and OFF selections, the chances of the emergence of false positives are much lower. This enables one to perform selections in liquid cultures, further increasing throughput and speeding up the process.
TetA, a tetracycline/H+ antiporter is just such a dual selector.55,56,76,77 Its expression confers resistance to tetracycline and can therefore be used for positive selections in the ON state. As a membrane-bound protein, its overexpression also makes the host bacteria more susceptible to toxic metal salts, including NiCl2. Therefore, cells with TetA expression will not grow well in the presence of NiCl2 allowing cells in the OFF state to be selected (negative selection) (Fig. 3A). This strategy was successfully applied in synthetic circuits based on riboswitches.55,56,77 Additionally, fusing TetA to GFPuv allows a quantitative readout without further subcloning.55,56
![]() | ||
Fig. 3 Dual selection protocols. (A) TetA confers resistance to tetracycline (ON selection) and makes the host bacteria more susceptible to toxic NiCl2 (OFF selection).55,56,76,77 (B) HsvTK can rescue the thymidine deficiency of a thymidine kinase deficient strain in the presence of a thymidylate synthase (ThyA) inhibitor (5FdU) (ON selection). HsvTK can also make cells sensitive to synthetic dP nucleosides; these become toxic upon phosphorylation by hsvTK (OFF selection).53 Grey crosses indicate cells where the conditions reduce viability. |
The Herpes simplex virus thymidine kinase (hsvTK) is an alternative dual selector with demonstrated use in E. coli.53 In the presence of a thymidylate synthase inhibitor (5FdU), a thymidine kinase-deficient strain does not grow, due to the lack of thymidine. In ON selections hsvTK expression can rescue this deficiency. The OFF selection is performed in the presence of synthetic nucleosides (e.g. deoxyribosyl-dihydropyrimido[4,5-c][1,2]oxazin-7-one: dP) that only become toxic upon phosphorylation by hsvTK (Fig. 3B). As with TetA, hsvTK can be fused C-terminally to GFP without losing its function (unpublished data).
Although impressive enrichment factors (1300–33000 times per ON/OFF cycle) have been demonstrated for these two dual selectors in model selections,53,56 both systems are still waiting to be applied more widely. One reason might be that it is difficult to match the rather limited dynamic range of these selection systems to the functional range of the synthetic circuits. For example, while it is possible to select circuits that are ON among those that are completely OFF, it is more difficult to select against slightly leaky devices: low expression levels of TetA are already enough to confer tetracycline resistance. The use of less active TetA mutants82 might alleviate this problem. The upper expression limit of the dual selectors is also constrained as they contribute to the metabolic load imposed on the host cell, even in the absence of the negative selection conditions. Moreover, TetA overexpression is known to be detrimental to cell growth.84 Another concern is whether the use of mutagenic nucleosides in the hsvTK OFF selection will introduce undesired mutations. Therefore, while the throughput of selection systems is generally higher than that of their screening counterparts, they are often less flexible. However, it is still early days and time will show whether the dual selectors discussed here are robust enough to be adopted by the community.
For devices intended for targeted applications, the selection can be tailored to be more specific for the purpose of the device. An elegant example was demonstrated for the above-mentioned bacteria that can invade cancer cells upon a signal:51 the bacteria carrying the device library were incubated with the cultured cancer cells, followed by the addition of an antibiotic. Bacteria unable to invade the mammalian cells were killed, but bacteria inside the cancer cells were protected from the antibiotic effect. Internalised bacteria were then released by mammalian cell lysis and grown on plates. Positive clones were subsequently screened for loss of invasiveness in the absence of the signal.
After any process requiring several rounds of selection or screening, the resulting circuits always have to be analysed carefully. Controls should ensure a good understanding of the function of the individual circuit components. This analysis should also uncover the cases where the observed behaviour is caused by (unexpected) interactions of the synthetic network and the host cells.79,85
The work cited in this section has been mostly carried out in bacterial cells. However, synthetic biology in eukaryotes is fast catching up86 and we expect to see the development of analogous screening systems in the near future.
![]() | ||
Fig. 4 Automated screening of spatial patterning. The schematic shows a thought-experiment on how one might screen a combinatorial library scaffold for a Gierer–Meinhardt system.83 Thousands of randomised candidates might have to be tested to find the correct behaviour. The library would comprise variants of an activator (U; red) and an inhibitor (V; blue) which would communicate local, non-linear self-activation and long range inhibition signals to other cells. By plating library members on dishes or multiwell plates (here, one per plate) thousands of randomised parameter sets might be screened for potential patterning behaviour. Reproduced from ref. 34. |
Another emerging trend is the engineering of multicellular traits in cell consortia where single cells carry out different simple tasks.94 This makes particular sense because our ability to increase the size and complexity of synthetic networks in one host cell has begun to stagnate.17 Two recent studies95,96 applied this approach in a rather elegant way: cells, each carrying out a simple function, were combined in multiple ways so that the whole consortia carried out far more complicated distributed computational tasks than the individual cells. Perhaps this concept of distributed networks truly represents the future of synthetic biology and individual robust functions could be made relatively easily using the screening or selection of combinatorial libraries. Ultimately, the power of directed evolution has yet to be fully harnessed and should provide us with an efficient new generation of engineering tools in the years to come.
This journal is © The Royal Society of Chemistry 2013 |