Hyun Uk
Kim
ab,
Tae Yong
Kim
ab and
Sang Yup
Lee
*abc
aDepartment of Chemical and Biomolecular Engineering (BK21 Program), Metabolic and Biomolecular Engineering National Research Laboratory, Korea Advanced Institute of Science and Technology, Daejeon 305-701, Republic of Korea. E-mail: leesy@kaist.ac.kr; Fax: +82 42 869 3910; Tel: +82 42 869 3930
bCenter for Systems and Synthetic Biotechnology, Institute for the BioCentury, Korea Advanced Institute of Science and Technology, Daejeon 305-701, Republic of Korea
cDepartment of Bio and Brain Engineering, BioProcess Engineering Research Center and Bioinformatics Research Center, Korea Advanced Institute of Science and Technology, Daejeon 305-701, Republic of Korea
First published on 16th October 2007
Recent advances in metabolic flux analysis including genome-scale constraints-based flux analysis and its applications in metabolic engineering are reviewed. Various computational aspects of constraints-based flux analysis including genome-scale stoichiometric models, additional constraints used for the improved accuracy, and several algorithms for identifying the target genes to be manipulated are described. Also, some of the successful applications of metabolic flux analysis in metabolic engineering are reviewed. Finally, we discuss the limitations that need to be overcome to make the results of genome-scale flux analysis more realistically represent the real cell metabolism.
![]() Hyun Uk Kim | Hyun Uk Kim received his BS in biotechnology from Yonsei University, and is a graduate student at the Metabolic and Biomolecular Engineering Laboratory at Korea Advanced Institute of Science and Technology (KAIST). His current research is concerned with metabolic flux analysis, development of genome-scale stoichiometric models and integration of biological information for rational design of metabolic engineering experiments and identification of drug targets in silico. |
![]() Tae Yong Kim | Tae Yong Kim obtained his BS in chemical engineering from Korea University. He received MS in chemical and biomolecular engineering from KAIST in 2005, during which he conducted research on in silico metabolic engineering of microorganisms. He is currently a PhD candidate working on modeling genome-scale metabolic networks and their applications at KAIST. |
![]() Sang Yup Lee | Sang Yup Lee is a Distinguished Professor and LG Chem Chair Professor at the Department of Chemical and Biomolecular Engineering at KAIST. He is a Director of Center for Systems and Synthetic Biotechnology, Director of BioProcess Engineering Research Center, Director of Bioinformatics Research Center, and Co-Director of the Institute for the BioCentury. He is also serving as senior editor, associate editor, and editorial board member of 12 journals including Biotechnology and Bioengineering, Applied Microbiology and Biotechnology, and Biotechnology Journal. His research interests include metabolic engineering, systems biotechnology, synthetic biology, white biotechnology, and nanobiotechnology. |
Metabolic engineering has emerged to fulfill this purpose, which can be defined as purposeful modification of metabolic and cellular networks by employing various experimental techniques to achieve desired goals.8–10 What distinguishes metabolic engineering from genetic engineering and old-fashioned strain improvement before the emergence of recombinant DNA technique is that it considers metabolic and other cellular network as a whole to identify targets to be engineered. In this sense, metabolic flux is an essential concept in the practice of metabolic engineering. Although gene expression levels and the concentrations of proteins and metabolites in the cell can provide clues to the status of metabolic network, they have inherent limitations in fully describing the cellular phenotype due to the lack of information on the correlations among these cellular components. Metabolic fluxes represent the reaction rates in metabolic pathways, and serve to integrate these factors through a mathematical framework.9,11,12 Thus, metabolic fluxes can be considered as one way of representing the phenotype of the cell as a result of interplays among various cell components; the observed metabolic flux profiles reflect the consequences of interconnected transcription, translation, and enzyme reactions incorporating complex regulations.
Metabolic flux analysis (MFA) is an analytical technique that quantifies intracellular metabolic fluxes and dissects the functional aspects of metabolic network into greater details. MFA is based on mass balances around intracellular metabolites under the pseudo-steady state assumption, and two methods have generally been used: 13C-based flux analysis and constraints-based flux analysis. The former utilizes an isotope labeled carbon source and analyzes 13C enrichment patterns of metabolites with nuclear magnetic resonance (NMR) or gas chromatography–mass spectrometry (GC–MS). The outcome is used as input data for mathematical calculations that estimate the in vivo fluxes.11–15 Even though this allows relatively accurate estimation of intracellular fluxes, difficulties in experiments and subsequent calculations using a large-sized metabolic network are limiting the wide-spread use.
Constraints-based flux analysis is a general term for optimization-based simulation techniques, and various algorithms are available for this (Table 1). First, a stoichiometric model is reconstructed based on genomic information and literature. It is then simulated by linear optimization technique using an appropriate objective function (e.g. maximization of cell growth rate) and constraints that restrict the solution space within cell's capacity16 (Fig. 1). Importantly, it enables systematically predicting and evaluating the effects of genetic and/or environmental perturbations on the cell on a global scale, and suggests what parts of the system (e.g.genes) need to be modified for the directed improvement of the system. This cannot be intuitively gained by handling individual reactions, and thus this method enables more rational and efficient design of real metabolic engineering experiments. In addition to genome-scale stoichiometric modeling and its use in constraints-based flux analysis, there are two approaches that allow pathway analysis: elementary (flux) mode analysis17,18 and extreme pathway analysis.19 They allow identification of minimal set of systemic pathways and all possible steady-state flux distributions that the network can inherently achieve.20,21
![]() | ||
Fig. 1 Construction of stoichiometric matrix for a model metabolic network and constraints-based flux analysis. (A) A model metabolic network consists of 7 internal and 4 external metabolites with 11 reactions. From this network, mass balances for each metabolite can be set up as linear equations, and they can be converted into the form of matrix S on the right. Reactants and products in the reaction have negative and positive stoichiometric coefficients, respectively. In this example, internal reactions are denoted by r, and reactions that span the system boundary are denoted by R. The stoichiometric coefficient for metabolite B is 2 as the reaction r1 generates two molecules of B from one molecule of A. (B) Optimization by linear programming is formulated with an objective function (Z) of maximizing the biomass formation (growth) rate (RE) subject to mass balances and additional constraints. Each reaction flux, vi, is subject to lower and upper bound constraints, represented as αi and βi, respectively. Also, measured fluxes, RA, Rc and RD in this example, can be used as additional constraints. c is a vector that specifies which flux to optimize. Other objective functions such as maximization of product formation and minimization of byproduct formation can be used. Knock-out mutant strains can be similarly simulated by setting the reaction flux of the knocked-out gene to zero. Mutant strain shown in this example has a deleted reaction r4 to increase the production rate of the metabolite Pc. Be noted that the measured fluxes change due to the deleted reaction, which is often observed in real examples. (C) Intracellular metabolic flux distributions calculated by constraints-based flux analysis are shown. Fluxes are represented in mmol (g dry cell weight h)−1. |
Algorithma | Input | Output | Objective | Ref. |
---|---|---|---|---|
a The inputs shown for FBA are common for all algorithms, and thus are not shown. Abbreviations of algorithms are: FBA, flux balance analysis; SR-FBA, steady-state regulatory flux balance analysis; ROOM, regulatory on/off minimization; TMFA, thermodynamic–metabolic flux analysis; OMNI, optimal metabolic network identification; MOMA, minimization of metabolic adjustment. Abbreviations of solvers are: LP, linear programming; QP, quadratic programming, MILP, mixed integer linear programming. | ||||
FBA | Various constraints such as substrate uptake rate, metabolite excretion rate, maintenance energy and capacity limits | Metabolic fluxes | Predicts intracellular flux distribution with maximization of an objective function (e.g. biomass, metabolite production, etc.) using LP | 16 |
SR-FBA | Constraints for regulations, genes to reactions mapping, reaction enzyme state, and reaction predicates | Metabolic fluxes, gene expression status | Predicts gene expression and metabolic fluxes in a genome-scale integrated metabolic-regulatory model using MILP | 35 |
ROOM | Thresholds determining significance of the flux change and their relative and absolute ranges of tolerance | Metabolic fluxes of the knock-out mutant | Minimizes the number of significant flux changes in the knock-out mutant compared to the wild-type using MILP | 36 |
TMFA | Gibbs free energy change of a reaction | Metabolic fluxes free of thermodynamic infeasibilities | Predicts intracellular flux distribution with additional thermodynamic constraints using MILP | 42 |
OMNI | Experimentally determined flux distribution, number of reactions to be knocked-out | Bottleneck reactions to be removed in the model | Identifies a set of reactions in the model whose removal improves the agreement between the model predictions and experimental data using MILP | 45 |
MOMA | Metabolic flux profile of the wild-type | Metabolic fluxes of the knock-out mutant | Minimizes the Euclidian distance from a wild type flux distribution under knock-out condition using QP | 49 |
OptKnock | Minimal growth rate, number of genes to be knocked-out | Gene knock-out targets for biochemical production | Predicts gene knock-out targets through bilevel optimization framework using MILP | 51 |
OptGene | Number of individuals forming a population, | Gene knock-out targets for biochemical production | Predicts gene knock-out targets using genetic algorithm and constraints-based flux analysis (e.g.FBA, MOMA, ROOM, etc.) | 48 |
OptReg | Minimal growth rate, regulation strength parameter, number of reactions to be modulated or knocked-out | Gene knock-out or up/down-regulation targets for biochemical production | Determines the activation/inhibition and elimination reaction set for biochemical production using MILP | 52 |
This Highlight focuses on the recent developments in constraints-based flux analysis of stoichiometric models and its applications in metabolic engineering of microorganisms. We describe the current status of genome-scale stoichiometric models, the strategies of incorporating additional constraints for more accurate flux simulation, gene targeting algorithms for metabolic pathway engineering, and their applications in actual metabolic engineering. Readers can refer to other excellent reviews9,11,12,14,16 for more general information on MFA, which is not repeated in this Highlight.
![]() | ||
Fig. 2 Development of genome-scale stoichiometric models for various organisms, which are continually expanding. The models are available for bacteria including E. coli,24–26H. influenza,27,28H. pylori,29,30M. succiniciproducens,31,61B. subtilis,62G. sulfurreducens,63L. plantarum,64L. lactis,65M. tuberculosis,66,67N. meningitides,68S. aureus,69,70 and S. coelicolor,71 for archae including M. barkeri,72 and for eukaryotes including S. cerevisiae,32,33H. sapiens73 and M. musculus.74 ‘G’ stands for the number of genes incorporated in the model, ‘R’ for the number of reactions, and ‘M’ for the number of metabolites. For S. aureus, the stoichiometric model on the left side of the slash was developed by Becker and Palsson,69 while that on the right side was by Heinemann et al.70 For M. tuberculosis, left model information before the slash is from Beste et al.67 and the right one is from Jamshidi and Palsson.66 |
![]() | ||
Fig. 3 Changing states of cellular physiology through environmental and genetic perturbations within biologically feasible solution space. (A) Under a given condition, cells are at their suboptimal state, and can reach their optimal state through adaptive evolution. (B) MOMA allows identification of the state with the minimal metabolic adjustment as a result of gene knock-outs leading to suboptimal state of the cell. (C) In the rightmost graph, the outermost boundary refers to the biological solution space beyond the current metabolic capacity, which is possible by altered regulation/metabolic engineering. (D) In the opposite direction, superior cells beyond metabolic solution space can approach the real cell by returning the regulatory circuits back to that of wild-type. Or, they can be interpreted as reducing the solution space by altered regulation/metabolic engineering and/or imposing thermodynamic constraints. |
The second method is to reduce the solution space of metabolic fluxes by the addition of thermodynamic constraints37–42 (Fig. 3(D)). Conventional MFA relies only on mass balance of metabolites, but does not account for thermodynamics of reactions; this makes several reactions to take place even though they are not thermodynamically feasible. Based on the first and second laws of thermodynamics, the sum of Gibbs free energy changes around flux loops should be zero, and the Gibbs free energy change should be negative for a reaction to proceed. Consequently, mass balance constraints can be augmented with thermodynamic constraints such that the metabolic network no longer contains thermodynamically infeasible flux loops and reactions. In particular, thermodynamics-based metabolic flux analysis (TMFA) recently developed by Henry et al.42 allows genome-scale calculation of intracellular fluxes by employing additional linear constraints that segregate fluxes that violate the above two thermodynamic criteria.
The third method is to constrain the solution space by employing experiments-based metabolic flux data, which are typically obtained by 13C-based flux analysis.43–45 Although constraints-based flux analysis can predict the intracellular fluxes on a genome-scale, the fluxes calculated may not represent real metabolic fluxes because the model cannot perfectly describe the cellular metabolism. Furthermore, the optimal results obtained by using the objective function during the constraints-based flux analysis may be different from real cellular metabolism (i.e., the cellular metabolism may operate in a suboptimal mode).46,47 The 13C-based flux analysis calculates in vivo fluxes using isotope-labeled substrate, such as 13C-labeled glucose, and its resulting fluxes are believed to be more reliable.11,12 Thus, the 13C-based metabolic fluxes can serve as realistic constraints to reduce the solution space of constraints-based genome-scale flux analysis. Herrgard et al.45 devised a method called optimal metabolic network identification (OMNI) that identifies a set of reaction changes to be made to the model based on the principle described above, so that predictions from the modified model and experimental data become coherent. Diagnosis of E. coli strains engineered for lactate overproduction with OMNI suggested several factors that limit their performance and the strategies to overcome these limits. Thus, this approach may be useful for designing metabolic engineering strategies.
The minimization of metabolic adjustment (MOMA)49 algorithm allows prediction of the suboptimal distribution of metabolic fluxes in knock-out mutants by minimizing the changes in the flux distribution of the mutant with respect to the wild type instead of maximizing the biomass formation in the mutant. This algorithm has successfully been applied to construct intensively engineered microorganisms for improved production of lycopene50 and L-valine.4
OptKnock51 and OptGene48 allow identification of target genes to be knocked out, while more recently developed OptReg52 allows prediction of the genes to be down- and up-regulated in addition to simple knock-outs. OptKnock method is an approach that identifies knock-out targets by formulating a bi-level linear optimization problem with mixed integer linear programming (MILP).51 OptKnock usually finds a set of gene deletions that maximizes the flux towards a desired product, while the internal flux distribution is still operated such that growth is optimized. Thus, the identified gene deletions will force the microorganism to produce the desired bioproduct in order to achieve maximum growth. Using this method, the gene knock-out targets for enhancing the production of various metabolites could be predicted; for example, a quadruple gene deletion mutant E. coli (pta–adhE–pfk–glk mutant) capable of high level production of lactate could be identified.53 OptGene48 is an improved algorithm of OptKnock developed by employing genetic algorithm to reduce computational time.
Pharkya and Maranas introduced a simulation method called OptReg52 that allows examination of the effects of homologous gene amplification (up-regulation of genes) in addition to those of down-regulation and knock-out of genes. This can be considered as an improved version of OptKnock by employing additional constraints that describe the magnitude of fluxes to be regulated. The proposed strategy was used to identify the genes to be engineered for enhancing ethanol production in E. coli. Pharkya and co-workers also developed a computational framework termed OptStrain for the overproduction of a wide range of biochemicals, which may be novel to the host microorganism by adding heterologous metabolic pathways and/or deleting pathways that hamper the production of targeted compounds.54 It was used to identify three heterologous reactions that need to be introduced into E. coli for the production of vanillin, which is followed by systematic gene knock-out studies to enhance the yield. The characteristics of these algorithms are comparatively summarized in Table 1.
Fong et al.53 reported an interesting observation that a strain engineered based on MFA predictions is at suboptimal state, which can undergo adaptive evolution to reach the state of the metabolically optimal in silicocell (Fig. 3(A)). Strong evidences were found that cells maximize their biomass formation as they undergo adaptive evolution.57,58 In other words, incorrect predictions of MFA may be partly due to incomplete adaptive evolution of the cell under the condition examined. Based on this concept, Fong et al.53 first constructed mutants overproducing lactic acid as predicted by OptKnock, and then conducted the adaptive evolution experiments. It was confirmed that mutants did actually evolve towards the maximization of growth rate and lactic acid secretion rate.
Additional gene manipulations associated with various types of regulations can be combined with the above in silico prediction methods to result in more drastic metabolic engineering results beyond the normal limit (Fig. 3(C)).3,4 In one of the recent examples, genome engineering to knock out negative regulations, engineering of new target genes identified by transcriptome profiling, and knocking-out the metabolic genes based on MOMA were all combined to develop an E. coli strain capable of enhanced L-valine production.4 First, an L-valine producing E. coli base strain was constructed by knocking out all known feedback inhibitions, removing attenuation controls, and amplifying activities of the direct L-valine biosynthetic enzymes. This base strain was further improved in a stepwise manner using new information deciphered from transcriptome profiling; a global regulator leucine responsive protein (Lrp) and an L-valine exporter protein were identified. Then, MOMA was employed to identify the genes to be further knocked out. The resulting triple knock-out mutant possessing upgraded regulatory circuits and exporter system produced a remarkable yield of valine (0.378 g L-valine per g glucose). It should be noted that engineering targets for regulatory circuits, global regulator and exporter were selected based on the previous literature information accumulated by many years of research and/or new data generated by omics studies. Understanding the regulatory circuits and subsequent engineering based on the simulation will be one of the topics to which future research should be directed.
This journal is © The Royal Society of Chemistry 2008 |