Flux balance analysis accounting for metabolite dilution
 Tomer Benyamini^{1}Email author,
 Ori Folger^{1},
 Eytan Ruppin^{1, 2} and
 Tomer Shlomi^{3}Email author
DOI: 10.1186/gb2010114r43
© Benyamini et al.; licensee BioMed Central Ltd. 2010
Received: 9 December 2009
Accepted: 16 April 2010
Published: 16 April 2010
Abstract
Flux balance analysis is a common method for predicting steadystate flux distributions within metabolic networks, accounting for the growth demand for the synthesis of a predefined set of essential biomass precursors. Ignoring the growth demand for the synthesis of intermediate metabolites required for balancing their dilution leads flux balance analysis to false predictions in some cases. Here, we present metabolite dilution flux balance analysis, which addresses this problem, resulting in improved metabolic phenotype predictions.
Background
A practical approach to gaining biological understanding of complex metabolic networks requires the development of mathematical modeling, simulation, and analysis techniques. Traditional modeling techniques are based on mathematical approaches that require detailed and accurate information regarding reaction kinetics as well as enzyme and metabolite concentrations [1, 2]. The lack of sufficient data limits the current applicability of such methods to smallscale systems. This hurdle is surpassed through the use of constraintbased modeling (CBM), which serves to analyze the functionality of genomescale metabolic networks by relying solely on simple physicalchemical constraints [3, 4]. Genomescale CBM models have already been constructed for more than 50 organisms [5], including common model microorganisms [6, 7], industrially relevant microbes [8–11], various pathogens [12–15], and recently for human cellular metabolism [16]. Flux balance analysis (FBA) is a key computational approach within the CBM modeling framework [17–19] and is frequently used to successfully predict various phenotypes of microorganisms, such as their growth rates, uptake rates, byproduct secretion, and knockout lethality (see [3, 5, 20] for reviews).
The uptake and secretion of a predefined set of metabolites from and to the environment is facilitated via the definition of exchange reactions in the stoichiometric matrix S [3]. A pseudo growth reaction is defined to simulate the utilization of metabolites during growth, consuming the most abundant biomass constituents based on experimentally determined concentrations (that is, the jth component in denotes the steadystate concentration of metabolite j). The objective of FBA is to find a steadystate flux distribution, , satisfying Equation 2 alongside additional enzymatic directionality and capacity constraints [3], together permitting a maximal growth rate μ. Accounting only for linear constraints, the resulting space of feasible flux distribution described by FBA is convex (forming a highdimensional polytope), in which optimal biomass producing solutions can be efficiently searched for via linear programming (LP).
The employment of a pseudo growth reaction in FBA to represent the utilization of metabolites as part of growth poses two fundamental problems. First, the metabolite composition of cellular biomass significantly varies across different growth media, genetic backgrounds and growth rates [22–24]. Indeed, previous work by Pramanik and Keasling [22, 23] has shown that using the correct experimentally measured biomass composition of Escherichia coli under different growth media and growth rates significantly improves FBA flux predictions. However, as FBA is commonly applied to probe metabolic behavior under diverse genetic and environmental conditions for which no metabolite concentration data are available, it has become common practice to employ a constant biomass composition across all conditions [25]. Second, the growth reaction in various CBM models commonly accounts for no more than a few dozen metabolites, for which measured concentrations are available under a specific condition [23]. Ignoring the growthassociated dilution of the remaining metabolites (those not included in the biomass composition in use; required by Equation 1) may result in the prediction of biologically implausible flux distributions, leading to false predictions of gene essentiality and growth rates, as shown in the Results. This problem has been recently addressed by Kruse and Ebenhöh [26], who suggested a method that is based on network expansion to compute the set of producible metabolites under a given growth medium. This method, however, does not enable the prediction of feasible flux distributions that account for the growthassociated dilution of all intermediate metabolites. Another approach, recently suggested by Martelli et al. [27], predicts metabolic fluxes based on Von Neumann's model, which maximizes the growth rate in a metabolic network without assuming massbalance nor utilizing prior knowledge of a biomass composition. However, similarly to FBA, flux distributions predicted by this method do not fully account for the growthassociated dilution of all intermediate metabolites.
In this paper we describe a variant of FBA, metabolite dilution flux balance analysis (MDFBA), which aims to predict metabolic flux distributions by accounting for the dilution of all intermediate metabolites that are synthesized under a given condition. As shown below, accounting for growth dilution of intermediate metabolites is especially important for metabolites that participate in catalytic cycles, many of them being metabolic cofactors. Since CBM assumes a steadystate flux distribution and does not predict the actual concentration of the intermediate metabolites, we consider a uniform minimal dilution rate for all intermediate metabolites produced via a nonzero flux through some reaction (assuming a uniform concentration for all intermediate metabolites, following [28]).
Next, we describe the implementation of MDFBA as a mixedinteger linear programming (MILP) optimization problem and demonstrate its applicability in predicting metabolic phenotypes, outperforming the commonly used FBA method.
Results
MDFBA  accounting for growthassociated dilution of all intermediate metabolites
Our method, MDFBA, aims to predict a feasible flux distribution through a metabolic network under a given environmental and genetic condition, by maximizing the production rate of the biomass (that is, the flux through the biomass reaction) while satisfying a stoichiometric massbalance constraint, accounting for the growthassociated dilution of all produced intermediate metabolites, and satisfying enzymatic directionality and capacity constraints embedded in the model (similarly to FBA). MDFBA is formulated as a MILP problem as defined in the Materials and methods.
Applying MDFBA to predict metabolic phenotypes in Escherichia coli
As a benchmark for the prediction performance of MDFBA, we applied it to the genomescale metabolic network model of E. coli [6] to predict growth rates and gene essentiality under a diverse set of growth media and gene knockouts. The model of Feist et al. [6] accounts for 1,260 metabolic genes, 2,382 reactions and 1,668 metabolites.
Discussion
This study presents MDFBA, a variant of FBA for predicting metabolic flux distributions by accounting for growthassociated dilution of all metabolites in a contextdependent manner. The method predicts feasible flux distributions maximizing the production rate of a predefined biomass while accounting for the dilution of all intermediate metabolites, and most importantly, for all metabolic cofactors involved in the process. MDFBA was shown to successfully predict E. coli's gene essentiality under a variety of growth media and knockout strains, displaying a significant improvement upon the prediction performance of the commonly used FBA method.
MDFBA has two notable limitations, which may contribute to the relatively low improvement in growth rate prediction accuracy (compared to the marked advantage in predicting gene knockout lethality). First, MDFBA employs a uniform lower bound on the dilution rate of intermediate metabolites which, along with the absence of reactions outside the scope of the network model that degrade intermediate metabolites, implicitly reflects the assumption of a uniform concentration of all intermediate metabolites. A natural extension of MDFBA would be to consider different lower and upper bounds on concentrations of different metabolites, based on concentration statistics gathered via metabolomic measurements across a variety of conditions (for example, [24]). Notably though, changing the lower bound employed here to a range of possible values and incorporating an upper bound on dilution rates across all metabolites did not improve the prediction performance (data not shown). Second, MDFBA, similarly to FBA, is based on the assumption that microbial species aim to maximize their growth rate and hence search for feasible flux distributions that maximize biomass synthesis rate. However, previous studies have questioned this hypothesis, suggesting alternative possible optimization criteria. Future studies should investigate the potential usage of such optimization criteria with MDFBA [35]. More generally, CBM methods that do not rely on optimization may also benefit from variants that account for metabolite dilution during growth.
A marked disadvantage of MDFBA is its dependence on MILP, which is computationally more demanding than LP, utilized by FBA. To improve the runtime of MDFBA, the amount of integer variables in the MDFBA formulation may be reduced by employing a previous method to identify the metabolic 'scope' of the medium nutrients. Specifically, Handorf et al. [36] investigated the capacity to produce metabolites from available medium nutrients by applying FBA and a network expansion algorithm, resulting in a production scope for each set of medium metabolites. A potential improvement in runtime may be achieved by calculating the scope of the input growth medium and assigning integer variables only for metabolites in that derived scope, as all the other metabolites will never be able to satisfy their dilution demand. Speeding up the runtime may be of importance when applying MDFBA to larger networks, such as the recently published human model [16], or when probing the network under multiple knockout configurations [37, 38].
An interesting comparison can be made between MDFBA and a method developed by Price et al. [39] for eliminating futile cycles via the identification of type III extreme pathways (that is, a unique set of convex basis vectors of the flux distribution solution space that do not include exchange reactions). While the extreme pathways method enables the elimination of thermodynamically impossible loops, MDFBA removes infeasible solutions due to dilution demands. Notably, the latter method also implicitly eliminates type III extreme pathways since these pathways do not satisfy dilution demands of the participating metabolites. Additionally, MDFBA eliminates solutions that do not involve type III extreme pathways as demonstrated in Figure 1b: when metabolite X is absent from the growth medium, the cycle involving reactions v_{4} and v_{8} cannot be activated based on MDFBA, since the dilution of cofactor C cannot be satisfied, although this cycle is not part of a type III extreme pathway.
Another appealing application of MDFBA could be the identification of missing reactions in the model by comparing predicted phenotypes with measured ones, in line with previous works using FBA for this purpose [40]. For example, suppose that in Figure 1a the biosynthetic pathway for metabolite C, through reactions v_{6} and v_{7}, was not included in the model. In this case, MDFBA would predict metabolic flow through reactions v_{2} and v_{3}, such that the enzymes catalyzing these reactions are essential, contrary to experimental essentiality data. Utilizing a method similar to that used by Reed et al. [40], using MDFBA can infer the missing reactions, v_{6} and v_{7}. Employing FBA for this purpose would not work since FBA predicts v_{2} and v_{3} to be nonessential, as the activity of reactions v_{4} and v_{8} do not depend on the presence of reactions v_{6} and v_{7}.
While this work applied MDFBA to predict metabolic phenotypes in E. coli, for which a comprehensive and accurate metabolic network model exists, the method can also be applied to any one of a growing number of reconstructed network models [20]. Importantly, the application of MDFBA to other network models is straightforward and requires no modelspecific data curation. To facilitate simple usage of MDFBA, we provide an implementation of the method in the supplemental website [41]. A particularly interesting potential application of MDFBA would be for modeling malignant proliferating cells in human cancer, potentially revealing the activity of biosynthetic pathways for various cofactors required to balance their growthassociated dilution. The latter may utilize the recently published model of human cellular metabolism by [16] or [42]. Overall, we expect that future use of MDFBA will promote improved metabolic phenotypic predictions across a variety of organisms, growth conditions and genetic alterations.
Materials and methods
Metabolite dilution flux balance analysis
where a massbalance constraint, accounting for the dilution of all active metabolites, is formulated in Equation 3. Equation 4 assigns a positive dilution rate above a predefined threshold (denoted by ε) for active metabolites, produced in some nonzero rate in the flux distribution . In our application of the method for E. coli we set ε = 10^{4} μmol/mg, which represents a common concentration of intermediate metabolites [6]. Notably, the model's predictions were robust to different choices of ε values (data not shown). Enzyme directionality and capacity constraints are formulated in Equation 5 by imposing and as lower and upper bounds on flux values.
A simplified formulation assuming a constant growth rate of μ = 1 in Equation 8 (for calculating the dilution rate of intermediate metabolites) gave qualitatively similar results to the above linear formulation (data not shown). The commercial solver CPLEX running on 64bit Linux machines was used for solving LP and MILP problems within a few dozens of seconds per problem.
Abbreviations
 AUC:

area under curve
 CBM:

constraintbased modeling
 FBA:

flux balance analysis
 LP:

linear programming
 MDFBA:

metabolite dilution flux balance analysis
 MILP:

mixed integer linear programming
 OD:

optical density
 q8:

ubiquinone8oxidized
 q8h2:

ubiquinone8reduced
 ROC:

receiver operating characteristic.
Declarations
Acknowledgements
We are grateful to Hadas Zur, Naama Tepper and Yoav Teboulle for their fruitful comments. This study was supported in part by a fellowship from the Edmond J Safra Bioinformatics program at TelAviv University. ER's and TS's research is supported by grants from the Israel Science Foundation.
Authors’ Affiliations
References
 Fell DA: Understanding the Control of Metabolism. 1996, London: Portland PressGoogle Scholar
 Domach MM, Leung SK, Cahn RE, Cocks GG, Shuler ML: Computer model for glucoselimited growth of a single cell of Escherichia coli B/rA. Biotechnol Bioeng. 2000, 67: 827840.PubMedView ArticleGoogle Scholar
 Price ND, Reed JL, Palsson BO: Genomescale models of microbial cells: evaluating the consequences of constraints. Nat Rev Microbiol. 2004, 2: 886897.PubMedView ArticleGoogle Scholar
 Stelling J, Klamt S, Bettenbrock K, Schuster S, Gilles E: Metabolic network structure determines key aspects of functionality and regulation. Nature. 2002, 420: 190193.PubMedView ArticleGoogle Scholar
 Oberhardt MA, Palsson BO, Papin JA: Applications of genomescale metabolic reconstructions. Mol Syst Biol. 2009, 5: 320PubMedPubMed CentralView ArticleGoogle Scholar
 Feist AM, Henry CS, Reed JL, Krummenacker M, Joyce AR, Karp PD, Broadbelt LJ, Hatzimanikatis V, Palsson BO: A genomescale metabolic reconstruction for Escherichia coli K12 MG1655 that accounts for 1260 ORFs and thermodynamic information. Mol Syst Biol. 2007, 3: 121PubMedPubMed CentralView ArticleGoogle Scholar
 Mo M, Palsson B, Herrgard M: Connecting extracellular metabolomic measurements to intracellular flux states in yeast. BMC Syst Biol. 2009, 3: 37PubMedPubMed CentralView ArticleGoogle Scholar
 Durot M, Le Fevre F, de Berardinis V, Kreimeyer A, Vallenet D, Combe C, Smidtas S, Salanoubat M, Weissenbach J, Schachter V: Iterative reconstruction of a global metabolic model of Acinetobacter baylyi ADP1 using highthroughput growth phenotype and gene essentiality data. BMC Syst Biol. 2008, 2: 85PubMedPubMed CentralView ArticleGoogle Scholar
 Senger RS, Papoutsakis ET: Genomescale model for Clostridium acetobutylicum: Part I. Metabolic network resolution and analysis. Biotechnol Bioeng. 2008, 101: 10361052.PubMedPubMed CentralView ArticleGoogle Scholar
 Izallalen M, Mahadevan R, Burgard A, Postier B, Didonato R, Sun J, Schilling CH, Lovley DR: Geobacter sulfurreducens strain engineered for increased rates of respiration. Metab Eng. 2008, 10: 267275.PubMedView ArticleGoogle Scholar
 Mahadevan R, Bond DR, Butler JE, EsteveNunez A, Coppi MV, Palsson BO, Schilling CH, Lovley DR: Characterization of metabolism in the Fe(III)reducing organism Geobacter sulfurreducens by constraintbased modeling. Appl Environ Microbiol. 2006, 72: 15581568.PubMedPubMed CentralView ArticleGoogle Scholar
 Kjeld Raunkaer K, Jens N: In silico genomescale reconstruction and validation of the Corynebacterium glutamicum metabolic network. Biotechnol Bioeng. 2009, 102: 583597.View ArticleGoogle Scholar
 Jamshidi N, Palsson B: Investigating the metabolic capabilities of Mycobacterium tuberculosis H37Rv using the in silico strain iNJ661 and proposing alternative drug targets. BMC Syst Biol. 2007, 1: 26PubMedPubMed CentralView ArticleGoogle Scholar
 Schilling C, Covert M, Famili I, Church G, Edwards J, Palsson B: Genomescale metabolic model of Helicobacter pylori 26695. J Bacteriol. 2002, 184: 45824593.PubMedPubMed CentralView ArticleGoogle Scholar
 Becker S, Palsson B: Genomescale reconstruction of the metabolic network in Staphylococcus aureus N315: an initial draft to the twodimensional annotation. BMC Microbiol. 2005, 5: 8PubMedPubMed CentralView ArticleGoogle Scholar
 Duarte NC, Becker SA, Jamshidi N, Thiele I, Mo ML, Vo TD, Srivas R, Palsson BO: Global reconstruction of the human metabolic network based on genomic and bibliomic data. Proc Natl Acad Sci USA. 2007, 104: 17771782.PubMedPubMed CentralView ArticleGoogle Scholar
 Varma A, Palsson B: Metabolic capabilities of Escherichia coli. II. Optimal growth patterns. J Theor Biol. 1993, 165: 503522.View ArticleGoogle Scholar
 Varma A, Palsson B: Stoichiometric flux balance models quantitatively predict growth and metabolic byproduct secretion in wildtype Escherichia coli W3110. Appl Environ Microbiol. 1994, 60: 37243731.PubMedPubMed CentralGoogle Scholar
 Kauffman K, Prakash P, Edwards J: Advances in flux balance analysis. Curr Opin Biotechnol. 2003, 14: 491496.PubMedView ArticleGoogle Scholar
 Feist AM, Herrgard MJ, Thiele I, Reed JL, Palsson BO: Reconstruction of biochemical networks in microorganisms. Nat Rev Microbiol. 2009, 7: 129143.PubMedPubMed CentralView ArticleGoogle Scholar
 Visser D, Schmid JW, Mauch K, Reuss M, Heijnen JJ: Optimal redesign of primary metabolism in Escherichia coli using linlog kinetics. Metab Eng. 2004, 6: 378390.PubMedView ArticleGoogle Scholar
 Pramanik J, Keasling JD: Stoichiometric model of Escherichia coli metabolism: incorporation of growthrate dependent biomass composition and mechanistic energy requirements. Biotechnol Bioeng. 1997, 56: 398421.PubMedView ArticleGoogle Scholar
 Pramanik J, Keasling JD: Effect of Escherichia coli biomass composition on central metabolic fluxes predicted by a stoichiometric model. Biotechnol Bioeng. 1998, 60: 230238.PubMedView ArticleGoogle Scholar
 Bennett BD, Kimball EH, Gao M, Osterhout R, Van Dien SJ, Rabinowitz JD: Absolute metabolite concentrations and implied enzyme active site occupancy in Escherichia coli. Nat Chem Biol. 2009, 5: 593599.PubMedPubMed CentralView ArticleGoogle Scholar
 Feist AM, Palsson BO: The growing scope of applications of genomescale metabolic reconstructions using Escherichia coli. Nat Biotechnol. 2008, 26: 659667.PubMedPubMed CentralView ArticleGoogle Scholar
 Kruse K, Ebenhöh O: Comparing flux balance analysis to network expansion: producibility, sustainability and the scope of compounds. Genome Informatics. 2008, 20: 91101.PubMedGoogle Scholar
 Martelli C, De Martino A, Marinaric E, Marsili M, Castilloe I: Identifying essential genes in Escherichia coli from a metabolic optimization principle. Proc Natl Acad Sci USA. 2009, 106: 26072611.PubMedPubMed CentralView ArticleGoogle Scholar
 Covert MW, Knight EM, Reed JL, Herrgard MJ, Palsson BO: Integrating highthroughput and computational data elucidates bacterial networks. Nature. 2004, 429: 9296.PubMedView ArticleGoogle Scholar
 Glasner Jea: ASAP, a systematic annotation package for community analysis of genomes. Nucleic Acids Res. 2003, 31: 147151.PubMedPubMed CentralView ArticleGoogle Scholar
 Wu G, Williams HD, Zamanian M, Gibson F, Poole RK: Isolation and characterization of Escherichia coli mutants affected in aerobic respiration: the cloning and nucleotide sequence of ubiG: Identification of an Sadenosylmethioninebinding motif in protein, RNA, and smallmolecule methyltransferases. J Gen Microbiol. 1992, 138: 21012112.PubMedView ArticleGoogle Scholar
 Hsu AY, Poon WW, Shepherd JA, Myles DC, Clarke CF: Complementation of coq3 mutant yeast by mitochondrial targeting of the Escherichia coli UbiG polypeptide: evidence that UbiG catalyzes both Omethylation steps in ubiquinone biosynthesis. Biochemistry. 1996, 35: 97979806.PubMedView ArticleGoogle Scholar
 Baba T, Ara T, Hasegawa M, Takai Y, Okumura Y, Baba M, Datsenko KA, Tomita M, Wanner BL, Mori H: Construction of Escherichia coli K12 inframe, singlegene knockout mutants: the Keio collection. Mol Syst Biol. 2006, 2: 2006.0008PubMedPubMed CentralView ArticleGoogle Scholar
 Joyce A, Reed J, White A, Edwards R, Osterman A, Baba T, Mori H, Lesely S, Palsson B, Agarwalla S: Experimental and computational assessment of conditionally essential genes in Escherichia coli. J Bacteriol. 2006, 188: 82598271.PubMedPubMed CentralView ArticleGoogle Scholar
 Bradley AP: The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognition. 1997, 30: 11451159.View ArticleGoogle Scholar
 Schuetz R, Kuepfer L, Sauer U: Systematic evaluation of objective functions for predicting intracellular fluxes in Escherichia coli. Mol Syst Biol. 2007, 3: 119PubMedPubMed CentralView ArticleGoogle Scholar
 Handorf T, Ebenhoh O, Heinrich R: Expanding metabolic networks: scopes of compounds, robustness, and evolution. J Mol Evol. 2005, 61: 498512.PubMedView ArticleGoogle Scholar
 Burgard A, Maranas C: Optimizationbased framework for inferring and testing hypothesized metabolic objective functions. Biotechnol Bioeng. 2003, 82: 670677.PubMedView ArticleGoogle Scholar
 Deutscher D, Meilijson I, Kupiec M, Ruppin E: Multiple knockout analysis of genetic robustness in the yeast metabolic network. Nat Genet. 2006, 38: 993998.PubMedView ArticleGoogle Scholar
 Price ND, Famili I, Beard DA, Palsson B: Extreme pathways and Kirchhoff's second law. Biophys J. 2002, 83: 28792882.PubMedPubMed CentralView ArticleGoogle Scholar
 Reed JL, Patel TR, Chen KH, Joyce AR, Applebee MK, Herring CD, Bui OT, Knight EM, Fong SS, Palsson BO: Systems approach to refining genome annotation. Proc Natl Acad Sci USA. 2006, 103: 1748017484.PubMedPubMed CentralView ArticleGoogle Scholar
 MDFBA supplemental material. [http://www.cs.technion.ac.il/~tomersh/tools]
 Ma H, Sorokin A, Mazein A, Selkov A, Selkov E, Demin O, Goryanin I: The Edinburgh human metabolic network reconstruction and its functional analysis. Mol Syst Biol. 2007, 3: 135PubMedPubMed CentralView ArticleGoogle Scholar
Copyright
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.