-->

Flux Balance Analysis

To Top

Constraint-Based Modeling

Model Description

Flux balance analysis is a mathematical approach used to understand the flow of metabolites through a system to understand biochemical networks. A numerical matrix is formed using the stoichiometric coefficients of each reaction from a genome scale metabolic model (GEM) containing all known metabolic reactions in an organism. The particular GEM we used is the well curated iML1515, representative of the E.coli K-12 MG1655 strain [1]. Stoichiometries and bounds for reactions are used to impose constraints and create a solution space. From this solution space, an optimization function is applied to identify the specific flux distribution across all reactions that maximizes our objective (e.g., L-cysteine export) while satisfying the imposed constraints.

Embedded Image

Figure 1. Flux balance analysis workflow to determine the optimal solution from a solution space and imposed constraints. The figure was reproduced from Orth et. al, 2010 [2]

Constraint-based modeling effectively characterizes the L-cysteine overproduction plasmid. This circuitry is involved in redirecting metabolic flux through existing reactions in the Cysteinator. The L-cysteine production pathways do not work in isolation and using a base GEM helps account for other competing pathways and accurate reallocation of resources without introducing biophysical equations that often require difficult-to-measure kinetic parameters.

A key assumption underlying the model is that the system operates under steady-state conditions. The optimal solution is calculated with the underlying assumption that while metabolite concentrations may fluctuate transiently, they quickly reach a state where production and consumption are balanced. There is no net accumulation or depletion within the system. However, our engineered system is inherently time dependent, as it relies on the gradual accumulation of L-cysteine to trigger the kill switch. Therefore, FBA is mainly used to assess how our mutated enzymes affect overall L-cysteine production and to determine optimal medium conditions. Specific time-dependent behavior is built into future applications of the model.

FBA primarily relies on stoichiometric coefficients, and it often predicts unrealistically high fluxes with a large metabolic solution space. Reducing the size of this solution space generally improves the accuracy of the predicted optimal solution. The solution space can be narrowed by imposing additional fluxes in the form of omics data. To add an additional layer of complexity to the model, enzyme constraints were introduced. Enzyme constraints ensure that fluxes through pathways are capped by enzyme availability and the catalytic efficiency of the enzymes, to avoid arbitrarily high flux predictions. Incorporating enzyme data, as opposed to other omics data, best fits our device design since L-cysteine overproduction is achieved through mutating enzymes in the L-cysteine biosynthesis pathways. Enzyme constraints were added adapting the ECMpy workflow [3]. This specific workflow was used because it adds one overall total enzyme constraint in addition to stoichiometric constraints without altering the GEM. The disadvantages of other common enzyme constraint workflows like GECKO (genome-scale model to account for enzyme constraints, using kinetics and omics) and MOMENT (metabolic modeling with enzyme kinetics), are that they alter the stoichiometric matrix and add in pseudo-reactions/metabolites which significantly increases the model size. Furthermore, ECMpy generates increased accuracy in predictions compared to GECKO, MOMENT, and the original iML1515 model [3].

Methods

The first step in creating the model involved altering the baseline GEM and media conditions to reflect our system. The iML1515 model used is the most complete reconstruction of E. coli K-12 MG1655 to date. It includes 1,515 open reading frames, 2,719 metabolic reactions, and 1,192 metabolites [1]. E. coli K-12 BW25113 serves as the final chassis for implementing the combined genetic circuit. Because the model is designed to predict the behavior of the engineered system in a bioreactor, this specific strain is used as the basis for the simulation. The two K-12 derivatives possess genetic differences with a few gene deletions and allele variations [4]. However, genetic differences are not involved in the L-cysteine production pathways and biomass growth. Thus, we assume negligible differences between the two K-12 derivatives and use iML1515 to model our strain.

The ECMpy workflow identified errors in iML1515 with Gene-Protein-Reaction (GPR) relationships, reaction direction, and others based on the Encyclopedia of Escherichia coli genes and metabolism database [5]. The first step in altering the GEM involved incorporating these changes to reflect EcoCyc.

Embedded Image

Figure 2. The metabolic pathways involved in L-cysteine production and export in E.coli. Dashed lines represent feedback inhibition. The engineered circuits specifically target SerA, CysE, and EamB shown in the pathways. Thiosulfate assimilation pathways are also used to produce L-cysteine. The figure is reproduced from Li et. al, 2023 [6]

To incorporate enzyme constraints, all reversible reactions in iML1515 were split into forward and reverse reactions to assign corresponding forward and reverse Kcat values. Similarly, reactions catalyzed by multiple isoenzymes were split into independent reactions, because they have different associated Kcat values. Molecular weights were calculated using protein subunit composition from EcoCyc [5]. The protein fraction used in the enzyme constraint was set to 0.56 based on literature [7]. E.coli protein abundance data were obtained from the protein abundance database (PAXdb), and Kcat values were obtained from the BRENDA database [8,9]. In addition to the data obtained from literature/databases, certain values were modified to reflect the effects of engineering SerA, CysE, and EamB, the genes involved in L-cysteine production. The changes made to Kcat values reflect fold increases in mutant enzyme activity based on literature review to relax the constraint. It is important to note that SerA is a promiscuous enzyme involved in catalyzing several reactions. Of these reactions, PGCD is mainly involved in L-cysteine production. The modification made to SerA is to remove feedback inhibition by L-serine and glycine to increase the rate of the PGCD reaction. The other reactions catalyzed by SerA, AHGDx and ARHGDx, are inhibited by other compounds apart from L-cysteine and glycine. Thus, we assume the Kcat values for those reactions are unaltered. Gene abundance values were similarly increased based on the modified promoters and copy number. The specific modifications and justifications are detailed in the table below.

Parameter Gene/Enzyme/Reaction Original Value Modified Value Justification/Reference
Kcat_forward PGCD 20 1/s 2000 1/s [10]
Kcat_reverse SERAT 15.79 1/s 42.15 1/s [11]
Kcat_forward SERAT 38 1/s 101.46 1/s [11]
Kcat_forward SLCYSS None 24 1/s [12]
Gene Abundance SerA/b2913 626 ppm 5643000 ppm [13]
Gene Abundance CysE/b3607 66.4 ppm 20632.5 ppm [13]
Gene Subunit CysM/b2421 None 2 [5]

Table 1. Modifications made to the base iML1515 model to reflect the genetic modifications made in the L-cysteine overproduction insert

The original ECMpy workflow disregards Kcat values for transport reactions across the cell membrane because databases like BRENDA contain very little information on transporter proteins and machine learning-based approaches such as UniKP also have limited predictive capability for transport reactions. As a result, current enzyme-constrained models for E. coli include kinetic data only for a subset of reactions, and transporter proteins are often not well represented. Thus, such reactions are assumed to be unconstrained in the model. As a result, we are unable to model the effects of modifying the EamB protein involved in L-cysteine export. It is assumed that all L-cysteine produced in the cell is exported out without any further constraints.

Figure 2 illustrates the specific L-cysteine biosynthesis and export reactions that are most relevant to the model. After performing flux variance analysis, the O-acetyl-L-serine sulfhydrylase and S-sulfo-L-cysteine sulfite lyase reaction pathways involved in thiosulfate assimilation and conversion to L-cysteine were found to be missing from the iML1515 GEM. These reactions are present in E.coli K-12 MG1655 as shown in Figure 2. Thus, gap filling methods were used to update the model to incorporate these key reactions and associated metabolites for L-cysteine production.

The last step to prepare the model before performing FBA was updating medium conditions through altering uptake reaction bounds. The upper bounds for uptake reactions of metabolites for components of the SM1 + Luria-Bertani broth (LB) were altered. LB mainly provides amino acids and trace metals while SM1 provides the carbon source and other key components of growth. The amino acid uptake rates were based on previous literature for the BW25113 strain [14]. The SM1 components are less characterized in literature. Thus, the upper bounds were limited to an approximation based on the initial concentration of the medium component used in wet lab and its corresponding molecular weight. In addition to SM1 and LB, thiosulfate is a key component of the medium since it can directly be assimilated into L-cysteine production pathways. Thus, thiosulfate is also added as a medium component to observe its effects on L-cysteine production. The specific values of the SM1 medium components are listed in Table 2.

Because the model was ultimately optimized for L-cysteine export, any L-serine or L-cysteine added to the medium conditions was instantly converted to L-cysteine for export. L-serine and L-cysteine uptake reactions were blocked in simulations to ensure there was flux through the L-cysteine production pathways shown in Figure 2 to accurately model the system.

Medium Component Associated Uptake Reaction Upper Bound
Glucose EX_glc__D_e_reverse 55.5074491
Citrate EX_cit_e_reverse 5.288207298
Ammonium Ion EX_nh4_e_reverse 554.3237251
Phosphate EX_pi_e_reverse 157.9446141
Magnesium EX_mg2_e_reverse 12.34060058
Sulfate EX_so4_e_reverse 5.746408495
Thiosulfate EX_tsul_e_reverse 44.5950767

Table 2. Upper bounds placed on uptake reactions for metabolites from the SM1 medium

The purpose of these modifications was to accurately predict flux distributions, growth rates, and the overall effect on cell metabolism. After altering the base GEM, the enzyme constrained GEM model was created using methods from the ECMpy package. All FBA optimizations were performed using the COBRApy package [15].

The optimization function was initially set to be L-cysteine export. However, only optimizing for L-cysteine export leads to solutions where biomass growth is 0. This does not accurately reflect the growth of bacteria in culture. Thus, lexicographic optimization was performed. The model was first optimized for biomass growth and then the model was constrained to require a percentage of the growth. We initially chose to constrain growth to 30% of the optimized growth as a conservative approximation. L-cysteine export was again optimized after requiring growth. We expect to constrain biomass growth to experimental data from growth curves to replace using a conservative approximation and increase accuracy of the model.

All code and supplemental files can be found under the FBA folder in the team’s Github linked on the overview page

Results

Effects of Modifications on L-cysteine production

The first aim of the FBA model was to understand how the modified enzymes and promoters in the designed circuits affect overall L-cysteine production/export and biomass growth. We first ran an initial lexicographic optimization without any of the modifications detailed in Table 1. The fluxes through biomass and L-cysteine export reactions were noted. The same optimization was run after modifying the model. We expected biomass growth and L-cysteine export both to increase. The results from Table 3 compare the different simulations and validate our design and its ability to overproduce L-cysteine with a near two-fold increase.

Simulation Biomass/Growth Rate (h^-1) L-cysteine Export Flux (mmol/gDW*h)
Unmodified 0.117 7.263
Modified 0.201 14.03
Fold Change 1.71 1.93

Table 3. Effects of modifications to the GEM on overall growth rate and L-cysteine export

Embedded Image

Figure 3. Top 15 reactions by flux as predicted by the optimal solution from FBA

While performing optimization, the model predicts how fluxes are carried through metabolic networks/pathways. The top 15 reactions carrying the highest flux are illustrated in Figure 3. It is noted that most of these reactions involve using glucose and acetate for biomass growth. Upon further analysis of the fluxes through all pathways, the L-cysteine production pathways depicted in Figure 2 all carried flux. Thus, we can expect the modifications made in genetic circuits used in wet lab to similarly increase L-cysteine production experimentally.

Media Conditions

Another mechanism for adding constraints before performing FBA is through media conditions and limiting the metabolites available for uptake. The initial simulations were run using SM1, LB, and thiosulfate. Because adding thiosulfate is an additional cost to our device design and function, the model was used to understand how thiosulfate and carbon sources like glucose affect biomass growth and L-cysteine production.

Embedded Image

Figure 4 A-B. Heatmap on the effects of glucose and thiosulfate uptake on biomass growth and L-cysteine export

From Figure 4B, it is apparent that after thiosulfate uptake reaches around 8 mmol/gDW*h, L-cysteine export does not increase significantly and plateaus. These results are critical for understanding how much thiosulfate to add to the medium to optimize L-cysteine production without added costs, especially for making the device cost-effective for industrial application. Biomass growth similarly does not continue to increase linearly as glucose uptake increases. It is important to note that because FBA predicts fluxes assuming steady state conditions, specific optimal concentrations cannot be determined from this model. However, the results still provide strong insights to consider when preparing the device for industrial application.

Predicting Flux Under Varying Conditions

Furthermore, these conditions are likely to change if our device were to be implemented in a bioreactor. Dark fermentation bioreactors contain wastewater and sludge for biohydrogen production. Through modifying and testing various media conditions, we can predict how our device will behave under these different conditions and use the results further down the line for industrial use. Wastewater used in bioreactors commonly comes in three forms—real wastewater (i.e. from factories), synthetic wastewater, and solid waste. Sugar factory wastewater (i.e. from beet sugar) and synthetic wastewater contain several of the same components like sucrose and acetate [16]. Solid waste is typically composed of fruit/vegetable waste and office paper waste, which contain glucose, fructose, and xylose [17]. The original SM1 conditions had glucose as the main carbon source. To test different bioreactor conditions, two representative medium conditions were developed to represent real/synthetic wastewater and solid waste. The upper bounds for uptake rates were calculated similar to the SM1 medium based on the concentration of each medium component found in literature and the component’s molecular weight. Although we expect to add our device with the SM1 medium to wastewater, the media conditions tested excluded SM1 medium components to understand the effects of the wastewater without other contributing factors. The real/synthetic wastewater conditions incorporated sucrose, acetate, butyrate, lactate, and ammonium in addition to amino acids and trace metals. The solid waste conditions incorporated fructose and xylose in addition to amino acids and trace metals. The uptake bounds for amino acids and trace metals were kept the same from initial simulations.

Embedded Image

Figure 5 A-D. Heatmaps plotting biomass growth and L-cysteine export based on the uptake of various metabolites under varying conditions. A-B: Real/synthetic wastewater conditions. C-D: Solid waste conditions

While testing the real/synthetic wastewater conditions, the main carbohydrate added to the medium was sucrose. Thus, sucrose was initially expected to be used for growth in place of glucose. However, after running optimization and evaluating Figure 5A, it was apparent the model did not follow the initial conjecture. Biomass growth was independent of sucrose uptake. With further evaluation of the iML1515 model, it was found that the model did not contain downstream reactions for hydrolyzing sucrose 6-phosphate to use it as a carbon source. This is because the E.coli K-12 strain and its derivatives lack the csc operon, which is responsible for sucrose catabolism [18]. This operon is found in other strains like E.coli EC3132 and E.coli W. Although the specific strains tested in wet lab are E.coli K-12 derivatives, future directions for the project include testing the genetic circuits in various chassis. Thus, the same type of testing and analysis can be used to evaluate the effect of sucrose on the functionality of the device in different chassis.

Furthermore, biomass growth and L-cysteine export were still possible without sucrose or glucose present. From analyzing the reactions carrying flux in the model, acetate and threonine uptake reactions were found to carry high fluxes. These reactions were tested to see if they had an effect on biomass growth. From the analysis, it was seen that the model was able to utilize threonine and amino acids for growth. These results highlight the flexibility of our device to function under various and nonideal conditions, necessary for a device to be implemented in variable environments.

In addition to real/synthetic wastewater, solid waste was also tested. To determine the effects of different carbon sources like fructose and xylose, glucose was removed from the medium conditions although it is commonly present in solid waste. These simulations yielded slightly higher biomass growth and L-cysteine export flux of 0.222 h^-1 and 14.17 mmol/gDW*h, respectively, compared to initial simulations. With further analysis, it was observed that these results were mainly attributed to the addition of fructose to the medium. E.coli K-12 was able to naturally metabolize this carbon source. These results are shown in Figure 5C-D. Thus, we expect higher yields with the addition of glucose as well. Thiosulfate uptake similarly plateaus and does not contribute to increased L-cysteine export after reaching a certain uptake rate. This signifies that once an optimal concentration of thiosulfate is determined, adding more thiosulfate does not make the system more effective.

Summary

The FBA model, incorporating the iML1515 GEM and enzyme constraints, provided insights into the effects of genetic modifications on the metabolic network of E.coli K-12 BW23115 and its ability to produce and export L-cysteine. The results validated the effect of the genetic modifications made experimentally and predicted a 1.71-fold increase in biomass growth and a 1.93-fold increase in L-cysteine export. After testing media conditions, the model simulations showed that thiosulfate and glucose uptake plateaued after reaching certain rates, revealing that the cost-effective aspect of the device design can be optimized by identifying the ideal concentrations of these medium components. Testing various media conditions representative of dark fermentation bioreactor environments demonstrated the metabolic flexibility of the design to use various carbon sources like fructose and threonine for growth, while identifying limitations such as the K-12 strain’s inability to metabolize sucrose. These findings offer actionable insights for scaling the device and selecting the final industrial chassis.

Future Directions

Future directions for the project include evaluating different chassis apart from E.coli K-12 strains. By utilizing different base models for different bacterial strains and implementing the same modifications and testing various media conditions, the results can be used to evaluate the most effective chassis and medium conditions in terms of L-cysteine production and adaptability under varying conditions. The accuracy of the model could also be increased by including other forms of omics data like expression and thermodynamics data. Furthermore, sensitivity analysis can be performed on all modified parameters and upper bounds on medium components to understand how sensitive the results are to the input parameters and medium conditions.

References