Genome Scale
Genome Scale Metabolic model
What is this model? It is a genome scale metabolic model for Pichia pastoris.
What does it do? It takes a network representation of the entire metabolism of the yeast to optimize metabolic fluxes towards an objective.
What does it tell us? It tells us the effect of growing the yeast under different carbon sources.
Abstract
Recombinant protein production in P. pastoris relies on the use of 2 phase production systems that permit us to distribute metabolic fluxes first to biomass and then to microbial production. This way they avoid a common limitation of strong promoters, where excessive induction impedes growth and affects the entire production process. Mathematical modeling plays a significant role in bioprocess design. Genome scale metabolic modeling in particular is commonly used since it allows us to represent the entire metabolic network of the cell and avoid reductionist methodologies. Here we present a genome scale metabolic model of P. pastoris and use it to simulate the effects of different carbon sources on the production of recombinant protein.
Introduction
The joint development of synthetic biology, metabolic engineering, and systems biology has developed multiple tools for bioengineering. As the importance of the complex reality of biological systems hampers early development of cell factories, researchers developed tools to represent the interconnected components of living organisms. Among them were genome scale metabolic models (GEM) (Bernstein et al., 2021; Bi et al., 2022; Blazeck & Alper, 2010; Capela et al., 2022; Cuevas et al., 2016). At the core of GEMs lies the systematic reconstruction of the metabolic network from whole-genome sequences. This process requires the annotation of genes encoding metabolic enzymes, the assignment of functions based on biochemical databases, and the assembly of metabolic reactions into an interconnected network (Capela et al., 2022, 2022; Fang et al., 2020). The primary framework used for simulating GEMs is constraint-based modeling (CBM), which operates under a set of well-defined mathematical rules rather than requiring detailed kinetic parameters, a significant advantage given their scarcity for most cellular reactions (Domenzain et al., 2022; Fang et al., 2020). CBM assumes a quasi-steady state, meaning that the model represents a moment in time where concentrations remain constant. The reactions on the network can be represented as a stoichiometric matrix (S) of dimensions mxn, where m are all the metabolites in the system and n are all the reactions (Blazeck & Alper, 2010; Cuevas et al., 2016). This equation is supplemented with physiological constraints. Each reaction flux, vj, is bound by a minimum (LBj) and maximum (UBj) value, reflecting the physical and thermodynamic limits of the reaction. These bounds define the solution space of feasible flux distributions. To predict a single, biologically relevant state from this vast space of possibilities, an objective function is defined. Typically, this function represents a key cellular goal, most often biomass formation, which aims to maximize the synthesis of all cellular components required for growth and proliferation. The optimization of this objective function subject to the mass balance and flux bounds is performed using linear programming through a method called Flux Balance Analysis (FBA) (Anand et al., 2020; Bi et al., 2022; Cuevas et al., 2016; Damiani et al., 2017; Gianchandani et al., 2010; Nagrath et al., 2010; Orth et al., 2010).
The formulation of FBA is as follows:
Each chemical species in the system has the following mass balance equation:
Where
is the chemical species,
represents different fluxes,
is the term for accumulation,
is the term for entry,
for generation,
for comsumtion, and
for exit. Steady state means that there is no change overtime:
Subject to upper and lower bounds (UB and LB respectively)
The network of reactions is represented as a matrix (S) of dimensions mxn where m is the number of metabolites (an equation for the mass balance for each of them) and n is the number of reactions. It is complemented by a vector of dimensions nx1 with the value of each flux (
) (Blazeck & Alper, 2010; Cuevas et al., 2016; Orth et al., 2010). Under steady state assumption:
FBA takes this to formulate an optimization problem, traditionally a single objective optimization problem that will seek to maximize the or minimize an objective function:
Where x is the specific flux to optimize, usually production of biomass or a product for maximization or reduce generation of byproducts and consumption of expensive substrates for minimization (Chen et al., 2022; Gianchandani et al., 2010; Orth et al., 2010).
GEMS have been used extensively to develop strategies to optimize P. pastoris. By simulating changes in intracellular metabolism induced by heterologous protein expression, FBA models have identified key reactions that divert precursors and reduce equivalents away from biomass synthesis toward protein production. This insight has led to the identification of gene targets for overexpression or deletion in order to redirect metabolic fluxes and enhance product yields (Chung et al., 2010; Morales et al., 2014; Torres et al., 2019). Studies have also compared the use of methanol, glycerol, glucose, and sorbitol by simulating cell growth and carbon conversion efficiencies under defined substrate uptake rates. These analyses have revealed that glycerol and sorbitol are particularly promising substrates for recombinant protein production due to their favorable growth yields, while methanol, despite its efficiency in ATP generation, poses a challenge in terms of producing unfavorable by-products such as formaldehyde (Chung et al., 2010; Tomàs-Gamisans et al., 2019). A common strategy with GEMs is the simulating the effects of overexpression, knockout, or downregulation of specific enzymatic reactions. These models predict how alterations in central carbon metabolism can lead to enhanced recombinant protein production. For example, manipulations aimed at boosting NADH regeneration or reducing by-product formation have been predicted to improve the overall energy balance of the cell, thereby allowing greater flux toward heterologous protein synthesis (Chung et al., 2010).
In this section we present a genome scale metabolic model for P. pastoris as part of FBA analysis to simulate the effects of different carbon sources.
Methodology
1. The analysis was simulated using the Cobra toolbox in MATLAB R2023a (Heirendt et al., 2019). We used a modified version of P. pastoris model iMT1026 v3 (Tomàs-Gamisans et al., 2018) Several modifications were applied to ensure the functionality and consistency of the metabolic model during flux balance analysis (FBA):
- Blocked reactions were removed.
- All exchange flux upper bounds were set to 1000 to allow metabolite exchange.
- Metabolites lacking annotated charge values were assigned a neutral charge (0).
- Dead-end metabolites were deleted according to MEMOTE test results.
- The model was confirmed to be auxotrophic for biotin and oxidized glutathione, consistent with previous studies (e.g., Genome-scale metabolic model analysis of Pichia pastoris for enhancing the production of S-adenosyl-L-methionine).
2. Pichia Strain Background
- The strain Pichia pastoris GS115 (his4–), which is naturally auxotrophic for histidine.
- However, the current metabolic model behaves as prototrophic, not requiring external histidine supplementation for growth.
3. Model Processing and Reduction
The reduceModel function from the COBRA Toolbox was used to:
- Remove blocked and unbounded reactions.
- Simplify the network.
- Correct inconsistencies related to auxotrophies.
4. Base Model and Gene Annotation
- The base model used was iMT1026v3, described in Fine-tuning the P. pastoris iMT1026 genome-scale metabolic model for improved prediction of growth on methanol or glycerol as sole carbon sources (Tomàs-Gamisans et al., 2018).
- This model successfully simulates growth on methanol as the sole carbon source.
- Genes from the single-chain construct were integrated into iMT1026v3, and corresponding biosynthetic reactions were manually added.
- Gene annotations were obtained using KAAS (KEGG Automatic Annotation Server) for the proteins CAH2445710.1, which correspond to gene IDs from the Genome assembly KP_7435-4, reported in OPENPichia: Licence-free Komagataella phaffii chassis strains and toolkit for protein expression (Claes et al., 2024).
5. Flux Balance Analysis (FBA) Configuration
The following FBA configuration was applied:
- Objective function: Ex_scFVLR (single-chain product export).
- Internal biomass flux: fixed at 0.1 mmol·gDW−1·h−1, with retry optimization.
- O2 uptake: allowed (OK).
- CO2: secretion enabled (OUT).
- Minerals: unconstrained (±1000).
- Ex_btn (biotin exchange): lower bound = –4×10−5.
- Carbon source (C): lower bound = –10.
6. Mapped Metabolic Pathways
Fluxes were mapped and analyzed in the following central metabolic subsystems:
- Glycolysis / Gluconeogenesis
- Oxidative Phosphorylation
- Pyruvate Metabolism
- Citric Acid Cycle (TCA Cycle)
- Pentose Phosphate Pathway (PPP)
Results and discussion
We tested the model to see internal subsystem fluxes as can be seen on figure 1. Generally mean fluxes are similarly utilized with the exception of fructose.
We simulated the effects on growth and yield of different carbon sources on the model. We fixed growth to 0.1 h−1 and calculated yield for biomass and for our product scFVLR, this can be seen on table 1. Since biomass was fixed, simulating chemostate conditions, we obtained the same biomass yield for all carbon sources. Product yield was a different story. The highest yields were for glucose and fructose, while methanol was the lowest yield. This is supported by existing reports of carbon sources, which report glucose as more energetically efficient. In a similar vein, methanol is considered energy inefficient as sole carbon source. However, methanol offers an important tool in controllable expression. Methanol inducible promoters like AOX1 are commonly utilized in production systems with P. pastoris. These promoters allow for a 2 phase process, with the first phase dedicated to grow and the second phase dedicated to production (Gündüz Ergün et al., 2019; Tomàs-Gamisans et al., 2018; Torres et al., 2019; Xu et al., 2018).
Table 1. Biomass (x) and product (p) yields per carbon source with fixed biomass.
| Source | ObjectiveRate | Yxs | Yps |
|---|---|---|---|
| Ex_glc_D | 0.680910122 | 0.014285714 | 0.097272875 |
| Ex_glyc | 0.351197913 | 0.014285714 | 0.05017113 |
| Ex_sbt_D | 0.731806659 | 0.014285714 | 0.104543808 |
| Ex_mnl | 0.73180665 | 0.014285714 | 0.104543807 |
| Ex_meoh | 0.011715122 | 0.014285714 | 0.001673589 |
| Ex_fru | 0.680909957 | 0.014285714 | 0.097272851 |
| Ex_xyl_D | 0.536239157 | 0.014285714 | 0.076605594 |
| Ex_xylt | 0.601426957 | 0.014285714 | 0.085918137 |
NADPH is commonly a limiting factor in the expression of recombinant proteins. We analized the mean fluxes of NADPH on the same carbon sources, figure 2. In this case the largest median use of NADPH was with xylose, followed by glucose. Methanol had the second lowest median flux related reactions, after glycerol.
To further simulate the 2 commonly used substrates used in recombinant protein production, we carried out a 2-phase simulation. Phase A biomass is free, O2 swept with substrate uptake. Phase B O2 is open, biomass is fixed on a grid. We made a comparative analysis of glucose and methanol. As can be seen on figure 3, methanol as a carbon source makes biomass compete heavily with protein production. While glucose consistently produces more than methanol, the biggest difference is present when biomass is not fixed. During bioproduction strong promoters have the risk of redirecting excessive resources towards protein production. While this increases relative production, overall protein production is a function of biomass (more organisms mean more protein). The advantages of promoters like AOX1 and HpFMD are that they allow controllable production. The first phase allows biomass to grow using glucose as carbon source, not only is it energetically efficient, in represses protein expression. The second phase induces expression, in the case of AOX1 under methanol, while less energetically efficient than glucose, the cellular machinery can be directed towards protein production without affecting the viability of the process (De et al., 2021; Pan et al., 2022; Tomàs-Gamisans et al., 2018; Xu et al., 2018). In this Project we proposed the use of the promoter HpFMD, it has 2 advantages over AOX1, it achieves stronger induction under methanol and has a de-repression mechanism that makes it expressive in the absence of glucose without needing induction. Based on the FBA results we present here; we believe that HpFMD could be used to produce proteins in a similar 2-phase system, using a different carbon source for the second phase. As mentioned above, methanol is energy inefficient. While the expression activity of the de-repressed promoter is less than while induced by methanol, it is possible than using a more efficient carbon source could produce more overall protein, without mentioning the possibility of eliminating a toxin chemical from the production process (Tomàs-Gamisans et al., 2018; Vogl et al., 2016, 2020).
Conclusion
Methanol is the least energetically efficient of the carbon sources tested, while glucose is the most efficient. This opens the way for the use of alternative promoters that still allow designing production systems with a growth and production phase, but that don’t rely on methanol as carbon source. The HpFMD promoter presented as part of the project could be used here since it has a strong de-repression activator. It could also be used in combinatorial designs where methanol and non-repressive carbon source are used. Our models reflect known trends in bioengineering where expression control is necessary for optimal recombinant protein production.
Bibliography
Anand, S., Mukherjee, K., & Padmanabhan, P. (2020). An insight to flux-balance analysis for biochemical networks. Biotechnology and Genetic Engineering Reviews, 36(1), 32–55. https://doi.org/10.1080/02648725.2020.1847440
Bernstein, D. B., Sulheim, S., Almaas, E., & Segrè, D. (2021). Addressing uncertainty in genome-scale metabolic model reconstruction and analysis. Genome Biology, 22(1), 64. https://doi.org/10.1186/s13059-021-02289-z
Bi, X., Liu, Y., Li, J., Du, G., Lv, X., & Liu, L. (2022). Construction of Multiscale Genome-Scale Metabolic Models: Frameworks and Challenges. Biomolecules, 12(5), 721. https://doi.org/10.3390/biom12050721
Blazeck, J., & Alper, H. (2010). Systems metabolic engineering: Genome-scale models and beyond. Biotechnology Journal, 5(7), 647–659. https://doi.org/10.1002/biot.200900247
Capela, J., Lagoa, D., Rodrigues, R., Cunha, E., Cruz, F., Barbosa, A., Bastos, J., Lima, D., Ferreira, E. C., Rocha, M., & Dias, O. (2022). Merlin, an improved framework for the reconstruction of high-quality genome-scale metabolic models. Nucleic Acids Research, 50(11), 6052–6066. https://doi.org/10.1093/nar/gkac459
Chen, Y., Li, F., & Nielsen, J. (2022). Genome-scale modeling of yeast metabolism: Retrospectives and perspectives. FEMS Yeast Research, 22(1), foac003. https://doi.org/10.1093/femsyr/foac003
Chung, B. K., Selvarasu, S., Camattari, A., Ryu, J., Lee, H., Ahn, J., Lee, H., & Lee, D.-Y. (2010). Genome-scale metabolic reconstruction and in silico analysis of methylotrophic yeast Pichia pastoris for strain improvement. Microbial Cell Factories, 9(1), 50. https://doi.org/10.1186/1475-2859-9-50
Claes, K., Van Herpe, D., Vanluchene, R., Roels, C., Van Moer, B., Wyseure, E., Vandewalle, K., Eeckhaut, H., Yilmaz, S., Vanmarcke, S., Çıtak, E., Fijalkowska, D., Grootaert, H., Lonigro, C., Meuris, L., Michielsen, G., Naessens, J., van Schie, L., De Rycke, R., … Callewaert, N. (2024). OPENPichia: Licence-free Komagataella phaffii chassis strains and toolkit for protein expression. Nature Microbiology, 9(3), 864–876. https://doi.org/10.1038/s41564-023-01574-w
Cuevas, D. A., Edirisinghe, J., Henry, C. S., Overbeek, R., O’Connell, T. G., & Edwards, R. A. (2016). From DNA to FBA: How to Build Your Own Genome-Scale Metabolic Model. Frontiers in Microbiology, 7. https://www.frontiersin.org/articles/10.3389/fmicb.2016.00907
Damiani, C., Di Filippo, M., Pescini, D., Maspero, D., Colombo, R., & Mauri, G. (2017). popFBA: Tackling intratumour heterogeneity with Flux Balance Analysis. Bioinformatics, 33(14), i311–i318. https://doi.org/10.1093/bioinformatics/btx251
De, S., Mattanovich, D., Ferrer, P., & Gasser, B. (2021). Established tools and emerging trends for the production of recombinant proteins and metabolites in Pichia pastoris. Essays in Biochemistry, 65(2), 293–307. https://doi.org/10.1042/EBC20200138
Domenzain, I., Sánchez, B., Anton, M., Kerkhoven, E. J., Millán-Oropeza, A., Henry, C., Siewers, V., Morrissey, J. P., Sonnenschein, N., & Nielsen, J. (2022). Reconstruction of a catalogue of genome-scale metabolic models with enzymatic constraints using GECKO 2.0. Nature Communications, 13(1), Article 1. https://doi.org/10.1038/s41467-022-31421-1
Fang, X., Lloyd, C. J., & Palsson, B. O. (2020). Reconstructing organisms in silico: Genome-scale models and their emerging applications. Nature Reviews Microbiology, 18(12), Article 12. https://doi.org/10.1038/s41579-020-00440-4
Gianchandani, E. P., Chavali, A. K., & Papin, J. A. (2010). The application of flux balance analysis in systems biology. WIREs Systems Biology and Medicine, 2(3), 372–382. https://doi.org/10.1002/wsbm.60
Gu, C., Kim, G. B., Kim, W. J., Kim, H. U., & Lee, S. Y. (2019). Current status and applications of genome-scale metabolic models. Genome Biology, 20(1), Article 1. https://doi.org/10.1186/s13059-019-1730-3
Gündüz Ergün, B., Hüccetoğulları, D., Öztürk, S., Çelik, E., & Çalık, P. (2019). Established and Upcoming Yeast Expression Systems. In B. Gasser & D. Mattanovich (Eds), Recombinant Protein Production in Yeast (pp. 1–74). Springer. https://doi.org/10.1007/978-1-4939-9024-5_1
Heirendt, L., Arreckx, S., Pfau, T., Mendoza, S. N., Richelle, A., Heinken, A., Haraldsdóttir, H. S., Wachowiak, J., Keating, S. M., Vlasov, V., Magnusdóttir, S., Ng, C. Y., Preciat, G., Žagare, A., Chan, S. H. J., Aurich, M. K., Clancy, C. M., Modamio, J., Sauls, J. T., … Fleming, R. M. T. (2019). Creation and analysis of biochemical constraint-based models using the COBRA Toolbox v.3.0. Nature Protocols, 14(3), Article 3. https://doi.org/10.1038/s41596-018-0098-2
Kishk, A., Pacheco, M. P., Heurtaux, T., Sinkkonen, L., Pang, J., Fritah, S., Niclou, S. P., & Sauter, T. (2022). Review of Current Human Genome-Scale Metabolic Models for Brain Cancer and Neurodegenerative Diseases. Cells, 11(16), Article 16. https://doi.org/10.3390/cells11162486
Morales, Y., Tortajada, M., Picó, J., Vehí, J., & Llaneras, F. (2014). Validation of an FBA model for Pichia pastoris in chemostat cultures. BMC Systems Biology, 8(1), 142. https://doi.org/10.1186/s12918-014-0142-y
Nagrath, D., Avila-Elchiver, M., Berthiaume, F., Tilles, A. W., Messac, A., & Yarmush, M. L. (2010). Soft constraints-based multiobjective framework for flux balance analysis. Metabolic Engineering, 12(5), Article 5. https://doi.org/10.1016/j.ymben.2010.05.003
Orth, J. D., Thiele, I., & Palsson, B. Ø. (2010). What is flux balance analysis? Nature Biotechnology, 28(3), Article 3. https://doi.org/10.1038/nbt.1614
Pan, Y., Yang, J., Wu, J., Yang, L., & Fang, H. (2022). Current advances of Pichia pastoris as cell factories for production of recombinant proteins. Frontiers in Microbiology, 13. https://doi.org/10.3389/fmicb.2022.1059777
Tomàs-Gamisans, M., Ferrer, P., & Albiol, J. (2018). Fine-tuning the P. pastoris iMT1026 genome-scale metabolic model for improved prediction of growth on methanol or glycerol as sole carbon sources. Microbial Biotechnology, 11(1), 224–237. https://doi.org/10.1111/1751-7915.12871
Tomàs-Gamisans, M., Ødum, A. S. R., Workman, M., Ferrer, P., & Albiol, J. (2019). Glycerol metabolism of Pichia pastoris (Komagataella spp.) characterised by 13C-based metabolic flux analysis. New Biotechnology, 50, 52–59. https://doi.org/10.1016/j.nbt.2019.01.005
Torres, P., Saa, P. A., Albiol, J., Ferrer, P., & Agosin, E. (2019). Contextualized genome-scale model unveils high-order metabolic effects of the specific growth rate and oxygenation level in recombinant Pichia pastoris. Metabolic Engineering Communications, 9, e00103. https://doi.org/10.1016/j.mec.2019.e00103
Vogl, T., Fischer, J. E., Hyden, P., Wasmayer, R., Sturmberger, L., & Glieder, A. (2020). Orthologous promoters from related methylotrophic yeasts surpass expression of endogenous promoters of Pichia pastoris. AMB Express, 10, 38. https://doi.org/10.1186/s13568-020-00972-1
Vogl, T., Sturmberger, L., Kickenweiz, T., Wasmayer, R., Schmid, C., Hatzl, A.-M., Gerstmann, M. A., Pitzer, J., Wagner, M., Thallinger, G. G., Geier, M., & Glieder, A. (2016). A Toolbox of Diverse Promoters Related to Methanol Utilization: Functionally Verified Parts for Heterologous Pathway Expression in Pichia pastoris. ACS Synthetic Biology, 5(2), 172–186. https://doi.org/10.1021/acssynbio.5b00199
Xu, N., Zhu, J., Zhu, Q., Xing, Y., Cai, M., Jiang, T., Zhou, M., & Zhang, Y. (2018). Identification and characterization of novel promoters for recombinant protein production in yeast Pichia pastoris. Yeast, 35(5), 379–385. https://doi.org/10.1002/yea.3301