1 Overview

Our software aims to provide a low-barrier, flexible, easy-to-use, and graphically interactive platform for metabolic modeling and analysis for synthetic biology research teams. The platform is based on COBRApy [10] for core FBA(Flux Balance Analysis) [1][5] computation and provides visualization and a dedicated FBA knowledge-base Agent [6][7][8] through a Web UI interface that supports natural language operations.

Its main functions include:

Model Selection: Supports switching between different metabolic models, including selecting pre-loaded models in the software, and also supports users importing new models.
Target Reaction and Weight Setting: Allows flexible specification of target reactions and their optimization weights in the selected model, used to define the optimization direction of the simulation. It also supports importing new reactions.
Reaction Flux Bound Adjustment: Supports modifying the flux boundary conditions of metabolic reactions to facilitate exploring metabolic behaviors under different environments or engineering modifications.
Gene Knockout Simulation: Analyzes the impact of gene knockout experiments on the metabolic network and product synthesis by setting gene knockouts.
FBA Computation and Visualization:After submission, the system calls COBRApy to perform FBA (Flux Balance Analysis) and automatically generates clear metabolic flux maps using Escher [3][9] for intuitive result visualization.
Intelligent Agent for Knowledge Q&A and Operation Assistance: Additionally, the platform integrates an Agent(DeepSeek-R1:7B) that can automatically convert user natural language instructions into the above operations. For example, a user can directly input "Please set Yeast9-GEM [2] as the target model," and the Agent will parse and execute the corresponding steps, significantly lowering the software usage barrier. At the same time, the Agent is loaded with a relevant knowledge base and can answer user questions about FBA concepts and applications.

By integrating metabolic modeling computation, visualization, and natural language interaction, our software is suitable not only for experienced researchers to conduct in-depth analysis but also enables beginners to more easily explore metabolic network design in synthetic biology.

Tips:

If you are unfamiliar with FBA, you can jump to the Model section to learn its detailed principles and functions.

Briefly, FBA is a linear programming computational method based on constraints used to predict the distribution of metabolic fluxes in a metabolic network. It uses genome-scale metabolic models (GEM) combined with physicochemical constraints (such as mass conservation, energy balance, and reaction rate bounds) for mathematical optimization, thereby simulating the metabolic state of an organism under specific conditions. FBA does not rely on detailed kinetic parameters and is widely used in predicting gene knockout effects, discovering biomarkers, optimizing biosynthetic pathways, and guiding metabolic engineering.

2 Highlights

2.1 Generality and Practicality

The core of synthetic biology research lies in the iterative "Design-Build-Test-Learn" (DBTL) cycle. High-quality models can provide accurate, directive suggestions and accelerate the DBTL iteration. Our platform provides metabolic network models for chassis strain design, playing an indispensable role.

Our software can provide functionalities such as: predicting theoretical yield of target products, identifying key gene targets, and metabolic network analysis [4].

Predicting Theoretical Yield of Target Products: Teams need to know the maximum theoretical yield of their target product (e.g., lycopene, paclitaxel) under ideal conditions—what is the ceiling? FBA can quickly calculate this maximum theoretical yield, providing key decision-making basis for project feasibility.
Identifying Key Gene Targets: Teams need to determine which genes to knock out or overexpress to maximize metabolic flux toward the target product while suppressing byproducts. FBA's gene essentiality analysis and robustness analysis (ROOM) can systematically screen optimal gene editing target lists.
Understanding Metabolic Network Behavior: Teams need to understand how metabolic flux redistributes under specific conditions (e.g., anaerobic). Which pathways are activated, and which are suppressed? FBA's Flux Variability Analysis (FVA) can provide a global perspective.
Strong Extensibility of Platform Chassis Models: Supports multiple chassis cells, such as Yeast9-GEM, IJO1366, and e_coli_core. Also supports customized models and importing simplified models generated by other software (e.g., CarveMe). Supports freely adding new reactions, such as introducing a new heterologous metabolic pathway from literature or hypothesizing a novel enzymatic reaction. Supports creating new metabolites to include hypothetical intermediates not yet in standard databases.

2.2 Compatibility with Common Synthetic Biology Standards

SBML Support: The platform is built on COBRApy and natively supports SBML format metabolic model files. This allows users to directly import standard models from databases like BiGG Models, ensuring reproducibility and shareability of results [Click here to learn about import operations]
COBRApy Integration:COBRApy is an open-source Python-based toolkit primarily used for genome-scale metabolic network reconstruction and constraint-based modeling analysis. In short, it helps researchers simulate and predict intracellular metabolic processes using computers. Our software's core computation directly calls the COBRApy API, ensuring compatibility with mainstream tools in the metabolic modeling community.
Escher Visualization: Escher is an interactive, open-source web application for constructing, visualizing, and analyzing genome-scale metabolic models (GEMs). Its core function is to present complex metabolic networks and data (e.g., FBA results) in an intuitive, aesthetically pleasing, and interactive graphical format. Our software generates metabolic flux maps via Escher's interface, and results are compatible with the existing Escher JSON format, enabling sharing and reuse in the community.
DeepSeek LLM Agent:DeepSeek-R1-Distill-Qwen-7B is an advanced large language model that, through deep learning on massive text data, masters complex language patterns and can generate fluent and contextually appropriate text. We designed a natural language interaction interface; user instructions are parsed into structured operations and then task functions are called to complete tasks. We also provide a unified API layer for future expansion, allowing integration of more tasks.

2.3 Maintainability and Extensibility

We emphasized code maintainability and extensibility during development:

Code Comments and Documentation:Core modules include detailed comments and function descriptions, accompanied by installation and user guides (README, Wiki tutorials) for quick onboarding by subsequent teams. [Click here to learn about the installation tutorial]
Clear Architecture: The software adopts a front-end and back-end separation architecture (Frontend Web UI - Backend Flask - Underlying COBRApy/Escher/LLM API), with clear module responsibilities, facilitating expansion.

2.4 User-Friendly Design

Our software is designed with user-friendliness as a core principle:

Intuitive Interface: Currently, there are few FBA visualization and graphical interactive computing software. Most rely on programming for computation. Visualization steps are often only for results, while the computation process still requires coding. In traditional research paradigms, metabolic network analysis typically requires researchers to be proficient in MATLAB (with COBRA Toolbox) or Python(with COBRApy). They must not only understand biological logic but also master the syntax, library installation, and debugging skills of another programming language. Our software achieves full-process visualization and graphical interaction of computational results, greatly lowering the usage barrier. After deployment, all modeling operations are completed through the Web interface, eliminating the need for complex command-line operations. Interactive metabolic flux maps are generated using Escher for result visualization, making results immediately clear.
Natural Language Interaction: We integrate a large language model Agent, allowing users to complete complex operations through natural language, significantly lowering the usage barrier.
Specific assistance is reflected in two major aspects:
1. Knowledge Q&A and Explanation:
  Researchers can directly ask questions in natural language. The Agent can provide immediate and accurate answers based on the built-in metabolic modeling knowledge base and current model context.
2. Operation Assistance and Automation:
  This is the most revolutionary feature. Users do not need to write any code or search for functions in menus. They simply tell the Agent their analysis intent in natural language, and the Agent automatically converts it into backend operations.
[Click here to learn more details]

2.5 Validation through Wet Experiments

To calculate the feasibility of using red algae as a carbon source, we performed FBA calculations using the software. Wet experiment results ultimately confirmed that red algae can indeed be used as a carbon source substrate and achieved good yields. [Click here to learn more details]

3 Flowchart

The interactive FBA analysis platform offers users a comprehensive computational biology workflow accessible via a web-based interface. The system provides three main functions: FBA Flux Map calculation, Operation Assistance, and Question Query.

For FBA computation, users begin by selecting an appropriate genome-scale metabolic model from the model library as the analysis foundation. After choosing a model, users define the biological objective function for the simulation—common objectives include biomass maximization or optimization of specific metabolite production. To achieve the defined goals, users adjust network constraints based on experimental conditions or physiological states, such as substrate uptake rates and oxygen availability. These constraints delineate the physico-chemical feasible space for metabolic network operation.Gene knockout simulation enables systematic identification of essential genes or discovery of potential metabolic engineering targets. After calculation, all simulated flux distribution results can be intuitively visualized using the integrated Escher Visualization module, facilitating interpretation of network states.

Throughout the analysis process, an LLM Agent and a structured knowledge base support users by providing instant queries about model components, reaction mechanisms, or gene functions. This offers essential contextual knowledge for biological interpretation of simulation results, along with real-time operational guidance to ensure a smooth workflow. Together, these features create an integrated analysis environment spanning model construction, simulation, visualization, and knowledge retrieval.

4 Architecture Diagram

Above is our project architecture diagram. The frontend consists of WebUI and LLM ChatBox, while the backend includes Flask, COBRApy, and Escher.

5 Software Demonstration

5.1 Web Interface and Operation Demonstration

In this tutorial, we will demonstrate how to use our software for FBA analysis. [Click here to learn about installation]

Step 0: Access the FBA Homepage

Visit our deployed FBA platform, and you will see the following page:

Step 1: Select a Model

We provide some commonly used models in advance, such as e_coli_core and Yeast-9. You can also download models from BIGG or use local models and import them into our platform via the "Import" button.

On the page, you can input the model ID to search and execute the query.

After selecting a model, click "Next." The selected model will be displayed at the top.

Step 2: Select Target Reaction

This step involves two tasks: searching for the target reaction and setting its weight. The target reaction may be a linear combination of multiple reactions.

If no reaction is selected, maximizing biomass will be the default target reaction.

If there are additional needs, you can import reaction types not currently considered in the model via "Add New...".

On the weight page, the sum of reaction weights must equal 100%.

Step 3: Constrain the Environment

You can set the upper and lower bounds of reactions on the "Reaction Constraints" page.

Similarly, you can knock out genes on the "Gene Knockout" page.

Then, on the confirmation page, verify the operations performed.

You can always go back to previous pages and click "Clear" to reset settings. All pages involving multiple selections allow using the "Organize" button to reorder selections, enabling users to manage selected items on the first page.

After verification, click "Submit," and the software backend will perform the computation.

When results are satisfactory, the calculated metabolic data will be mapped onto a metabolic map for visualizing metabolic pathways.

Additionally, we provide a dark mode for more comfortable nighttime use.

5.2 Agent System Introduction

Our Agent is based on DeepSeek-R1:7B and integrates a specialized knowledge base for FBA. The knowledge base was constructed by integrating the practical experience of five synthetic biology researchers, covering common questions, frequent operations, and typical application scenarios. Based on their feedback, we systematically compiled user common queries and operational needs to ensure the Agent can more accurately understand and respond to user requests.

After deploying the Agent, you can ask questions or perform operations via natural language in the chat box.

Notably, the conversation history in this session is saved on the current page.

Below are real-time video demonstrations of Agent-user interaction:

Agent Intelligent Q&A:

What is the principle of FBA?
As shown in the video, after inputting the question, the Agent briefly outlines FBA-related concepts and provides a detailed point-by-point answer, ensuring effective responses within limited space.
What files or data are needed to start FBA calculations?
As shown in the video demonstration, after completing the first question, we followed up with the second question. The system retains previous history within this session, and new topics do not affect normal responses.
Where can I find and download reliable, validated metabolic models?
The Agent will list available websites, corresponding file types, and their comparative features.
What exactly do the "constraints" refer to? How should I understand and set these constraints (e.g., reaction rate upper and lower limits)?
As shown in the video, the Agent briefly answers the meaning of constraint subdivisions and describes the significance of setting limits.

Agent Intelligent Operation Assistance:

Set the model to iJN746
The video demonstrates that, starting from a blank setting, inputting the command causes the Agent to set the model to iJN746. The operation result is visible in the global selection prompt and the final summary, consistent with manual mouse operations.
Set a specific objective function
Similar to the previous Q&A, Agent-assisted operations retain all interaction content in the current session, allowing users to clearly see which operations have been successfully executed.
Modify the upper and lower flux bounds of a reaction
Operation is similar to setting the objective function.
Knock out certain genes
The results are also visible in the final summary.
Generate a metabolic flux distribution map
After completing the necessary steps above, the Agent can directly call relevant functions to compute results. Users can view them on the Results page.

6 Expert Feedback

In Agent-related practices, we consulted Professor Zhang Tong. [click here to learn more details]

For example, regarding "model selection," he suggested we try using open-source models like Qwen (e.g., 32B version) for local deployment to alleviate response delays caused by calling public APIs (e.g., DeepSeek) and improve system stability.

Regarding "mitigating hallucination issues," he pointed out that hallucinations are currently difficult to completely solve, but can be partially mitigated through methods like building fact bases, counterfactual validation, and feature alignment. He recommended establishing a dedicated ambiguity lexicon and attempting to use attention mechanisms to identify and handle semantic ambiguities.

Regarding "whether it's necessary to introduce knowledge or tools for constraints to avoid incorrect interpretations," Professor Zhang gave a positive answer: "This is necessary. Because there are many specialized or domain-specific knowledge points that large models have not learned. Especially for innovative tasks like yours, it's essential to perform secondary training or use an external knowledge base to supplement knowledge."

Regarding "long-text and memory mechanism design," he recommended "introducing long-text memory and adaptive forgetting mechanisms like LSTM (Long Short-Term Memory) to improve information coherence in multi-turn dialogues and avoid error accumulation and semantic interference. Reinforcement learning ideas can also be combined to optimize memory weight allocation."

Finally, regarding the project's innovation and feasibility, the professor provided high praise, encouraging us to leverage interdisciplinary advantages, deeply integrate AI with domain knowledge, and demonstrate practical value such as shortening R&D cycles and reducing costs.

7 Installation

If Python is not available, install Python first and choose the appropriate installation package based on the current OS.

Install Python
Visit the official Python website, download page and run the downloaded installation program

The most important step: check the "Add Python 3. x to PATH" checkbox.
Verify if the installation was successful
Regardless of the method, after installation, it should be verified.

Open the command-line tool (Windows: CMD/PowerShell; Mac/Linux: Terminal) and enter the following command:
```
              
# Windows users
python --version
# Mac/Linux users
python3 --version
```
If the Python version number is displayed (such as "Python 3.9.7"), it indicates a successful installation.
Configure and run
- Recommended: Four-Step Setup for basic FBA functionality
  Follow these steps to set up the environment and run the software:
  1. Create a virtual environment
```
                        
python -m venv venv
```
  2. Activate the virtual environment
```
                        
source venv/bin/activate    # Mac/Linux
.\venv\Scripts\activate     # Windows
```
  3. Install dependencies
```
                        
pip install -r requirements.txt
```
  4. Run the software
```
                        
python src/app.py
```
  Then visit http://127.0.0.1:8079/
- Optional: Agent functionality
  Prerequisites:
  
  Install Ollama: visit the website and download the appropriate version for your operating system.
  
  Verify the installation by running:
```
ollama --version
```
  Recommended Hardware:
  
  For optimal performance, we recommend using an NVIDIA RTX 3060 12GB or higher GPU to deploy the DeepSeek-R1-7b model.
  
  Steps to Deploy:
  1. Generate the agent database:
```
python src/build_fba_knowledge.py
```
  2. Pull the DeepSeek-R1 Model Run the following command to download the model:
```
ollama pull deepseek-r1:7b
```
  3. Run the Model
    Start the model with the command:
```
ollama run deepseek-r1:7b
```
  Once the model is running, you can activate the agent feature in your web application!~~(^v^)~~

8 References

[1] Orth, J., Thiele, I. & Palsson, B. What is flux balance analysis?. Nat Biotechnol 28, 245-248 (2010). https://doi.org/10.1038/nbt.1614.

[2] Zhang, C., Sánchez, B. J., et al. (2024). Yeast9: a consensus genome-scale metabolic model for S. cerevisiae curated by the community. Molecular Systems Biology. https://doi.org/10.1038/s44320-024-00060-7.

[3] Rowe, E., King, Z. A., Palsson, B. O., & Ebrahim, A. (2018). Escher-FBA: a web application for interactive flux balance analysis. BMC Systems Biology, 12, 84. https://doi.org/10.1186/s12918-018-0607-5.

[4] von Kamp, A., Klamt, S., & others. (2023). Balancing biomass reaction stoichiometry and measured growth phenotypes improves model predictions in genome-scale metabolic models. Bioinformatics, 39(10), btad600. https://doi.org/10.1093/bioinformatics/btad600.

[5] Sahu, A., Reddy, S., & Bhattacharya, S. (2021). Advances in flux balance analysis by integrating machine learning-based methods. Computational and Structural Biotechnology Journal, 19, 4423-4434. https://doi.org/10.1016/j.csbj.2021.07.023.

[6] Ruofan Jin, Zaixi Zhang, Mengdi Wang, Le Cong, et al. (2025). STELLA: Self-Evolving LLM Agent for Biomedical Research. arXiv preprint arXiv:2507.02004v1. https://arxiv.org/abs/2507.02004v1.

[7] Zhucong Li, Bowei Zhang, Jin Xiao, Zhijian Zhou, Fenglei Cao, Jiaqing Liang, Yuan Qi. (2025). ChemHAS: Hierarchical Agent Stacking for Enhancing Chemistry Tools. arXiv. https://arxiv.org/abs/2505.21569.

[8] Yijie Xia, Xiaohan Lin, Zicheng Ma, Jinyuan Hu, Yanheng Li, Zhaoxin Xie, Hao Li, Li Yang, Zhiqiang Zhao, Lijiang Yang, Zhenyu Chen, Yi Qin Gao. (2025). Large Language Models as AI Agents for Digital Atoms and Molecules: Catalyzing a New Era in Computational Biophysics. arXiv preprint arXiv:2505.00270. https://arxiv.org/abs/2505.00270.