Introduction
We warmly thank the iGEM community for its open platform and resources, and we’re deeply grateful to the teams before us whose shared knowledge made our journey possible. Standing on your shoulders, we were able to tackle challenges, learn fast, and grow as a team.
This page collects practical resources and lessons we want to pass on to future iGEMers—short, actionable materials that save time and reduce repeated mistakes. Our contributions include:
- Modeling Guidance
- Lab Guidance
- Human Practices Guidance
We hope these materials are useful starting points. Please adapt, improve, and share them back with the community—let’s keep the iGEM spirit alive and evolving.
Modeling
The cell, the fundamental unit of life, is a wondrously intricate entity with properties and behaviors that challenge the limits of physical and computational modeling. Every cell is a dynamic and adaptive system in which complex behavior emerges from a myriad of molecular interactions. Some aspects are remarkably robust to perturbations, while others are sensitive to minor disruptions. To understand a cell’s function, scientists have attempted to construct virtual cell models to simulate, predict, and steer cell behavior [1]. In our experiment, we also employed this method—AI virtual cell (AIVC).
Before we begin constructing the AI virtual cell, we know that construction requires adherence to three fundamental pillars: a priori knowledge, static architecture, and dynamic states [2]. Throughout the process of constructing virtual cells, we consistently adhere to these principles. Next, we walk through our process.
Defining the Static Architecture — What a Cell Is
- Morphological Profiling: Using segmentation masks, we extract quantitative shape descriptors (area, perimeter, eccentricity, solidity) per cell and frame.
- Appearance and Texture Analysis: Using masked pixels from microscopy images, we compute intensity statistics and GLCM-based texture features (contrast, energy, homogeneity).
- Deep Learning for Visual Fingerprinting: Image patches feed into a pre-trained ResNet-18 (with the final classifier removed) to yield a 128-dimensional feature vector per cell.
Capturing the Dynamic States — What a Cell Does
- Trajectory Construction: We link per-frame cell centroids by ID to build time-series trajectories.
- Motion and Behavior Quantification: From trajectories we derive instantaneous velocity (dx, dy), speed, turning angle, and history-aware features.
- Modeling Population Dynamics: Using a KD-Tree of cell positions per frame, we compute:
- Local Density: neighbors within a defined radius.
- Collective Alignment: mean direction similarity with neighbors.
Integrating A Priori Knowledge — What We Know
- Machine Learning Framework: A Random Forest Regressor predicts future dx/dy from the full feature set; GroupKFold prevents leakage across a cell’s time points.
- Pharmacodynamic Model (Emax): A user-provided “dose” modulates:
- Velocity Factor (inhibition/stimulation of speed)
- Persistence Factor (directional memory/randomness)
- Directed Motion & Stochasticity: Chemotaxis is simulated by adding a constant drift vector; small random noise acknowledges inherent biological variability.
Scheme 1. Cell trajectory demonstrations (two views).
ATRA Synthetic Pathway Modeling
To extend the AIVC framework into applied biotechnology, we incorporated mathematical modeling from our ATRA biosynthesis project.
ATRA Metabolic Pathway Kinetics
For each enzymatic step in the ATRA synthesis pathway:
$$ v_i \;=\; \frac{V_{\max,i}\,S_i}{K_{m,i}+S_i} $$
Flux balance across the pathway is described by:
$$ \mathbf{S}\,\mathbf{v}=\mathbf{0} $$
and gene expression dynamics are modeled as:
$$ \frac{d[\mathrm{mRNA}_j]}{dt}=\alpha_j-\beta_j[\mathrm{mRNA}_j],\quad \frac{d[E_j]}{dt}=\gamma_j[\mathrm{mRNA}_j]-\delta_j[E_j] $$
This system provides quantitative insight into the steady-state balance of substrate flux and enzyme expression, helping predict rate-limiting steps in ATRA biosynthesis.
Plasmid Optimization Framework
We constructed a multi-objective optimization model for plasmid design, targeting:
- GC content balance (40%–60%)
- Codon Adaptation Index (CAI) optimization
- Thermodynamic stability (minimization)
Optimization proceeds along a Pareto frontier, enabling trade-offs among expression efficiency, genetic stability, and structural robustness.
Metabolic Flux Balance Analysis (FBA)
The steady-state stoichiometric model and objective:
$$ \mathbf{S}\,\mathbf{v}=\mathbf{0}, \qquad \max \; \mathbf{c}^{\mathsf T}\mathbf{v} $$
permit prediction of pathway throughput and identification of key flux bottlenecks. Visualization of ATRA flux distribution maps further supports rational strain engineering and integration into AIVC simulations.
System Robustness and Sensitivity
Parameter sensitivity coefficients quantify robustness against perturbations, guiding optimization of promoter strength and ribosome binding sites.
$$ S_{y_i}^{(p_k)} \;=\; \frac{\partial \ln y_i}{\partial \ln p_k} $$
AIVC Demonstration
To Future iGEMers
AI Virtual Cell (AIVC) is our contribution to iGEM. It shows how computational modeling and synthetic biology reinforce each other. Take the code and datasets, adapt them to your system, and push predictive engineering forward—one virtual cell at a time.
GitHub Repository
Public source code mirror of AIVC & notebooks (add your URL if different):
Lab
BBa_25P4P9KL
In biological systems, retinoic acid is biosynthesized from β-carotene through two reaction steps. β-carotene is symmetrically cleaved by β-carotene 15,15-oxygenase to generate retinaldehyde (also known as retina), which is then oxidized by retinal dehydrogenase (RALDH) to form retinoic acid. To construct a plasmid for the synthesis of trans-retinoic acid from β-carotene as a precursor, we integrated three ATRA synthesis genes, raldh, IIdR, and blh, into the pET-21a-trc plasmid through homologous recombination, thereby constructing the plasmid pET-21a-trc-raldh-IIdR-blh for the synthesis of ATRA using β-carotene as a substrate.
pEcgRNA-N20: Modular CRISPR sgRNA Plasmid with ccdB Selection
We built a customizable CRISPR guide RNA plasmid (pEcgRNA-N20) based on the pTargetF/pEcgRNA system. The plasmid contains a toxic ccdB cassette flanked by BsaI (Type IIS) sites. Users design a pair of complementary 24-nt oligos encoding any 20-nt spacer, anneal them, and ligate into BsaI-linearized pEcgRNA. Successful clones lose ccdB (counterselection), ensuring only correct sgRNA inserts survive. This streamlines sgRNA construction: two short oligos and a one-pot Golden Gate reaction. We verified broad utility (E. coli K-12, B, W, and Nissle 1917) and simplified CRISPR workflows [3].
pET21a-raldh-IIdR-blh: ATRA Biosynthesis Expression Plasmid
We engineered a new expression plasmid to produce all-trans retinoic acid (ATRA) in E. coli. This construct carries blh, raldh, and IIdR in tandem under a strong trc (T7) promoter with dual T7 terminators. Sequences were codon-optimized and assembled as a single operon in pET21a. Prior work shows co-expressing Blh and Raldh yields measurable ATRA [4]; our plasmid encodes these enzymes plus IIdR in one cassette. Future teams can transform pET21a-raldh-IIdR-blh into β-carotene–producing strains and induce expression to generate ATRA. The trc promoter ensures high expression and terminators prevent read-through.
CRISPR–Cas9 Genome Integration Protocol for E. coli Nissle 1917
We developed a detailed workflow for scarless genome editing in Nissle 1917 using our CRISPR system:
- sgRNA plasmid assembly: Design a 20-nt guide; insert into pEcgRNA-N20 by annealing two 24-nt oligos and performing BsaI Golden Gate ligation. The ccdB cassette selects for correct ligations.
- Donor plasmid construction: Build a donor with ~500–1000 bp homology arms flanking the edit site, inserting any payload between arms if needed.
- One-pot transformation and editing: Co-transform Nissle 1917 with pEcCas (Cas9, λ-Red, sacB), the new pEcgRNA guide plasmid, and the donor. Induce to trigger a DSB and homologous repair.
- Plasmid curing: Counterselect pEcCas via sacB (sucrose) and force loss of the guide plasmid by continued cleavage. We optimized curing so edited cells are obtained in ~32 h instead of ~60 h [3].
Using this workflow, we achieved efficient, markerless genome integration in Nissle 1917. The full cycle (assemble guide, donor, transform, select, cure) completes in ~6–7 days [3].
Lactic Acid Chemotaxis Engineering
We engineered a lactate-chemotactic probiotic strain (EcN-SY) by insertin eTlpC>, a hybrid of Helicobacter pylori TlpC and E. coli CFT073 domains, into the OmpT locus. CRISPR-assisted recombination ensured accurate genome integration, and PCR sequencing verified construct fidelity.
Full Synthetic Pathway for ATRA Production
Our ATRA biosynthesis system is divided into:
Upstream module: IPP → GGPP → phytoene → lycopene → β-carotene (via crtE, crtB, crtI, crtY)
Downstream module: β-carotene → retinal → ATRA (via blh and raldh).
Two key plasmids — pET-21a-trc-crtEBIY and pET-21a-trc-raldh-IIdR-blh — were constructed to enable this two-stage biosynthesis. Testing confirmed expression and activity, while iterative learning cycles guided optimization through enzyme replacement and codon refinement.
Virtual Experiments · Demo Videos
A selection of virtual experiment demonstrations for education and public outreach.
To Future iGEMers (Lab)
Here we publish reusable assets for wet-lab & dry-lab: process animation, model visualizations, DOIs for citation, demo videos, and an online virtual experiment platform. Fork, adapt, and build faster.
Dry-lab Result Portal · Videos
Two short demos of the analysis/result portal.
Human Practices
Education
We promoted open collaboration by exchanging plasmids and E. coli Nissle 1917 strains with the OUC-Haide team and sharing experimental designs and ideas. Additionally, we co-created a “Synthetic Biology Handbook” with Jilin University and 33 global iGEM teams, integrating diverse expertise into one educational resource.
Integrating Cultural Creativity into Science Outreach
We designed a puzzle-shaped logo forming “SYPHU,” representing teamwork, curiosity, and the iGEM spirit. We shared these creative products at public events to engage the public.
Entrepreneurship
We embrace an entrepreneurial mindset throughout our project—from customer discovery and stakeholder interviews to market sizing, regulatory pathways, risk mapping, and go-to-market strategy. The team distilled these insights into a polished Enterprise Proposal to guide translational impact and partnerships.
To Future iGEMers (Human Practices)
Data ethics, inclusivity, and real-world adoption matter. Below are our reusable HP assets: clinical survey data, industry/market modeling, interactive platforms, and demo videos—ready for you to cite, fork, and extend.
HP Business Intelligence Cloud
Interactive dashboard for market & stakeholder analysis.
HP Questionnaire Platform
Multilingual, inclusive survey with ethics and consent module.
Streamlit Apps Repository
All our Streamlit apps (playgrounds, dashboards, and utilities) are organized in one GitLab repository. Clone it to reproduce or deploy modules quickly.
Quick actions
# Clone
git clone https://gitlab.igem.org/chenwenbin/streamlit_app.git
cd streamlit_app
# (Optional) create venv
python -m venv .venv && source .venv/bin/activate # Windows: .venv\Scripts\activate
# Install & run
pip install -r requirements.txt
streamlit run app.py
References
- Bunne, C., et al. How to build the virtual cell with artificial intelligence: Priorities and opportunities. Cell, 2024, 187(25): 7045–7063.
- Qian, L.; Dong, Z.; Guo, T. Grow AI virtual cells: three data pillars and closed-loop learning. Cell Research, 2025, 1–3.
- Li, Q.I., et al. A modified pCas/pTargetF system for CRISPR-Cas9-assisted genome editing in Escherichia coli. Acta Biochimica et Biophysica Sinica, 2021, 53(5): 620–627.
- Han, M.; Lee, P.C. Microbial production of bioactive retinoic acid using metabolically engineered Escherichia coli. Microorganisms, 2021, 9(7): 1520.