Introduction

We warmly thank the iGEM community for its open platform and resources, and we’re deeply grateful to the teams before us whose shared knowledge made our journey possible. Standing on your shoulders, we were able to tackle challenges, learn fast, and grow as a team.

This page collects practical resources and lessons we want to pass on to future iGEMers—short, actionable materials that save time and reduce repeated mistakes. Our contributions include:

Modeling Guidance
Lab Guidance
Human Practices Guidance

We hope these materials are useful starting points. Please adapt, improve, and share them back with the community—let’s keep the iGEM spirit alive and evolving.

Modeling

The cell, the fundamental unit of life, is a wondrously intricate entity with properties and behaviors that challenge the limits of physical and computational modeling. Every cell is a dynamic and adaptive system in which complex behavior emerges from a myriad of molecular interactions. Some aspects are remarkably robust to perturbations, while others are sensitive to minor disruptions. To understand a cell’s function, scientists have attempted to construct virtual cell models to simulate, predict, and steer cell behavior [1]. In our experiment, we also employed this method—AI virtual cell (AIVC).

Before we begin constructing the AI virtual cell, we know that construction requires adherence to three fundamental pillars: a priori knowledge, static architecture, and dynamic states [2]. Throughout the process of constructing virtual cells, we consistently adhere to these principles. Next, we walk through our process.

Defining the Static Architecture — What a Cell Is

Morphological Profiling: Using segmentation masks, we extract quantitative shape descriptors (area, perimeter, eccentricity, solidity) per cell and frame.
Appearance and Texture Analysis: Using masked pixels from microscopy images, we compute intensity statistics and GLCM-based texture features (contrast, energy, homogeneity).
Deep Learning for Visual Fingerprinting: Image patches feed into a pre-trained ResNet-18 (with the final classifier removed) to yield a 128-dimensional feature vector per cell.

Capturing the Dynamic States — What a Cell Does

Trajectory Construction: We link per-frame cell centroids by ID to build time-series trajectories.
Motion and Behavior Quantification: From trajectories we derive instantaneous velocity (dx, dy), speed, turning angle, and history-aware features.
Modeling Population Dynamics: Using a KD-Tree of cell positions per frame, we compute:
- Local Density: neighbors within a defined radius.
- Collective Alignment: mean direction similarity with neighbors.

Integrating A Priori Knowledge — What We Know

Machine Learning Framework: A Random Forest Regressor predicts future dx/dy from the full feature set; GroupKFold prevents leakage across a cell’s time points.
Pharmacodynamic Model (E_max): A user-provided “dose” modulates:
- Velocity Factor (inhibition/stimulation of speed)
- Persistence Factor (directional memory/randomness)
Directed Motion & Stochasticity: Chemotaxis is simulated by adding a constant drift vector; small random noise acknowledges inherent biological variability.

Scheme 1. Cell trajectory demonstrations (two views).

ATRA Synthetic Pathway Modeling

To extend the AIVC framework into applied biotechnology, we incorporated mathematical modeling from our ATRA biosynthesis project.

ATRA Metabolic Pathway Kinetics

For each enzymatic step in the ATRA synthesis pathway:

$$ v_i \;=\; \frac{V_{\max,i}\,S_i}{K_{m,i}+S_i} $$

Flux balance across the pathway is described by:

$$ \mathbf{S}\,\mathbf{v}=\mathbf{0} $$

and gene expression dynamics are modeled as:

$$ \frac{d[\mathrm{mRNA}_j]}{dt}=\alpha_j-\beta_j[\mathrm{mRNA}_j],\quad \frac{d[E_j]}{dt}=\gamma_j[\mathrm{mRNA}_j]-\delta_j[E_j] $$

This system provides quantitative insight into the steady-state balance of substrate flux and enzyme expression, helping predict rate-limiting steps in ATRA biosynthesis.

**Figure 2.** ATRA Anabolic Metabolic Pathway Diagram

Plasmid Optimization Framework

We constructed a multi-objective optimization model for plasmid design, targeting:

GC content balance (40%–60%)
Codon Adaptation Index (CAI) optimization
Thermodynamic stability (minimization)

Optimization proceeds along a Pareto frontier, enabling trade-offs among expression efficiency, genetic stability, and structural robustness.

**Figure 3.** Plasmid Optimization Using Multi-Objective Pareto Front

Metabolic Flux Balance Analysis (FBA)

The steady-state stoichiometric model and objective:

$$ \mathbf{S}\,\mathbf{v}=\mathbf{0}, \qquad \max \; \mathbf{c}^{\mathsf T}\mathbf{v} $$

permit prediction of pathway throughput and identification of key flux bottlenecks. Visualization of ATRA flux distribution maps further supports rational strain engineering and integration into AIVC simulations.

**Figure 4.** Plasmid_Map_Fused_ATRA_Plasmid

**Figure 5.** Plasmid_Map_Fused β-Carotene-Synthesizing Granules

System Robustness and Sensitivity

Parameter sensitivity coefficients quantify robustness against perturbations, guiding optimization of promoter strength and ribosome binding sites.

$$ S_{y_i}^{(p_k)} \;=\; \frac{\partial \ln y_i}{\partial \ln p_k} $$

**Figure 6.** System Parameter Sensitivity Heatmap

AIVC Demonstration

To Future iGEMers

AI Virtual Cell (AIVC) is our contribution to iGEM. It shows how computational modeling and synthetic biology reinforce each other. Take the code and datasets, adapt them to your system, and push predictive engineering forward—one virtual cell at a time.

Code & Data DOI

10.5281/zenodo.17259374

Open DOI

Tip: replace the badge above with your actual DOI screenshot image if available.

Model — Full Gallery

Complete images & data browser (Streamlit).

Open Gallery Download from DOI

AIVC Simulation Playground

Try fun, parameterized AIVC runs in the browser.

Launch App

GitHub Repository

Public source code mirror of AIVC & notebooks (add your URL if different):

View on GitHub Cite via DOI

Lab

BBa_25P4P9KL

In biological systems, retinoic acid is biosynthesized from β-carotene through two reaction steps. β-carotene is symmetrically cleaved by β-carotene 15,15-oxygenase to generate retinaldehyde (also known as retina), which is then oxidized by retinal dehydrogenase (RALDH) to form retinoic acid. To construct a plasmid for the synthesis of trans-retinoic acid from β-carotene as a precursor, we integrated three ATRA synthesis genes, raldh, IIdR, and blh, into the pET-21a-trc plasmid through homologous recombination, thereby constructing the plasmid pET-21a-trc-raldh-IIdR-blh for the synthesis of ATRA using β-carotene as a substrate.

Flowchart of pEcgRNA-N20 plasmid construction — **Figure 2.** Diagram of BBa_25P4P9KL construct

pEcgRNA-N20: Modular CRISPR sgRNA Plasmid with ccdB Selection

We built a customizable CRISPR guide RNA plasmid (pEcgRNA-N20) based on the pTargetF/pEcgRNA system. The plasmid contains a toxic ccdB cassette flanked by BsaI (Type IIS) sites. Users design a pair of complementary 24-nt oligos encoding any 20-nt spacer, anneal them, and ligate into BsaI-linearized pEcgRNA. Successful clones lose ccdB (counterselection), ensuring only correct sgRNA inserts survive. This streamlines sgRNA construction: two short oligos and a one-pot Golden Gate reaction. We verified broad utility (E. coli K-12, B, W, and Nissle 1917) and simplified CRISPR workflows [3].

pET21a-raldh-IIdR-blh: ATRA Biosynthesis Expression Plasmid

We engineered a new expression plasmid to produce all-trans retinoic acid (ATRA) in E. coli. This construct carries blh, raldh, and IIdR in tandem under a strong trc (T7) promoter with dual T7 terminators. Sequences were codon-optimized and assembled as a single operon in pET21a. Prior work shows co-expressing Blh and Raldh yields measurable ATRA [4]; our plasmid encodes these enzymes plus IIdR in one cassette. Future teams can transform pET21a-raldh-IIdR-blh into β-carotene–producing strains and induce expression to generate ATRA. The trc promoter ensures high expression and terminators prevent read-through.

**Figure 3.** Schematic representation of recombinant plasmid 21a-raldh-IIdR-blh.

**CRISPR–Cas9 Genome Integration Protocol for E. coli Nissle 1917**

We developed a detailed workflow for scarless genome editing in Nissle 1917 using our CRISPR system:

sgRNA plasmid assembly: Design a 20-nt guide; insert into pEcgRNA-N20 by annealing two 24-nt oligos and performing BsaI Golden Gate ligation. The ccdB cassette selects for correct ligations.
Donor plasmid construction: Build a donor with ~500–1000 bp homology arms flanking the edit site, inserting any payload between arms if needed.
One-pot transformation and editing: Co-transform Nissle 1917 with pEcCas (Cas9, λ-Red, sacB), the new pEcgRNA guide plasmid, and the donor. Induce to trigger a DSB and homologous repair.
Plasmid curing: Counterselect pEcCas via sacB (sucrose) and force loss of the guide plasmid by continued cleavage. We optimized curing so edited cells are obtained in ~32 h instead of ~60 h [3].

Using this workflow, we achieved efficient, markerless genome integration in Nissle 1917. The full cycle (assemble guide, donor, transform, select, cure) completes in ~6–7 days [3].

Lactic Acid Chemotaxis Engineering

We engineered a lactate-chemotactic probiotic strain (EcN-SY) by insertin eTlpC>, a hybrid of Helicobacter pylori TlpC and E. coli CFT073 domains, into the OmpT locus. CRISPR-assisted recombination ensured accurate genome integration, and PCR sequencing verified construct fidelity.

Full Synthetic Pathway for ATRA Production

Our ATRA biosynthesis system is divided into:
Upstream module: IPP → GGPP → phytoene → lycopene → β-carotene (via crtE, crtB, crtI, crtY)
Downstream module: β-carotene → retinal → ATRA (via blh and raldh).
Two key plasmids — pET-21a-trc-crtEBIY and pET-21a-trc-raldh-IIdR-blh — were constructed to enable this two-stage biosynthesis. Testing confirmed expression and activity, while iterative learning cycles guided optimization through enzyme replacement and codon refinement.

Virtual Experiments · Demo Videos

A selection of virtual experiment demonstrations for education and public outreach.

Clip 1

Clip 2

Clip 3

Clip 4

Clip 5

Clip 6

To Future iGEMers (Lab)

Here we publish reusable assets for wet-lab & dry-lab: process animation, model visualizations, DOIs for citation, demo videos, and an online virtual experiment platform. Fork, adapt, and build faster.

Wet-lab Resources · DOI

10.5281/zenodo.17274589

Open DOI

Replace the badge with your actual DOI screenshot if needed.

Model Visualization & Results · DOI

10.5281/zenodo.17259332

Open DOI

Dry-lab Result Portal · Videos

Two short demos of the analysis/result portal.

Open Video 1 Open Video 2

Virtual Experiment Platform

Run interactive experiments online (Streamlit).

Launch Platform

Human Practices

Education

We promoted open collaboration by exchanging plasmids and E. coli Nissle 1917 strains with the OUC-Haide team and sharing experimental designs and ideas. Additionally, we co-created a “Synthetic Biology Handbook” with Jilin University and 33 global iGEM teams, integrating diverse expertise into one educational resource.

Download the Handbook (PDF)

Integrating Cultural Creativity into Science Outreach

We designed a puzzle-shaped logo forming “SYPHU,” representing teamwork, curiosity, and the iGEM spirit. We shared these creative products at public events to engage the public.

Figure 4. SYPHU puzzle-shaped logo.

Entrepreneurship

We embrace an entrepreneurial mindset throughout our project—from customer discovery and stakeholder interviews to market sizing, regulatory pathways, risk mapping, and go-to-market strategy. The team distilled these insights into a polished Enterprise Proposal to guide translational impact and partnerships.

Problem–Solution Fit & Customer Segments

Regulatory & Ethics Pathway (pre-clinical → clinical)

Business Model & Pricing Hypotheses

Competitive Landscape & Differentiation

IP Strategy & Collaboration Roadmap

Operational Plan & Risk Mitigation

If the PDF does not display due to browser policy, use the buttons above to open or download.

To Future iGEMers (Human Practices)

Data ethics, inclusivity, and real-world adoption matter. Below are our reusable HP assets: clinical survey data, industry/market modeling, interactive platforms, and demo videos—ready for you to cite, fork, and extend.

HP Clinical Data · DOI

10.5281/zenodo.17259406

Open DOI

HP Industry Model · DOI

10.5281/zenodo.17259310

Open DOI

HP Business Intelligence Cloud

Interactive dashboard for market & stakeholder analysis.

Open BI Platform

HP Questionnaire Platform

Multilingual, inclusive survey with ethics and consent module.

Open Questionnaire

Streamlit Apps Repository

All our Streamlit apps (playgrounds, dashboards, and utilities) are organized in one GitLab repository. Clone it to reproduce or deploy modules quickly.

GitLab · Streamlit apps collection

Quick actions

Open in GitLab Download ZIP

https://gitlab.igem.org/chenwenbin/streamlit_app.git

# Clone
git clone https://gitlab.igem.org/chenwenbin/streamlit_app.git
cd streamlit_app

# (Optional) create venv
python -m venv .venv && source .venv/bin/activate   # Windows: .venv\Scripts\activate

# Install & run
pip install -r requirements.txt
streamlit run app.py

References

Bunne, C., et al. How to build the virtual cell with artificial intelligence: Priorities and opportunities. Cell, 2024, 187(25): 7045–7063.
Qian, L.; Dong, Z.; Guo, T. Grow AI virtual cells: three data pillars and closed-loop learning. Cell Research, 2025, 1–3.
Li, Q.I., et al. A modified pCas/pTargetF system for CRISPR-Cas9-assisted genome editing in Escherichia coli. Acta Biochimica et Biophysica Sinica, 2021, 53(5): 620–627.
Han, M.; Lee, P.C. Microbial production of bioactive retinoic acid using metabolically engineered Escherichia coli. Microorganisms, 2021, 9(7): 1520.