Try EMS-Optimizer at https://2025.igem.wiki/software-tools/fudan/
Source code at https://gitlab.igem.org/2025/software-tools/fudan
Problem Statement
Directed evolution through EMS (ethyl methanesulfonate) mutagenesis is powerful but faces a key challenge: how to control mutation rates in specific regions. Researchers need to:
- Maximize mutations in target regions for library diversity
- Minimize mutations in critical functional domains to preserve protein function
Current limitation: No tools exist to rationally design sequences with tunable EMS sensitivity while maintaining protein identity.
Our Solution
EMS-Optimizer intelligently selects synonymous codons to modulate G/C content, thereby controlling EMS mutation susceptibility. Key innovations:
- Dual modes: Forward (G→A, C→T for EMS) and Reverse (A→G, T→C for suppressor screening)
- Risk-based algorithm: Scores codons by stop/missense/silent mutation potential (not just G/C count)
- CAI integration: Monitors translation efficiency impact using Codon Adaptation Index
- GFP specialization: Built-in analysis for 22+ critical fluorescent protein sites
- Real-time visualization: Interactive risk heatmaps with per-codon probability tooltips

Key Features
Intelligent Mutagenesis Design
At the core of our software is the Smart Codon Selection tool, which allows users to strategically target specific regions of a gene for mutagenesis. This feature operates in two distinct modes to suit different experimental goals. The Maximize mode is engineered to introduce high-risk codons into EMS-sensitive regions, thereby increasing the likelihood of desired mutations. Conversely, the Minimize mode selects low-risk codons for regions intended to remain stable. To quantify the potential impact of mutations, we have implemented a risk scoring system that heavily penalizes stop codons (×10) and missense mutations (×3) over silent mutations (×0). All codon selections are optimized for expression in Saccharomyces cerevisiae (yeast), ensuring that the resulting protein variants can be effectively produced for our project.
The software supports two operational modes for modeling genetic changes. The default G:C → A:T Transition Mode simulates the effects of EMS mutagenesis, specifically G→A and C→T transitions, which is ideal for standard mutagenesis screening experiments. For researchers interested in studying genetic suppression or reversion events, the A:T → G:C Transition Mode models the corresponding reversion mutations (A→G, T→C), providing a powerful tool for identifying suppressor mutations and conducting reversion screening.
Real-time Expression and Functional Analysis
To maintain a balance between mutagenesis and protein expression, our software includes Translation Efficiency Tracking. This feature calculates the Codon Adaptation Index (CAI) in real-time as sequence modifications are made. Visual cues—a green indicator (🟢) for an increase in CAI, red (🔴) for a decrease, and white (⚪) for neutral changes—provide immediate feedback, allowing users to make informed decisions that align their mutagenesis goals with optimal expression efficiency.

For projects focused on fluorescent protein engineering, the software offers a GFP-Specific Analysis module. This pre-configured tool is tailored for the analysis of Green Fluorescent Protein (GFP) and its derivatives. It automatically identifies and tracks over 22 critical sites, including the essential chromophore core (Thr65-Tyr66-Gly67), key catalytic residues (Arg96, Glu222), and common color variants such as Y66H (BFP), Y66W (CFP), and T203Y (YFP). The analysis is weighted to prioritize these functionally significant residues, and reference sequences for EGFP and a yeast-optimized yEGFP are included for easy comparison.
User-Friendly and Accessible
Our software is designed with the user in mind. As a web-based application, it requires no installation and is accessible from any modern web browser. It accommodates both DNA and protein sequences as input, with an automatic conversion feature for seamless workflow. The mutation rate is highly adjustable, with precision down to 1×10-10, giving users fine-grained control over their in-silico experiments. Once an optimized sequence is generated, it can be copied to the clipboard with a single click. To cater to a global user base, the interface is available in both English and Chinese.
How It Works
Algorithm Overview

Risk Scoring Method
For each codon position (0, 1, 2):
- Identify mutable bases (G/C for forward, A/T for reverse)
- Simulate mutation (G→A, C→T, or A→G, T→C)
- Score outcome:
- Nonsense (stop codon): +10 points
- Missense (AA change): +3 points
- Silent (synonymous): 0 points
- Base presence: +1 point
- Sum scores across 3 positions
Probability Calculation
Per-codon probabilities: direct simulation of all 3 positions
Sequence-wide cumulative probability:
P(≥1 mutation) = 1 - ∏[1 - P(single codon)]
implemented with log-space arithmetic for numerical stability:
P(≥1 mutation) = -expm1(∑log1p(-Pᵢ))
Installation & Usage
Online Access
Live demo at https://2025.igem.wiki/software-tools/fudan/
Local Deployment
git clone https://gitlab.igem.org/2025/software-tools/fudan.git
cd fudan
pnpm install
pnpm run serve
Requirements: Node.js ≥18, modern browser (Chrome/Firefox/Safari/Edge)
Quick Start Guide
- Input sequence: Paste CDS (must be multiple of 3) or protein sequence
- Select mutation mode: Forward (EMS) or Reverse (suppressor)
- Adjust mutation rates (optional): Default 1.67×10⁻⁸% per site
- Choose optimization: Minimize (protective) or Maximize (mutagenic)
- Analyze results:
- View color-coded risk heatmap (red=stop, orange=change, green=silent)
- Check CAI changes in statistics panel
- Hover codons for detailed probabilities
- Copy optimized sequences with one click
Development Process (DBTL cycles)
Cycle 1: Algorithm Validation
- Design: Python CLI to test core optimization logic
- Build: Implemented G/C-based codon selection
- Test: Distributed to team members
- Learn: ❌ Poor usability (requires Python installation), results not intuitive
Key insight: A powerful algorithm needs an accessible interface.
Cycle 2: Web Application
Design: Browser-based GUI with 3-step workflow (Input → Optimize → Export)
Build:
- Migrated Python logic to JavaScript/TypeScript
- Created Vue 3 single-page application
- Added side-by-side sequence comparison view
Test: Positive feedback on ease of use and visualization
Learn: ✅ Web delivery removes installation barriers and enables instant feedback
Cycle 3: Advanced Features (Current)
Design: Based on user feedback, added:
- Reverse mutation mode for suppressor screening
- CAI tracking for expression efficiency
- GFP-specific site analysis
- Bilingual support
Build: Implemented all features with full test coverage
Test: In progress with wet-lab validation
Learn: Integrated features increase utility without sacrificing simplicity
Future Development
- Expand organism support: add E. coli, mammalian cell codon tables
- Integrate structural data: incorporate AlphaFold predictions to weight solvent accessibility
- Experimental feedback: implement machine learning to tune parameters from observed mutation distributions
References
Sharp, P. M., & Li, W. H. (1987). The codon adaptation index — a measure of directional synonymous codon usage bias, and its potential applications. Nucleic Acids Research, 15(3), 1281-1295. DOI: 10.1093/nar/15.3.1281
Tsien, R. Y. (1998). The green fluorescent protein. Annual Review of Biochemistry, 67, 509-544. DOI: 10.1146/annurev.biochem.67.1.509
Zacharias, D. A., et al. (2002). Partitioning of lipid-modified monomeric GFPs into membrane microdomains. Science, 2002;296(5569):913‑916. DOI: 10.1126/science.1068539
Greene E. A., et al. (2003). Spectrum of chemically induced mutations from a large-scale reverse-genetic screen in Arabidopsis. Genetics, 164(2):731-740. DOI: 10.1093/genetics/164.2.731
Bennetzen J. L., Hall B. D. (1982). Codon selection in yeast. J Biol Chem, 257(6):3026-3031. PMID: 6277903
Ormö M. et al. (1996). Crystal structure of the Aequorea victoria green fluorescent protein. Science, 273(5280):1392‑1395. DOI: 10.1126/science.273.5280.1392
Cormack B. P., Valdivia R. H., Falkow S. (1996). FACS-optimized mutants of the green fluorescent protein (GFP). Gene, 173(1):33‑38. DOI: 10.1016/0378-1119(95)00685-0
Heim R., Tsien R.Y. (1996). Engineering green fluorescent protein for improved brightness, longer wavelengths and fluorescence resonance energy transfer. Curr Biol, 6(2):178‑182. DOI: 10.1016/s0960-9822(02)00450-5