SynbioMCP Logo

Introduction

SynbioMCP is an open-source framework that connects Large Language Models (LLMs) with specialized synthetic biology tools through the Model Context Protocol (MCP). Built by iGEM Team HKUST-GZ 2025, SynbioMCP transforms natural language instructions into actionable bioinformatics workflows, lowering barriers for researchers with limited computational backgrounds.

Instead of copying long DNA sequences into clunky web tools, or learning scripting just to run simple analyses, SynbioMCP allows teams to simply ask:

"Design qPCR primers for the GFP gene in gfp_gene.fasta."

"Optimize this CDS for E. coli expression and check restriction sites."

"Render this PDB structure."

With SynbioMCP, LLMs become research copilots that handle repetitive computational tasks, freeing scientists to focus on biological insight.

Motivation

Synthetic biology research increasingly relies on computational tools—yet common challenges persist:

  • Dry-lab bottlenecks: Many tasks (sequence analysis, primer design, visualizations) are repetitive and time-consuming.
  • Knowledge gaps: Wet-lab members often struggle with software requiring advanced coding or command-line skills.
  • Workflow friction: Moving files across disconnected tools causes inefficiency and errors.

Our vision is to democratize computational biology. SynbioMCP integrates AI reasoning, file management, and bioinformatics tools into one system—so researchers can access powerful analysis without deep programming expertise.

What is MCP?

The Model Context Protocol (MCP), introduced by Anthropic and widely adopted by multiple vendors, is a standard that lets LLMs call external tools safely and reproducibly. It defines a host–server model:

  • MCP Server: Provides specialized tools (e.g., DNA analysis, codon optimization).
  • MCP Host: Manages interaction with LLMs, files, and results.

By adopting MCP, SynbioMCP ensures interoperability: it works not only with Google Gemini (the model we're currently adapting) but with any MCP-compatible LLM (Claude, GPT, or even local models).

Features

SynbioMCP features a clean, intuitive interface with support for both light and dark themes:

SynbioMCP Light Theme Interface

Light Theme - Clean and focused interface

SynbioMCP Dark Theme Interface

Dark Theme - Comfortable for extended use

SynbioMCP already supports a rich set of tools:

DNA & RNA Analysis

  • Sequence statistics (length, GC%, codon usage)
  • ORF and 6-frame translation
  • Codon optimization for multiple organisms
  • Protocol-aware primer design (PCR, qPCR, colony PCR, etc.)
  • Primer Tm analysis with multiple algorithms

Protein Analysis

  • Molecular weight, isoelectric point, hydropathy plots
  • Amino acid composition
  • AlphaFold structure query (under development)
  • 3D structure visualization

Synthetic Biology Tools

  • BioBrick Registry search (beta)
  • Pathway modeling & analysis (under development)

Data & File Management

  • Upload FASTA/GenBank/PDB files once; reuse via references
  • Preview sequences & metadata without context overflow
  • Persistent, sandboxed storage for project reproducibility

Workflow Demo: Basic Sequence Analysis

SynbioMCP Basic Sequence Analysis Demo

Demonstration of basic sequence analysis

Workflow Demo: 3D Visualization

SynbioMCP 3D Visualization Workflow Demo

Live demonstration of AI-powered 3D protein structure visualization

Quick Start

Prerequisites

Installation (5 minutes)

bash
# 1. Clone the repository
git clone https://gitlab.igem.org/2025/software-tools/hkust-gz.git
cd hkust-gz

# 2. Set up Python environment
python3 -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate

# 3. Install dependencies
pip install -r requirements.txt

# 4. Configure API key (DON'T MISS THIS)
cp .env.example .env
# Edit .env and paste your GEMINI_API_KEY (!IMPORTANT)

# 5. Launch!
python start_full_system.py

Open your browser to http://localhost:3000 and start chatting!

Try It Out

We've included example biological data in the examples/ folder:

  • gfp_gene.fasta - Green Fluorescent Protein coding sequence
  • insulin_protein.fasta - Human insulin protein sequence
  • example_peptide.pdb - Small protein structure file

Upload any file in the chat interface and try:

  • "Analyze gfp_gene.fasta and calculate its GC content"
  • "Design qPCR primers for the GFP gene"
  • "What's the molecular weight of insulin?"

Implementation

  • Language Model: Google Gemini 2.5, with roadmap for fine-tuning and RAG enhanced into a synbio-specialized assistant.
  • Frameworks: Built with FastMCP (MCP SDK from open source community) for extensible development.
  • Visualization: 3Dmol.js integration for 3D molecular graphics.
  • Architecture: Host–Server design for separation of concerns, modularity, and security.
SynbioMCP System Architecture

SynbioMCP system architecture: three-layer design with Web Chat UI, Host Server (Gemini + file system), and MCP Tool Server (14+ bioinformatics tools) communicating via the Model Context Protocol

This screen recording showcases our Proof of Concept stage, demonstrating early AI-powered visualization via MCP integration.

User Research — Surveys & Interviews

To validate usability beyond our team, we conducted surveys and interviews with wet-lab researchers and collaborators:

Interview with Dr. Pan - DiscussionInterview with Dr. Pan - Presentation

Photo of one of the offline experience sessions (participant has approved his face shown publicly).

Findings:

  1. File Handling: Users strongly preferred file upload/preview over copy-paste. → We prioritized the file manager in the host UI.
  2. Primer Design Confidence: Users wanted clear protocol presets (e.g., qPCR vs. colony PCR).
  3. Visualization Needs: Users requested one-click, publication-ready images. → We integrated visualization and image export features.
  4. Replayability: Users asked to rerun workflows with modified parameters. → We implemented run-logs and parameter echo.
  5. Future features: Users proposed additional use cases such as image-based bacterial counting from microscope captures and multi-language interface support. Many of these ideas are not yet implemented but have been documented in our roadmap.

This feedback guided our design, ensuring SynbioMCP is not just powerful but genuinely usable by iGEM teams and researchers.

iGEM Applications

SynbioMCP is tailored for iGEM projects:

  • Design Stage: Analyze and optimize parts, search BioBricks, design cloning primers.
  • Lab Stage: Plan PCR protocols, troubleshoot Tm/GC issues, analyze sequencing results.
  • Wiki Stage: Generate visualizations, pathway diagrams, and models for documentation.

Advantages and Novelties

Comparison with Existing Tools

Most existing synthetic biology software provides valuable single-purpose functionality but often requires technical setup or lacks integration. SynbioMCP is distinct because it leverages the Model Context Protocol (MCP) to unify tools under a natural language interface, making it extensible and collaborative.

Feature Benchling / SnapGene Primer3 / Web Tools 🧬 SynbioMCP
Primer Design Yes, via GUI Yes, CLI/web only Yes — protocol-aware presets (PCR, qPCR, colony PCR)
Codon Optimization Limited (organism set fixed) No Yes — multiple host organisms supported
Protein Visualization Manual PyMOL export/import No Yes — integrated 3D visualization, auto-load, one-click images
Ease of Use Requires training / license Requires coding or manual input Natural language queries, no coding needed
Extensibility Closed system Tool-specific only Plug-and-play via MCP — add modules in Python

Comparison of SynbioMCP with widely used alternatives. SynbioMCP shows unique advantages for combining natural language, extensibility, and multi-tool integration.

Note: Our goal is not to benchmark against professional single-task tools like Benchling or Primer3. Instead, SynbioMCP envisions an architecture that can combine those professional-grade abilities into one flexible, AI-powered environment.

Comparison with General-Purpose AI Tools

General-purpose AI assistants (like ChatGPT, Claude, or Gemini chat) are powerful for open-ended tasks but not optimized specially for laboratory workflows. SynbioMCP builds on these models and introduces a domain-specific architecture that makes them practically useful for synthetic biology.

Feature General AI (ChatGPT / Claude / Gemini) 🧬 SynbioMCP
Handling Biological Files Limited — must paste sequences or encode as text; some could handle file using virtual environment, but LLM needs to handle everything from sketch Built-in file manager with persistent FASTA/GenBank/PDB storage and preview
Tool Integration Indirect (requires plugins or manual API calls) Direct MCP integration with 14+ bioinformatics tools
Reproducibility Conversations may be ephemeral; workflows not logged Run-logs and parameter echo ensure reproducibility
Visualization Static diagrams or ASCII art at best Integrated visualization for high-quality 3D structures
User Flow General-purpose prompts, no lab presets Protocol-aware presets (qPCR, colony PCR, codon optimization for hosts)

Comparison of SynbioMCP with general AI assistants. SynbioMCP does not replace the underlying LLM but tailors its use to real lab workflows through domain-specific architecture.

Note: We are not claiming to outperform the raw reasoning ability of general-purpose models (e.g., Gemini, Claude, GPT). Instead, SynbioMCP provides a specialized user flow and architecture that makes these models practically useful for synthetic biology tasks. Our roadmap includes building a synbio-specialized LLM through fine-tuning and RAG enhancement — extending this tailored experience even further.

Future Directions

  • SynBio-specialized LLM for synthetic biology expertise.
  • Expanded tools: CRISPR design, metabolic pathway simulation, RNA folding.
  • Collaboration features: Shared workspaces for multiple team members.
  • Enhanced UI: Drag-and-drop workflows, richer visualization canvases.

Open Source & Reuse

  • License: Released under Apache-2.0, an OSI-approved open-source license for software.
  • Repository: See our GitLab Repo
  • Extensibility: Any team can add a new analysis tool by writing one Python module and registering it following the guidelines in readme file.

We encourage future iGEM teams to extend SynbioMCP with their own tools, protocols, or databases.

Summary

SynbioMCP demonstrates how AI agents can transform synthetic biology research:

  • Lowers the barrier for dry-lab tasks.
  • Bridges gaps between wet-lab and dry-lab members.
  • Ensures reproducibility with file-based, tool-driven workflows.
  • Provides a modular, open-source platform other iGEM teams can build upon.

By combining AI reasoning with validated bioinformatics tools, SynbioMCP reimagines how synthetic biology research can be conducted—efficiently, collaboratively, and accessibly.