Engineering banner

Modeling & Analysis

"Model-driven design for TRACER: from molecules to populations."

0:Overview

Under iGEM's judging criteria, a strong model is not merely a mathematical or computational showcase; more importantly, it must explain biological processes, guide design optimization, and reduce trial-and-error experimentation. Accordingly, we built a multi-scale, multi-tier modeling framework that spans the entire drug-design and validation pipeline—from molecules to populations.
This ensures that our proposed TRACER anticancer drug delivery system is rational, feasible, and predictive at every level.

Statistical level modeling: Use clinical datasets and transcriptomic analyses to pinpoint pain points in drug development and to identify molecular targets that are highly expressed in cancers.

Molecular level modeling: Apply structure prediction and molecular dynamics to verify that our designs can be specifically activated and to genuinely enhance transmembrane propensity.

Cellular level modeling: Use ordinary differential equation (ODE) kinetics to forecast production,and we have also planned a detailed Physiologically-Based Pharmacokinetic (PBPK) simulation to predict the drug's clinical performance. Although the critical MTT assay has been postponed until after the WikiFreeze due to time constraints, our modeling protocol is established, provide in-silico surrogates for clinical trials the wet lab cannot complete within time constraints.

Population level modeling: Employ agent-based simulations in NetLogo to explore how engineered bacterial population behaviors under realistic culture conditions affect production outcomes and to elucidate the timing dynamics of population lysis, yielding actionable guidance for wet-lab experiments.

This architecture enables end-to-end coverage—design → validation → optimization → prediction—and surfaces dynamic processes that are hard to capture experimentally in one pass, thereby saving time and cost for the wet lab.

Figure 2.4 left

Objective: Answer "why this design is warranted."

Methods: Curate approved/clinical anticancer peptide drugs and integrate transcriptomic data to identify significantly dysregulated targets in colorectal cancer.

Conclusion: High expression of matrix metalloproteinase-9 (MMP9) is a usable, target-specific activation cue → directly informed the design of TRACER.

Objective: Determine whether the design holds at atomic resolution.

Methods: Use AlphaFold and GROMACS to simulate two states (shielded/activated) of TRACER-IL24, and introduce the Transmembrane Readiness Framework (TRF) for assessing transmembrane propensity.

Conclusion: Simulations show that cleavage by MMP9 exposes the positively charged/hydrophobic face of BRII, enhances membrane-translocation propensity, and maintains IL-24 stability → the structural design is validated.

Objective: Assess whether the engineered bacterial circuit operates stably at single-cell and population scales.

Methods: Build an ODE model to predict time courses of AHL signaling, lytic proteins, and cell counts.

Conclusion: Depending on strain parameters, two regimes emerge—sustained oscillations vs. damped convergence—yielding practical "sampling cadence and interpretation thresholds" that plug directly into wet-lab workflows.

Objective: Evaluate whether engineered microbial population behavior affects efficacy under realistic manufacturing conditions.

Methods: Use an agent-based model (ABM) in NetLogo, varying initial inoculum to track population growth, signal diffusion, and lysis periodicity.

Conclusion: Higher inoculum leads to earlier lysis with more regular cycles → provides a theoretical basis for tuning cultivation duration to achieve optimal yield.


Significance of the modeling effort

Deep integration with experiments: Each modeling layer maps directly to key wet-lab questions, making the model not a showcase but a prerequisite for experimental design.

Methodological innovation: We introduce the Transmembrane Readiness Framework (TRF) and unify multi-scale modeling (molecular--cellular--tissue--population) into a coherent architecture.

Reproducibility and extensibility: All simulations come with parameter tables, a code repository, and a README to enable future replication,all our engineering documents have been uploaded to the iGEM officail website server,click here to download, the TRF assessment paradigm generalizes to other peptide-drug designs beyond TRACER.

Ultimately, our modeling did more than substantiate the rationale for TRACER; it also delivered actionable operating standards for the wet lab. In parallel, it aligns with the ethos of the iGEM community—elucidating biological phenomena, forecasting experimental outcomes, steering design optimization, and contributing a transferable methodology for others.

1. Statistics on Existing Anticancer Drug Categories

TL;DR: Anchored in the industry context, we reviewed current classes of anticancer agents to distill clinical improvement needs, then tested—via structural modeling and molecular simulation—whether TRACER meets these criteria:

  • Substantially improve target specificity while reducing toxicity.
  • Facilitate peptide drugs reaching their targets and increase bioavailability.
  • Tailor to colorectal cancer and enable noninvasive delivery.

We compiled anticancer peptide drugs that are approved or in clinical development from public databases.

Data sources:

  1. ClinicalTrials.gov: The world's largest registry of clinical trials, covering the vast majority of publicly developed drug studies.
  2. THPdb: A curated database of peptide therapeutics that are approved or in clinical phases.

Procedure:

We retrieved records using the keywords "anticancer" AND "peptide drug," removed duplicates, and organized the resulting entries by drug class.

Figure 2.4 left

Figure 1.1. Proportions of mainstream classes of peptide-based anticancer therapeutics

From the figure, antibody modalities account for the largest share (red, blue, and purple segments), indicating that industry practice predominantly relies on antibodies as the principal therapeutic form. Although these peptide-related agents offer high target specificity, they typically entail very high R&D costs, narrow applicability, and limited tissue penetration. For example, trastuzumab is a monoclonal antibody that binds HER2 and suppresses tumor growth via pathway blockade; in colorectal cancer (CRC), only ~5% of patients are eligible. Other peptide-related modalities—such as colony-stimulating factors and antibody–drug conjugates (ADCs)—carry notable adverse-event profiles and potential misuse risks. Consequently, a delivery technology that helps peptide drugs reach targets efficiently while maintaining low toxicity during transit is crucial for unlocking their anticancer potential.

In response, our project aims to develop a delivery vehicle with high specificity, low toxicity, and high bioavailability to guide existing peptide drugs to intracellular targets.

  1. markedly improve target specificity while reducing toxicity.
  2. help peptide drugs reach their targets and increase bioavailability.
  3. tailor to colorectal cancer and enable noninvasive delivery.

2. Target Identification and Bioinformatics Analysis

TL;DR: We systematically analyzed transcriptomes from colorectal cancer and normal tissues, identifying significant up-regulation of the matrix metalloproteinase (MMP) family and related pathways, with MMP9 showing more favorable expression characteristics than MMP2. These findings directly informed TRACER's design: we use high MMP9 expression as the activation trigger so that TRACER peptides are activated only within the tumor microenvironment, reducing toxicity and enhancing selectivity.

Beyond statistical summaries from clinical drug databases, we leveraged GEO transcriptomic datasets to perform differential expression and pathway enrichment between colorectal cancer and normal tissues. Results showed:

  1. Significant up-regulation of MMP genes (MMP1, MMP3, MMP7, MMP9; p < 0.0001), implicating key roles in ECM remodeling and invasion;
  2. Prominent enrichment of ECM-receptor interaction, cell adhesion molecules (CAMs), and focal adhesion pathways—highly concordant with MMP function.

Grounded in the literature, we executed the following workflow:

  1. Data acquisition and cohort stratification: We obtained transcriptomes for colorectal cancer and normal intestinal tissues (training set GSE10950). Clustering analyses confirmed clear separation of tumor vs. normal expression profiles, providing phenotypic support for downstream analyses.
  2. Differentially expressed genes (DEGs) and necroptosis-related DEGs (NRDEGs): Using statistical criteria, we identified DEGs between tumor and normal tissues, with specific focus on MMP2 and MMP9. Volcano and violin plots provided complementary views to highlight markedly overexpressed proto-oncogenic candidates.
  3. Gene set enrichment analysis (GSEA): Moving beyond single genes, we profiled pathway-level enrichments for ECM-receptor interaction, CAMs, and focal adhesion in tumor tissues. We further evaluated upstream regulation—e.g., TGF-β and VEGF—on MMP expression, assembling a pro-oncogenic signaling network.
  4. Protein-protein interaction (PPI) network and core-module mining: We built a PPI network from DEGs to pinpoint modules coordinating with MMPs upstream and downstream, revealing putative functional partners and core gene clusters implicated in tumorigenesis.
  5. Validation cohort analysis: We validated core DEGs and pathway signatures in an independent dataset ("Transcriptomic and Cellular Content Analysis of Colorectal Cancer by Combining Multiple Independent Cohorts"). Diagnostic performance of shortlisted genes was assessed to confirm robustness and generalizability.

Objective:

The GEO repository is the world's largest public gene-expression database. Leveraging published transcriptomes avoids redundant experiments, reduces costs, and ensures reproducibility. We selected GSE10950 as the primary training cohort (owing to its larger sample size) and used the supplementary data from "Transcriptomic and Cellular Content Analysis of Colorectal Cancer by Combining Multiple Independent Cohorts" as an independent validation cohort to test robustness and mitigate single-dataset bias.

Procedure:

We retrieved GSE10950 (tumor vs. normal colorectal tissue expression profiles) from GEO as the training set and downloaded the supplementary files of the study above as the validation set. Data preprocessing and curation were conducted with UltraEdit and Excel.

Objective:

  • Identify DEGs that differ significantly between colorectal cancer and normal tissues; such genes can serve as molecular markers of disease phenotypes and may drive tumorigenesis.
  • Focus on the matrix metalloproteinase (MMP) family to narrow from genome-wide signals to genes directly tied to our core biology (ECM remodeling, aberrant adhesion), thereby filtering noise and sharpening the target space.
  • Utilize visualization tools in Sangerbox (http://sangerbox.com/)—heatmaps, volcano plots, violin plots—to depict DEG counts and expression patterns, laying the groundwork for downstream GSEA and PPI network modeling.

Procedure:

We uploaded tumor and normal CRC datasets to Sangerbox and applied default criteria (FDR/adjusted p < 0.05 with |log2FC| thresholds) to call significant DEGs and annotate NRDEGs (necroptosis-related DEGs). Particular attention was given to MMP2 and MMP9 and other MMP family members. Using volcano plots, heatmaps, violin plots, and ROC curves, we characterized their tumor-biased overexpression and nominated putative pro-oncogenic candidates.

Figure 2.4 left

Figure 2.1. Differential expression heatmap

Caption: Blue = Tumor; Red = Normal.

Sample grouping Left/top annotation bars denote sample classes (Tumor vs. Normal) Hierarchical clustering shows clear separation
Gene expression The heatmap color scale encodes expression levels: blue = low, red = high A clear "tumor vs. normal" differential pattern is observed

Table 2.1. Notes on the differential-expression heatmap

Sample stratification is clear, with a marked tumor--normal separation. In the dendrogram, tumor (blue bar) and normal (red bar) samples are fully segregated, indicating a substantial divergence in their gene-expression programs.

Biological implication: Colorectal cancer exhibits expression signatures that are distinctly different from normal intestinal tissue, providing phenotypic support for downstream differential analyses.

The figure also reveals "tumor high-/low-expression gene clusters." The heatmap shows four expression modules (upper-left red, upper-right blue, lower-left blue, lower-right red): the upper-left represents a tumor high-expression cluster (red blocks enriched), whose genes may be pro-oncogenic (e.g., the matrix metalloproteinase (MMP) family and cell-proliferation-related genes). The lower-left represents a tumor low-expression cluster (blue blocks enriched) → these genes may be tumor suppressors or normal tissue-specific genes.

Figure 2.4 left

Figure 2.2. Volcano plot of differential expression

Axis/Feature Description Observation
X-axis
(−log10 (p-value))
Statistical significance of differential expression: larger values (further right) indicate stronger significance (smaller p-values) Range 0–26, with most genes at p < 0.05
Y-axis
(log2 fold change)
Magnitude of expression change: larger values (higher up) denote greater tumor-vs-normal differences Range −8 to 8, with red (upregulated) points clustering in the upper region
Color
(direction of change)
Red = tumor high expression (up-regulated); green = tumor low expression (down-regulated). Red points outnumber green ones Many matrix metalloproteinase (MMP) family genes appear in red

Table 2.2. Notes on the volcano plot

Key findings:

  • In the volcano plot, MMP-9 appears as a significant red point, indicating marked up-regulation in tumors with strong statistical support (high −log10(p-value)).
  • By contrast, MMP-2 appears in black, not passing the preset significance thresholds (e.g., combined FDR/adjusted p and |log2FC| criteria), indicating that MMP-2 does not reach statistical significance in this training cohort.
Figure 2.4 left

Figure 2.3. Violin plots for MMP9 vs. MMP2 enrichment/expression

Figure 2.4 left Figure 2.4 right

Figure 2.4. Box plots for MMP9 vs. MMP2 enrichment/expression (left: MMP9; right: MMP2)

Plot Type Feature Description Observation
Violin shape Distribution density of gene expression: the wider the violin, the greater the sample density at that expression level Here, the tumor group (blue) shows generally wider violins, indicating a larger fraction of samples at higher expression ranges and greater heterogeneity in tumors
Box plot Statistical features of expression: the box denotes the interquartile range (IQR, 25%–75%), the horizontal line inside marks the median, and whiskers span 1.5×IQR (points beyond are outliers) In this figure, tumor box plots sit higher than normal

Table 2.3. Notes on violin and box plots

Figure 2.5 left Figure 2.5 right

Figure 2.5. Gene enrichment analysis plot for matrix metalloproteinase 9 (MMP9) vs matrix metalloproteinase 2 (MMP2) (left: MMP9; right: MMP2)

The AUC of matrix metalloproteinase 9 (MMP9) is around 0.82, indicating potential as a CRC molecular biomarker and as the targeting moiety for the TRACER-sensitive peptide.

Objective:

Unlike single-gene differential analysis, GSEA evaluates predefined gene sets (e.g., pathways, biological processes) to assess overall expression trends in tumor vs. normal groups and determine whether pathways are activated or suppressed.

The core goal here is to verify whether pathways associated with high expression of matrix metalloproteinases (MMPs)—such as ECM remodeling and aberrant cell adhesion—are significantly enriched in colorectal cancer. Results show these pathways are markedly activated in tumor samples, supporting the hypothesis that the matrix metalloproteinase (MMP) family participates in ECM remodeling, adhesion abnormalities, and their upstream regulatory signals.

We also observed enrichment of regulatory pathways such as TGF-β and VEGF, providing clues for subsequent mechanistic analyses (e.g., links between ECM remodeling and signaling pathways).

Procedure:

Using the Sangerbox web platform, we performed GSEA on the uploaded differentially expressed genes. We selected the c2 (curated pathway) collection from MSigDB, ran the analysis, and obtained enriched pathways and biological processes. We focused on core MMP-related pathways—ECM-receptor interaction, cell adhesion molecules (CAMs), and focal adhesion—and concurrently evaluated enrichment of upstream signaling pathways such as TGF-β and VEGF.

Figure 2.6

Figure 2.6. KEGG pathway enrichment bubble chart

Caption: Pathway names—e.g., Human T-cell leukemia virus 1 infection, Axon guidance—are on the y-axis (enriched KEGG pathways).

  • Metrics: X-axis GeneRatio = the proportion of differentially expressed genes (DEGs) in a pathway relative to all DEGs (larger values indicate a stronger association between the pathway and the DEGs).
  • Bubble size (Count): Number of DEGs contained in the pathway (larger bubbles indicate broader pathway impact).
  • Color (−log10(p-value)): Statistical significance of pathway enrichment (darker colors indicate smaller p-values and more significant enrichment).

Below is the analysis of specific KEGG pathways:

ECM Receptor Interaction Cell Adhesion Molecules Focal Adhesion VEGF Signaling Pathways in Cancer Wnt Signaling TGF-beta Signaling

Figure 2.7. Detailed signaling pathway analysis

Pathway name ES (enrichment score) NP (nominal P value) Interpretation Mechanistic link
ECM RECEPTOR INTERACTION ES = 0.4557 NP = 0.0978 Pathway enriched in tumors (positive ES, right-shifted curve), near significance Matrix metalloproteinase (MMP) activity degrades ECM components (e.g., collagen, fibronectin); as this pathway underpins cell--ECM adhesion, MMP function directly impacts its activity
CELL ADHESION MOLECULES ES = 0.4368 NP = 0.0432 Pathway significantly enriched in tumors (positive ES, p < 0.05) Focal adhesions are key for cell--ECM linkage; ECM degradation by matrix metalloproteinases (MMPs) disrupts focal adhesions, and pathway activation can in turn regulate MMP expression
FOCAL ADHESION ES = 0.4168 NP = 0.0039 Pathway significantly enriched in tumors (positive ES, p < 0.01) Cell adhesion molecules (CAMs; e.g., integrins) act via the ECM; ECM degradation by matrix metalloproteinases (MMPs) alters CAM function, with the pathway and MMPs co-regulating cell migration
VEGF SIGNALING PATHWAY ES = 0.2971 NP = 0.1381 Pathway enriched in tumors (positive ES, right-shifted curve), near significance The VEGF pathway promotes angiogenesis; matrix metalloproteinases (MMPs) degrade vascular basement membrane (facilitating neovascularization), and VEGF can upregulate MMP gene expression
PATHWAYS IN CANCER ES = −0.2050 NP = 0.3496 Pathway enriched in normals (negative ES, left-shifted curve), not significant "Pathways in cancer" provides an overview; high matrix metalloproteinase (MMP) expression is one hallmark of pathway activation, contributing to ECM degradation and angiogenesis
WNT SIGNALING PATHWAY ES = 0.2393 NP = 0.2373 Pathway enriched in tumors (positive ES, right-shifted curve), not significant The TGF-β pathway regulates the balance of ECM synthesis and degradation; matrix metalloproteinases (MMPs) act as downstream effectors (e.g., TGF-β upregulating MMP2)
TGF-β SIGNALING PATHWAY ES = 0.3194 NP = 0.1357 Pathway enriched on the tumor side but not strictly significant (p ≈ 0.1; suggests association with tumor activation, pending validation) Activation of the Wnt pathway promotes tumor cell proliferation and migration; matrix metalloproteinases (MMPs) degrade the ECM to aid invasion, and the pathway can directly regulate MMP gene expression (e.g., Wnt upregulating MMP7)

Table 2.4. Notes on specific signaling pathways

Conclusion:

  • ECM-receptor interaction, cell adhesion molecules (CAMs), and focal adhesion pathways are significantly enriched in tumors, representing the biological context of MMP functions.
  • TGF-β and VEGF pathways show trend-level enrichment on the tumor side and can directly upregulate MMP gene expression (e.g., TGF-β promoting transcription of MMP2 and MMP9 via SMAD signaling).

Value: Extending from "single genes" to "pathway networks" reveals the upstream regulatory routes underlying high expression of MMPs, corroborating that MMP levels are markedly elevated in colorectal cancer tissues compared with normal tissues.

Objective:

Proteins form functional networks through their interactions, and a PPI network can reveal cooperative relationships among differentially expressed genes. The network constructed in this study shows that matrix metalloproteinase 9 (MMP9), MMP1, MMP3, MMP7, MMP10, MMP2 and other matrix metalloproteinase (MMP) family genes are highly interconnected with collagen family members (e.g., COL1A1, COL4A1, COL6A3) and integrins (the ITGA series), suggesting cooperative roles in ECM remodeling and aberrant cell adhesion.

By screening hub genes (e.g., matrix metalloproteinase 9 (MMP9), COL1A1, ITGA5), key nodes in the network can be pinpointed; dysfunction of these genes may be decisive for stromal remodeling, tumor invasion, and metastasis in colorectal cancer, providing a prioritized list for subsequent experimental validation and target studies.

Module analysis (e.g., modules enriched with multiple collagens and integrins) helps focus on the core biological unit of ECM structural remodeling and extracellular matrix signaling, implying that these genes may act in concert to regulate high matrix metalloproteinase (MMP) expression and ECM abnormalities, thereby guiding deeper mechanistic investigation.

Procedure:

Construct a PPI network of differentially expressed genes—focusing on the matrix metalloproteinase (MMP) family and their upstream/downstream ECM-related genes—using the STRING database.

Figure 2.4 left

Figure 2.8. PPI analysis graph

Legend: Green = gene neighborhood/co-occurrence; Red = gene fusion; Blue = co-expression; Purple = experimental evidence; Light blue = curated database associations; Yellow = text mining; Black = co-expression or mutual regulation; Pink = protein homology.

Nodes with more edge colors and denser connections indicate proteins playing more central roles in the network.

The PPI network shows that the matrix metalloproteinase (MMP) family (e.g., MMP9, MMP1, MMP3) forms a highly interactive network with collagen family members (COL1A1, COL6A3, COL4A1) and integrins (ITGA series), suggesting cooperative regulation in extracellular matrix (ECM) remodeling, cell adhesion, and tumor invasion/metastasis.

Among them, MMP9, COL1A1, and ITGA5 may serve as potential hub genes, offering candidate directions for further mechanistic studies and targeted interventions. However, this is not the main focus of our project, so we did not pursue it further.

Bioinformatics findings may be influenced by dataset-specific factors; consistency in an independent validation cohort can markedly improve the reliability of conclusions, demonstrating that the matrix metalloproteinase 9 (MMP9) gene is significantly differentially expressed in colorectal cancer, rather than being an artifact of a single dataset.

We used the independent validation dataset "Transcriptomic and Cellular Content Analysis of Colorectal Cancer by Combining Multiple Independent Cohorts" to externally validate the core differentially expressed genes and pathways identified in the training set.

In this validation cohort, a Venn diagram was used to intersect differentially expressed genes (DEGs) from the training set with those from the validation set; matrix metalloproteinase (MMP) family genes—specifically MMP9—consistently appeared in the intersection, suggesting concordant expression patterns across cohorts and potential biological significance. A volcano plot displayed the overall distribution of gene expression between tumor and normal tissues, visually showing the numbers and significance of up- and down-regulated genes, thereby aiding assessment of genome-wide differential trends within the validation set.

This "discovery in the training set—replication in the validation set" strategy, via multidimensional visualization with Venn and volcano plots, not only intuitively presents the expression and significance of core genes in the validation cohort, but also objectively evaluates their diagnostic potential and clinical applicability, further strengthening the robustness and generalizability of the conclusions.

Figure 2.4 left

Figure 2.9. Volcano plot of the validation cohort

Figure 2.4 left

Figure 2.10. Venn diagram of differentially expressed genes in the training vs. validation cohorts

Caption: The matrix metalloproteinase 9 (MMP9) gene appears in the intersection of the two sets.

These results confirm that MMP9 is a core gene markedly activated in colorectal cancer and reveal its associated pathways and immune microenvironment features. We will translate these findings into modeling parameters: using MMP9 expression level as the activation probability and the ECM degradation rate as an input to the tissue model, thereby making subsequent structural modeling and pharmacokinetic simulations more biologically realistic.

3. Structure Modeling of TRACER: Shielded vs Activated

TL;DR: To validate the rationality of the TRACER-IL24 design at the atomic level, we used AlphaFold and molecular dynamics to model its two states before and after cleavage by MMP9---activated vs. shielded. The results show: (1) in the shielded state, the negatively charged peptide segment effectively masks the transmembrane functional face of BRII; (2) in the activated state (post-cleavage), BRII's functional region becomes exposed, markedly enhancing its transmembrane propensity; and (3) throughout, the therapeutic protein IL-24 remains structurally stable with function unaffected. These findings structurally demonstrate that our "proteolytic activation" switch is feasible and efficient.

Through differential transcriptomic analysis and pathway enrichment, we confirmed that MMP9 is highly expressed in colorectal cancer and is closely associated with ECM remodeling, cell adhesion, and pro-inflammatory immune responses. This indicates a molecular milieu in tumors distinct from normal tissues:

High MMP expression → provides a potential activation signal;

A pro-inflammatory immune microenvironment → further broadens the therapeutic intervention window.

Accordingly, based on preliminary literature review by the biology team, we devised TRACER---a multimodule fusion architecture tailored to the tumor environment---comprising a custom negatively charged shielding segment, a protease-sensitive peptide, and BRII. We further fused IL-24 to TRACER to create a bona fide peptide therapeutic. IL-24 is a known anticancer factor that triggers endoplasmic reticulum stress within cancer cells and induces tumor cell apoptosis.

Figure 2.4 left

Figure 3.1. TRACER genetic circuit schematic

Next, we will address three key questions via structural modeling to validate the structural soundness of TRACER-IL24 and to predict its functions:

(1) In the inactive state, is the TRACER-IL24 structure reasonable (can the negatively charged shielding peptide effectively reduce the positive charge on the BRII surface)?

(2) After cleavage of the MMP-sensitive peptide (activated state), is BRII sufficiently exposed and competent for delivery? Do BRII's cationic and hydrophobic residues reappear at the surface?

(3) In the fused construct, is IL-24 stable and epitope-exposed? Is its fold preserved? Are the key receptor-binding regions unobstructed by BRII or the negatively charged segment?

We constructed two TRACER-IL24 structural systems to test the design goal that, upon removing the shielding segment, BRII becomes fully exposed and adopts a delivery-competent conformation:

System A (shielded/inactive): Total_TRACER-IL24

Sequence: EEEGEEGEG-PLG↓LAG- RAGLQFPVGRLLRRLLR - GSGSGS (flexible linker) - IL-24 (N-terminal signal peptide removed)

Meaning: No cleavage by MMP; in theory, the negatively charged segment masks BRII's cationic/hydrophobic face.

System B (activated): Cut_TRACER-IL24

Sequence: LAG- RAGLQFPVGRLLRRLLR - GSGSGS (flexible linker) - IL-24 (N-terminal signal peptide removed)

Meaning: Mimics post-MMP cleavage; the negative segment + sensitive peptide are removed, exposing the BRII core and, in theory, yielding a membrane-translocating/uptake-competent conformation.

The structures predicted by the mature open-source online platform alphafoldserver.com using the AlphaFold3 model are:

ECM Receptor Interaction Cell Adhesion Molecules

Figure 3.2. Structures of Total_TRACER-IL24 (left) vs Cut_TRACER-IL24 (right)

Caption: Two conformations predicted by AlphaFold. Left: uncleaved full-length TRACER-IL24 (negative segment + sensitive peptide + BRII + IL-24). Right: post-proteolysis activated state (BRII + IL-24). Blue indicates regions of high predicted confidence; orange denotes lower-confidence flexible segments.

In both models, the core fold of IL-24 remains stable with high confidence;

The negatively charged segment and the sensitive-peptide region appear as low-confidence, coil-like structures, suggesting intrinsic disorder consistent with the design expectation of being cleavable/removable;

Conformationally, the BRII region in state B is more open with greater surface exposure, making membrane interaction more likely.

Conclusion: Both states exhibit generally good predictive confidence, providing a foundation for subsequent comparisons.

ECM Receptor Interaction Cell Adhesion Molecules

Figure 3.3. Total_TRACER-IL24 (left) vs Cut_TRACER-IL24 (right) structure

Caption: Per-residue error plots predicted by AlphaFold. Greener colors indicate lower error and higher confidence.

The IL-24 region shows uniformly low error, indicating a highly reliable fold.

The negative segment/sensitive-peptide region has higher error, consistent with expected flexibility and disorder.

The IL-24 portion is highly consistent between states, indicating that removing the shielding peptide does not compromise IL-24 structural reliability.

Conclusion: AlphaFold predictions support our hypotheses for both systems: IL-24 remains stably folded, and BRII's conformation changes markedly upon shielding-peptide removal.

After obtaining the initial AlphaFold models, we used UCSF ChimeraX with ModRefiner + MolProbity for energy optimization and geometric assessment of TRACER-IL24 to determine whether further refinement was needed.

Figure 2.4 left

Figure 3.4. MolProbity quality assessment of the TRACER-IL24 model

Caption: Clashscore = 0 (no serious atomic clashes, 100th percentile); Ramachandran favored = 100%; rotamer favored = 100%; MolProbity score = 0.50 (100th percentile). The only notable deviation is the Rama distribution Z-score (−4.68 ± 1.61), arising from the N-terminal negatively charged segment and BRII; this is not a folding error, but indicates intrinsic disorder/flexibility in these regions.

The AlphaFold3 structural modeling shows excellent overall geometry. The core IL-24 region is geometrically sound with no atomic clashes. Peptide flexibility also matches the design.

Conclusion: The system's energetic and geometric metrics are already optimal and highly reliable, suitable as a starting point for subsequent analyses; additional rounds of refinement are unnecessary.

Surface charge

ECM Receptor Interaction Cell Adhesion Molecules

Figure 3.5. Surface electrostatics comparison (Total vs Cut)

Caption: Electrostatic potential distribution of the BRII region in the Total and Cut conformations. Blue indicates positive charge; red indicates negative charge. In the Total state, BRII's positively charged surface is masked by the negatively charged segment, whereas in the Cut state continuous positive patches appear.

Analysis: After removing the shield in the Cut state, BRII's cationic face is largely unchanged from pre-cleavage and remains essentially stable.

Figure 2.4 left Figure 2.4 left

Figure 3.6. SASA comparison of positively charged residues in TRACER (Total vs Cut)

Caption: Per-residue SASA in BRII, with a bar chart comparing the Total and Cut states (blue = Total; orange = Cut). The lower plot shows ΔSASA (Cut —— Total).

Observation: Residue SASA increases markedly in the Cut state (ΔSASA > 0), indicating that the shielding segment masks these cationic residues in the Total state, whereas the Cut state exposes them, favoring membrane interaction.

Figure 2.4 left Figure 2.4 left

Figure 3.7. SASA comparison of hydrophobic residues in TRACER (Total vs Cut)

Caption: Solvent-accessible surface area (SASA) of hydrophobic residues (Leu/Val/Phe) in the BRII region, comparing the Total and Cut states.

Analysis: In the Cut state, hydrophobic surface area increases, revealing more potential membrane-inserting surfaces---consistent with the activated-state transmembrane delivery profile.

Figure 2.4 left Figure 2.4 left

Figure 3.8. IL-24 epitope exposure comparison (Total vs Cut)

Caption: Comparison of SASA for IL-24 epitope residues; bar charts show exposure of key residues in the Total vs. Cut states, with the right panel displaying ΔSASA (Cut -- Total).

Analysis: The active sites of IL-24 are epitopes at res68--88 and 138--148. Exposure varies in both states without a consistent trend, suggesting that the TRACER fusion architecture may influence IL-24's functional domain. Therefore, we urge that before final production, wet-lab experiments be conducted to reassess whether IL-24 can exert its pharmacological activity.

Figure 2.4 left

Figure 3.9. Short-timescale MD simulations (RMSD and RMSF)

Caption: Left: 0--1 ns RMSD curves; right: per-residue RMSF. Blue = Total; orange = Cut.

Analysis: In the Total state, RMSD fluctuates around 0.8--1.0 nm, whereas the Cut state stays below 0.1 nm. With the flexible segment removed, the Cut state shows better RMSD convergence and lower fluctuations; RMSF indicates pronounced motion in the Total state but overall stability in the Cut state, suggesting the activated structure is dynamically more reasonable and functionally competent.

Metric Definition Expected change (B vs A) Role
Continuous cationic surface Area/connectivity of positive surface patches on TRACER Favors adsorption to negatively charged membrane headgroups
SASA of cationic residues Lys/Arg side-chain SASA in BRII (ΔSASA>0) Indicates previously shielded positives are exposed
Hydrophobic exposure area Surface area of BRII Leu/Val/Phe Promotes hydrophobic partitioning/insertion
IL-24 epitope SASA SASA at receptor-binding key sites No decrease Fusion does not occlude function
Structural geometric quality MolProbity / Ramachandran Pass/improve Usable for comparisons after refinement

Table 3.1. Comparative metrics

The metrics above align with the expected trends, leading to the following final conclusions:

(1) The Total-state conformation is reasonable.

(2) Relative to the Total state, the Cut state shows significant increases in cationic-surface exposure, cationic-residue SASA, and hydrophobic surface area, demonstrating that removal of the shielding peptide structurally activates the cell-penetrating peptide's delivery function.

(3) IL-24's structure and functional epitopes remain stable, indicating that fusion to TRACER does not impair IL-24 activity.

Having established the structural soundness of TRACER-IL24, we next aim to analyze its cleavage process & the interactions of the post-cleavage Cut state with cancer cell membranes---the goals of the following two chapters.

4. Docking of MMP9 and TRACER-IL24 and molecular dynamics validation

TL;DR: Bioinformatics indicates that MMP9 is highly expressed in CRC. We therefore tested whether TRACER's PLGLAG sensitive peptide can be cleaved by MMP9: first, molecular docking to obtain a correct pose in the active pocket; then GROMACS MD on the complex to assess whether the catalytic geometry remains accessible and interfacial interactions are stable. Results: the catalytic distance stays within a cleavable range and the complex is stable, supporting a mechanistic loop of "MMP9-mediated shield removal → BRII activation."

If TRACER's PLGLAG is cleaved by MMP9, the negatively charged shielding segment is removed, BRII becomes exposed, and a delivery-competent conformation is obtained. To simulate and analyze this "cleavage" process, we performed joint validation using molecular docking and short-timescale MD (GROMACS).

We first performed molecular docking:

Receptor: human MMP9 crystal structure (PDB: 5I12; waters and hetero ligands removed, hydrogens added, Zn²⁺ and key coordinating residues retained).

Ligand: TRACER-IL24 (containing the PLGLAG sensitive peptide with a flexible linker).

Grid: centered on the active-site pocket, covering the site and the vicinity of Zn²⁺.

Figure 2.4 left

Figure 4.1. Docked conformation of matrix metalloproteinase 9 (MMP9) and TRACER-IL24

Caption: The protein backbone (cyan) is matrix metalloproteinase 9 (MMP9); the protein backbone (green) is TRACER-IL24. The active-site Zn²⁺ (grey sphere) is clearly identified. TRACER-IL24's MMP9 recognition site (the sensitive peptide segment PLGPAG shown as yellow/green sticks) inserts stably into the catalytic pocket, forming a single hydrogen bond (yellow dashed line; average distance ~2.5 Å) and engaging key MMP9 residues (e.g., Lys258, Leu213).

The docking pose shows the scissile-site peptide extending into the MMP9 catalytic center; beyond metal coordination, the recognition segment forms stable hydrogen-bonding and hydrophobic contacts with surrounding MMP9 residues, indicating a dynamically reasonable conformation.

Beyond docking, we used MD (GROMACS) to examine the atomistic interaction features and stability of TRACER-IL24 with matrix metalloproteinase 9 (MMP9).

Given the Zn²⁺ ion in the MMP9 active center, proper Zn force-field treatment is critical. We referred to Chen et al., 2021 and corresponded via email with the paper's corresponding author, Prof. Qi Wang, to confirm modeling schemes for Zn²⁺. This ensured rigorous and sound system construction.

The docked complex served as the starting conformation; system setup and equilibration in GROMACS were as follows:

System build: Complex placed in a water box; Na⁺/Cl⁻ added to 0.1 M for neutrality.

Zn²⁺ modeling: Catalytic-site Zn²⁺ treated with an ionized model; tetracoordinate Zn²⁺ parameterized with the ZAFF force field.

Equilibration: Energy minimization followed by NVT and NPT equilibration.

Production: 100 ns MD run, focusing on stability of the peptide near the scissile site.

Detailed parameters and intermediate files are provided in the appendix. We proceed directly to results. Over the full simulation:

(1) Number of interfacial hydrogen bonds
Figure 2.4 left

Figure 4.2. Time evolution of hydrogen-bond counts in the complex

Caption: Purple denotes the raw trajectory (instantaneous hydrogen-bond count), and green denotes the 100 ps moving average. The hydrogen-bond count increases steadily after 2 ns, averaging above 10.

For small systems, ≥3 hydrogen bonds suffice to indicate a stable complex; in our system the count remains above 10 after 10 ns, indicating a very strong interfacial binding.

The upward trend indicates that after PLGLAG inserts into the catalytic groove, progressively more stable interactions are established rather than random fluctuations.

(2) Interfacial buried surface area (BSA)
Figure 2.4 left

Figure 4.3. Time evolution of the complex's buried surface area (BSA) during simulation

Caption: BSA increases over the first 10 ns and then plateaus. The blue dashed line marks the mean BSA at ~24--25 nm² (i.e., 2400--2500 Ų).

0--2 ns: BSA ~18--22 nm², indicating the interface has not fully tightened.

2--10 ns: BSA rises steadily and stabilizes at ~25 nm² (≈2500 Ų). For typical protein--peptide complexes, BSA ≥600--1000 Ų is considered an effective interface.

Literature thresholds are generally 600--1000 Ų; our system reaches ~2500 Ų, far above the minimum, indicating a very large binding interface.

The trend---from gradual fitting to stable burial---indicates interfacial optimization during dynamics and eventual convergence to a steady state, consistent with typical kinetic behavior.

(3) Number of interfacial contacts
Figure 2.4 left

Figure 4.4. Number of atomic contacts during the simulation

Caption: Contacts within <0.6 nm remain at 4000--6000; tight contacts within <0.35 nm stay around ~350.

Small peptide--protein complexes are typically stable with ≥10 tight contacts; our system maintains hundreds, indicating very robust binding.

The contact count increases and then plateaus rather than declining or dissociating, showing that the PLGLAG segment remains engaged and becomes progressively tighter over time.

The stable contact plateau corroborates the rising trend in BSA.

(4) Complex RMSD
Figure 2.4 left

Figure 4.5. Ligand RMSD over time

Caption: The RMSD averages ~1.5--1.6 nm with convergent fluctuations and no notable drift. From 0--2 ns, RMSD rises from 0 to ~1.5 nm, indicating rapid search for an optimal binding pose. From 2--10 ns, a plateau persists at 1.5--1.6 nm with no continued increase or drift.

In small-molecule systems, a 0.2--0.3 nm plateau is typically deemed stable; our system is larger, hence the higher RMSD values. The key point is trend convergence with no signs of dissociation.

The IL-24 moiety is sizable, elevating RMSD relative to small complexes, yet the PLGLAG segment remains stably seated in the catalytic groove---central to validating the mechanism.

Thus, the complex can be judged converged and stable after ~2 ns.

(5) Distance between the active ion and the scissile site
Figure 2.4 left

Figure 4.6. Time evolution of the Zn²⁺--carbonyl oxygen geometric distance

Caption: The plot shows, over a 10-ns MD simulation, the time course of the key Zn²⁺--carbonyl-oxygen distance at the PLGLAG scissile site between matrix metalloproteinase-9 (MMP9) and the TRACER sequence. The y-axis is the minimum distance (nm) between Zn²⁺ and the substrate carbonyl oxygen; the x-axis is time (ps).

Early phase (0--1000 ps): the Zn²⁺--carbonyl-O distance remains ~0.2 nm, indicating appropriate proximity to the active site. 2000--7000 ps: the distance repeatedly rises to 2--4 nm but then returns below 1 nm, consistent with dynamic binding--unbinding. 8000--10000 ps: the distance frequently resides near ~0.2 nm, showing that the substrate carbonyl oxygen repeatedly enters the Zn²⁺ coordination range.

Prior work indicates that, for matrix metalloproteinase-9 (MMP9) cleavage of PLGLAG, an effective catalytic distance requires Zn²⁺--carbonyl-O < 0.2 nm (2 Å) to enter the catalytic window [12]. Despite the limited 10-ns span, our trajectory shows repeated entries into this range, implying substantial dynamic accessibility of the sensitive peptide near the active site; longer simulations would further substantiate binding stability and catalytic relevance.

In sum, throughout the MD simulation the PLGLAG sequence remains tightly associated with the MMP9 catalytic groove, with no evident dissociation.

After docking and MD, we compiled key geometric parameters, stability metrics, and interfacial interactions, comparing them with common ranges in the literature. The results are summarized below.

Metric Reference acceptable range Our result Notes
Zn²⁺--peptide carbonyl-O distance < 2.0 Å Within acceptable range Most frames fall in a catalytically accessible window
Complex RMSD Stable plateau < 0.30--0.35 nm 1.5nm Complex remains stable over the simulation
Interfacial H-bonds ≥ 3 >10 on average Indicates robust interfacial binding
Hydrophobic contacts (<0.35 nm) ≥ 10 ~350 Hydrophobic packing reinforces stability
Buried surface area (BSA) ≥ 600 Ų 2400--2500 Ų (avg.) Sufficiently large, evidencing a substantive interface

Table 4.1. Comparative metrics

It should be noted that literature "stable-binding ranges" largely come from small-molecule--protein or short-peptide--protein systems; our complex is larger, so absolute H-bond counts, contacts, BSA, and RMSD are naturally higher. The evaluation hinges not on matching small-system values but on:
whether these metrics converge and remain stable;
whether they exceed the "minimum stability thresholds."

Our results show:
the numbers of H-bonds and contacts increase and then stabilize, indicating interfacial optimization;
BSA rises and then plateaus, indicating a formed and maintained buried interface;
RMSD rises rapidly early then settles to a plateau, indicating structural adjustment followed by stability;
and the Zn²⁺--O(Gly C=O) distance (a key catalytic geometry) repeatedly enters ~2 Å, demonstrating that the PLGLAG segment indeed samples the matrix metalloproteinase-9 (MMP9) catalytic conformation with sustained geometric accessibility---directly supporting the feasibility of cleavage.

Therefore, beyond reasonable absolute values, the time-evolution trends fit the physical picture of a substrate progressively entering and stably engaging the enzyme's catalytic groove, further validating the molecular-level feasibility of "MMP9-mediated shield cleavage."

Docking and MD collectively predict that the highly expressed matrix metalloproteinase-9 (MMP9) identified by bioinformatics can indeed recognize and stably bind PLGLAG, validating the molecular basis of TRACER-IL24's "proteolytic activation" mechanism and providing theoretical support for subsequent experiments.

5. Interactions between activated TRACER-IL24 and cancer cell membranes

TL;DR: To assess transmembrane competence of activated TRACER-IL24, we found that single metrics are insufficient for large biologics. We therefore introduce a Transmembrane Readiness Framework (TRF) that integrates geometric, thermodynamic, and kinetic layers. MD simulations analyzed with TRF show that the MMP-9--cleaved Cut state significantly outperforms the Total state across insertion depth, energy barriers, and orientation, mechanistically validating proteolytic activation as both effective and necessary for enabling membrane entry and delivery.

Given the size/charge complexity of TRACER-IL24 and the IL-24 moiety, we ask:
Q1 Can the fusion still engage cancer membranes effectively?
Q2 Does proteolysis significantly enhance transmembrane potential, and how can we compare states without a universal "large-molecule vs membrane" metric?

We predicted the Cut_TRACER-IL24 structure with AlphaFold2 and simulated its interaction on a HeLa membrane model. Composition followed Botet-Carreras (2021) [9] and Li (2011) [10] to match leaflet asymmetry (outer rich in PSM/CHOL; inner rich in PE/PS). CHARMM-GUI generated lipid counts and hydration.

Leaflet POPC POPE POPS PSM CHOL
Outer 57.1% 0.0% 0.0% 14.3% 28.6%
Inner 16.7% 33.3% 16.7% 0.0% 33.3%

Table 5.1. Cancer cell membrane composition

Note: The outer leaflet is enriched in PSM/CHOL and the inner leaflet in PE/PS, consistent with reported asymmetry for HeLa membranes. In practice, CHARMM-GUI converts percentages into per-leaflet lipid counts (we used default hexagonal packing and duplicate-water removal).

Methods: We adopted the membrane-penetration workflow of Brožek et al., 2020 (J. Phys. Chem. B) [11] and confirmed with the corresponding author, Prof. Vácha, via email the parameters and specific GROMACS input settings used in their "PLGLAG sensitive peptide--membrane interaction" work. We obtained their workflow and parameter repository to ensure reproducibility.

We likewise compiled and uploaded all parameter and intermediate files for this part of the work as attachments for other researchers to reference.

Existing literature (including Brožek's work) often uses single metrics (e.g., free-energy profiles or insertion depth) to describe peptide translocation, but under our problem setting such metrics are insufficient to fully answer whether cleavage significantly enhances transmembrane potential. The reasons are twofold:

Translocation is driven by multiple coupled factors: geometric fit, hydrophobic exposure, charge distribution, barrier height, and insertion stability are all necessary.

There is no unified criterion: differences in systems (membrane composition, peptide charge, simulation methods) make absolute thresholds hard to compare. We therefore need a framework that compares Total vs. Cut under relative conditions.

Accordingly, we propose the Transmembrane Readiness Framework (TRF).

Figure 2.4 left

Figure 5.1. Transmembrane Readiness Framework

Layer A -- Geometry & interactions (structural necessary conditions)

Insertion depth z: distance from the COM to membrane headgroups; a direct geometric metric (commonly used) to judge whether the peptide truly inserts into the membrane.

Orientation & tilt: transmembrane peptides must adopt a suitable tilt angle to enter the membrane (a standard analysis in cell-penetrating peptide literature).

Contact density: number of contacts with lipid tails reflects hydrophobic driving force.

Charge patchiness: APBS electrostatic surfaces report uniformity; large contiguous charge patches can be "locked" by polar headgroups, hindering penetration.

Layer B -- Thermodynamics (energetic necessary conditions)

Free-energy barrier ΔG: ideally via PMF from umbrella sampling; here we use SMD curves as a ranking proxy---peak force--displacement and cumulative work reflect relative barrier heights.

Layer C -- Kinetics (sustainability conditions)

Orientation autocorrelation τ_orient: indicates whether molecular orientation becomes easily locked at the membrane interface.

Rationale: Layer-A metrics follow established peptide--membrane MD (depth/tilt); Layer-B barrier analysis aligns with classical free-energy methods; Layer-C orientation stability is a common "functional" metric in drug delivery. Thus, TRF systematizes and extends existing analyses rather than inventing them de novo.

To evaluate the role of matrix metalloproteinase (MMP) cleavage, we built two systems:

Total_TRACER-IL24 (uncleaved): full fusion protein.

Cut_TRACER-IL24 (post-cleavage): shielding segment removed; BRII + IL-24 exposed.

Hereafter "Cut" and "Total." Both systems ran EM, NPT, and SMD workflows under identical parameters. We then evaluated results within the TRF framework:

Layer A: insertion depth, SASA, electrostatic patchiness, tilt distribution.

Layer B: SMD curves (relative barrier ranking).

Layer C: temporal trends (whether orientation remains persistently diversified).

This allows us to answer whether the cleaved TRACER-IL24 shows stronger transmembrane readiness across geometry, electrostatics, thermodynamics, and kinetics.

Results below follow this framework.

Layer A -- Geometry & interactions
(1) RMSD and Rg (overall stability)
ECM Receptor Interaction Cell Adhesion Molecules

Figure 5.2. Backbone RMSD vs time (total left, cut right)

Caption: Backbone-on-backbone alignment. RMSD fluctuates slightly within 0.020--0.025 nm and plateaus in both systems, with no drift. X-axis: 0--10 ns.

Both total and cut fluctuate within 0.02--0.025 nm, indicating overall system stability.

Cut shows a slightly higher RMSD yet remains within a stable range, indicating no collapse post-cleavage and suitability for subsequent membrane-interaction analyses.

ECM Receptor Interaction Cell Adhesion Molecules

Figure 5.3. Radius of gyration vs time (total left, cut right)

Caption: Average Rg for total ≈ 1.94 nm; for cut ≈ 1.77 nm. X-axis: 0--10 ns.

Cut is more compact while total is looser, suggesting the cut conformation is more favorable for approaching/inserting into the membrane (reducing steric hindrance near the interface).

(2) Minimum distance between BRII and membrane headgroup atoms
ECM Receptor Interaction Cell Adhesion Molecules

Figure 5.4. Minimum distance between peptide COM and membrane (total left, cut right)

Caption: Total remains ~2.0--2.2 nm throughout; cut reaches ~1.6 nm.

Cut approaches polar headgroups more readily, forming a tighter initial adsorption interface; total stays farther away, hindering further insertion.

(3) Insertion depth
ECM Receptor Interaction Cell Adhesion Molecules

Figure 5.5. Insertion depth (COM_Z -- headgroup_Z) vs time (total left, cut right)

Caption: Defined as the peptide COM Z relative to the same-side headgroup plane. Smaller values indicate closer/deeper insertion. Total plateaus at ~6.3 nm; cut stabilizes at ~5.7 nm.

Cut inserts markedly deeper, stably reaching the interfacial zone; total remains mostly at the surface.

Note: Y-axis meanings are consistent; X-axis windows differ (some 0--1 ns, others 0--10 ns). Comparisons use plateau/mean over matching windows.

(4) Tilt-angle distribution and time evolution
ECM Receptor Interaction Cell Adhesion Molecules

Figure 5.6. Tilt-angle distribution

Caption: Total peaks at ~99° (quasi "lying flat"); cut peaks at ~155° (more "upright/slanted insertion").

Cut adopts a markedly more insertion-friendly orientation toward the hydrophobic core; total tends to spread on the surface.

ECM Receptor Interaction Cell Adhesion Molecules

Figure 5.7. Tilt angle vs time

Caption: Total fluctuates mildly around ~100°; cut maintains ≥150° for extended periods---an insertion-friendly orientation.

Cut sustains the "right posture," underpinning thermodynamic advancement (Layer B) and kinetic persistence (Layer C).

(B) Thermodynamics layer

Background: Z-directed SMD (pull/push) qualitatively compares "effective barrier/force profiles." Workflow/parameters follow Brožek; as nonequilibrium sampling, SMD is used for relative ranking/shape---not as a strict PMF surrogate.

ECM Receptor Interaction Cell Adhesion Molecules

Figure 5.8. SMD force--displacement (total left, cut right)

Caption:
Total: steep, continuously rising resistance from the outset, lacking "steps/plateaus," indicative of high barriers;
Cut: composed of stepwise linear segments with lower overall slope---requiring notably less force over comparable displacements.

Comparing slopes/plateaus, cut exhibits markedly lower effective barriers; total struggles to progress into the hydrophobic region.

ECM Receptor Interaction Cell Adhesion Molecules ECM Receptor Interaction Cell Adhesion Molecules

Figure 5.9. SMD force vs time & displacement vs time

Caption: Total: force spikes to a maximum within ~0.1--0.2 ns, after which displacement growth is hindered ("stalled");
Cut: force rises quasi-linearly while displacement continues to increase, showing jump-like advances over local barriers.

TRF-B interpretation: Under equal pulling schedules, cut advances continuously---consistent with lower/segmented barriers; total is blocked early by large barriers.

(C) Kinetics layer
ECM Receptor Interaction Cell Adhesion Molecules

Figure 5.10. Orientation autocorrelation (0--1 ns, second-order Legendre polynomial)

Caption: Exponential fits over 0--1 ns yield orientation correlation time τ. Cut τ ≈ 0.12 ns; total τ ≈ 0.49 ns.

Cut reorients faster (smaller τ), actively searching for "insertable windows" near the membrane; total is more "sticky," behaving like a surface-trapped adsorbate.

TRF layer Metric total (uncleaved) cut (cleaved/activated) Conclusion
A Geometry/Interactions RMSD (stability) stable (0.02--0.025 nm) stable Comparable
Rg 1.94 nm 1.77 nm Cut is more compact (favors membrane approach)
Min COM--membrane distance ~2.0--2.2 nm ~1.6 nm Cut approaches headgroups more closely
Insertion depth (smaller = deeper) ~6.3 nm ~5.7 nm Cut inserts deeper
Tilt distribution/over time ~99° ~155° Cut orientation favors translocation
B Thermodynamics SMD force--displacement slope/plateaus Steep, no plateau Lower slope, clear stepwise plateaus Smaller effective barrier for cut
SMD force--time / displacement--time Early saturation, limited displacement Continuous advance, near-linear displacement Cut more readily overcomes barriers
C Kinetics Orientation correlation time τ ~0.49 ns ~0.12 ns Faster reorientation/greater adaptability for cut

Table 5.1. Metrics in the TRF analysis framework

The simulations clearly show that the cleaved (cut) state dominates across Layers A/B/C---geometrically approaching and penetrating the interfacial zone more deeply; thermodynamically facing lower pulling resistance and effective barriers; kinetically reorienting more readily and maintaining an "insertion-friendly" pose.

This indicates that negative charges effectively shield BRII's cationic surface. Consistent with our design rationale, matrix metalloproteinase (MMP)--mediated cleavage is both the "activation switch" and a prerequisite for membrane entry. The workflow and metrics follow common practices in translocation simulations (SMD/distance/orientation, etc.).

Going forward, the same workflow and TRF can assess TRACER peptides with other cargos---e.g., small molecules, which may further enhance transmembrane potential. This suggests new experimental directions and that our ACPP platform could deliver not only proteins but also small-molecule therapeutics.

6. Quorum-triggered synchronized lysis (SLC) dynamics: predictions and wet-lab guidance

TL;DR: To support wet-lab experiments on quorum-triggered synchronized lysis, we built an ODE model to predict four observables over time: cell number (N), AHL signal, lysis protein (L), and LuxI, and we propose actionable criteria for assessing agreement with the model. We use two parameter sets:

  • Set A (E. coli-like): predicts sustained/weakly damped oscillations;
  • Set B (S. Typhimurium-like): predicts pronounced damping with oscillations dying out.

These predictions guide sampling cadence, loading volume, and interpretation thresholds; "data backfill slots" are reserved so the wet-lab team can substitute measured curves and recompute goodness-of-fit.

After validating TRACER's feasibility at the molecular level, we face a practical engineering question: how to produce this therapeutic efficiently and robustly? Our strategy employs a quorum-sensing triggered synchronized lysis circuit (SLC) in engineered bacteria. This is a complex dynamical system whose behavior (e.g., lysis period and efficiency) depends strongly on strain traits and environment.

Accordingly, the goal here shifts from "validation" to "prediction and guidance." We construct an ordinary differential equation (ODE) model to address key questions and provide a clear "operating manual" for the wet lab:

  • How will the system behave? Under different strain backgrounds, will synchronized lysis be sustained or damp quickly?
  • How should experiments be run? What sampling frequency captures the critical dynamics?
  • How should results be interpreted? How do we decide from data whether the circuit behaves as intended?

To capture the core dynamics of SLC, we formulate an ODE system with four key variables: cell density (N), AHL concentration (H), lysis protein concentration (L), and LuxI synthase concentration (I). Based on literature review, the relevant processes can be described by ODEs as follows:

System equations:

$$\frac{dN}{dt} = \mu_G N(N_0 - N) - \gamma_N N$$
$$\frac{dH}{dt} = bNI - \frac{\mu H}{1 + N/N_0}$$
$$\frac{dL}{dt} = C_I P_{lux} - \gamma_L L - \mu_G L$$
$$\frac{dI}{dt} = C_I P_{lux} - \gamma_I I - \mu_G I - \gamma_C I$$
$$P_{lux} = \alpha 0 + \alpha H(H/H0)41 + (H/H0)4, \gamma N P_{lux}$$
$$= \alpha_0 + \frac{\alpha_H (H/H_0)^4}{1 + (H/H_0)^4}, \quad \gamma_N \frac{kL^n}{L_0^n + L^n}$$

Meaning:

  • N: Cell number (approximated by OD or CFU).
  • H: Concentration of AHL.
  • L: Lysis protein (readouts via SDS-PAGE/immunoassay or fluorescent tags).
  • I: LuxI synthase.
  • αH: Promoter activity activated by AHL (Hill model with coefficient 4 to capture cooperativity).
  • μL: Cell death rate mediated by the lysis protein (Hill model).

Because circuit behavior may vary across chassis strains, we avoid a single parameter set and instead use two literature-informed representative scenarios to probe behavioral bounds.

We identified two parameter sets from the literature:

Set A (literature E. coli-like) [16]

Set A (literature E. coli-like) models efficient AHL synthesis and clearance. It is expected to yield sustained/stable oscillations—an ideal, periodically "harvestable" production mode.

Parameter Value Parameter Value
μG0.2N020
k10L01
n2b15
μ12CL0.5
CI1α01
αH20H04.5
γL2γI2
γC12

Expected phenomenon: quorum-synchronized lysis with oscillations persisting over multiple cycles.

Parameter set B (S. Typhimurium-like in the literature) [17]

Parameter set B (S. Typhimurium-like in the literature) models a weaker AHL signaling system. It is expected to yield damped oscillations—i.e., oscillations that decay to a steady state after several cycles—representing a plausible but less efficient mode.

Parameter Value Parameter Value
μG0.2N020
k4L01
n4b15
μ1CL0.5
CI1α00.4
αH8H02
γL2γI1
γC12

Expected behavior: With weaker synthesis/activation and a steeper lysis threshold (n↑, k↓), oscillations are strongly damped and approach steady state.

(1) Population rhythms of cells and AHL
Figure 2.4 left

Figure 6.1-A (Set A): N shows clear periodic rise-fall; each surge of H increases promoter activity, then L rises and the death rate increases, triggering synchronized lysis; after N drops, H is cleared and the next cycle begins.

Interpretation: A classic self-excited oscillator; OD/CFU should show >2 cycles of drop peaks, phase-led by AHL peaks in LC-MS (AHL peak precedes lysis peak).

Figure 2.4 left

Figure 6.1-B (Set B): After the first one or two peaks, amplitudes successively shrink and converge to steady state.

Interpretation: Damped oscillations, suggesting weaker AHL induction or a higher lysis threshold in this chassis.

These simulations translate into three practical guidelines:

  • Set expectations: Damped oscillations need not mean failure; they may reflect chassis traits (à la Set B), guiding host choice/engineering.
  • Optimize sampling: With a ~3-4 h period, measure OD600 every 15-20 min; densify sampling near peaks/valleys for AHL and protein to capture dynamics.
  • Define quantitative calls:
    • If over 3 cycles ΔN/N_peak > 30% and amplitude decay < 20% → "sustained oscillation" (Set A-like).
    • If amplitude decays > 50% within 3 cycles → "damped" (Set B-like).
(2) Phase relation of L and LuxI/II
Figure 2.4 left

Figure 6.2-A (Set A): I rises before L (driven by promoter activity), L peaks ~half a cycle later; each L peak coincides with increased death rate and a sharp N drop.

Interpretation: Clear phase ordering.

Figure 2.4 left

Figure 6.2-B (Set B): The peak value decreases gradually and the peak-to-peak interval slightly lengthens.

Unlike the continuous oscillation in Figure A, the oscillation in this scenario exhibits significant damping attenuation, with the cracking peak decreasing gradually and eventually stabilizing. This indicates that the strength of the positive feedback loop driving cracking is insufficient to maintain sustained synchronous behavior.

According to the analysis of model parameters, the two main inhibitory factors that may cause this phenomenon are:

  1. The degradation of key proteins is too fast: the ClpXP system responsible for degrading LuxI proteins may be too efficient (equivalent to a "brake" that is too heavy), resulting in AHL signaling molecules being unable to effectively accumulate to the threshold that triggers cleavage.
  2. The synthesis rate of key proteins is too low: The basal synthesis rate of LuxI or lytic proteins (equivalent to the upper limit of the "throttle") itself is low, and even with strong activation of AHL signals, the system's response ability is limited and insufficient to support the next round of strong lysis.

This simulation result warns us that when selecting or designing host strains, we must pay attention to their endogenous protein degradation and synthesis efficiency, ensuring that the positive feedback loop can be effectively triggered and amplified, otherwise the expected periodic lysis effect may not be achieved.

In Sum,
Parameter set Ref. Traits Simulation Experimental implication
Set 1 (E. coli) Nat. Commun. 2020 High AHL synthesis/clearance; strong LuxI Stable oscillations of N/H/L/I (~3-4 h period) Predicts periodic autolysis; offers quantitative guidance
Set 2 (S. Typhimurium) Nat. Microbiol. 2017 Low synthesis/clearance; weak LuxI Damped oscillations toward steady state Oscillations may vanish in some chassis; choose hosts carefully

Table 6.1. Two chassis, parameterized behaviors, and guidance

Suggested workflow: predicted period T_pred ≈ 120-180 min (placeholder). Measure OD600 every 10-15 min and photo-record clarity.

Record sheet (to backfill):

Period T (experiment): ________ min; amplitude (cycles 1/2): ________ / ________;

TBP = to be

Item Symbol Value
Average PeriodT_measurePending
Amplitude of cycleAmp_measurePending
DampingYes/NoPending

Parameter set A (E. coli-like): predicts sustained synchronized-lysis oscillations;

Parameter set B (S. Typhimurium-like): predicts rapidly damped oscillations approaching steady state;

Different chassis exhibit markedly different dynamics in the synchronized lysis circuit (SLC). The model suggests that sustained oscillations are maintained only with specific hosts (e.g., E. coli-like parameters as in set A). Therefore, achieving the desired oscillatory behavior requires targeted screening and optimization of the host strain.

7. Pharmacokinetics

Due to time constraints, we were unable to obtain the results of the MTT experiment. However, we will conduct the experiment and perform pharmacokinetic modeling after Wiki Freeze. We will use PH-Sim to simulate different dosing regimens for individuals in East Asia and analyze the concentration-time spectra of TRACER-IL24 in plasma, distal colorectum and kidneys. We will calculate pharmacokinetic parameters, including AUC, t_max and c_max, to characterize in vivo PK features and provide data support and theoretical basis for future animal studies and potential therapeutic applications.

MTT results analysis

Process cell viability according to the formula below:

$$Cell\ Viability(\%) = \frac{OD_{experiment} - OD_{blank}}{OD_{control} - OD_{blank}} \times 100\%$$

By fitting the dose-response curve to the raw data:

$$Cell\ Viability = Bottom + \frac{Top - Bottom}{1 + (\frac{Concentration}{EC_{50}})^{HillSlope}}$$

The fitting result yields the effect concentration parameter.

Convert the in vitro concentration effect relationship into effective drug dose mg/kg through in vivo pharmacokinetic PK:

$$Dose = \frac{C \times V_d}{F \times Weight}$$
  • C: Target plasma concentration mg/L
  • Vd: Distribution volume, drug dose/in vivo drug concentration=1.12L
  • F: bioavailability, Value ≈ 0.2
  • Weight: weight, modeled as East Asian, Default Weight = 60kg

Obtain the effective concentration parameter effective drug dose from above and conduct pharmacokinetic modeling.

1. Individual & Compound Create
Parameters Type/Value Source
SpeciesHumanDefault
PopulationEast AsianDefault
GenderMaleDefault
Age30.00 yearsDefault
Weight60.03 kgDefault
Height169.96 cmDefault
BMI20.78 kg/m²Default
Lipophilicity-2.50 Log UnitsEstimation
Fraction Unbound0.25%Estimation
Molecular Weight23.80kDaDatabase
Solubility at pH 7.000.21 mg/LEstimation
Advanced Compound Parameters/Default
2. Administration Protocol Create

According to the calculated effective drug dose, four rounds of administration were set, and the administration method was set as Intravenous Infusion to simulate an intelligent microneedle delivery device. The infusion time for each round was set to 60.00 minutes, and the dose matched the effective drug dose.

3. Simulation Run

Curve Selection: 1. Distal Colon → Interstitial Concentration 2. Kidney → Interstitial Concentration 3. Peripheral Venous Blood → Plasma Concentration. Used to evaluate whether the drug efficacy target concentration can be reached, while assessing plasma and kidney exposure, and determining the efficacy vs. toxicity balance.

8. NetLogo Organizational Layer Simulation

To explore the effects of different inoculation amounts on the population dynamics and lysis process of engineering bacteria, we used the Agent-Based Modeling (ABM) approach and simulated with NetLogo.

Figure 2.4 left

Figure 8.1. NetLogo simulation workflow

Initialization: Generate Num bacterial agents on a 2D grid as the initial inoculum (2000-10000 gradient). Assign each agent base attributes—lifespan, division cycle, and maximum health—with randomized variation to capture heterogeneity.

Nutrient-driven growth: At each timestep, bacteria consume nutrients on their current patch and decide to divide based on accumulated energy; biomass increase triggers the quorum-sensing module.

Quorum sensing: Viable cells secrete AHL, LuxR, and the AHL-LuxR complex at specified rates; molecules diffuse to neighboring patches and undergo degradation.

Lysis module: When local AHL-LuxR concentration remains above threshold, cells enter lysis state and synthesize lysis protein. Once this exceeds its threshold, the cell dies and instantly releases intracellular lysis protein as a burst, accelerating lysis in neighboring cells.

Outputs: At each step, record total cell count, biomass, and concentrations of signaling molecules and lysis protein to analyze synchronous lysis patterns.

(1) Relationship between inoculum size and lysis
Figure 2.4 left

Figure 8.2: The effect of different initial inoculation amounts on biomass accumulation and fragmentation cycles

[Caption]: Simulate setting Num=2000, 5000, 8000, 10000, 12000, 15000, 20000. The vertical axis represents biomass, and the horizontal axis represents time step (tick). As the vaccination amount increases, the lysis cycle is significantly shortened and the synchronization of the population is enhanced.

Figure 2.4 left

Figure 8.3: Inoculation volume fragmentation cycle

As the inoculation amount Num increases (2000 → 10000), the first lysis cycle is significantly shortened.

Figure 2.4 left

Figure 8.4: When the inoculation amount is Num=20000, the growth quantity of engineering bacteria.

[Caption]: When the inoculation amount is too high, too many lytic proteins are produced in the lysed area, and the rate of inactivation of lytic proteins is not sufficient to rapidly reduce the effective amount of lytic proteins, resulting in a low bacterial count after the first lysis and the system losing the observable second cycle.

The results indicate that the initial inoculation amount determines the accumulation speed of quorum sensing signals, which in turn affects the lysis cycle. In the future, the wet experimental section will introduce inoculation volume for validation.

(2) Visualization of spatiotemporal distribution:
Figure 2.4 left

Figure 8.5: Spatiotemporal visualization from NetLogo simulations

Caption: Each panel shows the lysis process under different values of Num. a.Num=2000; b.Num=2000; c.Num=2000; d.Num=2000. The upper-left subfigure plots total biomass and product over time; the upper-right shows spatial distributions (red = high lysis-protein concentration, green = viable cells). Results indicate that at higher inoculum levels, lysis events are more clustered and exhibit more stable periodicity.

  • Spatial clustering of signaling molecules and lysis protein is visible in local patches, forming "hotspot" regions.
  • Lysis events erupt locally and then spread to neighboring areas, driving synchronized lysis across the population.
  • At higher inoculum, this spatial clustering effect is amplified, accelerating the onset of synchronized lysis.

The NetLogo simulations reveal, at the tissue scale, a quantitative relationship between inoculum size and lysis period:

  • A higher initial inoculum accelerates quorum-signal build-up and shortens the lysis period.
  • When the initial inoculum is too low, lysis initiation is delayed.

These findings provide a theoretical basis for large-scale industrial culture of engineered bacteria: by tuning the initial inoculum, the lysis rhythm can be precisely controlled to manage cultivation cycles and achieve optimal yield.

9. Discussion and Outlook on Current Work 1: A model-driven closed loop, its limitations, and future iterations

Our current work has built an unprecedented multi-scale modeling framework that not only provides multidimensional validation of the TRACER-IL24 delivery system but also forms a complete logical closed loop from "why do it (Why)" to "can it work (How)" and then to "how to use it (What's next)." The excellence of this framework lies in the fact that each layer of the model takes the conclusions of the previous layer as input. However, we must also soberly recognize that every model is a simplification of complex biological reality, with underlying assumptions and limitations that must be faced.

Our modeling story begins with the most macro-level clinical need and advances layer by layer, ultimately returning to the question of clinical application. The core strength of the work lies in deeply realizing the synergy between model and experiment—the model is no longer mere "validation," but rather a "guidance system" that runs throughout.

Although our modeling framework is logically coherent, each layer is built on certain idealized assumptions. These constitute the limitations of the current work and are precisely the starting points for future optimization.

Molecular-level limitations—challenges of static predictions and simplified environments:

Inherent uncertainty of structure prediction: AlphaFold provides a "static" most probable conformation. For the highly flexible peptides in TRACER, prediction confidence is inherently lower. The real activation process is dynamic, whereas our simulations can capture only two snapshots—"before activation" and "after activation."

Limitations of simulation duration: Our MD simulations are at the nanosecond (ns) scale, sufficient to verify stability and initial trends. However, observing a complete catalytic event or membrane traversal requires microseconds (μs) or longer, which is beyond our current computational resources.

Over-simplified environment: Our transmembrane simulations use an idealized pure lipid bilayer. Real cancer cell membranes are covered by a glycocalyx and embed numerous proteins, which can greatly affect TRACER's behavior.

System-level limitations—universality of parameters and neglect of individual variability:

Parameter dependence of ODE and ABM models: The ODE and NetLogo models of the population lysis circuit rely heavily on kinetic parameters from the literature (e.g., protein synthesis/degradation rates). These parameters come from specific strains and experimental conditions and may differ from the true parameters of our engineered strains. Before calibration with wet-lab data, the model's predictions serve more as qualitative trend guidance than precise quantitative forecasts.

"Average person" assumption in the PBPK model: Our pharmacokinetic model is based on a standard physiological database for an "East Asian healthy male." This overlooks large inter-patient variability (e.g., age, sex, body weight, hepatic/renal function) and, more importantly, does not account for the impact of disease states on physiological parameters (for example, tumors themselves may alter local blood flow and tissue permeability).

Clinical translation-level limitations—bridging the gap from organ concentration to tumor targeting:

Core challenge: uncertainty of target-site concentration: The greatest limitation of the PBPK model is that it predicts the average interstitial concentration of the drug in the "distal colorectum" organ region rather than the actual concentration within the tumor microenvironment. Tumor tissue has unique physiological features (e.g., high interstitial pressure, abnormal vasculature) that make permeation from blood into the tumor core much more difficult than in normal tissue. Therefore, while our current conclusion that "the target-site concentration may be insufficient" is an important warning, its precision still needs improvement.

Facing the above limitations is key to driving the project toward maturity. Future work will focus on calibrating models with experimental data and using more refined models to answer more complex questions.

Molecular level: toward more realistic dynamic simulations

Future plan: We will perform long-timescale simulations using enhanced sampling methods (such as umbrella sampling) to compute the transmembrane potential of mean force (PMF), thereby elevating the assessment of transmembrane potential from the qualitative "TRF framework" to a quantitative comparison of energy barriers. At the same time, we will verify the actual biological activity of IL-24 through wet-lab experiments and use the results to guide optimization of the linker design.

Production level: building an accurate "digital twin"

Future plan: The current parameter dependence of the models precisely underscores the importance of data back-filling. We will use wet-lab measurements such as growth curves and AHL concentrations to perform parameter fitting and calibration for the ODE and NetLogo models, turning them into "digital twin" models that can accurately predict the behavior of our specific engineered strains, thereby enabling precise control of the production process.

Clinical application level: building an integrated PBPK model for multiple administration routes

Core taskbridging the "tumor target" gap: Addressing the greatest limitation of the PBPK model, our future core effort is to establish an integrated multi-route PBPK model combining microneedles (systemic) and oral engineered bacteria (local), and on this basis introduce an independent tumor physiological compartment. This model will simulate how the drug overcomes physiological barriers of tumor tissue and predict whether the drug concentration in the tumor core can reach the EC90 level under the synergy of the two administration routes. This will be the most critical step in moving from "proof of concept" to "preclinical research."

10. Discussion and Outlook on Current Work 2: Centering the TRF paradigm to build a complete closed loop from molecular design to clinical prediction

Our current work has built an unprecedented multi-scale modeling framework; while its logical coherence is an advantage, its true methodological core and innovative highlight lies in the Transmembrane Readiness Framework (TRF) we proposed to solve key scientific questions. The entire modeling system can be viewed as a complete narrative constructed around the "input side" and "output side" of TRF.

In our project, the most critical design hypothesis is: "Can cleavage by MMP9 (matrix metalloproteinase MMP9) effectively 'activate' the transmembrane ability of the BRⅡ peptide?" This is a typical complex question that is difficult to answer with a single metric. The existing literature often uses single indicators in isolation—such as insertion depth, energy profiles, or tilt angle—lacking a "gold standard" for comprehensive evaluation.

Innovativeness and necessity of TRF:

To fill this gap, we proposed TRF. Its innovation lies in deconstructing the transmembrane process into three logically progressive layers for the first time: A—geometry and interactions (structural premise), B—thermo-dynamic (energy feasibility), and C—kinetics (process sustainability). This three-layer, eight-metric framework turns the vague problem of "transmembrane potential" into a set of specific, quantifiable, and comparable indicators. It is no longer a "yes/no" judgment but a scoring of "readiness." In our work, TRF clearly "adjudicated" that the post-cleavage Cut state is significantly superior to the Total state across all three layers. This conclusion provides the firmest microscopic theoretical support for the feasibility of the entire TRACER design.

TRF as the engine driving the project forward:

TRF is not merely a static evaluation tool; it in fact drives the logic of our preceding and subsequent work. Bioinformatics analysis (Chapter 2) and structural modeling (Chapter 3) provided validated input models for TRF (i.e., the Total and Cut conformations); and TRF's conclusion—"the activated state can indeed traverse the membrane"—directly forms the fundamental premise for the subsequent pharmacokinetic (PBPK) model (Chapter 7). Without the strong evidence from TRF, the assumptions in PBPK simulations that the drug can enter cells and exert effects would be castles in the air. Therefore, TRF is the bridge and engine that carries us from molecular design to clinical application prediction.

Although the TRF framework itself is systematic, the precision (clarity) of its conclusions is inevitably constrained by the "resolution" of the simulation data we feed into it. This constitutes a major limitation of the current work.

Qualitative thermodynamic evaluation: In TRF's B-layer (thermodynamics), we used steered MD (SMD) to qualitatively compare energetic resistance to transmembrane passage. This is computationally efficient but relatively coarse; it can only provide a ranking of "relative barrier heights," not precise free-energy barrier values (ΔG). This leaves our conclusion at the qualitative level of "Cut is easier than Total."

Idealization of the simulation environment: The entire TRF analysis was conducted in a simplified pure lipid bilayer without other membrane proteins. This ignores complex structures on real cancer cell membranes, such as the glycocalyx and receptors. It means our current "high TRF score" represents potential under ideal conditions, while behavior on crowded, complex real cell surfaces remains unknown.

Acknowledging these limitations provides a clear path for the future development and application of TRF. Our goal is to upgrade it from a project-specific solution to a more precise and general platform-style design tool.

Core upgrades: achieving quantitative and high-resolution TRF

Upgrade the B-layer—thermodynamic evaluation: Our top priority is to use enhanced sampling methods with higher computational cost but greater accuracy, such as umbrella sampling, to compute the potential of mean force (PMF) for peptide translocation. This will upgrade TRF from a qualitative comparison tool to a precise analytical framework that provides quantitative barrier values (units: kJ/mol). We will then not only know which is better but also "by how much," and we can directly compare with other thermodynamic data such as molecular binding energies.

Building more realistic simulation environments: In future simulations, we will construct cell membrane models closer to physiological reality—for example, introducing varying proportions of sphingolipids and cholesterol, and even embedding representative glycoproteins on the membrane surface—to test whether TRF evaluations remain robust under more complex conditions.

Application expansion: Turning TRF into a general-purpose design platform for the iGEM community

Evaluating other payloads: TRACER is a delivery platform. In the future, we can use a standardized TRF workflow to rapidly assess TRACER's potential to deliver other types of drugs. By simply swapping the payload, rerunning simulations, and applying TRF analysis, we can efficiently predict the transmembrane efficiency of new fusion proteins and greatly accelerate preliminary drug screening.

Guiding rational design and evolution of TRACER peptides: The most exciting application prospect of TRF is to use it as a computer-aided design (CAD) tool. We can perform in silico multi-site mutations on the BRⅡ peptide segment to generate dozens of virtual candidate designs, then rapidly score and rank them with TRF, and submit only the top-scoring designs for wet-lab validation. This will fundamentally change traditional trial-and-error-dependent peptide optimization and enable truly rational design.

In summary, the TRF paradigm is the core innovation of our work. It has now successfully validated the rationality of our design; in the future, with continuous quantitative upgrades and expansion of application scenarios, it has the potential to become a powerful and efficient computational simulation platform serving a broader scope of peptide drug design—perfectly embodying the iGEM project's core value of "contributing to the community."

References

  1. [1] Yu J, He Z, He X, Luo Z, Lian L, Wu B, Lan P, Chen H. Comprehensive analysis of the expression and prognosis for MMPs in human colorectal cancer. Front Oncol. 2021;11:771099. doi: 10.3389/fonc.2021.771099.
  2. [2] Gao P, He M, Zhang C, Geng C. Integrated analysis of gene expression signatures associated with colon cancer from three datasets. Gene. 2018;654:95–102. doi: 10.1016/j.gene.2018.02.007.
  3. [3] Nunes L, Li F, Wu M, et al. Prognostic genome and transcriptome signatures in colorectal cancers. Nature. 2024;633:137–146. doi: 10.1038/s41586-024-07769-3.
  4. [4] Zhang B, Wang J, Wang X, et al. Proteogenomic characterization of human colon and rectal cancer. Nature. 2014;513(7518):382–387. doi: 10.1038/nature13438.
  5. [5] Kothalawala WJ, Győrffy B. Transcriptomic and cellular content analysis of colorectal cancer by combining multiple independent cohorts. Clin Transl Gastroenterol. 2023;14(2):e00517. doi: 10.14309/ctg.0000000000000517.
  6. [6] Pham PN, Zahradník J, Kolářová L, Schneider B, Fuertes G. Regulation of IL-24/IL-20R2 complex formation using photocaged tyrosines and UV light. Front Mol Biosci. 2023;10:1214235. doi: 10.3389/fmolb.2023.1214235.
  7. [7] Singh A, Copeland MM, Kundrotas PJ, Vakser IA. GRAMM web server for protein docking. Methods Mol Biol. 2024;2714:101–112.
  8. [8] Chen B, Kang Z, Yao C, Zhang B, Liu Y, Wang Q. De-shielding of activatable cell-penetrating peptides: recognizing and releasing in activation process. Res Chem Intermed. 2021;47(1):117–130. doi: 10.1007/s11164-020-04339-w.
  9. [9] Botet-Carreras A, Montero MT, Sot J, Domènech Ò, Borrell JH. Engineering and development of model lipid membranes mimicking the HeLa cell membrane. Colloids Surf A Physicochem Eng Asp. 2021;630:127663. doi: 10.1016/j.colsurfa.2021.127663.
  10. [10] Li X, Yuan YJ. Lipidomic analysis of apoptotic HeLa cells induced by paclitaxel. OMICS. 2011;15(10):655–664. doi: 10.1089/omi.2011.0027.
  11. [11] Brožek R, Kabelka I, Vácha R. Effect of helical kink on peptide translocation across phospholipid membranes. J Phys Chem B. 2020;124(28):5940–5947. doi: 10.1021/acs.jpcb.0c03317.
  12. [12] Zheng Y, Sheng J, Ji X-F, Zheng L-H, Sun M. Structure analysis of zinc metalloproteases. Chinese Journal of Biochemistry and Molecular Biology. 2013;29(8):719–726.
  13. [13] Miano A, Liao MJ, Hasty J. Inducible cell-to-cell signaling for tunable dynamics in microbial communities. Nat Commun. 2020;11(1):1193. doi: 10.1038/s41467-020-15056-8.
  14. [14] Prindle A, Selimkhanov J, Li H, Razinkov I, Tsimring LS, Hasty J. Synchronized cycles of bacterial lysis for in vivo delivery. Nature. 2016;536(7614):81–85. doi: 10.1038/nature18930.
  15. [15] Danino T, Mondragón-Palomino O, Tsimring LS, Hasty J. A synchronized quorum of genetic clocks. Nature. 2010;463(7279):326–330. doi: 10.1038/nature08753.
  16. [16] Leuzzi A, Grossi M, Di Martino ML, Pasqua M, Michelì G, Colonna B, Prosseda G. Role of the SRRz/Rz1 lambdoid lysis cassette in the pathoadaptive evolution of Shigella. Int J Med Microbiol. 2017;307(4-5):268–275. doi: 10.1016/j.ijmm.2017.03.002.
  17. [17] Scott SR, Din MO, Bittihn P, Xiong L, Tsimring LS, Hasty J. A stabilized microbial ecosystem of self-limiting bacteria using synthetic quorum-regulated lysis. Nat Microbiol. 2017;2:17083. doi: 10.1038/nmicrobiol.2017.83.