
Overview

The research project titled "Detection of Pseudomonas aeruginosa in Water Samples Using Engineered Bacteria Based on a Dual-Plasmid System and Enzyme Substrate Method" focuses on achieving "rapid" and "efficient" detection performance. This means the system needs to produce both yellow substances and blue fluorescent substances earlier, faster, and in larger quantities. Therefore, the performance of each biological component that constitutes the detection pathway determines the overall detection efficiency.
To enable the engineered bacterial detection system to promptly capture the specific signaling molecules (PQS and PYO) from Pseudomonas aeruginosa, and effectively activate and operate the downstream detection pathway, it is necessary to select high-quality genes. These genes should express proteins that can bind/complex with PQS/PYO, and their interaction with these signaling molecules should effectively contribute to the corresponding promoters upstream of the reporter protein genes. For this purpose, three candidate genes—brlR, soxR, and mexL—are available for selection.

Validation via molecular docking modeling showed that: .
Abstract
This study integrates computational simulations with experimental validation to identify the optimal genetic components for a Pseudomonas aeruginosa biosensor. Molecular docking was employed to quantitatively evaluate the binding affinities of the transcription factors SoxR, BrlR, and MexL with the signaling molecule PYO. AlphaFold3 was used to predict the complex structures of these factors with their corresponding promoter DNAs. A multi-criteria decision analysis matrix was constructed to objectively assess performance. The MexL system achieved the highest comprehensive score and was recommended as the primary candidate. Wet-lab GUS enzyme activity assays confirmed this prediction, showing that the MexL system exhibited higher enzyme activity than the other groups. This research establishes a complete "computational prediction–quantitative evaluation–experimental validation" rational design paradigm, providing a systematic approach for synthetic biology component engineering.1. Research Background
The core of synthetic biology lies in the predictable, engineering-based design and reconstruction of biological systems. This study focuses on a typical design challenge: prospectively screening the most sensitive components responsive to the Pseudomonas aeruginosa signature signal molecule PYO from multiple candidate transcription factor-promoter systems. Traditional trial-and-error methods suffer from inefficiency and lack of mechanistic insight, prompting the establishment of a "computationally guided rational design paradigm" that focuses on two decisive molecular events in the signaling pathway: 1. Ligand recognition: Determines sensor sensitivity to the target signal—the specific binding efficiency between the transcription factor and PYO. 2. Transcriptional activation: Determines sensor signal output strength—the ability of the ligand-bound transcription factor to form a complex with the promoter and drive downstream gene expression. By simulating these events at atomic and molecular levels and integrating multi-criteria decision analysis, this study achieves a qualitative-to-quantitative evaluation transition, providing high-confidence priorities for experimental validation.2. Thermodynamic Quantitative Analysis of Ligand Binding Affinity
Specific binding between ligand and receptor is the foundation of biosensor signal perception, a process that strictly follows thermodynamic principles. This study used AutoDock4 to calculate the average binding free energy (ΔG) of candidate transcription factors with a dual-PYO molecular system through secondary docking. ΔG serves as a direct thermodynamic indicator of binding spontaneity and strength—more negative values indicate stronger binding capacity. Figure 1 shows the results of dual-PYO docking for the three transcription factors.





3. Binding Free Energy Calculation and Dissociation Constant Estimation
Table 1 — Average binding free energy (ΔG) and rankingRANK | Candidate Transcription Factor | ΔG(kcal/mol) |
---|---|---|
1 | BrlR | -7.26 |
2 | MexL | -7.1 |
3 | SoxR | -5.79 |
RANK | Candidate Transcription Factor | |
---|---|---|
1 | BrlR | 4.5 |
2 | MexL | 6.2 |
3 | SoxR | 55.2 |


4. Preliminary Thermodynamic Analysis and Conclusions
BrlR and MexL demonstrate significantly higher binding affinity for PYO than SoxR. Both exhibit Kd values in the low micromolar range, indicating efficient receptor occupancy and signal triggering at low environmental PYO concentrations, suggesting high detection sensitivity. SoxR's Kd value (55.2 μM) is nearly an order of magnitude higher, indicating weaker signal perception capability and prompting its preliminary exclusion based on the "sensitivity" criterion. Further structural analysis is needed to evaluate the comprehensive performance of each system.5. Structural Biology Study of the Transcriptional Activation Complex
Ligand binding represents only the first step in signal transduction. The activated transcription factor must form a stable complex with the promoter DNA to drive downstream gene transcription. This study used AlphaFold3 to predict the complex structures of the three transcription factors with their corresponding promoters in the unbound PYO state, revealing conformational differences and their impact on transcriptional activation efficiency.
6. Preliminary Conformation–Function Relationship Analysis
The structural characteristics of each system suggest different functional mechanisms: SoxR's "open" conformation and numerous potential DNA interaction sites indicate strong DNA binding stability but potentially slower binding kinetics. BrlR's "closed" conformation suggests DNA binding strictly depends on PYO-induced conformational changes, with docking results showing no exposed DNA binding sites under normal conditions. MexL, despite having fewer DNA binding sites, appears to bind and dissociate from DNA more easily, suggesting faster DNA dynamic response capabilities. Based on this analysis, the BrlR–PBrlR system was excluded from further consideration, while the final selection between SoxR–PmexG and MexL–PphzA1 required a more comprehensive evaluation.7. Construction of a Rational Screening Evaluation Matrix
Table 3 — Evaluation criteria and weight distributionDecision criteria | Weight | Standard description |
---|---|---|
Ligand-binding affinity | 30% | Quantitative assessment based on binding free energy |
DNA binding conformation | 30% | Basal transcription initiation efficiency |
Prediction of interaction strength | 20% | Assessment of complex stability |
Conformational change necessity | 20% | Activation energy barrier between inactive and active states |
System | Ligand-receptor binding affinity (30%) | DNA binding conformation (30%) | Predict the strength of interaction (20%) | Conformational change necessity (20%) | Weighted total score |
---|---|---|---|---|---|
SoxR-PmexG | |||||
Goal | 2 | 4 | 5 | 2 | |
Weighted score | 0.6 | 1.2 | 0.1 | 0.4 | 2.3 |
Basis for scoring | K_d=55.2μM insensitivity | Open clamp, Normally open conformation | Have the most interacting sites | May require minor conformation changes | |
BrlR-PBrlR | |||||
Goal | 5 | 1 | 1 | 1 | |
Weighted score | 1.5 | 0.3 | 0.2 | 0.2 | 2.2 |
Basis for scoring | K_d=4.5μM the highest sensitivity | Closed conformation, ligand-unbound | No effective interaction | May require a large conformational change to open | |
MexL-PphzA1 | |||||
Goal | 4 | 3 | 2 | 1 | |
Weighted score | 1.2 | 0.9 | 0.4 | 0.2 | 2.7 |
Basis for scoring | K_d=6.2μM high sensitivity | Pre-binding region,Pre-activation | Partially well-defined binding mode | May require significant conformational changes |
8. Experimental Validation and Prediction Consistency Analysis

Application of Curve Fitting Software Based on High-Dimensional Search and Azimuth Statistics
Summary
Dynamical models based on ordinary differential equations (ODEs) are widely used in physics, chemistry, and biology. By interpreting the intrinsic properties of a research object through a model and visualizing the output curves of state variables, we can better fit these curves to real experimental data (such as wet-lab measurement curves). Adjusting parameters according to the model's characteristics can improve fitness, The parameters of these models are usually estimated based on experimental data. [1] however, automated parameter search methods remain to be fully explored.
The majority of models involve transcendental functions and nonlinear systems, which means strategies to inversely deduce parameters from state variables are often impractical. Methods that avoid direct parameter inversion—such as neural network training and Bayesian search—rely heavily on large datasets for training, [2] which is time-consuming and not well-suited for quickly developing a convenient curve-fitting and prediction software.
Therefore, the modeling team at YAU-China has developed a reinforcement learning search method based on high-dimensional space concepts and curve similarity quantification and designed a fitting software accordingly. Testing on various curves demonstrates its broad applicability, suggesting potential influence on future research.
Software Overview
1. Software Name: CFS_BHDS_AS
Full Name: Curve Fitting Software Based on High-Dimensional Search and Azimuth Statistics
2. Developer: Shengxuan BIAN
Email: shixiu_yakuchi0324@qq.com
Download Link: https://gitlab.igem.org/2025/software-tools/yau-china/-/tree/main/CFS_BHDS_AS
Password: SoxR
3. Functions:
Fit experimental data curves and predict data curves, providing optimal parameters for the best fit
Provide a dynamic plotting window for real-time observation of fitting progress
Offer visualizations including predicted curves and parameter variation analysis
4. Software Dependencies: R-4.5.0,RStudio (2025.05.0+496)
5. Hardware Dependencies: Intel(R) Core(TM) i7-8850H CPU @ 2.60 GHz (2.59 GHz)
Methodology
We assume that for any ODE model, the parameters corresponding to the current predicted curve α form a prediction coordinate A, where each dimension represents the value of one parameter. Atom the perspective of the ODE model, the experimental data curve β has its own unique parameter combination; at least one target coordinate B contains these parameter values as its dimensions. The Euclidean distance AB between A and B represents the degree of dissimilarity between the predicted curve α and the experimental curve β.
The prediction coordinate A can be moved under the guidance of a spatial vector a⃗ to approach B. If the approach efficiency decreases, a spatial vector a⃗ that is perpendicular to the current trajectory direction and that further reduces the Euclidean distance AB is sought to guide the next movement of the prediction coordinate. Using the R function base::outer(x, y, "-") , we compute a distance matrix between points of the predicted curve α and experimental curve β after applying the same weighting, then calculate the mean squared error (MSE) as the dissimilarity metric.
As for the dynamic adjustment strategy of vector a⃗ , we maintain a reinforcement learning weight environment to record the historical performance of each parameter direction, based on which new movement vectors are probabilistically generated; when optimization efficiency declines, orthogonal constraints are applied to generate new vectors perpendicular to the current direction so as to sustain search momentum; meanwhile, a blacklist–whitelist mechanism is incorporated to avoid repeated and ineffective searches, and local fine-tuning is performed on directions that demonstrate superior performance.
This transforms the problem of "how to adjust parameters so the predicted curve increasingly resembles the experimental curve" into "how to reduce the distance between two points in high-dimensional space."
Research Task
We studied the project described at https://2025.igem.wiki/yau-china/: "An engineered bacterial system based on a dual-plasmid architecture and enzyme-substrate method for detecting Pseudomonas aeruginosa in aquatic environments."
The project requires visible color development between 6 and 12 hours after detection begins. We assumed a target curve shape for the total concentration of the reporter gene gus expression enzyme in the detection system, then used CFS_BHDS_AS to fit the curve, verify dimensional information, and obtain optimal prediction parameters. The parameters corresponding to the "ideal protein" were compared with authoritative database information for BrlR, SoxR, and MexL to identify the most suitable protein.
ODE Model Design
Research Process
1. Hypothesis and Fitting Exploration
1. Growth Kinetics of Pseudomonas aeruginosa and Engineered Bacteria
\[ \frac{dN_{\text{PA}}}{dt}= \mu_{\text{PA}}N_{\text{PA}}\!\left(1-\frac{N_{\text{PA}}}{K_{\text{PA}}}\right) \frac{N}{K_{N,\text{PA}}}-\delta_{\text{PA}}N_{\text{PA}} \] \[ \frac{dN_{\text{EC}}}{dt}= \mu_{\text{EC}}N_{\text{EC}}\!\left(1-\frac{N_{\text{EC}}}{K_{\text{EC}}}\right) \frac{N}{K_{N,\text{EC}}}-\delta_{\text{EC}}N_{\text{EC}} \]Based on the logistic growth framework, intraspecific competition is introduced through the term (1 - N / K), where K_PA and K_EC represent the environmental carrying capacities of Pseudomonas aeruginosa (PA) and the engineered bacteria (EC), respectively, thereby simulating the reality that populations cannot grow indefinitely under resource-limited conditions; simultaneously, the term (N / K_N) directly links the growth rate to the nutrient concentration N (where K_N is the half-saturation constant for nutrient uptake, analogous to the Michaelis constant), ensuring that even if the population density has not reached its upper limit, growth will cease if nutrients are depleted (N→0); in terms of population dynamics, growth is represented by mu * … and natural decay or dilution by - delta * N, reflecting that population dynamics result from the net effect of the two processes of reproduction and death; regarding interspecific relationships, the growth equations for PA and EC have parallel forms but different parameters (such as mu, K, and delta), which not only accounts for inherent differences in the growth characteristics of the two bacterial species but also avoids explicitly modeling direct competition between them (e.g., through competition coefficients), instead implementing indirect competition via a shared nutrient pool N; in short, the core of this model is to simulate the process by which the two bacterial species each undergo logistic growth and continuous decay under conditions of limited nutrients.
2. Synthesis and Degradation of PQS and PYO
\[ \frac{dI_{\text{PQS}}}{dt}= s_{\text{PQS}}\frac{N_{\text{PA}}^{n}}{K_{\text{PA},\text{PQS}}+N_{\text{PA}}^{n}} -d_{\text{PQS}}I_{\text{PQS}} \] \[ \frac{dI_{\text{PYO}}}{dt}= s_{\text{PYO}}\frac{N_{\text{PA}}^{m}}{K_{\text{PA},\text{PYO}}+N_{\text{PA}}^{m}} -d_{\text{PYO}}I_{\text{PYO}} \]At the level of signal molecule synthesis, Hill functions are employed to link the synthesis rate to the pathogen population density N_PA: when N_PA is low, the synthesis rate is slow, and when N_PA is high, the synthesis rate approaches the maximum values s_PQS or s_PYO, thereby simulating the core feature of quorum sensing whereby signal molecules are produced in large quantities only after the population reaches a specific threshold density; by setting Hill coefficients n and m both greater than 1, cooperative effects are introduced such that signal production does not exhibit a simple linear relationship with population density but instead displays a switch-like response, whereby signal output rises sharply once the critical density is exceeded. The term - d * I is adopted to assume that signal molecules degrade or are diluted at a constant rate d; meanwhile, given that PQS and PYO are distinct signaling pathways unique to Pseudomonas aeruginosa, they are modeled separately, laying the foundation for the subsequent construction of a dual-channel detection logic. In brief, the core of this component is that pathogen density controls the production of signal molecules through cooperative effects, while the signal molecules themselves undergo degradation over time.
3. Dual-Plasmid Replication Kinetics
\[ \frac{dP_{\text{MiniCTX}}}{dt}= r_{\text{MiniCTX}}P_{\text{MiniCTX}}\!\left(1-\frac{P_{\text{MiniCTX}}}{P_{\max,\text{MiniCTX}}}\right) \] \[ \frac{dP_{\text{pBlue}}}{dt}= r_{\text{pBlue}}P_{\text{pBlue}}\!\left(1-\frac{P_{\text{pBlue}}}{P_{\max,\text{pBlue}}}\right) \]The model treats the number of plasmids as a dynamically changing variable rather than a fixed constant; in modeling the plasmid replication process, the key design is the adoption of a logistic growth model: intracellular resources such as dNTPs, replication enzymes, and ATP are limited, and plasmid replication consumes these resources; as the number of plasmids increases, resource competition intensifies, which in turn leads to a decline in the replication rate, where P_max represents the maximum number of copies of a particular plasmid that a cell can sustain while maintaining normal physiological activities, reflecting the homeostatic balance within the cell; for the dual-plasmid system, the model assigns independent replication parameters (r, P_max) to each of the two plasmids, a design that reflects their mutual non-interference and effectively enhances the modularity and predictability of the genetic circuit design. In summary: the operation of the genetic circuit depends on the quantity of gene templates (plasmids), and the quantity of templates itself is constrained by the internal resource conditions of the cell.
4. Regulatory Protein Expression
\[ \frac{dX_{\text{PqsR}}}{dt}= k_{\text{PqsR}}P_{\text{pBlue}}\!\left(1-\frac{X_{\text{PqsR}}}{X_{\text{sat},\text{PqsR}}}\right) -(\mu_{\text{EC}}+d_{\text{PqsR}})X_{\text{PqsR}} \] \[ \frac{dX_{\text{SoxR}}}{dt}= k_{\text{SoxR}}P_{\text{pBlue}}\!\left(1-\frac{X_{\text{SoxR}}}{X_{\text{sat},\text{SoxR}}}\right) -(\mu_{\text{EC}}+d_{\text{SoxR}})X_{\text{SoxR}} \]The regulatory proteins PqsR and SoxR are expressed in a constitutive manner. The expression rate is defined by the term k_* * P_pBlue, which explicitly accounts for the gene dosage effect—i.e., a higher copy number of the pBlue plasmid results in a faster rate of regulatory protein production. For protein concentration control, the negative feedback term (1 − X / X_sat) prevents unlimited protein accumulation, stabilizing the concentration near the set value X_sat and thereby avoiding resource waste and potential toxicity. In the protein degradation mechanism, the loss term (mu_EC + d_X) again underscores the decisive influence of growth dilution on intracellular protein concentration; even with high expression intensity, rapid growth of the engineered strain may hinder protein accumulation due to “dilution.” In summary, by continuously producing and maintaining regulatory proteins at a certain level, the system achieves constant readiness for environmental signal detection, with the protein concentration jointly regulated by the plasmid template dosage and the physiological growth state of the host cell.
5. Reporter Gene Expression
\[ \text{Activation}_{P}= \frac{(X_{\text{PqsR}}\,I_{\text{PQS}})^{h}}{K_{\text{pqsA}}^{h}+(X_{\text{PqsR}}\,I_{\text{PQS}})^{h}} \] \[ \text{Activation}_{M}= \frac{(X_{\text{SoxR}}\,I_{\text{PYO}})^{g}}{K_{\text{mexG}}^{g}+(X_{\text{SoxR}}\,I_{\text{PYO}})^{g}} \]In terms of logical implementation, the multiplication logic represented by (X_PqsR × I_PQS) ensures that a significant output is generated only when both the receptor protein X and the signaling molecule I are present simultaneously. If either is absent, the activation level is zero, thereby effectively guaranteeing detection specificity. This prevents both non-specific activation in the presence of signal but absence of receptor and leaky expression in the presence of receptor but absence of signal. Regarding response characteristics, the Hill function (^h / (K^h + ...^h)) transforms the continuous product value into a switch-like response. A high Hill coefficient (h, g > 1) ensures that the activation level remains near 0 (off state) when the complex concentration is below the threshold K, and rapidly jumps to 1 (on state) once the threshold is exceeded. In summary, through the combination of multiplicative operation and cooperative activation, the reporter gene is activated only when both inputs simultaneously exceed their respective thresholds.
\[ \frac{dX_{\text{lacZ}}}{dt}= k_{\text{lacZ}}\,\text{Activation}_{P}\,P_{\text{MiniCTX}}\!\left(1-\frac{X_{\text{lacZ}}}{X_{\text{sat},\text{lacZ}}}\right) -(\mu_{\text{EC}}+d_{\text{lacZ}})X_{\text{lacZ}} \] \[ \frac{dX_{\text{gus}}}{dt}= k_{\text{gus}}\,\text{Activation}_{M}\,P_{\text{MiniCTX}}\!\left(1-\frac{X_{\text{gus}}}{X_{\text{sat},\text{gus}}}\right) -(\mu_{\text{EC}}+d_{\text{gus}})X_{\text{gus}} \]In the regulatory logic governing reporter protein expression, the expression rate is controlled by the activation level activation. In terms of template design, the reporter genes (X_lacZ, X_gus) are encoded on the Mini_CTX plasmid (P_Mini_CTX), whereas the detector proteins (X_PqsR, X_SoxR) are encoded on the pBlue plasmid (P_pBlue). This physical separation of templates is a key aspect of the modular design, avoiding the problem of excessive size for a single plasmid while enabling independent adjustment of copy numbers for different modules. In summary, the expression of the reporter gene is the output of the genetic circuit’s logic operations, modulated by the copy number of its respective plasmid, and ultimately produces a detectable signal (enzyme activity).
6. Substrate Metabolism Pathways
\[ \frac{d\text{oNpG}_{\text{int}}}{dt}= k_{\text{uptake},\text{oNpG}}\,\text{oNpG}_{\text{ext}}\!\left(1-\frac{\text{oNpG}_{\text{int}}}{\text{oNpG}_{\text{sat}}}\right) -k_{\text{cat},\text{lacZ}}X_{\text{lacZ}}\,\text{oNpG}_{\text{int}} \] \[ \frac{d\text{MuG}_{\text{int}}}{dt}= k_{\text{uptake},\text{MuG}}\,\text{MuG}_{\text{ext}}\!\left(1-\frac{\text{MuG}_{\text{int}}}{\text{MuG}_{\text{sat}}}\right) -k_{\text{cat},\text{gus}}X_{\text{gus}}\,\text{MuG}_{\text{int}} \]used to capture saturation behavior—transport rate gradually decreases as the intracellular substrate concentration rises, thereby simulating both the efficiency limitations of the transporter protein and potential feedback inhibition mechanisms. In the intracellular substrate consumption step, the term k_cat_* * X_* * S_int represents the process by which the reporter enzyme converts the intracellular substrate into product. From the perspective of overall dynamic equilibrium, the equation describes the temporal evolution of the intracellular substrate concentration, whose final steady-state value is jointly determined by the rates of substrate uptake and consumption.
\[ \frac{dY_{\text{lacZ}}}{dt}= k_{\text{cat},\text{lacZ}}X_{\text{lacZ}}\,\text{oNpG}_{\text{int}} \] \[ \frac{dY_{\text{gus}}}{dt}= k_{\text{cat},\text{gus}}X_{\text{gus}}\,\text{MuG}_{\text{int}} \]In terms of the relationship between product and substrate, the model is based on the explicit assumption that one molecule of substrate yields one molecule of colored product, thereby directly equating the accumulated amount of product Y with the intensity of the chromogenic signal. Regarding the driving factors of product formation, the production rate is proportional to the amount of functional reporter enzyme X_*, directly linking the output of the upstream genetic circuit (reporter protein concentration) to the final observable signal (product concentration). It is also proportional to the intracellular substrate concentration S_int, reflecting a first-order kinetic simplification that implicitly assumes S_int << K_m (the intracellular substrate concentration is much lower than the enzyme’s Michaelis constant). This assumption results in a linear dependence of the reaction rate on S_int, simplifying model analysis but potentially overestimating the reaction rate at high substrate concentrations. In summary, through the enzyme-catalyzed reaction, the output of the upstream genetic circuit (reporter enzyme concentration) is linearly translated into the final measurable signal (colored product concentration), with the overall pathway rate jointly limited by substrate uptake efficiency and enzyme catalytic efficiency.
7. Nutrient Kinetics
\[ \frac{dN}{dt}= -\Bigl[ \mu_{\text{PA}}N_{\text{PA}}+\mu_{\text{EC}}N_{\text{EC}} +\eta_{P}\!\bigl(\max\!\bigl(0,\tfrac{dP_{\text{MiniCTX}}}{dt}\bigr) +\max\!\bigl(0,\tfrac{dP_{\text{pBlue}}}{dt}\bigr)\bigr) +\eta_{X}\!\bigl(\max\!\bigl(0,\tfrac{dX_{\text{PqsR}}}{dt}\bigr) +\max\!\bigl(0,\tfrac{dX_{\text{lacZ}}}{dt}\bigr) +\max\!\bigl(0,\tfrac{dX_{\text{gus}}}{dt}\bigr) +\max\!\bigl(0,\tfrac{dX_{\text{SoxR}}}{dt}\bigr)\bigr) \Bigr] \]Nutrient N is abstracted as a unified, consumable resource pool (e.g., carbon source, nitrogen source, ATP, etc.). Nutrient consumption is precisely partitioned among four major processes: biomass growth (–mu_PA * N_PA – mu_EC * N_EC) represents the most direct and dominant consume pathway, where faster population growth accelerates nutrient depletion; macromolecular synthesis (–η_P × dP – η_X × dX) is a key and sophisticated design explicitly modeling the operating costs of the genetic circuit, including nutrient consumption for plasmid replication (η_P × dP) and protein expression (η_X × dX). The pmax(0, ...) function ensures that nutrients are consumed only during net synthesis, with degradation not returning nutrients to the pool (a simplifying assumption). Through competition for the shared resource N, population growth, signal production, gene expression, and metabolic activity are mutually coupled. For instance, excessive genetic circuit activity can deplete nutrients, thereby limiting bacterial growth; conversely, rapid bacterial proliferation can exhaust nutrients and compromise circuit performance. This introduces a survival–function trade-off, enabling the simulation of more realistic biological behaviors. In summary, via a shared, consumable nutrient pool, all energy-consuming processes within the system are tightly interconnected, effectively mimicking the authentic intracellular strategies of resource competition and allocation.
8. Color Development Efficiency Sensitivity Curves
\[ \frac{d\text{Active}\_Y_{\text{lacZ}}}{dt}= \frac{k_{\text{cat},\text{lacZ}}X_{\text{lacZ}}\,\text{oNpG}_{\text{int}}}{(N_{\text{EC}}+10^{-10})(N_{\text{PA}}+10^{-10})} \] \[ \frac{d\text{Active}\_Y_{\text{gus}}}{dt}= \frac{k_{\text{cat},\text{gus}}X_{\text{gus}}\,\text{MuG}_{\text{int}}}{(N_{\text{EC}}+10^{-10})(N_{\text{PA}}+10^{-10})} \]The numerator represents the production rate of product Y (i.e., the absolute signal intensity), while the denominator is the product of the quantities of the two bacterial populations. The resulting ratio denotes the “chromogenic rate contributed per unit of engineered bacteria and per unit of pathogen.” In terms of design intent, the primary objective is to evaluate system cooperativity—this metric quantifies the detection efficiency when both bacterial populations coexist, avoiding misjudgments where a high pathogen count (N_PA) yields a strong absolute signal but reflects low actual detection efficiency. The aim is to identify the ideal scenario in which the engineered bacteria can still achieve efficient detection even when the pathogen population is not large. To prevent computational failure caused by zero population sizes at the initial stage of the simulation, a numerical technique of (N + 1e-10) is adopted to avoid division-by-zero errors.
\[ \frac{d\text{Active}\_Y_{\text{LG}}}{dt}= \sqrt{\max\!\bigl(0,\tfrac{d\text{Active}\_Y_{\text{lacZ}}}{dt}\bigr) \cdot\max\!\bigl(0,\tfrac{d\text{Active}\_Y_{\text{gus}}}{dt}\bigr)} \]A geometric mean is employed, such that the overall performance is high only when both channels achieve high individual performance; if either channel’s performance approaches zero, the overall performance will also tend toward zero. This design directly serves the objective of the “dual-plasmid, dual-channel” system, compelling the optimization algorithm to search for a parameter space that can simultaneously and efficiently activate both the lacZ and gus reporter systems. This, in turn, enhances the system’s reliability and anti-interference capability while avoiding potential false-positive issues that may occur in a single-channel configuration. The pmax(0, ...) function is used to ensure the value inside the square root remains non-negative; however, this “hard” treatment may mask non-physical negative values generated by the model under certain parameter combinations. In summary, an efficiency metric that emphasizes dual-channel cooperative operation is defined to guide system parameter optimization, ensuring the final engineered bacterial detection system is both highly efficient and reliable.
9. Parameter List
(Please refer to Appendix 1 for the parameter list.)
Based on general experience in molecular biology, for a clear visible observation of blue precipitation or color change in liquid culture media (with a short optical path length), the concentration of the Y_gus product typically needs to reach the order of 1–10 µM when visualized under an ultraviolet lamp. [3] We chose 5 µM as the threshold, simulated an ideal experimental curve, and used CFS_BHDS_AS for fitting and prediction.

2. Parameter Behavior and Fitting Analysis
Principal Component Analysis (PCA) can jointly synthesize multiple variables when they are all quantitative, so as to describe the set of individuals defined by these variables— which are the subject of descriptive research— in the best possible way.[4]For a tightly coupled ODE model, processing parameter variation information using principal component analysis (PCA) is an excellent choice.
Principal Component Analysis (PCA) of parameter changes during iterations revealed that all SoxR-related parameters correlated positively with PC1, indicating that increasing the "overall strength" of the SoxR pathway was necessary to achieve visible coloration.Variables close to the circle are well-represented; those near the center are poorly represented. Orthogonal radii and variables close to the circle are independent, while variables that are diametrically opposite on the circle show a negative correlation. [5]

Both SoxR expression rate k_SoxR and degradation rate d_SoxR increased, suggesting the protein needs both higher expression and faster turnover.

Discussion
Fitting results indicate that to achieve the desired "earlier rise and higher saturation" of the Y_gus curve, both expression and degradation parameters for the SoxR protein must increase.
Protein Name | Length | Molecular Weight |
---|---|---|
BrlR | 270 aa | 29 700 Da |
SoxR | 156 aa | 17 160 Da |
MexL | 212 aa | 23 320 Da |
Table 1: The corresponding protein lengths were obtained from NCBI searches, and the molecular weights were estimated using an average molecular weight of amino acids ≈ 110 Da (including water loss from peptide bonds).
Among BrlR, SoxR, and MexL, SoxR has the lowest molecular weight (17,160 Da), suggesting it is easier to express and degrade. Therefore, based on dry-lab fitting results, SoxR is the preferred choice.

To evaluate the real-world performance of the proteins BrlR, SoxR and MexL, the complete sensing modules—SoxR–PmexG, MexL–PphzA1 and BrlR–PbrlR—were introduced into engineered bacteria. Cells were challenged with identical concentrations of PYO, and the resulting fluorescence was quantified to assess sensitivity and reliability.
As shown in Figure 4D, the SoxR–PmexG circuit delivered the strongest fluorescence signal together with the lowest background, exhibiting the best overall sensitivity and robustness. In the absence of PYO, the MexL–PphzA1 circuit displayed a basal fluorescence that decreased upon PYO addition (dissolved in DMSO), a typical repression behaviour; however, the reduction was modest. The BrlR–PbrlR circuit produced the weakest reporter output, presumably owing to poor protein expression or low transcriptional activation efficiency.
ODE-based verification indicated that the regulatory protein SoxR possesses the smallest molecular weight among the candidates, endowing it with rapid expression and turnover—an attribute aligned with the requirement for high regulator recycling. After cross-checking against the NCBI database, these wet-lab results confirm that soxR is the optimal regulatory gene for experimental implementation. The superior fluorescence observed with the soxR-regulated pathway, relative to the mexL and brlR alternatives, provides strong empirical support for the model-guided experimental strategy.
References
[1] Dorešić D, Grein S, Hasenauer J. Efficient parameter estimation for ODE models of cellular processes using semi-quantitative data. Bioinformatics. 2024 Jun 28;40(Suppl 1):i558-i566. doi: 10.1093/bioinformatics/btae210. PMID: 38940161; PMCID: PMC11211815.
[2] Dastin-van Rijn EM, Widge AS. Failure modes and mitigations for Bayesian optimization of neuromodulation parameters. J Neural Eng. 2025 Jun 13;22(3):036038. doi: 10.1088/1741-2552/ade189. PMID: 40472872; PMCID: PMC12164532.
[3] Jain N, Kaur N. A Naked Eye and Fluorescence Turn-on Sensor for Smartphone Assisted Solid and Solution Phase Sensing of Highly Toxic Triphosgene. J Fluoresc. 2024 Mar;34(2):935-943. doi: 10.1007/s10895-023-03328-7. Epub 2023 Jul 11. PMID: 37432582.
[4] Ben Salem K, Ben Abdelaziz A. Principal Component Analysis (PCA). Tunis Med. 2021 Avril;99(4):383-389. English. PMID: 35244921; PMCID: PMC8734479.
[5] Ben Salem K, Ben Abdelaziz A. Principal Component Analysis (PCA). Tunis Med. 2021 Avril;99(4):383-389. English. PMID: 35244921; PMCID: PMC8734479.
State Variable Name | Function Description | Dimensional Info | Initial Value |
---|---|---|---|
P_Mini_CTX | Number of Mini-CTX plasmids | [P] | 2 |
P_pBlue | Number of pBlue plasmids | [P] | 1 |
X_PqsR | Concentration of PqsR regulatory protein | [C] | 0.5 |
X_lacZ | Concentration of lacZ reporter protein | [C] | 0.2 |
X_gus | Concentration of gus reporter protein | [C] | 0.1 |
X_SoxR | Concentration of SoxR regulatory protein | [C] | 0.1 |
I_PQS | Concentration of PQS signaling molecule | [C] | 0.1 |
I_PYO | Concentration of PYO signaling molecule | [C] | 0.05 |
N | Nutrient concentration | [N] | 2.5 |
oNpG_int | Intracellular oNpG substrate concentration | [S] | 0 |
MuG_int | Intracellular MuG substrate concentration | [S] | 0 |
Y_lacZ | Product concentration of lacZ pathway | [S] | 0 |
Y_gus | Product concentration of gus pathway | [S] | 0 |
N_PA | Pseudomonas aeruginosa concentration | [X] | 0.5 |
N_EC | Engineered bacteria concentration | [X] | 0.3 |
dActive_Y_lacZ | Color development efficiency of lacZ path | [S][T]⁻¹[X]⁻² | 0.1 |
dActive_Y_gus | Color development efficiency of gus path | [S][T]⁻¹[X]⁻² | 0.1 |
dActive_Y_LG | Combined color development efficiency | [S][T]⁻¹[X]⁻² | 0.1 |
Parameter Name | Function Description | Value | Units |
---|---|---|---|
mu_PA | Max growth rate of Pseudomonas aeruginosa | 0.3 | T⁻¹ |
K_PA | Carrying capacity of Pseudomonas aeruginosa | 0.6 | X |
delta_PA | Death rate of Pseudomonas aeruginosa | 0.005 | T⁻¹ |
K_N_PA | Nutrient half-saturation constant for PA | 0.1 | N |
mu_EC | Max growth rate of engineered bacteria | 0.15 | T⁻¹ |
K_EC | Carrying capacity of engineered bacteria | 0.5 | X |
delta_EC | Death rate of engineered bacteria | 0.02 | T⁻¹ |
K_N_EC | Nutrient half-saturation constant for EC | 0.1 | N |
s_PQS | Max synthesis rate of PQS | 0.1 | C·T⁻¹ |
K_PA_PQS | Half-saturation constant for PQS synthesis | 0.3 | X |
n | Hill coefficient for PQS synthesis | 2 | — |
d_PQS | Degradation rate of PQS | 0.04 | T⁻¹ |
s_PYO | Max synthesis rate of PYO | 0.15 | C·T⁻¹ |
K_PA_PYO | Half-saturation constant for PYO synthesis | 0.4 | X |
m | Hill coefficient for PYO synthesis | 2 | — |
d_PYO | Degradation rate of PYO | 0.03 | T⁻¹ |
r_Mini_CTX | Replication rate of Mini-CTX plasmid | 0.08 | T⁻¹ |
P_max_Mini_CTX | Max copy number of Mini-CTX plasmid | 20 | P |
r_pBlue | Replication rate of pBlue plasmid | 0.05 | T⁻¹ |
P_max_pBlue | Max copy number of pBlue plasmid | 10 | P |
k_PqsR | Expression rate constant of PqsR protein | 0.1 | P⁻¹·T⁻¹ |
X_sat_PqsR | Saturation concentration of PqsR protein | 30 | C |
d_PqsR | Degradation rate of PqsR protein | 0.05 | T⁻¹ |
h | Hill coefficient for PqsR-PQS complex | 3 | — |
K_pqsA | Activation constant of pqsA promoter | 0.5 | C |
k_lacZ | Expression rate constant of lacZ | 0.15 | P⁻¹·T⁻¹ |
X_sat_lacZ | Saturation concentration of lacZ | 60 | C |
d_lacZ | Degradation rate of lacZ | 0.02 | T⁻¹ |
k_SoxR | Expression rate constant of SoxR protein | 0.1 | P⁻¹·T⁻¹ |
X_sat_SoxR | Saturation concentration of SoxR protein | 20 | C |
d_SoxR | Degradation rate of SoxR protein | 0.05 | T⁻¹ |
g | Hill coefficient for SoxR-PYO complex | 2 | — |
K_mexG | Activation constant of mexG promoter | 0.4 | C |
k_gus | Expression rate constant of gus | 0.1 | P⁻¹·T⁻¹ |
X_sat_gus | Saturation concentration of gus | 50 | C |
d_gus | Degradation rate of gus | 0.01 | T⁻¹ |
k_uptake_oNpG | Uptake rate of oNpG substrate | 0.15 | T⁻¹ |
oNpG_ext | External oNpG substrate concentration | 10 | S |
oNpG_sat | Saturation concentration of internal oNpG | 10 | S |
k_cat_lacZ | Catalytic constant of lacZ enzyme | 0.2 | C⁻¹·T⁻¹ |
k_uptake_MuG | Uptake rate of MuG substrate | 0.08 | T⁻¹ |
MuG_ext | External MuG substrate concentration | 10 | S |
MuG_sat | Saturation concentration of internal MuG | 10 | S |
k_cat_gus | Catalytic constant of gus enzyme | 0.05 | C⁻¹·T⁻¹ |
eta_P | Nutrient cost coefficient for plasmid replication | 0.02 | — |
eta_X | Nutrient cost coefficient for protein expression | 0.005 | — |