This model focuses on constructing a dose-response curve between arsenic (NaAsO₂) concentration and fluorescence intensity of the engineered biosensor. Its key roles in the project are:
1. Quantify sensor performance: Establish a mathematical relationship between arsenic concentration (independent variable) and normalized fluorescence intensity (dependent variable, a.u./OD₆₀₀), and identify the arsenic concentration threshold that triggers a significant fluorescence surge (critical response point).
2. Define effective detection range: Clarify the concentration interval where the sensor can stably and accurately respond to arsenic, providing a quantitative basis for subsequent field application (e.g., matching with rice paddy arsenic safety standards).
Without this mathematical model, we cannot:
1. Quantify the correlation between arsenic concentration and fluorescence signal (e.g., how much fluorescence increases with 1 μM arsenic addition);
2. Determine the sensor's critical parameters (e.g., EC₅₀, effective detection range);
3. Evaluate whether the sensor meets practical needs (e.g., whether it can detect arsenic concentrations below the national safety limit for farmland soil).
This would directly hinder the assessment of the sensor's core performance and limit its translation from laboratory to field.
To ensure the model's biological rationality and fitting reliability, the following assumptions were established based on experimental logic and literature:
Assumption 1: Monotonic response of fluorescence to arsenic concentration - The fluorescence intensity (normalized by OD₆₀₀) shows a non-decreasing trend with increasing arsenic concentration.
Rationale: The sensor's core mechanism is "arsenic-induced release of ArsR repression → GFP expression upregulation"; a non-monotonic trend (e.g., fluorescence decline at high arsenic concentrations) would imply non-specific interference (e.g., cell toxicity), making the model lose its ability to reflect true dose-response relationships.
Assumption 2: No significant interference from non-target factors - The fluorescence signal is primarily regulated by arsenic concentration, not by other variables (e.g., interfering ions, metabolites, or cell growth differences).
Rationale: If factors like Pb²⁺, Hg²⁺, or pH affect fluorescence, the model cannot distinguish "arsenic-induced signals" from "interference signals," leading to inaccurate parameter estimation.
Assumption 3: Complete coverage of response phases - The tested arsenic concentration range includes three key phases of the sensor's response:
Rationale: A narrow concentration range (e.g., failing to reach saturation or lacking a steep segment) will result in incomplete fitting of the S-shaped curve, leading to biased estimates of EC₅₀ and Hill coefficient (n).
The model is derived from the Hill equation, a classic mathematical tool for describing ligand-receptor binding and dose-response relationships in biological systems. It is suitable for our sensor because the "ArsR-arsenic binding → Pars promoter activation → GFP expression" process follows a cooperative response (consistent with the Hill equation's core assumption of synergistic effects).
We used a mathematically equivalent form of the Hill equation for non-linear fitting, which balances biological interpretability and computational stability:
Linearized form (for code implementation, avoiding numerical instability caused by division):
| Variable | Symbol | Unit | Practical Meaning | Value Source |
|---|---|---|---|---|
| Independent variable | x | μM/L | NaAsO₂ concentration | Experimental selection |
| Dependent variable | y | a.u./OD₆₀₀ | Fluorescence intensity (normalized by cell density) | Experimental data |
| Parameters | a | a.u./OD₆₀₀ | Baseline response | Model fitting |
| Parameters | b | a.u./OD₆₀₀ | Maximum Response | Model fitting |
| Parameters | c | μM/L | EC₅₀,Half-maximal effect concentration | Model fitting |
| Parameters | n | _ | Hill coefficient, power index | Model fitting |
Data input: Experimental data of "arsenic concentration (x) → normalized fluorescence (y)" (3 biological replicates per concentration).
Algorithm: Non-linear least squares method (minimizes the sum of squared residuals between predicted and experimental y-values).
Reliability evaluation:
Software: R version 4.4.2 (open-source, widely used in biological data analysis)
Core packages: drc (for dose-response curve fitting), ggplot2 (for result visualization)
Core code: See Figure 1
Multiple rounds of fitting were conducted for different sensor variants (DH5αV3, PJ101). The pseudo-R² and curve trends indicated that some parameter combinations met the reliability criteria, while others required experimental adjustment. Selected results are shown in Figure 2.1-2.3.
| Curve ID | Tested Concentration Range | Fitting Trend & Interpretation |
|---|---|---|
| Figure 2.1 (DH5αV3) | 0~800 μM | Fluorescence reached saturation at ~150 μM; concentrations >150 μM provided no new information. Suggestion: Narrow the range to 0~150 μM for subsequent experiments. |
| Figure 2.2 (DH5αV3) | 0~200 μM | Fluorescence showed a linear trend (no S-shape). Possible cause: Sensor construction failure (e.g., ArsR-Pars binding inefficiency), not concentration range issue. |
| Figure 2.3 (PJ101) | 0~150 μM | Typical S-shaped curve: baseline (0~10 μM), steep increase (10~100 μM), saturation (>100 μM). Conclusion: 0~150 μM is the effective detection range for the PJ101 variant. |
All variants showed high baseline fluorescence at 0 μM NaAsO₂ (no arsenic). This indicates residual "leakage expression" of the sensor's GFP reporter, which may reduce detection specificity (e.g., false positives in low-arsenic samples). Follow-up optimization should focus on reducing this leakage (consistent with the project's "AND-gate circuit" design goal).
The model currently only provides a preliminary evaluation of the sensor's dose-response relationship. Key limitations include:
Limitation 1: Lack of detection limit (LOD) and quantification limit (LOQ) - LOD (minimum detectable arsenic concentration) and LOQ (minimum quantifiable concentration) are critical for practical application but were not calculated.
Limitation 2: No interference resistance assessment - Actual rice paddy soil/water contains interfering substances (e.g., heavy metals Pb²⁺, Hg²⁺, Cu²⁺; anions Cl⁻, SO₄²⁻), but the model does not verify whether these affect fluorescence signals.
Limitation 3: Ignoring environmental factor effects - Paddy fields have variable pH (6.2~6.8) and temperature (20~30°C), but the model does not test how these factors influence baseline fluorescence or EC₅₀.
Limitation 4: Incomplete signal quality evaluation - Metrics like signal-to-noise ratio (S/N) and signal stability (e.g., fluorescence drift over time) were not included, which are essential for on-site detection.
To address the above limitations and improve the model's practical value, we propose the following optimizations:
Optimization 1: Calculate LOD and LOQ - Use the 3σ (LOD) and 10σ (LOQ) method (σ = standard deviation of baseline fluorescence) after re-fitting with a narrower concentration range (0~50 μM).
Optimization 2: Add interference resistance testing - Include gradient concentrations of interfering ions in experiments, and modify the model to include an "interference term" to quantify their impact.
Optimization 3: Incorporate environmental factors - Design orthogonal experiments (pH × temperature) to collect data, and use a multi-variable Hill model to analyze how these factors affect key parameters (a, c, n).
Optimization 4: Evaluate signal quality - Add S/N calculation (S/N = (signal fluorescence - baseline fluorescence)/baseline fluorescence standard deviation) and long-term stability testing (fluorescence measurement every 2 hours for 12 hours) to the model output.