From AI-Driven Design to Biosensor-Based Characterization: Systems Modeling of DTE Enzymes Enabled by Pepper-HBC Fluorescent RNA
Content Overview
Given that traditional directed evolution strategies for DTE enzyme cannot rapidly meet the demands for enzyme optimization, the dry lab team utilized artificial intelligence to conduct de novo design of D-tagatose-3-epimerase, aiming to discover novel enzyme variants with potential advantages. By integrating LigandMPNN sequence redesign, AlphaFold3 structure prediction, molecular docking, and molecular dynamics simulations, we achieved rapid iteration and functional optimization at the computational level, and ultimately validated the catalytic activity of the designed variants through wet lab experiments. This work represents the first implementation of AI-driven global sequence redesign for DTE enzyme, successfully obtaining engineered enzymes with novel structural features and functional potential. Additionally, based on the screening and characterization of DTE enzyme, we designed systematic modeling for Pepper-HBC Fluorescent RNA Biosensors, including The Catalytic Circuit Model, The Characterization Circuit Model, The Population Dynamics Model, and The Half-life System Integration Model, in order to facilitate the testing of the entire system.
Models
- The Catalytic Circuit Model
- The Characterization Circuit Model
- The Population Dynamics Model
- The Half-life System Integration Model
1. Design
Objective: To carry out a de-novo redesign of the DTE sequence while
preserving its catalytic activity.
1.1 Definition of Key Functional Regions
We first analyzed the structure of the wild-type DTE (PDB: 2OU4) to
delineate regions that must remain unchanged, establishing these as
structural constraints for our design strategy (Table 1)
D-Tagatose-3-epimerase (DTE) is the key enzyme that catalyzes the C3
epimerization of various keto-sugars and is widely exploited for
rare-sugar biosynthesis. In this study, the DTE from Pseudomonas
cichorii (PDB: 2OU4) was chosen as the design template. The enzyme
adopts a canonical metal-dependent (β/α)₈-barrel fold; its active site
contains a Mn²⁺ ion coordinated by four strictly conserved residues
(Glu152, Asp185, His211, Glu246) that mediate substrate
deprotonation/reprotonation. In addition, the native 2OU4 structure
forms a stable homodimer; the catalytic pocket is located at the dimer
interface and displays an open, promiscuous substrate-binding character,
enabling efficient C3 epimerization of both D-tagatose and D-fructose.
DTE_2OU4 sequence:
MNKVGMFYTYWSTEWMVDFPATAKRIAGLGFDLMEISLGEFHNLSDAKKRELKAVADDLGLTVMCCIGLKSEYDFASPDKSVRDAGTEYVKRLLDDCHLLGAPVFAGL
TFCAWPQSPPLDMKDKRPYVDRAIESVRRVIKVAEDYGIIYALEVVNRFEQWLCNDAKEAIAFADAVDSPACKVQLDTFHMNIEETSFRDAILACKGKMGHFHLGEAN
RLPPGEGRLPWDEIFGALKEIGYDGTIVMEPFMRKGGSVSRAVGVWRDMSNGATDEEMDERARRSLQFVRDKLA
DOI:10.1016/j.jmb.2007.09.033;10.1002/cbic.201402620
Table 1. Key Functional Regions and Design Strategy for DTE
| Region Category |
Residue ID (A) |
Mutability |
| Catalytic Core |
A152/A185/A211/A246 |
Fixed |
| Dimer Interface |
A103-263 (key residues) |
Fixed |
| Hydrophobic Core |
A30/50/100/120/150/180/200/230/250/280 |
Mutable |
| Surface Charge |
A10/20/60/90/130/170/220/260/290 |
Mutable |
| Remaining 90% |
All remaining residues |
Mutable |
1.2 Sequence Redesign (LigandMPNN)
Using 2OU4 as the template, we performed structure-based sequence
prediction with LigandMPNN. The Mn²⁺-coordinating residues
A152/A185/A211/A246 and key dimer-interface positions were declared
immutable; the remaining ~90 % of the residues were designated
redesignable. Residue-specific biases were applied to reinforce
Val/Ile/Leu in the hydrophobic core and Asp/Glu/Lys/Arg on
surface-charge patches, while all free cysteines were globally excluded
to avoid oxidation risk. Symmetry constraints enforced synchronous
optimization of both chains, and a ligand-aware side-chain packing
algorithm prevented steric clashes with the metal ion or substrate.
After iterative refinement within the Renew framework, 32×2 sequences
were sampled and rapidly filtered with Expasy ProtParam, yielding the
most stable variants DTE-1 and DTE-2.
2. Build
Objective: To computationally predict and evaluate the designed proteins
before wet-lab experiments.
2.1 Physicochemical Property Prediction (Expasy ProtParam)
We employed the Expasy ProtParam web server to rapidly estimate basic
theoretical parameters--- isoelectric point, molecular weight,
instability index, aliphatic index, and grand average of hydropathicity
(GRAVY)---for every newly generated sequence. This allowed us to
rationally discard variants whose predicted instability index was
excessively high (Table 2). ProtParam scanning revealed that DTE-1 and
DTE-2 exhibit a 16--17 % decrease in instability index and a 30 %
increase in aliphatic index relative to wild-type, while pI and
molecular weight remain virtually unchanged. These metrics suggest that
the mutants possess improved solubility and thermostability, providing a
promising scaffold for subsequent functional characterization.
Table 2. Predicted physicochemical parameters of the designed enzymes
| Protein id |
Theoretical pI |
Molecular weight |
Instability index |
Aliphatic index |
Grand average of hydropathicity (GRAVY) |
| 2OU4 |
5.21 |
65081.53 |
50.98 |
79.55 |
-0.234 |
| DTE-1 |
5.21 |
64447.74 |
35.51 |
101.35 |
-0.079 |
| DTE-2 |
5.16 |
64448.91 |
34.94 |
104.35 |
-0.041 |
2.2 Three-Dimensional Structure Prediction (AlphaFold3)
AlphaFold3 was used to verify whether the redesigned sequences fold
correctly and to assess the spatial orientation of the catalytic
residues. After LigandMPNN sequence generation and initial filtering,
the top variants DTE-1 and DTE-2 were submitted to AlphaFold3 for rapid
structure prediction. The resulting models were inspected in PyMol and
superimposed on the 2OU4 template, focusing on the active-site pocket.
Structural analysis:
Confidence: AlphaFold3 produced models with an average pLDDT ≈ 0.90
(Table 3).
Global superposition: RMSD between the predicted structures and 2OU4 is
< 0.56 Å, confirming that the redesigned sequences preserve the native
backbone fold (Fig. 1-A1, A2).
2OU4 original pocket:
The native pocket adopts a classic "bowl-like" open conformation (volume
805 ų). The hydrophobic framework accommodates D-fructose, but the
C4--C6 alkyl chain remains solvent-exposed, giving a hydrophobic contact
ratio of only ~58 %. This loose recognition facilitates substrate
entry/exit and reflects the enzyme's promiscuous catalytic profile (Fig.
1-B1).
DTE-1:
While retaining the overall opening and scaffold, DTE-1 introduces
subtle side-wall adjustments. Val152 swings inward, shrinking the pocket
volume to 501 ų and increasing the hydrophobic contact ratio to 65 %.
The solvent channel is preserved, representing a refined version of the
original open pocket (Fig. 1-B2).
DTE-2:
Key mutations Leu183→Phe and Val181→Tyr install aromatic side chains at
the pocket entrance. Together with the pre-existing Phe208, they form a
closed "hydrophobic lid". Pocket volume drops to 448 ų, the entrance
diameter narrows to 4.6 Å, and the hydrophobic burial fraction jumps to
82 %. The binding site is thereby transformed from an open "bowl" into a
fully enclosed "box" (Fig. 1-B3).
Table 3. Confidence of the predicted structures for the redesigned enzymes
| Protein id |
ipTM |
pTM |
ranking_score |
RMSD |
| 2OU4 |
0.94 |
0.96 |
0.94 |
0.000 |
| DTE-1 |
0.90 |
0.92 |
0.91 |
0.546 |
| DTE-2 |
0.89 |
0.92 |
0.90 |
0.555 |
3. Test
Objective: To computationally evaluate the binding affinity and
catalytic potential of the designed enzymes toward their substrate.
3.1 Molecular Docking (Docking Tools)
We employed the CB-Dock2 web server to dock D-fructose into the
structures of DTE-1 and DTE-2. For the reference enzyme 2OU4, CB-Dock2
blind docking first returned five candidate pockets (CurPocket C1--C5);
C1 exhibited the best Vina score (--5.9 kcal mol⁻¹) and was selected as
the primary binding site. A template-based refinement was then performed
using PDB 2qum (100 % pocket concordance, ligand RMSD 0.23 Å), yielding
a final contact score of 64.1.
The same protocol was applied to the two redesigned enzymes. Again, five
pockets were sampled. For both DTE-1 and DTE-2 the optimal site was C3,
with Vina scores of --5.1 and --5.0 kcal mol⁻¹, respectively. Template
refinement with 2qum (47 % pocket concordance) raised the contact scores
to 12.4 (DTE-1) and 54.7 (DTE-2). Relative to 2OU4, DTE-1 requires the
ligand to shift into the smaller C3 cavity, accompanied by a modest drop
in Vina energy. Despite lower pocket similarity, DTE-2 retains most of
the hydrogen bonds and hydrophobic clamps required for substrate
recognition.
In parallel, DiffDock was used to dock both D-tagatose and D-fructose
into the redesigned enzymes for rapid comparison. DTE-1 displayed high
confidence for both sugars, whereas DTE-2 showed a clear preference for
tagatose, indicating that the DTE-2 scaffold confers higher pairing
propensity toward tagatose (Table 4).
Table 4. Summary of molecular docking results
| Protein id |
CBdock Vina Score (kcal/mol) |
CBdock pocket volume(ų) |
CBdock Contact Score |
Diffdock_D_fructose rank_confidence |
Diffdock_D_tagatose rank_confidence |
| 2OU4 |
-5.9 |
805 |
64.1 |
0.83 |
0.76 |
| DTE-1 |
-5.1 |
501 |
12.4 |
0.63 |
0.71 |
| DTE-2 |
-5.0 |
448 |
54.7 |
1.34 |
0.97 |
3.2 Atomic-interaction analysis (PLIP)
From the analysis of atomic interactions in PLIP, for the wild-type
enzyme 2OU4, the substrate D-fructose directly coordinates with the
catalytic Mn²⁺ ion via its O2 and O3 atoms (coordination distances
approximately 2.07--2.53 Å, Tables 5-1 and 5-2), and forms a
pre-reaction catalytic conformation with key residues such as Glu152 and
Glu246. In the predicted conformation of DTE-1, D-fructose is anchored
within an elongated pocket formed by
Ile150-Val152-Gln182-Asp184-Gly207-Phe208 via six geometrically
favorable hydrogen bonds (Figure 2). This pocket is spatially separated
from the catalytic center (Tables 5-3, 5-6). The hydroxyl groups (e.g.,
O3) of D-fructose only form hydrogen bonds with the protein backbone or
side chains, without coordinating to Mn²⁺. Furthermore, its C3 atom is
positioned farther from Glu246, which is responsible for proton
abstraction (distance increased from 2.3 Å to 2.59 Å, Tables 5-4, 5-5).
This ultra-tight binding may significantly enhance the enzyme's ability
to capture the substrate---first fixing it in a secondary pocket before
releasing it to the catalytic center---though it may also imply a higher
energy barrier hindering proper substrate pairing. Compared to the
wild-type 2OU4, DTE-2 retains the complete tetrahedral Mn²⁺ coordination
core formed by Glu152, Asp185, His211, and Glu246. However, the
regularity of its coordination geometry may be somewhat compromised,
particularly with the Glu246--Mn²⁺ coordination distance increasing to
2.82 Å, which is longer than that observed in DTE-1 and the wild-type
enzyme (2.59 Å). This may suggest suboptimal Mn²⁺ coordination in DTE-2
(Tables 5-7, 5-8). Both DTE-1 and DTE-2 exhibit a reduction in some
hydrogen bonds compared to 2OU4 (Tables 5-3, 5-6, 5-9). Nevertheless,
this streamlined yet potent hydrogen-bonding network, combined with
introduced aromatic residues contributing hydrophobic interactions, may
facilitate substrate binding in a conformation more closely resembling
the transition state. Although this temporarily sacrifices certain
quantitative binding metrics, it creates a more favorable geometric
prerequisite for achieving high catalytic efficiency.
Table 5. Protein Structure Analysis
| Index |
Residue |
Chain |
Metal |
Target |
Distance |
Location |
| 1 |
152A |
GLU |
2290 |
1202 |
2.17 |
protein.sidechain |
| 2 |
185A |
ASP |
2290 |
1458 |
2.07 |
protein.sidechain |
| 3 |
211A |
HIS |
2290 |
1666 |
2.37 |
protein.sidechain |
| 4 |
246A |
GLU |
2290 |
1937 |
2.33 |
protein.sidechain |
| Index |
Residue |
Chain |
Metal |
Target |
Distance |
Location |
| 1 |
152D |
GLU |
4705 |
3617 |
2.21 |
protein.sidechain |
| 2 |
185D |
ASP |
4705 |
3873 |
2.10 |
protein.sidechain |
| 3 |
211D |
HIS |
4705 |
4081 |
2.48 |
protein.sidechain |
| 4 |
246D |
GLU |
4705 |
4353 |
2.53 |
protein.sidechain |
| Index |
Residue |
Chain |
D-A (Å) |
H-A (Å) |
Angle |
Protein donor |
Description |
| 1 |
LEU151 |
D |
2.67 |
2.06 |
119.1 |
NO |
Backbone NH→O3 |
| 2 |
VAL153 |
D |
3.79 |
2.99 |
139.7 |
YES |
Backbone NH→O3 |
| 3 |
VAL153 |
D |
3.95 |
3.20 |
134.9 |
NO |
Backbone NH→O2 |
| 4 |
GLN183 |
D |
2.80 |
2.23 |
115.7 |
YES |
Backbone NH→O3 |
| 5 |
LEU184 |
D |
4.03 |
3.44 |
120.7 |
NO |
Backbone NH→O2 |
| 6 |
ASP185 |
D |
3.03 |
2.21 |
138.9 |
YES |
Backbone NH→O3 |
| 7 |
GLY208 |
D |
2.41 |
1.47 |
157.1 |
YES |
Strong H-bond |
| 8 |
HIS209 |
D |
3.02 |
2.26 |
132.6 |
YES |
Backbone NH→O3 |
| Index |
Residue |
Chain |
Metal |
Target |
Distance |
Location |
| 1 |
GLU152 |
A |
25 |
O |
2.39 |
protein.sidechain |
| 2 |
ASP185 |
A |
25 |
O |
2.11 |
protein.sidechain |
| 3 |
HIS211 |
A |
25 |
N |
2.27 |
protein.sidechain |
| 4 |
GLU246 |
A |
25 |
O |
2.59 |
protein.sidechain |
| Index |
Residue |
Chain |
Metal |
Target |
Distance |
Location |
| 1 |
GLU151 |
B |
26 |
O |
2.31 |
protein.sidechain |
| 2 |
ASP184 |
B |
26 |
O |
2.16 |
protein.sidechain |
| 3 |
GLU210 |
B |
26 |
O |
2.03 |
protein.sidechain |
| 4 |
GLU245 |
B |
26 |
O |
2.26 |
protein.sidechain |
| Index |
Residue |
Chain |
D-A (Å) |
H-A (Å) |
Angle |
Protein donor |
Description |
| 1 |
ILE150 |
B |
2.94 |
2.35 |
118.0 |
NO |
Backbone→Ligand |
| 2 |
VAL152 |
B |
3.73 |
2.81 |
155.5 |
YES |
Backbone NH→O3 |
| 3 |
GLN182 |
B |
2.96 |
2.25 |
128.0 |
YES |
Backbone NH→O3 |
| 4 |
ASP184 |
B |
3.31 |
2.44 |
146.6 |
YES |
Backbone NH→O3 |
| 5 |
GLY207 |
B |
2.57 |
1.62 |
160.8 |
YES |
Strong H-bond |
| 6 |
PHE208 |
B |
3.25 |
2.54 |
128.9 |
YES |
Backbone NH→O3 |
| Index |
Residue |
Chain |
Metal |
Target |
Distance |
Location |
| 1 |
152A |
GLU |
25 |
1182 |
2.30 |
protein.sidechain |
| 2 |
185A |
ASP |
25 |
1433 |
2.36 |
protein.sidechain |
| 3 |
211A |
HIS |
25 |
1646 |
2.39 |
protein.sidechain |
| 4 |
246A |
GLU |
25 |
1921 |
2.82 |
protein.sidechain |
| Index |
Residue |
Chain |
Metal |
Target |
Distance |
Location |
| 1 |
GLU151 |
B |
26 |
O |
2.38 |
protein.sidechain |
| 2 |
ASP184 |
B |
26 |
O |
2.32 |
protein.sidechain |
| 3 |
GLU210 |
B |
26 |
O |
2.42 |
protein.sidechain |
| 4 |
GLU245 |
B |
26 |
O |
2.44 |
protein.sidechain |
| Index |
Residue |
Chain |
D-A (Å) |
H-A (Å) |
Angle |
Protein donor |
Description |
| 1 |
ILE150 |
B |
2.39 |
2.89 |
121.3 |
NO |
Backbone→O3 |
| 2 |
VAL152 |
B |
2.89 |
2.89 |
161.6 |
YES |
Backbone NH→O3 |
| 3 |
GLN182 |
B |
2.41 |
2.98 |
116.7 |
YES |
Backbone NH→O2 |
| 4 |
ASP184 |
B |
2.47 |
3.32 |
144.6 |
YES |
Backbone NH→O3 |
| 5 |
GLY207 |
B |
1.48 |
2.44 |
163.1 |
YES |
Strong binding |
| 6 |
PHE208 |
B |
2.39 |
3.16 |
134.8 |
YES |
Backbone NH→O3 |
Figure 2. PLIP-generated schematic of enzyme–ligand interactions
3.3 Molecular Dynamics and Binding Free Energy
We evaluated the stability and binding free energy of the complexes
using the docking results from CBdock through molecular dynamics (MD)
simulations and MM/PBSA calculations. Additionally, we performed docking
of the enzymes with the substrate D-fructose and the product D-psicose
using Swissdock to further assess binding effectiveness. The Swissdock
results indicated that the binding energy of 2OU4 with the product
D-psicose was slightly lower than that with the substrate D-fructose,
suggesting the existence of reversible catalysis in 2OU4 for converting
fructose to allulose. In contrast, DTE-1 and DTE-2 appear to have
optimized this reversible catalytic effect, as reflected by their
stronger binding energies with the substrate D-fructose. Furthermore,
MM/PBSA calculations also showed that DTE-1 and DTE-2 exhibit better
binding performance compared to 2OU4, even though the calculated ΔG
values were all greater than zero. This may be attributed to the absence
of Mn²⁺ topology in the simulated system, leading to only transient
binding between the enzyme and the substrate (Table 6).
Binding free energy decomposition analysis (Figure 3) revealed that in
2OU4, the binding energy primarily originates from the catalytic center
and surrounding residues (e.g., Glu158, Asp185, Glu192, etc.; Figure
3-1, A1-2). In contrast, the energy contribution hotspots in DTE-1 and
DTE-2 shift to residues forming the secondary pocket (e.g., Glu157,
Gln182, Arg216, etc.; Figure 3-2/3, B1-2, C1-2), indicating that the
enzyme variants may have optimized the ability of the secondary pocket
to capture ligands. Notably, Trp112/113 contributed the largest energy,
suggesting that this residue may play a critical role in guiding the
ligand into the hydrophobic pocket.
MD simulation results showed that the protein backbone RMSD of DTE-1,
DTE-2, and 2OU4 all reached stable plateaus in the later stages of the
simulation (Figure 4-A), indicating correct overall folding and
conformational stability. The RMSD of the catalytic center and protein
secondary structures also stabilized in the later stages (Figure 4-B, C,
D), demonstrating that the catalytic structures of the enzyme variants
remain functional. Further trajectory analysis indicated that the
substrate-binding region of DTE-2 exhibits higher rigidity, while DTE-1
shows moderate rigidity, which may confer advantages in structural
transitions (reduced RMSF, Figure 4-H). The radius of gyration (Rg) of
both DTE-1 and DTE-2 showed an initial increasing trend followed by
stabilization, suggesting possible structural expansion of the enzyme
variants during the simulation time, with DTE-1 appearing more loosely
packed (elevated Rg). The solvent accessible surface area (SASA) of
DTE-1 and DTE-2 increased and exhibited continuous fluctuations compared
to 2OU4 (Figure 4-G), indicating that under dynamic conditions, the
hydrophobic core of the enzyme variants may undergo continuous opening
and closing due to rigid structural transitions. This suggests that the
overall structures of the new enzymes may be expanding or drifting,
making them slightly less stable compared to 2OU4. However, the number
of hydrogen bonds between DTE-1/DTE-2 and the substrate remained
consistently high, validating the streamlined and potent
hydrogen-bonding network observed in the PLIP analysis.
Together, these results indicate that the enzyme variants achieve higher
apparent binding affinity (lower MM/PBSA binding free energy) by
optimizing hydrophobic packing and a streamlined hydrogen-bonding
network. Kinetically, this may facilitate rapid substrate capture and
transfer from the secondary pocket to the catalytic active center.
The protein simulation trajectories were visualized using VMD, with 25
frames extracted every 200 frames from the first 5000 frames. The
results show that the ligand migrates between the two pockets of 2OU4
and the protein dimer interface (yellow isosurface in Figure 5-A),
likely due to instability caused by the lack of Mn²⁺ coordination in the
system, which also explains the positive free energy values.
Furthermore, different catalytic cores identified by PLIP were rendered
using VMD: the catalytic pocket of 2OU4 was colored pink (Glu152,
Asp182, etc.), which partially overlaps with some secondary structures,
while the distinct catalytic pockets of DTE-1 and DTE-2 were colored
yellow (Ile150, Val152, Gln182, etc.). The figures show an increased
probability of ligands being adsorbed into the secondary pockets in
DTE-1 and DTE-2 (Figure 5-B, C). Moreover, the ligand distribution in
DTE-2 (in both secondary and catalytic pockets) is more concentrated,
indirectly explaining its further reduced binding energy. However, in
terms of overall structural distortion, 2OU4 exhibits a more compact
structure, while DTE-1 and DTE-2 display more expanded conformations,
consistent with the radius of gyration results from the dynamics
simulations.
Table 6. Binding free energies calculated by SwissDock and MM/PBSA
| Protein id |
SwissDock-D_fructose Kcal/mol |
SwissDock-D_piscose Kcal/mol |
(Gromacs-mmpbsa) Kcal/mol |
| 2OU4 |
-5.210 |
-5.497 |
dG=21.816 |
| DTE-1 |
-6.070 |
-5.749 |
dG=17.797 |
| DTE-2 |
-4.766 |
-4.752 |
dG=16.643 |
3.4 Wet-Lab Validation Results
Based on the fluorescent biosensor system established through wet-lab
experiments, we obtained the activity profiles of different DTE enzymes
(Figure 6).
The results indicate that, compared to the Wild-Type (WT) group, the
newly designed enzyme variants DTE-1 and DTE-2 did not exhibit a
significant advantage in fluorescence intensity during the initial
reaction phase, and even showed lower activity in the mid-phase.
However, in the mid-to-late stages of the reaction, we detected a
substantially higher fluorescence intensity for DTE-1 in the system, and
DTE-2 also demonstrated favorable performance.
This observation aligns well with the overall trends noted in our
dry-lab molecular dynamics simulation trajectories. Specifically, the
flexible structure of 2OU4 (WT) likely allows for greater conformational
adaptability and overall stability, reflected in its steady increase in
fluorescence intensity. In contrast, DTE-1 and DTE-2 appear slightly
less stable. Nevertheless, it is noteworthy that the newly designed
enzyme variants, despite structural remodeling, still exhibit
significant catalytic potential, as evidenced by the high final
fluorescence intensity. This corroborates the transient lower binding
free energy observed in the MM/PBSA calculations.
This suggests that maintaining catalytic efficiency over extended
periods for these variants might require enhanced structural rigidity or
a reinforced hydrogen-bond network. Alternatively, substrate binding
within the hydrophobic pocket might influence the flexibility of the
secondary pocket structure in the variants, potentially facilitating
increased substrate access to the catalytic center.
We anticipate designing additional novel enzyme variants to explore the
diverse possibilities underlying these activity changes, and conducting
further characterization of different variants in wet-lab experiments to
solidify our conclusions. This iterative process benefits significantly
from the synergistic integration of dry and wet-lab experimental
approaches.
4. Learn
Objective:To characterize the enzyme variants through wet-lab
experiments, integrate all test results, and summarize the insights.
4.1 Key Conclusions
Through AI-driven global sequence redesign of the DTE enzymes and
systematic computational evaluation, we have drawn the following core
conclusions:
- Sequence Design Strategy is Effective:Subsequent wet-lab
experiments confirmed that the enzyme variants exhibit functional
activity. This demonstrates that the structure-based sequence
redesign strategy (using LigandMPNN) can effectively guide the
design of novel enzyme variants. The rational changes in the
variants are supported by computational simulation data, aided by
various analytical tools for prediction and evaluation.
- Structural Remodeling of Enzyme Variants Suggests Potential
Catalytic Dynamics:In the DTE enzyme sequence prediction project,
balancing the structure of the hydrophobic core and the catalytic
core may enable the enzyme to capture ligands more readily. The
introduced mutations potentially adjusted hydrophobic preferences,
leading to structural remodeling and resulting in more rigid
architectures. This demonstrates that under conditions of highly
compressed enzyme structures, there is an opportunity to construct
numerous novel variants and uncover more potent and stable enzyme
scaffolds.
- Docking Tools Combined with Dynamics Simulations Enable a Rapid
Enzyme Evaluation Pipeline:Utilizing diverse simulation and docking
tools can significantly accelerate the prediction and evaluation of
enzyme variants. A comprehensive technical workflow, encompassing
both static data analysis and dynamic simulation observation,
supports further sequence design efforts.
4.2 Iterative Design Recommendations
When adjusting the structural sequence design of DTE enzymes, attention
should be paid to the key coordination of core catalytic residues. For
instance, optimizing the coordination space of Glu246 is crucial, as
this residue might be highly susceptible to interference from
hydrophobic structures, leading to diminished catalytic performance.
When optimizing the hydrogen-bond network, the focus should not solely
be on the number of hydrogen bonds; balancing the charge distribution at
the dimer interface is equally important. Otherwise, the catalytic core
structure might become unstable, potentially leading to compensatory
changes in secondary structures.
Appropriate introduction of flexible structural elements, such as Glu
residues, could be considered. Key flexible structures might enhance the
enzyme's binding capability.
Finally, the results from each optimization round should undergo rapid
assessment to filter out proteins with incorrect binding poses as
efficiently as possible. Furthermore, during MD simulations,
incorporating topological parameters for Mn²⁺ and applying positional
restraints within the protein might better simulate the authentic
catalytic structure, aiding in the identification of superior enzyme
variants.
4.3 Summary and Outlook
This in silico study showcases the powerful potential of artificial
intelligence in the field of enzyme design. We successfully designed the
mutants DTE-1 and DTE-2 based on a full-sequence redesign of the DTE
enzyme scaffold and preliminarily investigated the remodeling mechanisms
of their structure-function relationships through multi-scale
simulations.
Not only have we computationally designed and validated enzyme variants
possessing novel structural features and potential functional
advantages, but more importantly, we have established a reusable
methodology for the rational design of enzymes.
Looking ahead, with continuous advancements in AI models, computational
power, and experimental automation, AI-assisted enzyme engineering is
poised to break through the bottlenecks of traditional directed
evolution, emerging as an efficient, controllable, and predictable
paradigm for novel enzyme design. We anticipate that this strategy will
contribute to green technology innovation in fields such as synthetic
biology.
Screening and Characterization of DTE Based on Pepper-HBC Fluorescent RNA Biosensors
As shown in Figure 1, the plasmid PSB1C3-psiR-PsiA-pepper was constructed as a biosensor for DTE enzyme activity based on the fluorescent RNA pepper-HBC system. The wild-type DTE background enzyme, which is a homodimer, was derived from the DTE mutant IDF10-3 of Pseudomonas cichorii (GenBank: AB000361.1), while the designed heterodimer DTE enzyme was obtained from the sequence designed with AI assistance and synthesized through full gene synthesis.
Two recombinant plasmids (homodimer and heterodimer designs) and the characterization plasmid pET28-Porin-PsiRA-Pepper (fluorescent RNA pepper-HBC system) were co-transformed into Escherichia coli BL21(DE3) by electroporation. The obtained double-plasmid system strains were then cultured in LB medium at 37°C for 10-12 hours. Different concentrations of substrate D-fructose (0.01, 0.1, 1, 10, 100, 1000 mM) were added, and the culture was continued for another 12 hours. Then, HBC dye was added, and the fluorescence values after the addition of different concentrations of D-fructose were detected. The optimal concentration of D-fructose under this culture condition (i.e., the concentration at which the fluorescence intensity reaches its peak, as shown in Figure 2) was determined.
At the same time, different active DTE constructs were selected for constructing recombinant Escherichia coli strains. The fluorescence intensity changes were measured under the optimal D-fructose concentration and the same conditions. The functional relationship between DTE activity/conversion rate and Papper-HBC fluorescence intensity was established (a blank control without DTE recombinant strain was also required to characterize the background leakage expression level).
Modeling Content: Psicose Synthetic Kinetics Model
The affinity of PsiR transcription factor for D-Psicose: When the intracellular concentration of D-Psicose increases, PsiR dissociates from the promoter pPsiA and activates the expression of downstream Pepper fluorescent RNA.
Based on the above analysis of the design of the Pepper-HBC fluorescent RNA biosensor, this system divides the circuit into four parts: The Catalytic Circuit Model, The Characterization Circuit Model, The Population Dynamics Model, and The Half-life System Integration Model, in order to facilitate the testing of the entire system.
Key Assumptions
| No. |
Assumption Name |
Assumption Content |
| 1 |
Enzyme Expression |
DTE enzyme expression is regulated by the strength of the promoter, and the assembly efficiency of homologous/heterologous dimers is different. |
| 2 |
Substrate Diffusion |
The extracellular D-fructose concentration is constant, and the transmembrane diffusion rate in the short term is proportional to the concentration gradient. |
| 3 |
Transcriptional Regulation |
The PsiR-PsiA system follows the Hill equation, and the transcriptional intensity is regulated by the concentration of D-alloconitose. |
| 4 |
Fluorescence Detection |
The fluorescence intensity of the Pepper-HBC complex has a quasi-linear relationship with the PH* activity of the complex. |
1. The Catalytic Circuit Model
1.1 Kinetics of Expression of Homodimeric DTE Enzyme
The synthesis rate of DTE enzyme depends on the strength of the promoter, the efficiency of transcription and translation, as well as possible negative feedback regulation. Considering that the background DTE enzyme is constitutively expressed and is a homodimer, and assuming it is composed of two identical monomers (denoted as M), the expression process includes transcription, translation, and dimerization, and the following equation can be obtained:
\[
\frac{d\left[ \text{mRNA}_{DTE} \right]}{dt} = \beta_{DTE} - \delta_{m}\left[ \text{mRNA}_{DTE} \right]
\]
\[
\frac{d[M]}{dt} = \frac{k_{TL}\left[ \text{mRNA}_{DTE} \right]}{1 + \frac{\left[ \text{mRNA}_{DTE} \right]}{k_{TL}}} - 2k_{f}[M]^{2} + 2k_{r}[DTE] - \delta_{M}[M]
\]
\[
\frac{d[DTE]}{dt} = k_{f}[M]^{2}\left( 1 - \frac{[DTE]}{[DTE]_{\max}} \right) - k_{r}[DTE] - \delta_{D}[DTE]
\]
Among them, constant βDTE represents the transcription rate of the mRNA of the DTE enzyme, δm represents the degradation rate of the mRNA. kTL is the translation rate constant. kf is the dimerization rate constant, kr is the dissociation rate constant, δM is the degradation rate of the monomer, and δD is the degradation rate of dimerization.
In Figure 3, the mRNA rapidly rises to a steady state of 165.52 nM, reflecting the rapid equilibrium between transcription and degradation; the monomeric protein accumulates to a steady state of 31.72 nM, which is the result of the dynamic balance between translation and dimerization; the DTE dimer rises slightly to a steady state of 0.155 nM, demonstrating the cascade process of transcription → translation → dimerization, which conforms to the molecular mechanism of homodimer formation from two monomers.
1.2 Kinetics of Expression of Heterodimeric DTE Enzyme
The expression of the newly constructed heterodimer still needs to be reconsidered. Let's assume it consists of A chain and B chain.
Since the expression of the designed A chain and B chain is continuous and constitutive, the following changes will occur:
\[
\frac{d\left[ \text{mRNA}_{A} \right]}{dt} = \beta_{A} - \delta_{m}\left[ \text{mRNA}_{A} \right]
\]
\[
\frac{d\left[ \text{mRNA}_{B} \right]}{dt} = \beta_{B} - \delta_{m}\left[ \text{mRNA}_{B} \right]
\]
Among them, the concentration of mRNA for the A chain changes over time, which is determined by the transcription rate \(\beta_{A}\) and the degradation rate \(\delta_{A}\). The concentration of mRNA for the B chain also changes over time, which is determined by the transcription rate \(\beta_{B}\) and the degradation rate \(\delta_{B}\).
\[
\frac{d[A]}{dt} = k_{TLA}\left[ \text{mRNA}_{A} \right] - k_{on}[A][B] + k_{off}\left[ DTE_{AB} \right] - \delta_{A}[A]
\]
\[
\frac{d[B]}{dt} = k_{TLB}\left[ \text{mRNA}_{B} \right] - k_{on}[A][B] + k_{off}\left[ DTE_{AB} \right] - \delta_{B}[B]
\]
Among them, the concentration of the A chain changes over time, which is determined by the translation rate \(k_{TLA},\) the binding rate \(k_{on}\), the dissociation rate \(k_{off}\) and the degradation rate\(\ \delta_{A}\). The concentration of the B chain changes over time, which is determined by the translation rate \(k_{TLB}\), the binding rate \(k_{on}\), the dissociation rate \(k_{off}\) and the degradation rate \(\delta_{B}\).
\[
\frac{d\left[ DTE_{AB} \right]}{dt} = k_{on}[A][B] - k_{off}\left[ DTE_{AB} \right] - \delta_{AB}\left[ DTE_{AB} \right]
\]
Figure 5 shows that both mRNA_A and mRNA_B reached a steady state of 0.167 nM, indicating that the transcription rates of the A and B chains are consistent; both A and B monomers reached a steady state of 81.85 nM, demonstrating that the translation efficiency is matched; the DTE_AB heterodimer reached a steady state of 0.118 nM, reflecting the efficiency of the binding and assembly of the A and B chains; although the steady state is slightly lower than that of the homodimer (0.155 nM), it still efficiently completes heterodimerization, which is in line with the characteristics of hetero-subunit assembly.
1.3 Enzymatic Reaction Kinetics
1. D-fructose Transmembrane Diffusion:
The concentration of D-fructose is different inside and outside the cell. It can be considered that the diffusion rate of D-fructose is proportional to the concentration difference between inside and outside the cell. Therefore, the rate at which D-fructose diffuses into the cell is:
\[
\gamma_{F}\left( \left[ {Fru}_{out} \right] - \left[ {Fru}_{in} \right] \right)
\]
The concentration change of fructose within the cell is:
\[
\frac{d[{Fru}_{in}]}{dt} = \gamma_{F}\left( \left[ {Fru}_{out} \right] - \left[ {Fru}_{in} \right] \right) - \frac{k_{cat}[DTE]\left[ {Fru}_{in} \right]}{k_{M} + {Fru}_{in}} - \mu\left[ {Fru}_{in} \right]
\]
Among them, \(\gamma_{F}\) represents the diffusion coefficient of fructose, \([{Fru}_{out}]\) and \([{Fru}_{in}]\) are the concentrations of extracellular and intracellular D-fructose respectively, and \(\mu\) is the dilution effect caused by cell growth rate.
Considering that the volume of the cells is sufficiently small and can be neglected compared to the external solution, the amount of D-fructose diffusing into the cells is very small. We can assume that the concentration of D-fructose outside the cells remains constant over a short period of time and does not change with time.
\[
\frac{d[{Fru}_{out}]}{dt} = 0
\]
2. D-Psicose Synthesis:
D-Psicose is produced after D-fructose enters the cell and is catalyzed by DTE enzyme. The concentration of D-Psicose varies depending on the concentration of D-fructose and can be described by the Hill equation. D-Psicose binds to the promoter pPsiR to form the CCI complex. Considering the consumption and degradation of D-Psicose, the generation rate of D-Psicose is:
\[
\frac{d[ Psicose]}{dt} = \frac{k_{cat}[DTE]\left[ {Fru}_{in} \right]}{k_{M} + {Fru}_{in}} - {m_{Psicose,pPsiR}[ Psicose]}^{n}\left[ \text{pPsiR} \right] - \delta_{P}[ Psicose] + m_{CCI}\ [CCI] - \mu[ Psicose]
\]
Here, \(- {m_{Psicose,pPsiR}[ Psicose]}^{n}\left[ \text{pPsiR} \right]\) represents the rate at which D-Psicose binds to the PsiR transcription factor. The binding rate is determined by the concentrations of Psicose and PsiR. \(- \delta_{P}[ Psicose]\) represents the degradation or dilution rate of D-Psicose within the cell. \(+ m_{CCI}\,[CCI]\) represents the rate at which D-Psicose is released by the decomposition of the CCI complex, which is determined by the concentration of the CCI complex [CCI] and the decomposition constant \(m_{CCI}\). \(\mu\) is the dilution effect caused by the cell growth rate.
Figure 6 shows that the extracellular D-fructose homeostasis is maintained at 10.00 nM with almost no consumption; the intracellular D-fructose rapidly drops to 0.346 nM within the first 2 hours and then reaches a stable state, indicating that it is efficiently consumed by DTE after transmembrane transport. D-Psicose gradually increases from 0 to a steady state of 4.52 nM, with a production rate of 47.59%. This is because the rate at which DTE catalyzes the formation of Psicose from fructose, the binding of Psicose to PsiR, its degradation, and the consumption due to cell dilution reach equilibrium, demonstrating the directional regulation characteristic of the substrate-enzyme-product metabolic flow.
2. The Characterization Circuit Model
2.1 PsiR-PsiA Transcriptional Regulation System
1. PsiR-pPsiA Regulatory Circuit:
After the generation of D-Psicose, D-Psicose will bind to the pPsiR transcription factor to form the CCI complex. The entire process can be represented by the following reaction equation:
\[
{Fru}_{out}\overset{DTE}{\rightarrow}Psicose + PsiR \leftrightarrow CCI\ complexes
\]
The concentration of the pPsiR transcription factor depends on the concentration of D-Psicose and can be described by the Hill equation. Taking into account the consumption and degradation of the pPsiR transcription factor, the generation rate of the pPsiR transcription factor is:
\[
\frac{d\left[ \text{p}\text{P}\text{si}\text{R} \right]}{dt} = \alpha_{\text{pPsiR}} + \alpha_{\text{leak}} - {m_{Psicose,pPsiR}[ Psicose]}^{n}\left[ \text{pPsiR} \right] - \delta_{\text{pPsiR}}\left[ \text{p}\text{P}\text{si}\text{R} \right] + m_{CCI}\ [CCI]
\]
\[
[PsiR]_{total} = [PsiR]_{free} + [CCI]
\]
Here, \(\alpha_{\text{pPsiR}}\) represents the synthesis rate of the pPsiR transcription factor, \(\alpha_{\text{leak}}\) represents the background expression rate of the pPsiR transcription factor (the leakage expression when there is no D-Psicose), \(- m_{Psicose,pPsiR}[ Psicose]^{n}\left[ \text{pPsiR} \right]\) represents the binding rate of the pPsiR transcription factor to D-Psicose, and the binding rate is determined by the concentration of D-Psicose and pPsiR.\(- \delta_{\text{pPsiR}}\left[ \text{pPsiR} \right]\) represents the degradation or dilution rate of pPsiR in the cell. \(+ m_{CCI}\,[CCI]\)represents the rate at which pPsiR is released from the CCI complex after its decomposition, which is determined by the concentration of the CCI complex [CCI] and the decomposition constant\({\ m}_{CCI}\).
2. Changes in CCI Complex Concentration:
Taking into account the formation and degradation rates of the CCI complex, the concentration change of the CCI complex is as follows:
\[
\ \frac{d[CCI]}{dt} = {m_{Psicose,pPsiR}[ Psicose]}^{n}\left[ \text{pPsiR} \right] - m_{CCI}\ [CCI] - \delta_{CCI}\ [CCI]
\]
\(m_{Psicose,pPsiR}[ Psicose]^{n}\left[ \text{pPsiR} \right]\) represents the formation rate of the CCI complex, \(- m_{CCI}\,[CCI]\) represents the decomposition rate of the CCI complex, and \(- \delta_{CCI}\,[CCI]\) represents the degradation or dilution rate of the CCI complex.
As can be seen from Figure 9, the concentration of Pepper RNA initially increased and then decreased in the first few minutes, while the total concentration of HBC continued to decrease; the PH complex reached its peak rapidly and then decreased, with the PH* fluorescent state being low but stable, and the fluorescence intensity decreasing over time. Combined with the steady-state data, the free Pepper was extremely low, the steady-state of PH* was 0.007105 nM, supporting a fluorescence intensity of 0.428 a.u., indicating that after the binding of Pepper and HBC, the fluorescence depends on the PH* conformation, and this characteristic of attenuation with the degradation of components is maintained.
2.2 Pepper Aptamer/HBC Dye Concentration Variation and Fluorescence Excitation
The concentration of Pepper aptamer varies depending on the concentration of the repressor pPsiR (which regulates the pPsiA promoter), and can be described by the Hill equation. Moreover, Pepper itself is also degraded, and the addition of HBC dye also combines with Pepper to form an intermediate complex PH. When the HBC dye is in a free state, its fluorescence quantum yield is extremely low (\(\Phi\) ≈ 0.03), but after specifically binding to the Pepper RNA aptamer:
- The dye molecules are locked in a rigid structure;
- The intramolecular rotation is restricted;
- The energy transition level changes → the fluorescence quantum yield increases by 26 times (\(\Phi\)≈0.78);
\[
\Phi_{free} = 0.03\quad \rightarrow \quad\Phi_{bound} = 0.78
\]
Here, \(\Phi\) represents the fluorescence quantum yield, which indicates the probability of each excited-state molecule emitting photons.\(\Phi_{free}\) represents the photon emission yield of the free HBC dye portion, while \(\Phi_{bound}\) represents the photon emission yield of the portion where HBC is bound to Pepper.
The fluorescence intensity \(I_{fluor}\) is directly proportional to the concentration of the excited state PH* of the complex:
\[
I_{fluor} \propto \Phi_{bound} \cdot \left[ PH^{*} \right]
\]
Considering that after the addition of HBC dye, HBC combines with Pepper to form the complex PH, this process is reversible. Only the complex PH emits fluorescence, and the variable that directly determines the fluorescence intensity is the excited state complex PH* rather than PH. Therefore, we introduce three states: free Pepper (\(P_{f}\)), free HBC (\(H_{f}\)), and the complex (PH). We also introduce three models for enhancing fluorescence changes of PH states: the ground state complex (PH) is the one that has completed the combination but has not reached the optimal luminescence conformation; the excited state complex (PH*) is the one that has been adjusted to the configuration that maximizes fluorescence; the dissociation intermediate state (PH~d~) explains the bimodal fluorescence decay phenomenon.
a) Equilibrium Among Three Free States:
\[
\frac{d\left[ P_{f} \right]}{dt} = \eta_{Pepper}\beta_{Pepper}(\beta_{leak} + \frac{1}{1 + \left[ \frac{pPsiR}{K_{d}} \right]^{n}}) - k_{on}\left[ P_{f} \right]\left[ H_{f} \right] + k_{r}\left[ PH_{d} \right] - \delta_{Pepper}\left[ P_{f} \right]
\]
\[
\frac{d\left[ H_{f} \right]}{dt} = \alpha_{HBC} - k_{on}\left[ P_{f} \right]\left[ H_{f} \right] + k_{d}\left[ PH_{d} \right] - \delta_{\text{HBC}}\left[ H_{f} \right]
\]
\[
\frac{d[PH]}{dt} = k_{on}\left[ P_{f} \right]\left[ H_{f} \right] - k_{f}[PH] + k_{b}\left[ PH^{*} \right] - k_{d}[PH]
\]
Among them, \(\eta_{Pepper}\beta_{Pepper}\frac{1}{1 + \left[ \frac{pPsiR}{K_{d}} \right]^{n}}\) represents the expression of the Pepper aptamer, which is determined by the concentration of PsiR and the maximum expression rate\(\eta_{Pepper}\), and \(\beta_{Pepper}\) is a parameter related to the expression of the Pepper aptamer, possibly related to the expression efficiency. \(- k_{on}\left[ P_{f} \right]\left[ H_{f} \right]\) represents the consumption of Pepper by the binding of the Pepper aptamer to the HBC dye, and\(k_{r}\left[ PH_{d} \right]\) represents the recovery of Pepper from the dissociated intermediate state. \(- \delta_{Pepper}\left[ P_{f} \right]\) represents the degradation or dilution of the Pepper aptamer.
\(\alpha_{HBC}\) is the addition rate of the HBC dye (in the experiment, it is added once, so it can be considered that there is a high concentration of [HBC] at the initial moment, and then no replenishment, that is, \(\alpha_{HBC} = 0\); while the degradation term \(\delta_{\text{HBC}}\) is generally very small and can be ignored. \(k_{d}\left[ PH_{d} \right]\)represents the recovery of HBC from the dissociated intermediate state.
\(k_{on}\left[ P_{f} \right]\left[ H_{f} \right]\) represents the binding of the Pepper aptamer to the HBC dye, \(- k_{f}[PH]\) represents the transition of PH to the fluorescent state, and \(k_{b}\left[ PH^{*} \right]\) represents the return of PH* to the ground state PH. \(- k_{d}[PH]\) represents the dissociation of PH to the intermediate state \(PH_{d}\).
b) Conformational Change:
\[
\frac{d[PH]}{dt} = k_{on}\left[ P_{f} \right]\left[ H_{f} \right] - k_{f}[PH] + k_{b}\left[ PH^{*} \right] - k_{d}[PH]
\]
\[
\frac{d\left[ PH^{*} \right]}{dt} = k_{f}[PH] - k_{b}\left[ PH^{*} \right] - k_{nr}\left[ PH^{*} \right] - k_{rad}\left[ PH^{*} \right]
\]
\[
\frac{d\left[ PH_{d} \right]}{dt} = k_{d}[PH] - k_{r}\left[ PH_{d} \right] - k_{d}\left[ PH_{d} \right]
\]
Among them, \(k_{on}\left[ P_{f} \right]\left[ H_{f} \right]\) represents the binding of the Pepper aptamer to the HBC dye, \(- k_{f}[PH]\) represents the consumption of the transition of PH to the fluorescent state, \(k_{b}\left[ PH^{*} \right]\) represents the return of PH* to the ground state PH. \(- k_{d}[PH]\) represents the consumption of the dissociation of PH into the intermediate state \(PH_{d}\).
Similarly, \(k_{f}[PH]\) represents the transition of PH to the fluorescent state, \(- k_{b}\left[ PH^{*} \right]\) represents the consumption of the range of PH* returning to the ground state PH, the introduction of \(- k_{nr}\left[ PH^{*} \right]\) represents non-radiative decay (thermal dissipation without emission), \(- k_{rad}\left[ PH^{*} \right]\) represents radiative decay (emission of photons (fluorescence)).
\(k_{d}[PH]\) represents the intermediate state generated by the dissociation of PH, \(- k_{r}\left[ PH_{d} \right]\) represents the consumption of the recombination of the intermediate state back to PH, \(- k_{d}\left[ PH_{d} \right]\) represents the degradation of the dissociated intermediate state itself.
c) Fluorescence Excitation:
The fluorescence intensity \(I_{fluor}\) is directly dependent on the concentration of the excited-state HBC-Pepper complex (PH∗) and its radiative decay behavior.
\[
I_{fluor} = \gamma \cdot {(\Phi}_{bound} \cdot \left[ PH^{*} \right]_{ss} + \Phi_{leak}) \approx \gamma^{*} \cdot \left[ PH^{*} \right]_{ss}
\]
Here, \(\gamma\) represents the instrument coefficient. \(\Phi_{leak}\) indicates the background fluorescence quantum yield.
As can be seen from Figure 9, the concentration of Pepper RNA initially increased and then decreased in the first few minutes, while the total concentration of HBC continued to decrease; the PH complex reached its peak rapidly and then decreased, with the PH* fluorescent state being low but stable, and the fluorescence intensity decreasing over time. Combined with the steady-state data, the free Pepper was extremely low, the steady-state of PH* was 0.007105 nM, supporting a fluorescence intensity of 0.428 a.u., indicating that after the binding of Pepper and HBC, the fluorescence depends on the PH* conformation, and this characteristic of attenuation with the degradation of components is maintained.
3. Population Dynamics Model
When quantifying the yield of D-Psicose (Psicose) in large-scale production, factors such as cell growth, substrate metabolism, and enzyme kinetics can be considered for coupling.
3.1 Error Accumulation
During enzyme expression (especially in long-term fermentation or continuous passage), DNA replication errors and transcription errors can lead to:
- Mutation in the enzyme coding sequence → Change in the amino acid sequence → Decrease in enzyme activity \(\frac{K_{cat}}{K_{M}}\);
- The rate of error accumulation is positively correlated with the error rate of replication/transcription, sequence length, and expression duration;
- In directed evolution, it is necessary to quantify the impact of errors on the stability of production.
Enzyme expression kinetics model:
\[
\frac{d[DTE]}{dt} = k_{f}[M]^{2} - k_{r}[DTE] - \delta_{D}[DTE]
\]
\[
\frac{d\left[ DTE_{AB} \right]}{dt} = k_{on}[A][B] - k_{off}\left[ DTE_{AB} \right] - \delta_{AB}\left[ DTE_{AB} \right]
\]
Considering the assembly rate in the population model: The formation of homodimers is usually a spontaneous process, while heterodimers consist of two different subunits, which may make the assembly process more complex. Therefore, assembly efficiency is introduced:
\[
\eta_{assembly} = \cdot \frac{k_{on} \cdot k_{\min}}{k_{on} + k_{\deg}}\quad\left( {k_{\min} = \min\left( \beta_{A},\beta_{B} \right),k}_{\deg} = \max\left( \delta_{A},\delta_{B} \right) \right)
\]
Here, \(\eta_{assembly}\) represents the assembly rate of the A and B chains forming a dimer, and it can be determined by the minimum concentration\(k_{\min}\) as the limiting assembly speed. \(k_{\min}\) is determined by the transcription rates \(\beta_{A}\) and \(\beta_{B}\), the binding rate \(k_{on}\), and the maximum degradation rate \(k_{\deg}\).
Finally, the expression changes of the homologous dimeric DTE enzyme and the designed heterologous dimeric DTE enzyme were compared:
\[
[DTE]_{ss} = \frac{\beta_{DTE}k_{TL}}{2\delta_{M}\delta_{D}}\quad vs\quad\left[ DTE_{AB} \right]_{ss} = \frac{\eta_{assembly}k_{\min}{\cdot k}_{TL,avg}}{\delta_{AB}}
\]
Here, "ss" indicates that the enzyme is in a steady state, and \(k_{TL,avg}\) represents the average translation rate constant of the A chain and the B chain. \(\eta_{assembly}\) represents the assembly efficiency of the A and B chains. Other parameters can be found in the expression of the DTE enzyme in the kinetic model.
3.2 Source of Error:
i. Duplication error rate:
\[
repRat(t) = \left( s_{repRat} \right)^{len_{DNA}}
\]
ii. Transcription error rate:
\[
trsRat(t) = \left( s_{trsRat} \right)^{len_{mRNA}}
\]
SrepRat \(\approx\) 10−8: The replication error rate per base; lenDNA: The base length of the DTE gene;
StrsRat \(\approx\) 10−4: The transcription error rate per base; lenmRNA: The base length of mRNA.
iii. The total error rate is the product of the replication error rate and the transcription error rate:
\[
errRat(t) = repRat(t) \bullet trsRat(t)
\]
3.3 Cell population growth kinetics
a) Logistic Equation:
\[
\frac{d[N]}{dt} = rN\left( 1 - \frac{N}{K} \right)
\]
Here, N denotes the density of Escherichia coli (cells/L), r denotes the maximum growth rate (h⁻¹) and K denotes the environmental carrying capacity or the maximum cell density, representing the upper limit of cell growth.
b) Cumulative Error-Induced Fatality Correction
The rate at which each cell undergoes inactivation mutations within a unit of time:
\[
\lambda_{\text{mut}} = \mu_{\text{mut}} \cdot r_{eff} \cdot lenDNA
\]
\(\mu_{\text{mut}}\) represents the mutation rate per unit time and per unit length.
The error leads to a mutation death probability of: \(\mu_{\text{death}} = \lambda_{\text{mut}}\).
Correction of the population growth equation (introduction of error accumulation):
\[
\frac{d[N]}{dt} = rN\left( 1 - \frac{N}{K} \right) - \mu_{\text{death}}N
\]
c) The metabolic burden continues to increase
Growth rate correction (taking into account metabolic burden):
\[
r_{eff} = \frac{r_{\max}}{\left( 1 + \beta_{burden} \bullet \frac{\left[ DTE_{func} \right]}{\left[ DTE_{ref} \right]} \right)} \bullet \left( 1 - \beta_{os} \bullet \frac{[ Psicose]}{[ Psicose] + K_{os}} \right)
\]
Here, \(r_{eff}\) : The corrected effective growth rate, taking into account the effects of metabolic burden and osmotic pressure burden. \(r_{\max}\): The maximum growth rate, representing the theoretical maximum growth rate of the cells when there are no limiting factors. \(\beta_{burden}\): The metabolic burden coefficient, indicating the negative impact of metabolites (such as DTE) on cell growth. \(\frac{\left[ DTE_{func} \right]}{\left[ DTE_{ref} \right]}\): The measure of metabolic burden, representing the ratio of the current concentration of metabolite DTE to the reference concentration \(DTE_{ref}\). \(\beta_{os}\): The osmotic pressure burden coefficient of product accumulation on the cells, describing the osmotic pressure effect caused by the accumulation of metabolic products such as Psicose. \(\frac{[ Psicose]}{[ Psicose] + K_{os}}\) The influence of the concentration of product Psicose on cell growth, inhibiting growth through osmotic pressure burden. \(K_{os}\) is the half-saturation constant of this osmotic pressure effect, indicating that when the concentration of Psicose approaches this constant, the inhibitory effect of the osmotic pressure on the growth rate gradually saturates.
The cell growth kinetics model was ultimately revised to:
\[
\frac{d[N]}{dt} = r_{eff} \cdot N\left( 1 - \frac{N}{K} \right) - \mu_{\text{death}}N
\]
In the figure, the population density of the wild-type Escherichia coli reached 4.96×10⁹ cells/L, showing an S-shaped growth pattern. It entered the rapid proliferation phase within 10 - 15 hours. The mutant strain had a population density of only 1.95×10⁷ cells/L and its growth was almost stagnant. The total error rate rose from 0 to 37% within 0 - 5 hours and then maintained a stable state, reflecting the rapid accumulation of replication and transcription errors. The effective growth rate gradually decreased from the initial 0.50 h⁻¹ to a steady state of approximately 0.45 h⁻¹. The mutant strain's growth was significantly inhibited by the accumulation of errors and the metabolic burden, visually demonstrating the negative impact of errors on the population's proliferation.
3.4 Substrate consumption and product formation
The material concentration at the group level:
\[
C_{total}(t) = \frac{N(t) \bullet C_{single}(t) \bullet V_{cell}}{V}
\]
This formula describes the changes in the concentration of substances at the population level (such as in culture medium or fermentation tanks). \(C_{total}(t)\): The concentration of substances at the population level, usually the concentration of a certain metabolite or a certain molecule within the cells. \(N(t)\): The number of cells or cell concentration at time t. \(C_{single}(t)\): The concentration of substances in a single cell, representing the concentration of a certain substance (such as a product or metabolite) within a single cell. \(V_{cell}\): The volume of each cell, which is usually a constant and depends on the cell type and growth state.
This formula indicates that the population-level substance concentration \(C_{total}(t)\) is determined by the cell number \(N(t)\), the substance concentration of a single cell \(C_{single}(t)\), and the volume of each cell\(V_{cell}\). When the number of cells increases, or the substance concentration within a single cell increases, the population-level substance concentration will also increase accordingly.
Error accumulation simultaneously affects the expression of DTE enzymes, and replication errors lead to the exponential decay of the proportion index of functional enzymes. Considering the homologous form of DTE background enzymes and heterologously designed DTE enzymes, for the proportion of functional enzymes in the cell population:
\[
\frac{d\left[ \text{DTE}_{\text{func}} \right]}{dt} = k_{f}[M]^{2} - \delta_{D}\left[ \text{DTE}_{\text{func}} \right] - \lambda_{\text{mut}}\left[ \text{DTE}_{\text{func}} \right]
\]
\[
\frac{d\left[ \text{DTE}_{\text{func,AB}} \right]}{dt} = \eta_{\text{assembly}}k_{TL}\left[ \text{mRNA}_{DTE} \right] - \delta_{D}\left[ \text{DTE}_{\text{func,AB}} \right] - \lambda_{\text{mut}}\left[ \text{DTE}_{\text{func,AB}} \right]
\]
\[
f_{func}(t) = e^{- \lambda_{\text{mut}} \cdot t}(1 - \alpha) + \alpha
\]
\[
k_{cat}^{\text{eff}}(t) = k_{cat}^{wt} \cdot \ f_{func}(t)\left( 1 - \alpha \cdot (1 - f_{func}(t)) \right)
\]
\[
\frac{d\left[ \text{DTE}_{\text{inact}} \right]}{dt} = \lambda_{\text{mut}}\left[ \text{DTE}_{\text{func}} \right] - \gamma\left[ {DTE}_{inact} \right]
\]
Here, \(k_{cat}^{\text{eff}}(t)\) represents the effective catalytic constant, which is a modification of the enzyme catalytic rate constant \(k_{cat}\) after considering the inactivated state of the enzyme. α represents the proportion of inactivated enzymes over a long time scale (i.e., as t approaches infinity), which is the steady-state inactivation ratio.\(\ \gamma\) is the rate constant for the repair or clearance of inactivated enzymes.
In Figure 11, DTE mRNA rapidly increased to 2.75 nM within 2 hours and maintained a stable level thereafter. The concentrations of A chain and B chain mRNA were extremely low and showed no significant increase, indicating that DTE is mainly expressed in a homologous transcriptional form during population production. The total DTE enzyme concentration continuously rose over time, reaching a steady state of 2516.72 nM, but the active DTE was only 125.84 nM. Due to the high error rate at the steady state, the enzyme was inactivated due to the accumulation of errors. In the dimer distribution, homodimers increased linearly over time, while heterodimers were almost zero, indicating that the heterodimerization efficiency in population production was low, and the system was dominated by homodimers. The theoretical rate and effective rate of the enzymatic reaction both showed an upward trend, but the effective rate was slightly lower than the theoretical rate. This intuitively reflects that the proportion of active DTE is low and the heterodimer assembly is inefficient, jointly limiting the reaction efficiency, and the actual enzymatic activity is slightly lower than the theoretical expectation.
Regarding the consumption of the substrate (D-fructose) and the production of the products, there is:
a) Extracellular substrate dynamics:
\[
\frac{d\left[ S_{out} \right]}{dt} = - \frac{1}{Y_{X/S}} \cdot \frac{d[N]}{dt} \cdot \frac{1}{V} - \gamma_{F} \cdot (\left[ S_{out} \right] - \left[ S_{in} \right]) \cdot \frac{N \bullet V_{cell}}{V}
\]
b) Intracellular substrate dynamics:
\[
\frac{d\left[ S_{in} \right]}{dt} = \gamma_{F} \cdot (\left[ S_{out} \right] - \left[ S_{in} \right]) - \frac{k_{cat}^{\text{eff}}\left[ \text{DTE}_{\text{func}} \right]\left[ S_{in} \right]}{k_{M} + \left[ S_{in} \right]} - \mu \cdot \left[ S_{in} \right]
\]
c) Product formation dynamics:
\[
\frac{d\left[ P_{out} \right]}{dt} = \eta \cdot \frac{k_{cat}^{\text{eff}}\left[ \text{DTE}_{\text{func}} \right]\left[ S_{in} \right]}{k_{M} + \left[ S_{in} \right]} \cdot N \bullet V_{cell} - \delta_{P}[P_{out}] + \kappa([P_{in}] - [P_{out}])
\]
d) Intracellular products dynamics:
\[
\frac{d\left[ P_{in} \right]}{dt} = \frac{k_{cat}^{\text{eff}}\left[ \text{DTE}_{\text{func}} \right]\left[ S_{in} \right]}{k_{M} + \left[ S_{in} \right]} - \kappa([P_{in}] - [P_{out}]) - \mu \bullet [P_{out}]
\]
Among them, \(\left[ S_{out} \right]\) is the extracellular D-fructose concentration (mM), \(\left[ S_{in} \right]\) is the extracellular D-fructose concentration (mM); [\(P_{out}\)] is the extracellular D-Psicose concentration (mM), \(\left[ P_{in} \right]\) is the intracellular D-Psicose concentration (mM); \(Y_{X\text{/}S}\) is the yield coefficient of the cell for the substrate (g cell/g substrate); \(\eta\) is the product synthesis efficiency (including transmembrane transport loss); \(\delta_{P}\) is the degradation rate of the product.
\(V\): the culture volume; \(V_{cell}\): the volume of a single cell; \(\gamma_{F}\): the transmembrane diffusion coefficient. \(\mu = \frac{1}{N}\frac{dN}{dt}\) is the specific growth rate. \(\kappa\): the transmembrane secretion rate constant.
In the left figure, the extracellular D-fructose was maintained at a stable level of 30 mM, while the intracellular fructose rose rapidly from 0 to approximately 6 mM and then increased slowly; in the right figure, the maximum yield of extracellular D-Psicose reached 10.31 mM, and the intracellular level reached approximately 6.5 mM before slightly decreasing. The conversion rate of D-fructose was 92.55%, indicating that under the catalysis of DTE, fructose was efficiently converted into Psicose, and the product was mainly secreted extracellularly, which is in line with the dynamic balance of biosynthesis and secretion.
4. Half-life System Integration
\[
\frac{d[CCI]}{dt} = k_{on} \bullet [PsiR] \bullet [Psicose]^{n} - (k_{off}{+ \delta}_{CCI}) \bullet [CCI]
\]
Half-life of the promoter complex:
\[
t_{\frac{1}{2}} = \frac{\ln 2}{k_{off} + \delta_{CCI}}
\]
4.2 Model of protein thermal stability half-life (DTE enzyme thermal inactivation kinetic)
Add temperature dependence:
\[
\frac{dE}{dt} = k_{syn} - \delta_{D}(T) \bullet E(t)
\]
Temperature-dependent degradation rate:
\[
\delta_{D}(T) = \delta_{D,0} \bullet e^{\left( - \frac{E_{a}}{R}\left( \frac{1}{T} - \frac{1}{T_{0}} \right) \right)}
\]
Protein half-life:
\[
t_{\frac{1}{2}_{protein}} = \frac{\ln 2}{\delta_{D}(T)}
\]
Here, \(k_{syn}\) represents the synthesis rate of the enzyme.
4.3 mRNA half-life prediction model
mRNA dynamic change model:
\[
\frac{d[ mRNA] }{dt} = \beta - \delta_{m} \bullet [ mRNA] - k_{dilution} \bullet [ mRNA]
\]
Here, \(k_{dilution}\) represents the dilution effect term caused by cell growth.
mRNA half-life:
\[
t_{\frac{1}{2}_{mRNA}} = \frac{\ln 2}{\delta_{m} + k_{dilution}}
\]
4.4 Fluorescence signal half-life model
The stability of the Pepper-HBC complex shows time-dependent characteristics:
\[
\frac{d[{PH}^{*}]}{dt} = k_{f} \bullet [PH] - ({\ k}_{rad} + k_{nr} + \delta_{pepper}) \bullet [{PH}^{*}]
\]
Fluorescence half-life:
\[
t_{\frac{1}{2}_{fluorescence}} = \frac{\ln 2}{k_{rad} + k_{nr} +\delta_{pepper}}
\]
4.5 System integration of half-life model
a) Stability of multi-component systems
Taking into account the half-lives of all components:
\[
t_{\frac{1}{2}_{system}} = \left( \sum_{i}^{}\frac{1}{t_{\frac{1}{2}}^{i}} \right)^{- 1}
\]
b) The impact of cumulative errors on the half-life
The corrected half-life is:
\[
t_{\frac{1}{2}_{corrected}} = t_{\frac{1}{2}_{initial}} \bullet (1 - errRat(t) \bullet t)
\]
Where errRat(t) is derived from the cumulative sum of the replication error rate and the transcription error rate.
It can be easily seen from Figure 13 that in the core functional components of the wild-type and mutant products, the half-life of the mutant CCI (0.1398 hours) is only 66.7% of that of the wild-type (0.2097 hours), the half-life of DTE (6.3020 hours) is 87.2% of that of the wild-type (7.2309 hours), and the half-life of mRNA (0.2356 hours) is 92.6% of that of the wild-type (0.2544 hours). All of these have been shortened to varying degrees, reflecting the weakening of the stability of the core components due to the mutation. The fluorescence half-life of the mutant is slightly higher than that of the wild-type, indicating that the stability of the fluorescence-related part has slightly improved after the mutation, but the decay of the core components still makes the long-term stability of the wild-type system better.