Document Navigation

Introduction
1. Design
- 1.1 Definition of Key Functional Regions
- 1.2 Sequence Redesign (LigandMPNN)
2. Build
- 2.1 Physicochemical Property Prediction (Expasy ProtParam)
- 2.2 Three-Dimensional Structure Prediction (AlphaFold3)
3. Test
4. Learn
Modeling
- Modeling Content
- Key Assumptions
1. Catalytic Circuit Model
2. Characterization Circuit Model
- 2.1 PsiR-PsiA Transcriptional Regulation
- 2.2 Pepper Aptamer/HBC Dye System
3. Population Dynamics Model
4. Half-life System Integration

From AI-Driven Design to Biosensor-Based Characterization: Systems Modeling of DTE Enzymes Enabled by Pepper-HBC Fluorescent RNA

Content Overview

Given that traditional directed evolution strategies for DTE enzyme cannot rapidly meet the demands for enzyme optimization, the dry lab team utilized artificial intelligence to conduct de novo design of D-tagatose-3-epimerase, aiming to discover novel enzyme variants with potential advantages. By integrating LigandMPNN sequence redesign, AlphaFold3 structure prediction, molecular docking, and molecular dynamics simulations, we achieved rapid iteration and functional optimization at the computational level, and ultimately validated the catalytic activity of the designed variants through wet lab experiments. This work represents the first implementation of AI-driven global sequence redesign for DTE enzyme, successfully obtaining engineered enzymes with novel structural features and functional potential. Additionally, based on the screening and characterization of DTE enzyme, we designed systematic modeling for Pepper-HBC Fluorescent RNA Biosensors, including The Catalytic Circuit Model, The Characterization Circuit Model, The Population Dynamics Model, and The Half-life System Integration Model, in order to facilitate the testing of the entire system.

Technical Tools

LigandMPNN：https://github.com/dauparas/LigandMPNN
Expasy ProtParam：https://web.expasy.org/protparam/
AlphaFold3：https://alphafoldserver.com/
CB-Dock2：https://cadd.labshare.cn/cb-dock2/php/index.php
Diffdock：https://huggingface.co/spaces/reginabarzilaygroup/DiffDock-Web
SwissDock：https://www.swissdock.ch/
Protein-Ligand Interaction Profiler：https://plip-tool.biotec.tu-dresden.de/plip-web/plip/index
Gromacs 2025.2：https://manual.gromacs.org/2025.2

Visualization Tools

Pymol：https://pymol.org/
VMD：https://www.ks.uiuc.edu/Research/vmd/

Models

The Catalytic Circuit Model
The Characterization Circuit Model
The Population Dynamics Model
The Half-life System Integration Model

1. Design

Objective: To carry out a de-novo redesign of the DTE sequence while preserving its catalytic activity.

1.1 Definition of Key Functional Regions

We first analyzed the structure of the wild-type DTE (PDB: 2OU4) to delineate regions that must remain unchanged, establishing these as structural constraints for our design strategy (Table 1) D-Tagatose-3-epimerase (DTE) is the key enzyme that catalyzes the C3 epimerization of various keto-sugars and is widely exploited for rare-sugar biosynthesis. In this study, the DTE from Pseudomonas cichorii (PDB: 2OU4) was chosen as the design template. The enzyme adopts a canonical metal-dependent (β/α)₈-barrel fold; its active site contains a Mn²⁺ ion coordinated by four strictly conserved residues (Glu152, Asp185, His211, Glu246) that mediate substrate deprotonation/reprotonation. In addition, the native 2OU4 structure forms a stable homodimer; the catalytic pocket is located at the dimer interface and displays an open, promiscuous substrate-binding character, enabling efficient C3 epimerization of both D-tagatose and D-fructose.

DTE_2OU4 sequence：

MNKVGMFYTYWSTEWMVDFPATAKRIAGLGFDLMEISLGEFHNLSDAKKRELKAVADDLGLTVMCCIGLKSEYDFASPDKSVRDAGTEYVKRLLDDCHLLGAPVFAGL TFCAWPQSPPLDMKDKRPYVDRAIESVRRVIKVAEDYGIIYALEVVNRFEQWLCNDAKEAIAFADAVDSPACKVQLDTFHMNIEETSFRDAILACKGKMGHFHLGEAN RLPPGEGRLPWDEIFGALKEIGYDGTIVMEPFMRKGGSVSRAVGVWRDMSNGATDEEMDERARRSLQFVRDKLA

DOI：10.1016/j.jmb.2007.09.033;10.1002/cbic.201402620

Table 1. Key Functional Regions and Design Strategy for DTE

Region Category	Residue ID (A)	Mutability
Catalytic Core	A152/A185/A211/A246	Fixed
Dimer Interface	A103-263 (key residues)	Fixed
Hydrophobic Core	A30/50/100/120/150/180/200/230/250/280	Mutable
Surface Charge	A10/20/60/90/130/170/220/260/290	Mutable
Remaining 90%	All remaining residues	Mutable

1.2 Sequence Redesign (LigandMPNN)

Using 2OU4 as the template, we performed structure-based sequence prediction with LigandMPNN. The Mn²⁺-coordinating residues A152/A185/A211/A246 and key dimer-interface positions were declared immutable; the remaining ~90 % of the residues were designated redesignable. Residue-specific biases were applied to reinforce Val/Ile/Leu in the hydrophobic core and Asp/Glu/Lys/Arg on surface-charge patches, while all free cysteines were globally excluded to avoid oxidation risk. Symmetry constraints enforced synchronous optimization of both chains, and a ligand-aware side-chain packing algorithm prevented steric clashes with the metal ion or substrate. After iterative refinement within the Renew framework, 32×2 sequences were sampled and rapidly filtered with Expasy ProtParam, yielding the most stable variants DTE-1 and DTE-2.

2. Build

Objective: To computationally predict and evaluate the designed proteins before wet-lab experiments.

2.1 Physicochemical Property Prediction (Expasy ProtParam)
We employed the Expasy ProtParam web server to rapidly estimate basic theoretical parameters--- isoelectric point, molecular weight, instability index, aliphatic index, and grand average of hydropathicity (GRAVY)---for every newly generated sequence. This allowed us to rationally discard variants whose predicted instability index was excessively high (Table 2). ProtParam scanning revealed that DTE-1 and DTE-2 exhibit a 16--17 % decrease in instability index and a 30 % increase in aliphatic index relative to wild-type, while pI and molecular weight remain virtually unchanged. These metrics suggest that the mutants possess improved solubility and thermostability, providing a promising scaffold for subsequent functional characterization.

Table 2. Predicted physicochemical parameters of the designed enzymes

Protein id	Theoretical pI	Molecular weight	Instability index	Aliphatic index	Grand average of hydropathicity (GRAVY)
2OU4	5.21	65081.53	50.98	79.55	-0.234
DTE-1	5.21	64447.74	35.51	101.35	-0.079
DTE-2	5.16	64448.91	34.94	104.35	-0.041

2.2 Three-Dimensional Structure Prediction (AlphaFold3)

AlphaFold3 was used to verify whether the redesigned sequences fold correctly and to assess the spatial orientation of the catalytic residues. After LigandMPNN sequence generation and initial filtering, the top variants DTE-1 and DTE-2 were submitted to AlphaFold3 for rapid structure prediction. The resulting models were inspected in PyMol and superimposed on the 2OU4 template, focusing on the active-site pocket.

Structural analysis:

Confidence: AlphaFold3 produced models with an average pLDDT ≈ 0.90 (Table 3).

Global superposition: RMSD between the predicted structures and 2OU4 is < 0.56 Å, confirming that the redesigned sequences preserve the native backbone fold (Fig. 1-A1, A2).

2OU4 original pocket:

The native pocket adopts a classic "bowl-like" open conformation (volume 805 Å³). The hydrophobic framework accommodates D-fructose, but the C4--C6 alkyl chain remains solvent-exposed, giving a hydrophobic contact ratio of only ~58 %. This loose recognition facilitates substrate entry/exit and reflects the enzyme's promiscuous catalytic profile (Fig. 1-B1).

DTE-1:

While retaining the overall opening and scaffold, DTE-1 introduces subtle side-wall adjustments. Val152 swings inward, shrinking the pocket volume to 501 Å³ and increasing the hydrophobic contact ratio to 65 %. The solvent channel is preserved, representing a refined version of the original open pocket (Fig. 1-B2).

DTE-2:

Key mutations Leu183→Phe and Val181→Tyr install aromatic side chains at the pocket entrance. Together with the pre-existing Phe208, they form a closed "hydrophobic lid". Pocket volume drops to 448 Å³, the entrance diameter narrows to 4.6 Å, and the hydrophobic burial fraction jumps to 82 %. The binding site is thereby transformed from an open "bowl" into a fully enclosed "box" (Fig. 1-B3).

Table 3. Confidence of the predicted structures for the redesigned enzymes

Protein id	ipTM	pTM	ranking_score	RMSD
2OU4	0.94	0.96	0.94	0.000
DTE-1	0.90	0.92	0.91	0.546
DTE-2	0.89	0.92	0.90	0.555

Figure 1. Structural comparison and hydrophobic-pocket analysis between the redesigned enzymes and 2OU4

3. Test

Objective: To computationally evaluate the binding affinity and catalytic potential of the designed enzymes toward their substrate.

3.1 Molecular Docking (Docking Tools)

We employed the CB-Dock2 web server to dock D-fructose into the structures of DTE-1 and DTE-2. For the reference enzyme 2OU4, CB-Dock2 blind docking first returned five candidate pockets (CurPocket C1--C5); C1 exhibited the best Vina score (--5.9 kcal mol⁻¹) and was selected as the primary binding site. A template-based refinement was then performed using PDB 2qum (100 % pocket concordance, ligand RMSD 0.23 Å), yielding a final contact score of 64.1.

The same protocol was applied to the two redesigned enzymes. Again, five pockets were sampled. For both DTE-1 and DTE-2 the optimal site was C3, with Vina scores of --5.1 and --5.0 kcal mol⁻¹, respectively. Template refinement with 2qum (47 % pocket concordance) raised the contact scores to 12.4 (DTE-1) and 54.7 (DTE-2). Relative to 2OU4, DTE-1 requires the ligand to shift into the smaller C3 cavity, accompanied by a modest drop in Vina energy. Despite lower pocket similarity, DTE-2 retains most of the hydrogen bonds and hydrophobic clamps required for substrate recognition.

In parallel, DiffDock was used to dock both D-tagatose and D-fructose into the redesigned enzymes for rapid comparison. DTE-1 displayed high confidence for both sugars, whereas DTE-2 showed a clear preference for tagatose, indicating that the DTE-2 scaffold confers higher pairing propensity toward tagatose (Table 4).

Table 4. Summary of molecular docking results

Protein id	CBdock Vina Score (kcal/mol)	CBdock pocket volume(Å³)	CBdock Contact Score	Diffdock_D_fructose rank_confidence	Diffdock_D_tagatose rank_confidence
2OU4	-5.9	805	64.1	0.83	0.76
DTE-1	-5.1	501	12.4	0.63	0.71
DTE-2	-5.0	448	54.7	1.34	0.97

3.2 Atomic-interaction analysis (PLIP)

From the analysis of atomic interactions in PLIP, for the wild-type enzyme 2OU4, the substrate D-fructose directly coordinates with the catalytic Mn²⁺ ion via its O2 and O3 atoms (coordination distances approximately 2.07--2.53 Å, Tables 5-1 and 5-2), and forms a pre-reaction catalytic conformation with key residues such as Glu152 and Glu246. In the predicted conformation of DTE-1, D-fructose is anchored within an elongated pocket formed by Ile150-Val152-Gln182-Asp184-Gly207-Phe208 via six geometrically favorable hydrogen bonds (Figure 2). This pocket is spatially separated from the catalytic center (Tables 5-3, 5-6). The hydroxyl groups (e.g., O3) of D-fructose only form hydrogen bonds with the protein backbone or side chains, without coordinating to Mn²⁺. Furthermore, its C3 atom is positioned farther from Glu246, which is responsible for proton abstraction (distance increased from 2.3 Å to 2.59 Å, Tables 5-4, 5-5). This ultra-tight binding may significantly enhance the enzyme's ability to capture the substrate---first fixing it in a secondary pocket before releasing it to the catalytic center---though it may also imply a higher energy barrier hindering proper substrate pairing. Compared to the wild-type 2OU4, DTE-2 retains the complete tetrahedral Mn²⁺ coordination core formed by Glu152, Asp185, His211, and Glu246. However, the regularity of its coordination geometry may be somewhat compromised, particularly with the Glu246--Mn²⁺ coordination distance increasing to 2.82 Å, which is longer than that observed in DTE-1 and the wild-type enzyme (2.59 Å). This may suggest suboptimal Mn²⁺ coordination in DTE-2 (Tables 5-7, 5-8). Both DTE-1 and DTE-2 exhibit a reduction in some hydrogen bonds compared to 2OU4 (Tables 5-3, 5-6, 5-9). Nevertheless, this streamlined yet potent hydrogen-bonding network, combined with introduced aromatic residues contributing hydrophobic interactions, may facilitate substrate binding in a conformation more closely resembling the transition state. Although this temporarily sacrifices certain quantitative binding metrics, it creates a more favorable geometric prerequisite for achieving high catalytic efficiency.

Table 5. Protein Structure Analysis

Table 5-1. Metal coordination in model MN-A-1001 ▼

Index	Residue	Chain	Metal	Target	Distance	Location
1	152A	GLU	2290	1202	2.17	protein.sidechain
2	185A	ASP	2290	1458	2.07	protein.sidechain
3	211A	HIS	2290	1666	2.37	protein.sidechain
4	246A	GLU	2290	1937	2.33	protein.sidechain

Table 5-2. Metal coordination in model MN-D-1004 ▼

Index	Residue	Chain	Metal	Target	Distance	Location
1	152D	GLU	4705	3617	2.21	protein.sidechain
2	185D	ASP	4705	3873	2.10	protein.sidechain
3	211D	HIS	4705	4081	2.48	protein.sidechain
4	246D	GLU	4705	4353	2.53	protein.sidechain

Table 5-3. Hydrogen-bond network of UNL-Z-999 (D-fructose) ▼

Index	Residue	Chain	D-A (Å)	H-A (Å)	Angle	Protein donor	Description
1	LEU151	D	2.67	2.06	119.1	NO	Backbone NH→O3
2	VAL153	D	3.79	2.99	139.7	YES	Backbone NH→O3
3	VAL153	D	3.95	3.20	134.9	NO	Backbone NH→O2
4	GLN183	D	2.80	2.23	115.7	YES	Backbone NH→O3
5	LEU184	D	4.03	3.44	120.7	NO	Backbone NH→O2
6	ASP185	D	3.03	2.21	138.9	YES	Backbone NH→O3
7	GLY208	D	2.41	1.47	157.1	YES	Strong H-bond
8	HIS209	D	3.02	2.26	132.6	YES	Backbone NH→O3

Table 5-4. Metal coordination in model MN-A-1001 ▼

Index	Residue	Chain	Metal	Target	Distance	Location
1	GLU152	A	25	O	2.39	protein.sidechain
2	ASP185	A	25	O	2.11	protein.sidechain
3	HIS211	A	25	N	2.27	protein.sidechain
4	GLU246	A	25	O	2.59	protein.sidechain

Table 5-5. Metal coordination in model MN-D-1004 ▼

Index	Residue	Chain	Metal	Target	Distance	Location
1	GLU151	B	26	O	2.31	protein.sidechain
2	ASP184	B	26	O	2.16	protein.sidechain
3	GLU210	B	26	O	2.03	protein.sidechain
4	GLU245	B	26	O	2.26	protein.sidechain

Table 5-6. Hydrogen-bond network of UNL-Z-999 (D-fructose) ▼

Index	Residue	Chain	D-A (Å)	H-A (Å)	Angle	Protein donor	Description
1	ILE150	B	2.94	2.35	118.0	NO	Backbone→Ligand
2	VAL152	B	3.73	2.81	155.5	YES	Backbone NH→O3
3	GLN182	B	2.96	2.25	128.0	YES	Backbone NH→O3
4	ASP184	B	3.31	2.44	146.6	YES	Backbone NH→O3
5	GLY207	B	2.57	1.62	160.8	YES	Strong H-bond
6	PHE208	B	3.25	2.54	128.9	YES	Backbone NH→O3

Table 5-7. Metal coordination in model MN-A-1001 ▼

Index	Residue	Chain	Metal	Target	Distance	Location
1	152A	GLU	25	1182	2.30	protein.sidechain
2	185A	ASP	25	1433	2.36	protein.sidechain
3	211A	HIS	25	1646	2.39	protein.sidechain
4	246A	GLU	25	1921	2.82	protein.sidechain

Table 5-8. Metal coordination in model MN-D-1004 ▼

Index	Residue	Chain	Metal	Target	Distance	Location
1	GLU151	B	26	O	2.38	protein.sidechain
2	ASP184	B	26	O	2.32	protein.sidechain
3	GLU210	B	26	O	2.42	protein.sidechain
4	GLU245	B	26	O	2.44	protein.sidechain

Table 5-9. Hydrogen-bond network of UNL-Z-999 (D-fructose) ▼

Index	Residue	Chain	D-A (Å)	H-A (Å)	Angle	Protein donor	Description
1	ILE150	B	2.39	2.89	121.3	NO	Backbone→O3
2	VAL152	B	2.89	2.89	161.6	YES	Backbone NH→O3
3	GLN182	B	2.41	2.98	116.7	YES	Backbone NH→O2
4	ASP184	B	2.47	3.32	144.6	YES	Backbone NH→O3
5	GLY207	B	1.48	2.44	163.1	YES	Strong binding
6	PHE208	B	2.39	3.16	134.8	YES	Backbone NH→O3

Figure 2. PLIP-generated schematic of enzyme–ligand interactions

3.3 Molecular Dynamics and Binding Free Energy

We evaluated the stability and binding free energy of the complexes using the docking results from CBdock through molecular dynamics (MD) simulations and MM/PBSA calculations. Additionally, we performed docking of the enzymes with the substrate D-fructose and the product D-psicose using Swissdock to further assess binding effectiveness. The Swissdock results indicated that the binding energy of 2OU4 with the product D-psicose was slightly lower than that with the substrate D-fructose, suggesting the existence of reversible catalysis in 2OU4 for converting fructose to allulose. In contrast, DTE-1 and DTE-2 appear to have optimized this reversible catalytic effect, as reflected by their stronger binding energies with the substrate D-fructose. Furthermore, MM/PBSA calculations also showed that DTE-1 and DTE-2 exhibit better binding performance compared to 2OU4, even though the calculated ΔG values were all greater than zero. This may be attributed to the absence of Mn²⁺ topology in the simulated system, leading to only transient binding between the enzyme and the substrate (Table 6).

Binding free energy decomposition analysis (Figure 3) revealed that in 2OU4, the binding energy primarily originates from the catalytic center and surrounding residues (e.g., Glu158, Asp185, Glu192, etc.; Figure 3-1, A1-2). In contrast, the energy contribution hotspots in DTE-1 and DTE-2 shift to residues forming the secondary pocket (e.g., Glu157, Gln182, Arg216, etc.; Figure 3-2/3, B1-2, C1-2), indicating that the enzyme variants may have optimized the ability of the secondary pocket to capture ligands. Notably, Trp112/113 contributed the largest energy, suggesting that this residue may play a critical role in guiding the ligand into the hydrophobic pocket.

MD simulation results showed that the protein backbone RMSD of DTE-1, DTE-2, and 2OU4 all reached stable plateaus in the later stages of the simulation (Figure 4-A), indicating correct overall folding and conformational stability. The RMSD of the catalytic center and protein secondary structures also stabilized in the later stages (Figure 4-B, C, D), demonstrating that the catalytic structures of the enzyme variants remain functional. Further trajectory analysis indicated that the substrate-binding region of DTE-2 exhibits higher rigidity, while DTE-1 shows moderate rigidity, which may confer advantages in structural transitions (reduced RMSF, Figure 4-H). The radius of gyration (Rg) of both DTE-1 and DTE-2 showed an initial increasing trend followed by stabilization, suggesting possible structural expansion of the enzyme variants during the simulation time, with DTE-1 appearing more loosely packed (elevated Rg). The solvent accessible surface area (SASA) of DTE-1 and DTE-2 increased and exhibited continuous fluctuations compared to 2OU4 (Figure 4-G), indicating that under dynamic conditions, the hydrophobic core of the enzyme variants may undergo continuous opening and closing due to rigid structural transitions. This suggests that the overall structures of the new enzymes may be expanding or drifting, making them slightly less stable compared to 2OU4. However, the number of hydrogen bonds between DTE-1/DTE-2 and the substrate remained consistently high, validating the streamlined and potent hydrogen-bonding network observed in the PLIP analysis.

Together, these results indicate that the enzyme variants achieve higher apparent binding affinity (lower MM/PBSA binding free energy) by optimizing hydrophobic packing and a streamlined hydrogen-bonding network. Kinetically, this may facilitate rapid substrate capture and transfer from the secondary pocket to the catalytic active center.

The protein simulation trajectories were visualized using VMD, with 25 frames extracted every 200 frames from the first 5000 frames. The results show that the ligand migrates between the two pockets of 2OU4 and the protein dimer interface (yellow isosurface in Figure 5-A), likely due to instability caused by the lack of Mn²⁺ coordination in the system, which also explains the positive free energy values. Furthermore, different catalytic cores identified by PLIP were rendered using VMD: the catalytic pocket of 2OU4 was colored pink (Glu152, Asp182, etc.), which partially overlaps with some secondary structures, while the distinct catalytic pockets of DTE-1 and DTE-2 were colored yellow (Ile150, Val152, Gln182, etc.). The figures show an increased probability of ligands being adsorbed into the secondary pockets in DTE-1 and DTE-2 (Figure 5-B, C). Moreover, the ligand distribution in DTE-2 (in both secondary and catalytic pockets) is more concentrated, indirectly explaining its further reduced binding energy. However, in terms of overall structural distortion, 2OU4 exhibits a more compact structure, while DTE-1 and DTE-2 display more expanded conformations, consistent with the radius of gyration results from the dynamics simulations.

Table 6. Binding free energies calculated by SwissDock and MM/PBSA

Protein id	SwissDock-D_fructose Kcal/mol	SwissDock-D_piscose Kcal/mol	(Gromacs-mmpbsa) Kcal/mol
2OU4	-5.210	-5.497	dG=21.816
DTE-1	-6.070	-5.749	dG=17.797
DTE-2	-4.766	-4.752	dG=16.643

Figure 3-1. Per-residue binding energy decomposition (A1) and energetic landscape along the protein sequence (A2) for 2OU4–substrate complex

Figure 3-2. Per-residue binding energy decomposition (B1) and energetic landscape along the protein sequence (B2) for DTE-1–substrate complex

Figure 3-3. Per-residue binding energy decomposition (C1) and energetic landscape along the protein sequence (C2) for DTE-2–substrate complex

Figure 4. Data-processing results from molecular-dynamics simulations of protein–substrate binding

Figure 5. Trajectory view of protein–substrate binding from molecular-dynamics simulation displayed in VMD

3.4 Wet-Lab Validation Results

Based on the fluorescent biosensor system established through wet-lab experiments, we obtained the activity profiles of different DTE enzymes (Figure 6).

The results indicate that, compared to the Wild-Type (WT) group, the newly designed enzyme variants DTE-1 and DTE-2 did not exhibit a significant advantage in fluorescence intensity during the initial reaction phase, and even showed lower activity in the mid-phase. However, in the mid-to-late stages of the reaction, we detected a substantially higher fluorescence intensity for DTE-1 in the system, and DTE-2 also demonstrated favorable performance.

This observation aligns well with the overall trends noted in our dry-lab molecular dynamics simulation trajectories. Specifically, the flexible structure of 2OU4 (WT) likely allows for greater conformational adaptability and overall stability, reflected in its steady increase in fluorescence intensity. In contrast, DTE-1 and DTE-2 appear slightly less stable. Nevertheless, it is noteworthy that the newly designed enzyme variants, despite structural remodeling, still exhibit significant catalytic potential, as evidenced by the high final fluorescence intensity. This corroborates the transient lower binding free energy observed in the MM/PBSA calculations.

This suggests that maintaining catalytic efficiency over extended periods for these variants might require enhanced structural rigidity or a reinforced hydrogen-bond network. Alternatively, substrate binding within the hydrophobic pocket might influence the flexibility of the secondary pocket structure in the variants, potentially facilitating increased substrate access to the catalytic center.

We anticipate designing additional novel enzyme variants to explore the diverse possibilities underlying these activity changes, and conducting further characterization of different variants in wet-lab experiments to solidify our conclusions. This iterative process benefits significantly from the synergistic integration of dry and wet-lab experimental approaches.

Figure 6. Comparison of enzyme activities characterized by wet-lab biosensor assay

4. Learn

Objective：To characterize the enzyme variants through wet-lab experiments, integrate all test results, and summarize the insights.

4.1 Key Conclusions

Through AI-driven global sequence redesign of the DTE enzymes and systematic computational evaluation, we have drawn the following core conclusions:

Sequence Design Strategy is Effective：Subsequent wet-lab experiments confirmed that the enzyme variants exhibit functional activity. This demonstrates that the structure-based sequence redesign strategy (using LigandMPNN) can effectively guide the design of novel enzyme variants. The rational changes in the variants are supported by computational simulation data, aided by various analytical tools for prediction and evaluation.
Structural Remodeling of Enzyme Variants Suggests Potential Catalytic Dynamics：In the DTE enzyme sequence prediction project, balancing the structure of the hydrophobic core and the catalytic core may enable the enzyme to capture ligands more readily. The introduced mutations potentially adjusted hydrophobic preferences, leading to structural remodeling and resulting in more rigid architectures. This demonstrates that under conditions of highly compressed enzyme structures, there is an opportunity to construct numerous novel variants and uncover more potent and stable enzyme scaffolds.
Docking Tools Combined with Dynamics Simulations Enable a Rapid Enzyme Evaluation Pipeline：Utilizing diverse simulation and docking tools can significantly accelerate the prediction and evaluation of enzyme variants. A comprehensive technical workflow, encompassing both static data analysis and dynamic simulation observation, supports further sequence design efforts.

4.2 Iterative Design Recommendations

When adjusting the structural sequence design of DTE enzymes, attention should be paid to the key coordination of core catalytic residues. For instance, optimizing the coordination space of Glu246 is crucial, as this residue might be highly susceptible to interference from hydrophobic structures, leading to diminished catalytic performance.

When optimizing the hydrogen-bond network, the focus should not solely be on the number of hydrogen bonds; balancing the charge distribution at the dimer interface is equally important. Otherwise, the catalytic core structure might become unstable, potentially leading to compensatory changes in secondary structures.

Appropriate introduction of flexible structural elements, such as Glu residues, could be considered. Key flexible structures might enhance the enzyme's binding capability.

Finally, the results from each optimization round should undergo rapid assessment to filter out proteins with incorrect binding poses as efficiently as possible. Furthermore, during MD simulations, incorporating topological parameters for Mn²⁺ and applying positional restraints within the protein might better simulate the authentic catalytic structure, aiding in the identification of superior enzyme variants.

4.3 Summary and Outlook

This in silico study showcases the powerful potential of artificial intelligence in the field of enzyme design. We successfully designed the mutants DTE-1 and DTE-2 based on a full-sequence redesign of the DTE enzyme scaffold and preliminarily investigated the remodeling mechanisms of their structure-function relationships through multi-scale simulations.

Not only have we computationally designed and validated enzyme variants possessing novel structural features and potential functional advantages, but more importantly, we have established a reusable methodology for the rational design of enzymes.

Looking ahead, with continuous advancements in AI models, computational power, and experimental automation, AI-assisted enzyme engineering is poised to break through the bottlenecks of traditional directed evolution, emerging as an efficient, controllable, and predictable paradigm for novel enzyme design. We anticipate that this strategy will contribute to green technology innovation in fields such as synthetic biology.

Screening and Characterization of DTE Based on Pepper-HBC Fluorescent RNA Biosensors

As shown in Figure 1, the plasmid PSB1C3-psiR-PsiA-pepper was constructed as a biosensor for DTE enzyme activity based on the fluorescent RNA pepper-HBC system. The wild-type DTE background enzyme, which is a homodimer, was derived from the DTE mutant IDF10-3 of Pseudomonas cichorii (GenBank: AB000361.1), while the designed heterodimer DTE enzyme was obtained from the sequence designed with AI assistance and synthesized through full gene synthesis.

Two recombinant plasmids (homodimer and heterodimer designs) and the characterization plasmid pET28-Porin-PsiRA-Pepper (fluorescent RNA pepper-HBC system) were co-transformed into Escherichia coli BL21(DE3) by electroporation. The obtained double-plasmid system strains were then cultured in LB medium at 37°C for 10-12 hours. Different concentrations of substrate D-fructose (0.01, 0.1, 1, 10, 100, 1000 mM) were added, and the culture was continued for another 12 hours. Then, HBC dye was added, and the fluorescence values after the addition of different concentrations of D-fructose were detected. The optimal concentration of D-fructose under this culture condition (i.e., the concentration at which the fluorescence intensity reaches its peak, as shown in Figure 2) was determined.

Figure 1: Biosensor based on pepper-HBC fluorescent RNA system

Figure 2: Schematic diagram of Pepper fluorescence changing with D-fructose concentration

At the same time, different active DTE constructs were selected for constructing recombinant Escherichia coli strains. The fluorescence intensity changes were measured under the optimal D-fructose concentration and the same conditions. The functional relationship between DTE activity/conversion rate and Papper-HBC fluorescence intensity was established (a blank control without DTE recombinant strain was also required to characterize the background leakage expression level).

Modeling Content: Psicose Synthetic Kinetics Model

The affinity of PsiR transcription factor for D-Psicose: When the intracellular concentration of D-Psicose increases, PsiR dissociates from the promoter pPsiA and activates the expression of downstream Pepper fluorescent RNA.

Based on the above analysis of the design of the Pepper-HBC fluorescent RNA biosensor, this system divides the circuit into four parts: The Catalytic Circuit Model, The Characterization Circuit Model, The Population Dynamics Model, and The Half-life System Integration Model, in order to facilitate the testing of the entire system.

Key Assumptions

No.	Assumption Name	Assumption Content
1	Enzyme Expression	DTE enzyme expression is regulated by the strength of the promoter, and the assembly efficiency of homologous/heterologous dimers is different.
2	Substrate Diffusion	The extracellular D-fructose concentration is constant, and the transmembrane diffusion rate in the short term is proportional to the concentration gradient.
3	Transcriptional Regulation	The PsiR-PsiA system follows the Hill equation, and the transcriptional intensity is regulated by the concentration of D-alloconitose.
4	Fluorescence Detection	The fluorescence intensity of the Pepper-HBC complex has a quasi-linear relationship with the PH* activity of the complex.

1. The Catalytic Circuit Model

1.1 Kinetics of Expression of Homodimeric DTE Enzyme

The synthesis rate of DTE enzyme depends on the strength of the promoter, the efficiency of transcription and translation, as well as possible negative feedback regulation. Considering that the background DTE enzyme is constitutively expressed and is a homodimer, and assuming it is composed of two identical monomers (denoted as M), the expression process includes transcription, translation, and dimerization, and the following equation can be obtained:

\[ \frac{d\left[ \text{mRNA}_{DTE} \right]}{dt} = \beta_{DTE} - \delta_{m}\left[ \text{mRNA}_{DTE} \right] \]

\[ \frac{d[M]}{dt} = \frac{k_{TL}\left[ \text{mRNA}_{DTE} \right]}{1 + \frac{\left[ \text{mRNA}_{DTE} \right]}{k_{TL}}} - 2k_{f}[M]^{2} + 2k_{r}[DTE] - \delta_{M}[M] \]

\[ \frac{d[DTE]}{dt} = k_{f}[M]^{2}\left( 1 - \frac{[DTE]}{[DTE]_{\max}} \right) - k_{r}[DTE] - \delta_{D}[DTE] \]

Among them, constant β_DTE represents the transcription rate of the mRNA of the DTE enzyme, δ_m represents the degradation rate of the mRNA. k_TL is the translation rate constant. k_f is the dimerization rate constant, k_r is the dissociation rate constant, δ_M is the degradation rate of the monomer, and δ_D is the degradation rate of dimerization.

Figure 3: Schematic diagram of the concentration changes of mRNA, monomer protein and homodimer during DTE synthesis over time

In Figure 3, the mRNA rapidly rises to a steady state of 165.52 nM, reflecting the rapid equilibrium between transcription and degradation; the monomeric protein accumulates to a steady state of 31.72 nM, which is the result of the dynamic balance between translation and dimerization; the DTE dimer rises slightly to a steady state of 0.155 nM, demonstrating the cascade process of transcription → translation → dimerization, which conforms to the molecular mechanism of homodimer formation from two monomers.

1.2 Kinetics of Expression of Heterodimeric DTE Enzyme

The expression of the newly constructed heterodimer still needs to be reconsidered. Let's assume it consists of A chain and B chain.

Figure 4: Construction of plasmid for DTE design enzyme

Since the expression of the designed A chain and B chain is continuous and constitutive, the following changes will occur:

\[ \frac{d\left[ \text{mRNA}_{A} \right]}{dt} = \beta_{A} - \delta_{m}\left[ \text{mRNA}_{A} \right] \]

\[ \frac{d\left[ \text{mRNA}_{B} \right]}{dt} = \beta_{B} - \delta_{m}\left[ \text{mRNA}_{B} \right] \]

Among them, the concentration of mRNA for the A chain changes over time, which is determined by the transcription rate \(\beta_{A}\) and the degradation rate \(\delta_{A}\). The concentration of mRNA for the B chain also changes over time, which is determined by the transcription rate \(\beta_{B}\) and the degradation rate \(\delta_{B}\).

\[ \frac{d[A]}{dt} = k_{TLA}\left[ \text{mRNA}_{A} \right] - k_{on}[A][B] + k_{off}\left[ DTE_{AB} \right] - \delta_{A}[A] \]

\[ \frac{d[B]}{dt} = k_{TLB}\left[ \text{mRNA}_{B} \right] - k_{on}[A][B] + k_{off}\left[ DTE_{AB} \right] - \delta_{B}[B] \]

Among them, the concentration of the A chain changes over time, which is determined by the translation rate \(k_{TLA},\) the binding rate \(k_{on}\), the dissociation rate \(k_{off}\) and the degradation rate\(\ \delta_{A}\). The concentration of the B chain changes over time, which is determined by the translation rate \(k_{TLB}\), the binding rate \(k_{on}\), the dissociation rate \(k_{off}\) and the degradation rate \(\delta_{B}\).

\[ \frac{d\left[ DTE_{AB} \right]}{dt} = k_{on}[A][B] - k_{off}\left[ DTE_{AB} \right] - \delta_{AB}\left[ DTE_{AB} \right] \]

Figure 5: Schematic diagram of the changes in mRNA, monomer protein and heterodimer concentrations during DTE synthesis over time

Figure 5 shows that both mRNA_A and mRNA_B reached a steady state of 0.167 nM, indicating that the transcription rates of the A and B chains are consistent; both A and B monomers reached a steady state of 81.85 nM, demonstrating that the translation efficiency is matched; the DTE_AB heterodimer reached a steady state of 0.118 nM, reflecting the efficiency of the binding and assembly of the A and B chains; although the steady state is slightly lower than that of the homodimer (0.155 nM), it still efficiently completes heterodimerization, which is in line with the characteristics of hetero-subunit assembly.

1.3 Enzymatic Reaction Kinetics

1. D-fructose Transmembrane Diffusion:

The concentration of D-fructose is different inside and outside the cell. It can be considered that the diffusion rate of D-fructose is proportional to the concentration difference between inside and outside the cell. Therefore, the rate at which D-fructose diffuses into the cell is:

\[ \gamma_{F}\left( \left[ {Fru}_{out} \right] - \left[ {Fru}_{in} \right] \right) \]

The concentration change of fructose within the cell is:

\[ \frac{d[{Fru}_{in}]}{dt} = \gamma_{F}\left( \left[ {Fru}_{out} \right] - \left[ {Fru}_{in} \right] \right) - \frac{k_{cat}[DTE]\left[ {Fru}_{in} \right]}{k_{M} + {Fru}_{in}} - \mu\left[ {Fru}_{in} \right] \]

Among them, \(\gamma_{F}\) represents the diffusion coefficient of fructose, \([{Fru}_{out}]\) and \([{Fru}_{in}]\) are the concentrations of extracellular and intracellular D-fructose respectively, and \(\mu\) is the dilution effect caused by cell growth rate.

Considering that the volume of the cells is sufficiently small and can be neglected compared to the external solution, the amount of D-fructose diffusing into the cells is very small. We can assume that the concentration of D-fructose outside the cells remains constant over a short period of time and does not change with time.

\[ \frac{d[{Fru}_{out}]}{dt} = 0 \]

2. D-Psicose Synthesis:

D-Psicose is produced after D-fructose enters the cell and is catalyzed by DTE enzyme. The concentration of D-Psicose varies depending on the concentration of D-fructose and can be described by the Hill equation. D-Psicose binds to the promoter pPsiR to form the CCI complex. Considering the consumption and degradation of D-Psicose, the generation rate of D-Psicose is:

\[ \frac{d[ Psicose]}{dt} = \frac{k_{cat}[DTE]\left[ {Fru}_{in} \right]}{k_{M} + {Fru}_{in}} - {m_{Psicose,pPsiR}[ Psicose]}^{n}\left[ \text{pPsiR} \right] - \delta_{P}[ Psicose] + m_{CCI}\ [CCI] - \mu[ Psicose] \]

Here, \(- {m_{Psicose,pPsiR}[ Psicose]}^{n}\left[ \text{pPsiR} \right]\) represents the rate at which D-Psicose binds to the PsiR transcription factor. The binding rate is determined by the concentrations of Psicose and PsiR. \(- \delta_{P}[ Psicose]\) represents the degradation or dilution rate of D-Psicose within the cell. \(+ m_{CCI}\,[CCI]\) represents the rate at which D-Psicose is released by the decomposition of the CCI complex, which is determined by the concentration of the CCI complex [CCI] and the decomposition constant \(m_{CCI}\). \(\mu\) is the dilution effect caused by the cell growth rate.

Figure 6: Schematic diagram of D-fructose and D-Psicose concentrations over time

Figure 6 shows that the extracellular D-fructose homeostasis is maintained at 10.00 nM with almost no consumption; the intracellular D-fructose rapidly drops to 0.346 nM within the first 2 hours and then reaches a stable state, indicating that it is efficiently consumed by DTE after transmembrane transport. D-Psicose gradually increases from 0 to a steady state of 4.52 nM, with a production rate of 47.59%. This is because the rate at which DTE catalyzes the formation of Psicose from fructose, the binding of Psicose to PsiR, its degradation, and the consumption due to cell dilution reach equilibrium, demonstrating the directional regulation characteristic of the substrate-enzyme-product metabolic flow.

2. The Characterization Circuit Model

2.1 PsiR-PsiA Transcriptional Regulation System

1. PsiR-pPsiA Regulatory Circuit:

After the generation of D-Psicose, D-Psicose will bind to the pPsiR transcription factor to form the CCI complex. The entire process can be represented by the following reaction equation:

\[ {Fru}_{out}\overset{DTE}{\rightarrow}Psicose + PsiR \leftrightarrow CCI\ complexes \]

The concentration of the pPsiR transcription factor depends on the concentration of D-Psicose and can be described by the Hill equation. Taking into account the consumption and degradation of the pPsiR transcription factor, the generation rate of the pPsiR transcription factor is:

\[ \frac{d\left[ \text{p}\text{P}\text{si}\text{R} \right]}{dt} = \alpha_{\text{pPsiR}} + \alpha_{\text{leak}} - {m_{Psicose,pPsiR}[ Psicose]}^{n}\left[ \text{pPsiR} \right] - \delta_{\text{pPsiR}}\left[ \text{p}\text{P}\text{si}\text{R} \right] + m_{CCI}\ [CCI] \]

\[ [PsiR]_{total} = [PsiR]_{free} + [CCI] \]

Here, \(\alpha_{\text{pPsiR}}\) represents the synthesis rate of the pPsiR transcription factor, \(\alpha_{\text{leak}}\) represents the background expression rate of the pPsiR transcription factor (the leakage expression when there is no D-Psicose), \(- m_{Psicose,pPsiR}[ Psicose]^{n}\left[ \text{pPsiR} \right]\) represents the binding rate of the pPsiR transcription factor to D-Psicose, and the binding rate is determined by the concentration of D-Psicose and pPsiR.\(- \delta_{\text{pPsiR}}\left[ \text{pPsiR} \right]\) represents the degradation or dilution rate of pPsiR in the cell. \(+ m_{CCI}\,[CCI]\)represents the rate at which pPsiR is released from the CCI complex after its decomposition, which is determined by the concentration of the CCI complex [CCI] and the decomposition constant\({\ m}_{CCI}\).

2. Changes in CCI Complex Concentration:

Figure 7: Schematic diagram of the concentration changes of pSiR transcription factor and CCI complex over time

Taking into account the formation and degradation rates of the CCI complex, the concentration change of the CCI complex is as follows:

\[ \ \frac{d[CCI]}{dt} = {m_{Psicose,pPsiR}[ Psicose]}^{n}\left[ \text{pPsiR} \right] - m_{CCI}\ [CCI] - \delta_{CCI}\ [CCI] \]

\(m_{Psicose,pPsiR}[ Psicose]^{n}\left[ \text{pPsiR} \right]\) represents the formation rate of the CCI complex, \(- m_{CCI}\,[CCI]\) represents the decomposition rate of the CCI complex, and \(- \delta_{CCI}\,[CCI]\) represents the degradation or dilution rate of the CCI complex.

As can be seen from Figure 9, the concentration of Pepper RNA initially increased and then decreased in the first few minutes, while the total concentration of HBC continued to decrease; the PH complex reached its peak rapidly and then decreased, with the PH* fluorescent state being low but stable, and the fluorescence intensity decreasing over time. Combined with the steady-state data, the free Pepper was extremely low, the steady-state of PH* was 0.007105 nM, supporting a fluorescence intensity of 0.428 a.u., indicating that after the binding of Pepper and HBC, the fluorescence depends on the PH* conformation, and this characteristic of attenuation with the degradation of components is maintained.

2.2 Pepper Aptamer/HBC Dye Concentration Variation and Fluorescence Excitation

The concentration of Pepper aptamer varies depending on the concentration of the repressor pPsiR (which regulates the pPsiA promoter), and can be described by the Hill equation. Moreover, Pepper itself is also degraded, and the addition of HBC dye also combines with Pepper to form an intermediate complex PH. When the HBC dye is in a free state, its fluorescence quantum yield is extremely low (\(\Phi\) ≈ 0.03), but after specifically binding to the Pepper RNA aptamer:

The dye molecules are locked in a rigid structure;
The intramolecular rotation is restricted;
The energy transition level changes → the fluorescence quantum yield increases by 26 times (\(\Phi\)≈0.78)；

\[ \Phi_{free} = 0.03\quad \rightarrow \quad\Phi_{bound} = 0.78 \]

Here, \(\Phi\) represents the fluorescence quantum yield, which indicates the probability of each excited-state molecule emitting photons.\(\Phi_{free}\) represents the photon emission yield of the free HBC dye portion, while \(\Phi_{bound}\) represents the photon emission yield of the portion where HBC is bound to Pepper.

The fluorescence intensity \(I_{fluor}\) is directly proportional to the concentration of the excited state PH* of the complex:

\[ I_{fluor} \propto \Phi_{bound} \cdot \left[ PH^{*} \right] \]

Figure 8: Dynamic changes of the Pepper-HBC fluorescent RNA biosensor

Considering that after the addition of HBC dye, HBC combines with Pepper to form the complex PH, this process is reversible. Only the complex PH emits fluorescence, and the variable that directly determines the fluorescence intensity is the excited state complex PH* rather than PH. Therefore, we introduce three states: free Pepper (\(P_{f}\)), free HBC (\(H_{f}\)), and the complex (PH). We also introduce three models for enhancing fluorescence changes of PH states: the ground state complex (PH) is the one that has completed the combination but has not reached the optimal luminescence conformation; the excited state complex (PH*) is the one that has been adjusted to the configuration that maximizes fluorescence; the dissociation intermediate state (PH~d~) explains the bimodal fluorescence decay phenomenon.

a) Equilibrium Among Three Free States：

\[ \frac{d\left[ P_{f} \right]}{dt} = \eta_{Pepper}\beta_{Pepper}(\beta_{leak} + \frac{1}{1 + \left[ \frac{pPsiR}{K_{d}} \right]^{n}}) - k_{on}\left[ P_{f} \right]\left[ H_{f} \right] + k_{r}\left[ PH_{d} \right] - \delta_{Pepper}\left[ P_{f} \right] \]

\[ \frac{d\left[ H_{f} \right]}{dt} = \alpha_{HBC} - k_{on}\left[ P_{f} \right]\left[ H_{f} \right] + k_{d}\left[ PH_{d} \right] - \delta_{\text{HBC}}\left[ H_{f} \right] \]

\[ \frac{d[PH]}{dt} = k_{on}\left[ P_{f} \right]\left[ H_{f} \right] - k_{f}[PH] + k_{b}\left[ PH^{*} \right] - k_{d}[PH] \]

Among them, \(\eta_{Pepper}\beta_{Pepper}\frac{1}{1 + \left[ \frac{pPsiR}{K_{d}} \right]^{n}}\) represents the expression of the Pepper aptamer, which is determined by the concentration of PsiR and the maximum expression rate\(\eta_{Pepper}\), and \(\beta_{Pepper}\) is a parameter related to the expression of the Pepper aptamer, possibly related to the expression efficiency. \(- k_{on}\left[ P_{f} \right]\left[ H_{f} \right]\) represents the consumption of Pepper by the binding of the Pepper aptamer to the HBC dye, and\(k_{r}\left[ PH_{d} \right]\) represents the recovery of Pepper from the dissociated intermediate state. \(- \delta_{Pepper}\left[ P_{f} \right]\) represents the degradation or dilution of the Pepper aptamer.

\(\alpha_{HBC}\) is the addition rate of the HBC dye (in the experiment, it is added once, so it can be considered that there is a high concentration of [HBC] at the initial moment, and then no replenishment, that is, \(\alpha_{HBC} = 0\); while the degradation term \(\delta_{\text{HBC}}\) is generally very small and can be ignored. \(k_{d}\left[ PH_{d} \right]\)represents the recovery of HBC from the dissociated intermediate state.

\(k_{on}\left[ P_{f} \right]\left[ H_{f} \right]\) represents the binding of the Pepper aptamer to the HBC dye, \(- k_{f}[PH]\) represents the transition of PH to the fluorescent state, and \(k_{b}\left[ PH^{*} \right]\) represents the return of PH* to the ground state PH. \(- k_{d}[PH]\) represents the dissociation of PH to the intermediate state \(PH_{d}\).

b) Conformational Change:

\[ \frac{d[PH]}{dt} = k_{on}\left[ P_{f} \right]\left[ H_{f} \right] - k_{f}[PH] + k_{b}\left[ PH^{*} \right] - k_{d}[PH] \]

\[ \frac{d\left[ PH^{*} \right]}{dt} = k_{f}[PH] - k_{b}\left[ PH^{*} \right] - k_{nr}\left[ PH^{*} \right] - k_{rad}\left[ PH^{*} \right] \]

\[ \frac{d\left[ PH_{d} \right]}{dt} = k_{d}[PH] - k_{r}\left[ PH_{d} \right] - k_{d}\left[ PH_{d} \right] \]

Among them, \(k_{on}\left[ P_{f} \right]\left[ H_{f} \right]\) represents the binding of the Pepper aptamer to the HBC dye, \(- k_{f}[PH]\) represents the consumption of the transition of PH to the fluorescent state, \(k_{b}\left[ PH^{*} \right]\) represents the return of PH* to the ground state PH. \(- k_{d}[PH]\) represents the consumption of the dissociation of PH into the intermediate state \(PH_{d}\).

Similarly, \(k_{f}[PH]\) represents the transition of PH to the fluorescent state, \(- k_{b}\left[ PH^{*} \right]\) represents the consumption of the range of PH* returning to the ground state PH, the introduction of \(- k_{nr}\left[ PH^{*} \right]\) represents non-radiative decay (thermal dissipation without emission), \(- k_{rad}\left[ PH^{*} \right]\) represents radiative decay (emission of photons (fluorescence)).

\(k_{d}[PH]\) represents the intermediate state generated by the dissociation of PH, \(- k_{r}\left[ PH_{d} \right]\) represents the consumption of the recombination of the intermediate state back to PH, \(- k_{d}\left[ PH_{d} \right]\) represents the degradation of the dissociated intermediate state itself.

c) Fluorescence Excitation:

The fluorescence intensity \(I_{fluor}\) is directly dependent on the concentration of the excited-state HBC-Pepper complex (PH∗) and its radiative decay behavior.

\[ I_{fluor} = \gamma \cdot {(\Phi}_{bound} \cdot \left[ PH^{*} \right]_{ss} + \Phi_{leak}) \approx \gamma^{*} \cdot \left[ PH^{*} \right]_{ss} \]

Figure 9: Schematic diagram of Pepper RNA, HBC dye, complex PH concentration and fluorescence intensity changes over time

Here, \(\gamma\) represents the instrument coefficient. \(\Phi_{leak}\) indicates the background fluorescence quantum yield.

3. Population Dynamics Model

When quantifying the yield of D-Psicose (Psicose) in large-scale production, factors such as cell growth, substrate metabolism, and enzyme kinetics can be considered for coupling.

3.1 Error Accumulation

During enzyme expression (especially in long-term fermentation or continuous passage), DNA replication errors and transcription errors can lead to:

Mutation in the enzyme coding sequence → Change in the amino acid sequence → Decrease in enzyme activity \(\frac{K_{cat}}{K_{M}}\);
The rate of error accumulation is positively correlated with the error rate of replication/transcription, sequence length, and expression duration;
In directed evolution, it is necessary to quantify the impact of errors on the stability of production.

Enzyme expression kinetics model:

\[ \frac{d[DTE]}{dt} = k_{f}[M]^{2} - k_{r}[DTE] - \delta_{D}[DTE] \]

\[ \frac{d\left[ DTE_{AB} \right]}{dt} = k_{on}[A][B] - k_{off}\left[ DTE_{AB} \right] - \delta_{AB}\left[ DTE_{AB} \right] \]

Considering the assembly rate in the population model: The formation of homodimers is usually a spontaneous process, while heterodimers consist of two different subunits, which may make the assembly process more complex. Therefore, assembly efficiency is introduced:

\[ \eta_{assembly} = \cdot \frac{k_{on} \cdot k_{\min}}{k_{on} + k_{\deg}}\quad\left( {k_{\min} = \min\left( \beta_{A},\beta_{B} \right),k}_{\deg} = \max\left( \delta_{A},\delta_{B} \right) \right) \]

Here, \(\eta_{assembly}\) represents the assembly rate of the A and B chains forming a dimer, and it can be determined by the minimum concentration\(k_{\min}\) as the limiting assembly speed. \(k_{\min}\) is determined by the transcription rates \(\beta_{A}\) and \(\beta_{B}\), the binding rate \(k_{on}\), and the maximum degradation rate \(k_{\deg}\).

Finally, the expression changes of the homologous dimeric DTE enzyme and the designed heterologous dimeric DTE enzyme were compared:

\[ [DTE]_{ss} = \frac{\beta_{DTE}k_{TL}}{2\delta_{M}\delta_{D}}\quad vs\quad\left[ DTE_{AB} \right]_{ss} = \frac{\eta_{assembly}k_{\min}{\cdot k}_{TL,avg}}{\delta_{AB}} \]

Here, "ss" indicates that the enzyme is in a steady state, and \(k_{TL,avg}\) represents the average translation rate constant of the A chain and the B chain. \(\eta_{assembly}\) represents the assembly efficiency of the A and B chains. Other parameters can be found in the expression of the DTE enzyme in the kinetic model.

3.2 Source of Error:

i. Duplication error rate:

\[ repRat(t) = \left( s_{repRat} \right)^{len_{DNA}} \]

ii. Transcription error rate:

\[ trsRat(t) = \left( s_{trsRat} \right)^{len_{mRNA}} \]

SrepRat \(\approx\) 10−8: The replication error rate per base; lenDNA: The base length of the DTE gene;

StrsRat \(\approx\) 10−4: The transcription error rate per base; lenmRNA: The base length of mRNA.

iii. The total error rate is the product of the replication error rate and the transcription error rate:

\[ errRat(t) = repRat(t) \bullet trsRat(t) \]

3.3 Cell population growth kinetics

a) Logistic Equation:

\[ \frac{d[N]}{dt} = rN\left( 1 - \frac{N}{K} \right) \]

Here, N denotes the density of Escherichia coli (cells/L), r denotes the maximum growth rate (h⁻¹) and K denotes the environmental carrying capacity or the maximum cell density, representing the upper limit of cell growth.

b) Cumulative Error-Induced Fatality Correction

The rate at which each cell undergoes inactivation mutations within a unit of time:

\[ \lambda_{\text{mut}} = \mu_{\text{mut}} \cdot r_{eff} \cdot lenDNA \]

\(\mu_{\text{mut}}\) represents the mutation rate per unit time and per unit length.

The error leads to a mutation death probability of: \(\mu_{\text{death}} = \lambda_{\text{mut}}\).

Correction of the population growth equation (introduction of error accumulation):

\[ \frac{d[N]}{dt} = rN\left( 1 - \frac{N}{K} \right) - \mu_{\text{death}}N \]

c) The metabolic burden continues to increase

Growth rate correction (taking into account metabolic burden):

\[ r_{eff} = \frac{r_{\max}}{\left( 1 + \beta_{burden} \bullet \frac{\left[ DTE_{func} \right]}{\left[ DTE_{ref} \right]} \right)} \bullet \left( 1 - \beta_{os} \bullet \frac{[ Psicose]}{[ Psicose] + K_{os}} \right) \]

Here, \(r_{eff}\) : The corrected effective growth rate, taking into account the effects of metabolic burden and osmotic pressure burden. \(r_{\max}\): The maximum growth rate, representing the theoretical maximum growth rate of the cells when there are no limiting factors. \(\beta_{burden}\): The metabolic burden coefficient, indicating the negative impact of metabolites (such as DTE) on cell growth. \(\frac{\left[ DTE_{func} \right]}{\left[ DTE_{ref} \right]}\): The measure of metabolic burden, representing the ratio of the current concentration of metabolite DTE to the reference concentration \(DTE_{ref}\). \(\beta_{os}\): The osmotic pressure burden coefficient of product accumulation on the cells, describing the osmotic pressure effect caused by the accumulation of metabolic products such as Psicose. \(\frac{[ Psicose]}{[ Psicose] + K_{os}}\) The influence of the concentration of product Psicose on cell growth, inhibiting growth through osmotic pressure burden. \(K_{os}\) is the half-saturation constant of this osmotic pressure effect, indicating that when the concentration of Psicose approaches this constant, the inhibitory effect of the osmotic pressure on the growth rate gradually saturates.

Figure 10: Schematic diagram of Escherichia coli, the total error rate and the growth rate changes over time

The cell growth kinetics model was ultimately revised to:

\[ \frac{d[N]}{dt} = r_{eff} \cdot N\left( 1 - \frac{N}{K} \right) - \mu_{\text{death}}N \]

In the figure, the population density of the wild-type Escherichia coli reached 4.96×10⁹ cells/L, showing an S-shaped growth pattern. It entered the rapid proliferation phase within 10 - 15 hours. The mutant strain had a population density of only 1.95×10⁷ cells/L and its growth was almost stagnant. The total error rate rose from 0 to 37% within 0 - 5 hours and then maintained a stable state, reflecting the rapid accumulation of replication and transcription errors. The effective growth rate gradually decreased from the initial 0.50 h⁻¹ to a steady state of approximately 0.45 h⁻¹. The mutant strain's growth was significantly inhibited by the accumulation of errors and the metabolic burden, visually demonstrating the negative impact of errors on the population's proliferation.

3.4 Substrate consumption and product formation

The material concentration at the group level:

\[ C_{total}(t) = \frac{N(t) \bullet C_{single}(t) \bullet V_{cell}}{V} \]

This formula describes the changes in the concentration of substances at the population level (such as in culture medium or fermentation tanks). \(C_{total}(t)\): The concentration of substances at the population level, usually the concentration of a certain metabolite or a certain molecule within the cells. \(N(t)\): The number of cells or cell concentration at time t. \(C_{single}(t)\): The concentration of substances in a single cell, representing the concentration of a certain substance (such as a product or metabolite) within a single cell. \(V_{cell}\): The volume of each cell, which is usually a constant and depends on the cell type and growth state.

This formula indicates that the population-level substance concentration \(C_{total}(t)\) is determined by the cell number \(N(t)\), the substance concentration of a single cell \(C_{single}(t)\), and the volume of each cell\(V_{cell}\). When the number of cells increases, or the substance concentration within a single cell increases, the population-level substance concentration will also increase accordingly.

Error accumulation simultaneously affects the expression of DTE enzymes, and replication errors lead to the exponential decay of the proportion index of functional enzymes. Considering the homologous form of DTE background enzymes and heterologously designed DTE enzymes, for the proportion of functional enzymes in the cell population:

\[ \frac{d\left[ \text{DTE}_{\text{func}} \right]}{dt} = k_{f}[M]^{2} - \delta_{D}\left[ \text{DTE}_{\text{func}} \right] - \lambda_{\text{mut}}\left[ \text{DTE}_{\text{func}} \right] \]

\[ \frac{d\left[ \text{DTE}_{\text{func,AB}} \right]}{dt} = \eta_{\text{assembly}}k_{TL}\left[ \text{mRNA}_{DTE} \right] - \delta_{D}\left[ \text{DTE}_{\text{func,AB}} \right] - \lambda_{\text{mut}}\left[ \text{DTE}_{\text{func,AB}} \right] \]

\[ f_{func}(t) = e^{- \lambda_{\text{mut}} \cdot t}(1 - \alpha) + \alpha \]

\[ k_{cat}^{\text{eff}}(t) = k_{cat}^{wt} \cdot \ f_{func}(t)\left( 1 - \alpha \cdot (1 - f_{func}(t)) \right) \]

\[ \frac{d\left[ \text{DTE}_{\text{inact}} \right]}{dt} = \lambda_{\text{mut}}\left[ \text{DTE}_{\text{func}} \right] - \gamma\left[ {DTE}_{inact} \right] \]

Figure 11: Schematic diagram of mRNA concentration, DTE enzyme concentration and reaction rate changes over time in group production

Here, \(k_{cat}^{\text{eff}}(t)\) represents the effective catalytic constant, which is a modification of the enzyme catalytic rate constant \(k_{cat}\) after considering the inactivated state of the enzyme. α represents the proportion of inactivated enzymes over a long time scale (i.e., as t approaches infinity), which is the steady-state inactivation ratio.\(\ \gamma\) is the rate constant for the repair or clearance of inactivated enzymes.

In Figure 11, DTE mRNA rapidly increased to 2.75 nM within 2 hours and maintained a stable level thereafter. The concentrations of A chain and B chain mRNA were extremely low and showed no significant increase, indicating that DTE is mainly expressed in a homologous transcriptional form during population production. The total DTE enzyme concentration continuously rose over time, reaching a steady state of 2516.72 nM, but the active DTE was only 125.84 nM. Due to the high error rate at the steady state, the enzyme was inactivated due to the accumulation of errors. In the dimer distribution, homodimers increased linearly over time, while heterodimers were almost zero, indicating that the heterodimerization efficiency in population production was low, and the system was dominated by homodimers. The theoretical rate and effective rate of the enzymatic reaction both showed an upward trend, but the effective rate was slightly lower than the theoretical rate. This intuitively reflects that the proportion of active DTE is low and the heterodimer assembly is inefficient, jointly limiting the reaction efficiency, and the actual enzymatic activity is slightly lower than the theoretical expectation.

Regarding the consumption of the substrate (D-fructose) and the production of the products, there is:

a) Extracellular substrate dynamics:

\[ \frac{d\left[ S_{out} \right]}{dt} = - \frac{1}{Y_{X/S}} \cdot \frac{d[N]}{dt} \cdot \frac{1}{V} - \gamma_{F} \cdot (\left[ S_{out} \right] - \left[ S_{in} \right]) \cdot \frac{N \bullet V_{cell}}{V} \]

b) Intracellular substrate dynamics:

\[ \frac{d\left[ S_{in} \right]}{dt} = \gamma_{F} \cdot (\left[ S_{out} \right] - \left[ S_{in} \right]) - \frac{k_{cat}^{\text{eff}}\left[ \text{DTE}_{\text{func}} \right]\left[ S_{in} \right]}{k_{M} + \left[ S_{in} \right]} - \mu \cdot \left[ S_{in} \right] \]

c) Product formation dynamics:

\[ \frac{d\left[ P_{out} \right]}{dt} = \eta \cdot \frac{k_{cat}^{\text{eff}}\left[ \text{DTE}_{\text{func}} \right]\left[ S_{in} \right]}{k_{M} + \left[ S_{in} \right]} \cdot N \bullet V_{cell} - \delta_{P}[P_{out}] + \kappa([P_{in}] - [P_{out}]) \]

d) Intracellular products dynamics:

\[ \frac{d\left[ P_{in} \right]}{dt} = \frac{k_{cat}^{\text{eff}}\left[ \text{DTE}_{\text{func}} \right]\left[ S_{in} \right]}{k_{M} + \left[ S_{in} \right]} - \kappa([P_{in}] - [P_{out}]) - \mu \bullet [P_{out}] \]

Among them, \(\left[ S_{out} \right]\) is the extracellular D-fructose concentration (mM), \(\left[ S_{in} \right]\) is the extracellular D-fructose concentration (mM); [\(P_{out}\)] is the extracellular D-Psicose concentration (mM), \(\left[ P_{in} \right]\) is the intracellular D-Psicose concentration (mM); \(Y_{X\text{/}S}\) is the yield coefficient of the cell for the substrate (g cell/g substrate); \(\eta\) is the product synthesis efficiency (including transmembrane transport loss); \(\delta_{P}\) is the degradation rate of the product.

Figure 12: Schematic diagram of the changes in D-fructose and D-psicose concentrations over time in group production

\(V\): the culture volume; \(V_{cell}\): the volume of a single cell; \(\gamma_{F}\): the transmembrane diffusion coefficient. \(\mu = \frac{1}{N}\frac{dN}{dt}\) is the specific growth rate. \(\kappa\): the transmembrane secretion rate constant.

In the left figure, the extracellular D-fructose was maintained at a stable level of 30 mM, while the intracellular fructose rose rapidly from 0 to approximately 6 mM and then increased slowly; in the right figure, the maximum yield of extracellular D-Psicose reached 10.31 mM, and the intracellular level reached approximately 6.5 mM before slightly decreasing. The conversion rate of D-fructose was 92.55%, indicating that under the catalysis of DTE, fructose was efficiently converted into Psicose, and the product was mainly secreted extracellularly, which is in line with the dynamic balance of biosynthesis and secretion.

4. Half-life System Integration

4.1 Promoter-transcription factor binding stability mode

\[ \frac{d[CCI]}{dt} = k_{on} \bullet [PsiR] \bullet [Psicose]^{n} - (k_{off}{+ \delta}_{CCI}) \bullet [CCI] \]

Half-life of the promoter complex:

\[ t_{\frac{1}{2}} = \frac{\ln 2}{k_{off} + \delta_{CCI}} \]

4.2 Model of protein thermal stability half-life (DTE enzyme thermal inactivation kinetic)

Add temperature dependence:

\[ \frac{dE}{dt} = k_{syn} - \delta_{D}(T) \bullet E(t) \]

Temperature-dependent degradation rate:

\[ \delta_{D}(T) = \delta_{D,0} \bullet e^{\left( - \frac{E_{a}}{R}\left( \frac{1}{T} - \frac{1}{T_{0}} \right) \right)} \]

Protein half-life:

\[ t_{\frac{1}{2}_{protein}} = \frac{\ln 2}{\delta_{D}(T)} \]

Here, \(k_{syn}\) represents the synthesis rate of the enzyme.

4.3 mRNA half-life prediction model

mRNA dynamic change model:

\[ \frac{d[ mRNA] }{dt} = \beta - \delta_{m} \bullet [ mRNA] - k_{dilution} \bullet [ mRNA] \]

Here, \(k_{dilution}\) represents the dilution effect term caused by cell growth.

mRNA half-life:

\[ t_{\frac{1}{2}_{mRNA}} = \frac{\ln 2}{\delta_{m} + k_{dilution}} \]

4.4 Fluorescence signal half-life model

The stability of the Pepper-HBC complex shows time-dependent characteristics:

\[ \frac{d[{PH}^{*}]}{dt} = k_{f} \bullet [PH] - ({\ k}_{rad} + k_{nr} + \delta_{pepper}) \bullet [{PH}^{*}] \]

Fluorescence half-life:

\[ t_{\frac{1}{2}_{fluorescence}} = \frac{\ln 2}{k_{rad} + k_{nr} +\delta_{pepper}} \]

4.5 System integration of half-life model

a) Stability of multi-component systems

Taking into account the half-lives of all components:

\[ t_{\frac{1}{2}_{system}} = \left( \sum_{i}^{}\frac{1}{t_{\frac{1}{2}}^{i}} \right)^{- 1} \]

b) The impact of cumulative errors on the half-life

The corrected half-life is:

\[ t_{\frac{1}{2}_{corrected}} = t_{\frac{1}{2}_{initial}} \bullet (1 - errRat(t) \bullet t) \]

Figure 13: Schematic diagram of the half-life of each component of different DTE generators

Where errRat(t) is derived from the cumulative sum of the replication error rate and the transcription error rate.

It can be easily seen from Figure 13 that in the core functional components of the wild-type and mutant products, the half-life of the mutant CCI (0.1398 hours) is only 66.7% of that of the wild-type (0.2097 hours), the half-life of DTE (6.3020 hours) is 87.2% of that of the wild-type (7.2309 hours), and the half-life of mRNA (0.2356 hours) is 92.6% of that of the wild-type (0.2544 hours). All of these have been shortened to varying degrees, reflecting the weakening of the stability of the core components due to the mutation. The fluorescence half-life of the mutant is slightly higher than that of the wild-type, indicating that the stability of the fluorescence-related part has slightly improved after the mutation, but the decay of the core components still makes the long-term stability of the wild-type system better.