Taxadiene-5α-hydroxylase(T5αH)


Construction of Heterologous T5OH Synthesis Pathway

To validate the generality and robustness of REvoDesign, we first verified its performance and then applied this protein design approach to engineer taxadiene-5α-hydroxylase (T5αH) which is involved in the biosynthesis of paclitaxel (Fig.1). Heterologous production of paclitaxel by reconstituting the biosynthetic pathway in microorganisms is a highly promising route. However, it is still highly challenging to achieve de novo and efficient synthesis of paclitaxel in microbial hosts due to bottlenecks constituted along the pathway. T5αH represents one such enzyme-level bottleneck due to its poor activity and selectivity when expressed in microbial hosts. Therefore, it is imperative to enhance the conversion of taxadiene into T5OH and reduce the accumulation of byproducts, which, as such, would be advantageous for improved pathway control and performance.

Primary school science popularization activity

Fig.1 Schematic diagram of the T5OH biosynthesis pathway.

Vector Construction of Genes Related to the T5OH Biosynthesis Pathway

Through molecular cloning, the taxadiene synthesis genes GGPPs and TS derived was cloned into vectors pMASC02-Ura3, resulting in the recombinant plasmids pMASC02-TTPI1-TS-Linker-GGPPs-PGAL10-Ura3 (Fig.2a). Meanwhile, T5αH and CRP was cloned into vector pMASC04-HIS3, yielding the recombinant plasmid pMASC04-TFBA1-CPR-PGAL10-PGAL1-T5αH-TPDC1-His3 (Fig.2b). The plasmid map and results of E. coli colony PCR identification are shown in Figure 2. The E. coli colony PCR identification results were consistent with the expected outcomes, and the sequencing results were correct.

Primary school science popularization activity

Fig.2 Construction of gene expression vector.

a The plasmid has integrated the GGPPs and TS genes. Meanwhile, key functional elements are labeled: promoter GAL10, terminator TPI1, and the selectable marker gene Ura3. b The T5αH and CRP genes were inserted into the plasmid, with key elements including the GAL1 and GAL10 promoter, PDC1 and FBA1 terminator, and His3 selectable marker gene. The figure below shows the PCR amplification bands of the recombinant plasmid.

- Construction of T5OH-Engineering Strains

The linearized recombinant plasmid pMASC02-TTPI1-TS-Linker-GGPPs-PGAL10- Ura3 was integrated into the genome of BY4742, and the strain TY01 was obtained. Subsequently, the linearized plasmid pMASC04-TFBA1-CPR-PGAL10-PGAL1-T5αH-TPDC1 -His3 was transformed into TY01 (used as the chassis cell), resulting in the T5OH-producing strain TY02. After verification by genomic PCR (Fig.3a), the correct integration of the plasmid was confirmed. The recombinant yeast was plated on Leu-deficient plates, and white single colonies were observed on the surface of the medium after 2 days (Fig.3b).

Primary school science popularization activity

Fig.3 Construction of T5OH engineering strain.

a The electrophoresis profile of PCR detection for transformed yeast strains. b The colony morphology of recombinant yeast inoculated onto Leu-deficient (leucine-deficient) medium plates after 2 days of cultivation.

Rational Design of T5αH Mutant Proteins Using REvoDesign

We set out to design and engineer T5αH variants using our REvoDesign approach. An apo structural model of T5αH was created using AlphaFold2, and the heme group referring to its top hit crystal structure (PDB: 6A15) was transplanted via the pair-fit wizard tool of PyMOL, which was followed by a FastRelax using Rosetta to remove spatial crashes. A DiffDock blind docking inference was next performed against the top hit complex model of relaxed T5αH-heme model and taxadiene molecule. By aligning the diffusion trajectories, the top model as the initial model was then selected for further model refinement using RosettaLigand (Fig.4). The final optimized model was used for further design and evaluation processes.

Primary school science popularization activity

Fig.4 presents the optimized T5αH-heme complex model.

Functional Verification of T5αH Mutant

To systematically explore how mutations targeting activity-related sites and stability-related sites affect T5OH (taxadien-5α-ol) production, we first defined two functional regions of T5αH to guide site-specific mutation design:

1. Activity-related region (substrate pocket): Residues within a 6-angstrom radius of T5OH (used as a substrate analog) were designated as the substrate pocket—these residues directly interact with the substrate taxadiene and participate in catalytic reactions, so mutations here primarily regulate T5αH’s catalytic activity (Tab.1).

Primary school science popularization activity

Tab. 1 Screening of activity-related mutants.

2. Stability-related region: Residues that are non-overlapping with the substrate pocket and cofactor pocket (heme-binding region) but critical for maintaining T5αH’s three-dimensional structure were defined as stability-related sites—mutations here aim to improve the enzyme’s thermal stability or structural robustness without interfering with catalytic function (Tab. 2).

Primary school science popularization activity

Tab. 2 Screening of Mutants Based on Thermal Stability

During mutant design, substitutions involving proline (Pro) or cysteine (Cys) were excluded: Pro has a rigid side chain that is essential for maintaining the protein backbone’s stability (disrupting it may cause structural collapse), while Cys can form unwanted disulfide bonds or interfere with heme binding, both of which are detrimental to T5αH’s function. Using the "Visualize" function of REvoDesign, we further screened mutants by evaluating two key metrics: (1) for activity-related mutants, changes in substrate binding energy (to predict catalytic efficiency); (2) for stability-related mutants, changes in protein folding free energy (to predict structural stability). Finally, 23 mutants were selected—including 15 targeting activity-related sites and 8 targeting stability-related sites—with the goal of separately verifying how mutations in the two types of sites influence T5OH production. The catalytic activity and stability of these mutants were measured, with the results shown in Fig.5.

Primary school science popularization activity

Fig.5 Effects of T5αH activity-related and stability-related mutations on T5OH production.

a. T5OH production efficiency of activity-related mutants vs WT T5αH. The vertical axis represents relative T5OH yield. Compared to WT, mutants targeting activity-related sites showed distinct yield changes: L72M and V226E exhibited highly significant increases in T5OH yield, with yields ~2.3-fold and ~2.1-fold that of WT, respectively—this is attributed to enhanced substrate binding affinity and catalytic turnover rate. b. Residual activity of stability-related mutants and their correlation with T5OH yield. Mutants targeting stability-related sites showed clear stability-yield correlations: Q122A, Q266A exhibited highly significant yield improvement. Statistical significance is marked by asterisks: ***p < 0.001.

T5OH Yield Verification of T5αH Double Mutants

To further enhance T5OH (taxadien-5α-ol) production efficiency, we leveraged the results of single-site mutant verification (Fig.5) and adopted a rational combination strategy for activity-related and stability-related mutation sites—selecting high-performance single mutants that target non-overlapping functional regions (to avoid mutual interference between mutations) for site-directed combination. This design was guided by REvoDesign’s predictive analysis: the tool confirmed that the selected activity-related sites (located in the substrate pocket) and stability-related sites (located in the structural maintenance region) have independent spatial distributions and functional roles, ensuring that their combination would not disrupt the enzyme’s overall structure or catalytic core.

We constructed recombinant plasmids carrying different double mutant sequences via molecular cloning. We performed various types of arrangement and combination on the aforementioned high-yield single mutants to further obtain double mutants, and observed whether they possess a higher yield.

All strains were subjected to shake-flask fermentation under the same conditions, and T5OH yields were detected by HPLC. The results are shown in the T5OH yield diagram below (Fig.6).

Primary school science popularization activity

Fig.6 Comparison of T5OH yields among T5αH Wild-Type, single mutants, and double mutants.

In summary, the yield data of the double mutants not only confirms that multi-dimensional optimization of T5αH can drastically improve T5OH production but also further validates REvoDesign’s effectiveness as a powerful tool for rational enzyme engineering.

Bifunctional phytoene synthase/lycopene cyclase(CarRP)


Construction of yeast chassis

Saccharomyces cerevisiae can utilize its intrinsic MVA pathway to provide precursors for terpenoid synthesis; however, its natural metabolic network lacks the complete enzyme system required for lycopene biosynthesis. Therefore, in this study, the CarRP and CarB genes — which encode phytoene synthase and phytoene desaturase, respectively— were constructed into a gene integration vector. These genes were randomly integrated into the genome of Saccharomyces cerevisiae, so as to achieve stable lycopene synthesis.

In the MVA pathway, acetyl-CoA generates intermediate products through a series of enzymatic reactions, which provide the basis for lycopene synthesis. Nevertheless, limited by the catalytic capacity of enzymes involved in each step of the pathway, the rate of product synthesis remains rather low. Hence, while constructing the lycopene synthesis pathway, this study also introduced CarG, CarRP, and CarB into the synthetic pathway (as shown in Fig.7) to enhance lycopene production.

Primary school science popularization activity

Fig. 7 Lycopene synthesis Pathway.

Vector Construction of Genes Related to the Lycopene Biosynthesis Pathway

Through molecular cloning, the lycopene synthesis genes CarB and CarG derived from Mucor circinelloides were cloned into vectors pMASC02-Ura3, resulting in the recombinant plasmids pMASC02-TTPI1-CarB-PGAL10-PGAL1-CarG-TPGI-Ura3 (Fig.8a). Meanwhile, CarRP was cloned into vector pMASC04-HIS3, yielding the recombinant plasmid pMASC04-PGAL1-CarRP-TPDC1-His3 (Fig.8b). The plasmid map and results of E. coli colony PCR identification are shown in Fig.8. The E. coli colony PCR identification results were consistent with the expected outcomes, and the sequencing results were correct.

Primary school science popularization activity

Fig. 8 Construction of gene expression vector.

a. The plasmid has integrated the CarB and CarG genes. Meanwhile, key functional elements are labeled: promoter GAL10, promoter GAL1, terminator PGI , terminator TPI1, and the selectable marker gene Ura3. b. The CarRP gene is inserted into the plasmid, with key elements including the GAL1 promoter, PDC1 terminator, and His3 selectable marker gene. The figure below shows the PCR amplification bands of the recombinant plasmid.

Construction of Lycopene Engineering Strains

The linearized recombinant plasmid pMASC02-TTPI1-CarB-PGAL10-PGAL1-CarG-TPGI- Ura3 was integrated into the genome of BY4742. After verification by genomic PCR (Fig.9a), the strain BY01 was obtained. Subsequently, the linearized plasmid pMASC04-PGAL1-CarRP-TPDC1-His3 was transformed into BY01 (used as the chassis cell), resulting in the lycopene-producing strain BY02. Lycopene is a natural pigment with a vivid red color, allowing positive strains to be screened based on color. The recombinant yeast was plated on His-deficient plates, and orange single colonies were observed on the surface of the medium after 2 days (Fig.9b).

Primary school science popularization activity

Fig. 9 Construction of lycopene engineering strain.

a The electrophoresis profile of PCR detection for transformed yeast strains. b The colony morphology of recombinant yeast inoculated onto His-deficient (Histidine-deficient) medium plates after 2 days of cultivation.

The integration of CarG, CarB, and CarRP into the genome of Saccharomyces cerevisiae aimed to construct the lycopene biosynthesis pathway. However, CarRP possesses dual functions as phytoene synthase and lycopene cyclase, with its cyclization activity dominating in the heterologous expression system of Saccharomyces cerevisiae. This causes lycopene, a key intermediate in the biosynthesis pathway, to be rapidly cyclized into β-carotene, making it difficult to achieve substantial accumulation of lycopene. Shake flask fermentation followed by HPLC detection revealed that the products of BY02 ( https://doi.org/10.1038/s41467-022-28277-w ) contained a large amount of β-carotene (1.4 g/L), while the lycopene yield was only 0.024 g/L (Fig.10a,b).

Primary school science popularization activity

Fig. 10 HPLC Determination of β-Carotene and Lycopene Contents.

a. There are no characteristic peaks of β-carotene and lycopene in the chromatogram of the wild-type strain; two key peaks appear in the chromatogram of the BY02 strain: one strong peak corresponding to β-carotene, and one weak peak corresponding to lycopene. b. The β-carotene yield of the BY02 strain reaches 1.4 g/L, while the lycopene yield is only 0.024 g/L.

CarRP: Inactivation of the Lycopene Cyclase Function

Rational Design of CarRP Mutant Proteins Using REvoDesign

Therefore, functional modification of the key enzyme CarRP through protein engineering approaches—attenuating or even eliminating its cyclization activity while enhancing the ability to convert phytoene to lycopene—will be a crucial breakthrough for efficient synthesis and accumulation of lycopene. CarRP contains two functional domains: the R domain exhibits lycopene cyclase activity, and the P domain possesses phytoene synthase activity, though the P domain requires the correct conformation of the R domain to exert its function. In designing loss-of-function mutations for the R domain of CarRP, it is essential to ensure that the overall folding structure of the R domain is not disrupted / remains undisrupted. To this end, modeling of the enzyme-substrate complex and molecular docking were first carried out by DiffDock and AlphaFold3. The area around 6Å of substrate lycopene in R domain was selected as the potential active center. Furthermore, to avoid interrupting the function of phytoene synthase in P domain in CarRP, we opted to only target the residues that are far from the P domain for mutagenesis and experiments. We finally chose two mutant sites (F81A and Y145F) through rational inspection using the "visualization" function of REvoDesign, as shown in Fig. 11 and Tab. 3.

Primary school science popularization activity

Fig. 11 Complex model of CarRP and lycopene and the active center sites.

Finally, 2 sites were used for mutant verification, which may alter the electric field distribution in this region, thereby affecting interactions with other charged molecules or alter the channel size. This change could modify the binding properties of CarRP's active site and weaken the interaction between CarRP and lycopene.

Primary school science popularization activity

Tab. 3 Design strategies for protein modification

Functional Verification of CarRP Mutant

The mutated CarRP sequences were cloned into the vector pMASC04-HIs3, yielding the recombinant plasmids pMASC04-PGAL1-CarRP_F81A-TPDC1-His3 and pMASC04-PGAL1 -CarRP_Y145F-TPDC1-His3. The mutant plasmids were linearized and separately transformed into strain BY01, resulting in strains BY03 and BY04. These two strains, together with strain BY02, were subjected to shake flask fermentation simultaneously, and the products were detected by HPLC.

HPLC analysis showed that the fermentation products of CarRP_F81A and CarRP_Y145F mutants both exhibited a high lycopene peak, with yields of 1.42 g/L and 1.46 g/L, respectively. CarRP_F81A produced a very small amount of β-carotene, while CarRP_Y145F produced no β-carotene at all. Furthermore, the lycopene yields of each mutant strain were comparable to the β-carotene yield of the strain containing wild-type CarRP, indicating that the two CarRP mutants designed in this study had almost lost their lycopene cyclase activity. As a result, lycopene no longer cyclized to form β-carotene but accumulated as the final product in Saccharomyces cerevisiae.

The cellular appearance further confirmed that the mutations significantly impaired cyclase activity, leading to massive accumulation of lycopene and giving the cells a bright red color. All these experimental results are presented in Fig. 12.

Primary school science popularization activity

Fig. 12 Functional validation of CarRP mutants.

a Show the chromatograms of fermentation products from three strains (BY02: wild-type CarRP; BY03: CarRP_F81A; BY04: CarRP_Y145F). b Clarify the product content of each strain: the lycopene yield of BY03 is 1.42 g/L, and that of BY04 is 1.46 g/L; the lycopene yields of both strains are comparable to the β-carotene yield of BY02.

The above experimental results indicate that the RvoDesign tool exhibits excellent efficiency and practicality, as it can quickly identify the key sites that play a decisive role in mediating the binding of CarRP to lycopene and the cyclization of lycopene to β-carotene. Mutations at these sites can significantly reduce or even completely abrogate cyclization activity, thereby promoting lycopene accumulation. Among these mutants, Y145F has completely lost its lycopene cyclase function and exhibited the highest lycopene yield; therefore, strain BY04 was selected for subsequent fermentation use.

High-Cell-Density Fermentation (HCDF) of Recombinant Strains in Fermenters

Recombinant strain BY04 containing the pMASC04-PGAL1 -CarRP_Y145F-TPDC1-His3 gene expression cassette was activated and then inoculated into a 3 L fermenter for high-cell-density fermentation. During the initial phase, a higher aeration rate was set, and the stirring speed was adjusted in real-time to maintain the dissolved oxygen level at 20% saturation, so as to promote cell proliferation and rapidly increase cell biomass. 0.2 M PBS was added to maintain the pH stability of the medium, while regulating ion balance to optimize the cell growth environment. After 48 hours of cell growth, fed-batch feeding at a low rate was initiated to maintain a low glucose concentration and optimize the carbon-nitrogen ratio, thereby regulating the dynamic balance of carbon flux between the MVA pathway and the lipid synthesis pathway, and ultimately promoting the efficient synthesis and accumulation of lycopene Fig.13a.

Primary school science popularization activity

Fig. 13 Growth and lycopene yield of BY05 fermented in fermenter.

a High-density fermentation in a 3L fermenter. b Comparison of cell growth rate and lycopene productivity at different stages of fermentation.

As shown in Fig.13b, during the early stage of fermentation (0 - 48 h), cells grew rapidly, and carbon flux was more inclined to support the primary metabolic activities of cells to promote rapid cell proliferation. At this time, the biomass was low, with a low lycopene productivity (0.033 g/L/h). With the depletion of glucose in the medium, the preferential utilization of carbon source shifted from cell proliferation and lipid synthesis to lycopene synthesis, and the rate of biomass growth gradually slowed down. Finally, after 216 h of continuous fed-batch fermentation, ~9.0 g/L of lycopene was obtained, with a productivity of 0.042 g/L/h. These data demonstrate the robustness of our engineered strains guided by REvoDesign in scale-up and high-cell-density fermentation.