Utilizing Fatty Acids
Objective: To connect the fatty-acid catabolic pathway with the Ro biosynthetic pathway so that Saccharomyces cerevisiae can use fatty acids as a carbon source to synthesize Ro.
Cycle 1: Peroxisomal Localization
In S. cerevisiae, fatty acids are naturally catabolized via β-oxidation in the peroxisome to generate acetyl-CoA. From the literature, we learned that acetyl-CoA cannot freely pass through the peroxisomal membrane, whereas the subsequent steps of the mevalonate (MVA) pathway take place in the cytosol. Therefore, resolving how to utilize fatty acids is essential.
Design:
The literature indicates that metabolites produced from acetyl-CoA through the MVA pathway-such as mevalonate (MVA), isopentenyl pyrophosphate (IPP), and dimethylallyl pyrophosphate (DMAPP)-can freely cross the peroxisomal membrane. Therefore, we introduced the key MVA enzymes ERG10, ERG13, and HMG1 into S. cerevisiae and targeted their expression to the peroxisome.
Build:
We designed plasmids driven by constitutive promoters and carrying a tryptophan selection marker. To ensure that the proteins localize to the peroxisome, we fused the genes of interest to the peroxisomal targeting signal 1 (PTS1) via a flexible linker.

Test:
Using equimolar carbon amounts of lauric acid, oleic acid, and palmitic acid as fatty-acid carbon sources-with a constant glycerol carbon source added-we monitored the growth curves of Saccharomyces cerevisiae to assess fatty-acid utilization. The results showed that, after transformation with plasmid 1, the yeast's ability to utilize fatty acids was markedly improved; however, its utilization of lauric acid was weaker than that of palmitic acid and oleic acid.

Learn:
Based on this anomalous observation, we considered that lauric acid may not efficiently enter yeast cells and is therefore difficult to utilize. Literature indicates that:
(1) Palmitic acid is one of the predominant saturated fatty acids in the phospholipids of S. cerevisiae cell membranes. Accordingly, yeast fatty-acid transporters (e.g., Fat1p) exhibit high affinity for fatty acids of C14-C18 chain length.
(2) Faa1p and Faa4p are the key enzymes primarily responsible for activating long-chain fatty acids in yeast. Their active-site architecture is best suited to C12-C18 fatty acids, with typically highest activity toward C14-C16.
(3) Medium-chain fatty acids possess stronger detergent effects than long-chain fatty acids and can disrupt membrane integrity more effectively. When environmental lauric acid concentrations are high, this imposes greater membrane toxicity on yeast, thereby inhibiting cell growth.
Cycle 2: Mixed Substrate Optimization
Design:
We redesigned the fatty-acid carbon source, favoring mixed (combinational) fatty-acid inputs. To verify whether the oleic acid + palmitic acid pair is the optimal combination, we formulated four mixed fatty-acid media and measured the growth curves.
Build:
Same as Cycle 1.
Test:
Using four mixed fatty-acid carbon sources, we monitored the growth curves of Saccharomyces cerevisiae to evaluate its fatty-acid utilization.

Learn:
With mixed fatty-acid carbon sources, Saccharomyces cerevisiae showed the fastest growth when oleic acid + palmitic acid were used as the mixture, suggesting that this medium is better suited for S. cerevisiae to produce ginsenoside Ro.
Constructing the OA→CE Metabolic Pathway
Objective: To screen and build the optimal metabolic route to achieve the conversion from OA to CE.
Cycle 1: Enzyme Discovery
Design:
Through gene family analysis, we constructed a phylogenetic tree covering Csl enzymes from the genomes of ginseng, notoginseng, and soybean, and screened Csl enzymes capable of catalyzing the conversion of OA to CE.
Build:
We first obtained the genome files of ginseng, notoginseng, and soybean from NCBI. After genome annotation, we extracted protein-coding sequence FASTA files. Using Pn022859 as the reference sequence, we searched for candidate Csl enzymes in the three species with BLAST and HMM models. We then performed domain and motif analyses of the candidates using NCBI CD-Search and MEME Suite to determine the distribution of functional domains and motifs. Next, we merged the candidate Csl FASTA files from the three species, removed low-similarity and overly short sequences, and built a phylogenetic tree using the neighbor-joining (NJ) method to classify the candidates into gene families and subfamilies.

Test:
By constructing a phylogenetic tree, we identified 21 Csl enzyme sequences belonging to the same family as Pn022859. Further subfamily delineation revealed that the closest evolutionary relatives to PN022859 are EVM0017102 and EVM0010403, which are Csl sequences from Panax ginseng.

Learn:
Gene family classification clusters genes by sequence similarity; however, whether closely related genes necessarily share similar functions remains to be determined.
Cycle2: Computational Validation
Design:
To investigate whether genes closely related to Pn022859 in the phylogenetic tree possess similar functions-or even superior catalytic efficiency-we validated the sequences in the same family as Pn022859 using molecular docking and molecular dynamics.
Build:
We first used AlphaFold 3 to construct three-dimensional models for the 21 Csl candidate sequences in the same family as Pn022859. The 3D structure of the Csl substrate OA was obtained from PubChem. We performed molecular docking between each candidate Csl enzyme and OA using AutoDock Vina, selected the top five Csl enzymes by best binding affinity, and then conducted molecular dynamics simulations with GROMACS to verify whether these Csl enzymes in the same family as Pn022859 exhibit similar or even better catalytic performance toward OA.
Test:
We validated the above with molecular docking and molecular dynamics simulations. The docking results showed that Pm022859 binds OA well (Best Affinity = -12.6 kcal/mol), and we identified four Csl enzymes with better binding than PN022859: EVM0004671 from ginseng (Best Affinity = -12.6 kcal/mol) and three from soybean-XM_041006423 (-12.7 kcal/mol), XM_003536208 (-13.0 kcal/mol), and NM_001365113 (-13.2 kcal/mol). Based on the docking results, there remains room to improve the Csl enzyme used for the OA→CE reaction. We therefore carried out MD simulations in GROMACS for the top-five Csl-OA docked complexes. The Pn022859 complex shows the smallest radius-of-gyration (Rg) distribution among the complexes, indicating favorable dynamic stability. For details, see the Model page.


Learn:
At the dry-lab level, we screened Csl sequences functionally similar to Pn022859 and, using molecular docking and molecular dynamics simulations, verified Pn022859's strong catalytic performance in the OA→CE conversion. We found that Pn022859 exhibited superior in-silico performance, indicating that this reaction step still has room for further optimization.
Cycle3: UDP-GlcA Test
Design:
Through dry-lab screening, we identified Pn022859 as the key enzyme for converting OA to CE. This step requires UDP-GlcA (uridine diphosphate glucuronic acid); based on the literature, we selected the UDP-glucose dehydrogenase AtUGDH and transformed it into Saccharomyces cerevisiae.
Build:
We designed two plasmids driven by constitutive promoters, each carrying a leucine selection marker.

Test:
On day 3 post-transformation, we harvested cells, extracted intracellular contents, and quantified R1 by high-performance liquid chromatography (HPLC).

Learn:
After introducing AtUGDH, R1 levels decreased rather than increased. We hypothesize that the OA→CE step consumes UDP-GlcA, depleting UDP-glucose (UDPG) and thereby reducing downstream glycosylation efficiency. Therefore, we chose not to introduce AtUGDH at this step.
Construction of the CE-to-Ro Metabolic Pathway
Objective: Screening Glycosyltransferases and Optimizing the Metabolic Pathway
Cycle 1: In Vivo Assembly
Design:
Based on literature research, we selected two glycosyltransferases, UGT73P40 and UGT73F3, from Panax notoginseng and co-expressed them with PN022859 in Saccharomyces cerevisiae.
Bulid:
The genes for UGT73P40, UGT73F3, and PN022859 were codon-optimized, synthesized, and assembled into a single expression plasmid. This plasmid was driven by constitutive promoters and carried a leucine selection marker for stable transformation and maintenance in the yeast chassis.
Test:
Three days post-transformation, intracellular metabolites were extracted from Saccharomyces cerevisiae and analyzed by high-performance liquid chromatography (HPLC). The results showed that the ginsenoside Ro standard peak appeared at t = 59 min. Since Ro is a trisaccharide saponin, the appearance of an additional peak at t = 85 min in the sample chromatogram is likely attributable to a disaccharide ginsenoside byproduct.
Learn:
To explain this unexpected result, we hypothesize that it may stem from compatibility issues after transferring the Panax notoginseng glycosyltransferase system into Saccharomyces cerevisiae. In plants, glycosylation levels of these enzymes are generally low, whereas yeast exhibits comparatively higher glycosylation activity. Consequently, the transferred glycosyltransferases may undergo hyperglycosylation, which could reduce their enzymatic activity and ultimately prevent the formation of the trisaccharide ginsenoside Ro.
Cycle 2: In Vitro Activity
Objective:Assessment of Glycosyltransferase Activity in Saccharomyces cerevisiae.
Design:
Crude intracellular enzymes were extracted from Saccharomyces cerevisiae and incubated in vitro with the respective substrates R1 or IVa. The formation of the final product, ginsenoside Rod to determine whether the enzymatic conversion was successful.The plasmid design was the same as described in Cycle 1.
Bulid:
We cultured the engineered yeast strain from Cycle 1, lysed the cells, and prepared active crude enzyme extracts. Parallel in vitro reaction systems were established, each containing the enzyme extract supplemented with either substrate R1 or IVa and the necessary cofactor UDP-glucose.
Test:
The results showed that the conversion efficiency of R1 to ginsenoside Ro was significantly lower than that of IVa.
Learn:
In vitro experiments demonstrated that the glycosyltransferases expressed in Saccharomyces cerevisiae are enzymatically active. However, in vivo assays revealed substantial accumulation of disaccharide byproducts, suggesting that substrate specificity issues may lead to intermediate buildup. The results indicated that the conversion efficiency from R1 to ginsenoside Ro is considerably lower than that from IVa to Ro. Based on these findings, we aim to apply directed evolution to the enzyme to alter its substrate specificity and thereby reduce the formation of the intermediate R1.
Single-point fully saturated computational screening for enhanced activity of UGT73F3
Objective:To identify potential activity-enhancing mutants via '497x19 single-point saturation mutation + machine learning prediction + molecular docking validation', thereby defining targets for subsequent validation.
Cycle 1: machine learning models
Objective:Training and Evaluating Mutation Effect Predictors
Design:
Employ dataset S10988 (10,510; 5,255 positive/negative) for five-fold cross-validation, with S2814 (5,978; 1,118/4,860) as independent test set; Features comprised 1,280-dimensional sequence embedding differences (WT-Mut, dESM) for ESM1v/ESM2b, combined with AlphaFold3 structure mapping within a 10 Å neighbourhood of mutation sites; models included GCN/GAT (structure + sequence), MLP (sequence differences only), SVM/RF (comparative baseline).
Build:
DGL implements GCN (GraphConvx2) and GAT (GATConvx2, multi-head attention) + global average pooling; MLP architecture: 1280→512→1024→2 (ReLU); SVM/RF grid search; binary classification cross-entropy + Adam(1e-4) + early stopping.
Test:
On S10988 cross-validation, graph models and MLP achieved AUC≈0.96-0.99, Accuracy/F1≈0.94-0.95; On independent set S2814, AUC≈0.62-0.70, specificity TNR>0.86-0.95, but sensitivity TPR≈0.21-0.35 (e.g., AF3-GCN TPR=0.24).
Learn:
Graph model cross-validation yields excellent results, yet sensitivity markedly declines on independent datasets, indicating generalisation challenges (potentially due to strong structural/contextual dependencies). Decision: Retain the multi-model comparison framework for computational screening of all UGT73F3 variants, proceeding to the next round. Decision: Retain the multi-model comparison framework for computational screening of all UGT73F3 variants, proceeding to the next round.
Cycle 2: Top Mutant Screening
Objective:Full-sequence single-point mutant screening + molecular docking verification
Design:
Generate 9,443 single-point mutations of UGT73F3; score using trained model and select Top-100; perform 5 independent dockings per mutation to assess energy stability.
Build:
AutoDock Vina 1.2.x batch script automatically configures grids, subdirectories, and logging; extracts lowest-energy conformations; converts PDBQT to MOL2 using Open Babel 3.x; aggregates duplicate binding energies.
Test:
Multiple docking runs confirmed energy stability; S175P demonstrated optimal performance with an average binding energy of –8.64 kcal·mol⁻¹, indicating highest potential activity.

Learn:
S175P identified as current top candidate (top hit); next step: conduct molecular dynamics (MD) simulations to validate stability and binding mode.
Increase UDP-glucose (UDPG) availability
Cycle: Overexpression of the key enzymes
Objective:The biosynthesis of ginsenoside Ro involves multiple glycosylation steps. By increasing intracellular UDP-glucose (UDPG) availability, the efficiency of these glycosylation reactions can be enhanced, leading to higher Ro production.
Design:
Literature reports indicate that yeast possesses a native glycerol-to-glucose metabolic pathway. To increase intracellular UDP-glucose (UDPG) levels, we overexpressed the key enzymes FBPase1, PGM2, and UGP1. The genes were assembled into a plasmid driven by constitutive promoters and carrying a tryptophan selection marker for stable expression in Saccharomyces cerevisiae.
Bulid:
The genes encoding the key enzymes FBPase1, PGM2, and UGP1 were codon-optimized for Saccharomyces cerevisiae, synthesized, and assembled into a single expression plasmid. This plasmid, based on the pRS314 backbone, was driven by strong constitutive promoters and carried a tryptophan selection marker.
Test:
Learn:
By introducing the key enzymes responsible for converting glycerol to glucose-derived UDP-glucose (UDPG), intracellular UDPG levels in Saccharomyces cerevisiae can be increased, facilitating the glycosylation of intermediates to produce the final product, ginsenoside Ro. To explore optimal strategies, we plan to investigate the best glycerol-to-medium ratio. Specifically, we will measure intracellular UDPG levels under different glycerol concentrations and correlate these data with yeast growth curves to identify the optimal cultivation conditions.
References:
1.Wang, Z., Su, C., Zhang, Y., Shangguan, S., Wang, R., & Su, J. (2024). Key enzymes involved in the utilization of fatty acids by Saccharomyces cerevisiae: a review. Frontiers in microbiology, 14, 1294182. https://doi.org/10.3389/fmicb.2023.1294182
2.Qiu, F., Kang, N., Tan, J., Yan, S., Lin, L., Cai, L., Goodman, J. M., & Gao, Q. (2023). Fatty Acyl Coenzyme A Synthetase Fat1p Regulates Vacuolar Structure and Stationary-Phase Lipophagy in Saccharomyces cerevisiae. Microbiology spectrum, 11(1), e0462522. https://doi.org/10.1128/spectrum.04625-22
3.Tang, H., Wang, S., Wang, J., Song, M., Xu, M., Zhang, M., Shen, Y., Hou, J., & Bao, X. (2016). N-hypermannose glycosylation disruption enhances recombinant protein production by regulating secretory pathway and cell wall integrity in Saccharomyces cerevisiae. Scientific reports, 6, 25654. https://doi.org/10.1038/srep25654
4.Faergeman, N. J., Black, P. N., Zhao, X. D., Knudsen, J., & DiRusso, C. C. (2001). The Acyl-CoA synthetases encoded within FAA1 and FAA4 in Saccharomyces cerevisiae function as components of the fatty acid transport system linking import, activation, and intracellular Utilization. The Journal of biological chemistry, 276(40), 37051-37059. https://doi.org/10.1074/jbc.M100884200
5.Li, Z., Chen, Y., Liu, D., Zhao, N., Cheng, H., Ren, H., Guo, T., Niu, H., Zhuang, W., Wu, J., & Ying, H. (2015). Involvement of glycolysis/gluconeogenesis and signaling regulatory pathways in Saccharomyces cerevisiae biofilms during fermentation. Frontiers in microbiology, 6, 139. https://doi.org/10.3389/fmicb.2015.00139
6.Mukherjee, M., Blair, R. H., & Wang, Z. Q. (2022). Machine-learning guided elucidation of contribution of individual steps in the mevalonate pathway and construction of a yeast platform strain for terpenoid production. Metabolic engineering, 74, 139-149. https://doi.org/10.1016/j.ymben.2022.10.004

