Dry Lab Design | XJTLU-AI-China

What is Plaszyme — Framework at a Glance

Upon reading the following content, you will have completed the entire MODEL section of this project:

Database

More about Plazyme DB

Predictor

More about Plazyme AI Model

WebApp

More about Webapp

Plaszyme is an integrated AI–biochemistry framework that predicts and interprets enzyme–plastic interactions through a unified modeling pipeline.

It combines curated biochemical data (PlaszymeDB), deep representation learning (ESM + GNN/GVP), and a cloud-based prediction platform accessible to both computational and experimental researchers.

Our framework progresses through four iterative stages — from statistical baselines to structure-aware interaction models — ultimately forming a dual-tower architecture that jointly encodes enzyme structure and polymer chemistry.

This design enables not only accurate prediction but also mechanistic understanding of how enzyme folds and substrate features determine degradability.

At the system level, Plaszyme integrates:

Data reliability: A standardized benchmark dataset with biological validation;

Model interpretability: Structure-aware backbones highlighting active residues;

Practical accessibility: A web platform supporting sequence upload, structure prediction (ColabFold), and interactive result visualization (Mol* API).

Visit PlazymeDB

Access our comprehensive plastic-degrading enzyme database

Try PlazymeAlpha / PlazymeX

Test our AI-powered enzyme prediction models

Key Results — Performance at a Glance

To visualize how each model performs across critical dimensions, we summarized six representative metrics:

Category	Metric	Meaning
Accuracy	Micro F1	Overall predictive precision across all samples
Robustness	Macro F1	Stability under class imbalance
Generalization	Hard-tier F1	Performance on low-similarity (hard) enzymes
Ranking Power	Hit@1 / Hit@3	Correct plastic appearing in Top-K predictions
Coverage	Recall@3	Ability to recover true positives among Top-3

Model Highlights

Baseline-ML demonstrates strong ranking precision (Hit@3 ≈ 0.73), reflecting the effectiveness of classical feature engineering and decision-based learning, though its class balance and generalization remain limited.

PlaszymeAlpha (sequence-based) achieves consistent and interpretable gains
showing that protein language models (ESM) capture biochemical regularities even without 3D information.

PlaszymeX (structure-aware) further enhances generalization on hard enzyme tiers (Hard F1 ≈ 0.65)
and maintains stable overall accuracy, though its ranking precision does not surpass the simpler ML models — a result highlighting the trade-off between structural expressiveness and ranking bias.

Takeaway: architectures are complementary — ML excels in ranking precision and coverage, while structure-aware models add robustness and mechanistic insight.

Database