What is Plaszyme — Framework at a Glance

Upon reading the following content, you will have completed the entire MODEL section of this project:
Plaszyme is an integrated AI–biochemistry framework that predicts and interprets enzyme–plastic interactions through a unified modeling pipeline.
It combines curated biochemical data (PlaszymeDB), deep representation learning (ESM + GNN/GVP), and a cloud-based prediction platform accessible to both computational and experimental researchers.
Our framework progresses through four iterative stages — from statistical baselines to structure-aware interaction models — ultimately forming a dual-tower architecture that jointly encodes enzyme structure and polymer chemistry.
This design enables not only accurate prediction but also mechanistic understanding of how enzyme folds and substrate features determine degradability.
At the system level, Plaszyme integrates:
- Data reliability: A standardized benchmark dataset with biological validation;
- Model interpretability: Structure-aware backbones highlighting active residues;
- Practical accessibility: A web platform supporting sequence upload, structure prediction (ColabFold), and interactive result visualization (Mol* API).
Key Results — Performance at a Glance
To visualize how each model performs across critical dimensions, we summarized six representative metrics:
Category | Metric | Meaning |
---|---|---|
Accuracy | Micro F1 | Overall predictive precision across all samples |
Robustness | Macro F1 | Stability under class imbalance |
Generalization | Hard-tier F1 | Performance on low-similarity (hard) enzymes |
Ranking Power | Hit@1 / Hit@3 | Correct plastic appearing in Top-K predictions |
Coverage | Recall@3 | Ability to recover true positives among Top-3 |
Model Highlights

- Baseline-ML demonstrates strong ranking precision (Hit@3 ≈ 0.73), reflecting the effectiveness of classical feature engineering and decision-based learning, though its class balance and generalization remain limited.
-
PlaszymeAlpha (sequence-based) achieves consistent and interpretable gains
showing that protein language models (ESM) capture biochemical regularities even without 3D information.
-
PlaszymeX (structure-aware) further enhances
generalization on hard enzyme tiers (Hard F1 ≈ 0.65)
and maintains stable overall accuracy, though its ranking precision does not surpass the simpler ML models — a result highlighting the trade-off between structural expressiveness and ranking bias.
- Takeaway: architectures are complementary — ML excels in ranking precision and coverage, while structure-aware models add robustness and mechanistic insight.