Project Description | XJTLU-AI-China

Background

Plastic waste is damaging ecosystems around the world

Plastic waste is damaging ecosystems around the world.
Credit: Mark Rightmire / MediaNews Group / Orange County Register / Getty.

Plastics are among the most emblematic materials of modern civilization—lightweight, durable, and versatile in application.

However, their persistence has created a global environmental challenge: traditional plastics can remain in nature for centuries with little to no degradation.

In recent years, enzymatic degradation has emerged as a sustainable and promising approach to address this issue, becoming an active research focus across microbiology, chemistry, and synthetic biology.

Surprisingly, although plastics are synthetic materials created by humans less than a century ago—far too recent for evolution to have produced enzymes specifically adapted to them—studies have revealed that microorganisms capable of degrading plastics have already been found in soil, marine, compost, and landfill environments (Urbanek et al., 2020; Danso et al., 2019).

This phenomenon is a result of functional convergence.

Most known plastic-degrading enzymes originate from natural hydrolase families such as cutinases, lipases, and carboxylesterases, whose native substrates—like cutin or natural polyesters—share structural similarities with the chemical bonds found in synthetic plastics (Wei & Zimmermann, 2017).

Through long-term environmental exposure and subtle mutations, these enzymes gradually expanded their substrate spectrum, accidentally acquiring hydrolytic activity toward synthetic polymers (Austin et al., 2018; Kawai et al., 2020).

However, this evolutionary process is stochastic and fragmented, with degradation activities distributed across multiple enzyme families and lacking unified sequence patterns (Lu et al., 2022).

As a result, traditional sequence alignment and homology-based search methods often fail to identify such enzymes, implying that a vast number of potentially efficient plastic-degrading enzymes remain hidden in nature and metagenomic datasets.

Inspiration

Nature cover featuring AlphaFold: Protein power (Volume 596, Issue 7873, 26 August 2021)

Protein power. Cover of Nature, Volume 596, Issue 7873 (26 August 2021). The issue highlights the AlphaFold breakthrough, marking a major advance in accurate protein structure prediction.
Credit: © Nature (2021), Cover image by DeepMind.

In recent years, artificial intelligence has profoundly transformed biology — from decoding genomes to predicting protein structures through AlphaFold (Jumper et al., 2021).

Yet, the next frontier of AI for Science (AI4Sci) raises a deeper question: Can AI go beyond shape to predict what proteins actually do — how they function, and how they interact with their substrates (Sanchez-Lengeling et al., 2023)？

Plastic-degrading enzymes provide an ideal testing ground for this question (Lu et al., 2022; Tournier et al., 2020). Their catalytic functions are not the result of directed evolution, but rather adaptive innovations arising by chance — subtle functional shifts that traditional bioinformatics often fails to capture.

If AI can learn the relationship between enzyme sequence, structure, and substrate specificity, it could not only help us understand existing enzymes, but also design new, more efficient catalysts, accelerating both enzyme discovery and the development of sustainable biotechnologies (Hie et al., 2022; Yang et al., 2024).

Our Goal

Our project begins with the question of how AI can learn to understand and predict enzyme function, seeking to bring artificial intelligence closer to the core challenges of life science — from prediction to understanding.

By harnessing the analytical power of deep learning, we aim to explore whether AI can truly capture how enzyme sequence and structure together determine catalytic activity.

Building on this foundation, our goal is to develop an interpretable intelligent framework — one that allows AI not only to generate predictions but to reveal the biological logic behind them, making it a tool that is usable, understandable, and trustworthy in scientific research.

In the long term, we aspire to move AI from assisting research to empowering science — enabling researchers from diverse backgrounds to interact intuitively with intelligent models,

and turning artificial intelligence into a bridge between data and understanding, a shared language for discovery in the life sciences.

The Plaszyme Initiative

Building upon this vision, the XJTLU AI China Team sought to integrate artificial intelligence into biological research — combining deep learning with biochemical expertise to explore whether AI can truly understand how enzymes degrade plastics.

To achieve this, we developed Plaszyme, an integrated framework that bridges data organization, enzyme discovery, and predictive modeling.

At its foundation lies PlaszymeDB, a curated knowledge base of literature-reported plastic-degrading enzymes, substrates, and activities — serving as a reference for model training and biological validation.

Complementing the main framework, PlaszymeHMM provides a lightweight Hidden Markov Model (HMM) tool for rapid metagenomic sequence discovery.

The core of the framework centers on two interconnected models PlaszymeAlpha and PlaszymeX — representing successive steps in AI-driven understanding of enzyme function:

PlaszymeAlpha is a sequence-based machine learning model built upon protein language models (ESM).

It extracts evolutionary and biochemical signals directly from amino acid sequences, enabling fast, large-scale predictions of potential plastic-degrading activity.

This model demonstrates how AI can learn the underlying logic of enzyme function directly from sequence data, making it ideal for metagenomic exploration and high-throughput enzyme screening.

PlaszymeX builds upon the same ESM-based embeddings but extends them into a structure-aware dual-tower architecture.

On the enzyme side, it integrates 3D geometric encoders (GNN/GVP) to model residue-level spatial relationships; on the plastic side, it encodes physicochemical and topological polymer descriptors.

Through a dedicated interaction layer — combining attention and bilinear mappings — the model captures how enzyme structure and substrate chemistry jointly determine catalytic specificity.

This design not only refines prediction accuracy but also enhances mechanistic interpretability, allowing visualization of active sites and identification of key structure–function features.

To ensure accessibility beyond computational domains, we launched the Plaszyme Web Platform, a cloud-based system automating the full workflow — from sequence upload to structure prediction (via ColabFold) and interactive visualization (Mol* API).

Seamlessly linked to PlaszymeDB, it allows researchers to perform predictions and cross reference experimental data within a single, user-friendly interface.

Through close collaboration with experts in microbiology, environmental science and synthetic biology, we continuously refine the Plaszyme data framework and user experience to ensure its reliability, interpretability, and practicality in real scientific research.

Future Outlook

In the next phase, we aim to expand Plaszyme’s role in predicting and interpreting plastic degradation, transitioning from enzyme discovery to deciphering the molecular logic of polymer breakdown.

This mechanistic understanding will guide the rational design of sustainable biocatalysts and inspire new strategies to address persistent environmental challenges.

At the same time, Plaszyme’s modular and extensible architecture provides a foundation for broader applications — integrating new biological tasks such as enzyme classification, activity modeling, and structure–function analysis.

Through iterative optimization and multi-task learning, Plaszyme could gradually evolve into a shared, generalizable framework for understanding protein function.

Beyond enzyme discovery, Plaszyme also aligns with the principles of synthetic biology turning understanding into design.

By linking AI-driven insights with biological engineering, it paves the way for rational design, evolution and optimization of new enzymes and metabolic pathways, accelerating the development of sustainable and programmable biotechnologies.

Looking ahead, we envision Plaszyme as part of a broader transformation — where artificial intelligence and synthetic biology converge to create a new paradigm of interpretable, designable, and sustainable life science.

By maintaining openness, scalability, and strong biological foundations, we aim to empower interdisciplinary researchers to explore life’s mechanisms with greater precision, deeper understanding, and lower barriers.

Ultimately, Plaszyme is more than a model — it is a vision: an intelligent foundation for the next era of AI-driven synthetic biology and sustainable biotechnology.

Reference

Austin, H. P., Allen, M. D., Donohoe, B. S., Rorrer, N. A., Kearns, F. L., Silveira, R. L., … Beckham, G. T. (2018). Characterization and engineering of a plastic-degrading aromatic polyesterase. Proceedings of the National Academy of Sciences, 115(19), E4350–E4357. https://doi.org/10.1073/pnas.1718804115

Danso, D., Chow, J., & Streit, W. R. (2019). Plastics: Microbial degradation, environmental and biotechnological perspectives. Applied and Environmental Microbiology, 85(19), e01970–18. https://doi.org/10.1128/aem.01095-19

Han, X., Liu, W., Huang, J. W., Ma, J., Zheng, Y., Ko, T. P., … Guo, R. T. (2017). Structural insight into catalytic mechanism of PET hydrolase. Nature Communications, 8, 2106. https://doi.org/10.1038/s41467-017-02255-z

Hie, B., Zhong, E. D., Berger, B., & Bryson, B. D. (2022). Learning the language of viral evolution and escape. Science, 377(6604), 480–486. https://doi.org/10.1126/science.abd7331

Joo, S., Cho, I. J., Seo, H., Son, H. F., Sagong, H. Y., Shin, T. J., … Kim, K. J. (2018). Structural insight into molecular mechanism of poly(ethylene terephthalate) degradation. Nature Communications, 9, 382. https://doi.org/10.1038/s41467-018-02881-1

Jumper, J., Evans, R., Pritzel, A., Green, T., Figurnov, M., Ronneberger, O., … Hassabis, D. (2021). Highly accurate protein structure prediction with AlphaFold. Nature, 596(7873), 583–589. https://doi.org/10.1038/s41586-021-03819-2

Kawai, F., Kawabata, T., & Oda, M. (2020). Current knowledge on enzymatic PET degradation and its possible application to waste stream management and other fields. Applied Microbiology and Biotechnology, 104, 493–504. https://doi.org/10.1007/s00253-019-09717-y

Lu, H., Diaz, D. J., Czarnecki, N. J., Zhu, C., Kim, W., Shroff, R., … Alper, H. S. (2022). Machine learning-aided engineering of hydrolases for PET depolymerization. Nature Communications, 13, 3743. https://doi.org/10.1038/s41586-022-04599-z

Sanchez-Lengeling, B., Wei, J. N., Lee, B. K., Gerkin, R. C., Aspuru-Guzik, A., & Wiltschko, A. B. (2023). A structure-based generative model of olfactory molecules. Science, 379(6639), 1103–1110. https://doi.org/10.1126/science.ade4401

Tournier, V., Topham, C. M., Gilles, A., David, B., Folgoas, C., Moya-Leclair, E., … Marty, A. (2020). An engineered PET depolymerase to break down and recycle plastic bottles. Nature, 580(7802), 216–219. https://doi.org/10.1038/s41586-020-2149-4

Urbanek, A. K., Rymowicz, W., & Mirończuk, A. M. (2020). Degradation of plastics and plastic-degrading bacteria in cold marine habitats. Applied Microbiology and Biotechnology, 104, 493–504. https://doi.org/10.1007/s00253-018-9195-y

Wei, R., & Zimmermann, W. (2017). Microbial enzymes for the recycling of recalcitrant petroleum-based plastics: How far are we? Microbial Biotechnology, 10(6), 1308–1322. https://doi.org/10.1111/1751-7915.12710

Yang, K. K., Wu, Z., Bedbrook, C. N., & Arnold, F. H. (2024). Machine learning-guided directed evolution for protein engineering. Nature Machine Intelligence, 6, 13–28. https://doi.org/10.1038/s41592-019-0496-6

Yoshida, S., Hiraga, K., Takehana, T., Taniguchi, I., Yamaji, H., Maeda, Y., … Oda, K. (2016). A bacterium that degrades and assimilates poly(ethylene terephthalate). Science, 351(6278), 1196–1199. https://doi.org/10.1126/science.aad6359