0%
Banner

Software

mATChmaker

Nonribosomal peptide synthetases (NRPSs) and their modular recombination hold great potential for discovering novel antimicrobial compounds. However, this approach is fundamentally limited by one key challenge: predicting which module combinations will yield a functional enzyme complex.

We observed the same limitation in our project — while we successfully generated numerous hybrid NRPSs, many failed to produce detectable peptides. Specifically, during our first attempt at derivatization by Golden Gate cloning, only 27 % of combinations were catalytically active. Knowing beforehand which units work together and which do not would save us and other researchers a lot of time and resources and this is why we developed mATChmaker - a software tool to guide NRPS hybrid design.

mATChmaker explores two strategies to facilitate hybrid NRPS design. Since closely related clusters tend to recombine more successfully, the tool helps assess how similar two NRPS clusters are. The phylogenetic analysis of our library already proved that using related clusters greatly increases engineering success rates.

Furthermore, as the condensation reaction is the heart of peptide synthesis, the formation of the condensation complex seems to play a critical role in hybrid NRPS functionality. Our structural pipeline greatly facilitates the 3D structure prediction of condensation complexes including their substrates, enabling experimental design and model-building for a more mechanistic understanding of NRPS unit incompatibility.

Fig. 1
🔍
Fig. 1: 3D structure of a condensation complex predicted by our pipeline.

During the integration of bioinformatic tools into our pipelines, we encountered many problems with dependency conflicts and requirements for specific operating systems. Some programs could take our drylab team members a day to install correctly, meaning that they are essentially inaccessible to people without coding experience. To make our drylab work easily available to future iGEM teams and researchers, we developed our software tool mATChmaker.

At its core, mATChmaker is built on a Docker-based architecture, ensuring that every user, regardless of operating system, can run the same environment without dependency management. The Docker image, based on Ubuntu 20.04, comes pre-configured with all required libraries (Python, Biopython, RDKit, Clustal Omega, OpenBabel, and others) as well as multiple conda environments for specialized tasks such as paras and GetContacts. This setup offers a set of key advantages that are crucial in making sure that our software tool can reach a wide and diverse user base:

First, mATChmaker ensures cross-platform compatibility. Because the entire system runs inside a Docker container, it behaves identically on Windows, macOS, and Linux. All dependencies and libraries are pre-installed, ensuring that analyses yield the same results regardless of the user’s operating system.

This also means that the user does not need to manually install the constituent bioinformatic tools included in the pipelines - the installation of Docker and mATChmaker already covers everything! There are step-by-step guides for both these installations, on our gitlab. After the first installation, starting up the program only takes a minute.

To ensure a user-friendly experience, a simple and intuitive menu-driven command-line interface guides the user through the available options such as the T-TE phylogenetics analysis and the condensation complex structure prediction. The interface also offers the option to use PARAS (an A domain specificity predictor) and getcontacts (a tool to extract interactions from protein structures) separately from the structure prediction pipeline. This design makes the tool accessible even to users with no prior Python experience.

Since mATChmaker unifies both pipelines into one software, it eliminates the need for separate installations or configurations. More advanced users can use a Jupyter Notebook interface to combine different parts of our pipelines and integrate their own functions by adding new Python scripts under workspace/utils/ and calling them in main.py. More information about this is given in our developer’s guide , which was specifically tailored towards more experienced bioinformaticians eager to use our tool. The modular design of our software allows rapid integration of additional algorithms, visualization tools, or data-processing utilities.

Both pipelines require no further input than GenBank files annotated by antiSMASH, which represents the state of the art in secondary metabolite research. All results are automatically stored in structured directories under workspace/results/ and include log files for reproducibility and traceability. All directories that are part of our software tool are accessible both from within the Docker and in the regular file system of the user’s computer.

The tool is designed to efficiently handle high-throughput analysis. For both T-TE analysis and structure prediction, users can input entire folders of GenBank files, enabling automated high-throughput analysis of multiple sequences or complexes in a single run. This reduces manual effort and accelerates data processing for large projects.

You can find the tool in the GitLab Repository. Our documentation contains all information needed for installing and using our software tool. To see mATChmaker in action, you can watch the video below.




Please click on one of the buttons below to learn more about our phylogenetic approach for donor module selection or our structural prediction pipeline!

References

[1] Baunach, M., Chowdhury, S., Stallforth, P., & Dittmann, E. (2021). The Landscape of Recombination Events That Create Nonribosomal Peptide Diversity. Molecular Biology and Evolution, 38(5), 2116–2130. https://doi.org/10.1093/molbev/msab015

[2] Blin, K. et al. (2025). antiSMASH 8.0: extended gene cluster detection capabilities and analyses of chemistry, enzymology, and regulation. Nucleic Acids Research, 53(W1), W32-W38. https://doi.org/10.1093/nar/gkaf334

[3] Sievert, F. et al. (2011). Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega. Molecular Systems Biology, 7:539. https://doi.org/10.1038/msb.2011.75

[4] Terlouw, B. et al. (2025). PARAS: high-accuracy machine-learning of substrate specificities in nonribosomal peptide synthetases. bioRxiv. https://doi.org/10.1101/2025.01.08.631717

[5] Chai Discovery team (2024). Chai-1: Decoding the molecular interactions of life. bioRxiv https://doi.org/10.1101/2024.10.10.615955

[6] Heberlig, G. W., La Clair, J. J. & Burkart, M. D. (2024). Crosslinking intermodular condensation in non-ribosomal peptide biosynthesis. Nature. 638, 261–269. https://doi.org/10.1038/s41586-024-08306-y

[7] Weininger, D. (1988). SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. Journal of Chemical Information and Computer Sciences, 28(1), 31–36. https://doi.org/10.1021/ci00057a005

[8] Nicholas Rego and David Koes 3Dmol.js: molecular visualization with WebGL Bioinformatics (2015) 31 (8): 1322-1324 https://doi.org/10.1093/bioinformatics/btu829

[9] Pistofidis, A., Ma, P., Li, Z., Munro, K., Houk, K. N. & Schmeing, T. M. (2025). Structures and mechanism of condensation in non-ribosomal peptide synthesis. Nature 638, 270–278. https://doi.org/10.1038/s41586-024-08417-6

Show all references

Show less