NRPS Engineering

The Molecular Basis

Key Points

Nonribosomal peptide synthetases (NRPS) are molecular assembly lines that synthesize peptides independently of the ribosome.
The modular architecture allow their engineering to produce complex products with unique chemical structures and biological activities.
NRPS engineering remains challenging, as exchanging NRPS parts is technically demanding and novel combinations often fail to produce functional peptides.

Non-Ribosomal Peptide Synthesis

It is widely known that cells synthesize peptides using the ribosome. Less well known is a second biosynthetic strategy: non-ribosomal peptide synthetases (NRPSs). These large, multimodular enzyme complexes are found across all domains of life, but are especially common in bacteria and fungi^[1].

NRPSs catalyze the assembly of non-ribosomal peptides (NRPs) - bioactive secondary metabolites that often provide ecological advantages to their producers, such as improved competitiveness, defense, or nutrient acquisition. These compounds include clinically important compounds such as antibiotics (e.g. vancomycin and daptomycin) and others with immunosuppressive (e.g. cyclosporin) or anti cancer (e.g. romidepsin) activity^[2].

Non-Ribosomal versus Ribosomal Synthesis

One of the most remarkable features of NRPSs is their ability to incorporate a vast array of more than 400 distinct monomers. These include the standard proteinogenic L-amino acids, D-amino acid, other non proteinogenic amino acids, and fatty acids. In addition, after NRPS-mediated peptide assembly, further enzymatic tailoring can occur, allowing monomers to be linked by different types of bonds, such as disulfide or phenolic bonds. Some monomers can connect to as many as five other units, enabling the formation of cyclic or branched structures in NRPs. Nonribosomal peptide synthetases (NRPSs) show extraordinary structural and functional diversity, which makes them a valuable resource for natural product discovery, drug development, and synthetic biology^[3]^[4]. Unlike ribosomal peptide synthesis - which translates mRNA sequences into amino acid chains with high efficiency and precision but is restricted to the 20 canonical L-amino acids - NRPSs operate independently of messenger RNA (mRNA) templates. This template-independent mechanism enables the incorporation of a broad range of building blocks, including D-amino acids, non-proteinogenic amino acids, and various chemical modifications such as cyclization, N-methylation, and acylation. Consequently, NRPSs generate structurally complex and chemically diverse peptides that often have potent bioactive properties far beyond the capabilities of ribosomally synthesized peptides (Fig. 1)^[3]^[5] .

**Fig. 1:** Comparison of ribosomal and non-ribosomal monomers.

The Principle of NRPS

NRPSs operate like molecular assembly lines, with modules functioning as a distinct station responsible for a specific step in peptide synthesis: each module is responsible for incorporation of one amino acid into the growing peptide chain. Modules are further divided into domains which each perform a specific role in the incorporation process of one amino acid. The smallest functional unit of an NRPS consists of a C-A-T module: a condensation (C) domain (ca. 50 kDa) for peptide bond formation, an adenylation (A) domain (ca. 60 kDa) for substrate activation and a thiolation (T) domain (ca. 10 kDa) for substrate transport. Offloading is done by a thioesterase (TE) domain (ca. 30 kDa), which releases the final peptide product through hydrolysis or cyclization. Also epimerization (E) domains (ca. 50 kDa) or dual E/C domains are oftentimes part of NRPSs, resulting in enzymes with molecular masses ranging from hundred kilodaltons (kDa) to a few megadaltons (MDa).

Before initiating peptide assembly, NRPS enzymes require post-translational activation by phosphopantetheinyl transferases (PPTases). These enzymes convert inactive NRPSs into their active forms by attaching a 4′-phosphopantetheine prosthetic group to the thiolation (T) domain. This modification provides the flexible arm (Ppant arm) necessary for substrate binding and transfer during peptide elongation (Fig. 2, 9)^[5].

**Fig. 2:** Non-ribosomal peptide synthetase - modular organization is illustrated using different shapes for distinct domains.

Domains categorized into distinct functional types^[5]:

Adenylation (A) domain: Selects and activates a specific amino acid (or non-standard building blocks) by adenylation, forming an aminoacyl-AMP intermediate; in schematic representations, the amino acid that is activated by the domain is typically indicated by its one-letter amino acid code inside the A domain circle (Fig. 3, 4, 9).

**Fig. 3:** Adenylation - amino acids that are activated by the A domains are indicated by its one-letter amino acid code inside the A domain circles.

**Fig. 4:** Adenylation - aminoacyl-AMP intermediates inside the A domain circles.

Thiolation (T) domain (Peptidyl Carrier Protein, PCP): Grabs the activated amino acid-AMP substrate via a covalently attached 4′‑phosphopantetheinyl arm and forming a thioester bond, enabling substrate shuttling between enzymatic domains (Fig. 5, 9).

**Fig. 5:** Thiolation - T domains carrying amino acids by forming thioester bonds.

Condensation (C) domain: Catalyzes peptide bond formation between the growing peptide chain and the amino acid activated by the current module by interaction with the upstream and downstream T domain. Therefore the amine of the downstream amino acid attacks the thioester of the upstream amino acid (Fig. 6, 9).
Epimerization (E) domain: Converts L-amino acids to D-amino acids, altering the stereochemistry of the resulting peptide and contributing to structural diversity (Fig. 7, 9).
Dual Condensation/Epimerization (C/E) domain: A bifunctional domain that combines epimerization and condensation activity, found in some NRPSs. It epimerizes the incoming amino acid and catalyzes peptide bond formation in a single step (Fig. 7, 9).

**Fig. 6:** Condensation - formation of the growing peptide at the C domain by interaction with the upstream and downstream T domain.

**Fig. 7:** Epimerization of L-alanine to D-alanine by the dual E/C domain.

Thioesterase (TE) domain: Releases the completed peptide from the NRPS, either through hydrolysis (Fig. 8, 9) or macrocyclization, thereby defining the final structure of the product.

**Fig. 8:** Hydrolysis - offloading of the final peptide and water by the TE domain in a linear fashion.

**Fig. 9:** Non-ribosomal peptide synthesis - formation of a peptide chain by a non-ribosomal peptide synthetase consisting of A, T, C, E/C and TE domain.

In some cases, a starter C domain is located at the beginning of the NRPS, responsible for the incorporation of fatty acids or other acyl groups to modify the N-terminus of the peptide. Additional auxiliary domains - such as cyclization (Cy), methyltransferase (MT), reductase (R), formylation (F) or halogenase (Hal) domains - can be incorporated to introduce further chemical complexity and functional diversity^[5] .

This modularity resembles a building block system, which naturally suggests the possibility of swapping and varying these blocks. To pursue this approach, several engineering strategies have been developed to modify and engineer NRPSs to produce novel peptides.

NRPS-Engineering

Building on this architecture, engineering strategies to recombine NRPS assembly lines focus on manipulating individual units to create novel peptide variants. These approaches can involve swapping entire modules between different NRPS systems or targeting specific domains within modules. Therefore, the native NRPS must be split at some positions^[5].

Split Sites

The most intuitive approach to engineering these enzymes is to recombine modules from different NRPSs (Fig. 10).

However, NRPSs are large, complex, and highly interdependent. Their interfaces are not universally compatible, so swapping fragments at random junctions often disrupts enzyme function, leading to reduced activity or complete loss of function. As highlighted in the literature, the outcomes of module swapping are frequently unpredictable, and production yields are usually poor^[6].

To overcome these problems, the concept of split sites and eXchange Units (XUs) was introduced. These units are based on conserved motifs located within or between domains that can be used to recombine NRPS parts^[7]. This strategy offers several advantages:

It preserves the natural “handshake” between functional elements, improving compatibility.
It is more modular and predictable, since smaller pieces are easier to recombine.
It increases the success rate of producing functional, chimeric NRPSs.

In NRPS Engineering four different eXchange Unit (XU) strategies have been established, each having their individual application^[7]^[8]:

XU strategy: uses XU sites on the C-A interface, specifically inside the conserved WNATE motif (right after W). The XU approach enables modular recombination by fusing short units between condensation and adenylation domains, preserving specificity, but it often suffers from reduced production titers.
XUC strategy: Uses a fusion point inside the condensation (C) domain. This approach produces peptides at significantly higher yields, while also reducing side-product diversity.^[9]
XUT^I and XUT^IV strategies: XUT^I uses a site within the linker region between the A-T domains that is located 90 bp upstream from the conserved FFxxGGxS motif in the T domain. XUT^IV uses a conserved motif inside the T domain. Inspired by evolutionary recombination, the XUT (exchange between T domains) method permits the assembly of NRPS fragments from diverse sources (with varying GC content, similarity, and specificity), thereby broadening chemical diversity and enabling the design of complex molecules^[8].

By choosing specific engineering sites, NRPSs can be recombined from different units to alter their function more reliably (Fig. 11).

Unit Exchange

In our project, we chose the XUT^I strategy for NRPS engineering, as it combines the advantages of flexibility and broad applicability with the novelty of an evolution-based recombination method, making it particularly promising for generating functional hybrid NRPS assembly lines. Between XUT^I and XUT^IV, we preferred XUT^I because it keeps the thiolation domain intact, thereby preserving complete native interfaces to the downstream condensation domain. We did not choose to work with XUT^IV because it potentially could lead to further incompatibilities within the Thiolation domain, reducing overall engineering success.

We therefore consider XUT^I to be more universally reliable and less likely to compromise NRPS function when applied across diverse clusters.

However, unit exchange also has limitations: inter-module interfaces, such as those between the upstream A domain and the downstream T domain, may still be incompatible, potentially reducing overall success. Furthermore, module swaps are very challenging due to the larger genetic constructs and highly repetitive sequences. These engineering sites can also be combined to only exchange specific domains within the assembly line. Combining XUT^I with the XU strategy allows for the cutout of single A domains, increasing flexibility (Fig. 12).

**Fig. 12:** A-domain exchange using XU and XUT^I split sites.

Unit Incompatibility & Solutions

Overall, while both strategies aim to engineer NRPSs, module exchange generally provides a more robust and reliable approach, as it preserves interactions within the unit that was exchanged. However, even when using the described engineering strategies, unit incompatibilities may still occur, raising the critical question of why such mismatches persist and how they might be avoided or overcome.

The unpredictability of functionality in recombinant NRPS constructs remains a substantial challenge in the field. Current data has not yet yielded comprehensive insights into the underlying mechanisms governing this functionality. Advanced computational analyses hold promise for revealing these intricate mechanisms. Such approaches could not only make NRPS engineering more predictable but also enhance its potential as a biotechnological tool. Moreover, the development of predictive models could substantially increase the throughput of NRPS construct generation by enabling the pre-selection of functional combinations. We have implement this using the phylogenetics in our software.

Biosynthetic Gene Clusters

NRPS are encoded on Biosynthetic Gene Clusters (BGCs) that have a size of multiple kilobases. They are contiguous sets of co-localized genes within microbial genomes and have additional auxiliary elements such as tailoring enzymes^[10]. The architecture of the BGCs presents major difficulties for cloning NRPSs.

NRPS Cloning

When cloning NRPSs, the following key challenges have to be considered:

Repetitive sequence architecture: Due to the modular nature of NRPSs, the genes encoding them are highly repetitive. This complicates primer design for PCR-based cloning. If no unique binding sites can be found, split primers have to be designed.
High GC content: Some NRPS clusters, particularly from species like Pseudomonas fluorescens^[11], have a high GC content. This increases primer annealing temperatures and further complicates amplification, occasionally making certain clusters experimentally inaccessible.
Product toxicity and metabolic burden: Many NRPS products are bioactive and can be toxic to the host organism. In the absence of resistance mechanisms, accumulation of the compound can inhibit growth or kill the producer. Additionally, expression of large multidomain proteins consumes substantial cellular resources, reducing proliferation and yield. Workarounds such as co-culturing with natural product absorbing beads can improve production titers.
Large gene cluster size: NRPS biosynthetic gene clusters (BGCs) are often very large (e.g. ~65 kbp for vancomycin (MIBiG BGC0000455)), complicating DNA assembly, transformation, plasmid stability, and heterologous expression. Dividing clusters into separate plasmids alleviates cloning issues.

Split Inteins

Some NRPS systems naturally consist of several proteins that interact through their own docking domains. This makes it easy to distribute their gene across multiple plasmids for expression. However, docking domains do not occur in all NRPS and are not always in the same position, meaning this method cannot be applied as a universal way to divide clusters. Additionally, natural docking domains are often located after T domains, a position that does not work well for NRPS engineering.

One solution to address this challenge is the use of split inteins. Inteins are small proteins that can excise themselves from a larger polypeptide and simultaneously join the flanking extein. They are naturally occurring or engineered protein fragments that function only when present as a pair. Each pair consists of an N-terminal (N) and a C-terminal fragment (C), which non-covalently associate to reconstitute the active intein. Once assembled, the intein catalyzes protein splicing, ligating the attached NRPS fragments with a native peptide bond.

**Fig. 13:** XUT^I split sites combined with Inteins: The NRPS gene is split at an XUT^I and the sequence is distributed on three orthogonal expression plasmids. The plasmids encode the two halves of two orthogonal split inteins gp41-8 and NrdJ-1, to which the NPRS genes are genetically fused.

Recently, split inteins have been established as a novel tool to express a single NRPS protein from multiple plasmids, and we chose these inteins for our project^[12] (Fig. 13).

For engineering purposes, three amino acids from the natural extein sequence are usually retained on each side of the intein to ensure efficient splicing. Those amino acids are left as a six-amino acid scar (three residues contributed by each half) in the spliced NRPS protein. The split intein pairs used in this project are:

gp41-8_N and gp41-8_C
NrdJ-1_N and NrdJ-1_C

The scar residues introduced by gp41-8 are: Leucine (L) - Asparagine (N) - Arginine (R), and Serine (S) - Alanine (A) - Valine (V).

For NrdJ-1: Asparagine (N) - Proline (P) - Cysteine (C), the C-terminus with Serine (S) - Glutamic acid (E) - Isoleucine (I), while the rest of the intein is removed.

The gp41-8_N / gp41-8_C pair, a variant of the gp41 intein family, has been characterized as fast splicing. The NrdJ-1_N / NrdJ-1_C pair, from the bacterial ribonucleotide reductase nrdJ gene, is likewise among the fastest split inteins. Together, these pairs provide a reliable and orthogonal set for reconnecting split NRPS modules into continuous, functional assembly lines (Fig. 14).^[12]^[13]

**Fig. 14:** Intein-based NRPS expression: The three expression plasmids are co-transformed in E. coli. After protein expression, the two split inteins assemble and autocatalytically remove themselves, producing a single covalently linked NRPS protein.

NRPS & Golden Gate Cloning

Golden Gate cloning is a widely used DNA assembly method that enables the seamless insertion of multiple DNA fragments into a vector in a single, one-pot reaction. In this strategy, DNA parts from so-called donor plasmids are inserted into a plasmid backbone brought into this reaction by a so-called acceptor vector.

The technique relies on Type IIS restriction enzymes, which recognize specific DNA sequences but cut outside of them. We used the Type IIS restriction enzyme BsaI. It recognizes the sequence 5’-GGTCTC(N)₁-3’ and complementary 3’-CCAGAG(N)₅-5’ and cuts at the defined distance (N)_X from this site, resulting in sticky ends (Fig. 15) (New England Biolabs. (2025). BsaI-HF®v2 (R3733): A high-fidelity Type IIS restriction enzyme with reduced star activity. New England Biolabs).

**Fig. 15:** BsaI recognition site and alanine-serine "scar".

When applying this method to NRPS engineering, Golden Gate requires the manual insertion of restriction sites at the chosen split sites. Because the overhangs must be designed to be compatible, this process usually leaves behind a short “scar” in the final construct at the recombination junction^[14] . At the protein level, this corresponds to the addition of amino acids (Fig. 15).

NRPS Libraries

Application of the described engineering approaches offers great opportunities for the creation of part libraries.

Exchanging modules or domains within nonribosomal peptide synthetases (NRPSs) enables the generation of peptide derivatives with altered structures and functions. By systematically combining such engineered variants, researchers can construct NRPS libraries that serve as powerful resources for exploring chemical diversity, identifying novel bioactive compounds, and optimizing peptide properties.

Current NRPS library approaches^[15] enable modular assembly of NRPS variants but require donor fragments with specific type IIS overhangs for each position in the NRPS, where it should be inserted, limiting throughput and flexibility. In contrast, our NRPieceS concept combines Golden Gate cloning with a split inteins-based expression system, reducing cloning complexity, and the requirement for multiple donor plasmids per positions, facilitating more versatile and larger NRPS library generation.

References

[1] Wang, H., Fewer, D. P., Holm, L., Rouhiainen, L., & Sivonen, K. (2014). Atlas of nonribosomal peptide and polyketide biosynthetic pathways reveals common occurrence of nonmodular enzymes. Proceedings of the National Academy of Sciences of the United States of America, 111(25), 9259–9264. https://doi.org/10.1073/pnas.1401734111

[2] Pham, V. H. T., Quach, D. T., Hossain, M. B., & Islam, M. T. (2019). A review of the microbial production of bioactive natural products. Frontiers in Microbiology, 10, 1404. https://doi.org/10.3389/fmicb.2019.01404

[3] Flissi, A., Ricart, E., Campart, C., Chevalier, M., Dufresne, Y., Michalik, J., Jacques, P., Flahaut, C., Lisacek, F., Leclère, V., & Pupin, M. (2020). Norine: update of the nonribosomal peptide resource. Nucleic Acids Research, 48(D1), D465–D469. https://doi.org/10.1093/nar/gkz1000

[4] Peng, Y., Chen, Y., Zhou, C., Miao, W., Jiang, Y., Zeng, X., & Zhang, C. (2024). Modular catalytic activity of nonribosomal peptide synthetases depends on the dynamic interaction between adenylation and condensation domains. Structure. https://doi.org/10.1016/j.str.2024.01.010

[5] Süssmuth, R. D., & Mainz, A. (2017). Nonribosomal peptide synthesis—principles and prospects. Angewandte Chemie International Edition, 56(14), 3770–3821. https://doi.org/10.1002/anie.201609079

[6] Winn, M., Fyans, J. K., Zhuo, Y., & Micklefield, J. (2016). Recent advances in engineering nonribosomal peptide assembly lines. Natural Product Reports, 33(2), 317–347. https://doi.org/10.1039/c5np00099h

[7] Bozhüyük, K. A. J., Fleischhacker, F., Linck, A., Wesche, F., Tietze, A., Niesert, C.-P., & Bode, H. B. (2018). De novo design and engineering of non-ribosomal peptide synthetases. Nature Chemistry, 10(3), 275–281. https://doi.org/10.1038/nchem.2890

[8] Kenan, Präve, L., Kegler, C., Schenk, L., Kaiser, S., Schelhas, C., Shi, Y.-N., Wolfgang Kuttenlochner, Schreiber, M., Kandler, J., Alanjary, M., Mohiuddin, T. M., Groll, M., Georg, & Bode, H. B. (2024). Evolution-inspired engineering of nonribosomal peptide synthetases. Science, 383(6689). https://www.science.org/doi/10.1126/science.adg4320

[9] Bozhüyük, K.A.J., Linck, A., Tietze, A. et al. (2019). Modification and de novo design of non-ribosomal peptide synthetases using specific assembly points within condensation domains. Nat. Chem. 11, 653–661. https://doi.org/10.1038/s41557-019-0276-z

[10] Meesil, W., Muangpat, P., Sitthisak, S., Rattanarojpong, T., Chantratita, N., Machado, R. A. R., Shi, Y.-M., Bode, H. B., Vitta, A., & Thanwisai, A. (2023). Genome mining reveals novel biosynthetic gene clusters in entomopathogenic bacteria. Scientific Reports, 13, 20764. https://doi.org/10.1038/s41598-023-47121-9%E2%80%8C

[11] Martínez-García, P. M., Ruano-Rosa, D., Schilirò, E., Prieto, P., Ramos, C., Rodríguez-Palenzuela, P., & Mercado-Blanco, J. (2015). Complete genome sequence of Pseudomonas fluorescens strain PICF7, an indigenous root endophyte from olive (Olea europaea L.) and effective biocontrol agent against Verticillium dahliae. Standards in Genomic Sciences, 10, 10. https://doi.org/10.1186/1944-3277-10-10

[12] Gonschorek, P., Policarpo, R., Heck, T., Podolski, A., Lindeboom, T. A., Schindler, D., George, J., & Bode, H. B. (2025). Split inteins for generating combinatorial non-ribosomal peptide libraries. bioRxiv. https://doi.org/10.1101/2025.10.02.680031

[13] Pinto, F., Thornton, E. L., & Wang, B. (2020). An expanded library of orthogonal split inteins enables modular multi-peptide assemblies. Nature Communications, 11, 1529. https://doi.org/10.1038/s41467-020-15272-2

[14] Bird, J. E., Marles-Wright, J., & Giachino, A. (2022). A User’s Guide to Golden Gate Cloning Methods and Standards. ACS Synthetic Biology, 11(11), 3551–3563. https://doi.org/10.1021/acssynbio.2c00355

[15] Podolski, A., Lindeboom, T. A., Präve, L., Kranz, J., Schindler, D., & Bode, H. B. (2025). High-throughput engineering and modification of non-ribosomal peptide synthetases based on Golden Gate assembly. bioRxiv. https://doi.org/10.1101/2025.04.23.650154

Show all references

Show less

Contents

Key points

Non-Ribosomal Peptide Synthesis

Non-Ribosomal versus Ribosomal Synthesis

The Principle of NRPS

NRPS Engineering

Split Sites

Unit Exchange

Unit Incompatibility & Solution

Biosynthetic Gene Clusters

NRPS Cloning

Split Inteins

NRPS & Golden Gate Cloning

NRPS Libraries

NRPS Engineering

The Molecular Basis

Key Points

Non-Ribosomal Peptide Synthesis

Non-Ribosomal versus Ribosomal Synthesis

The Principle of NRPS

NRPS-Engineering

Split Sites

Unit Exchange

Unit Incompatibility & Solutions

Biosynthetic Gene Clusters

NRPS Cloning

Split Inteins

NRPS & Golden Gate Cloning

NRPS Libraries

References