0%
Banner

NRPS for Dry Lab

Key points

  • T domains are covalently bound to a Phosphopantetheine (Ppant) arm that activates amino acids and transports them into the active sites of other domains.

  • The term ‘condensation complex’ describes the interaction between two T domains and one C domain in which the peptide bond is formed.

Abstract

An important part of our software is a pipeline for the automated prediction of the 3D structures of condensation complexes, the three-domain interactions in which the peptide bond formation is catalyzed. The development of this pipeline required an in-depth understanding of the chemical mechanisms underlying NRP synthesis. This text provides all important information on NRPS domains and condensation complexes.

How do NRPS catalyze peptide synthesis?

A domains are the largest NRPS domains with an approximate mass of 60 kDa. They bind an amino acid via a conserved lysine residue and a mostly conserved aspartate residue that form salt bridges with the carboxylate and ammonium groups of the amino acid. The residues around this binding site confer a specificity towards one amino acid substrate, although so-called promiscuous A-domains that can incorporate two or more different substrates are also common[1]. To predict the structure of NRPS products, it is vital to know which amino acid any A domain will incorporate into the peptide. For this reason, our pipeline uses antiSMASH[2] and PARAS[3], two different tools to predict this substrate specificity.

A-domains catalyze two subsequent activation steps (Fig. 1): First, the carboxylate group of the amino acid reacts with adenosine triphosphate (ATP), forming an aminoacyl-AMP species and pyrophosphate. The activated amino acid then reacts with a thiol group - belonging to a phosphopantetheine (Ppant) arm that is covalently connected to the T domain directly downstream - forming a thioester and adenosine monophosphate.[1]The structure and function of the Ppant arm will be explained in the next section. Thioesters are high in energy, which facilitates the formation of the peptide bond, as we will see in the section on C domains.

Fig. 1
🔍
Fig. 1: Reaction sequence catalyzed by the A domain in the context of the larger NRPS system. The empty Ppant arms of each T domain are loaded with an amino acid by the A domain directly upstream (left). The mechanism of the reaction catalyzed by the A domain (right). The red ‘R’ is the side chain of the amino acid, the blue ‘R’ represents the Ppant arm bound to the T domain.

T domains are small 10 kDa domains consisting of four ⍺-helices that are responsible for transporting the activated amino acids between catalytically active domains. They contain a conserved FFxxGGxS motif that is post-translationally modified by the attachment of phosphopantetheine (Ppant).[1] This Ppant arm (Fig. 2) is similar to other typical carriers of activated acyl groups such as acetyl-CoA - the cofactor has a long, linear structure, and is responsible for the transfer of the activated Amino acids to the catalytic centers of all other domains.

Fig. 2
🔍
Fig. 2: Structure of the Ppant arm. It is connected to the T domain via a serine residue (shown as R-O).

C domains are 50 kDa pseudodimers that catalyze the formation of a peptide bond. Its two subdomains, often called C-lobe and N-lobe, form a V-shape in which the active center lies at the interface of both subdomains - though the conserved active site motif HHxxxDG is part of the N-lobe[4] Two T domains can bind at opposite ends of the C domain with their Ppant arms extending along the interface between the two lobes towards the active site. The interaction of the condensation domain with the two thiolation domains is referred to as a condensation complex, and our drylab project mainly concerned the high-throughput 3D structure prediction of these complexes.

Experimentally, it is difficult to produce 3D structures of condensation complexes because of the dynamic nature of NRPS - all domains are mobile and involved in different interactions, making it difficult to observe the condensation complex specifically. Nonetheless, structures of condensation complexes (Fig. 3) can be measured if the Ppant arms are modified in order to form a covalent crosslink in the active site, thereby trapping the NRPS in the condensation complex state[5]

Fig. 3
🔍
Fig. 3: Cryo-EM structure of a condensation complex with covalently crosslinked Ppant arms (PDB 9BFD, modified)[5] Schematic representation of the condensation complex in the context of a larger NRPS system.

The condensation reaction catalyzed by the C domain is illustrated in fig. 4. The catalytic mechanism has not yet been fully elucidated, especially, the role of the two conserved histidine residues remains unresolved. Generally, the upstream (donor) T domain carries a peptidyl-Ppant arm, to which the growing peptide chain is covalently attached, while the downstream (acceptor) T domain bears an aminoacyl-Ppant arm containing a single activated amino acid. The free amino group of this aminoacyl substrate attacks the thioester bond of the peptidyl-Ppant intermediate, resulting in peptide bond formation and the release of a thiolate leaving group. Following condensation, the donor T domain is left with a free Ppant arm that can return to the A domain for activation of another amino acid. Meanwhile, the acceptor T domain now holds a peptidyl-Ppant arm—elongated by one residue—which will subsequently transfer to the next downstream C domain, where it serves as the donor in the following condensation step.[1]

Fig. 4
🔍
Fig. 4: Scheme of the reaction catalyzed by the C domain.

In some NRPS systems (including those that our wetlab work was centered on), the first domain is not an A domain, but a C domain. These specific Condensation-starter domains connect the first aminoacyl-Ppant to a (non-amino) acid such as acetic acid. In the final peptide, this modification will appear as an N-terminal modification. Condensation-starter domains can often bind various substrates - for instance, the starter domain of the Chaiyaphumine synthetase incorporates phenylacetic acid, butyric acid, propionic acid and acetic acid, leading to the respective Chaiyaphumines A-D (Fig. 5).

Fig. 5
🔍
Fig. 5: Structures of Chaiyaphumines A-D with the N-terminal modification shown in color: phenylacetic acid in red, butyric acid in violet, propionic acid in green and acetic acid in violet.

A common modification during NRP synthesis is the epimerization of amino acids. This refers to the changing of the chirality of exactly one stereocenter - in the case of NRPS, the ⍺-carbon of the first amino acid on the peptidyl-Ppant arm.

All proteinogenic amino acids (except the achiral glycine) are L-amino acids, as are almost all substrates for A domains. NRPS have evolved two distinct methods of turning these L-amino acids into D-amino acids, which are then incorporated into the final peptide. These strategies greatly increase the chemical space accessible to NRPS without having to increase the number of building blocks.

E domains most commonly perform epimerizations. They are related and structurally similar to C domains, but they can only bind a T domain at the acceptor site, while the donor site is blocked. In the peptidyl-Ppant bound to this T domain, only the ⍺-carbon of the C-terminal amino acid (i. e. the last one that was incorporated) will be epimerized. The C domain directly downstream will be a so-called DCL domain, because it connects a D-amino acid to an L-amino acid. Regular C domains would conversely be called LCL domains.

On the other hand, dual E/C domains combine epimerization and DCL activity. They are structurally very similar to other C domains and seem to have evolved separately from E domains. Independent of the domains involved, epimerization will always happen to the amino acid that is on the upstream T domain (Fig. 6).[1]

Fig. 6
🔍
Fig. 6: Chemical structure of Chaiyaphumine A. L-amino acids are shown in red, D-amino acids are shown in green. (Please note that it would not be correct to refer to the stereocenters in the final peptide as D or L. We only refer to the chirality of the amino acid building blocks.) In the schematic overview of the chaiyaphumine synthetase, arrows show which E domains epimerize which amino acid.

Finally, TE domains catalyze the release of the peptide chain from the final T domain. A nucleophilic residue of the TE domain cleaves the thioester and the resulting TE domain-peptide intermediate can be attacked by multiple different nucleophiles: (i) Attack of water leads to a linear peptide with a carboxyl C-terminus, (ii) Attack by ammonia or an amine leads to a linear peptide with an amide C-terminus, (iii) Attack by the N-terminal free amino group of the peptide leads to a cyclic peptide, (iv) Attack by the hydroxy group of a threonine or serine side chain leads to a cyclic peptide with an ester bond, called a depsipeptide.[1] All clusters that we worked with produced depsipeptides.

References

[1] Süssmuth, R. D., & Mainz, A. (2017). Nonribosomal Peptide Synthesis-Principles and Prospects. Angewandte Chemie International Edition, 56(14), 3770–3821. https://doi.org/10.1002/anie.201609079

[2] Blin, K. et al. (2025). antiSMASH 8.0: extended gene cluster detection capabilities and analyses of chemistry, enzymology, and regulation. Nucleic Acids Research. 53(W1), W32-W38. https://doi.org/10.1093/nar/gkaf334

[3] Terlouw, B. et al. (2025). PARAS: high-accuracy machine-learning of substrate specificities in nonribosomal peptide synthetases. bioRxiv https://doi.org/10.1101/2025.01.08.631717

[4] Bloudoff, K. & Schmeing, T. M. (2017). Structural and functional aspects of the nonribosomal peptide synthetase condensation domain superfamily: discovery, dissection and diversity. Biochimica et Biophysica Acta. 1865(11B), 1587-1604. https://doi.org/10.1016/j.bbapap.2017.05.010.

[5] Heberlig, G. W., La Clair, J. J. & Burkart, M. D. (2024). Crosslinking intermodular condensation in non-ribosomal peptide biosynthesis. Nature. 638, 261–269. https://doi.org/10.1038/s41586-024-08306-y

Show all references

Show less