NRPS for Dry Lab
Key points
T domains are covalently bound to a Phosphopantetheine (Ppant) arm that activates amino acids and transports them into the active sites of other domains.
The term ‘condensation complex’ describes the interaction between two T domains and one C domain in which the peptide bond is formed.
Abstract
An important part of our software is a pipeline for the automated prediction of the 3D structures of condensation complexes, the three-domain interactions in which the peptide bond formation is catalyzed. The development of this pipeline required an in-depth understanding of the chemical mechanisms underlying NRP synthesis. This text provides all important information on NRPS domains and condensation complexes.
How do NRPS catalyze peptide synthesis?
A domains are the largest NRPS domains with an approximate mass of 60 kDa. They bind an amino acid via a conserved lysine residue and a mostly conserved aspartate residue that form salt bridges with the carboxylate and ammonium groups of the amino acid. The residues around this binding site confer a specificity towards one amino acid substrate, although so-called promiscuous A-domains that can incorporate two or more different substrates are also common[1]. To predict the structure of NRPS products, it is vital to know which amino acid any A domain will incorporate into the peptide. For this reason, our pipeline uses antiSMASH[2] and PARAS[3], two different tools to predict this substrate specificity.
A-domains catalyze two subsequent activation steps (Fig. 1): First, the carboxylate group of the amino acid reacts with adenosine triphosphate (ATP), forming an aminoacyl-AMP species and pyrophosphate. The activated amino acid then reacts with a thiol group - belonging to a phosphopantetheine (Ppant) arm that is covalently connected to the T domain directly downstream - forming a thioester and adenosine monophosphate.[1]The structure and function of the Ppant arm will be explained in the next section. Thioesters are high in energy, which facilitates the formation of the peptide bond, as we will see in the section on C domains.
T domains are small 10 kDa domains consisting of four ⍺-helices that are responsible for transporting the activated amino acids between catalytically active domains. They contain a conserved FFxxGGxS motif that is post-translationally modified by the attachment of phosphopantetheine (Ppant).[1] This Ppant arm (Fig. 2) is similar to other typical carriers of activated acyl groups such as acetyl-CoA - the cofactor has a long, linear structure, and is responsible for the transfer of the activated Amino acids to the catalytic centers of all other domains.
C domains are 50 kDa pseudodimers that catalyze the formation of a peptide bond. Its two subdomains, often called C-lobe and N-lobe, form a V-shape in which the active center lies at the interface of both subdomains - though the conserved active site motif HHxxxDG is part of the N-lobe[4] Two T domains can bind at opposite ends of the C domain with their Ppant arms extending along the interface between the two lobes towards the active site. The interaction of the condensation domain with the two thiolation domains is referred to as a condensation complex, and our drylab project mainly concerned the high-throughput 3D structure prediction of these complexes.
Experimentally, it is difficult to produce 3D structures of
condensation complexes because of the dynamic nature of NRPS - all
domains are mobile and involved in different interactions, making it
difficult to observe the condensation complex specifically. Nonetheless,
structures of condensation complexes (Fig. 3) can be
measured if the Ppant arms are modified in order to form a covalent
crosslink in the active site, thereby trapping the NRPS in the
condensation complex state[5]
The condensation reaction catalyzed by the C domain is illustrated in
fig. 4. The catalytic mechanism has not yet been fully
elucidated, especially, the role of the two conserved histidine residues
remains unresolved. Generally, the upstream (donor) T domain carries a
peptidyl-Ppant arm, to which the growing peptide chain is covalently
attached, while the downstream (acceptor) T domain bears an
aminoacyl-Ppant arm containing a single activated amino acid. The free
amino group of this aminoacyl substrate attacks the thioester bond of
the peptidyl-Ppant intermediate, resulting in peptide bond formation and
the release of a thiolate leaving group. Following condensation, the
donor T domain is left with a free Ppant arm that can return to the A
domain for activation of another amino acid. Meanwhile, the acceptor T
domain now holds a peptidyl-Ppant arm—elongated by one residue—which
will subsequently transfer to the next downstream C domain, where it
serves as the donor in the following condensation
step.[1]
In some NRPS systems (including those that our wetlab work was centered on), the first domain is not an A domain, but a C domain. These specific Condensation-starter domains connect the first aminoacyl-Ppant to a (non-amino) acid such as acetic acid. In the final peptide, this modification will appear as an N-terminal modification. Condensation-starter domains can often bind various substrates - for instance, the starter domain of the Chaiyaphumine synthetase incorporates phenylacetic acid, butyric acid, propionic acid and acetic acid, leading to the respective Chaiyaphumines A-D (Fig. 5).
A common modification during NRP synthesis is the epimerization of amino acids. This refers to the changing of the chirality of exactly one stereocenter - in the case of NRPS, the ⍺-carbon of the first amino acid on the peptidyl-Ppant arm.
All proteinogenic amino acids (except the achiral glycine) are L-amino acids, as are almost all substrates for A domains. NRPS have evolved two distinct methods of turning these L-amino acids into D-amino acids, which are then incorporated into the final peptide. These strategies greatly increase the chemical space accessible to NRPS without having to increase the number of building blocks.
E domains most commonly perform epimerizations. They are related and structurally similar to C domains, but they can only bind a T domain at the acceptor site, while the donor site is blocked. In the peptidyl-Ppant bound to this T domain, only the ⍺-carbon of the C-terminal amino acid (i. e. the last one that was incorporated) will be epimerized. The C domain directly downstream will be a so-called DCL domain, because it connects a D-amino acid to an L-amino acid. Regular C domains would conversely be called LCL domains.
On the other hand, dual E/C domains combine
epimerization and DCL activity. They are structurally very similar to
other C domains and seem to have evolved separately from E domains.
Independent of the domains involved, epimerization will always happen to
the amino acid that is on the upstream T domain (Fig.
6).[1]
Finally, TE domains catalyze the release of the peptide chain from the final T domain. A nucleophilic residue of the TE domain cleaves the thioester and the resulting TE domain-peptide intermediate can be attacked by multiple different nucleophiles: (i) Attack of water leads to a linear peptide with a carboxyl C-terminus, (ii) Attack by ammonia or an amine leads to a linear peptide with an amide C-terminus, (iii) Attack by the N-terminal free amino group of the peptide leads to a cyclic peptide, (iv) Attack by the hydroxy group of a threonine or serine side chain leads to a cyclic peptide with an ester bond, called a depsipeptide.[1] All clusters that we worked with produced depsipeptides.
References
[1] Süssmuth, R. D., & Mainz, A. (2017). Nonribosomal Peptide Synthesis-Principles and Prospects. Angewandte Chemie International Edition, 56(14), 3770–3821. https://doi.org/10.1002/anie.201609079
[2] Blin, K. et al. (2025). antiSMASH 8.0: extended gene cluster detection capabilities and analyses of chemistry, enzymology, and regulation. Nucleic Acids Research. 53(W1), W32-W38. https://doi.org/10.1093/nar/gkaf334
[3] Terlouw, B. et al. (2025). PARAS: high-accuracy machine-learning of substrate specificities in nonribosomal peptide synthetases. bioRxiv https://doi.org/10.1101/2025.01.08.631717
[4] Bloudoff, K. & Schmeing, T. M. (2017). Structural and functional aspects of the nonribosomal peptide synthetase condensation domain superfamily: discovery, dissection and diversity. Biochimica et Biophysica Acta. 1865(11B), 1587-1604. https://doi.org/10.1016/j.bbapap.2017.05.010.
[5] Heberlig, G. W., La Clair, J. J. & Burkart, M. D. (2024). Crosslinking intermodular condensation in non-ribosomal peptide biosynthesis. Nature. 638, 261–269. https://doi.org/10.1038/s41586-024-08306-y