Molecular Dynamics

To study molecular interactions under realistic conditions, we employ molecular dynamics (MD) simulations using GROMACS1 to analyze CC-domains and the protease AvrRpt2 in the signaling pathways. AvrRpt2 cleaves RIN4 sequences in linkers connecting weakly associated CC-domain peptides, thereby allowing the strongly interacting peptide pairs to associate and mediate downstream signaling. Therefore, the differential binding affinities among CC-domains and the efficient cleavage activity of AvrRpt2 play a crucial role in the recognition of Pst.

MD allows us to reproduce the laboratory environment on a computer, including water, ions, temperature, and pressure, so that proteins, peptide complexes, and other biomolecules can adopt conformations closer to their natural states. Beyond static structural predictions or docking, MD provides dynamic trajectories that show how molecules move, stabilize, or rearrange over time, especially during production MD simulation. This makes it possible to examine interaction strength, structural stability, and even residue-level contributions to binding free energy. In this way, GROMACS serves as a powerful tool to validate molecular interactions and to uncover the detailed mechanisms behind stability and recognition in protein or peptide systems.

Fig.1 The signal route including AvrRpt2, A-A', A-B, C-C', C-D, and C'-E

 

Cleavage of RIN4 by AvrRpt2

Pst AvrRpt2 (AvrRpt2PS) is an effector protease that specifically cleaves the linkers between leucine zippers in our designed signaling routes, thereby controlling downstream responses. Understanding how this enzyme recognizes and processes its substrates is crucial for interpreting the molecular basis of the signal cascade. We aimed to confirm the binding stability and identify key residues involved in substrate recognition in this part. To address this, we carried out simulations to capture its binding dynamics under near-physiological conditions. We evaluated the energy of the enzyme–substrate binding state and the examined structure interface. Such interactions are closely related to the positioning of the cleavage site and may therefore influence the cleavage efficiency.

System Assembly

Basic Idea

Although no experimental structure is currently available for AvrRpt2PS, the resolved structure of Erwinia amylovora AvrRpt2 (AvrRpt2EA) provides a reliable template for homology modeling. After AvrRpt2PS undergoes autocleavage, it reaches the mature sequence and shows high sequence similarity with the reference AvrRpt2EA structure 6HQZ2.

The catalytic triad in AvrRpt2PS is residues Cys122–His208–Asp2263, which can be set as the active core site for the cleavage sequence of RIN4 to bind.

Procedure

  • Structure Preparation

  We downloaded the AvrRpt2PS structure (predicted) from AlphafoldDB4 as chain A. The linker (including RIN4 minimum cleavage sequence) structural model was obtained from Robetta5 as chain B.

  • Docking with HADDOCK6

  Cys122, His208, and Asp226 in AvrRpt2 and RIN4 sequence VPKFGNW were defined as active residues, ensuring that the RIN4 peptide fragment is properly oriented near the proteolytic site to generate realistic binding conformations.

Molecular Dynamics

  • Simulation

  We used a combination of steepest descent (SD) and conjugate gradient (CG) to minimize the energy of the system, followed by NVT and NPT equilibration with a gradient reduction of position constraint. The production MD lasts 100 ns, output 50000 frames.

  • Trajectory Processing

Results

Overall Stability

We carried out Root Mean Square Deviation (RMSD) analysis of the protein complex, which shows the deviation of the whole complex from the reference structure (frame 1 of the trajectory) during the production MD simulation.

Fig.2 Cα RMSD of the protein complex relative to the initial structure after least-squares fitting to the Cα atoms. The red line shows the moving average smoothed over a 300-frame window.

Relative to the initial conformation, the system’s RMSD value gradually increases during the production simulation, reaching a plateau around 0.75 nm after 50 ns with slight fluctuations, and indicating the complex becomes stable and experiences less conformational change during the last 25 ns as shown in fig.2.

We adopted Molecular Mechanics Poisson–Boltzmann Surface Area (MMPBSA)7 to calculate the energy of the complex. MMPBSA is a post-process method used to calculate ΔG of complexes based on molecular dynamics trajectories gained from MD simulation. The ΔG calculated here included gas-phase molecular mechanics energy (ΔGgas) and solvation free energy (ΔGsolv), without entropy contribution.

Based on the RMSD result, we calculated MMPBSA from 50 ns to 100 ns, and set dt=250 ps to get a sample of 200 frames.

Fig.3 Total ΔG, calculated by MMPBSA using the normal PB model.

The binding free energy of the protein system rises from approximately –120 kcal/mol to around –90 kcal/mol, where it fluctuates, indicating that the system has converged to a steady low-energy state as depicted in fig.3.

Fig.4 Energy component decomposition of ΔG calculated by MMPBSA using the normal PB model.

The ΔGgas decomposition includes van der Waals interactions (VDWAALS) and electrostatic energy (EEL), while ΔGsolv consists of the polar solvation term (EPB) and the nonpolar solvation term (ENPOLAR). Van der Waals interactions arise from close packing and dispersion forces, while electrostatic energy results from complementary charge interactions. Both VDWAALS and EEL are negative, reflecting the favorable attractive forces between the enzyme and substrate. EPB represents the polar solvation penalty, which is positive because desolvating polar groups upon binding is energetically costly, partially offsetting the favorable gas-phase interactions. ENPOLAR is slightly negative, reflecting favorable nonpolar interactions with the solvent that are retained upon binding. Overall, the total binding free energy ΔG, which is the sum of ΔGgas and ΔGsolv, remains negative, indicating that the enzyme–substrate complex formation is thermodynamically favorable.

Overall, the binding structure is well supported by the calculated energetic profile, consistent with a stable protein–protein interaction.

Binding Interface and Energy Distribution

To examine the residues located at the protein–protein interface and their contributions to ΔG, we used a residue decomposition in MMPBSA. Residues were defined as interface residues if any of their heavy atoms were within 6 Å of at least one heavy atom from the opposite chain at a given frame. This criterion ensured that only residues in direct spatial contact were included in the analysis.

Residues with lower (more negative) ΔG values contribute more strongly to stabilizing the binding, while higher values indicate weaker or destabilizing contributions.

Fig.5 Per-residue decomposition of ΔG calculated by MMPBSA using the normal PB model, A for chain A, B for chain B.

Two of the three catalytic triad residues—Cys122 and Asp226—are included, both showing negative contributions to binding free energy. In addition, several other residues exhibit relatively large contributions to ΔG, and they are Gln115, Asn117, and Glu135 in chain A, as well as Gly1 and Trp21 in chain B.

We also looked into the Root Mean Square Fluctuation (RMSF) of the complex, which calculates how much a residue fluctuates around its average position over the course of the simulation, and mapped it on the structure. Regions with higher RMSF values are more flexible, while regions with lower values are more rigid and stable.

Fig.6 RMSF of the complex. The color scale ranges from blue to white to red, representing RMSF values from low to high.

AvrRpt2 and its bound cleavage sequence exhibit relatively constrained fluctuations around their reference conformations, in contrast to the higher flexibility of the GS motif. The enhanced mobility of the free GS sequence likely accounts for the elevated RMSD compared with typical levels, and this flexibility in turn facilitates the proper engagement of AvrRpt2 with its cleavage site.

In summary, the per-residue energy decomposition highlights the key residues that stabilize binding, while RMSF analysis reveals that flexibility is mainly concentrated in the linker region. Together, these results support the expected binding mode and provide a clear picture of how structural dynamics contribute to the interaction.

 

Peptide affinity in water

For the CC-domain peptides in our signal routes, we applied MD analysis to check whether their binding affinities match expected theoretical values. First, structural models of peptide pairs were generated through prediction and docking. These complexes were then placed in explicit water with ions, equilibrated, and simulated with GROMACS. The production runs provided trajectories of peptides from which we analyzed stability and calculated binding free energies (ΔG). These ΔG values allow us to compare peptide pairs quantitatively and can be related to dissociation constants (Kd). Moreover, apart from confirming the expected ranking of affinities, the simulation can also reveal which residues play the largest role in stabilizing or weakening peptide binding, providing a reference for mutation design.

System Preparation

5 pairs of CC-domains - AA’, AB, CC’, CD, and CE - were involved. The structures of single chains were obtained from Robetta and put into HADDOCK for docking.

Molecular Dynamics

  • Simulation

All the structures were placed in GROMACS, and the simulation setup involved defining the simulation box, solvating the system, adding counterions for neutralization, followed by energy minimization, NVT equilibration, and NPT equilibration.

We carried out 50 ns of production MD simulation to let the peptides relax and adopt energetically favorable structures.

  • Trajectory Processing

This step is to make the entire complex whole by combining the fractions caused by periodic boundary conditions and to center it within the solvent box.

Results

We conducted MMPBSA mainly to compare the ΔG of different CC-domains. We use the negative value of ΔG to reflect binding strength. ΔG is the free energy released during the binding process - the greater the energy release, the more stable the resulting complex. Thus, a more negative ΔG indicates stronger binding affinity, allowing us to compare and rank the relative affinities of different complexes.

Table1. ΔG decomposition (kcal·mol⁻¹) of A-A' and A-B.

 ΔGgasΔGsolvΔG
A-A'-364.86299.53-65.32
A-B-223.57181.90-41.68

Table2. ΔG decomposition (kcal·mol⁻¹) of C-C', C-D, and C'-E.

 ΔGgasΔGsolvΔG
C-C'-197.78127.26-70.51
C-D-178.08114.29-63.79
C'-E-201.35153.72-47.63

The two charts indicate that A–A′ binds more strongly than A–B, and that C–C′ binds more strongly than C–D or C′–E.

Though the ΔG values are lower than the normal level for reasons which would be discussed later, the relative ranks in both groups support that the binding affinity difference is true under simulated lab conditions and can be applied to experiments.

As for the abnormally low ΔG, the factors listed below have been considered:

  • Entropy (TΔS) not considered

 We did not include the entropy contribution, which normally increases the system ΔG.

  • Sensitivity of electrostatic or solvation parameters

Analysis of the Two Lowest-ΔG CC-domains

We selected A–A′ and C–C′ as the two representative CC-domain pairs for detailed analysis, as they showed the strongest binding within their respective groups. To characterize their structural stability and binding energetics, we examined their RMSD trajectories and binding free energies from molecular dynamics (MD) simulations and MMPBSA calculations.

Below are the RMSD values of the two CC-domains as a function of time during the production MD simulation.

Fig.7 Cα RMSD of CC-domain A-A' relative to the initial structure after least-squares fitting to the Cα atoms. The red line shows the moving average smoothed over a 300-frame window.

Fig.8 Cα RMSD of CC-domain C-C' relative to the initial structure after least-squares fitting to the Cα atoms. The red line shows the moving average smoothed over a 300-frame window.

Both A–A′ and C–C′ exhibit an initial increase in RMSD within the first 25 ns, indicating structural adaptation from their starting conformations. Afterward, the moving average reaches a relatively stable plateau, suggesting that both complexes achieve equilibrium. However, A–A′ displays larger fluctuations during the final 25 ns, implying greater flexibility, while C–C′ stabilizes at a higher RMSD (~0.8 nm), consistent with a more adjusted overall conformation.

The ΔG of the equilibrated portion of the trajectories (frames 1500–5000, corresponding to 15–50 ns) was evaluated using the MMPBSA method with the normal Poisson–Boltzmann (PB) model.

Fig.9 Total ΔG of A-A', calculated by MMPBSA using the normal PB model.

Fig.10 Total ΔG of C-C', calculated by MMPBSA using the normal PB model.

Although fluctuations exist, the average ΔG values of both systems remain stable over time, confirming a converged and equilibrated region. The overall energy profiles suggest comparable binding stability, with subtle differences arising from sequence-specific interactions and conformational flexibilities.

We performed a decomposition analysis of ΔG to examine the contributions of individual residues in A–A′ and C–C′, as well as how these contributions fluctuate over time during the production MD simulation. The results are presented as heat maps below. The X-axis represents simulation time, while the Y-axis lists the residues within a 6 Å distance cutoff. The color scale denotes each residue’s binding energy contribution at a given frame, as indicated by the color bar on the left. A negative value corresponds to a favorable contribution to binding, with darker green shades indicating stronger stabilizing effects.

Fig. 11 Per-residue decomposition of ΔG calculated by MMPBSA using the normal PB model, A for peptide A', B for peptide A.

In the A–A′ complex, Arg5 and Arg12 from peptide A′, along with Asp2, Glu3, and Asp9 in peptide A, contribute significantly to interface stabilization, likely through salt bridges and electrostatic interactions.

Fig. 12 Per-residue decomposition of ΔG calculated by MMPBSA using the normal PB model, A for peptide C, B for peptide C'.

In peptide C, the key contributors to binding are Lys24, Lys26, and Lys31, whereas most other residues show negligible energetic contributions. For peptide C′, residues Glu12, Lys19, Lys24, Lys26, and Lys31 play dominant roles, suggesting that electrostatic complementarity is a primary factor stabilizing the C–C′ interface.

Overall, the CC-domains A–A′ and C–C′ exhibit stable binding according to our energy analysis, although further optimization of the sequences could be made regarding the residues that do not favor binding.

 

Conclusion

The MD approach provides mechanistic insight into the regulation of our synthetic pathway and establishes a framework for both rational design and future experimental validation. We assessed the binding affinity of CC-domains as well as the interaction between AvrRpt2 and Rin4, both of which yielded encouraging results. Although the computational tools and methods have inherent limitations and the absolute energy values should be interpreted with caution, the findings may still offer useful indications of protein conformational changes and their underlying energetic landscape.

More files and detailed instructions can be found on the igem gitLab.

Reference

  1. Abraham, M. J., Murtola, T., Schulz, R., Páll, S., Smith, J. C., Hess, B., & Lindahl, E. (2015). GROMACS: High performance molecular simulations through multi-level parallelism from laptops to supercomputers. SoftwareX, 1–2, 19–25.
  2. J. D., Demitri, N., Wuerges, J., & Benini, S. , Crystal structure of the type III effector protein AvrRpt2 from Erwinia amylovora, a C70 family cysteine protease (2019)
  3. Bartho, J. D., Demitri, N., Bellini, D., Flachowsky, H., Peil, A., Walsh, M. A., & Benini, S. (2019). The structure of Erwinia amylovora AvrRpt2 provides insight into protein maturation and induced resistance to fire blight by Malus × robusta 5. Journal of Structural Biology, 206(2), 233–242.
  4. Fleming, J., Magana, P., Nair, S., Tsenkov, M., Bertoni, D., Pidruchna, I., Querino Lima Afonso, M., Midlik, A., Paramval, U., Žídek, A., Laydon, A., Kovalevskiy, O., Pan, J., Cheng, J., Avsec, Ž., Bycroft, C., Wong, L. H., Last, M., Mirdita, M., Steinegger, M., … Velankar, S. (2025). AlphaFold Protein Structure Database and 3D-Beacons: New data and capabilities. Journal of Molecular Biology, 437(15), 168967.
  5. Kim, D. E., Chivian, D., & Baker, D. (2004). Protein structure prediction and analysis using the Robetta server. Nucleic Acids Research, 32, W526–W531.
  6. Honorato, R. V., Trellet, M. E., Jiménez-García, B., Schaarschmidt, J. J., Giulini, M., Reys, V., Koukos, P. I., Rodrigues, J. P. G. L. M., Karaca, E., van Zundert, G. C. P., Roel-Touris, J., van Noort, C. W., Jandová, Z., Melquiond, A. S. J., & Bonvin, A. M. J. J. (2024). The HADDOCK2.4 web server for integrative modeling of biomolecular complexes. Nature Protocols, 19, 3219–3241.
  7. Valdés-Tresanco, M. S., Valdés-Tresanco, M. E., Valiente, P. A., & Moreno, E. (2021). gmx_MMPBSA: A new tool to perform end-state free energy calculations with GROMACS. Journal of Chemical Theory and Computation, 17 (10), 6281–6291.

© 2025 - Content on this site is licensed under a Creative Commons Attribution 4.0 International license.

The repository used to create this website is available at gitlab.igem.org/2025/ucas-china.