MOLECULAR DYNAMICS

INTRODUCTION

Molecular dynamics (abbreviated to MD) is a computational method that simulates the physical movements of atoms and molecules over time to understand their behavior and properties. In contrast to experimental procedures, molecular dynamics offer a more flexible, safe, and cost effective way of analyzing structures of molecules and can be applied to the areas of biomedical engineering. In our project we strive to study the structure of fusion protein and assess its stability and propensity to bind to guide RNA. With the help of computational tools such as AMBER, a suite of biomolecular simulation programs, we are able to simulate a nine step relaxation of the cas13a complex with a diphtheria toxin and glycoprotein 120 and observe its structural integrity.

An additional, and rather crucial part of our project here was the analysis of protein structure even prior to our relaxation steps. This includes a comparison of generated alphafold structures to similar, existing files from the Protein Data Bank (PDB). Please refer to our team’s engineering section for changes we made based on the structure of our synthetic fusion protein.

GOALS

Our project inherently holds concerns regarding physical and structural design/integrity:

How can such a large, novel combination of 3 proteins play a role in the folding of each subunit?
How will explicit solvent, such as water and ions, impact/change our system?
Are our regions accurate? Will we receive our intended function with each protein linked together?

There are three requirements needed for an MD simulation"

An input structure
A Force Field
A Solvent

CREATING AN INPUT STRUCTURE

When creating an input structure for a simulation an input structure is a necessity. Using the plasmid sequences (from wet lab), AlphaFold Server was able to generate model structures of our fusion protein. AlphaFold Server is a web-service that can generate highly accurate biomolecular structure predictions containing proteins, DNA, RNA, ligands, ions, and also model chemical modifications for proteins and nucleic acids in one platform. As shown below is the fusion protein as generated by alphafold consisting of a three cargo system. The Cas13a, the DTA toxin, and the GP 120.

Figure 1: Fusion protein with labeled markers.

FORCE FIELD & SOLVENT PARAMETERIZATION

As mentioned above, having a force field to describe interactions between molecules in a system is an absolute must for an MD simulation. The force field chosen was ff14SB. Furthermore, the choice of an explicit water solvent model was chosen. Explicit models treat individual water molecules as distinct entities and are represented with their own atoms and interactions. For these experiments TIP3P water model was used.

In order to ascertain the stability of the fusion protein a relaxation was run. A relaxation can be thought of as a gradual adaptation to its environment and that is what is done here.

“We are simply trying to slowly adapt the system to the simulation environment, and needs to do this very carefully so that the initial structure details from experiment are not lost due to instability during the change from the (usually low-temperature, crystallographic) experimental setup to the simulation model with a force field in solution.” - Carlos Simmerling

PREPROCESSING

Topology files were generated using AMBERS tleap command and due to the choice of explicit water tleap was run twice: once to add water and determine the volume of the simulation system, and a second time after we use the volume to determine how many ions need to be added to give a specific salt concentration.

The relaxation consists of nine steps:

Energy Minimization
Heating up the system
Relaxation of the system using pressure and temperature control to adjust the density of water to experimental values
Constant pressure constant temperature relaxation for 1 ns with lower restraint force
Energy minimization of side chains
ns MD with restrained backbone, constant Temperature and Pressure
Reduce the backbone restraint force constant to slowly let go of the restraints
Further reduce the backbone restraint force constant to slowly let go of the restraints
Completely release the restraints from the protein - last step of relaxation, still constant pressure and temperature

RESULTS

It was important to analyze the energy minimization results to ensure both the maximum force and potential energy are within reasonable thresholds.

Figure 2. Potential energy minimization.

Simulations are carried out in two stages: NVT (constant temperature) and NPT (constant pressure) equilibration, in order to attain thermal and mechanical equilibrium of the system. These are steps 2 and 3 of the relaxation process. The system temperature is progressively raised to the desired 300K to achieve stable conditions, and the system's stability is confirmed by monitoring the temperature, pressure, volume, and density curves over time.

Figure 3a,b. Figure of the potential energy from the second heating step as well as the temperature.

Figure 4a,b. Volume of the system over time and the density of the system over time for step 3.

PRODUCT SIMULATION, VISUALIZATION STRUCTURE OPTIMIZATION & INSPECTION

Under the conditions of equilibrium, a long-term production simulation is conducted. In order to further analyze the intermolecular interactions and dynamic conformational changes.

Using VMD the key interactions of the protein system can be visualized through selection and rotation commands, identifying important residues and binding pockets that play significant roles during the simulation process.

Figure 5: Fusion protein surrounded by water molecules.

ANALYZING THE STABILITY OF THE CRISPR SECTION OF THE FUSION PROTEIN

As stated in literature Cas13a relies on two regions for cleavage, they are named HEPN-I and HEPN-II.

Figure 6: Higher eukaryotes and prokaryotes nucleotide-binding site for Cas13a.

Within these domains lie two pairs of amino acids each that have been known to be highly conserved as shown below. Thus the distance over time between the pairs of histidine and arginine were measured. The RMSD of both HEPN regions was also measured compared to the original structure as a reference. Root mean square deviation is a fundamental metric for measuring structural deviation during the simulation process. By calculating the deviation of structures at different time points, it is possible to assess the stability of the system.

Figure 7: The two pairs of amino acids on each HEPN domain. HEPN-I contains the 1255th and 1260th which are Histidine and Arginine respectively. The histidine and arginine pairs for HEPN-II are 1936 and 1941 (Wang et al).

Figure 8a,b. Distance over time for both pairs of amino acids.

Figure 9a,b. Root mean square deviations of each back bone.

It is observed in figure 7 that both amino acid pair distances remain stable throughout the process of the simulation. Furthermore, as shown by the RMSD graphs the values range from 0-2. Generally if the value is small (under 1 or 2 angstroms), the match is pretty good and we call these "very similar" structures. If the value is higher, such as 3 or 4, it's hard to know if that means all of the selected atoms are about 3-4 angstroms from where they are in the reference, or if it means that many of the atoms match very closely (1-2), and that other atoms deviate more strongly (4-5, or maybe more). The goal of this specific RMSD is to compare how the simulated structure changes over time with respect to the experimental structure.

Overall we can conclude that the general stability of the Cas13a is maintained once the fusion protein is created.

REFERENCES

“Amber Force Fields.” Ambermd.org, ambermd.org/AmberModels.php.
“Building Protein Systems in Explicit Solvent.” Ambermd.org, 2020, ambermd.org/tutorials/basic/tutorial7/index.php.
“Building Systems.” Ambermd.org, 2025, ambermd.org/tutorials/BuildingSystems.php.
Data, Protein. “RCSB PDB - 7DMQ: Cryo-EM Structure of LshCas13a-CrRNA-Anti-Tag RNA Complex.” Rcsb.org, 2020, www.rcsb.org/structure/7DMQ.
“Model | Tsinghua - IGEM 2024.” Igem.wiki, 2024, 2024.igem.wiki/tsinghua/model.
Pancera, M., et al. “Structure of HIV-1 Gp120 with Gp41-Interactive Region Reveals Layered Envelope Architecture and Basis of Conformational Mobility.” Proceedings of the National Academy of Sciences, vol. 107, no. 3, 28 Dec. 2009, pp. 1166–1171, https://doi.org/10.1073/pnas.0911004107.
Wang, Beibei, et al. “Structural Basis for Self-Cleavage Prevention by Tag:anti-Tag Pairing Complementarity in Type vi Cas13 CRISPR Systems.” Vol. 81, no. 5, 4 Mar. 2021, pp. 1100-1115.e5, https://doi.org/10.1016/j.molcel.2020.12.033.