MSA Tool — Evolutionary Structural Analysis

Secondary analytical module for exploring evolutionary and structural conservation of RNA elements

Overview

The MSA Tool is a secondary analytical module developed as part of our modelling and software framework. While our main software (see Software and Software Technicalities pages) focuses on designing and simulating functional RNA switches, this complementary tool was created to explore the evolutionary and structural conservation of naturally occurring RNA elements.

We specifically designed the MSA Tool during our attempt to use an SCR element as our Functional RNA Element (FRE) (see Model and Software pages). Because the SCR's structure was not well characterized, we needed a way to test whether it showed conserved structural motifs that could indicate a functional secondary structure.

T When studying protein function, researchers often rely on structure prediction tools such as AlphaFold[2] or conservation analysis pipelines that use evolutionary information to identify key residues. However, for RNA, there is no equivalent integrated tool that allows researchers to visualise structural conservation in an evolutionary context.
This tool provided the framework to perform such an analysis — allowing us to visualize structural conservation directly on the RNA secondary structure. Ultimately, experiments designed using the insights from this tool confirmed that no important structure was conserved, helping us move forward with other FRE candidates.

Access Points

💻 GitLab Source Code (Official Repository)

All MSA Tool source code is openly available on our GitLab repository under an OSI-approved license.

🌐 Web Deployment (For Non-Experts)

A live web deployment of the MSA Tool that can be accessed directly from any browser, no installation required.

📋 Docker, Package & API Instructions (README)

Step-by-step installation guide including instructions for Docker, the Python package, and the MSA Tool API.

Example of how to use the tool:

How It Works

The MSA Tool takes a Multiple Sequence Alignment (MSA) file and analyzes both sequence conservation and predicted RNA secondary structures, integrating evolutionary information with RNA folding predictions.

Supported formats:

  • FASTA (.fasta, .fa) and Clustal (.aln) — up to 200 MB per file
  • You can drag and drop your file, or upload it manually

A folder named "examples" is available in the tool's web interface — it includes a sample FASTA file you can try directly on the platform.

The first sequence in the alignment is considered the reference sequence, and its predicted structure is used as a baseline for comparison and visualization.

Output Overview

Once the file is processed, three result tabs are generated:

1. General Information

This section provides the predicted structure and folding energy of the reference sequence, alongside a conservation map showing how conserved each base is across the alignment.

Example structure:

.((.(((.....))).)).....(((((.(((.....((((........))))....)))))))). 
          Energy: -15.00 kcal/mol

The conservation map displays a score between 0 (variable) and 1 (fully conserved) for each nucleotide, visualized on the RNAFold-predicted structure. High conservation across base-paired regions indicates possible structural or functional importance.

Ejemplo de imagen
Ejemplo de imagen
always — present in all sequences
almost always conserved — absent in up to two sequences
others — less consensus
not paired (white) — unpaired positions

2. Clustering Analysis

This tab summarizes the structural diversity observed across the alignment — showing how many unique structures exist and how they cluster together.

Example:

Total Structures: 20
Unique Structures: 4
Cluster 0 — 9 structures
Cluster 1 — 9 structures
Cluster 2 — 1 structure
Cluster 3 — 1 structure
Ejemplo de imagen

A low number of unique clusters, compared with the number of sequences, indicates a highly conserved overall fold, while higher diversity suggests that the sequence may not rely on a stable conserved structure for its function.

3. Visualise All Structures

This section allows you to explore each predicted structure individually — including sequence, folding energy, and base-pairing pattern — for a deeper look into how mutations may influence the RNA's folding.

Significance

The MSA Tool was designed to shed light on the potential structural role of naturally occurring RNA elements, offering an evidence-based approach rather than working "in the dark."

By integrating evolutionary conservation, structure prediction, and clustering analysis, it helps determine:

Whether a conserved structure exists

Identifies if the RNA element maintains consistent folding patterns across evolutionary time

Which regions may be functionally important

Highlights conserved base-paired regions that likely contribute to biological function

How much structural diversity occurs among homologous sequences

Quantifies the range of structural variation within related sequences

In our case, it ultimately demonstrated that the SCR element lacked significant structural conservation, leading us to conclude that it was unlikely to function as a reliable FRE. This insight was crucial in guiding the next steps of our design–build–test–learn cycle.

References

  1. Jumper, J., Evans, R., Pritzel, A., Green, T., Figurnov, M., Ronneberger, O., ... & Hassabis, D. (2021). Highly accurate protein structure prediction with AlphaFold. Nature, 596(7873), 583-589. Paper
  2. Lorenz, R., Bernhart, S. H., Höner zu Siederdissen, C., Tafer, H., Flamm, C., Stadler, P. F., & Hofacker, I. L. (2011). ViennaRNA Package 2.0. Algorithms for Molecular Biology, 6(1), 26. Paper