Introduction
Computational modeling is an essential forerunner for in vitro and in vivo research experimentation, because it provides a cost- and time-effective, scalable mechanism to test experimental designs and validate hypotheses. Our project centers around the development of many novel parts, such as antisense oligonucleotides (ASOs) and aptamers. Computational modeling was crucial for our research; we designed and scored many ASOs and aptamers in silico to quantify their efficacy in preventing ALS before testing our most-effective designs in vitro. We also developed a linear model predicting ASO-based protein knockdown based on sequence features using our wetlab data for future incorporation into this novel pipeline.
Computational modeling is an essential forerunner for in vitro and in vivo research experimentation, because it provides a cost- and time-effective, scalable mechanism to test experimental designs and validate hypotheses. Our project centers around the development of many novel parts, such as antisense oligonucleotides (ASOs) and aptamers. Computational modeling was crucial for our research; we designed and scored many ASOs and aptamers in silico to quantify their efficacy in preventing ALS before testing our most-effective designs in vitro. We also developed a linear model predicting ASO-based protein knockdown based on sequence features using our wetlab data for future incorporation into this novel pipeline.
TAR DNA-binding protein 43 (TDP-43) aggregation is a pathological hallmark of amyotrophic lateral sclerosis (ALS, Suk et al., 2020). Evidence suggests that oxidative stress and high local concentrations of TDP-43 in stress granules (SGs) are prerequisites for the aggregation of TDP-43 (Dewey et al., 2013). One protein responsible for the recruitment of these stress granules is DAZAP1, and it is also known to lead to enhanced accumulation of RNA stress granules once they have begun to aggregate (An et al., 2021). Another protein responsible for the stability of stress granules is FAM98A, which is known to localize around stress granules, leading to worsened aggregation (Ozeki et al., 2018)
Design Pipeline
Our ASOs were designed to knock down key proteins (abbreviated DAZ, FAM, and SND) that are implicated in stress granule formation in neurodegenerative diseases such as ALS. We adapted a computational pipeline for ASO generation (Yeo, 2024); this pipeline incorporates sequence tiling, biophysical filtering, conservation scoring, specificity mapping, and chemical modifications to assign a numerical score for each candidate ASO. Throughout the pipeline, we parameterized specific scoring and development factors to create the most effective ASOs for our target protein knockdowns.
ASO Scoring
Scoring Factors
Factor | Description |
---|---|
GC Content |
Fraction of nucleotides in sequence that are Guanine (G) or Cytosine (C).
We altered the pipeline to use range 0.4–0.6. Since G and C pair via three hydrogen bonds (as opposed to A–T, which pairs using only two bonds), higher GC content corresponds to increased stability and resistance to denaturation. We therefore optimize GC content to filter out weak ASOs and excessively stable duplexes (which may increase risk of off-target binding). \[ GC = \frac{\#\{G,C\}}{\text{sequence length}} \] |
Homopolymers |
Long sequences of consecutive, same bases.
We altered the pipeline to penalize “CCCCC” and “GGGGG.” Homopolymers are harder to synthesize chemically and typically form altered, unusual secondary structures. This can lower binding specificity, so the model penalizes the inclusion of these mini sequences. |
Conservation |
phastCons score measuring how conserved sequence is across human species.
We chose to use a conservation value of 0.9. Highly conserved species are more likely to have increased functional importance and relevance, so our model penalizes sequences that are not already conserved across the human species (Perez et al, 2025). \[ PC = \frac{1}{L} \sum_i \text{phastCons}(i) \] |
Secondary Structure |
Prediction of accessibility of ASO target binding region. If the target binding region of the mRNA (that the ASO contains a complementary strand to) is predicted to be inaccessible due to potential RNA-folding structures (such as hairpin loops), the ASO will be ineffective in knocking down required proteins. Therefore, ASOs that form complementary strands to inaccessible mRNA regions are penalized. |
Mapping Specificity |
Score determining binding specificity to singular genome locus.
We limited the number of mismatches to at maximum = 1. Determination of number of genomic sites: \[ \text{mappednum} = 1 + \big(\text{number of commas in mapping string}\big) \] Each ASO sequence is aligned to the human genome and tested for binding (Perez et al, 2025). If the ASO binds at multiple locations on the genome, it is penalized because of unspecific binding (increasing penalty with decreasing specificity).\[ \text{Penalty}_{\text{map}} = 100 \times (\text{mappednum} - 1) \] |

Each scoring block subtracts an equal number of points from the original score of 1000 to ensure that each characteristic is weighted equally to the rest.
Chemical Modifications
Since pure ASO sequences (20-base RNA sequences) are unlikely to “survive” harsh nuclear conditions, top-scoring candidate ASOs were transformed into gapmers to increase nuclear resistance and binding affinity. The pipeline uses 2′-O-methoxyethyl ribose (2MOEr) and 5-methylcytosine substitution (iMe-dC) modifications.
Modification Name | Modification Process | Purpose |
---|---|---|
2′-O-methoxyethyl ribose (2MOEr) | Alteration of 2′ ribose sugar | Increases binding strength and resistance to nucleases |
iMe-dC (5-methylcytosine substitution) | Replaces cytosine bases in “DNA gap” with methylated cytosine | Improves binding affinity and reduces immune stimulation |
We altered the pipeline to use 5-10-5 gapmer design, which modifies positions 1-5 (5’ flank) with 2MOEr, 6-14 (DNA gap) with iMe-dC substitutions, and 15-19 (3’ flask) with 2MOEr. This allowed us to prepare physiologically relevant ASOs for in vitro testing.
Conclusion and References
Without this model, we would have had to test hundreds of unfiltered ASOs experimentally. The model helped us focus on the most promising candidates and saved significant wet-lab time. Thus, our ASO design pipeline allowed us to develop and test the most effective ASOs for use in in vitro experimentation with limited resources. This pipeline can also be easily adapted by other scientists to target different disease genes, making it a reusable modeling framework.
References:
- Suk, T. R., & Rousseaux, M. W. C. (2020). The role of TDP-43 mislocalization in amyotrophic lateral sclerosis. Molecular Neurodegeneration, 15(45). https://doi.org/10.1186/s13024-020-00397-1
- Dewey, C. M., et al. (2012). TDP-43 aggregation in neurodegeneration: Are stress granules the key? Brain Research, 1462, 16–25. https://doi.org/10.1016/j.brainres.2012.02.032
- An, H., et al. (2021). Compositional analysis of ALS-linked stress granule-like structures reveals factors and cellular pathways dysregulated by mutant FUS under stress. bioRxiv. (Preprint.) https://doi.org/10.1101/2021.03.02.433611
- Ozeki, K., et al. (2019). FAM98A is localized to stress granules and associates with multiple stress granule-localized proteins. Molecular and Cellular Biochemistry, 451(1–2), 107–115. https://doi.org/10.1007/s11010-018-3397-6
- Yeo Lab. (n.d.). GitHub organization page. GitHub. Accessed 7 Oct. 2025. https://github.com/yeolab
- UCSC Genome Browser on Human (GRCh38/hg38), position chr19:1,435,544–1,435,764. University of California, Santa Cruz. Accessed 7 Oct. 2025. https://genome.ucsc.edu