Model

mascot-model

miRNA biomarkers


Aim: Select the most suitable miRNA biomarkers for a diagnostic test: those that will readily bind to detectors to produce a readout, those more specific to LTB, and those that combine to form the least crosstalk in a multiplex system.

Binding availability

To gauge how readily each miRNA biomarker could bind to a detector, e.g. a toehold switch, we decided to model their secondary structures. Just as secondary structure in DNA primers is detrimental to their binding and therefore function, we applied this logic to our miRNAs and sought to select the most suitable miRNAs based on minimal secondary structure.

To model this, we used the Utilities feature of the NUPACK web application. RNA was selected as the material, and the structure was predicted at 37°C. By inputting an miRNA sequence into the sequence box, then ticking the MFE (minimum free energy) setting above the structure box, we received predicted structures alongside base pair probabilities within these structures for each miRNA. Alternatively, a free web application that performs a similar function, providing predicted structures and base pair probabilities for DNA and RNA, is RNAfold by the University of Vienna

These results shown in Figure 1 below, suggest hsa-miR-1306-5p and hsa-miR-6529-5p were the two best candidates for biomarkers in terms of binding availability as it formed the least secondary structure. However, there were still more properties to investigate before determining the best combination of miRNA biomarkers for a multiplex system.

miRNA biomarker secondary structures
Figure 1: The secondary structures of each miRNA biomarker as generated and visualised on NUPACK. Those on the lefthand side show the least secondary structure, whilst those on the righthand side show the most secondary structure. Despite hsa-miR-1306-5p and hsa-miR-6529-5p having the same overall structure, the higher base pairing probabilities of hsa-miR-1306-5p predict hsa-miR-1306-5p to be more likely to remain in an unfolded state than hsa-miR-6529-5p. Similarly with hsa-miR-7850-5p and hsa-miR-363-5p, based on the base pairing probabilities of hsa-miR-363-5p it is more likely to stay in a folded state compared to hsa-miR-7850-5p.

Biomarker specificity

We wanted to investigate if any of these miRNA biomarkers were strongly associated with other diseases other than LTB, the detection of upregulated miRNAs not entirely specific to LTB would lead to false positives in a diagnostic test. As mentioned before, to mitigate this risk we aim to test for two different upregulated biomarkers to increase specificity to LTB.

Using MalaCards, we searched each of the four miRNA biomarkers and looked at their associated diseases other than LTB. We found that hsa-miR-7850-5p, hsa-miR-1306-5p and hsa-miR-363-5p were associated with a variety of different conditions and particularly cancers, breast cancer being a common factor between all three although with differing in the strength of the association. This shows that more research is needed to verify these biomarkers as being specific to LTB, however we are still developing an effective platform and method for confirmed biomarkers to be quickly implemented into a diagnostic test. Little information was available about hsa-miR-6529-5p, we could not find any other disease associations. For this reason we selected hsa-miR-6529-5p only as the most suitable in this category, with the information we had it was the most specific to LTB although more research is needed to confirm this.

Crosstalk in a multiplex system

To finalise the modelling for our biomarkers we wanted to look into potential crosstalk between different miRNAs. This crosstalk could lead to false positives if miRNAs are similar enough to bind and activate the same detector, or lead to false positives if they are similar enough to bind to a detector without activating it, preventing it from being activated by its designed trigger. To gain and insight into this we took two approaches. The first approach focused on crosstalk within the mechanism of the test itself, seeing which combinations of miRNAs are least likely to interact with one another. Secondly we were interested in crosstalk arising from other nucleic acid sequences that could be present alongside our target miRNAs in a sample.

First focusing on the interactions between the four LTB miRNA biomarkers, we used the analysis feature in NUPACK to simulate any binding between them through a wide temperature range (0°C - 40°C). This temperature feature was especially useful, allowing us to get an idea about how consistently our test would across a range of temperatures and especially in the climates of high-burden TB populations. We created a test tube containing all of the different miRNAs at the same concentration (1µM), and set the max complex size to 2 so that the model would only account for duplexes that could form, this was done to reduce computational load.

NUPACK analysis results showing miRNA interactions
Figure 2: On the left is a graph generated by NUPACK, showing the fraction of bases unpaired over the temperature melt range we inputted (0°C - 40°C). As temperature increases, there are fewer interactions between the miRNAs. The chart on the right shows the concentrations of individual miRNAs and miRNA duplexes formed at 20°C.

This modelling showed hsa-miR-363-5p forms the least interactions with other miRNAs, seemingly making it a good candidate for use in a multiplex system. However, based on previous modelling on the structure of hsa-miR-363-5p, this may be because of its naturally folded state that prevents it from binding to other miRNAs. If this was the case then hsa-miR-363-5p may not readily bind with a detector like a toehold switch, therefore we decided to choose other miRNAs. The miRNA that formed the second least interactions was hsa-miR-6529-5p, a miRNA with minimal secondary structure, and the miRNA it interacted the least with (excluding hsa-miR-363-5p) was hsa-miR-7850-5p.Therefore we believed these two miRNAs were the most suitable for use together in a multiplex system. However, hsa-miR-6529-5p also showed significant interaction with itself which we decided to investigate further through modelling the structure of the complex.

Structure of hsa-miR-6529-5p duplex
Figure 3: Structure of the hsa-miR-6529-5p duplex. Predicted using the utilities function in NUPACK, both sequences entered in the sequence box with a '+' between them to indicate two separate strands binding together. An alternative free software for modelling interactions is the RNAcofold webserver by the University of Vienna.

Looking at the base pair probabilities of this duplex the interaction between two hsa-miR-6529-5p molecules seems fairly stable. However, the greater prevalence of individual hsa-miR-6529-5p in the crosstalk analysis suggests that this interaction might form and separate regularly within the system. Alongside this, the majority of the duplex structure remains open and unfolded, suggesting that in the presence of a toehold switch the one of the hsa-miR-6529-5ps in the duplex can still bind to a toehold switch, separating from the second hsa-miR-6529-5p.

When modelling the interaction between hsa-miR-6529-5p and hsa-miR-7850-5p (Figure 4) we noticed the interaction seemed even more stable than the hsa-miR-6529-5p duplex. To gauge whether these structures significantly affected binding to detectors we decided to model example situations using the analysis feature, using toehold switches as the example of a detector molecule. One modelled situation included a system with a higher detector concentration than miRNA, the other having higher miRNA concentrations than detector (Figure 5)

Predicted structure of hsa-miR-6529-5p and hsa-miR-7850-5p duplex
Figure 4: Predicted structure of the hsa-miR-6529-5p and hsa-miR-7850-5p duplex. Predicted using the utilities function in NUPACK, both sequences entered in the sequence box with a '+' between them to indicate two separate strands binding together. The interaction consists of a greater number of base pairs compared to the duplex with base pair probabilities ~0.7. An alternative free software for modelling interactions is the RNAcofold webserver by the University of Vienna.
NUPACK analysis of miRNA and toehold switch interactions
Figure 5: Lefthand graph: hsa-miR-6529-5p and hsa-miR-7850-5p at 1µM, hsa-miR-6529-5p toehold switch and hsa-miR-7850-5p toehold switch at 0.5µM. Righthand graph: hsa-miR-6529-5p and hsa-miR-7850-5p at 0.5µM, hsa-miR-6529-5p toehold switch and hsa-miR-7850-5p toehold switch at 1µM. Generated by using the analysis feature of NUPACK, RNA as the material, 20°C, max complex size of 2.

In both situations the detectors became almost as saturated as possible with miRNA triggers, suggesting that these duplexes formed between the miRNAs may not significantly prevent the activation of detectors and therefore a readout. From this we decided that hsa-miR-6529-5p and hsa-miR-7850-5p are likely the best candidates to be used together in a multiplex diagnostic test, however, we aimed to test all combinations within the lab to see if lab results would reflect this modelling or show a better pair of biomarkers for detection.

Secondly, and the final step we took in analysing each of our miRNAs, was looking for other sequences within a blood exosome sample that could potentially interact with the miRNAs or miRNA complementary region on detectors able to produce false positives and negatives. To do this we used NCBI's Nucleotide BLAST web application to search for similar sequences in the human genome for every miRNA biomarker. For each miRNA, whilst other sequences had contained exact matches or close matches within them, we used GeneCards to learn more about the places these sequences can be found. Of the sequences highly similar to miRNAs and therefore detectors, potentially to a degree of creating false positives or negatives, none of them were likely to be found within the exosomes that our miRNA biomarkers could be found. From this information we concluded that none of the miRNA biomarkers had the risk of being too similar to additional sequences that could falsely activate or inhibit detectors from providing a readout.

Taking each of these modelling angles into account: The minimal secondary structures, the most specific the LTB, the minimal interactions with other biomarkers and off target molecules, we predicted hsa-miR-6529-5p and hsa-miR-7850-5p as the best combination to use in a multiplex system. This we aimed to test in the lab but unfortunately, due to time constraints, we did not get the chance to.

Take our work further:

We tried to go into as much depth as we could, but there are a number of ways future iGEM teams could optimise our work and methods even further.

One area in particular is the crosstalk modelling, where we did not have precise concentrations of detectors or triggers to base our model on. The results of our modelling are likely to change significantly when more precise concentrations are used to reflect what could actually occur within the diagnostic test, and whether interactions are inhibitory. Selecting a precise temperature to focus on based on the usage situation of the diagnostic test would also increase the accuracy of the modelled interactions. We did not get to test our prediction of the best biomarkers to use in our test, and due to limited time we may have overlooked more optimal combinations. It would be a much more thorough model if more combinations could have been tested, rather than filtering down results immediately after testing them all together in one condition.

A more in depth approach to identifying biomarker specificity and crosstalk potential could involve the use of machine learning to better expose common factor disease associations between biomarkers, and identify nucleic acid sequences present in blood exosomes able to interfere with the detectors of the diagnostic test.

Toehold switches


Aim: Understand and optimise toehold switch structure and function by modelling: secondary structures in various different states, basic thermodynamics of binding and crosstalk between toehold switch sequences and off target miRNAs.

Secondary structure is essential to the function of a toehold switch10. We needed to visualise the secondary structures of the toehold switches in both an inactive state without and miRNA trigger bound, and in an active state with the miRNA bound, opening the toehold switch structure and allowing for the expression of the reporter gene. Alongside this we did basic thermodynamic calculations to evaluate how favourable binding between the toehold switches and trigger are, and more crosstalk analysis between toehold switch sequences themselves and off target miRNA binding.

Inactive state modelling

Once again using the NUPACK utilities feature10, we predicted the secondary structure of each of our toehold switches to check if we had designed and built our sequences correctly. Initially we only modelled the toehold switch itself, not including a linker sequence or any part of the reporter gene, as many previous iGEM teams had done. While this is good for checking if the sequence folds correctly into the intended structure of a toehold switch, we later found this is not a good representation of toehold switch function. This is because it cannot take into account any off target binding that may occur between the toehold switch, linker sequence and reporter gene which may prevent a toehold switch functioning in various ways, for example, preventing a trigger from binding and activating the toehold switch. We also found modelling the entire secondary structure of the toehold switch sequence was could be computationally intensive and not helpful to gauge functionality, while it can be useful to see if the toehold switch maintains its stable structure when modelled entirely, with so many base pairing probabilities the final model is only a snapshot of many other stable conformations the sequence can adopt, and reporter gene mRNA sequences introduce a lot of secondary structure unrelated to the toehold switch making the model difficult to interpret. The most effective method of modelling we found, as recommended in literature10, was to include 50 nucleotides after the toehold switch which includes the entire linker sequence and reporter gene in a standard toehold switch structure10. This is minimally computationally intensive and gives enough sequence context to predict off target binding which could impede on function.

Toehold switch secondary structure
Figure 6: An example of one of our toehold switches which formed correct secondary structures hsa-miR-363-5p toehold switch with RFP as a reporter gene.

Using this modelling we were able to recognise off target structures that could be inhibitory to the function of the toehold switch, for example, preventing the miRNA from fully binding and activating the toehold switch. We were then able to use this modelling to design and test optimisations for our toehold switch sequences to remove/decrease off target binding. Please see our Engineering page to learn a lot more about how we used this modelling to optimise our sequences, an example is given in Figure 7 below.

Toehold switch optimization comparison
Figure 7: Lefthand side: hsa-miR-7850-5p toehold switch with RFP before adding a linker sequence, off target structure forms between the trigger binding region and the beginning of the reporter gene. Righthand side: hsa-miR-7850-5p toehold switch with RFP before adding a linker sequence, this off target secondary structure is no longer present and the toehold switch is in theory much more open to hsa-miR-7850-5p correctly binding and opening the toehold switch, leading to RFP expression.

Figure 7. Lefthand side: hsa-miR-7850-5p toehold switch with RFP before adding a linker sequence, off target structure forms between the trigger binding region and the beginning of the reporter gene. Righthand side: hsa-miR-7850-5p toehold switch with RFP before adding a linker sequence, this off target secondary structure is no longer present and the toehold switch is in theory much more open to hsa-miR-7850-5p correctly binding and opening the toehold switch, leading to RFP expression.

Toehold switch binding comparison
Figure 8: Left side: hsa-miR-363-5p toehold switch with RFP, hsa-miR-7850-5p binds completely to the toehold switch and there is minimal secondary structure around the RBS with a low base pair probability. Right side: hsa-miR-1306-5p toehold switch with RFP. Hsa-miR-7850-5p binds almost binds completely except for one nucleotide, however, the RBS is surrounded by and partially forms secondary structures of a higher base pairing probability.

We were unsure how much of an effect secondary structures sequestering the RBS would have on reporter gene expression, especially since the base pairing probabilities were often below 0.5 and therefore structure was likely variable. We aimed to compare the sensitivity between each of our toehold switches to see if varying secondary structures surrounding the RBS would have a significant impact over reporter gene translation rates.

Favourable thermodynamics

It is important that the binding between a toehold switch and trigger molecule is thermodynamically favourable and can occur spontaneously. Not only this, we want to judge whether within a system, the trigger and toehold switch will favour a bound state rather than remaining as individual molecules. As previous iGEM teams have done11, we used the following equation to calculate the free energy difference of miRNA triggers binding to toehold switches:

CFE of active Toehold switch&trigger - (CFE of inactive toehold switch + CFE of trigger)
CFE = Complex Free Energy (kcal/mol)

A negative answer confirms that the trigger favours a bound state with a toehold switch, and that this reaction can occur spontaneously. This is also shown through the crosstalk modelling for toehold switches, which we will discuss in the next section. When applying this equation to our toehold switches, we used the CFE value provided with each secondary structure prediction, an example of this being shown in Figure 9.

NUPACK properties and CFE calculation
Figure 9: Below predicted secondary structures, NUPACK provides more detailed properties of each structure under the 'Properties of complex ensemble' section. This example is of active hsa-miR-1306-5p toehold switch with RFP, having a complex energy of -57.26 kcal/mol. An alternative free method for generating values for the equation is through RNAfold and RNAcofold. With every structure prediction a MFE value is provided alongside which can be substituted into the equation. MFE or CFE values can be used for the calculation. However it has to be one or the other as each value is calculated in a different way using a different webserver, therefore they are not comparable.

Crosstalk in a multiplex system

There were two crosstalk situations we had to model: crosstalk between toehold switches and crosstalk between toehold switches and off target miRNAs.

Beginning with crosstalk between toehold switches, previous literature mentions how toehold switches are effective detector molecules that can show little crosstalk, with some toehold switch libraries having crosstalk levels <12%10. In order to gauge crosstalk between our toehold switches, we used the same method applied to miRNA crosstalk modelling, instead focusing only on toehold switches in a multiplex system. The results are shown in Figure 10.

Toehold switch crosstalk analysis
Figure 10: On the left is a graph generated by NUPACK, showing the fraction of bases unpaired over the temperature melt range we inputted (0°C - 40°C). As temperature increases, there are fewer interactions between the toehold switches. The chart on the right shows the concentrations of individual toehold switches and toehold switch duplexes formed at 20°C. The input concentrations of all four toehold switches was 1µM.

Three of our toehold switches, particularly hsa-miR-7850-5p toehold switch with RFP and hsa-miR-6529-5p toehold switch with RFP, showed little crosstalk between themselves and other toehold switches. This paired well with our previous miRNA modelling, where we predicted hsa-miR-6529-5p and hsa-miR-7850-5p as the optimal miRNA combination in a multiplex system. The minimal crosstalk between these two toehold switches would further optimise the detection system, minimising the risk of false negatives and positives.

However, these toehold switches could still be activated by unintended miRNA targets. To model this we used the analysis feature in NUPACK to model a test tube containing one toehold switch and all four miRNAs. In reality all four miRNA biomarkers will be present in a patient's blood exosome samples, therefore modelling gave us a more realistic insight into how crosstalk could impact the sensitivity and readout of the test. Figure 11 shows an example of the results.

Off-target miRNA activation analysis
Figure 11: Left graph: hsa-miR-7850-5p toehold switch with RFP at 1µM, all four miRNAs at 0.5µM. Right graph: hsa-miR-7850-5p toehold switch with RFP at 0.5µM, all four miRNAs at 1µM Generated by using the analysis feature of NUPACK, RNA as the material, 20°C, max complex size of 2.

We found in our modelling that each of our toehold switches much more favourably bind to the correct trigger miRNA over any other. In situations when the concentration of miRNA is higher than that of the toehold switch, the toehold switches are fully saturated, it is only when excess toehold switches are present that other miRNAs can bind. The next step forward was to model the interactions between these toehold switches and off target miRNA, however due to time constraints we were unable to finish this.

Take our work further:

Similar to our work with miRNAs, we did not have any precise concentrations to perform our modelling on, and this may significantly impact the results of each model.

Additionally, we did not get around to modelling the leakage of our toehold switches, something crucial to take into account as leaky expression of the reporter gene could create false positive results in the diagnostic test. Through this modelling further optimisations of the toehold switch structure could be made to reduce this leakiness, for example, increasing the length of the main stem to increase the stability of the stem and fully sequester the RBS. By increasing stability, sensitivity of the toehold switches decreases10. This is a delicate balancing act that could be used to optimise toehold switches as much as possible.

Creating a model of these cell free systems from ordinary differential equations (ODEs) would also be a step forward in implementing these detectors in a diagnostic test. This model could be used to find the best concentrations of components within the cell free system of a diagnostic test e.g. toehold switch concentration which could affect readout times.

Crispr Cas13a


Aim: Understand and design crRNAs through secondary structure modelling, miRNa-crRNA binding and crosstalk modelling, testing suitability for a multiplex test.

Secondary structure and binding

Each crRNA consists of two segments: a spacer sequence complementary to a mRNA trigger and a direct sequence that associates with Cas13a. For more details about the mechanism and design of this detection method please see our project description page and Engineering page. Using previous literature as a base for the design of our crRNAs14, we wanted to test if our designed crRNA sequences formed the same secondary structures and interactions as those used in research. Most importantly, we prioritised modelling the secondary structure of trigger miRNAs binding to their respective crRNAs, as correct binding between these two molecules is required to activate Cas13a. Without the activation of Cas13a, no readout on the lateral flow test would be possible, or in the context of a diagnostic test there would be a high risk of false negatives.

Continuing to use NUPACK to model secondary structures, we modelled both our designed crRNA sequences bound to the correct miRNAs,and an example provided in literature for comparison. We were looking to see the miRNA completely binding to the spacer sequence of the crRNA and the retention of the secondary structure formed by the direct sequence that associated with Cas13a.

crRNA binding comparison
Figure 12: The structure on the left depicts our designed crRNA for hsa-miR-7850-5p binding to hsa-miR-7850-5p. The structure on the left is a crRNA complementary and bound to a SARS-CoV-2 mRNA created by Sun H, Bu S, Wang C, Wang J, et al14.

There is very little difference between our designed crRNA sequences and the example we are trying to replicate, suggesting our sequences could function effectively as crRNAs able to activate Cas13a, producing a readout in a lateral flow test. The main difference between the two structures is the length of the spacer sequence. This is due to our miRNA biomarkers being smaller than the mRNA SARS-CoV-2 mRNA biomarker used to build the crRNA, meaning a smaller spacer sequence had to be included. We were hoping to test both the example crRNA alongside our designed crRNA to test if these differences in structure had any effect on overall function, however we did not get the chance to due to time constraints.

Crosstalk in a multiplex system

Applying the crosstalk modelling we performed with the toehold switches to the crRNAs we designed, we found crosstalk more prevalent in a multiplex system using crRNAs.

Figure 13 below shows some of the crRNAs form very few interactions with others such as hsa-miR-7850-5p crRNA and hsa-miR-363-5p crRNA, while others such as hsa-miR-6529-5p crRNA and hsa-miR-1306-5p crRNA form many interactions with other crRNAs. These interactions could prevent miRNA biomarkers from binding properly to the correct spacer sequences, consequently preventing Cas13a activation and diagnostic readout.

crRNA crosstalk analysis
Figure 13: On the left is a graph generated by NUPACK, showing the fraction of bases unpaired over the temperature melt range we inputted (0°C - 40°C). As temperature increases, there are fewer interactions between the crRNAs. The chart on the right shows the concentrations of individual crRNAs and crRNA duplexes formed at 20°C. The input concentrations of all four toehold switches was 1µM.

As we did with toehold switches, to gauge whether different miRNAs could bind to off target crRNAs and trigger a readout. As before using NUPACK, we modelled test tubes containing one toehold crRNA and all four miRNAs, with hsa-miR-7850-5p crRNA shown as an example below in Figure 14.

Off-target miRNA activation with crRNA
Figure 14: Left graph: hsa-miR-7850-5p crRNA at 1µM, all four miRNAs at 0.5µM. Right graph: hsa-miR-7850-5p crRNA at 0.5µM, all four miRNAs at 1µM Generated by using the analysis feature of NUPACK, RNA as the material, 20°C, max complex size of 2.

This modelling showed that cRNAs will bind most favourably to miRNA biomarkers they are designed to target, showing off target binding likely does not prevent target miRNAs from binding to activate Cas13a. However, particularly in the case of hsa-miR-7850-5p crRNA in Figure 14, other off target miRNAs like hsa-miR-7850-5p can bind to form stable interactions that may also trigger Cas13a and produce a readout. crRNAs may be more prone to off target binding compared to toehold switches due to their minimal secondary structure, with many exposed nucleotides that could interact with other molecules. After taking all of this modelling into account we decided that our crRNA sequences may not be a suitable replacement for toehold switches in our diagnostic test, and so we focused on the final detection method we had researched (see next section).

Take our work further

There is more modelling that we did not have time to perform for our designed crRNAs, for example, modelling the secondary structures of interactions shown through crosstalk modelling to get a better insight into how stable these complexes are, and how much they could interfere with a diagnostic test.

As mentioned previously with toehold switch modelling, using precise concentrations that would accurately reflect those we could extract from a sample and include in a diagnostic test could significantly change the results of the crosstalk modelling. The same applies for the temperatures used in modelling, tailoring these settings to the exact situations and climates the test will operate in will provide a more accurate in silico model that can inform lab experimentation.

CCDR


Aim: Use secondary structure modelling of complexes to understand and test designed DNA hairpin probes and compare this to those made in previous research.

The CCDR relies on a number of different interactions to occur in order to produce a readout: a trigger has to bind with a DNA hairpin, this bound hairpin then binds to a second DNA hairpin while the trigger detaches from the first hairpin, finally this DNA hairpin duplex produces a readout on a lateral flow test. More details about the mechanism and design of this detection method can be found on our Project description page and Engineering page. This is the last detection method we researched and we had little time to perform modelling. For this reason we took a similar approach to crRNA modelling, focusing only on secondary structures of each component and the interactions they form, comparing this to a modelled example from previous literature15.

Secondary structure and interactions

Using the same method we did for modelling the crRNAs we compared the secondary structures and interactions of our designed DNA hairpins compared to those from the research paper we were using as an example15. Figure 15 shows a set of hairpins we designed for hsa-miR-7850-5p compared to those from our reference paper15.

Our DNA hairpins for hsa-miR-7850-5p (DNA3 and DNA4) Control DNA hairpins for miRNA-223 (DNA1 and DNA2 - as stated in the paper)
Primary DNA hairpin DNA3 DNA1 Control
Primary DNA hairpin + miRNA DNA 3 + miRNA 2 miRNA-223 + DNA 1 Control
Secondary DNA hairpin DNA 4 DNA 2 Control
Secondary DNA hairpin + primary DNA hairpin DNA 4 + DNA 3 Control DNA 2 + DNA 1

Table 1: Comparison of DNA3 and DNA4 with the positive controls (DNA1 and DNA2 - as listed from the literature15)

This modelling shows our designed sequences to be nearly identical to those we were comparing them to, and in theory they should be comparably functional. In the complex between DNA 4 and DNA 3 in the figure above we noticed there was a smaller stem compared to the control structure of DNA 2 + DNA 1. We hypothesised these additional unpaired nucleotides may make binding to one of the detector proteins (e.g. Biotin) less specific, other miRNAs and molecules may be able to bind, preventing a detector protein from binding and ultimately preventing a readout from being produced. Further modelling and testing in the lab is needed to investigate this further.

Take our work further

A lot more modelling could be performed on these parts, for example, crosstalk in a multiplex system as we had done for previous parts. Creating a set of ODEs to model the behaviour of each of these components in a cell-free system would be beneficial for optimising the properties, such as concentrations, of the test in silico to save resources for wet lab experiments.

We were unable to test the function and performance of these sequences in the lab. Laboratory experiments looking at the sensitivity of these parts in a detection system, how long it takes to produce a readout, background signal and crosstalk in a multiplex system would well characterise these parts and their efficacy in a diagnostic test for LTB.

Epidemiological model


To better understand the scale of TB in the Dominican Republic and further investigate the benefit an effective LTB diagnostic could have, we decided to construct an epidemiological model based upon recent TB data from the Dominican Republic. This page focuses on the construction and assumptions behind the model, please see our Human Practices | KCL-UK - iGEM 2025 for analysis of the model.

We created our epidemiological model using python code. Throughout this page we will talk through what the code does, and what future iGEM teams can do to improve it further and apply this code to their own epidemiology studies. The script can be accessed here: DR Epidemiological Model Plain Text Script.txt

Base

This epidemiological model is based on the SEIR-D format. Rather than using the simplistic SIR format (susceptible, infectious, recovered), it was important we included the 'Exposed' category to distinguish between individuals with LTB and those with ATB. As we are trying to estimate the scale and burden of TB in the Dominican Republic, we also included the 'Dead' category to show individuals who have passed away due to a TB, when their infection could have been cured.

Building the model

To create this model, we calculated starting populations and parameters (except the β parameter, determining the transmission rate) from the most recent TB data we could find from the Dominican Republic, this came from the WHO's Global Tuberculosis report in 20245. The most recent years covered are 2022 and 2023. To make sure our model accurately reflects real TB trends in the Dominican Republic, we calculated all of our starting populations and parameters from 2022 data and designed the model to predict ending populations after one year (2022 to 2023). We then gradually increased the β parameter until the ending populations matched the actual data from 2023 as closely as possible, calibrating our β value to the recent TB trends in the Dominican Republic. To see how we calculated each of our populations and parameters, please see the comments in the script of this model.

After establishing a trend based on real-world data, we extended the simulation to 10 years to predict the scale and impact of TB in the longer-term. We initially wanted to show this data in the format of a graph, where increases or plateaus in the distribution of TB could be clearly seen. The script produces a graph of the model, shown in Figure 15.

SEIR-D epidemiological model graph
Figure 15: The graph of the SEIR-D epidemiological model, predicting the change in distribution of TB across a 10 year period.

We realised due to the large differences between population numbers, a graph would not be the best way to represent the results as the infectious, recovered and dead populations are dwarfed in comparison to the susceptible and exposed populations. The limited resolution of the graph showed very little change in each of the populations that was not very interpretable, and therefore did not convey any kind of scale or prediction of TB distribution.

To improve the resolution of our model, we added code that printed the populations at the end of each year within the 10 year period and built a table from this data (Table 2). From this table the trends and scale of TB is much clearer to follow. While the infectious, recovered and dead populations are a very small fraction of the whole population, this model clearly shows many people are still suffering with TB, and while not increasing drastically, is not predicted to improve from this model.

SEIR-D Epidemiological Model of Tuberculosis in the Dominican Republic

Year Susceptible population (4 s.f.) Exposed population (latent TB, 4 s.f.) Infectious population (Active TB) Recovered population Dead population
Year 0 (2022) 8.065e+6 2.690e+6 4620 0 0
Year 1 8.061e+6 2.690e+6 4683 4102 466
Year 2 8.056e+6 2.690e+6 4707 8240 936
Year 3 8.051e+6 2.690e+6 4716 12391 1408
Year 4 8.046e+6 2.690e+6 4720 16547 1880
Year 5 8.042e+6 2.691e+6 4721 20705 2353
Year 6 8.037e+6 2.691e+6 4722 24864 2826
Year 7 8.032e+6 2.691e+6 4722 29024 3298
Year 8 8.028e+6 2.691e+6 4722 33184 3771
Year 9 8.023e+6 2.691e+6 4723 37344 4244
Year 10 8.018e+6 2.691e+6 4723 41462 4712

Table 2. Showing the trend of TB distribution in the Dominican Republic predicted by the epidemiological model over a period of 10 years.

Assumptions

While this model can still provide useful insights into TB trends and progression, it is a drastic over simplification of an extremely complex global health crisis. These assumptions tie in closely with our human practices research. This model assumes:

  • The total number of individuals in the model does not increase or decrease, birth rates and death rates not related to TB are not taken into account
  • Everyone exposed had the same probability of transitioning from a latent infection to an active infection every year. In reality certain individuals face a far greater risk of developing ATB than others, and risk of developing ATB does not remain static in a person's lifetime17.
  • Everyone exposed who contracted ATB was diagnosed. Not all individuals with TB are diagnosed, the WHO produces two values for the incidence rates of TB, one based on the actual number of diagnosis, and another higher estimate taking into account predicted numbers of individuals with ATB who are not diagnosed18.
  • Everyone infectious who was diagnosed with ATB received treatment. The WHO had predicted that in 2021, 24.29% of ATB treatments were lost to follow-up16, meaning they did not continue with treatment until it was successful. Those who are infectious but not diagnosed are also unlikely to receive treatment.
  • Everyone infectious was treated for the same strain of TB. Different strains of TB have varying antibiotic resistances and this leads to treatments of varying lengths and success rates16.
  • Everyone infectious had the same probability of recovering with successful treatment every year. Treatment success is impacted by a large range of factors such as TB strain and comorbidities. Treatment also can take up to 3-6 months, with different individuals starting treatment at different points in the year17, 19.
  • Everyone recovered could no longer be reinfected or experience a relapse in their infection. TB reinfection and relapse does occur and often come with even lower success rates with treatments16.
  • Everyone infectious has the same chance of dying from TB each year. As with ATB contraction and recovery, this changes with comorbidities, living conditions and a large range of other factors17.

Take our work further:

There is a large range of assumptions innate in our ordinary differential equation based mode, future iGEM teams could significantly improve the accuracy of the model by improving detail of the current parameters, such as not all individuals who are infectious getting diagnosed or completing treatment.

Future iGEM teams could also implement interventions in their models to show how an intervention might change TB trends compared to trends without the intervention. A tool we found for this, but did not get the chance to try in time, is PyRoss20. This python library facilitates the creation of more detailed epidemiological models, taking into account many different features such as differences in age. It also enables the implementation of non-pharmaceutical interventions in a model, for example, a diagnostic test for LTB.

References


  1. Fornace ME, Huang J, Newman CT, Porubsky NJ, Pierce MB, Pierce NA. NUPACK: Analysis and Design of Nucleic Acid Structures, Devices, and Systems. ChemRxiv. 2022 Nov 11;(Version 1).
  2. Zadeh JN, Steenberg CD, Bois JS, Wolfe BR, Pierce MB, Khan AR, et al. NUPACK: Analysis and design of nucleic acid systems. Journal of Computational Chemistry. 2010 Nov 17;32(1):170–3.
  3. RNAfold web server [Internet]. rna.tbi.univie.ac.at. [cited 2025 Oct 7]. Available from: http://rna.tbi.univie.ac.at/cgi-bin/RNAWebSuite/RNAfold.cgi
  4. MalaCards - human disease database [Internet]. Malacards.org. 2017 [cited 2025 Oct 7]. Available from: https://www.malacards.org/
  5. NUPACK [Internet]. Nupack.org. 2024 [cited 2025 Oct 7]. Available from: https://nupack.org/
  6. RNAcofold web server [Internet]. Univie.ac.at. 2025 [cited 2025 Oct 7]. Available from: http://rna.tbi.univie.ac.at//cgi-bin/RNAWebSuite/RNAcofold.cgi
  7. Nucleotide BLAST: Search nucleotide databases using a nucleotide query [Internet]. Nih.gov. 2006 [cited 2025 Oct 7]. Available from: https://blast.ncbi.nlm.nih.gov/Blast.cgi?PAGE_TYPE=BlastSearch
  8. GeneCards [Internet]. Genecards.org. 2019 [cited 2025 Oct 7]. Available from: https://www.genecards.org/
  9. Green Alexander A, Silver Pamela A, Collins James J, Yin P. Toehold Switches: De-Novo-Designed Regulators of Gene Expression. Cell. 2014 Nov;159(4):925–39.
  10. Team:ULaval/Model - 2019.igem.org [Internet]. Igem.org. 2019 [cited 2025 Oct 7]. Available from: https://2019.igem.org/Team:ULaval/Model
  11. McSweeney MA, Zhang Y, Styczynski MP. Short Activators and Repressors of RNA Toehold Switches. ACS Synthetic Biology. 2023 Feb 21;12(3):681–8.
  12. Sun H, Bu S, Wang C, Wang J, Gao Y, Xu M, et al. A novel CRISPR/Cas13a biosensing platform comprising dual hairpin probe and traditional lateral flow assays. Sensors and Actuators B: Chemical [Internet]. 2024 Oct 9 [cited 2025 Oct 7];423:136752. Available from: https://www.sciencedirect.com/science/article/abs/pii/S0925400524014825
  13. Zhou P, Lu F, Pan W, Yin J, Li N, Tang B. Cyclic chain displacement amplification-based dual-miRNA detection: a triple-line lateral flow strip for the diagnosis of lung cancer. Chemical Communications [Internet]. 2021 Jan 1 [cited 2025 Oct 7];57(92):12301–4. Available from: https://pubs.rsc.org/en/content/articlelanding/2021/cc/d1cc05442b
  14. World Health Organization. Global Tuberculosis Report 2024 [Internet]. Who.int. 2024 [cited 2025 Oct 7]. Available from: https://www.who.int/teams/global-programme-on-tuberculosis-and-lung-health/tb-reports/global-tuberculosis-report-2024
  15. World Health Organization. Tuberculosis and Vulnerable Populations [Internet]. www.who.int. 2017 [cited 2025 Oct 7]. Available from: https://www.who.int/europe/news-room/fact-sheets/item/tuberculosis-and-vulnerable-populations
  16. TB profile [Internet]. Shinyapps.io. 2025 [cited 2025 Oct 7]. Available from: https://worldhealthorg.shinyapps.io/tb_profiles/?_inputs_&tab=%22tables%22&lan=%22EN%22&iso3=%22DOM%22&entity_type=%22country%22
  17. NHS. Tuberculosis (TB) [Internet]. NHS. NHS; 2023 [cited 2025 Oct 7]. Available from: https://www.nhs.uk/conditions/tuberculosis-tb/
  18. PyRoss API — PyRoss 1.0.0 documentation [Internet]. Readthedocs.io. 2020 [cited 2025 Oct 7]. Available from: https://pyross.readthedocs.io/en/latest/