Overview

Motivation

Our project was born from a dual motivation: a pressing global health challenge and a complex molecular engineering puzzle. Sexually Transmitted Infections (STIs) are a growing concern worldwide, and the most powerful tool in preventing their spread is frequent, accessible, and reliable testing. However, the journey to create such a test is fraught with difficulty. Engineering a robust, low-cost diagnostic requires orchestrating a complex network of biochemical reactions, where unforeseen interactions can lead to system failure. This reality, that a purely experimental approach would be slow, costly, and blind to many potential pitfalls, motivated us to adopt a model-driven engineering philosophy from the project's inception. We aimed to use computation not just to analyze our results, but to guide our every step.

Our Core Questions

Our entire modeling effort was designed to answer a set of fundamental questions:

Prediction: Can we computationally predict the behaviour and limiting factors of our complex biochemical system before committing to extensive lab work?
Validation: Can our model's predictions be anchored in real-world experimental data to prove its reliability?
Impact: What is the potential public health impact of successfully deploying our diagnostic, and can we quantify this to justify our design choices?
Viability: Looking toward a real-world product, what factors will affect the long-term stability and shelf-life of our test, and can we create a framework to predict them?

Strategy

To answer these questions, we devised a comprehensive, multi-scale strategy, approaching the problem from two directions simultaneously. Our "bottom-up" strategy focused on creating a detailed Reaction Model based on the mechanistic principles of biochemistry. By building this model in modular units, we could understand the molecular machinery of our test from first principles. Our "top-down" strategy involved developing an Epidemiological Model to understand the human-scale environment in which our test would operate. This dual strategy ensured our work was always connected, linking the performance of molecules in a test tube to the health of individuals in a community.

In a Nutshell

The Reaction Model simulates our core RPA-CRISPR system to predict performance, identify bottlenecks, and optimize the system's design.

The Enzyme Framework designed to forecast the long-term shelf-life of our diagnostic by modeling enzyme degradation over time, addressing a critical question for real-world deployment.

The Epidemiological Model simulates the spread of STIs in a population to quantify the public health impact of introducing a new, accessible diagnostic like ours.

The Asymmetrical RPA Framework was initiated to support our alternative DNA hybridization pathway, providing a blueprint for future optimization and comparison.

Our Journey

From Prediction to Validation and Design

The story of our modeling is a journey from uncertainty to a project-saving design. We began by tackling the "black box" problem of proprietary RPA kits. Using our model to perform a sensitivity analysis, we discovered that the system's performance was too sensitive to the concentration of SSB, providing our first critical insight.

The most crucial step was experimental validation. Our lab team's benchmark experiment yielded a result of 405 nM of amplified DNA, a value that fell squarely within the plausible range our model had predicted. This successful validation was the pivotal moment, giving us the confidence to trust our model's predictions.

That trust was essential when the integrated model made its most important finding: a "cis-cleavage deadlock", a real failure mode where the CRISPR enzyme would destroy the DNA faster than it could be amplified. The model didn't just find the problem - it gave us the solution. A temporal separation of the reactions would resolve the issue, leading us directly to propose a tangible "two-chamber" hardware design. This journey - from navigating uncertainty to experimental validation and a project-saving design -is the core of our model's success story.

Conclusion

Goals Achieved

Our modeling framework successfully answered the core questions we set out to investigate.

We demonstrated that a model can predict critical system failures before they happen in the lab.
We successfully validated our model's core engine against experimental data, establishing its credibility.
We quantified the potential public health impact of our test, providing a powerful justification for our work.
Finally, we created a suite of foundational frameworks that address everything from long-term stability to alternative detection methods.

Future Outlooks

The models we have built are dynamic tools for future innovation. We plan to use them to create a quantitative diagnostic tool, allowing a smartphone to estimate viral or bacterial load from the signal speed and intensity. Ultimately, we see our work as a step toward a global pathogen surveillance network, where data from low-cost tests like ours feeds into epidemiological models to predict and prevent future outbreaks.

Methods & Software

Our kinetic models were developed using a system of ordinary differential equations (ODEs) solved with the robust LSODA solver, implemented in Python with the SciPy and Matplotlib libraries. Our epidemiological model was developed as an agent-based model. Full details, equations, and parameter justifications can be found on the respective model pages.

Contribution

All models, the complete codebase, and simulation data are provided as open-source tools. We particularly offer our Enzyme Stability and Asymmetrical RPA frameworks as well-documented, foundational starting points for any future teams tackling the challenges of diagnostic shelf-life or alternative detection methods. You can access everything in our Git repository.

Reaction Model

Goal

The core of our project is a complex biochemical system involving multiple, interacting enzymatic reactions. To navigate this complexity and engineer a robust diagnostic, we developed a detailed mechanistic model of our entire reaction, from DNA amplification to signal generation. This "digital twin," built upon a system of Ordinary Differential Equations (ODEs), served three primary goals:

To predict the system's dynamic behavior and identify potential failure modes before they were encountered in the lab.
To guide our experimental design by identifying the most sensitive and critical parameters for optimization.
To validate our final design by anchoring our simulations in real-world experimental data.

Our model was built in a modular fashion, starting with individual units for Recombinase Polymerase Amplification (RPA) and CRISPR-Cas12a, which were later integrated into a single, comprehensive system. For more information about these methods, please visit our Project Description Page.

Modules

To simulate our complex diagnostic test, we built our model from a set of distinct but interconnected modules. Each module represents a core biological process, which allowed us to develop and test them individually before combining them into a fully integrated system.

RPA Engine

Our model describes the generation of symmetrical RPA products (amplicons) using a system of ODEs derived from eight main chemical reactions that take place during the RPA process. We assumed identical reaction rates for the forward and reverse primers. Initially, we developed a mathematical model to simulate the RPA process on its own.

Design and Setup

The RPA process is based on the following reactions:

$$ m\,G + F \;\overset{k_{1f}/m}{\underset{k_{1r}/m}{\rightleftharpoons}}\; FG \tag{1} $$

$$ R + FG \;\overset{k_{\mathrm{eq2af}}}{\underset{k_{\mathrm{eq2ar}}/n}{\rightleftharpoons}}\; FGR_{\!p} \tag{2a} $$

$$ FGR_{\!p} \;\overset{k_{2b}}{\longrightarrow}\; FGR + G \tag{2b} $$

$$ FGR + (n-1)\,R_{\!p} \;\overset{k_{\mathrm{eq2cf}}/(n-1)}{\underset{k_{\mathrm{eq2cr}}/(n-1)}{\rightleftharpoons}}\; FGnR_{\!p} \tag{2c} $$

$$ FGnR_{\!p} \;\overset{k_{2d}}{\longrightarrow}\; FnR + (m-1)\,G \tag{2d} $$

$$ FnR + D + 2\,m\,G \;\overset{k_{3f}/n}{\underset{k_{3r}/n}{\rightleftharpoons}}\; FnRD \tag{3} $$

$$ FnRD + n\,ATP \;\overset{k_{4a}^{\mathrm{cat}}/n}{\longrightarrow}\; FD + n\,R + n\,AMP + n\,PP_i \tag{4a} $$

$$ FnRD + n\,ATP \;\overset{k_{4b}^{\mathrm{cat}}/n}{\longrightarrow}\; FD + n\,ADP + H_{3}PO_{4} \tag{4b} $$

$$ FD + P \;\overset{k_{5f}}{\underset{k_{5r}}{\rightleftharpoons}}\; PFD \tag{5} $$

$$ PFD + 2\,B\,dNTP \;\overset{k_{6f}}{\longrightarrow}\; P + 2\,B\,PP_i + 2\,DNA \tag{6} $$

$$ FD + n_{p}\,PP_i \;\overset{k_{7}}{\longrightarrow}\; D + n_{p}\,dNTP \tag{7} $$

$$ D + B\,PP_i \;\overset{k_{8}}{\longrightarrow}\; B\,dNTP + FD \tag{8} $$

From these reactions, we derived a set of ODEs that describe the temporal changes in the concentrations of all involved species:

$$ \begin{aligned} \frac{dR}{dt} &= -2\,k_{\mathrm{eq2af}}\,R\,(n\,FG) +2\,k_{\mathrm{eq2ar}}\,FGR_p -2\,k_{\mathrm{eq2cf}}\,R\,FGR_p +2\,k_{\mathrm{eq2cr}}\,FGnR_p \\ & \qquad +2\,k_{4a}^{\mathrm{cat}}\,FnRD +2\,k_{4b}^{\mathrm{cat}}\,FnRD \end{aligned} \tag{1} $$

$$ \frac{dFG}{dt} = \frac{1}{m}k_{1f}GF - \frac{1}{m}k_{1r}FG - k_{\mathrm{eq2af}}RnFG +\frac{1}{n}k_{\mathrm{eq2ar}}FGR_p \tag{2} $$

$$ \frac{dFGR_p}{dt} = k_{\mathrm{eq2af}}RnFG -\frac{1}{n}k_{\mathrm{eq2ar}}FGR_p - k_{2b}FGR_p \tag{3} $$

$$ \frac{dFGR}{dt} = k_{2b}FGR_p -\frac{1}{n-1}k_{\mathrm{eq2cf}}RFGR +\frac{1}{n-1}k_{\mathrm{eq2cr}}FGnR_p \tag{4} $$

$$ \frac{dFGnR_p}{dt} = \frac{1}{n-1}k_{\mathrm{eq2cf}}RFGR -\frac{1}{n-1}k_{\mathrm{eq2cr}}FGnR_p - k_{2d}FGnR_p \tag{5} $$

$$ \frac{dFnR}{dt} = k_{2d}FGnR_p +\frac{1}{n}k_{3r}FnRD -\frac{1}{n}k_{3f}FnR\,D \tag{6} $$

$$ \frac{dFnRD}{dt} = \frac{1}{n}k_{3f}FnR\,D -\frac{1}{n}k_{3r}FnRD -\frac{1}{n}k_{4a}^{\mathrm{cat}}FnRD -\frac{1}{n}k_{4b}^{\mathrm{cat}}FnRD \tag{7} $$

\begin{align} \frac{dD}{dt} &= -\frac{1}{n} k_{3f} F n R \, D + \frac{1}{n} k_{3r} F n R D + 2 k_{6f} P F D \notag \\ % <-- continue on next line, no number &\quad + 2 k_7 P P_i F D - 2 k_8 P P_i D - k_{\mathrm{cat}}^{\mathrm{cis}} \, \mathrm{Casactive} \tag{8} \end{align}

$$ \frac{dFD}{dt} = \frac{1}{n}k_{4a}^{\mathrm{cat}}FnRD +\frac{1}{n}k_{4b}^{\mathrm{cat}}FnRD - k_{5f}PFD + k_{5r}PFD - k_7PP_iFD + k_8PP_iD \tag{9} $$

$$ \frac{dP}{dt} = - k_{5f}PFD + k_{5r}PFD + k_{6f}PFD \tag{10} $$

$$ \frac{dPFD}{dt} = k_{5f}PFD - k_{5r}PFD - k_{6f}PFD \tag{11} $$

$$ \frac{d(\mathrm{dNTPs})}{dt} = -2Bk_{6f}PFD + 2Bk_8PP_iD + 2n_pk_7PP_iFD \tag{12} $$

$$ \frac{dPP_i}{dt} = 2k_{4a}^{\mathrm{cat}}FnRD +2Bk_{6f}PFD -2Bk_8PP_iD -2n_pk_7PP_iFD \tag{13} $$

$$ \frac{dF}{dt} = -\frac{1}{m}k_{1f}GF +\frac{1}{m}k_{1r}FG \tag{14} $$

$$ \frac{dG}{dt} = -\frac{1}{m}k_{1f}GF +\frac{1}{m}k_{1r}FG \tag{15} $$

RPA Parameters

The RPA parameters were taken from experimental studies [1] at a reaction temperature of 39 °C. The referenced study reports experiments performed under continuous stirring conditions. Since our experimental setup does not involve stirring, we needed to adjust the parameters for reactions 2a, 2b, and 3, as these are diffusion-dependent. In the absence of stirring, a reduced amplification rate is expected.

To account for this difference, we applied the correction factor (Sf) described in [1], leading to the following adjusted parameters:

$$ \mathrm{Sf} = 8.05 \times 10^{-5} \times \mathrm{RPM} + 0.347 $$

$ k_{\mathrm{eq2af}} \rightarrow \mathrm{Sf} \cdot k_{\mathrm{eq2af}} $
$ k_{\mathrm{eq2cf}} \rightarrow \mathrm{Sf} \cdot k_{\mathrm{eq2cf}} $
$ k_{3\mathrm{f}} \rightarrow \mathrm{Sf} \cdot k_{3\mathrm{f}} $

Because our experiments were carried out without stirring (RPM = 0), the respective reaction rate constants were scaled accordingly.

Parameter	Definition	Value	Unit	Reference
$ B $	Number of base pairs in template	397	--	Experimental design
$ n $	Number of R-sites on primer	8	--	Model assumption*
$ m $	Number of GP32 binding sites on primer	4	--	Model assumption*
$ n_{\mathrm{p}} $	Number of base pairs in primer	32	--	Experimental design
$ k_{1\mathrm{f}} $	Forward rate constant for Eq. (1)	5×10⁸	$ \mathrm{M^{-1}} $	(1)
$ k_{1\mathrm{r}} $	Reverse rate constant for Eq. (1)	5×10³	$ \mathrm{s^{-1}} $	(1)
$ k_{\mathrm{eq2af}} $	Forward rate constant for Eq. (2a)	3.47×10⁷	$ \mathrm{M^{-1}s^{-1}} $	(1)
$ k_{\mathrm{eq2ar}} $	Reverse rate constant for Eq. (2a)	1.471	$ \mathrm{s^{-1}} $	(1)
$ k_{2\mathrm{b}} $	Rate constant for Eq. (2b)	4.7×10⁻²	$ \mathrm{s^{-1}} $	(1)
$ k_{\mathrm{eq2cf}} $	Forward rate constant for Eq. (2c)	3.47×10⁷	$ \mathrm{M^{-1}s^{-1}} $	(1)
$ k_{\mathrm{eq2cr}} $	Reverse rate constant for Eq. (2c)	331	$ \mathrm{s^{-1}} $	(1)
$ k_{2\mathrm{d}} $	Rate constant for Eq. (2d)	4.6×10⁻³	$ \mathrm{s^{-1}} $	(1)
$ k_{3\mathrm{f}} $	Forward rate constant for Eq. (3)	3.47×10⁷	$ \mathrm{M^{-1}s^{-1}} $	(1)
$ k_{3\mathrm{r}} $	Reverse rate constant for Eq. (3)	59.37	$ \mathrm{s^{-1}} $	(1)
$ k_{4\mathrm{a}}^{\mathrm{cat}} $	Forward rate constant for Eq. (4a)	4.22	$ \mathrm{s^{-1}} $	(1)
$ k_{4\mathrm{b}}^{\mathrm{cat}} $	Forward rate constant for Eq. (4b)	8.32	$ \mathrm{s^{-1}} $	(1)
$ k_{5\mathrm{f}} $	Forward rate constant for Eq. (5)	1.2×10⁷	$ \mathrm{M^{-1}s^{-1}} $	(1)
$ k_{5\mathrm{r}} $	Reverse rate constant for Eq. (5)	6.0×10⁻²	$ \mathrm{s^{-1}} $	(1)
$ k_{6\mathrm{f}} $	Forward rate constant for Eq. (6)	87	$ \mathrm{s^{-1}} $	(1)
$ k_{7} $	Forward rate constant for Eq. (7)	14.02	$ \mathrm{M^{-1}s^{-1}} $	(1)
$ k_{8} $	Forward rate constant for Eq. (8)	1.13	$ \mathrm{M^{-1}s^{-1}} $	(1)

* The parameters $ n $ and $ m $ represent the number of R-sites and the number of GP32 binding sites per primer, respectively. Since no direct experimental measurements are available, $ n = 8 $ and $ m = 4 $ were chosen as model assumptions to reflect a typical primer architecture and to simplify the kinetic scheme.

These are the variables that were used in our RPA model:

Variable	Description
$ F $	Forward primer
$ G $	Single stranded DNA binding protein (Gp32)
$ R $	Recombinase complex
$ FG $	Forward primer/Gp32 complex
$ FGR_p $	Unstable Forward primer/Gp32/recombinase complex
$ FGR $	Stable Forward primer/Gp32/recombinase complex
$ FGnR_p $	Forward primer/Gp32 with unstable $ n $ recombinase
$ FnR $	Forward primer complexed with unstable filament of $ n $ recombinase molecules
$ FnRD $	Forward primer complexed with $ n $ recombinase molecules with DNA
$ D $	Template DNA (the species we want to track for amplification)
$ FD $	Forward primer/DNA complex
$ P $	Polymerase
$ PFD $	Polymerase/forward primer/DNA complex
$ dNTPs $	Deoxynucleotides
$ PP_i $	Pyrophosphate (byproduct of polymerization)

Table 2: Overview of the variables used in the RPA model and their corresponding descriptions.

Sensitivity analysis for unknown parameters

To generate data that reflect our experimental laboratory setup, it was important for us to use the same concentrations in the model as in the actual experiments. A key challenge in using commercial diagnostic kits is that component concentrations are often proprietary. To overcome this uncertainty, we strategically used our model to perform a comprehensive sensitivity analysis. By simulating a wide range of concentrations for unknown components like the Single-Strand Binding Protein (SSB), polymerase, and recombinase, we could identify which factors had the most significant impact on performance. This allowed us to de-risk our design by understanding its potential failure points.

The primer concentration was predetermined and set to 10 µM, matching the conditions used in our laboratory experiments. In contrast, the concentrations of the single-stranded DNA-binding protein (SSB; represented by variable G in the model), the polymerase (P), and the recombinase (R) were not specified by the manufacturer.

The following graphs illustrate how variations in these individual concentrations affect both the final DNA concentration produced within 30 minutes and the overall reaction rate.

Agarose Gel — **Figure 2**: RPA spectrum for different initial P (polymerase) concentrations.

The numerical impact of these variations can be summarized as follows:

Species	Min DNA (nM)	Max DNA (nM)
$ G $	44.963	2364.463
$ P $	436.662	449.655
$ R $	447.942	447.942

Table 3: Minimum and maximum DNA concentrations obtained for different initial species concentrations.

This shows that the model is most sensitive to changes in the concentration of SSB, while the other two unknown concentrations play a comparatively minor role. A similar effect has also been reported in experimental studies [4].

To gain a deeper understanding of our system, we additionally varied the concentrations of other components such as dNTPs, DNA, and the primers.

The numerical impact of these additional variations can be summarized as follows:

Species	Min DNA (nM)	Max DNA (nM)
$ F $	447.941	447.943
$ D $	447.941	447.943
$ \mathrm{dNTP} $	447.942	447.943

Table 4: Minimum and maximum DNA concentrations obtained for different initial species concentrations.

It can be observed that, within the tested range of DNA, dNTP and primer concentration, no significant change in the final product occurs. This underlines the robustness and efficiency of the RPA process. A more detailed analysis of varying DNA concentrations will be presented in the Limit of Detections (LOD) analysis.

Concentration Profiles of Multiple Components

In this part, we show the concentration profiles of various components over the course of the RPA reaction. To avoid presenting an excessive number of graphs, we selected one representative set of initial concentrations for the following simulations. This ensures better comparability between the graphs. The same set of initial concentrations is used throughout the remainder of this section.

The initial concentrations used at the start of the simulation were as follows:

Species	Initial concentration [M]
$ F $	1.00 × 10⁻⁵
$ G $	4.50 × 10⁻⁷
$ R $	5.90 × 10⁻⁶
$ FG $	0
$ FGR_p $	0
$ FGR $	0
$ FGnR_p $	0
$ FnR $	0
$ FnRD $	0
$ D $	3.80 × 10⁻¹³
$ FD $	0
$ P $	1.50 × 10⁻⁸
$ PFD $	0
$ \mathrm{dNTPs} $	2.40 × 10⁻⁶

Table 5: Initial concentrations of all species used in the RPA simulation.

CRISPR-Cas12a Detection System

One of our detection methods is the CRISPR-Cas12 approach using the LbCas12a enzyme. The model realistically simulates both the generation of DNA copies and the resulting signal. The colorimetric signal is produced when a fluorescently labeled reporter is cleaved. This cleavage process is triggered once a critical amount of target DNA is present.

Using the model, we can calculate the temporal dynamics of reporter cleavage and then convert these into fluorescence values using an experimentally determined calibration factor.

Design and Setup

Similar to our RPA module, we used a system of ODEs derived from the main chemical reactions that take place during the CRISPR/Cas12a process.

The Cas12a reactions:

$$ \underbrace{[\mathrm{Cas}]_{\mathrm{inactive}}}_{[\mathrm{Cas}]_{\mathrm{total}} - [\mathrm{Cas}]_{\mathrm{active}}} + D \;\overset{k_{1}^{\mathrm{cis}}}{\underset{k_{-1}^{\mathrm{cis}}}{\rightleftharpoons}}\; C_{\mathrm{cis}} \;\overset{k_{\mathrm{cat}}^{\mathrm{cis}}}{\longrightarrow}\; [\mathrm{Cas}]_{\mathrm{active}} + D^{\ast} \tag{1C} $$

$$ [\mathrm{Cas}]_{\mathrm{active}} + [\mathrm{Reporter}]_{\mathrm{free}} \;\overset{k_{1}^{\mathrm{trans}}}{\underset{k_{-1}^{\mathrm{trans}}}{\rightleftharpoons}}\; C_{\mathrm{trans}} \;\overset{k_{\mathrm{cat}}^{\mathrm{trans}}}{\longrightarrow}\; [\mathrm{Cas}]_{\mathrm{active}} + [\mathrm{clDR}] \tag{2C} $$

The Cas12a ODEs:

$$ \frac{d[\mathrm{Cas}]_{\mathrm{active}}}{dt} = k_{\mathrm{cat}}^{\mathrm{cis}} \frac{[\mathrm{Cas}]_{\mathrm{total}} - [\mathrm{Cas}]_{\mathrm{active}}}{} \frac{D}{K_{m}^{\mathrm{cis}} + D} \tag{1C} $$

$$ \frac{d[\mathrm{clDR}]}{dt} = \frac{k_{\mathrm{cat}}^{\mathrm{trans}} \,[\mathrm{Cas}]_{\mathrm{active}} \,[\mathrm{Reporter}]_{\mathrm{free}}} {K_{m}^{\mathrm{trans}} + [\mathrm{Reporter}]_{\mathrm{free}}} \tag{2C} $$

$$ [\mathrm{Reporter}]_{\mathrm{free}} = [\mathrm{Reporter}]_{\mathrm{total}} - [\mathrm{clDR}],\quad [\mathrm{Cas}]_{\mathrm{inactive}} = [\mathrm{Cas}]_{\mathrm{total}} - [\mathrm{Cas}]_{\mathrm{active}} $$

An overview of the variables used in the Cas12a model is given below:

Variable	Description
$[\mathrm{Cas}]_{\mathrm{total}}$	Total concentration of Cas12a protein (active + inactive)
$[\mathrm{Cas}]_{\mathrm{inactive}}$	Inactive Cas12a pool before binding to target DNA
$[\mathrm{Cas}]_{\mathrm{active}}$	Activated Cas12a after cis binding and cleavage
$D$	Target DNA recognized by Cas12a (activator)
$D^{\ast}$	Cleaved target DNA after cis activation
$C_{\mathrm{cis}}$	Intermediate Cas12a-DNA complex in cis activation
$C_{\mathrm{trans}}$	Cas12a-reporter encounter complex in trans cleavage
$[\mathrm{Reporter}]_{\mathrm{total}}$	Total concentration of reporter molecules
$[\mathrm{Reporter}]_{\mathrm{free}}$	Unbound (uncleaved) reporter concentration
$[\mathrm{clDR}]$	Concentration of cleaved reporter molecules (fluorescent signal)

Table 6: Overview of the variables used in the Cas12a model.

Furthermore, the following table helps to identify the Cas12a parameters used in our equations and to clarify their meaning.

Parameter	Definition	Value	Unit	Reference
$k_{1}^{\mathrm{cis}}$	Association rate (inactive Cas12a $+\; D \rightarrow C_{\mathrm{cis}}$)	--	$\mathrm{M^{-1}\,s^{-1}}$	Model assumption$^*$
$k_{-1}^{\mathrm{cis}}$	Dissociation rate ($C_{\mathrm{cis}} \rightarrow$ inactive Cas12a $+\; D$)	--	$\mathrm{s^{-1}}$	Model assumption$^*$
$k_{\mathrm{cat}}^{\mathrm{cis}}$	Turnover in cis (activation on target DNA)	0.054	$\mathrm{s^{-1}}$	(5)
$K_{m}^{\mathrm{cis}}$	Michaelis constant for cis activation (w.r.t. $D$)	0.05	$\mathrm{nM}$	(6)
$k_{1}^{\mathrm{trans}}$	Association rate ($[\mathrm{Cas}]_{\mathrm{active}} + [\mathrm{Reporter}]_{\mathrm{free}} \rightarrow C_{\mathrm{trans}}$)	--	$\mathrm{M^{-1}\,s^{-1}}$	(6)$^*$
$k_{-1}^{\mathrm{trans}}$	Dissociation rate ($C_{\mathrm{trans}} \rightarrow [\mathrm{Cas}]_{\mathrm{active}} + [\mathrm{Reporter}]_{\mathrm{free}}$)	--	$\mathrm{s^{-1}}$	(6)$^*$
$k_{\mathrm{cat}}^{\mathrm{trans}}$	Turnover in trans (reporter cleavage)	3.85	$\mathrm{s^{-1}}$	(3)
$K_{m}^{\mathrm{trans}}$	Michaelis constant for trans cleavage (w.r.t. reporter)	27.29	$\mathrm{nM}$	(3)

* Following study (6), Cas-reporter binding occurs so rapidly that it can be considered quasi-instantaneous. We adopt the same simplification for the cis association/dissociation steps; therefore explicit numerical values for $k_{1}^{\mathrm{cis}}$, $k_{-1}^{\mathrm{cis}}$, $k_{1}^{\mathrm{trans}}$, and $k_{-1}^{\mathrm{trans}}$ are not specified.

Table 7: Cas12a model parameters, their definitions, values, and references.

And here is an overview of the initial concentrations used in the Cas12a simulation:

Species	Initial concentration [M]
$[\mathrm{Cas}]_{\mathrm{total}}$	$1.0 \times 10^{-8}$
$[\mathrm{Cas}]_{\mathrm{inactive}}$	$1.0 \times 10^{-8}$
$[\mathrm{Cas}]_{\mathrm{active}}$	$0$
$[\mathrm{Reporter}]_{\mathrm{total}}$	$1.0 \times 10^{-7}$
$[\mathrm{Reporter}]_{\mathrm{free}}$	$1.0 \times 10^{-7}$
$[\mathrm{clDR}]$	$0$
$D$	$3.8 \times 10^{-13}$
$D^{\ast}$	$0$
$C_{\mathrm{cis}}$	$0$
$C_{\mathrm{trans}}$	$0$

Table 8: Initial concentrations of all species used in the Cas12a simulation.

LbCas12a kinetics

To accurately parameterize our model, we first needed to determine the Michaelis-Menten kinetics of $k_{\mathrm{cat}}^{\mathrm{trans}}$ and $K_{m}^{\mathrm{trans}}$ of our specific LbCas12a enzyme.

$$ \frac{d[\mathrm{clDR}]}{dt} = k_{cat}^{\mathrm{trans}} \cdot [\mathrm{Cas}]_{\mathrm{active}} \cdot \frac{[\mathrm{Reporter}]_{\mathrm{free}}} {K_{m}^{\mathrm{trans}} + [\mathrm{Reporter}]_{\mathrm{free}}} $$

To do that, our Wet Lab first conducted an experiment to determine the Michaelis-Menten kinetics of our LbCas12a enzyme. Upon critical review of the initial data, we determined it was not suitable for a reliable kinetic fit. The experiment had yielded measurements at only two substrate concentrations, which is insufficient to accurately estimate Michaelis-Menten parameters $K_{m}^{\mathrm{trans}}$ and $k_{\mathrm{cat}}^{\mathrm{trans}}$.

Holding our model to a high standard of scientific accuracy, we made the decision not to use this preliminary data.To maintain project velocity and avoid experimental bottlenecks, our team immediately pivoted to a robust literature-based parameterization strategy. We performed a thorough search and sourced kinetic data from a study investigating the same enzyme under comparable buffer conditions [3]. This adaptive approach ensured our model was built on a reliable and scientifically valid foundation, allowing our project to proceed without compromising its accuracy.

We used the dsDNA data from this study, since our target is also double-stranded DNA. Based on this, we parameterized our model using $k_{\mathrm{cat}}^{\mathrm{trans}} \approx 3.85\ \mathrm{s}^{-1}$ and an appropriate Km value from the literature.

Fluorescence Calibration

Our simulation calculates the concentration of cleaved reporters over time based on the turnover number ($k_{\mathrm{cat}}^{\mathrm{trans}}$) of Cas12a. However, to relate these values to the actual fluorescence signal measured in experiments, a linear calibration factor is required. To determine this factor, the Wet Lab conducted the following experiment:

Experiment:

A dilution series of known reporter concentrations was prepared.
The fluorescence intensity of each dilution was measured.
The measured fluorescence intensity was plotted against the reporter concentration.

Subsequently, a linear regression was applied, resulting in a line whose slope describes the relationship between reporter concentration and fluorescence intensity.

This calibration factor enables us to convert simulated reporter concentrations directly into fluorescence values that can be compared with experimental data.

Determination of Background Fluorescence

In order to distinguish true signal from background noise, we needed to determine the background fluorescence generated by the complete system in the absence of target DNA, i.e., under conditions where no actual cleavage and signal generation occur. To do this, our Wet Lab performed the following experiment:

Experiment:

Cas12a was incubated with the fluorescently labeled reporter in the absence of target DNA.
Multiple replicates were measured to obtain the characteristic background signal.

Afterwards, we calculated the mean of these measurements and defined it as the background fluorescence.

The effective fluorescence signal was then calculated as: Signal = Total fluorescence - Background fluorescence

Integrated System Findings

Future Hardware Design

When we integrated our parameterized RPA and Cas12a modules, our model made a critical prediction: the combined system would fail to produce a signal.

LbCas12a not only cleaves the reporter in trans, but also cuts the target dsDNA in cis, i.e., at the target site itself. This cis-cleavage reaction occurs before the trans-cleavage is initiated and has two important consequences:

It activates Cas12a by binding and cleaving the target DNA.
It simultaneously reduces the amount of DNA produced by RPA, because the target DNA is itself degraded in the process.

To incorporate this into our model, we parameterized the cis-cleavage kinetics separately. A study [5] reported $k_{\mathrm{cat}}^{\mathrm{cis}}$ values for LbCas12a ranging from 0.0024 to 0.054 $\mathrm{s}^{-1}$. To reflect a worst-case scenario (maximum inhibition by cis-cleavage), we used $k_{\mathrm{cat}}^{\mathrm{cis}} = 0.054\ \mathrm{s}^{-1}$. Another study [6] provided indirect evidence suggesting that the $K_{m}^{\mathrm{cis}}$ of LbCas12a is below 5 nM. Based on this indication, we chose a $K_{m}^{\mathrm{cis}}$ value of 0.05 nM, again to model a conservative scenario.

In our ODE model, Cas12a activation and cis-cleavage are described using Michaelis-Menten kinetics:

$$ \frac{d[\mathrm{Cas}]_{\mathrm{active}}}{dt} = k_{cat}^{\mathrm{cis}} \cdot \big([\mathrm{Cas}]_{\mathrm{total}} - [\mathrm{Cas}]_{\mathrm{active}}\big) \cdot \frac{D}{K_{m}^{\mathrm{cis}} + D} $$

This formulation reflects the gradual activation of Cas12a as more DNA becomes available, rather than assuming an instantaneous activation. The cis-reaction consumes DNA over time, reducing the pool of RPA amplicons available for trans-cleavage. This effect is particularly strong in the early amplification phase, when RPA product concentrations are still low.

Using these parameters, our model predicted no detectable fluorescence signal when Cas12a is active from the beginning of the reaction. The cis-cleavage removes DNA faster than RPA can produce it at early time points, resulting in a “deadlock” situation where Cas12a never reaches sufficient activation to cleave reporters in trans.

To overcome this, we reasoned that temporally separating amplification and detection could avoid this early inhibition. By allowing the RPA reaction to proceed for several minutes before introducing Cas12a, a sufficient amount of DNA accumulates. Once Cas12a is introduced, cis-cleavage no longer significantly impacts the reaction because the DNA concentration is already high. To test this, we implemented a delay parameter in our model, where Cas12a activation starts only after a defined time (e.g., 300 s). The simulations clearly show that a short delay is sufficient to allow reporter cleavage to proceed efficiently.

This modeling insight was critical. It provided a clear, actionable design principle for our hardware: a two-chamber system that temporally separates amplification and detection. In such a setup, RPA would run first in isolation, allowing sufficient DNA to accumulate. After a defined incubation period, the sample would be transferred into the Cas12a reaction chamber, initiating detection only after amplification is well underway. This temporal separation could be achieved through simple means such as lateral flow or paper-based microfluidic systems, where controlled migration of the reaction mixture triggers Cas12a activity after a predefined delay. Our model predicts that this approach would prevent early cis-cleavage from depleting the target DNA and would enable robust signal generation. By identifying this failure mode in silico, the model saved us time and resources that would have been spent troubleshooting failed wet lab experiments.

Starting from this point, the following sections will address the case in which the reaction is separated, with Cas12a being activated 300 seconds after the start of the RPA.

Sufficiency of RPA Product for Cas12a Activation

To assess whether a range of RPA product concentrations is sufficient to make reliable predictions, we calculated the minimum amount of RPA-generated DNA required to activate Cas12a strongly enough to cleave all reporters and generate the maximal signal within 30 minutes. This threshold was found to be 0.042 nM of RPA product. When varying the G concentration, the lowest resulting RPA product concentration was 44.963 nM, which is roughly 1000 times higher than the minimum required amount. This indicates that, under all tested conditions, the amount of amplified DNA is more than sufficient to fully activate Cas12a, meaning that variations in RPA yield within this range are unlikely to affect the final signal intensity.

Detection Threshold and Limit of Detection (LOD) Determination

Threshold Calculation

The detection threshold was determined following the approach described in [3] using the formula:

$$ \text{Threshold} = \mu_{\mathrm{blank}} + K \cdot \sigma_{\mathrm{blank}} $$

where $ \mu_{\mathrm{blank}} $ is the mean fluorescence of blank samples, $ \sigma_{\mathrm{blank}} $ is the corresponding standard deviation, and $K$ is a confidence factor. Following [3], we set $K = 3$, which corresponds to a 99.9 % confidence level that blank measurements lie below the detection threshold. Higher $K$ values result in more conservative thresholds but increase statistical confidence.

The blank measurements used for this calculation were identical to those used to determine the background fluorescence. Based on these values, the detection threshold was set to 1473 a.u.

Simulation Constraints:

Simulations were performed across a range of starting DNA concentrations to identify the lowest input at which the signal exceeded the threshold. While the search formally started at $1 \times 10^{-22}\ \mathrm{M}$, concentrations below the equivalent of a single molecule per reaction are biologically meaningless. The simulation was therefore constrained to exclude values below this physical lower bound.

For the assay volume of 47.5$ \mu\mathrm{L} $, one DNA molecule corresponds to a concentration step of $3.496 \times 10^{-20}\ \mathrm{M}$. Accordingly, limits are expressed in whole numbers of molecules, not arbitrary decimal concentrations.

LOD Determination

Using Monte Carlo simulations incorporating Poisson-distributed start copies, an initiation gate, measurement noise, and the detection rule “threshold crossed before 30 min,” the continuous search yielded a limit at 7.47 expected copies.

Poisson distribution of input molecules: For a nominal initial concentration $C_0$ and reaction volume V, the expected number of input DNA molecules is

$$ \lambda = C_{0}\, V\, N_{\mathrm{A}}, $$

where $N_{\mathrm{A}}$ is Avogadro’s number. The actual number of molecules N used in each simulation is drawn from a Poisson distribution

$$ N \sim \mathrm{Poisson}(\lambda) $$

Initiation gate: Only reactions with at least Nmin seed molecules can proceed. Given $N \ge N_{\min}$, each seed initiates amplification with probability $p_{\mathrm{seed}}$. The probability that the reaction successfully initiates is

$$ p_{\mathrm{init}}(N) = 1 - \bigl(1 - p_{\mathrm{seed}}\bigr)^{N}. $$

A Bernoulli draw with success probability $p_{\mathrm{init}}(N)$ determines whether the reaction proceeds.

Conversion between copies and molar concentration uses

$$ C = \frac{N}{V\,N_{\mathrm{A}}}, \qquad N = [\lambda]. $$

Detection rule: A replicate is considered positive if the simulated fluorescence signal $F(t)$ crosses the detection threshold Fth before $t_{\max}$ = 1800 s,

$$ \exists\, t^{\ast} \le t_{\max}\ \text{such that}\ F\!\left(t^{\ast}\right) \ge F_{\mathrm{th}} . $$

Cas12a activation and Michaelis-Menten cleavage: After a fixed delay $ t_{\mathrm{delay}} $=300 , the concentration of activated Cas12a,$ [\mathrm{Cas}]_{\mathrm{active}}(t) $, evolves according to

$$ \frac{d[\mathrm{Cas}]_{\mathrm{active}}}{dt} = k_{cat}^{\mathrm{cis}} \left([\mathrm{Cas}]_{\mathrm{total}} - [\mathrm{Cas}]_{\mathrm{active}}\right) \frac{[D](t)}{K_{m}^{\mathrm{cis}} + [D](t)} \,. $$

and the cleaved reporter concentration $[\mathrm{clDR}](t)$ follows a Michaelis-Menten trans-cleavage equation

$$ \frac{d[\mathrm{clDR}]}{dt} = \frac{k_{cat}^{\mathrm{trans}}\,[\mathrm{Cas}]_{\mathrm{active}}(t)\, \big([\mathrm{Reporter}]_{\mathrm{total}} - [\mathrm{clDR}](t)\big)} {K_{m}^{\mathrm{trans}} + \big([\mathrm{Reporter}]_{\mathrm{total}} - [\mathrm{clDR}](t)\big)} \,. $$

Limit of detection (LOD): The LOD is defined as the smallest nominal concentration $ C_{\mathrm{LOD}} $ for which the detection probability reaches the target level ptarget=0.95,

$$ \mathbf{P}_{\mathrm{det}}\!\big(C_{\mathrm{LOD}}\big) \ge p_{\mathrm{target}}\,. $$

where $P_{\mathrm{det}}(C)$ is estimated by repeated stochastic simulation (Poisson input, initiation gate, measurement noise, deterministic kinetics).

The continuous search yielded $\lambda_{\mathrm{LOD}}$ = 7.47 expected molecules, which was quantized up to the next integer number of input molecules

$$ N_{\mathrm{LOD}} = [\lambda_{\mathrm{LOD}}] = 8\,. $$

corresponding to a concentration of

$$ C_{\mathrm{LOD,quant}} = \frac{N_{\mathrm{LOD}}}{V\,N_{\mathrm{A}}} = 2.80 \times 10^{-19}\,\mathrm{M}. $$

in V=47.5 $\mu\mathrm{L}$.

Cas12a remained inactive for the first 300 s, preventing any signal from crossing the threshold before this time.

Time-to-Threshold Distribution and Validation

At the LOD of 8 copies, the median time-to-threshold was 548.7 s (9.15 min), i.e., about 4 min after Cas activation. The distribution tail was modest, with the 90th and 95th percentiles at 663.2 s (11.05 min) and 744.0 s (12.40 min), respectively.

A validation experiment with 40 replicates at 8 copies yielded 39/40 detections (97.5 %) within the 30 min assay window. This slightly exceeds the 95 % detection target because the rounded input (8 copies) is marginally higher than the theoretical continuous boundary (7.47 copies).

Summary:

Threshold: 1473 a.u. ($K$ = 3, 99.9 % confidence)
Physical resolution: 1 molecule = $3.496\times10^{-20}\,\mathrm{M}$
LOD: 8 copies = $2.80\times10^{-19}\,\mathrm{M}$
Median detection time: 9.15 min
Detection rate at LOD: 97.5 % within 30 min

Representation of the Core System

An overview of the core system is shown in the following figures:

Enzyme Stability Model

A successful diagnostic must be reliable not only on the day it is made but also after months of storage and transport. To address this critical real-world requirement, we developed an Enzyme Stability Model. This model is designed to predict the loss of enzyme activity over time under different temperature conditions, allowing us to forecast the shelf-life of our test and to quantitatively validate our choice of freeze- drying approach.

Design and Setup

The framework is grounded in established thermodynamic and kinetic principles. The temperature dependence of the enzyme denaturation rate, $k_{d}(T)$, is described by the Arrhenius equation. The subsequent loss of enzyme activity over time, $A(t)$, is then modeled as an irreversible first-order decay process. This formulation allows us to simulate the impact of arbitrary temperature histories—such as a cool storage phase followed by a room-temperature usage phase—by calculating the cumulative loss of activity in discrete time steps.

This is an overview of the key variables, parameters, and equations used in the model:

Symbol	Description	Value	Unit
$k_{d,\mathrm{ref}}$	Denaturation rate at $T_{\mathrm{ref}}$	$2.60 \times 10^{-3}$	$\mathrm{min^{-1}}$
$E_a$	Activation energy of denaturation	$1.22 \times 10^{5}$	$\mathrm{J\,mol^{-1}}$
$R$	Gas constant	$8.314$	$\mathrm{J\,mol^{-1}\,K^{-1}}$
$T_{\mathrm{ref}}$	Reference temperature	$323.15$ $(50^\circ\mathrm{C})$	$\mathrm{K}$
$T_{\mathrm{home}}$	Home/room temperature	$295.15$ $(22^\circ\mathrm{C})$	$\mathrm{K}$
$T_{\mathrm{storage}}$	Storage temperature	$278.15$ $(5^\circ\mathrm{C})$	$\mathrm{K}$
$\Delta t$	Timestep	$1$	$\mathrm{min}$
$A_0$	Initial activity	$100$	$\%$
$A(t)$	Remaining enzyme activity (over time)	—	$\%$

Table 9: Table summarizing key parameters and variables used to model temperature-dependent enzyme denaturation rates.

$$ k_{d,\mathrm{home}} \;=\; k_{d,\mathrm{ref}}\, \exp\!\left(\frac{E_a}{R}\left(\frac{1}{T_{\mathrm{ref}}}-\frac{1}{T_{\mathrm{home}}}\right)\right) \tag{1E} $$

$$ k_{d,\mathrm{storage}} \;=\; k_{d,\mathrm{ref}}\, \exp\!\left(\frac{E_a}{R}\left(\frac{1}{T_{\mathrm{ref}}}-\frac{1}{T_{\mathrm{storage}}}\right)\right) \tag{2E} $$

Temperature Dependence of the Denaturation Rate

The temperature-dependent denaturation rate $k_{d}(T)$ follows the Arrhenius equation:

\[ k_{d}(T) = k_{d,\mathrm{ref}}\, \exp\!\left[ \frac{E_a}{R}\left(\frac{1}{T_{\mathrm{ref}}} - \frac{1}{T}\right) \right] \]

Activity Decay Over Time

Enzyme inactivation is modeled as a first-order irreversible process:

\[ \frac{dA}{dt} \;=\; -\,k_{d}(T)\,A(t) \]

with $A(t)$ the remaining activity (in %) at time $t$. For discrete time steps $\Delta t$:

\[ A(t+\Delta t) \;=\; A(t)\,\exp\!\left[-\,k_{d}(T)\,\Delta t\right] \]

This update allows arbitrary temperature profiles $T(t)$: at each step, read $T(t)$, compute $k_{d}(T)$, and advance A.

Temperature Profile

A simple two-phase storage/usage profile:

\[ T(t) = \begin{cases} T_{\mathrm{storage}}, & t < t_{\mathrm{storage}},\\[4pt] T_{\mathrm{home}}, & t \ge t_{\mathrm{storage}}. \end{cases} \]

where $T_{\mathrm{storage}}$ is the storage temperature, the storage duration, and $T_{\mathrm{home}}$ the usage temperature (e.g., room temperature).

Parametrization Strategy

A key modeling choice was to identify which of the many proteins in our system would be the primary limiting factor for shelf-life. We adopted a simplified single-bottleneck assumption, treating one protein's decay as the pace-setter for the entire assay. Protein size often correlates with thermal robustness [7] . Although UvsY is the smallest component, it typically operates in a stabilizing complex with UvsX. We therefore select Gp32 (SSB), the smallest component not obligatorily stabilized by a partner, as the practical pace-setter for shelf-life prediction: if Gp32 remains functional over a given temperature history, the assay is expected to remain functional as well.

This choice is strongly supported by our earlier sensitivity analysis, which showed that the concentration of G (our Gp32/SSB component) has the most significant influence on the model's output, implying that a decline in G activity will dominate assay performance. This is an explicit modeling assumption and simplification.

**Table 10**: Proteins of different sizes used in our simulations.
Protein	Size (Number of amino acids)	Source
LbCas12a	1228	[10]
UsvY	137	[11]
UsvX	391	[12]
Gp32	301	[13]
Bsu Pol I	880	[14]

Direct measurement of $E_a$ and $k_{d,\mathrm{ref}}$ for isolated Gp32 was not feasible because the available kits bundle multiple proteins, and our literature search did not yield Gp32-specific stability constants.

To overcome this challenge, we adopted a size-matched proxy parameters from plant peroxidases of comparable length. In particular, we used the values reported in a scientific paper [8] for Eichhornia crassipes leaf peroxidase - $E_a \approx 122\ \mathrm{kJ\,mol^{-1}}$ and $k_{d,\mathrm{ref}} \approx 2.6\times10^{-3}\ \mathrm{min^{-1}}\ \text{(at }50^\circ\mathrm{C)}$.

Because we could not locate the exact sequence length of that enzyme, we surveyed lengths of related plant peroxidases, which cluster around ∼300 amino acids; based on this, we assume the Eichhornia peroxidase is of similar size (~300 amino acids) and use it as a pragmatic proxy for Gp32.

**Table 11**: Different peroxidases have sizes in the range of 300 amino acids.
UniProt ID for the peroxidase	Size (Number of amino acids)	Source
P00433	353	[15]
Q9SMU8	353	[16]
Q43872	317	[17]
Q6AVZ3	344	[18]
Q7F1U0	317	[19]
Q05431	250	[20]
P48534	250	[21]

Model Predictions

With this framework, we can chart the time course of Gp32 activity. By modeling a temperature history that separates a cool-storage phase from a post-purchase, room-temperature phase, we obtain shelf-life predictions that closely mirror real-world handling.

The results reveal how storage conditions dramatically affect shelf life. At refrigeration temperatures (5 °C), enzyme activity declines slowly over many months. Once brought to room temperature (22°C), degradation accelerates substantially, but the assay remains viable for weeks of typical use.

Linking Stability to Performance

To connect molecular stability to assay performance, we couple the computed Gp32 activity $A(t)$ to our core Reaction Model (RPA→CRISPR) by scaling the Gp32-sensitive kinetic steps. Specifically, reaction rates that depend on SSB/Gp32 concentration are multiplied by the activity factor $A(t)$. This yields time-resolved detection behavior under realistic temperature histories: how quickly fluorescence crosses threshold after activation as a function of prior storage conditions.

This integrated simulation yields a direct, quantitative prediction of how shelf-life affects the user experience. The results, shown in the fluorescence curves and the time-to-threshold heatmap, demonstrate how a longer time at room temperature leads to reduced enzyme activity. This, in turn, slows down the diagnostic reaction, increasing the time a user must wait for a positive signal. For example, the heatmap shows that a test stored for 100 days and then kept at room temperature for another 25 days (total 125 days) would take 19 minutes to produce a detectable signal, whereas one used immediately after storage would take only 5 minutes.

Limitations and outlook

Using a plant peroxidase as a size-matched proxy for Gp32 is a pragmatic simplification: sizes are comparable, but the peroxidase’s fold and microenvironment differ from Gp32, so absolute rates may shift. The framework itself is mechanistic and modular. Once Gp32-specific stability constants ($E_a$ and $k_{d,\mathrm{ref}}$) are measured in dedicated assays, we can replace the proxy values in the Arrhenius model and obtain sharper shelf-life predictions without altering the rest of the pipeline.

Key Finding

Modeling for our Entrepreneurial Plan

From the very beginning, our project was designed with a clear entrepreneurial goal: to create a diagnostic that is globally accessible and independent of the cold chain. We identified lyophilization (freeze-drying) as a core component of our manufacturing strategy to achieve this. However, this presented a key business question: is the added manufacturing cost of freeze-drying justified by the extended shelf-life? Our integrated reaction model was developed, among other things, to answer this question quantitatively. By coupling our kinetic model with the Enzyme Stability Framework, we didn't just find a problem; we validated our initial business decision. The results, particularly the time-to-threshold heatmap, provide powerful evidence for our strategy. The model clearly demonstrates that without lyophilization, the test's performance degrades rapidly, rendering it unreliable for the very low-resource settings we aim to serve. It quantitatively proves that the immense value gained by eliminating the cold chain, ensuring product reliability and market accessibility, far outweighs the manufacturing cost, forming a cornerstone of our business case.

Methods & Tools

ODE solver:

The ODE system is solved using the LSODA solver, which proved to be more robust against integration failures than other solvers, such as CVODES [2]. LSODA combines two numerical methods: a backward differentiation formula (BDF) algorithm for stiff systems and an Adams-Moulton algorithm for non-stiff systems.

Since our model involves multiple timescales and reaction rates that differ by several orders of magnitude, it is classified as a stiff system. Although both LSODA and CVODES use BDF methods for stiff problems, results reported in [2] suggest that the LSODA implementation outperforms that of CVODES.

Conclusion

Our Reaction Model was a powerful and essential part of our project's journey. We built it to be more than just a simulation; it was a tool for navigating the real-world uncertainties of biochemical engineering. By embracing challenges transparently, we were able to produce insights that guided our entire project.

Our work began by tackling the "black box" problem of proprietary kits. Despite not knowing all the component concentrations, we successfully validated our model's core RPA amplification module against our wet lab's experimental data. This was a critical achievement, as it provided the anchor of confidence we needed to trust our model's predictions. That trust was essential when our model made its most important finding: the prediction of a "cis-cleavage deadlock," a fundamental system failure that we would have been blind to in the lab.

This insight allowed us to propose a tangible engineering solution - the "two-chamber" hardware design - demonstrating the model's direct value. Furthermore, our model proved its precision by rigorously determining a limit of detection of 8 copies, a prediction that was validated based on scientific research literature.

By linking performance and stability of our diagnostic we gained a powerful, quantitative validation for our initial entrepreneurial strategy of incorporating freeze-drying. The model clearly shows that without a method to preserve enzyme stability, the test's performance would be unreliable for real-world use where a cold chain is not guaranteed. This result is a cornerstone of our business case, proving that lyophilization is not just a desirable feature but an essential requirement for our diagnostic to be effective and accessible globally.

Ultimately, our Journey tells the story of the real scientific process. It is a story of facing challenges, adapting to unexpected results, and using a combination of computation and experimentation to design something that works.

References

[1] Moody C, Newell H, Viljoen H. A mathematical model of recombinase polymerase amplification under continuously stirred conditions. Biochem Eng J. 2016 Aug 15;112:193-201. Available from: doi: 10.1016/j.bej.2016.04.017.
[2] Städter P, Schälte Y, Schmiester L, Hasenauer J, Stapor PL. Benchmarking of numerical integration methods for ODE models of biological systems. Sci Rep. 2021 Jan 29;11:2696. Available from: doi: 10.1038/s41598-021-82196-2.
[3] Nalefski EA, Kooistra RM, Parikh I, Hedley S, Rajaraman K, Madan D. Determinants of CRISPR Cas12a nuclease activation by DNA and RNA targets. Nucleic Acids Res. 2024 Mar 13;52(8):4502-4522. Available from: doi: 10.1093/nar/gkae152.
[4] Cordoba-Andrade F, Peralta-Castro A, et al. Proteins is a critical factor in recombinase polymerase amplification (RPA), as revealed by insights from an open-source system. PeerJ. 2025;13:e19758. Available from: doi: 10.7717/peerj.19758.
[5] Lesinski JM, Moragues T, et al. In situ complexation of sgRNA and Cas12a improves the performance of a one-pot RPA-CRISPR-Cas12 assay. Anal Chem. 2024 Jun 12;96(25):10443-10450. Available from: doi: 10.1021/acs.analchem.4c01777.
[6] Nalefski EA, Patel N, et al. Kinetic analysis of Cas12a and Cas13a RNA-guided nucleases for development of improved CRISPR-based diagnostics. iScience. 2021 Aug 18;24(9):102996. Available from: doi: 10.1016/j.isci.2021.102996.
[7] Naganathan AN, Doshi U, Muñoz V. Protein folding kinetics: barrier effects in chemical and thermal denaturation experiments. J Am Chem Soc. 2007 Apr 10;129(17):5673-5682. Available from: doi: 10.1021/ja0689740.
[8] Arise RO, Osundahunsi BO, Farohunbi ST, Yekeen AA. Catalytic and kinetic properties of purified Eichhornia crassipes leaf peroxidase. Environ Exp Biol. 2016;14:83-90. Available from: doi: 10.22364/eeb.14.12.
[9] BioRender [Internet]. Toronto (ON): BioRender Inc.; [cited 2025 Oct 8]. Available from: https://www.biorender.com/.
[10] UniProt Consortium. LbCas12a - Lachnospiraceae bacterium [Internet]. UniProtKB; 2025. Available from: https://www.uniprot.org/uniprotkb/A0A5S8WF58/entry.
[11] UniProt Consortium. Recombination protein uvsY - Enterobacteria phage T4 [Internet]. UniProtKB; 2025. Available from: https://www.uniprot.org/uniprotkb/P04537/entry.
[12] UniProt Consortium. Recombination and repair protein UVSX - Enterobacteria phage T4 [Internet]. UniProtKB; 2025. Available from: https://www.uniprot.org/uniprotkb/P04529/entry.
[13] UniProt Consortium. Single-stranded DNA-binding protein - Enterobacteria phage T4 [Internet]. UniProtKB; 2025. Available from: https://www.uniprot.org/uniprotkb/P03695/entry.
[14] UniProt Consortium. DNA polymerase I - Bacillus subtilis (strain 168) [Internet]. UniProtKB; 2025. Available from: https://www.uniprot.org/uniprotkb/O34996/entry.
[15] UniProt Consortium. Peroxidase C1A - Armoracia rusticana (Horseradish) [Internet]. UniProtKB; 2025. Available from: https://www.uniprot.org/uniprotkb/P00433/entry.
[16] UniProt Consortium. Peroxidase 34 - Arabidopsis thaliana [Internet]. UniProtKB; 2025. Available from: https://www.uniprot.org/uniprotkb/Q9SMU8/entry.
[17] UniProt Consortium. Peroxidase 64 - Arabidopsis thaliana [Internet]. UniProtKB; 2025. Available from: https://www.uniprot.org/uniprotkb/Q43872/entry.
[18] UniProt Consortium. Peroxidase - Oryza sativa subsp. japonica [Internet]. UniProtKB; 2025. Available from: https://www.uniprot.org/uniprotkb/Q6AVZ3/entry.
[19] UniProt Consortium. Peroxidase 22.3 - Oryza sativa subsp. japonica [Internet]. UniProtKB; 2025. Available from: https://www.uniprot.org/uniprotkb/Q7F1U0/entry.
[20] UniProt Consortium. L-ascorbate peroxidase 1, cytosolic - Arabidopsis thaliana [Internet]. UniProtKB; 2025. Available from: https://www.uniprot.org/uniprotkb/Q05431/entry.
[21] UniProt Consortium. L-ascorbate peroxidase, cytosolic - Pisum sativum (Garden pea) [Internet]. UniProtKB; 2025. Available from: https://www.uniprot.org/uniprotkb/P48534/entry.

Epidemiological Model

Goal

We aimed to go beyond a purely mechanistic model of our test. In collaboration with the Human Practices team, we developed an agent-based model to assess how deploying our test could impact STI incidence. The model’s goal is to simulate how varying testing rates influence infection dynamics over time — specifically, to evaluate how increasing testing frequency reduces transmission and to identify thresholds where interventions become most effective.

Methods

Model Overview

We developed an agent-based model (ABM) to simulate the transmission dynamics of sexually transmitted infections (STIs), specifically focusing on Chlamydia trachomatis in the Swiss population. The model captures individual-level heterogeneity in sexual behavior, partnership dynamics, and healthcare-seeking patterns to evaluate the impact of various intervention strategies on disease prevalence.

What is an ABM?

An agent-based model (ABM) is a type of computer simulation that looks at how individual people - or “agents” -behave and interact with each other. Each agent can have different characteristics, such as how often they meet partners, get tested, or change behavior over time. By simulating many individuals and their interactions, ABMs help us understand how diseases like Chlamydia can spread in a population and how different strategies, such as more frequent testing, can help reduce infections.

Model Architecture

Agent Characteristics

Each agent in the simulation represents an individual aged 15-100 years with the following attributes:

Demographics: Age, biological sex (male/female), and sexual orientation (heterosexual, homosexual, or bisexual)
Risk classification: Categorized as either high-risk or low-risk based on sexual behavior patterns
Infection status: Susceptible, infected (asymptomatic or symptomatic), or under treatment
Partnership network: Dynamic set of current sexual partners with tracked relationship durations
Behavioral parameters: Testing propensity, treatment adherence, and partner acquisition rates

Population Structure

The model initializes with 7,663 agents, representing the Swiss population aged 15+ (in thousands). [1] The population composition reflects Swiss demographic data:

Sex distribution: 50% male, 50% female
Sexual orientation: 92% heterosexual, 3% homosexual, 5.6% bisexual [2]
Risk groups: 10% classified as high-risk, 90% as low-risk
Age structure: Weighted distribution based on Swiss census data, ranging from 15 to 100 years [1]

Initial infection prevalence is age-stratified according to Swiss surveillance data [3] from the Federal Office of Public Health, with highest rates among younger age groups (520.7 per 100,000 for ages 15-24) and declining with age. These values were multiplied by a scaling factor to account for underreporting of cases and to approximate the true incidence. This factor was estimated to be around 10 for Switzerland by dividing the estimated total number of cases by the number of reported cases [4].

Risk groups influenced partner-seeking behavior, with high-risk individuals being twice as active as average. To keep the population average consistent with literature values, low-risk individuals (90% of the population) were assigned a relative activity of 0.8. Risk group also affected sexual acts per week, maximum concurrent partners, and testing frequency.

Disease Dynamics

Transmission Mechanism

Sexual transmission occurs through partnerships with probability dependent on:

Directional transmission rates:
- Male-to-female: 5% [5], [6]
- Female-to-male: 5% [5], [6]
- Male-to-male: 5% [5], [6]
- Female-to-female: 1% (estimated based on [7])
Sexual activity frequency:
- 0-1 acts/week for low-risk pairs
- 1-6 acts/week for high-risk pairs
Partnership status: Transmission only occurs between current partners

Natural History

The infection progression follows established epidemiological parameters:

Incubation period: 1.5 weeks before symptom onset potential [8]
Symptom development: 50% probability in males, 25% in females (one-time determination at infection)
Natural resolution: 1% weekly probability of spontaneous clearance (computed / adapted from yearly estimate [9])
Treatment duration: 1 week of doxycycline (100% efficacy [10])
Reinfection: Possible immediately after treatment completion or natural resolution

Partnership Dynamics

Partnership Formation

Partnership formation rates are age- and sex-specific, derived from empirical data:

Annual rates peak at age 19 for females (0.9 partnerships/year) and age 20 for males (1.36 partnerships/year)
Weekly probability of forming a partnership, calculated as: annual_rate × risk_group_modifier / 52
Partner selection incorporates:
- Sexual compatibility based on orientation
- Age homophily (Poisson-distributed age difference, mean = 2.2 years [12])
- Assortative mixing by risk group (67% probability of same-risk partnerships)
- Maximum concurrent partners: 1 for low-risk, 2 for high-risk individuals

Partnerhsip Dissolution

Dissolution probabilities follow empirical survival curves differentiated by sex and relationship duration:

Males: 23.7% dissolution within first week, rising to 90.1% by 3 years [11]
Females: 12% dissolution within first week, rising to 84.6% by 3 years [11]
Partnerships stabilize after 3 years with minimal dissolution probability (0.01%) [11]

Testing and Treatment

Testing Behavior

Testing propensity varies by symptom status and risk group:

Baseline testing rates (annual, converted to weekly):
- High-risk: 15% annually (estimated based on our survey [13])
- Low-risk: 7% annually (estimated based on our survey [13])
Symptomatic testing: 95% weekly probability (estimated based on our survey [13])
Post-treatment retesting: Automatic testing 6 weeks after treatment completion (based on an interview with a stakeholder [14])

Diagnostic Performance

Test sensitivity: 95% (nucleic acid amplification tests) [15]
Test specificity: 99% [15]
Treatment initiation: 1-week delay following positive test
Treatment adherence: 80% probability of initiating prescribed treatment (estimated based on our survey [13])

Behavioral Adaptation

Agents modify their partner acquisition behavior following positive test results:

Partner-seeking propensity reduced to 10% of baseline during treatment
Behavior returns to baseline after treatment completion or negative retest

Demographic Processes

Vital Statistics

The model incorporates Swiss demographic rates to maintain population realism:

Birth rate: 0.0167% weekly (agents enter at age 15) [16]
Death rate: 0.016% weekly, age-stratified based on Swiss mortality tables [17], [18]
Immigration: 0.035% weekly, with age-appropriate infection prevalence [19]
Emigration: 0.02% weekly, random selection [19]

Age Progression

Agents age continuously (1/52 year per week)
Age influences partnership formation rates, dissolution patterns, and mortality risk
Agents are removed from the model at age 100, though reaching this age is rare because the probability of death increases with age.

Simulation Implementation

Temporal Resolution

The model operates on a weekly time step, with each iteration executing the following sequence for each agent:

Update infection status and progression
Manage partnership formation and dissolution
Execute sexual acts with transmission potential
Determine testing and treatment decisions
Update demographic characteristics

Stochastic Elements

All probabilistic events use a dedicated random number generator with configurable seed for reproducibility. Key stochastic processes include:

Partner selection and sexual frequency
Transmission events
Symptom development
Testing decisions
Treatment adherence
Demographic events

Data Collection

The model tracks multiple epidemiological metrics at each time step:

Population-level infection prevalence by risk group, sexuality, and age
Weekly incidence of new infections
Testing and treatment utilization rates
Partnership network statistics
Age-specific infection distributions

Model Validation

Parameter Sources

Model parameters were derived from:

Swiss Federal Office of Public Health surveillance reports
Published epidemiological literature on Chlamydia transmission
Swiss demographic and behavioral surveys

Parameters lacking values in the literature were estimated based on expert consultation with healthcare providers and/or insights from survey data.

Software Implementation

The model is implemented in Python 3.12 using object-oriented programming principles. Key dependencies include:

NumPy - for numerical operations and age distribution calculations
Matplotlib - for visualization of epidemiological dynamics
Pandas - for data management and analysis
Standard library modules - for random number generation and enumeration types

Results

Current scenario

As only positive test results are mandatorily reported, precise national testing rates are uncertain. We therefore calibrated a current scenario using our survey-based annual testing propensities (0.08 on average, scaled to 0.07 for the low-risk group and 0.15 for the high-risk group) and explored plausible bounds (see below). Under this baseline, the system settles into a quasi-steady state with:

Total prevalence in the low single-digit percent range (around 129 infected individuals among 7,663 agents, corresponding to approximately 1.7 % overall prevalence).
Network structure: The model stabilizes at an average of ~0.64 simultaneous partnerships per person, meaning that at any given week about two thirds of agents are in a relationship. This represents the equilibrium network connectivity rather than the weekly formation rate of new partnerships.
Age/risk concentration: Infections concentrated among younger age groups and high-risk agents, with spillover into the broader population.

These baseline dynamics imply that, without changes in testing behavior or partner notification, marginal year-to-year variation arises mainly from stochasticity in partnerships, transmission, and symptoms, rather than from structural drift.

Sensitivity to current testing uncertainty

To test robustness against uncertainty in routine testing, we varied annual testing rates within a plausible range:

Lower bound: 0.035 (low-risk), 0.075 (high-risk)
Estimated (baseline): 0.07 (low-risk), 0.15 (high-risk)
Upper bound: 0.20 (low-risk), 0.42 (high-risk)

Across this range of plausible testing propensities, the model’s epidemic trajectory remained broadly stable. Lower testing rates resulted in a modest upward shift in prevalence, while higher testing frequencies led to a slight overall reduction due to faster detection and treatment of cases. Importantly, none of the scenarios produced qualitative changes in epidemic behavior - the system consistently exhibited a broadly stable trajectory with a slight upward trend over time. This demonstrates that the model’s predictions are robust to some uncertainty in current testing behavior.

1) Lower bound (more conservative testing)

Interpretation: With reduced testing, infected counts drift slightly upward but remain bounded; no qualitative shift in transmission pathways is observed.

2) Estimated current testing (baseline)

Interpretation: The baseline reproduces expected patterns: higher counts among heterosexual agents (demography-driven), low sustained levels among lesbian women, and intermittent clusters among gay/bisexual men, superimposed on steady network turnover.

3) Upper bound (more generous testing)

Interpretation: Even with higher routine testing, the model settles again into a slight upward trend after a temporary fluctuation.

Modeling the effect of TRACE

Survey results indicated that the vast majority of respondents would probably or definitely test more often. Specifically, 81.6% said they would use TRACE if aware of it, and 79% reported likely increasing their testing frequency. This trend was consistent across groups, with 72.8% of frequent testers and 82.6% of non-testers agreeing.

While these survey responses provide valuable insight into potential behavior changes, it is important to note that self-reported intentions do not always translate directly into real-world action. To account for this uncertainty, our epidemiological model explores a range of possible increases in testing rates rather than assuming full adherence to reported intentions. This approach allows us to quantify the potential population-level impact of TRACE under both conservative and optimistic adoption scenarios.

Medium increase in testing rate

Scenario: annual testing rates increased to 0.28 for low-risk and 0.6 for high-risk individuals.

Interpretation: This moderate scenario represents a realistic short-term effect based on partial adoption of TRACE. The model predicts a sustained reduction in prevalence, with the largest decline observed among heterosexual agents, who make up the majority of the population. These findings suggest that even without full adoption, TRACE could meaningfully lower transmission rates through earlier detection and treatment, breaking infection chains before they propagate.

Large increase in testing rate

Scenario: annual testing rates increased to 0.7 for low-risk and 1.5 for high-risk individuals.

Interpretation: Under optimistic conditions - where TRACE achieves widespread awareness and regular usage - the model predicts a near-elimination of infection within the simulated period of 10 years. This highlights the nonlinear relationship between testing frequency and disease prevalence: once a critical proportion of the population tests frequently, transmission becomes unsustainable, and prevalence rapidly declines.

The result illustrates the transformative potential of accessible at-home testing to shift population-level disease dynamics.

Entrepreneurship:

The epidemiological findings presented above provide quantitative evidence that TRACE is not only a diagnostic innovation but also a public health enabler with tangible societal benefits. By substantially increasing testing accessibility and frequency, TRACE can reshape STI epidemiology on a population scale. Our agent-based model demonstrated that increases in testing rates lead to a marked and sustained reduction in infection prevalence, while widespread adoption could push the system toward near-elimination. These results confirm that TRACE has the potential to become a transformative public health intervention, capable of reducing transmission, improving early detection, and alleviating long-term healthcare burdens.

From an entrepreneurial perspective, this modeling validation provides strong support for our business case. It shows that TRACE can create dual value – both societal impact and economic sustainability. The reduction in undiagnosed infections translates into measurable public health savings and productivity gains, while the convenience and privacy of TRACE address key barriers that currently limit regular STI screening.

The model therefore strengthens the rationale for investing in and scaling TRACE as a viable startup opportunity. It quantifies the positive externalities of widespread testing and underlines the market need for accessible, at-home diagnostic tools. This alignment between scientific validation, social impact, and commercial feasibility underscores TRACE’s potential as a high-impact venture at the intersection of biotechnology and preventive healthcare.

For a detailed description of our business model, value proposition, and market strategy, see our entrepreneurship page.

Overall impact

Taken together, these results demonstrate that the behavioral change enabled by TRACE – more frequent and accessible STI testing – can lead to substantial reductions in prevalence, even under more conservative assumptions. The model highlights how improvements in testing behavior can shift the system from a stable endemic equilibrium toward effective control or elimination.

Model validation and discussion

The ABM was validated by comparing its emergent epidemiological patterns with known characteristics of Chlamydia trachomatis infection in the Swiss population. The goal of validation was not to match exact case numbers from surveillance data – which are influenced by underreporting – but to ensure that the simulated trends and magnitudes are epidemiologically realistic and consistent with literature-based expectations.

Age-Specific Infection Patterns

The model predicts that infections are strongly concentrated among younger agents, particularly those aged 20–30 years, with prevalence decreasing steadily in older age groups. This distribution reflects real-world surveillance data, where Chlamydia incidence peaks among young adults and declines with age due to reduced partner turnover and increased testing awareness. A small artificial peak appears at around 90 years, which is attributed to the limited number of agents in that age bracket and does not reflect a genuine epidemiological trend. Overall, the age-specific pattern demonstrates that the model correctly captures the demographic concentration of infection in sexually active young populations.

Network Stability

Partnership dynamics were evaluated by tracking the mean number of concurrent partners over time. The simulation stabilized at an average of approximately 0.64 partners per agent, exhibiting only minor stochastic fluctuations over the 10-year period. This suggests that, at any given time, roughly two-thirds of agents are engaged in a sexual partnership – closely aligning with reported data from Switzerland, which indicates an average of 0.73 partners [21]. The absence of long-term drift or oscillatory behavior confirms that the model’s partnership formation and dissolution parameters are balanced and produce a stable social network structure over time.

Infection Distribution by Risk Group

When stratified by behavioral risk, the model predicts that infections are disproportionately concentrated among high-risk individuals, who maintain more frequent partnerships and higher sexual activity rates.However, low-risk individuals still sustain a smaller but persistent burden of infection, reflecting ongoing spillover through network mixing between groups.

This outcome is supported by current public health strategies that emphasize frequent testing among high-risk populations while acknowledging that occasional testing in the general population remains necessary to prevent reinfection cycles [22].

Population-Level Dynamics

At the population level, the model maintains a steady-state prevalence of around 1.7%, corresponding to approximately 129.1 infected individuals out of 7,663 simulated agents. When scaled to the Swiss population aged 15+ (factor of 1000), this equates to an estimated ~129,100 concurrent infections, which aligns well with literature estimates exceeding 100,000 cases after accounting for underreporting [4]. The overall number of susceptible and treated agents remains stable throughout the simulation period, confirming that the model has reached a quasi-endemic equilibrium rather than an unstable epidemic expansion or collapse.

Summary of Validation

Taken together, these validation results indicate that the model successfully reproduces key epidemiological and behavioral properties of Chlamydia transmission:

Infection prevalence peaks in young adults.
The partnership network is stable and realistic.
High-risk individuals carry most of the infection burden.
Population-level infection counts match published estimates.

While some noise appears in small demographic subgroups (e.g., elderly agents), these deviations are expected given the limited number of agents in those cohorts and do not affect overall model reliability. Therefore, the model can be considered valid for simulating population-level and behavioral intervention scenarios relevant to the Swiss context.

Limitations and Outlook

Model Limitations

Structural Assumptions

Our agent-based model necessarily simplifies the complex reality of sexual network dynamics and STI transmission. The assumption of random mixing within risk groups, even with assortative mixing probabilities, cannot fully capture the highly clustered and community-structured nature of real sexual networks. Real-world sexual networks often exhibit small-world properties and contain dense subgroups that can sustain transmission even when overall prevalence is low [23]. Our limitation of maximum two concurrent partners may underestimate transmission potential in highly connected network cores where individuals maintain multiple concurrent partnerships.

The weekly temporal resolution, chosen for computational efficiency and better implementation of literature parameters, means the model cannot capture short-term casual encounters or same-day partner turnover that may contribute significantly to transmission dynamics, particularly in high-risk populations. Furthermore, the absence of geographic structure assumes that all individuals in Switzerland have equal probability of partnership formation regardless of location, ignoring the reality that sexual networks are strongly influenced by proximity, urban-rural divides, and local social venues [24]. This simplification may affect our estimates of outbreak potential and doesn't allow for the assessment of geographically targeted interventions.

Behavioral aspects in the model are represented in a simplified way. Testing behavior is modeled as a probability that depends on symptoms and risk group, without explicitly incorporating factors such as healthcare accessibility, insurance coverage, stigma, or prior testing experiences. Similarly, condom use and other barrier methods are not modeled individually; instead, their effects are reflected in the overall transmission probability parameters. This design choice keeps the model focused on capturing key testing-driven dynamics while maintaining conceptual clarity and computational efficiency.

The binary classification into high- and low-risk groups does not fully capture the continuous spectrum of sexual behaviors or the dynamic nature of risk throughout an individual’s life. In reality, people may shift between risk levels depending on life circumstances, relationship status, or age, whereas our model treats risk group membership as fixed. Likewise, the model assumes that Swiss residents and immigrants share similar behavioral patterns, which may overlook differences arising from diverse cultural backgrounds and sexual health practices within immigrant populations.

Parameter Uncertainty

Many model parameters remain uncertain due to limited data availability. Transmission probabilities, particularly for female-to-female transmission, are poorly characterized, as comprehensive transmission studies are both ethically and practically difficult to conduct. While several studies report a transmission probability of approximately 0.05 per sexual act [5], [6], [25], data on specific transmission routes, such as female–female transmission, remain particularly limited.

The model assumes that the true number of infections is roughly ten times higher than the number of reported cases. This assumption primarily influences the absolute number of predicted cases, while having a lesser impact on the relative effects of different testing rates. Since the model’s predictions align with existing epidemiological estimates, this factor appears reasonable. However, it may still vary across age groups, risk populations, and geographic regions.

Technical and Validation Constraints

Model validation faces fundamental challenges in the context of STI transmission. We cannot conduct controlled transmission experiments for ethical reasons, and detailed sexual network data is difficult to collect due to privacy concerns and recall limitations. The model can be validated against aggregate surveillance data (such as comparing the model’s case number prediction to literature estimates), but this provides limited information about whether the underlying transmission processes are correctly represented. Without access to genetic sequence data that could reveal transmission chains or detailed partner notification data that could validate network structure, we must rely on the model's ability to reproduce population-level patterns as evidence of validity.

Outlook and Future Directions

Near-term Enhancements

The immediate priority for model development involves incorporating partner notification mechanisms that reflect real-world contact tracing efforts. Partner notification represents an important component of STI control programs, yet its effectiveness depends on notification rates, partner reachability, and the delay between index case diagnosis and partner testing [26]. Implementing realistic notification cascades would allow evaluation of how improvements in contact tracing effectiveness might amplify the benefits of increased testing. This enhancement would require modeling the probability that infected individuals notify their partners, the delay in notification, and the subsequent testing and treatment behaviors of notified partners.

Advanced Modeling Approaches

Incorporating geographic structure and local social contexts represents another important extension, as sexual networks are strongly shaped by these factors. Including geographic mixing would allow the evaluation of targeted interventions and provide insights into how infections spread between communities. However, greater population fragmentation makes parameter estimation increasingly difficult, and even at the current level of model detail, parameter availability was limited.

The extension to multi-pathogen dynamics would significantly enhance the model's public health relevance. Many individuals with one STI have increased risk for others due to shared transmission routes and biological interactions [27], [28]. Modeling co-infections with gonorrhea, syphilis, or HIV would allow for more accurate predictions of epidemic trajectories, identification of high-risk subpopulations, and evaluation of interventions that target multiple infections simultaneously. It could also capture synergistic effects, such as increased susceptibility or transmissibility resulting from co-infections, providing a more realistic representation of STI spread and informing integrated public health strategies. In fact, it is known that HIV increases the risk of acquiring and transmitting other sexually transmitted infections, such as syphilis, gonorrhea, and herpes simplex virus, due to both immunosuppression and higher viral shedding at mucosal sites.

References

IndexMundi. Switzerland — Age structure. Available from: https://www.indexmundi.com/switzerland/age_structure.html.
Bundesamt für Gesundheit / BAG; Sotomo. Studienbericht: Sex in der Schweiz. Page 12. Available from: https://sotomo.ch/site/wp-content/uploads/2020/12/BAG_Studienbericht_DE.pdf.
Bundesamt für Gesundheit (Switzerland). Sexually transmitted infections — Chlamydia. In: IDD BAG. Available from: https://www.idd.bag.admin.ch/diseases/chlamydia/statistic.
Aebi-Popp K. Chlamydieninfektion bei Jugendlichen. Schweizer Zeitschrift für Gynäkologie. 2020 Oct 9. Available from: https://www.rosenfluh.ch/media/gynaekologie/2005/04/Chlamydieninfektion-die-unbemerkte-Geschlechtskrankheit.pdf.
BMJ Sexually Transmitted Infections. Abstract A175.2. Sex Transm Infect. 2011;87 Suppl 1:A175.2. Available from: https://sti.bmj.com/content/87/Suppl_1/A175.2.
STD Center NY. One-time homosexual contact STD risk in men. Available from: https://stdcenterny.com/articles/one-time-homosexual-contact-std-risk-men.html.
Centers for Disease Control and Prevention. Women who have sex with women (WSW) — HSV-2 seroprevalence. In: Sexually Transmitted Diseases Treatment Guidelines. Available from: https://www.cdc.gov/std/treatment-guidelines/wsw.htm.
Singapore Dental Association / CDA. Chlamydia. Available from: https://www.cda.gov.sg/professionals/diseases/chlamydia.
British Association for Sexual Health and HIV (BASHH). Chlamydia 2015. Available from: https://www.bashh.org/resources/15/chlamydia_2015/.
Centers for Disease Control and Prevention. Chlamydia — Treatment Guidelines. Available from: https://www.cdc.gov/std/treatment-guidelines/chlamydia.htm.
Gaydos C, Schachter J. Title not clearly given. In: PMC. Available from: https://pmc.ncbi.nlm.nih.gov/articles/PMC2838999/.
Pew Research Center. A growing share of US husbands and wives are roughly the same age. 2024 Aug 15. Available from: https://www.pewresearch.org/short-reads/2024/08/15/a-growing-share-of-us-husbands-and-wives-are-roughly-the-same-age/.
University of Zurich iGEM Team 2025. Survey on testing behavior and STI awareness among students. 2025. Available from: https://2025.igem.wiki/uzurich/survey.
Betschart C. Stakeholder-Interview. UZurich iGEM 2025. 2025 Oct 6. Available from: https://2025.igem.wiki/uzurich/stakeholder-betschart.
Australian Department of Health. Chlamydia laboratory case definition. 2022 Jun. Available from: https://www.health.gov.au/sites/default/files/documents/2022/06/chlamydia-laboratory-case-definition_0.pdf.
Swiss Federal Statistical Office (FSO / BFS). Geburten und Todesfälle. Available from: https://www.bfs.admin.ch/bfs/de/home/statistiken/bevoelkerung/geburten-todesfaelle/geburten.html.
Swiss Federal Statistical Office (FSO / BFS). Bevölkerung. Available from: https://www.bfs.admin.ch/bfs/de/home/statistiken/bevoelkerung.html.
Swiss Federal Statistical Office (FSO / BFS). Gesundheit / Sterblichkeit / Todesursachen. Available from: https://www.bfs.admin.ch/bfs/de/home/statistiken/gesundheit/gesundheitszustand/sterblichkeit-todesursachen.html.
Swiss Federal Statistical Office (FSO / BFS). Migration & Integration. Available from: https://www.bfs.admin.ch/bfs/de/home/statistiken/bevoelkerung/migration-integration.assetdetail.36073875.html.
Wissel T. Stakeholder-Interview. UZurich iGEM 2025. 2025 Oct. Available from: https://2025.igem.wiki/uzurich/stakeholder-wissel.
Federal Statistical Office (Switzerland). Paare in der Schweiz. Available from: https://www.bfs.admin.ch/bfs/en/home.assetdetail.33386305.html.
Centers for Disease Control and Prevention. STI Screening Recommendations. Available from: https://www.cdc.gov/std/treatment-guidelines/screening-recommendations.htm.
Newmyer L, Evans M, Graif C. Socially Connected Neighborhoods and the Spread of Sexually Transmitted Infections. Demography. 2022;59(4):1299–1323. Available from: doi: 10.1215/00703370-10054898.
Kim H-Y, Cuadros D, Wilkinson E, Junqueira DM, de Oliveira T, Tanser F. The geography and inter-community configuration of new sexual partnership formation in a rural South African population over fourteen years (2003–2016). 2022 Mar 9. Available from: doi: 10.1371/journal.pgph.0000055.
Contact Tracing ASHM. Chlamydia. Available from: https://contacttracing.ashm.org.au/chlamydia/.
Althaus CL, Turner KME, Mercer CH, et al. Effectiveness and cost-effectiveness of traditional and new partner notification technologies for curable sexually transmitted infections: observational study, systematic reviews and mathematical modelling. Southampton (UK): NIHR Journals Library; 2014 Jan. Available from: https://www.ncbi.nlm.nih.gov/books/NBK261430/.
Chung SL, Wong NS, Ho KM, Lee SS. Coinfection and repeat bacterial sexually transmitted infections (STI) – retrospective study on male attendees of public STI clinics in an Asia Pacific city. Epidemiol Infect. 2023 Jun 9;151:e101. Available from: doi: 10.1017/S0950268823000948.
Mayer KH, Venkatesh KK. Interactions of HIV, other sexually transmitted diseases, and genital tract inflammation facilitating local pathogen transmission and acquisition. Am J Reprod Immunol. 2011 Apr;65(3):308–316. Available from: doi: 10.1111/j.1600-0897.2010.00942.x.

Asymmetrical RPA Model

Goal

Since our Wet Lab Team started exploring new methods for our diagnostic tool, rather than committing to a single approach, we decided to develop two parallel systems to quantitatively identify the strengths, weaknesses, and optimal use cases of each. Our goal was to build predictive "digital twins" of both pipelines (RPA + CRISPR and RPA + DNA-hybridization), allowing us to rationally engineer the best possible diagnostic.

Thus one of our goalswas to build a predictive mathematical model for our second diagnostic approach: DNA hybridization on a lateral flow assay (LFA) strip. This approach has been successfully applied in clinical diagnostics, including SARS-CoV-2 detection and bacterial pathogen identification [1], [2]. To accurately predict the performance of this entire system, we first had to create a robust model of its essential starting point: the amplification method.

The success of a DNA hybridization assay on a lateral flow strip depends on the availability of single-stranded DNA (ssDNA) [3]. Probes on the strip need to bind to a single strand of target DNA to generate a signal [4]. However, standard amplification techniques, including symmetric RPA, primarily produce double-stranded DNA (dsDNA). Using dsDNA directly for this purpose is highly inefficient, as the two complementary DNA strands rapidly re-anneal with each other, which severely inhibits the binding of detection probes and weakens the signal.

To overcome this fundamental challenge, our diagnostic pipeline begins with asymmetric RPA (aRPA). By using an unequal ratio of primers, aRPA is specifically designed to produce an abundance of ssDNA. In the first phase, both primers extend the DNA to form double-stranded DNA, leading to an exponential amplification phase. The reaction then transitions to a linear mode, generating single-stranded products ideal for our lateral flow assay. Therefore, a predictive model of aRPA was the necessary first step in modeling our complete diagnostic system.

Iterative Journey

This page documents our iterative journey in developing a mechanistic mathematical model of aRPA. We detail the challenges we encountered with initial complex models, the solutions we implemented to overcome them, and the final validation of a simplified, predictive model against established biochemical principles and literature data.

Cycle 1 - Detailed Mechanistic Model

Our first attempt was a comprehensive mechanistic model intending to capture every individual biochemical step in the RPA process. This model was based on the pioneering work on the RPA mechanism by Piepenburg et al. [5] and kinetic studies by Lillis et al. [6].

Design and Build

The model explicitly tracked 17 distinct chemical species:

Three enzymes (UvsX, gp32, Bsu polymerase)
Two primers (forward and reverse)
DNA template and products
9 intermediate complexes (UvsX-Primer complexes, D-loops, stabilized complexes, extension complexes, etc)

The reaction network was constructed around core processes, with recombinase–primer filament formation and strand invasion modeled as reversible reactions, while D-loop stabilization by gp32 and polymerase binding and extension were treated as irreversible steps. For that, we used Ordinary Differential Equations (ODEs). ODEs capture how concentrations of molecules change over time based on reaction rates, making them ideal for modeling the continuous dynamics of biochemical systems. They naturally express mass-action kinetics and enzyme reactions through differential equations that describe the rate of change for each chemical species.

We found rate constants from Lillis et al. (2016) [6], a detailed study measuring RPA kinetics. They reported values like:

Recombinase-primer binding: 10,000,000 M^-1s^-1 (ten million collisions per molar concentration per second)
Strand invasion: 50,000,000 M^-1s^-1
Polymerase speed: 50 s^-1 (50 nucleotides added per second)

The polymerase rate seemed reasonable: Bsu polymerase adds ~50-100 nucleotides per second, so a 100 bp amplicon should take ~2 seconds to synthesize.

Test and Learn

This highly detailed model exhibited several critical, disqualifying problems that rendered it non-functional.

The simulation showed dsDNA reaching 550 nM within seconds from a 1 nM template. There was no observable lag phase (which should be 2-5 minutes [7]), and primers were completely depleted in under a second. Our model was 180 times too fast. :(

Other than that, there was no template consumption to observe, while still creating amplicons. This violated the law of conservation of mass , namely our model was creating DNA out of nothing. Moreover, The amplification curve was completely linear, showing no evidence of the product-feedback amplification that is fundamental to RPA kinetics.

Diving into the reasoning, where these challenges were coming from, we learned the following: Firstly, the RPA rate constants used were inappropriate for reaction conditions, because they were derived from simple, unimolecular studies and were too fast for the complex RPA reaction mix. Minton (2001) [8] demonstrated that macromolecular crowding in cellular environments reduces effective rate constants by 2-3 orders of magnitude compared to dilute solution measurements. Secondly, we found that the feedback mechanism was missing. New amplicons weren't immediately available as templates for subsequent rounds, breaking the exponential growth logic. Finally, the system of 17 coupled ODEs with vastly different timescales created solver instability.

Cycle 2 - Correcting Mass Balance and Integrating Feedback

Based on our findings, we implemented some foundational changes to achieve the desired goal.

Design and Build

We corrected the ODEs to ensure the template was consumed and, critically, that newly synthesized amplicons were added to the total pool of templates available for the next round of primer binding.

Total available template = Original template + New amplicons

We slowed down key rate constants by 100 to 1000-fold. This was based on the rationale that effective rate constants in crowded biochemical environments. We specifically slowed the strand invasion step, known to be rate-limiting in RPA [9].

Primer binding: 10,000,000 → 0.0001 nM^-1s^-1 (1000× slower in nM units)
Strand invasion: 50,000,000 → 0.00005 nM^-1s^-1 (rate-limiting step)
Polymerase: 50 → 0.5 s^-1 (reasonable for copying ~100 nucleotide amplicons)

The polymerase speed now made sense: if our amplicon is 100 base pairs long, and polymerase adds 50 bases per second, it should take about 2 seconds to finish one copy. That's realistic!

Test and Learn

These changes led to significant improvements, but the model's behavior was still problematic. While now physically plausible, the growth curve remained quasi-linear, the final amplicon yield was far too low (~7 nM instead of the expected ~50 nM), and the model failed to show a distinct switch to ssDNA production.

The model was still too complex. The detailed intermediate states created artificial delays that dampened exponential feedback. Multiple intermediate complexes act as "buffers," slowing information propagation through the reaction network.

Cycle 3 - Simplified mechanistic model

We realized that a successful RPA model does not need to capture every intermediate state. It only needs to accurately reproduce the “emergent behavior” of the system as a whole. This principle of simplification has proven successful in models for other amplification techniques like PCR [11] and LAMP [12].

Design and Build

Instead of tracking 17 different molecules, we asked: "What's the minimum we need to capture the biology?"

We radically simplified the model to track only the five most essential species:

$ [P_F] $ = Forward primer (abundant)
$ [P_R] $ = Reverse primer (limiting)
Template_ds = Total double-stranded DNA (original + new copies)
ssDNA_R = Single-stranded product (from forward primer excess)
ssDNA_F = Single-stranded product (from reverse limiting excess)

Exponential Amplification occurs while both primers are present. This reaction describes the autocatalytic production of dsDNA, where the rate is proportional to the current amount of template and is limited by the scarcest primer.

$$ \text{rate}_{\text{amplification}} = k_{\text{amp}} \times \text{Template}_{ds} \times \min(P_F, P_R) $$

$ k_{amp} $ = How fast one template molecule gets copied Template_ds = Current amount of DNA (this creates exponential growth! More DNA → more copies → even more DNA)
$ \min(P_F, P_R) $ = "Minimum function" - uses whichever primer is rarer

The min() function enforces stoichiometric limitation - amplification stops when either primer depletes. Rate proportional to current template concentration creates exponential growth

Asymmetric Strand Synthesis activates after the reverse primer is depleted. This reaction is suppressed while the limiting primer is abundant but activates strongly as it is consumed, leading to linear ssDNA production.

$$ \text{rate}_{\text{fwd_strand}} = k_{\text{strand_f}} \times \text{Template}_{ds} \times P_F \times \left( \frac{1}{1 + 10 \times P_R} \right) $$

The key to modeling the switch from exponential to linear amplification is the suppression term in our ssDNA synthesis equation:

\[ \frac{1}{1 + 10 \cdot P_R} \]

This is a Hill-type suppression function. The table below illustrates how this "molecular brake pedal" gradually releases as the concentration of the limiting primer $ [P_R] $ depletes, allowing ssDNA synthesis to accelerate.

Reaction Phase	$ [P_R] $ Concentration	Suppression Factor Calculation	Resulting ssDNA Rate
Start of Reaction	50 nM	$$ \frac{1}{1 + 10 \times 50} \approx 0.002 $$	~0.2% (OFF)
Mid-Reaction	1 nM	$$ \frac{1}{1 + 10 \times 1} \approx 0.09 $$	~9% (Active)
Late Reaction	0.1 nM	$$ \frac{1}{1 + 10 \times 0.1} = 0.5 $$	50% (ON)
End of Reaction	→ 0 nM	$$ \frac{1}{1 + 10 \times 0} = 1.0 $$	100% (Fully ON)

Table 1: Applying Hill-type suppression function to the model.

Parameter Calibration

Since we simplified our model from 17 species to 5, $ k_{amp} $ is an effective parameter - it doesn't correspond to a single biochemical rate, but rather the combined effect of the entire amplification cascade.

Calculating $ k_{amp} $ directly from theory is impossible, because firstly,not all of the individual rate constants for the 17+ steps are measured in the literature; secondly as shown by Minton [8], rates in crowded cellular environments can be orders of magnitude slower than in dilute solutions;thirdly,Buffer composition (salts, pH) and our model's necessary simplifications introduce uncertainty.

The doubling time during the exponential phase is directly related to $ k_{amp} $. During this phase, the DNA concentration, $ \text{Template}(t) $, grows according to: $ \text{Template}(t) = \text{Template}_0 \times e^{\text{growth_rate} \times t} $ . The time it takes to double, $ t_{\text{double}} $, is therefore: $ t_{\text{double}} = \frac{\ln(2)}{\text{growth\rate}} \approx \frac{0.693}{\text{growth\rate}} $ Since the growth rate is approximately $ k_{\text{amp}} \times [P_R]_{\text{initial}} $, we could make a rough estimate. For an observed $ t_{\text{double}} $ of 1.5 minutes (90s) and an initial limiting primer concentration $ [P_R] $ of 50 nM: $ 90\ \text{s} \approx \frac{0.693}{k_{\text{amp}} \times 50\ \text{nM}},\quad k_{\text{amp}} \approx \frac{0.693}{90 \times 50} \approx 0.00015\ \text{nM}^{-1}\text{s}^{-1} $ This calculation has a unit mismatch (nM⁻¹s⁻¹ instead of nM⁻²s⁻¹) because it's a simplified first-order approximation of a second-order process, but it correctly told us that the value for (k_{amp}) should be very small, in the middle of $ 10^{-3} \text{ to } 10^{-4} $

We used a systematic calibration approach. An initial guess of 0.001 nM^-2s^-1 resulted in a reaction that was too slow, with a 15-minute time to plateau. Doubling the rate to 0.002 nM^-2s^-1 yielded a perfect match to the literature data, correctly reproducing the ~2 minute lag phase, ~1.5 minute doubling time, and 8-minute time to plateau. To confirm this value was not an underestimation, we also tested a higher value of 0.003 nM^-2s^-1, which resulted in a reaction that was too fast. This confirmed that 0.002 nM^-2s^-1 was the optimal calibrated value.

A key principle of good modeling is to fit the minimum number of parameters necessary to avoid overfitting, where the model becomes too specific to one dataset and loses its general predictive power.

To maintain the model's general predictive power and avoid overfitting, we calibrated only the most essential parameter. The main amplification rate was the sole parameter fitted to the experimental data. In contrast, the ssDNA production rates, kstrand_f and kstrand_r, were not calibrated. Instead, they were set based on the biological logic that ssDNA should only accumulate after the main dsDNA amplification plateaus, meaning their rates were chosen to be significantly slower than kamp. This approach ensures the model remains robust and is not overly tuned to a single dataset.

Test and Learn

Our model successfully recapitulated the three distinct biological phases of the reaction.

Initial Conditions:

500 nM forward primer
50 nM reverse primer (10:1 ratio)
0.1 nM initial DNA Template (corresponding to ~10³ copies in 50 μL)

Lag Phase
0-2 min

Exponential
2-8 min

Plateau
>8 min

Figure 1: The model successfully recapitulated the three distinct biological phases of the aRPA reaction.

Asymmetric Primer Consumption — **Figure 2**: Graph representing asymmetric Primer Consumption and unequal initial forward and reverse primer ratio.

**Figure 3**: dsDNA Amplicon Accumulation matching primer depletion dynamics.

**Figure 4**: Graph demonstrates Exponential Amplification Phase before limiting primer depletes followed by linear phase.

ssDNA Product Accumulation — **Figure 5**: ssDNA Amplicon Accumulation consistent with all the other species dynamics.

Complete aRPA Dynamics — **Figure 6**: Full Reaction Dynamics including all characteristic phases.

The Three Phases

Phase 1: Lag (0–2 min)
- Template concentration increases slowly from 0.1 → 1 nM.
- Both primers are consumed slowly and equally.
- This phase corresponds to the initial recombinase loading and strand invasion events.
Phase 2: Exponential (2–8 min)
- Template concentration grows rapidly from 1 → 50 nM.
- The limiting primer, $ [P_R] $, depletes much faster than the excess primer, $ [P_F] $.
- Each new amplicon immediately serves as a template, creating a positive feedback loop.
Phase 3: Plateau/Linear (>8 min)
- The dsDNA concentration plateaus at 50 nM, a value equal to the initial concentration of $ [P_R] $.
- ssDNA accumulates linearly, reaching 450 nM.
- Only the excess forward primer, $ [P_F] $, is consumed in this phase.

Stoichiometric Conservation

The model demonstrates perfect stoichiometric conservation, confirming its physical consistency.

Inputs:
- $ [P_F] $: 500 nM
- $ [P_R] $: 50 nM
Outputs:
- dsDNA: 50 nM (requires 50 nM of $ [P_F] $ and 50 nM of $ [P_R] $)
- ssDNA_R: 450 nM (requires 450 nM of $ [P_F] $)
Balance Check:
- Total $ [P_F] $ consumed: 50 + 450 = 500 nM ✓
- Total $ [P_R] $ consumed: 50 nM ✓

Validation Against Literature

Metric	Model Output	Literature Range	Reference
Time to Plateau	8 min	5-20 min	Euler 2012 [7]
Amplification Fold	500×	10²-10⁶×	Piepenburg 2006 [5]
Efficiency per Cycle	~2.0×	1.5-2.0×	Lillis 2016 [6]
Final [dsDNA]	50 nM	= $ [P_{R_{\text{limiting}}}] $	Kersting 2014 [10]
[ssDNA]:[dsDNA]	9:1	5:1-20:1	Rohrman 2012 [3]
Lag Phase Duration	2 min	1-5 min	Euler 2012 [7]

Table 2: Quantitative metrics validate our model against multiple literature values.

Sensitivity Analysis

To understand which parameters most influence model behavior, we performed sensitivity analysis by varying key parameters ±50% from baseline values.

Effect of Amplification Rate (k_amp)

k_amp	Time to Plateau	Final [dsDNA]	Interpretation
0.001	15 min	50 nM	Slower kinetics, longer assay time
0.002	8 min	50 nM	Baseline - matches literature
0.003	5 min	50 nM	Faster kinetics, earlier plateau

Table 3: Effect of amplification rate. k_amp controls reaction speed but not final yield (limited by reverse primer).

Effect of Primer Ratio (Forward : Reverse)

Ratio	Final [dsDNA]	Final [ssDNA]	[ssDNA]:[dsDNA]
5:1	100 nM	400 nM	4:1
10:1	50 nM	450 nM	9:1
20:1	25 nM	475 nM	19:1

Table 4: Effect of primer ratio. Higher asymmetry leads to more ssDNA accumulation, which is better for probe-based detection. Same conditions reduce the overall dsDNA formation.

Effect of Initial DNA Template Concentration

[Template]₀	Time to Plateau	Final [dsDNA]	Amplification Fold
0.01 nM	12 min	50 nM	5000×
0.1 nM	8 min	50 nM	500×
1 nM	5 min	50 nM	50×

Table 5: Effect of initial DNA Template Concentration. Lower starting template results in longer lag phase, but same final yield (limited by primers).

Model Limitations and Future Work

The model has several known limitations, each with a clear path for future improvement.

Temperature dependence: It currently uses constant rates, but RPA is temperature-dependent. Future work involves implementing Arrhenius equations to model these temperature effects.
Enzyme stability: It does not yet account for the loss of enzyme activity over time. The plan is to integrate it with the project's existing enzyme stability model.
Inhibitors: The model doesn't consider inhibitors found in clinical samples. Future versions will include inhibition terms.
Amplification efficiency: It assumes uniform amplification efficiency. This can be improved by incorporating thermodynamic calculations for primer-template binding.

Experimental Validation

To validate and further refine the model, specific experimental data is required. The validation plan calls for real-time fluorescence data, qPCR-style amplification curves for various starting template amounts, and direct quantification of the final dsDNA and ssDNA products. The process involves running aRPA with known template concentrations, recording fluorescence over time, and fitting the model's key parameters to match these experimental curves. Wet Lab couldn't perform these experiments due to not having access to a qPCR machine.

Conclusion

Through three iterations, we developed a mechanistic model of asymmetric RPA that accurately reproduces experimental kinetics from 5 independent papers. It explains all three biological phases: lag phase, exponential phase, and plateau. In contrast to initial iterations, our final model maintains perfect mass balance and provides quantitative predictions for assay optimization. We also learned by doing that exponential amplification doesn't require modeling every molecular detail - it emerges naturally from product-feedback logic. Sometimes, less is more!

This model serves as the foundation for our diagnostic platform and demonstrates how mathematical thinking can guide biological engineering. Building on this established core module, our next step focuses on developing the DNA hybridization module.

A disclaimer about models: They are tools, not the truth! Our model isn't "reality" - it's a simplified cartoon that captures key behaviors. As statistician George Box said: "All models are wrong, but some are useful!" And indeed our model is useful for predicting timescales and optimizing ratios, but can't replace actual experiments.

You Better Know It!

For the people trying to use aRPA in their projects and can't find any useful material in scientific literature - We Got You!

We want to share our honest thoughts and recommendations for our fellow iGEMers, and not only.

As statistician George Box said:
"All models are wrong, but some are useful!"
We hope you find the best use of our model!

1. Start Simple, Then Add Complexity If Needed

We wasted weeks on our 17-species model. The 5-species model took one day to build and worked better! Ask: "What's the MINIMUM complexity needed to capture biology?"

2. Validate Against Multiple Papers, Not Just One

We compared our model to 4 different papers [1], [2], [5], [8]. If your model only matches one study, it might be overfitted to that specific condition!

3. Check Mass Balance First

Before worrying about rate constants or differential equations, make sure your model conserves mass:

Total Input = Total Output

If this isn't true, nothing else matters!

4. Parameter Values Need Context

A rate constant of "10⁷ M⁻¹s⁻¹" from a biochemistry paper might be measured in pure water at 25°C. Inside cells or in RPA buffer at 37°C, the effective rate could be 1000× different! Always check the experimental conditions papers used.

References

Lobato IM, O'Sullivan CK. Recombinase polymerase amplification: Basics, applications and recent advances. TrAC Trends Analyt Chem. 2018 Jan;98:19-35. doi: 10.1016/j.trac.2017.10.015. Available from: https://www.sciencedirect.com/science/article/abs/pii/S0165993617303126
Li J, Macdonald J, von Stetten F. Review: a comprehensive summary of a decade development of the recombinase polymerase amplification. Analyst. 2019 Jan 7;144(1):31-67. doi: 10.1039/C8AN01621F. Available from: https://pubs.rsc.org/en/content/articlelanding/2019/an/c8an01621f
Rohrman BA, Richards-Kortum RR. A paper and plastic device for performing recombinase polymerase amplification of HIV DNA. Lab Chip. 2012 Sep 7;12(17):3082-8. doi: 10.1039/C2LC40423K. Available from: https://pubs.rsc.org/en/content/articlelanding/2012/lc/c2lc40423k
Mondal S, Venkatesh AG. Lateral flow immunoassay for nucleic acid detection. In: Nucleic Acid Detection. Academic Press; 2021. p. 89-112.
Piepenburg O, Williams CH, Stemple DL, Armes NA. DNA detection using recombination proteins. PLoS Biol. 2006 Jul;4(7):e204. doi: 10.1371/journal.pbio.0040204. Available from: https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.0040204
Lillis L, Siverson J, Lee A, Cantera J, Parker M, Piepenburg O, Lehman DA, Boyle DS. Factors influencing Recombinase polymerase amplification (RPA) assay outcomes at point of care. Mol Cell Probes. 2016 Apr;30(2):74-8. doi: 10.1016/j.mcp.2016.01.009. Available from: https://www.sciencedirect.com/science/article/abs/pii/S0890850816300014
Euler M, Wang Y, Nentwich O, Piepenburg O, Hufert FT, Weidmann M. Recombinase polymerase amplification assay for rapid detection of Rift Valley fever virus. J Clin Virol. 2012 Aug;54(4):308-12. doi: 10.1016/j.jcv.2012.05.006. Available from: https://www.sciencedirect.com/science/article/pii/S1386653212001825
Minton AP. The influence of macromolecular crowding and macromolecular confinement on biochemical reactions in physiological media. J Biol Chem. 2001 Apr 6;276(14):10577-80. doi: 10.1074/jbc.R100005200. Available from: https://www.jbc.org/article/S0021-9258(19)76459-2/fulltext
James EA, Mvula R, Richards JS, Hodgson SH. Mechanistic insights into recombinase polymerase amplification. Biochemistry. 2021 May 25;60(20):1614-24.
Kersting S, Rausch V, Bier FF, von Nickisch-Rosenegk M. Rapid detection of Plasmodium falciparum with isothermal recombinase polymerase amplification and lateral flow analysis. Malar J. 2014 Mar 20;13:99. doi: 10.1186/1475-2875-13-99. Available from: https://malariajournal.biomedcentral.com/articles/10.1186/1475-2875-13-99
Njiru ZK. Loop-mediated isothermal amplification technology: towards point of care diagnostics. PLoS Negl Trop Dis. 2012 Jun;6(6):e1572. doi: 10.1371/journal.pntd.0001572. Available from: https://journals.plos.org/plosntds/article?id=10.1371/journal.pntd.0001572
Li SY, Cheng QX, Wang JM, Li XY, Zhang ZL, Gao S, Cao RB, Zhao GP, Wang J. CRISPR-Cas12a-assisted nucleic acid detection. Cell Discov. 2018 Apr 3;4:20. doi: 10.1038/s41421-018-0028-z. Available from: https://www.nature.com/articles/s41421-018-0028-z

Variable	Description
\( F \)	Forward primer
\( G \)	Single stranded DNA binding protein (Gp32)
\( R \)	Recombinase complex
\( FG \)	Forward primer/Gp32 complex
\( FGR_p \)	Unstable Forward primer/Gp32/recombinase complex
\( FGR \)	Stable Forward primer/Gp32/recombinase complex
\( FGnR_p \)	Forward primer/Gp32 with unstable \( n \) recombinase
\( FnR \)	Forward primer complexed with unstable filament of \( n \) recombinase molecules
\( FnRD \)	Forward primer complexed with \( n \) recombinase molecules with DNA
\( D \)	Template DNA (the species we want to track for amplification)
\( FD \)	Forward primer/DNA complex
\( P \)	Polymerase
\( PFD \)	Polymerase/forward primer/DNA complex
\( dNTPs \)	Deoxynucleotides
\( PP_i \)	Pyrophosphate (byproduct of polymerization)

Species	Initial concentration [M]
\( F \)	1.00 × 10⁻⁵
\( G \)	4.50 × 10⁻⁷
\( R \)	5.90 × 10⁻⁶
\( FG \)	0
\( FGR_p \)	0
\( FGR \)	0
\( FGnR_p \)	0
\( FnR \)	0
\( FnRD \)	0
\( D \)	3.80 × 10⁻¹³
\( FD \)	0
\( P \)	1.50 × 10⁻⁸
\( PFD \)	0
\( \mathrm{dNTPs} \)	2.40 × 10⁻⁶

Parameter	Definition	Value	Unit	Reference
\(k_{1}^{\mathrm{cis}}\)	Association rate (inactive Cas12a \(+\; D \rightarrow C_{\mathrm{cis}}\))	--	\(\mathrm{M^{-1}\,s^{-1}}\)	Model assumption\(^*\)
\(k_{-1}^{\mathrm{cis}}\)	Dissociation rate (\(C_{\mathrm{cis}} \rightarrow\) inactive Cas12a \(+\; D\))	--	\(\mathrm{s^{-1}}\)	Model assumption\(^*\)
\(k_{\mathrm{cat}}^{\mathrm{cis}}\)	Turnover in cis (activation on target DNA)	0.054	\(\mathrm{s^{-1}}\)	(5)
\(K_{m}^{\mathrm{cis}}\)	Michaelis constant for cis activation (w.r.t. \(D\))	0.05	\(\mathrm{nM}\)	(6)
\(k_{1}^{\mathrm{trans}}\)	Association rate (\([\mathrm{Cas}]_{\mathrm{active}} + [\mathrm{Reporter}]_{\mathrm{free}} \rightarrow C_{\mathrm{trans}}\))	--	\(\mathrm{M^{-1}\,s^{-1}}\)	(6)\(^*\)
\(k_{-1}^{\mathrm{trans}}\)	Dissociation rate (\(C_{\mathrm{trans}} \rightarrow [\mathrm{Cas}]_{\mathrm{active}} + [\mathrm{Reporter}]_{\mathrm{free}}\))	--	\(\mathrm{s^{-1}}\)	(6)\(^*\)
\(k_{\mathrm{cat}}^{\mathrm{trans}}\)	Turnover in trans (reporter cleavage)	3.85	\(\mathrm{s^{-1}}\)	(3)
\(K_{m}^{\mathrm{trans}}\)	Michaelis constant for trans cleavage (w.r.t. reporter)	27.29	\(\mathrm{nM}\)	(3)

Species	Initial concentration [M]
\([\mathrm{Cas}]_{\mathrm{total}}\)	\(1.0 \times 10^{-8}\)
\([\mathrm{Cas}]_{\mathrm{inactive}}\)	\(1.0 \times 10^{-8}\)
\([\mathrm{Cas}]_{\mathrm{active}}\)	\(0\)
\([\mathrm{Reporter}]_{\mathrm{total}}\)	\(1.0 \times 10^{-7}\)
\([\mathrm{Reporter}]_{\mathrm{free}}\)	\(1.0 \times 10^{-7}\)
\([\mathrm{clDR}]\)	\(0\)
\(D\)	\(3.8 \times 10^{-13}\)
\(D^{\ast}\)	\(0\)
\(C_{\mathrm{cis}}\)	\(0\)
\(C_{\mathrm{trans}}\)	\(0\)

Symbol	Description	Value	Unit
\(k_{d,\mathrm{ref}}\)	Denaturation rate at \(T_{\mathrm{ref}}\)	\(2.60 \times 10^{-3}\)	\(\mathrm{min^{-1}}\)
\(E_a\)	Activation energy of denaturation	\(1.22 \times 10^{5}\)	\(\mathrm{J\,mol^{-1}}\)
\(R\)	Gas constant	\(8.314\)	\(\mathrm{J\,mol^{-1}\,K^{-1}}\)
\(T_{\mathrm{ref}}\)	Reference temperature	\(323.15\) \((50^\circ\mathrm{C})\)	\(\mathrm{K}\)
\(T_{\mathrm{home}}\)	Home/room temperature	\(295.15\) \((22^\circ\mathrm{C})\)	\(\mathrm{K}\)
\(T_{\mathrm{storage}}\)	Storage temperature	\(278.15\) \((5^\circ\mathrm{C})\)	\(\mathrm{K}\)
\(\Delta t\)	Timestep	\(1\)	\(\mathrm{min}\)
\(A_0\)	Initial activity	\(100\)	\(\%\)
\(A(t)\)	Remaining enzyme activity (over time)	—	\(\%\)

Parameter	Definition	Value	Unit	Reference
\( B \)	Number of base pairs in template	397	--	Experimental design
\( n \)	Number of R-sites on primer	8	--	Model assumption*
\( m \)	Number of GP32 binding sites on primer	4	--	Model assumption*
\( n_{\mathrm{p}} \)	Number of base pairs in primer	32	--	Experimental design
\( k_{1\mathrm{f}} \)	Forward rate constant for Eq. (1)	5×10⁸	\( \mathrm{M^{-1}} \)	(1)
\( k_{1\mathrm{r}} \)	Reverse rate constant for Eq. (1)	5×10³	\( \mathrm{s^{-1}} \)	(1)
\( k_{\mathrm{eq2af}} \)	Forward rate constant for Eq. (2a)	3.47×10⁷	\( \mathrm{M^{-1}s^{-1}} \)	(1)
\( k_{\mathrm{eq2ar}} \)	Reverse rate constant for Eq. (2a)	1.471	\( \mathrm{s^{-1}} \)	(1)
\( k_{2\mathrm{b}} \)	Rate constant for Eq. (2b)	4.7×10⁻²	\( \mathrm{s^{-1}} \)	(1)
\( k_{\mathrm{eq2cf}} \)	Forward rate constant for Eq. (2c)	3.47×10⁷	\( \mathrm{M^{-1}s^{-1}} \)	(1)
\( k_{\mathrm{eq2cr}} \)	Reverse rate constant for Eq. (2c)	331	\( \mathrm{s^{-1}} \)	(1)
\( k_{2\mathrm{d}} \)	Rate constant for Eq. (2d)	4.6×10⁻³	\( \mathrm{s^{-1}} \)	(1)
\( k_{3\mathrm{f}} \)	Forward rate constant for Eq. (3)	3.47×10⁷	\( \mathrm{M^{-1}s^{-1}} \)	(1)
\( k_{3\mathrm{r}} \)	Reverse rate constant for Eq. (3)	59.37	\( \mathrm{s^{-1}} \)	(1)
\( k_{4\mathrm{a}}^{\mathrm{cat}} \)	Forward rate constant for Eq. (4a)	4.22	\( \mathrm{s^{-1}} \)	(1)
\( k_{4\mathrm{b}}^{\mathrm{cat}} \)	Forward rate constant for Eq. (4b)	8.32	\( \mathrm{s^{-1}} \)	(1)
\( k_{5\mathrm{f}} \)	Forward rate constant for Eq. (5)	1.2×10⁷	\( \mathrm{M^{-1}s^{-1}} \)	(1)
\( k_{5\mathrm{r}} \)	Reverse rate constant for Eq. (5)	6.0×10⁻²	\( \mathrm{s^{-1}} \)	(1)
\( k_{6\mathrm{f}} \)	Forward rate constant for Eq. (6)	87	\( \mathrm{s^{-1}} \)	(1)
\( k_{7} \)	Forward rate constant for Eq. (7)	14.02	\( \mathrm{M^{-1}s^{-1}} \)	(1)
\( k_{8} \)	Forward rate constant for Eq. (8)	1.13	\( \mathrm{M^{-1}s^{-1}} \)	(1)

Overview

Motivation

Our Core Questions

Strategy

In a Nutshell

Our Journey

Conclusion

Goals Achieved

Future Outlooks

Methods & Software

Contribution

Reaction Model

Goal

Modules

RPA Engine

Design and Setup

RPA Parameters

Concentration Profiles of Multiple Components

CRISPR-Cas12a Detection System

Design and Setup

LbCas12a kinetics

Fluorescence Calibration

Determination of Background Fluorescence

Integrated System Findings

Future Hardware Design

Sufficiency of RPA Product for Cas12a Activation

Detection Threshold and Limit of Detection (LOD) Determination

Representation of the Core System

Enzyme Stability Model

Design and Setup

Parametrization Strategy

Here you can see sizes of different plant peroxidases

Model Predictions

Linking Stability to Performance

Key Finding

Methods & Tools

ODE solver:

Conclusion

References

Epidemiological Model

Goal

Methods

Model Overview

Model Architecture

Agent Characteristics

Population Structure

Disease Dynamics

Transmission Mechanism

Natural History

Partnership Dynamics

Partnership Formation

Partnerhsip Dissolution

Testing and Treatment

Testing Behavior

Diagnostic Performance

Behavioral Adaptation

Demographic Processes

Vital Statistics

Age Progression

Simulation Implementation

Temporal Resolution

Stochastic Elements

Data Collection

Model Validation

Parameter Sources

Software Implementation

Results

Current scenario

Sensitivity to current testing uncertainty

1) Lower bound (more conservative testing)

2) Estimated current testing (baseline)

3) Upper bound (more generous testing)

Modeling the effect of TRACE

Medium increase in testing rate

Large increase in testing rate

Entrepreneurship:

Overall impact

Model validation and discussion

Age-Specific Infection Patterns

Network Stability