Overview

Motivation

Our project was born from a dual motivation: a pressing global health challenge and a complex molecular engineering puzzle. Sexually Transmitted Infections (STIs) are a growing concern worldwide, and the most powerful tool in preventing their spread is frequent, accessible, and reliable testing. However, the journey to create such a test is fraught with difficulty. Engineering a robust, low-cost diagnostic requires orchestrating a complex network of biochemical reactions, where unforeseen interactions can lead to system failure. This reality, that a purely experimental approach would be slow, costly, and blind to many potential pitfalls, motivated us to adopt a model-driven engineering philosophy from the project's inception. We aimed to use computation not just to analyze our results, but to guide our every step.

Our Core Questions

Our entire modeling effort was designed to answer a set of fundamental questions:

  • Prediction: Can we computationally predict the behaviour and limiting factors of our complex biochemical system before committing to extensive lab work?
  • Validation: Can our model's predictions be anchored in real-world experimental data to prove its reliability?
  • Impact: What is the potential public health impact of successfully deploying our diagnostic, and can we quantify this to justify our design choices?
  • Viability: Looking toward a real-world product, what factors will affect the long-term stability and shelf-life of our test, and can we create a framework to predict them?

Strategy

To answer these questions, we devised a comprehensive, multi-scale strategy, approaching the problem from two directions simultaneously. Our "bottom-up" strategy focused on creating a detailed Reaction Model based on the mechanistic principles of biochemistry. By building this model in modular units, we could understand the molecular machinery of our test from first principles. Our "top-down" strategy involved developing an Epidemiological Model to understand the human-scale environment in which our test would operate. This dual strategy ensured our work was always connected, linking the performance of molecules in a test tube to the health of individuals in a community.

Our Journey

From Prediction to Validation and Design

The story of our modeling is a journey from uncertainty to a project-saving design. We began by tackling the "black box" problem of proprietary RPA kits. Using our model to perform a sensitivity analysis, we discovered that the system's performance was too sensitive to the concentration of SSB, providing our first critical insight.

The most crucial step was experimental validation. Our lab team's benchmark experiment yielded a result of 405 nM of amplified DNA, a value that fell squarely within the plausible range our model had predicted. This successful validation was the pivotal moment, giving us the confidence to trust our model's predictions.

That trust was essential when the integrated model made its most important finding: a "cis-cleavage deadlock", a real failure mode where the CRISPR enzyme would destroy the DNA faster than it could be amplified. The model didn't just find the problem - it gave us the solution. A temporal separation of the reactions would resolve the issue, leading us directly to propose a tangible "two-chamber" hardware design. This journey - from navigating uncertainty to experimental validation and a project-saving design -is the core of our model's success story.

Conclusion

Goals Achieved

Our modeling framework successfully answered the core questions we set out to investigate.

  • We demonstrated that a model can predict critical system failures before they happen in the lab.
  • We successfully validated our model's core engine against experimental data, establishing its credibility.
  • We quantified the potential public health impact of our test, providing a powerful justification for our work.
  • Finally, we created a suite of foundational frameworks that address everything from long-term stability to alternative detection methods.

Future Outlooks

The models we have built are dynamic tools for future innovation. We plan to use them to create a quantitative diagnostic tool, allowing a smartphone to estimate viral or bacterial load from the signal speed and intensity. Ultimately, we see our work as a step toward a global pathogen surveillance network, where data from low-cost tests like ours feeds into epidemiological models to predict and prevent future outbreaks.

Methods & Software

Our kinetic models were developed using a system of ordinary differential equations (ODEs) solved with the robust LSODA solver, implemented in Python with the SciPy and Matplotlib libraries. Our epidemiological model was developed as an agent-based model. Full details, equations, and parameter justifications can be found on the respective model pages.

Contribution

All models, the complete codebase, and simulation data are provided as open-source tools. We particularly offer our Enzyme Stability and Asymmetrical RPA frameworks as well-documented, foundational starting points for any future teams tackling the challenges of diagnostic shelf-life or alternative detection methods. You can access everything in our Git repository.

Reaction Model

Goal

The core of our project is a complex biochemical system involving multiple, interacting enzymatic reactions. To navigate this complexity and engineer a robust diagnostic, we developed a detailed mechanistic model of our entire reaction, from DNA amplification to signal generation. This "digital twin," built upon a system of Ordinary Differential Equations (ODEs), served three primary goals:

  1. To predict the system's dynamic behavior and identify potential failure modes before they were encountered in the lab.
  2. To guide our experimental design by identifying the most sensitive and critical parameters for optimization.
  3. To validate our final design by anchoring our simulations in real-world experimental data.

Our model was built in a modular fashion, starting with individual units for Recombinase Polymerase Amplification (RPA) and CRISPR-Cas12a, which were later integrated into a single, comprehensive system. For more information about these methods, please visit our Project Description Page.

Modules

To simulate our complex diagnostic test, we built our model from a set of distinct but interconnected modules. Each module represents a core biological process, which allowed us to develop and test them individually before combining them into a fully integrated system.

RPA Engine

Our model describes the generation of symmetrical RPA products (amplicons) using a system of ODEs derived from eight main chemical reactions that take place during the RPA process. We assumed identical reaction rates for the forward and reverse primers. Initially, we developed a mathematical model to simulate the RPA process on its own.

Design and Setup

The RPA process is based on the following reactions:

$$ m\,G + F \;\overset{k_{1f}/m}{\underset{k_{1r}/m}{\rightleftharpoons}}\; FG \tag{1} $$
$$ R + FG \;\overset{k_{\mathrm{eq2af}}}{\underset{k_{\mathrm{eq2ar}}/n}{\rightleftharpoons}}\; FGR_{\!p} \tag{2a} $$
$$ FGR_{\!p} \;\overset{k_{2b}}{\longrightarrow}\; FGR + G \tag{2b} $$
$$ FGR + (n-1)\,R_{\!p} \;\overset{k_{\mathrm{eq2cf}}/(n-1)}{\underset{k_{\mathrm{eq2cr}}/(n-1)}{\rightleftharpoons}}\; FGnR_{\!p} \tag{2c} $$
$$ FGnR_{\!p} \;\overset{k_{2d}}{\longrightarrow}\; FnR + (m-1)\,G \tag{2d} $$
$$ FnR + D + 2\,m\,G \;\overset{k_{3f}/n}{\underset{k_{3r}/n}{\rightleftharpoons}}\; FnRD \tag{3} $$
$$ FnRD + n\,ATP \;\overset{k_{4a}^{\mathrm{cat}}/n}{\longrightarrow}\; FD + n\,R + n\,AMP + n\,PP_i \tag{4a} $$
$$ FnRD + n\,ATP \;\overset{k_{4b}^{\mathrm{cat}}/n}{\longrightarrow}\; FD + n\,ADP + H_{3}PO_{4} \tag{4b} $$
$$ FD + P \;\overset{k_{5f}}{\underset{k_{5r}}{\rightleftharpoons}}\; PFD \tag{5} $$
$$ PFD + 2\,B\,dNTP \;\overset{k_{6f}}{\longrightarrow}\; P + 2\,B\,PP_i + 2\,DNA \tag{6} $$
$$ FD + n_{p}\,PP_i \;\overset{k_{7}}{\longrightarrow}\; D + n_{p}\,dNTP \tag{7} $$
$$ D + B\,PP_i \;\overset{k_{8}}{\longrightarrow}\; B\,dNTP + FD \tag{8} $$

From these reactions, we derived a set of ODEs that describe the temporal changes in the concentrations of all involved species:

$$ \begin{aligned} \frac{dR}{dt} &= -2\,k_{\mathrm{eq2af}}\,R\,(n\,FG) +2\,k_{\mathrm{eq2ar}}\,FGR_p -2\,k_{\mathrm{eq2cf}}\,R\,FGR_p +2\,k_{\mathrm{eq2cr}}\,FGnR_p \\ & \qquad +2\,k_{4a}^{\mathrm{cat}}\,FnRD +2\,k_{4b}^{\mathrm{cat}}\,FnRD \end{aligned} \tag{1} $$
$$ \frac{dFG}{dt} = \frac{1}{m}k_{1f}GF - \frac{1}{m}k_{1r}FG - k_{\mathrm{eq2af}}RnFG +\frac{1}{n}k_{\mathrm{eq2ar}}FGR_p \tag{2} $$
$$ \frac{dFGR_p}{dt} = k_{\mathrm{eq2af}}RnFG -\frac{1}{n}k_{\mathrm{eq2ar}}FGR_p - k_{2b}FGR_p \tag{3} $$
$$ \frac{dFGR}{dt} = k_{2b}FGR_p -\frac{1}{n-1}k_{\mathrm{eq2cf}}RFGR +\frac{1}{n-1}k_{\mathrm{eq2cr}}FGnR_p \tag{4} $$
$$ \frac{dFGnR_p}{dt} = \frac{1}{n-1}k_{\mathrm{eq2cf}}RFGR -\frac{1}{n-1}k_{\mathrm{eq2cr}}FGnR_p - k_{2d}FGnR_p \tag{5} $$
$$ \frac{dFnR}{dt} = k_{2d}FGnR_p +\frac{1}{n}k_{3r}FnRD -\frac{1}{n}k_{3f}FnR\,D \tag{6} $$
$$ \frac{dFnRD}{dt} = \frac{1}{n}k_{3f}FnR\,D -\frac{1}{n}k_{3r}FnRD -\frac{1}{n}k_{4a}^{\mathrm{cat}}FnRD -\frac{1}{n}k_{4b}^{\mathrm{cat}}FnRD \tag{7} $$
\begin{align} \frac{dD}{dt} &= -\frac{1}{n} k_{3f} F n R \, D + \frac{1}{n} k_{3r} F n R D + 2 k_{6f} P F D \notag \\ % <-- continue on next line, no number &\quad + 2 k_7 P P_i F D - 2 k_8 P P_i D - k_{\mathrm{cat}}^{\mathrm{cis}} \, \mathrm{Casactive} \tag{8} \end{align}
$$ \frac{dFD}{dt} = \frac{1}{n}k_{4a}^{\mathrm{cat}}FnRD +\frac{1}{n}k_{4b}^{\mathrm{cat}}FnRD - k_{5f}PFD + k_{5r}PFD - k_7PP_iFD + k_8PP_iD \tag{9} $$
$$ \frac{dP}{dt} = - k_{5f}PFD + k_{5r}PFD + k_{6f}PFD \tag{10} $$
$$ \frac{dPFD}{dt} = k_{5f}PFD - k_{5r}PFD - k_{6f}PFD \tag{11} $$
$$ \frac{d(\mathrm{dNTPs})}{dt} = -2Bk_{6f}PFD + 2Bk_8PP_iD + 2n_pk_7PP_iFD \tag{12} $$
$$ \frac{dPP_i}{dt} = 2k_{4a}^{\mathrm{cat}}FnRD +2Bk_{6f}PFD -2Bk_8PP_iD -2n_pk_7PP_iFD \tag{13} $$
$$ \frac{dF}{dt} = -\frac{1}{m}k_{1f}GF +\frac{1}{m}k_{1r}FG \tag{14} $$
$$ \frac{dG}{dt} = -\frac{1}{m}k_{1f}GF +\frac{1}{m}k_{1r}FG \tag{15} $$

RPA Parameters

The RPA parameters were taken from experimental studies [1] at a reaction temperature of 39 °C. The referenced study reports experiments performed under continuous stirring conditions. Since our experimental setup does not involve stirring, we needed to adjust the parameters for reactions 2a, 2b, and 3, as these are diffusion-dependent. In the absence of stirring, a reduced amplification rate is expected.

To account for this difference, we applied the correction factor (Sf) described in [1], leading to the following adjusted parameters:

$$ \mathrm{Sf} = 8.05 \times 10^{-5} \times \mathrm{RPM} + 0.347 $$
  • \( k_{\mathrm{eq2af}} \rightarrow \mathrm{Sf} \cdot k_{\mathrm{eq2af}} \)
  • \( k_{\mathrm{eq2cf}} \rightarrow \mathrm{Sf} \cdot k_{\mathrm{eq2cf}} \)
  • \( k_{3\mathrm{f}} \rightarrow \mathrm{Sf} \cdot k_{3\mathrm{f}} \)

Because our experiments were carried out without stirring (RPM = 0), the respective reaction rate constants were scaled accordingly.

Parameter Definition Value Unit Reference
\( B \)Number of base pairs in template397--Experimental design
\( n \)Number of R-sites on primer8--Model assumption*
\( m \)Number of GP32 binding sites on primer4--Model assumption*
\( n_{\mathrm{p}} \)Number of base pairs in primer32--Experimental design
\( k_{1\mathrm{f}} \)Forward rate constant for Eq. (1)5×10⁸\( \mathrm{M^{-1}} \)(1)
\( k_{1\mathrm{r}} \)Reverse rate constant for Eq. (1)5×10³\( \mathrm{s^{-1}} \)(1)
\( k_{\mathrm{eq2af}} \)Forward rate constant for Eq. (2a)3.47×10⁷\( \mathrm{M^{-1}s^{-1}} \)(1)
\( k_{\mathrm{eq2ar}} \)Reverse rate constant for Eq. (2a)1.471\( \mathrm{s^{-1}} \)(1)
\( k_{2\mathrm{b}} \)Rate constant for Eq. (2b)4.7×10⁻²\( \mathrm{s^{-1}} \)(1)
\( k_{\mathrm{eq2cf}} \)Forward rate constant for Eq. (2c)3.47×10⁷\( \mathrm{M^{-1}s^{-1}} \)(1)
\( k_{\mathrm{eq2cr}} \)Reverse rate constant for Eq. (2c)331\( \mathrm{s^{-1}} \)(1)
\( k_{2\mathrm{d}} \)Rate constant for Eq. (2d)4.6×10⁻³\( \mathrm{s^{-1}} \)(1)
\( k_{3\mathrm{f}} \)Forward rate constant for Eq. (3)3.47×10⁷\( \mathrm{M^{-1}s^{-1}} \)(1)
\( k_{3\mathrm{r}} \)Reverse rate constant for Eq. (3)59.37\( \mathrm{s^{-1}} \)(1)
\( k_{4\mathrm{a}}^{\mathrm{cat}} \)Forward rate constant for Eq. (4a)4.22\( \mathrm{s^{-1}} \)(1)
\( k_{4\mathrm{b}}^{\mathrm{cat}} \)Forward rate constant for Eq. (4b)8.32\( \mathrm{s^{-1}} \)(1)
\( k_{5\mathrm{f}} \)Forward rate constant for Eq. (5)1.2×10⁷\( \mathrm{M^{-1}s^{-1}} \)(1)
\( k_{5\mathrm{r}} \)Reverse rate constant for Eq. (5)6.0×10⁻²\( \mathrm{s^{-1}} \)(1)
\( k_{6\mathrm{f}} \)Forward rate constant for Eq. (6)87\( \mathrm{s^{-1}} \)(1)
\( k_{7} \)Forward rate constant for Eq. (7)14.02\( \mathrm{M^{-1}s^{-1}} \)(1)
\( k_{8} \)Forward rate constant for Eq. (8)1.13\( \mathrm{M^{-1}s^{-1}} \)(1)

* The parameters \( n \) and \( m \) represent the number of R-sites and the number of GP32 binding sites per primer, respectively. Since no direct experimental measurements are available, \( n = 8 \) and \( m = 4 \) were chosen as model assumptions to reflect a typical primer architecture and to simplify the kinetic scheme.

These are the variables that were used in our RPA model:

Variable Description
\( F \)Forward primer
\( G \)Single stranded DNA binding protein (Gp32)
\( R \)Recombinase complex
\( FG \)Forward primer/Gp32 complex
\( FGR_p \)Unstable Forward primer/Gp32/recombinase complex
\( FGR \)Stable Forward primer/Gp32/recombinase complex
\( FGnR_p \)Forward primer/Gp32 with unstable \( n \) recombinase
\( FnR \)Forward primer complexed with unstable filament of \( n \) recombinase molecules
\( FnRD \)Forward primer complexed with \( n \) recombinase molecules with DNA
\( D \)Template DNA (the species we want to track for amplification)
\( FD \)Forward primer/DNA complex
\( P \)Polymerase
\( PFD \)Polymerase/forward primer/DNA complex
\( dNTPs \)Deoxynucleotides
\( PP_i \)Pyrophosphate (byproduct of polymerization)
Table 2: Overview of the variables used in the RPA model and their corresponding descriptions.

Sensitivity analysis for unknown parameters

To generate data that reflect our experimental laboratory setup, it was important for us to use the same concentrations in the model as in the actual experiments. A key challenge in using commercial diagnostic kits is that component concentrations are often proprietary. To overcome this uncertainty, we strategically used our model to perform a comprehensive sensitivity analysis. By simulating a wide range of concentrations for unknown components like the Single-Strand Binding Protein (SSB), polymerase, and recombinase, we could identify which factors had the most significant impact on performance. This allowed us to de-risk our design by understanding its potential failure points.

The primer concentration was predetermined and set to 10 µM, matching the conditions used in our laboratory experiments. In contrast, the concentrations of the single-stranded DNA-binding protein (SSB; represented by variable G in the model), the polymerase (P), and the recombinase (R) were not specified by the manufacturer.

The following graphs illustrate how variations in these individual concentrations affect both the final DNA concentration produced within 30 minutes and the overall reaction rate.

Graph
Figure 1: RPA spectrum for different initial G (Gp32) concentrations.
Agarose Gel
Figure 2: RPA spectrum for different initial P (polymerase) concentrations.
Agarose Gel
Figure 3: RPA spectrum for different initial R (recombinase) concentrations.

The numerical impact of these variations can be summarized as follows:

Species Min DNA (nM) Max DNA (nM)
\( G \) 44.963 2364.463
\( P \) 436.662 449.655
\( R \) 447.942 447.942
Table 3: Minimum and maximum DNA concentrations obtained for different initial species concentrations.

This shows that the model is most sensitive to changes in the concentration of SSB, while the other two unknown concentrations play a comparatively minor role. A similar effect has also been reported in experimental studies [4].

To gain a deeper understanding of our system, we additionally varied the concentrations of other components such as dNTPs, DNA, and the primers.

Agarose Gel
Figure 4: RPA spectrum for different initial F (primer) concentrations.
Graph
Figure 5: RPA spectrum for different initial D (template DNA) concentrations.
Graph
Figure 6: RPA spectrum for different initial dNTP concentrations.

The numerical impact of these additional variations can be summarized as follows:

Species Min DNA (nM) Max DNA (nM)
\( F \) 447.941 447.943
\( D \) 447.941 447.943
\( \mathrm{dNTP} \) 447.942 447.943
Table 4: Minimum and maximum DNA concentrations obtained for different initial species concentrations.

It can be observed that, within the tested range of DNA, dNTP and primer concentration, no significant change in the final product occurs. This underlines the robustness and efficiency of the RPA process. A more detailed analysis of varying DNA concentrations will be presented in the Limit of Detections (LOD) analysis.

Concentration Profiles of Multiple Components

In this part, we show the concentration profiles of various components over the course of the RPA reaction. To avoid presenting an excessive number of graphs, we selected one representative set of initial concentrations for the following simulations. This ensures better comparability between the graphs. The same set of initial concentrations is used throughout the remainder of this section.

The initial concentrations used at the start of the simulation were as follows:

Species Initial concentration [M]
\( F \)1.00 × 10⁻⁵
\( G \)4.50 × 10⁻⁷
\( R \)5.90 × 10⁻⁶
\( FG \)0
\( FGR_p \)0
\( FGR \)0
\( FGnR_p \)0
\( FnR \)0
\( FnRD \)0
\( D \)3.80 × 10⁻¹³
\( FD \)0
\( P \)1.50 × 10⁻⁸
\( PFD \)0
\( \mathrm{dNTPs} \)2.40 × 10⁻⁶
Table 5: Initial concentrations of all species used in the RPA simulation.
Agarose Gel
Figure 7: Concentration change of FnRD over the course of the RPA reaction.
Agarose Gel
Figure 8: Concentration change of FGR_p over the course of the RPA reaction.
Agarose Gel
Figure 9: DNA amplification over the course of the RPA reaction.
Agarose Gel
Figure 10: Concentration change of FnR over the course of the RPA reaction.
Agarose Gel
Figure 11: Concentration change of FGnR_p over the course of the RPA reaction.
Agarose Gel
Figure 12: Concentration change of FGR over the course of the RPA reaction.

CRISPR-Cas12a Detection System

One of our detection methods is the CRISPR-Cas12 approach using the LbCas12a enzyme. The model realistically simulates both the generation of DNA copies and the resulting signal. The colorimetric signal is produced when a fluorescently labeled reporter is cleaved. This cleavage process is triggered once a critical amount of target DNA is present.

Using the model, we can calculate the temporal dynamics of reporter cleavage and then convert these into fluorescence values using an experimentally determined calibration factor.

Design and Setup

Similar to our RPA module, we used a system of ODEs derived from the main chemical reactions that take place during the CRISPR/Cas12a process.

The Cas12a reactions:

$$ \underbrace{[\mathrm{Cas}]_{\mathrm{inactive}}}_{[\mathrm{Cas}]_{\mathrm{total}} - [\mathrm{Cas}]_{\mathrm{active}}} + D \;\overset{k_{1}^{\mathrm{cis}}}{\underset{k_{-1}^{\mathrm{cis}}}{\rightleftharpoons}}\; C_{\mathrm{cis}} \;\overset{k_{\mathrm{cat}}^{\mathrm{cis}}}{\longrightarrow}\; [\mathrm{Cas}]_{\mathrm{active}} + D^{\ast} \tag{1C} $$
$$ [\mathrm{Cas}]_{\mathrm{active}} + [\mathrm{Reporter}]_{\mathrm{free}} \;\overset{k_{1}^{\mathrm{trans}}}{\underset{k_{-1}^{\mathrm{trans}}}{\rightleftharpoons}}\; C_{\mathrm{trans}} \;\overset{k_{\mathrm{cat}}^{\mathrm{trans}}}{\longrightarrow}\; [\mathrm{Cas}]_{\mathrm{active}} + [\mathrm{clDR}] \tag{2C} $$

The Cas12a ODEs:

$$ \frac{d[\mathrm{Cas}]_{\mathrm{active}}}{dt} = k_{\mathrm{cat}}^{\mathrm{cis}} \frac{[\mathrm{Cas}]_{\mathrm{total}} - [\mathrm{Cas}]_{\mathrm{active}}}{} \frac{D}{K_{m}^{\mathrm{cis}} + D} \tag{1C} $$
$$ \frac{d[\mathrm{clDR}]}{dt} = \frac{k_{\mathrm{cat}}^{\mathrm{trans}} \,[\mathrm{Cas}]_{\mathrm{active}} \,[\mathrm{Reporter}]_{\mathrm{free}}} {K_{m}^{\mathrm{trans}} + [\mathrm{Reporter}]_{\mathrm{free}}} \tag{2C} $$

$$ [\mathrm{Reporter}]_{\mathrm{free}} = [\mathrm{Reporter}]_{\mathrm{total}} - [\mathrm{clDR}],\quad [\mathrm{Cas}]_{\mathrm{inactive}} = [\mathrm{Cas}]_{\mathrm{total}} - [\mathrm{Cas}]_{\mathrm{active}} $$

An overview of the variables used in the Cas12a model is given below:

Variable Description
\([\mathrm{Cas}]_{\mathrm{total}}\) Total concentration of Cas12a protein (active + inactive)
\([\mathrm{Cas}]_{\mathrm{inactive}}\) Inactive Cas12a pool before binding to target DNA
\([\mathrm{Cas}]_{\mathrm{active}}\) Activated Cas12a after cis binding and cleavage
\(D\) Target DNA recognized by Cas12a (activator)
\(D^{\ast}\) Cleaved target DNA after cis activation
\(C_{\mathrm{cis}}\) Intermediate Cas12a-DNA complex in cis activation
\(C_{\mathrm{trans}}\) Cas12a-reporter encounter complex in trans cleavage
\([\mathrm{Reporter}]_{\mathrm{total}}\) Total concentration of reporter molecules
\([\mathrm{Reporter}]_{\mathrm{free}}\) Unbound (uncleaved) reporter concentration
\([\mathrm{clDR}]\) Concentration of cleaved reporter molecules (fluorescent signal)
Table 6: Overview of the variables used in the Cas12a model.

Furthermore, the following table helps to identify the Cas12a parameters used in our equations and to clarify their meaning.

Parameter Definition Value Unit Reference
\(k_{1}^{\mathrm{cis}}\) Association rate (inactive Cas12a \(+\; D \rightarrow C_{\mathrm{cis}}\)) -- \(\mathrm{M^{-1}\,s^{-1}}\) Model assumption\(^*\)
\(k_{-1}^{\mathrm{cis}}\) Dissociation rate (\(C_{\mathrm{cis}} \rightarrow\) inactive Cas12a \(+\; D\)) -- \(\mathrm{s^{-1}}\) Model assumption\(^*\)
\(k_{\mathrm{cat}}^{\mathrm{cis}}\) Turnover in cis (activation on target DNA) 0.054 \(\mathrm{s^{-1}}\) (5)
\(K_{m}^{\mathrm{cis}}\) Michaelis constant for cis activation (w.r.t. \(D\)) 0.05 \(\mathrm{nM}\) (6)
\(k_{1}^{\mathrm{trans}}\) Association rate (\([\mathrm{Cas}]_{\mathrm{active}} + [\mathrm{Reporter}]_{\mathrm{free}} \rightarrow C_{\mathrm{trans}}\)) -- \(\mathrm{M^{-1}\,s^{-1}}\) (6)\(^*\)
\(k_{-1}^{\mathrm{trans}}\) Dissociation rate (\(C_{\mathrm{trans}} \rightarrow [\mathrm{Cas}]_{\mathrm{active}} + [\mathrm{Reporter}]_{\mathrm{free}}\)) -- \(\mathrm{s^{-1}}\) (6)\(^*\)
\(k_{\mathrm{cat}}^{\mathrm{trans}}\) Turnover in trans (reporter cleavage) 3.85 \(\mathrm{s^{-1}}\) (3)
\(K_{m}^{\mathrm{trans}}\) Michaelis constant for trans cleavage (w.r.t. reporter) 27.29 \(\mathrm{nM}\) (3)

* Following study (6), Cas-reporter binding occurs so rapidly that it can be considered quasi-instantaneous. We adopt the same simplification for the cis association/dissociation steps; therefore explicit numerical values for \(k_{1}^{\mathrm{cis}}\), \(k_{-1}^{\mathrm{cis}}\), \(k_{1}^{\mathrm{trans}}\), and \(k_{-1}^{\mathrm{trans}}\) are not specified.

Table 7: Cas12a model parameters, their definitions, values, and references.

And here is an overview of the initial concentrations used in the Cas12a simulation:

Species Initial concentration [M]
\([\mathrm{Cas}]_{\mathrm{total}}\) \(1.0 \times 10^{-8}\)
\([\mathrm{Cas}]_{\mathrm{inactive}}\) \(1.0 \times 10^{-8}\)
\([\mathrm{Cas}]_{\mathrm{active}}\) \(0\)
\([\mathrm{Reporter}]_{\mathrm{total}}\) \(1.0 \times 10^{-7}\)
\([\mathrm{Reporter}]_{\mathrm{free}}\) \(1.0 \times 10^{-7}\)
\([\mathrm{clDR}]\) \(0\)
\(D\) \(3.8 \times 10^{-13}\)
\(D^{\ast}\) \(0\)
\(C_{\mathrm{cis}}\) \(0\)
\(C_{\mathrm{trans}}\) \(0\)
Table 8: Initial concentrations of all species used in the Cas12a simulation.

LbCas12a kinetics

To accurately parameterize our model, we first needed to determine the Michaelis-Menten kinetics of \(k_{\mathrm{cat}}^{\mathrm{trans}}\) and \(K_{m}^{\mathrm{trans}}\) of our specific LbCas12a enzyme.

$$ \frac{d[\mathrm{clDR}]}{dt} = k_{cat}^{\mathrm{trans}} \cdot [\mathrm{Cas}]_{\mathrm{active}} \cdot \frac{[\mathrm{Reporter}]_{\mathrm{free}}} {K_{m}^{\mathrm{trans}} + [\mathrm{Reporter}]_{\mathrm{free}}} $$

To do that, our Wet Lab first conducted an experiment to determine the Michaelis-Menten kinetics of our LbCas12a enzyme. Upon critical review of the initial data, we determined it was not suitable for a reliable kinetic fit. The experiment had yielded measurements at only two substrate concentrations, which is insufficient to accurately estimate Michaelis-Menten parameters \(K_{m}^{\mathrm{trans}}\)​ and \(k_{\mathrm{cat}}^{\mathrm{trans}}\).

Holding our model to a high standard of scientific accuracy, we made the decision not to use this preliminary data.To maintain project velocity and avoid experimental bottlenecks, our team immediately pivoted to a robust literature-based parameterization strategy. We performed a thorough search and sourced kinetic data from a study investigating the same enzyme under comparable buffer conditions [3]. This adaptive approach ensured our model was built on a reliable and scientifically valid foundation, allowing our project to proceed without compromising its accuracy.

We used the dsDNA data from this study, since our target is also double-stranded DNA. Based on this, we parameterized our model using \(k_{\mathrm{cat}}^{\mathrm{trans}} \approx 3.85\ \mathrm{s}^{-1}\) and an appropriate Km value from the literature.

Fluorescence Calibration

Our simulation calculates the concentration of cleaved reporters over time based on the turnover number (\(k_{\mathrm{cat}}^{\mathrm{trans}}\)) of Cas12a. However, to relate these values to the actual fluorescence signal measured in experiments, a linear calibration factor is required. To determine this factor, the Wet Lab conducted the following experiment:

Experiment:

  • A dilution series of known reporter concentrations was prepared.
  • The fluorescence intensity of each dilution was measured.
  • The measured fluorescence intensity was plotted against the reporter concentration.

Subsequently, a linear regression was applied, resulting in a line whose slope describes the relationship between reporter concentration and fluorescence intensity.

This calibration factor enables us to convert simulated reporter concentrations directly into fluorescence values that can be compared with experimental data.

Agarose Gel
Figure 13: Calibration factor of fluorescence vs. reporter concentration with linear fit.

Determination of Background Fluorescence

In order to distinguish true signal from background noise, we needed to determine the background fluorescence generated by the complete system in the absence of target DNA, i.e., under conditions where no actual cleavage and signal generation occur. To do this, our Wet Lab performed the following experiment:

Experiment:

  • Cas12a was incubated with the fluorescently labeled reporter in the absence of target DNA.
  • Multiple replicates were measured to obtain the characteristic background signal.

Afterwards, we calculated the mean of these measurements and defined it as the background fluorescence.

The effective fluorescence signal was then calculated as: Signal = Total fluorescence − Background fluorescence

Integrated System Findings

Future Hardware Design

When we integrated our parameterized RPA and Cas12a modules, our model made a critical prediction: the combined system would fail to produce a signal.

LbCas12a not only cleaves the reporter in trans, but also cuts the target dsDNA in cis, i.e., at the target site itself. This cis-cleavage reaction occurs before the trans-cleavage is initiated and has two important consequences:

  1. It activates Cas12a by binding and cleaving the target DNA.
  2. It simultaneously reduces the amount of DNA produced by RPA, because the target DNA is itself degraded in the process.

To incorporate this into our model, we parameterized the cis-cleavage kinetics separately. A study [5] reported \(k_{\mathrm{cat}}^{\mathrm{cis}}\) values for LbCas12a ranging from 0.0024 to 0.054 \(\mathrm{s}^{-1}\). To reflect a worst-case scenario (maximum inhibition by cis-cleavage), we used \(k_{\mathrm{cat}}^{\mathrm{cis}} = 0.054\ \mathrm{s}^{-1}\). Another study [6] provided indirect evidence suggesting that the \(K_{m}^{\mathrm{cis}}\) of LbCas12a is below 5 nM. Based on this indication, we chose a \(K_{m}^{\mathrm{cis}}\) value of 0.05 nM, again to model a conservative scenario.

In our ODE model, Cas12a activation and cis-cleavage are described using Michaelis-Menten kinetics:

$$ \frac{d[\mathrm{Cas}]_{\mathrm{active}}}{dt} = k_{cat}^{\mathrm{cis}} \cdot \big([\mathrm{Cas}]_{\mathrm{total}} - [\mathrm{Cas}]_{\mathrm{active}}\big) \cdot \frac{D}{K_{m}^{\mathrm{cis}} + D} $$

This formulation reflects the gradual activation of Cas12a as more DNA becomes available, rather than assuming an instantaneous activation. The cis-reaction consumes DNA over time, reducing the pool of RPA amplicons available for trans-cleavage. This effect is particularly strong in the early amplification phase, when RPA product concentrations are still low.

Using these parameters, our model predicted no detectable fluorescence signal when Cas12a is active from the beginning of the reaction. The cis-cleavage removes DNA faster than RPA can produce it at early time points, resulting in a “deadlock” situation where Cas12a never reaches sufficient activation to cleave reporters in trans.

Agarose Gel
Figure 14: RPA-CRISPR simulation showing no detectable signal.

To overcome this, we reasoned that temporally separating amplification and detection could avoid this early inhibition. By allowing the RPA reaction to proceed for several minutes before introducing Cas12a, a sufficient amount of DNA accumulates. Once Cas12a is introduced, cis-cleavage no longer significantly impacts the reaction because the DNA concentration is already high. To test this, we implemented a delay parameter in our model, where Cas12a activation starts only after a defined time (e.g., 300 s). The simulations clearly show that a short delay is sufficient to allow reporter cleavage to proceed efficiently.

Agarose Gel
Figure 15: A short Cas12a delay (100 s) results in a weak signal.
Agarose Gel
Figure 16: A 120 s Cas12a delay produces a slightly stronger signal as DNA amplification has progressed further.
Agarose Gel
Figure 17: At 150 s delay, reporter cleavage becomes more pronounced as DNA concentration increases.
Agarose Gel
Figure 18: A 180 s Cas12a delay yields a strong and rapid fluorescence signal following sufficient DNA accumulation.
Agarose Gel
Figure 19: A 200 s Cas12a delay leads to full reporter cleavage.
Agarose Gel
Figure 20: A 250 s Cas12a delay leads to full reporter cleavage and full DNA amplification.
Agarose Gel
Figure 21: A 300 s Cas12a delay leads to full reporter cleavage.

This modeling insight was critical. It provided a clear, actionable design principle for our hardware: a two-chamber system that temporally separates amplification and detection. In such a setup, RPA would run first in isolation, allowing sufficient DNA to accumulate. After a defined incubation period, the sample would be transferred into the Cas12a reaction chamber, initiating detection only after amplification is well underway. This temporal separation could be achieved through simple means such as lateral flow or paper-based microfluidic systems, where controlled migration of the reaction mixture triggers Cas12a activity after a predefined delay. Our model predicts that this approach would prevent early cis-cleavage from depleting the target DNA and would enable robust signal generation. By identifying this failure mode in silico, the model saved us time and resources that would have been spent troubleshooting failed wet lab experiments.

Starting from this point, the following sections will address the case in which the reaction is separated, with Cas12a being activated 300 seconds after the start of the RPA.

Sufficiency of RPA Product for Cas12a Activation

To assess whether a range of RPA product concentrations is sufficient to make reliable predictions, we calculated the minimum amount of RPA-generated DNA required to activate Cas12a strongly enough to cleave all reporters and generate the maximal signal within 30 minutes. This threshold was found to be 0.042 nM of RPA product. When varying the G concentration, the lowest resulting RPA product concentration was 44.963 nM, which is roughly 1000 times higher than the minimum required amount. This indicates that, under all tested conditions, the amount of amplified DNA is more than sufficient to fully activate Cas12a, meaning that variations in RPA yield within this range are unlikely to affect the final signal intensity.

Detection Threshold and Limit of Detection (LOD) Determination

Threshold Calculation

The detection threshold was determined following the approach described in [3] using the formula:

$$ \text{Threshold} = \mu_{\mathrm{blank}} + K \cdot \sigma_{\mathrm{blank}} $$

where \( \mu_{\mathrm{blank}} \) is the mean fluorescence of blank samples, \( \sigma_{\mathrm{blank}} \)​ is the corresponding standard deviation, and \(K\) is a confidence factor. Following [3], we set \(K = 3\), which corresponds to a 99.9 % confidence level that blank measurements lie below the detection threshold. Higher \(K\) values result in more conservative thresholds but increase statistical confidence.

The blank measurements used for this calculation were identical to those used to determine the background fluorescence. Based on these values, the detection threshold was set to 1473 a.u.

Simulation Constraints:

Simulations were performed across a range of starting DNA concentrations to identify the lowest input at which the signal exceeded the threshold. While the search formally started at \(1 \times 10^{-22}\ \mathrm{M}\), concentrations below the equivalent of a single molecule per reaction are biologically meaningless. The simulation was therefore constrained to exclude values below this physical lower bound.

For the assay volume of 47.5\( \mu\mathrm{L} \), one DNA molecule corresponds to a concentration step of \(3.496 \times 10^{-20}\ \mathrm{M}\). Accordingly, limits are expressed in whole numbers of molecules, not arbitrary decimal concentrations.

LOD Determination

Using Monte Carlo simulations incorporating Poisson-distributed start copies, an initiation gate, measurement noise, and the detection rule “threshold crossed before 30 min,” the continuous search yielded a limit at 7.47 expected copies.

Poisson distribution of input molecules: For a nominal initial concentration \(C_0\) and reaction volume V, the expected number of input DNA molecules is

$$ \lambda = C_{0}\, V\, N_{\mathrm{A}}, $$

where \(N_{\mathrm{A}}\) is Avogadro’s number. The actual number of molecules N used in each simulation is drawn from a Poisson distribution

$$ N \sim \mathrm{Poisson}(\lambda) $$

Initiation gate: Only reactions with at least Nmin seed molecules can proceed. Given \(N \ge N_{\min}\)​, each seed initiates amplification with probability \(p_{\mathrm{seed}}\)​. The probability that the reaction successfully initiates is

$$ p_{\mathrm{init}}(N) = 1 - \bigl(1 - p_{\mathrm{seed}}\bigr)^{N}. $$

A Bernoulli draw with success probability \(p_{\mathrm{init}}(N)\) determines whether the reaction proceeds.

Conversion between copies and molar concentration uses

$$ C = \frac{N}{V\,N_{\mathrm{A}}}, \qquad N = [\lambda]. $$

Detection rule: A replicate is considered positive if the simulated fluorescence signal \(F(t)\) crosses the detection threshold Fth before \(t_{\max}\) = 1800 s,

$$ \exists\, t^{\ast} \le t_{\max}\ \text{such that}\ F\!\left(t^{\ast}\right) \ge F_{\mathrm{th}} . $$

Cas12a activation and Michaelis-Menten cleavage: After a fixed delay \( t_{\mathrm{delay}} \)=300 , the concentration of activated Cas12a,\( [\mathrm{Cas}]_{\mathrm{active}}(t) \), evolves according to

$$ \frac{d[\mathrm{Cas}]_{\mathrm{active}}}{dt} = k_{cat}^{\mathrm{cis}} \left([\mathrm{Cas}]_{\mathrm{total}} - [\mathrm{Cas}]_{\mathrm{active}}\right) \frac{[D](t)}{K_{m}^{\mathrm{cis}} + [D](t)} \,. $$

and the cleaved reporter concentration \([\mathrm{clDR}](t)\) follows a Michaelis-Menten trans-cleavage equation

$$ \frac{d[\mathrm{clDR}]}{dt} = \frac{k_{cat}^{\mathrm{trans}}\,[\mathrm{Cas}]_{\mathrm{active}}(t)\, \big([\mathrm{Reporter}]_{\mathrm{total}} - [\mathrm{clDR}](t)\big)} {K_{m}^{\mathrm{trans}} + \big([\mathrm{Reporter}]_{\mathrm{total}} - [\mathrm{clDR}](t)\big)} \,. $$

Limit of detection (LOD): The LOD is defined as the smallest nominal concentration \( C_{\mathrm{LOD}} \) for which the detection probability reaches the target level ptarget=0.95,

$$ \mathbf{P}_{\mathrm{det}}\!\big(C_{\mathrm{LOD}}\big) \ge p_{\mathrm{target}}\,. $$

where \(P_{\mathrm{det}}(C)\) is estimated by repeated stochastic simulation (Poisson input, initiation gate, measurement noise, deterministic kinetics).

The continuous search yielded \(\lambda_{\mathrm{LOD}}\) = 7.47 expected molecules, which was quantized up to the next integer number of input molecules

$$ N_{\mathrm{LOD}} = [\lambda_{\mathrm{LOD}}] = 8\,. $$

corresponding to a concentration of

$$ C_{\mathrm{LOD,quant}} = \frac{N_{\mathrm{LOD}}}{V\,N_{\mathrm{A}}} = 2.80 \times 10^{-19}\,\mathrm{M}. $$

in V=47.5 \(\mu\mathrm{L}\).

Cas12a remained inactive for the first 300 s, preventing any signal from crossing the threshold before this time.

Agarose Gel
Figure 22: Detection probability as a function of input DNA concentration. The dashed line indicates the 95 % target probability used to define the LOD.

Time-to-Threshold Distribution and Validation

At the LOD of 8 copies, the median time-to-threshold was 548.7 s (9.15 min), i.e., about 4 min after Cas activation. The distribution tail was modest, with the 90th and 95th percentiles at 663.2 s (11.05 min) and 744.0 s (12.40 min), respectively.

A validation experiment with 40 replicates at 8 copies yielded 39/40 detections (97.5 %) within the 30 min assay window. This slightly exceeds the 95 % detection target because the rounded input (8 copies) is marginally higher than the theoretical continuous boundary (7.47 copies).

Summary:

  • Threshold: 1473 a.u. (\(K\) = 3, 99.9 % confidence)
  • Physical resolution: 1 molecule = \(3.496\times10^{-20}\,\mathrm{M}\)
  • LOD: 8 copies = \(2.80\times10^{-19}\,\mathrm{M}\)
  • Median detection time: 9.15 min
  • Detection rate at LOD: 97.5 % within 30 min

Representation of the Core System

An overview of the core system is shown in the following figures:

Agarose Gel
Figure 23: Illustration of Recombinase Polymerase Amplification mechanism (Created with BioRender [9]).
Agarose Gel
Figure 24: Illustration of Cas12a mechanism, showing cis-and trans-cleavage activity followed by fluoresence signal generation (Created with BioRender [9])

Enzyme Stability Model

A successful diagnostic must be reliable not only on the day it is made but also after months of storage and transport. To address this critical real-world requirement, we developed an Enzyme Stability Model. This model is designed to predict the loss of enzyme activity over time under different temperature conditions, allowing us to forecast the shelf-life of our test and to quantitatively validate our choice of freeze- drying approach. Connect Entrepreneurship

Design and Setup

The framework is grounded in established thermodynamic and kinetic principles. The temperature dependence of the enzyme denaturation rate, \(k_{d}(T)\), is described by the Arrhenius equation. The subsequent loss of enzyme activity over time, \(A(t)\), is then modeled as an irreversible first-order decay process. This formulation allows us to simulate the impact of arbitrary temperature histories—such as a cool storage phase followed by a room-temperature usage phase—by calculating the cumulative loss of activity in discrete time steps.

This is an overview of the key variables, parameters, and equations used in the model:

Symbol Description Value Unit
\(k_{d,\mathrm{ref}}\) Denaturation rate at \(T_{\mathrm{ref}}\) \(2.60 \times 10^{-3}\) \(\mathrm{min^{-1}}\)
\(E_a\) Activation energy of denaturation \(1.22 \times 10^{5}\) \(\mathrm{J\,mol^{-1}}\)
\(R\) Gas constant \(8.314\) \(\mathrm{J\,mol^{-1}\,K^{-1}}\)
\(T_{\mathrm{ref}}\) Reference temperature \(323.15\) \((50^\circ\mathrm{C})\) \(\mathrm{K}\)
\(T_{\mathrm{home}}\) Home/room temperature \(295.15\) \((22^\circ\mathrm{C})\) \(\mathrm{K}\)
\(T_{\mathrm{storage}}\) Storage temperature \(278.15\) \((5^\circ\mathrm{C})\) \(\mathrm{K}\)
\(\Delta t\) Timestep \(1\) \(\mathrm{min}\)
\(A_0\) Initial activity \(100\) \(\%\)
\(A(t)\) Remaining enzyme activity (over time) \(\%\)
Table 9: Table summarizing key parameters and variables used to model temperature-dependent enzyme denaturation rates.
$$ k_{d,\mathrm{home}} \;=\; k_{d,\mathrm{ref}}\, \exp\!\left(\frac{E_a}{R}\left(\frac{1}{T_{\mathrm{ref}}}-\frac{1}{T_{\mathrm{home}}}\right)\right) \tag{1E} $$
$$ k_{d,\mathrm{storage}} \;=\; k_{d,\mathrm{ref}}\, \exp\!\left(\frac{E_a}{R}\left(\frac{1}{T_{\mathrm{ref}}}-\frac{1}{T_{\mathrm{storage}}}\right)\right) \tag{2E} $$

Temperature Dependence of the Denaturation Rate

The temperature-dependent denaturation rate \(k_{d}(T)\) follows the Arrhenius equation:

\[ k_{d}(T) = k_{d,\mathrm{ref}}\, \exp\!\left[ \frac{E_a}{R}\left(\frac{1}{T_{\mathrm{ref}}} - \frac{1}{T}\right) \right] \]

Activity Decay Over Time

Enzyme inactivation is modeled as a first-order irreversible process:

\[ \frac{dA}{dt} \;=\; -\,k_{d}(T)\,A(t) \]

with \(A(t)\) the remaining activity (in %) at time \(t\). For discrete time steps \(\Delta t\):

\[ A(t+\Delta t) \;=\; A(t)\,\exp\!\left[-\,k_{d}(T)\,\Delta t\right] \]

This update allows arbitrary temperature profiles \(T(t)\): at each step, read \(T(t)\), compute \(k_{d}(T)\), and advance A.

Temperature Profile

A simple two-phase storage/usage profile:

\[ T(t) = \begin{cases} T_{\mathrm{storage}}, & t < t_{\mathrm{storage}},\\[4pt] T_{\mathrm{home}}, & t \ge t_{\mathrm{storage}}. \end{cases} \]

where \(T_{\mathrm{storage}}\) is the storage temperature,​ the storage duration, and \(T_{\mathrm{home}}\) the usage temperature (e.g., room temperature).

Parametrization Strategy

A key modeling choice was to identify which of the many proteins in our system would be the primary limiting factor for shelf-life. We adopted a simplified single-bottleneck assumption, treating one protein's decay as the pace-setter for the entire assay. Protein size often correlates with thermal robustness [7] . Although UvsY is the smallest component, it typically operates in a stabilizing complex with UvsX. We therefore select Gp32 (SSB), the smallest component not obligatorily stabilized by a partner, as the practical pace-setter for shelf-life prediction: if Gp32 remains functional over a given temperature history, the assay is expected to remain functional as well.

This choice is strongly supported by our earlier sensitivity analysis, which showed that the concentration of G (our Gp32/SSB component) has the most significant influence on the model's output, implying that a decline in G activity will dominate assay performance. This is an explicit modeling assumption and simplification.

Protein Size (Number of amino acids) Source
LbCas12a 1228 [10]
UsvY 137 [11]
UsvX 391 [12]
Gp32 301 [13]
Bsu Pol I 880 [14]
Table 10: Proteins of different sizes used in our simulations.

Direct measurement of \(E_a\) and \(k_{d,\mathrm{ref}}\)​ for isolated Gp32 was not feasible because the available kits bundle multiple proteins, and our literature search did not yield Gp32-specific stability constants.

To overcome this challenge, we adopted a size-matched proxy parameters from plant peroxidases of comparable length. In particular, we used the values reported in a scientific paper [8] for Eichhornia crassipes leaf peroxidase - \(E_a \approx 122\ \mathrm{kJ\,mol^{-1}}\) and \(k_{d,\mathrm{ref}} \approx 2.6\times10^{-3}\ \mathrm{min^{-1}}\ \text{(at }50^\circ\mathrm{C)}\).

Because we could not locate the exact sequence length of that enzyme, we surveyed lengths of related plant peroxidases, which cluster around ∼300 amino acids; based on this, we assume the Eichhornia peroxidase is of similar size (~300 amino acids) and use it as a pragmatic proxy for Gp32.

UniProt ID for the peroxidase Size (Number of amino acids) Source
P00433 353 [15]
Q9SMU8 353 [16]
Q43872 317 [17]
Q6AVZ3 344 [18]
Q7F1U0 317 [19]
Q05431 250 [20]
P48534 250 [21]
Table 11: Different peroxidases have sizes in the range of 300 amino acids.

Model Predictions

With this framework, we can chart the time course of Gp32 activity. By modeling a temperature history that separates a cool-storage phase from a post-purchase, room-temperature phase, we obtain shelf-life predictions that closely mirror real-world handling.

The results reveal how storage conditions dramatically affect shelf life. At refrigeration temperatures (5 °C), enzyme activity declines slowly over many months. Once brought to room temperature (22°C), degradation accelerates substantially, but the assay remains viable for weeks of typical use.

Agarose Gel
Figure 25: Graph describes activity decay of rate-limiting enzyme (SBB/Gp32) over time, taking different temperatures in account.

Linking Stability to Performance

To connect molecular stability to assay performance, we couple the computed Gp32 activity \(A(t)\) to our core Reaction Model (RPA→CRISPR) by scaling the Gp32-sensitive kinetic steps. Specifically, reaction rates that depend on SSB/Gp32 concentration are multiplied by the activity factor \(A(t)\). This yields time-resolved detection behavior under realistic temperature histories: how quickly fluorescence crosses threshold after activation as a function of prior storage conditions.

This integrated simulation yields a direct, quantitative prediction of how shelf-life affects the user experience. The results, shown in the fluorescence curves and the time-to-threshold heatmap, demonstrate how a longer time at room temperature leads to reduced enzyme activity. This, in turn, slows down the diagnostic reaction, increasing the time a user must wait for a positive signal. For example, the heatmap shows that a test stored for 100 days and then kept at room temperature for another 25 days (total 125 days) would take 19 minutes to produce a detectable signal, whereas one used immediately after storage would take only 5 minutes.

Agarose Gel
Figure 26: Fluorescence curves for different storage and total aging times (up to 200 days). Line styles indicate storage duration; only G activity is aged.
Agarose Gel
Figure 27: Heatmap describing different scenarios of cold storage time followed by room temperature storage, leading to different time-to-result capacities.

Limitations and outlook

Using a plant peroxidase as a size-matched proxy for Gp32 is a pragmatic simplification: sizes are comparable, but the peroxidase’s fold and microenvironment differ from Gp32, so absolute rates may shift. The framework itself is mechanistic and modular. Once Gp32-specific stability constants (\(E_a\) and \(k_{d,\mathrm{ref}}\)) are measured in dedicated assays, we can replace the proxy values in the Arrhenius model and obtain sharper shelf-life predictions without altering the rest of the pipeline.

Key Finding

Modeling for our Entrepreneurial Plan

From the very beginning, our project was designed with a clear entrepreneurial goal: to create a diagnostic that is globally accessible and independent of the cold chain. We identified lyophilization (freeze-drying) as a core component of our manufacturing strategy to achieve this. However, this presented a key business question: is the added manufacturing cost of freeze-drying justified by the extended shelf-life? Our integrated reaction model was developed, among other things, to answer this question quantitatively. By coupling our kinetic model with the Enzyme Stability Framework, we didn't just find a problem; we validated our initial business decision. The results, particularly the time-to-threshold heatmap, provide powerful evidence for our strategy. The model clearly demonstrates that without lyophilization, the test's performance degrades rapidly, rendering it unreliable for the very low-resource settings we aim to serve. It quantitatively proves that the immense value gained by eliminating the cold chain, ensuring product reliability and market accessibility, far outweighs the manufacturing cost, forming a cornerstone of our business case.

Methods & Tools

ODE solver:

The ODE system is solved using the LSODA solver, which proved to be more robust against integration failures than other solvers, such as CVODES [2]. LSODA combines two numerical methods: a backward differentiation formula (BDF) algorithm for stiff systems and an Adams-Moulton algorithm for non-stiff systems.

Since our model involves multiple timescales and reaction rates that differ by several orders of magnitude, it is classified as a stiff system. Although both LSODA and CVODES use BDF methods for stiff problems, results reported in [2] suggest that the LSODA implementation outperforms that of CVODES.

Conclusion

Our Reaction Model was a powerful and essential part of our project's journey. We built it to be more than just a simulation; it was a tool for navigating the real-world uncertainties of biochemical engineering. By embracing challenges transparently, we were able to produce insights that guided our entire project.

Our work began by tackling the "black box" problem of proprietary kits. Despite not knowing all the component concentrations, we successfully validated our model's core RPA amplification module against our wet lab's experimental data. This was a critical achievement, as it provided the anchor of confidence we needed to trust our model's predictions. That trust was essential when our model made its most important finding: the prediction of a "cis-cleavage deadlock," a fundamental system failure that we would have been blind to in the lab.

This insight allowed us to propose a tangible engineering solution - the "two-chamber" hardware design - demonstrating the model's direct value. Furthermore, our model proved its precision by rigorously determining a limit of detection of 8 copies, a prediction that was validated based on scientific research literature.

By linking performance and stability of our diagnostic we gained a powerful, quantitative validation for our initial entrepreneurial strategy of incorporating freeze-drying. The model clearly shows that without a method to preserve enzyme stability, the test's performance would be unreliable for real-world use where a cold chain is not guaranteed. This result is a cornerstone of our business case, proving that lyophilization is not just a desirable feature but an essential requirement for our diagnostic to be effective and accessible globally.

Ultimately, our Journey tells the story of the real scientific process. It is a story of facing challenges, adapting to unexpected results, and using a combination of computation and experimentation to design something that works.

References

  1. [1] Moody C, Newell H, Viljoen H. A mathematical model of recombinase polymerase amplification under continuously stirred conditions. Biochem Eng J. 2016 Aug 15;112:193-201. Available from: doi: 10.1016/j.bej.2016.04.017.
  2. [2] Städter P, Schälte Y, Schmiester L, Hasenauer J, Stapor PL. Benchmarking of numerical integration methods for ODE models of biological systems. Sci Rep. 2021 Jan 29;11:2696. Available from: doi: 10.1038/s41598-021-82196-2.
  3. [3] Nalefski EA, Kooistra RM, Parikh I, Hedley S, Rajaraman K, Madan D. Determinants of CRISPR Cas12a nuclease activation by DNA and RNA targets. Nucleic Acids Res. 2024 Mar 13;52(8):4502-4522. Available from: doi: 10.1093/nar/gkae152.
  4. [4] Cordoba-Andrade F, Peralta-Castro A, et al. Proteins is a critical factor in recombinase polymerase amplification (RPA), as revealed by insights from an open-source system. PeerJ. 2025;13:e19758. Available from: doi: 10.7717/peerj.19758.
  5. [5] Lesinski JM, Moragues T, et al. In situ complexation of sgRNA and Cas12a improves the performance of a one-pot RPA-CRISPR-Cas12 assay. Anal Chem. 2024 Jun 12;96(25):10443-10450. Available from: doi: 10.1021/acs.analchem.4c01777.
  6. [6] Nalefski EA, Patel N, et al. Kinetic analysis of Cas12a and Cas13a RNA-guided nucleases for development of improved CRISPR-based diagnostics. iScience. 2021 Aug 18;24(9):102996. Available from: doi: 10.1016/j.isci.2021.102996.
  7. [7] Naganathan AN, Doshi U, Muñoz V. Protein folding kinetics: barrier effects in chemical and thermal denaturation experiments. J Am Chem Soc. 2007 Apr 10;129(17):5673-5682. Available from: doi: 10.1021/ja0689740.
  8. [8] Arise RO, Osundahunsi BO, Farohunbi ST, Yekeen AA. Catalytic and kinetic properties of purified Eichhornia crassipes leaf peroxidase. Environ Exp Biol. 2016;14:83-90. Available from: doi: 10.22364/eeb.14.12.
  9. [9] BioRender [Internet]. Toronto (ON): BioRender Inc.; [cited 2025 Oct 8]. Available from: https://www.biorender.com/.
  10. [10] UniProt Consortium. LbCas12a - Lachnospiraceae bacterium [Internet]. UniProtKB; 2025. Available from: https://www.uniprot.org/uniprotkb/A0A5S8WF58/entry.
  11. [11] UniProt Consortium. Recombination protein uvsY - Enterobacteria phage T4 [Internet]. UniProtKB; 2025. Available from: https://www.uniprot.org/uniprotkb/P04537/entry.
  12. [12] UniProt Consortium. Recombination and repair protein UVSX - Enterobacteria phage T4 [Internet]. UniProtKB; 2025. Available from: https://www.uniprot.org/uniprotkb/P04529/entry.
  13. [13] UniProt Consortium. Single-stranded DNA-binding protein - Enterobacteria phage T4 [Internet]. UniProtKB; 2025. Available from: https://www.uniprot.org/uniprotkb/P03695/entry.
  14. [14] UniProt Consortium. DNA polymerase I - Bacillus subtilis (strain 168) [Internet]. UniProtKB; 2025. Available from: https://www.uniprot.org/uniprotkb/O34996/entry.
  15. [15] UniProt Consortium. Peroxidase C1A - Armoracia rusticana (Horseradish) [Internet]. UniProtKB; 2025. Available from: https://www.uniprot.org/uniprotkb/P00433/entry.
  16. [16] UniProt Consortium. Peroxidase 34 - Arabidopsis thaliana [Internet]. UniProtKB; 2025. Available from: https://www.uniprot.org/uniprotkb/Q9SMU8/entry.
  17. [17] UniProt Consortium. Peroxidase 64 - Arabidopsis thaliana [Internet]. UniProtKB; 2025. Available from: https://www.uniprot.org/uniprotkb/Q43872/entry.
  18. [18] UniProt Consortium. Peroxidase - Oryza sativa subsp. japonica [Internet]. UniProtKB; 2025. Available from: https://www.uniprot.org/uniprotkb/Q6AVZ3/entry.
  19. [19] UniProt Consortium. Peroxidase 22.3 - Oryza sativa subsp. japonica [Internet]. UniProtKB; 2025. Available from: https://www.uniprot.org/uniprotkb/Q7F1U0/entry.
  20. [20] UniProt Consortium. L-ascorbate peroxidase 1, cytosolic - Arabidopsis thaliana [Internet]. UniProtKB; 2025. Available from: https://www.uniprot.org/uniprotkb/Q05431/entry.
  21. [21] UniProt Consortium. L-ascorbate peroxidase, cytosolic - Pisum sativum (Garden pea) [Internet]. UniProtKB; 2025. Available from: https://www.uniprot.org/uniprotkb/P48534/entry.

Epidemiological Model

Goal

We aimed to go beyond a purely mechanistic model of our test. In collaboration with the Human Practices team, we developed an agent-based model to assess how deploying our test could impact STI incidence. The model’s goal is to simulate how varying testing rates influence infection dynamics over time — specifically, to evaluate how increasing testing frequency reduces transmission and to identify thresholds where interventions become most effective.

Methods

Model Overview

We developed an agent-based model (ABM) to simulate the transmission dynamics of sexually transmitted infections (STIs), specifically focusing on Chlamydia trachomatis in the Swiss population. The model captures individual-level heterogeneity in sexual behavior, partnership dynamics, and healthcare-seeking patterns to evaluate the impact of various intervention strategies on disease prevalence.

What is an ABM?

An agent-based model (ABM) is a type of computer simulation that looks at how individual people - or “agents” -behave and interact with each other. Each agent can have different characteristics, such as how often they meet partners, get tested, or change behavior over time. By simulating many individuals and their interactions, ABMs help us understand how diseases like Chlamydia can spread in a population and how different strategies, such as more frequent testing, can help reduce infections.

Model Architecture

Agent Characteristics

Each agent in the simulation represents an individual aged 15-100 years with the following attributes:

  • Demographics: Age, biological sex (male/female), and sexual orientation (heterosexual, homosexual, or bisexual)
  • Risk classification: Categorized as either high-risk or low-risk based on sexual behavior patterns
  • Infection status: Susceptible, infected (asymptomatic or symptomatic), or under treatment
  • Partnership network: Dynamic set of current sexual partners with tracked relationship durations
  • Behavioral parameters: Testing propensity, treatment adherence, and partner acquisition rates

Population Structure

The model initializes with 7,663 agents, representing the Swiss population aged 15+ (in thousands). [1] The population composition reflects Swiss demographic data:

  • Sex distribution: 50% male, 50% female
  • Sexual orientation: 92% heterosexual, 3% homosexual, 5.6% bisexual [2]
  • Risk groups: 10% classified as high-risk, 90% as low-risk
  • Age structure: Weighted distribution based on Swiss census data, ranging from 15 to 100 years [1]

Initial infection prevalence is age-stratified according to Swiss surveillance data [3] from the Federal Office of Public Health, with highest rates among younger age groups (520.7 per 100,000 for ages 15-24) and declining with age. These values were multiplied by a scaling factor to account for underreporting of cases and to approximate the true incidence. This factor was estimated to be around 10 for Switzerland by dividing the estimated total number of cases by the number of reported cases [4].

Risk groups influenced partner-seeking behavior, with high-risk individuals being twice as active as average. To keep the population average consistent with literature values, low-risk individuals (90% of the population) were assigned a relative activity of 0.8. Risk group also affected sexual acts per week, maximum concurrent partners, and testing frequency.

Disease Dynamics

Transmission Mechanism

Sexual transmission occurs through partnerships with probability dependent on:

  • Directional transmission rates:
    • Male-to-female: 5% [5][6]
    • Female-to-male: 5%[5][6]
    • Male-to-male: 5%[5][6]
    • Female-to-female: 1% (estimated based on [7])
  • Sexual activity frequency:
    • 0-1 acts/week for low-risk pairs
    • 1-6 acts/week for high-risk pairs
  • Partnership status: Transmission only occurs between current partners

Natural History

The infection progression follows established epidemiological parameters:

  • Incubation period: 1.5 weeks before symptom onset potential [8]
  • Symptom development: 50% probability in males, 25% in females (one-time determination at infection)
  • Natural resolution: 1% weekly probability of spontaneous clearance (computed / adapted from yearly estimate [9])
  • Treatment duration: 1 week of doxycycline (100% efficacy [10])
  • Reinfection: Possible immediately after treatment completion or natural resolution

Partnership Dynamics

Partnership Formation

Partnership formation rates are age- and sex-specific, derived from empirical data:

  • Annual rates peak at age 19 for females (0.9 partnerships/year) and age 20 for males (1.36 partnerships/year)
  • Weekly probability of forming a partnership, calculated as: annual_rate × risk_group_modifier / 52
  • Partner selection incorporates:
    • Sexual compatibility based on orientation
    • Age homophily (Poisson-distributed age difference, mean = 2.2 years [12])
    • Assortative mixing by risk group (67% probability of same-risk partnerships)
    • Maximum concurrent partners: 1 for low-risk, 2 for high-risk individuals

Partnerhsip Dissolution

Dissolution probabilities follow empirical survival curves differentiated by sex and relationship duration:

  • Males: 23.7% dissolution within first week, rising to 90.1% by 3 years[11]
  • Females: 12% dissolution within first week, rising to 84.6% by 3 years[11]
  • Partnerships stabilize after 3 years with minimal dissolution probability (0.01%)[11]

Testing and Treatment

Testing Behavior

Testing propensity varies by symptom status and risk group:

  • Baseline testing rates (annual, converted to weekly):
    • High-risk: 15% annually (estimated based on our survey [13])
    • Low-risk: 7% annually (estimated based on our survey [13])
  • Symptomatic testing: 95% weekly probability (estimated based on our survey [13])
  • Post-treatment retesting: Automatic testing 6 weeks after treatment completion (based on an interview with a stakeholder [14])

Diagnostic Performance

  • Test sensitivity: 95% (nucleic acid amplification tests) [15]
  • Test specificity: 99% [15]
  • Treatment initiation: 1-week delay following positive test
  • Treatment adherence: 80% probability of initiating prescribed treatment (estimated based on our survey [13])

Behavioral Adaptation

Agents modify their partner acquisition behavior following positive test results:

  • Partner-seeking propensity reduced to 10% of baseline during treatment
  • Behavior returns to baseline after treatment completion or negative retest

Demographic Processes

Vital Statistics

The model incorporates Swiss demographic rates to maintain population realism:

  • Birth rate: 0.0167% weekly (agents enter at age 15) [16]
  • Death rate: 0.016% weekly, age-stratified based on Swiss mortality tables [17][18]
  • Immigration: 0.035% weekly, with age-appropriate infection prevalence [19]
  • Emigration: 0.02% weekly, random selection [19]

Age Progression

  • Agents age continuously (1/52 year per week)
  • Age influences partnership formation rates, dissolution patterns, and mortality risk
  • Agents are removed from the model at age 100, though reaching this age is rare because the probability of death increases with age.

Simulation Implementation

Temporal Resolution

The model operates on a weekly time step, with each iteration executing the following sequence for each agent:

  1. Update infection status and progression
  2. Manage partnership formation and dissolution
  3. Execute sexual acts with transmission potential
  4. Determine testing and treatment decisions
  5. Update demographic characteristics

Stochastic Elements

All probabilistic events use a dedicated random number generator with configurable seed for reproducibility. Key stochastic processes include:

  • Partner selection and sexual frequency
  • Transmission events
  • Symptom development
  • Testing decisions
  • Treatment adherence
  • Demographic events

Data Collection

The model tracks multiple epidemiological metrics at each time step:

  • Population-level infection prevalence by risk group, sexuality, and age
  • Weekly incidence of new infections
  • Testing and treatment utilization rates
  • Partnership network statistics
  • Age-specific infection distributions

Model Validation

Parameter Sources

Model parameters were derived from:

  • Swiss Federal Office of Public Health surveillance reports
  • Published epidemiological literature on Chlamydia transmission
  • Swiss demographic and behavioral surveys

Parameters lacking values in the literature were estimated based on expert consultation with healthcare providers and/or insights from survey data.

Software Implementation

The model is implemented in Python 3.12 using object-oriented programming principles. Key dependencies include:

  • NumPy - for numerical operations and age distribution calculations
  • Matplotlib - for visualization of epidemiological dynamics
  • Pandas - for data management and analysis
  • Standard library modules - for random number generation and enumeration types

Results

Current scenario

As only positive test results are mandatorily reported, precise national testing rates are uncertain. We therefore calibrated a current scenario using our survey-based annual testing propensities (0.08 on average, scaled to 0.07 for the low-risk group and 0.15 for the high-risk group) and explored plausible bounds (see below). Under this baseline, the system settles into a quasi-steady state with:

  • Total prevalence in the low single-digit percent range (around 129 infected individuals among 7,663 agents, corresponding to approximately 1.7 % overall prevalence).
  • Network structure: The model stabilizes at an average of ~0.64 simultaneous partnerships per person, meaning that at any given week about two thirds of agents are in a relationship. This represents the equilibrium network connectivity rather than the weekly formation rate of new partnerships.
  • Age/risk concentration: Infections concentrated among younger age groups and high-risk agents, with spillover into the broader population.

These baseline dynamics imply that, without changes in testing behavior or partner notification, marginal year-to-year variation arises mainly from stochasticity in partnerships, transmission, and symptoms, rather than from structural drift.

Agarose Gel
Figure 1: Overall Chlamydia dynamics under the current scenario, showing mostly stable infection levels with a slight upward trend.

Sensitivity to current testing uncertainty

To test robustness against uncertainty in routine testing, we varied annual testing rates within a plausible range:

  • Lower bound: 0.035 (low-risk), 0.075 (high-risk)
  • Estimated (baseline): 0.07 (low-risk), 0.15 (high-risk)
  • Upper bound: 0.20 (low-risk), 0.42 (high-risk)

Across this range of plausible testing propensities, the model’s epidemic trajectory remained broadly stable. Lower testing rates resulted in a modest upward shift in prevalence, while higher testing frequencies led to a slight overall reduction due to faster detection and treatment of cases. Importantly, none of the scenarios produced qualitative changes in epidemic behavior - the system consistently exhibited a broadly stable trajectory with a slight upward trend over time. This demonstrates that the model’s predictions are robust to some uncertainty in current testing behavior.

1) Lower bound (more conservative testing)

Agarose Gel
Figure 2: Dynamics by sexuality under lower-bound testing. Most infections occur among heterosexual agents, while other groups remain at low, fluctuating levels.

Interpretation: With reduced testing, infected counts drift slightly upward but remain bounded; no qualitative shift in transmission pathways is observed.

2) Estimated current testing (baseline)

Agarose Gel
Figure 3: Dynamics by sexuality under the estimated current testing propensities. Series oscillate around steady levels with small stochastic excursions, consistent with a stable endemic regime.

Interpretation: The baseline reproduces expected patterns: higher counts among heterosexual agents (demography-driven), low sustained levels among lesbian women, and intermittent clusters among gay/bisexual men, superimposed on steady network turnover.

3) Upper bound (more generous testing)

Agarose Gel
Figure 4: Dynamics by sexuality under the upper-bound testing scenario. Increased testing yields a gradual reduction in infected counts across groups, with the largest absolute effect in the heterosexual population (largest stratum).

Interpretation: Even with higher routine testing, the model settles again into a slight upward trend after a temporary fluctuation.

Modeling the effect of TRACE

Survey results indicated that the vast majority of respondents would probably or definitely test more often. Specifically, 81.6% said they would use TRACE if aware of it, and 79% reported likely increasing their testing frequency. This trend was consistent across groups, with 72.8% of frequent testers and 82.6% of non-testers agreeing.

While these survey responses provide valuable insight into potential behavior changes, it is important to note that self-reported intentions do not always translate directly into real-world action. To account for this uncertainty, our epidemiological model explores a range of possible increases in testing rates rather than assuming full adherence to reported intentions. This approach allows us to quantify the potential population-level impact of TRACE under both conservative and optimistic adoption scenarios.

Medium increase in testing rate

Scenario: annual testing rates increased to 0.28 for low-risk and 0.6 for high-risk individuals.

Agarose Gel
Figure 5: Dynamics by sexuality under a medium increase in testing rate. Infection levels decline and stabilize at lower equilibrium values across all sexual orientations.

Interpretation: This moderate scenario represents a realistic short-term effect based on partial adoption of TRACE. The model predicts a sustained reduction in prevalence, with the largest decline observed among heterosexual agents, who make up the majority of the population. These findings suggest that even without full adoption, TRACE could meaningfully lower transmission rates through earlier detection and treatment, breaking infection chains before they propagate.

Large increase in testing rate

Scenario: annual testing rates increased to 0.7 for low-risk and 1.5 for high-risk individuals.

Agarose Gel
Figure 6: Dynamics by sexuality under a large increase in testing rate. The infection burden collapses toward near-elimination across all sexual groups.

Interpretation: Under optimistic conditions - where TRACE achieves widespread awareness and regular usage - the model predicts a near-elimination of infection within the simulated period of 10 years. This highlights the nonlinear relationship between testing frequency and disease prevalence: once a critical proportion of the population tests frequently, transmission becomes unsustainable, and prevalence rapidly declines.

The result illustrates the transformative potential of accessible at-home testing to shift population-level disease dynamics.

Entrepreneurship:

The epidemiological findings presented above provide quantitative evidence that TRACE is not only a diagnostic innovation but also a public health enabler with tangible societal benefits. By substantially increasing testing accessibility and frequency, TRACE can reshape STI epidemiology on a population scale. Our agent-based model demonstrated that increases in testing rates lead to a marked and sustained reduction in infection prevalence, while widespread adoption could push the system toward near-elimination. These results confirm that TRACE has the potential to become a transformative public health intervention, capable of reducing transmission, improving early detection, and alleviating long-term healthcare burdens.

From an entrepreneurial perspective, this modeling validation provides strong support for our business case. It shows that TRACE can create dual value – both societal impact and economic sustainability. The reduction in undiagnosed infections translates into measurable public health savings and productivity gains, while the convenience and privacy of TRACE address key barriers that currently limit regular STI screening.

The model therefore strengthens the rationale for investing in and scaling TRACE as a viable startup opportunity. It quantifies the positive externalities of widespread testing and underlines the market need for accessible, at-home diagnostic tools. This alignment between scientific validation, social impact, and commercial feasibility underscores TRACE’s potential as a high-impact venture at the intersection of biotechnology and preventive healthcare.

For a detailed description of our business model, value proposition, and market strategy, see our entrepreneurship page.

Overall impact

Taken together, these results demonstrate that the behavioral change enabled by TRACE – more frequent and accessible STI testing – can lead to substantial reductions in prevalence, even under more conservative assumptions. The model highlights how improvements in testing behavior can shift the system from a stable endemic equilibrium toward effective control or elimination.

Model validation and discussion

The ABM was validated by comparing its emergent epidemiological patterns with known characteristics of Chlamydia trachomatis infection in the Swiss population. The goal of validation was not to match exact case numbers from surveillance data – which are influenced by underreporting – but to ensure that the simulated trends and magnitudes are epidemiologically realistic and consistent with literature-based expectations.

Age-Specific Infection Patterns

The model predicts that infections are strongly concentrated among younger agents, particularly those aged 20–30 years, with prevalence decreasing steadily in older age groups. This distribution reflects real-world surveillance data, where Chlamydia incidence peaks among young adults and declines with age due to reduced partner turnover and increased testing awareness. A small artificial peak appears at around 90 years, which is attributed to the limited number of agents in that age bracket and does not reflect a genuine epidemiological trend. Overall, the age-specific pattern demonstrates that the model correctly captures the demographic concentration of infection in sexually active young populations.

Agarose Gel
Figure 7: Age-specific infection prevalence at week 520. The highest infection rates occur among young adults, consistent with real-world epidemiology. Minor fluctuations in the elderly are artifacts of small sample size.

Network Stability

Partnership dynamics were evaluated by tracking the mean number of concurrent partners over time. The simulation stabilized at an average of approximately 0.64 partners per agent, exhibiting only minor stochastic fluctuations over the 10-year period. This suggests that, at any given time, roughly two-thirds of agents are engaged in a sexual partnership – closely aligning with reported data from Switzerland, which indicates an average of 0.73 partners [21]. The absence of long-term drift or oscillatory behavior confirms that the model’s partnership formation and dissolution parameters are balanced and produce a stable social network structure over time.

Agarose Gel
Figure 8: Network dynamics showing the average number of concurrent partners per agent over time. The system stabilizes around 0.64, indicating a realistic and equilibrated network structure.

Infection Distribution by Risk Group

When stratified by behavioral risk, the model predicts that infections are disproportionately concentrated among high-risk individuals, who maintain more frequent partnerships and higher sexual activity rates.However, low-risk individuals still sustain a smaller but persistent burden of infection, reflecting ongoing spillover through network mixing between groups.

This outcome is supported by current public health strategies that emphasize frequent testing among high-risk populations while acknowledging that occasional testing in the general population remains necessary to prevent reinfection cycles [22].

Agarose Gel
Figure 9: Chlamydia dynamics by risk group. Infections are concentrated among high-risk individuals but not limited to them, highlighting the importance of broad testing coverage.

Population-Level Dynamics

At the population level, the model maintains a steady-state prevalence of around 1.7%, corresponding to approximately 129.1 infected individuals out of 7,663 simulated agents. When scaled to the Swiss population aged 15+ (factor of 1000), this equates to an estimated ~129,100 concurrent infections, which aligns well with literature estimates exceeding 100,000 cases after accounting for underreporting [4]. The overall number of susceptible and treated agents remains stable throughout the simulation period, confirming that the model has reached a quasi-endemic equilibrium rather than an unstable epidemic expansion or collapse.

Agarose Gel
Figure 10: Overall Chlamydia dynamics over time. The proportions of susceptible, infected, and treated agents remain stable, indicating a consistent endemic equilibrium.

Summary of Validation

Taken together, these validation results indicate that the model successfully reproduces key epidemiological and behavioral properties of Chlamydia transmission:

  • Infection prevalence peaks in young adults.
  • The partnership network is stable and realistic.
  • High-risk individuals carry most of the infection burden.
  • Population-level infection counts match published estimates.

While some noise appears in small demographic subgroups (e.g., elderly agents), these deviations are expected given the limited number of agents in those cohorts and do not affect overall model reliability. Therefore, the model can be considered valid for simulating population-level and behavioral intervention scenarios relevant to the Swiss context.

Limitations and Outlook

Model Limitations

Structural Assumptions

Our agent-based model necessarily simplifies the complex reality of sexual network dynamics and STI transmission. The assumption of random mixing within risk groups, even with assortative mixing probabilities, cannot fully capture the highly clustered and community-structured nature of real sexual networks. Real-world sexual networks often exhibit small-world properties and contain dense subgroups that can sustain transmission even when overall prevalence is low [23]. Our limitation of maximum two concurrent partners may underestimate transmission potential in highly connected network cores where individuals maintain multiple concurrent partnerships.

The weekly temporal resolution, chosen for computational efficiency and better implementation of literature parameters, means the model cannot capture short-term casual encounters or same-day partner turnover that may contribute significantly to transmission dynamics, particularly in high-risk populations. Furthermore, the absence of geographic structure assumes that all individuals in Switzerland have equal probability of partnership formation regardless of location, ignoring the reality that sexual networks are strongly influenced by proximity, urban-rural divides, and local social venues [24]. This simplification may affect our estimates of outbreak potential and doesn't allow for the assessment of geographically targeted interventions.

Behavioral aspects in the model are represented in a simplified way. Testing behavior is modeled as a probability that depends on symptoms and risk group, without explicitly incorporating factors such as healthcare accessibility, insurance coverage, stigma, or prior testing experiences. Similarly, condom use and other barrier methods are not modeled individually; instead, their effects are reflected in the overall transmission probability parameters. This design choice keeps the model focused on capturing key testing-driven dynamics while maintaining conceptual clarity and computational efficiency.

The binary classification into high- and low-risk groups does not fully capture the continuous spectrum of sexual behaviors or the dynamic nature of risk throughout an individual’s life. In reality, people may shift between risk levels depending on life circumstances, relationship status, or age, whereas our model treats risk group membership as fixed. Likewise, the model assumes that Swiss residents and immigrants share similar behavioral patterns, which may overlook differences arising from diverse cultural backgrounds and sexual health practices within immigrant populations.

Parameter Uncertainty

Many model parameters remain uncertain due to limited data availability. Transmission probabilities, particularly for female-to-female transmission, are poorly characterized, as comprehensive transmission studies are both ethically and practically difficult to conduct. While several studies report a transmission probability of approximately 0.05 per sexual act [5, 6, 25], data on specific transmission routes, such as female–female transmission, remain particularly limited.

The model assumes that the true number of infections is roughly ten times higher than the number of reported cases. This assumption primarily influences the absolute number of predicted cases, while having a lesser impact on the relative effects of different testing rates. Since the model’s predictions align with existing epidemiological estimates, this factor appears reasonable. However, it may still vary across age groups, risk populations, and geographic regions.

Technical and Validation Constraints

Model validation faces fundamental challenges in the context of STI transmission. We cannot conduct controlled transmission experiments for ethical reasons, and detailed sexual network data is difficult to collect due to privacy concerns and recall limitations. The model can be validated against aggregate surveillance data (such as comparing the model’s case number prediction to literature estimates), but this provides limited information about whether the underlying transmission processes are correctly represented. Without access to genetic sequence data that could reveal transmission chains or detailed partner notification data that could validate network structure, we must rely on the model's ability to reproduce population-level patterns as evidence of validity.

Outlook and Future Directions

Near-term Enhancements

The immediate priority for model development involves incorporating partner notification mechanisms that reflect real-world contact tracing efforts. Partner notification represents an important component of STI control programs, yet its effectiveness depends on notification rates, partner reachability, and the delay between index case diagnosis and partner testing [26]. Implementing realistic notification cascades would allow evaluation of how improvements in contact tracing effectiveness might amplify the benefits of increased testing. This enhancement would require modeling the probability that infected individuals notify their partners, the delay in notification, and the subsequent testing and treatment behaviors of notified partners.

Advanced Modeling Approaches

Incorporating geographic structure and local social contexts represents another important extension, as sexual networks are strongly shaped by these factors. Including geographic mixing would allow the evaluation of targeted interventions and provide insights into how infections spread between communities. However, greater population fragmentation makes parameter estimation increasingly difficult, and even at the current level of model detail, parameter availability was limited.

The extension to multi-pathogen dynamics would significantly enhance the model's public health relevance. Many individuals with one STI have increased risk for others due to shared transmission routes and biological interactions [27, 28]. Modeling co-infections with gonorrhea, syphilis, or HIV would allow for more accurate predictions of epidemic trajectories, identification of high-risk subpopulations, and evaluation of interventions that target multiple infections simultaneously. It could also capture synergistic effects, such as increased susceptibility or transmissibility resulting from co-infections, providing a more realistic representation of STI spread and informing integrated public health strategies. In fact, it is known that HIV increases the risk of acquiring and transmitting other sexually transmitted infections, such as syphilis, gonorrhea, and herpes simplex virus, due to both immunosuppression and higher viral shedding at mucosal sites.

References

  1. [1] IndexMundi. Switzerland — Age structure. Available from: https://www.indexmundi.com/switzerland/age_structure.html.
  2. [2] Bundesamt für Gesundheit / BAG; Sotomo. Studienbericht: Sex in der Schweiz. Page 12. Available from: https://sotomo.ch/site/wp-content/uploads/2020/12/BAG_Studienbericht_DE.pdf.
  3. [3] Bundesamt für Gesundheit (Switzerland). Sexually transmitted infections — Chlamydia. In: IDD BAG. Available from: https://www.idd.bag.admin.ch/diseases/chlamydia/statistic.
  4. [4] Aebi-Popp K. Chlamydieninfektion bei Jugendlichen. Schweizer Zeitschrift für Gynäkologie. 2020 Oct 9. Available from: https://www.rosenfluh.ch/media/gynaekologie/2005/04/Chlamydieninfektion-die-unbemerkte-Geschlechtskrankheit.pdf.
  5. [5] BMJ Sexually Transmitted Infections. Abstract A175.2. Sex Transm Infect. 2011;87 Suppl 1:A175.2. Available from: https://sti.bmj.com/content/87/Suppl_1/A175.2.
  6. [6] STD Center NY. One-time homosexual contact STD risk in men. Available from: https://stdcenterny.com/articles/one-time-homosexual-contact-std-risk-men.html.
  7. [7] Centers for Disease Control and Prevention. Women who have sex with women (WSW) — HSV-2 seroprevalence. In: Sexually Transmitted Diseases Treatment Guidelines. Available from: https://www.cdc.gov/std/treatment-guidelines/wsw.htm.
  8. [8] Singapore Dental Association / CDA. Chlamydia. Available from: https://www.cda.gov.sg/professionals/diseases/chlamydia.
  9. [9] British Association for Sexual Health and HIV (BASHH). Chlamydia 2015. Available from: https://www.bashh.org/resources/15/chlamydia_2015/.
  10. [10] Centers for Disease Control and Prevention. Chlamydia — Treatment Guidelines. Available from: https://www.cdc.gov/std/treatment-guidelines/chlamydia.htm.
  11. [11] Gaydos C, Schachter J. Title not clearly given. In: PMC. Available from: https://pmc.ncbi.nlm.nih.gov/articles/PMC2838999/.
  12. [12] Pew Research Center. A growing share of US husbands and wives are roughly the same age. 2024 Aug 15. Available from: https://www.pewresearch.org/short-reads/2024/08/15/a-growing-share-of-us-husbands-and-wives-are-roughly-the-same-age/.
  13. [13] University of Zurich iGEM Team 2025. Survey on testing behavior and STI awareness among students. 2025. Available from: https://2025.igem.wiki/uzurich/survey.
  14. [14] Betschart C. Stakeholder-Interview. UZurich iGEM 2025. 2025 Oct 6. Available from: https://2025.igem.wiki/uzurich/stakeholder-betschart.
  15. [15] Australian Department of Health. Chlamydia laboratory case definition. 2022 Jun. Available from: https://www.health.gov.au/sites/default/files/documents/2022/06/chlamydia-laboratory-case-definition_0.pdf.
  16. [16] Swiss Federal Statistical Office (FSO / BFS). Geburten und Todesfälle. Available from: https://www.bfs.admin.ch/bfs/de/home/statistiken/bevoelkerung/geburten-todesfaelle/geburten.html.
  17. [17] Swiss Federal Statistical Office (FSO / BFS). Bevölkerung. Available from: https://www.bfs.admin.ch/bfs/de/home/statistiken/bevoelkerung.html.
  18. [18] Swiss Federal Statistical Office (FSO / BFS). Gesundheit / Sterblichkeit / Todesursachen. Available from: https://www.bfs.admin.ch/bfs/de/home/statistiken/gesundheit/gesundheitszustand/sterblichkeit-todesursachen.html.
  19. [19] Swiss Federal Statistical Office (FSO / BFS). Migration & Integration. Available from: https://www.bfs.admin.ch/bfs/de/home/statistiken/bevoelkerung/migration-integration.assetdetail.36073875.html.
  20. [20] Wissel T. Stakeholder-Interview. UZurich iGEM 2025. 2025 Oct. Available from: https://2025.igem.wiki/uzurich/stakeholder-wissel.
  21. [21] Federal Statistical Office (Switzerland). Paare in der Schweiz. Available from: https://www.bfs.admin.ch/bfs/en/home.assetdetail.33386305.html.
  22. [22] Centers for Disease Control and Prevention. STI Screening Recommendations. Available from: https://www.cdc.gov/std/treatment-guidelines/screening-recommendations.htm.
  23. [23] Newmyer L, Evans M, Graif C. Socially Connected Neighborhoods and the Spread of Sexually Transmitted Infections. Demography. 2022;59(4):1299–1323. Available from: doi: 10.1215/00703370-10054898.
  24. [24] Kim H-Y, Cuadros D, Wilkinson E, Junqueira DM, de Oliveira T, Tanser F. The geography and inter-community configuration of new sexual partnership formation in a rural South African population over fourteen years (2003–2016). 2022 Mar 9. Available from: doi: 10.1371/journal.pgph.0000055.
  25. [25] Contact Tracing ASHM. Chlamydia. Available from: https://contacttracing.ashm.org.au/chlamydia/.
  26. [26] Althaus CL, Turner KME, Mercer CH, et al. Effectiveness and cost-effectiveness of traditional and new partner notification technologies for curable sexually transmitted infections: observational study, systematic reviews and mathematical modelling. Southampton (UK): NIHR Journals Library; 2014 Jan. Available from: https://www.ncbi.nlm.nih.gov/books/NBK261430/.
  27. [27] Chung SL, Wong NS, Ho KM, Lee SS. Coinfection and repeat bacterial sexually transmitted infections (STI) – retrospective study on male attendees of public STI clinics in an Asia Pacific city. Epidemiol Infect. 2023 Jun 9;151:e101. Available from: doi: 10.1017/S0950268823000948.
  28. [28] Mayer KH, Venkatesh KK. Interactions of HIV, other sexually transmitted diseases, and genital tract inflammation facilitating local pathogen transmission and acquisition. Am J Reprod Immunol. 2011 Apr;65(3):308–316. Available from: doi: 10.1111/j.1600-0897.2010.00942.x.

Asymmetrical RPA Model

Asymmetrical RPA Model Page as PDF
Top