We engineered GenOMe as a next-generation platform for genome-integrated BioBricks, transforming the idea of “one-time insertion” into a reusable modular system. Guided by the Design–Build–Test–Learn cycle, we started by re-designing ORBIT for dual attB landing sites, then constructed the GenOMe strain, validated stepwise multi-site integrations, and iteratively optimized the workflow. Through this process, we created a platform that is not only robust and scalable, but also simple enough for iGEM teams to use as a genome “LEGO baseplate” for building ambitious genetic designs.
In iGEM projects, many genetic circuits are initially built on plasmids. However, plasmids face key limitations: restricted cargo capacity, fluctuating copy numbers, and the need for continuous antibiotic selection. For designs that require stability, inheritance, and stepwise scalability, chromosomal integration is the better option.
Traditional methods for genome integration, however, are often cumbersome, low in efficiency, and difficult to expand for multiple successive insertions. The problem is, traditional methods like CRISPR/Cas9 or λ-Red are slow, complex, and costly. Integrating large DNA fragments can take weeks, requires expensive synthesis, and demands extensive colony screening.
This poses a major bottleneck for iGEM teams that want to build stable, inheritable multi-gene arrays. GenOMe was conceived as a platform to make genome editing modular and user-friendly — so that iGEM teams can build chromosomal circuits like stacking LEGO bricks.
The GenOMe system is built on Bxb1-mediated site-specific recombination. Bxb1 integrase recognizes the phage attachment site (attP) and the bacterial attachment site (attB), brings them together, and catalyzes recombination into attL and attR. Importantly, attB/P 1–6 sites pair strictly with their cognate partners (1–1, 2–2, 3–3, 5–5, 6–6), enabling accurate and independent integrations without cross-interference.
A defining feature of this mechanism is its directionality. In the absence of a Recombination Directionality Factor (RDF), the forward reaction (attP + attB → attL + attR) is irreversible. Once integrated, DNA fragments remain stably inherited, making Bxb1 particularly valuable for synthetic biology applications requiring long-term stability and predictable circuit behavior.
The phage attachment site (attP) and the bacterial attachment site (attB) are recognized by Bxb1 integrase. The enzyme catalyzes a 180° DNA rotation, producing the hybrid products attL and attR.
Our starting point for developing GenOMe (Genome-integrated Modular Engineering) was ORBIT (Oligonucleotide Recombineering followed by Bxb1 Integrase Targeting). In E. coli, ORBIT functions by first installing a short attB site into the chromosome using a single-stranded oligonucleotide, and then employing Bxb1 integrase to insert a non-replicating DNA fragment carrying the cognate attP site, which recombines to generate stable attL/attR sites (Saunders & Ahmed, 2024). This provides a powerful method for stable chromosomal integration.
However, ORBIT still faces practical bottlenecks. Beyond being commonly applied to single-site integration, the system requires a complex helper plasmid that provides multiple functions simultaneously: a single-strand annealing protein (SSAP) to promote oligo recombination, a mismatch repair inhibitor (MutL_E32K) to prevent correction of the oligo, and inducible expression of Bxb1 integrase. Together with the targeting oligo and a non-replicating integration plasmid, these components must all be co-delivered and tightly regulated. This multi-component requirement increases system complexity and makes ORBIT less accessible for iterative applications such as iGEM projects.
This limitation naturally raises the question: if we had two attB sites in the chromosome, could we integrate a linear DNA fragment that carries two attP sites, and thereby achieve multi-locus insertion in one step?
By designing a chromosome with two attB sites and a DNA fragment carrying two attP sites, a genetic payload can be integrated into the chromosome at both sites in a single step, simplifying chromosomal insertion
In principle, introducing two attB sites could expand ORBIT beyond single-site applications, but it also raises challenges such as low efficiency, the need for synchronization, and the complexity of helper plasmids. To address these issues, we re-engineered the system into GenOMe, embedding the Bxb1 integration machinery directly into the host genome together with optimized attB sites, thereby creating a simplified and BioBrick-compatible platform for iterative genome integrations.
To evaluate whether dual attB integration was feasible, we used GenOMe Navigator — our genome integration decision software — to analyze key factors influencing recombination efficiency. Instead of building abstract equations, the Navigator translated real experimental data into actionable guidance for the wet-lab, identifying the optimal timing and expression balance required for efficient Two-to-Two recombination.
A payload flanked by attP ends is inserted into dual attB sites in the genome by Bxb1 integrase, resulting in stable chromosomal integration through attL/attR recombination.
Drawing from prior literature, systems such as ORBIT typically achieve only ~0.1–1% single-locus efficiency (Saunders & Ahmed, 2024), while λ-Red dsDNA recombineering performs at 10⁻⁵–10⁻³ per cell. Even CRISPR/Cas9 + λ-Red hybrids (Pyne et al., 2015) reached only 39–47% correct clones without CFU normalization. Given these baselines, achieving synchronized dual-site integration in one step was initially considered unrealistic.
However, GenOMe Navigator simulated the Two-to-Two process across three mechanistic layers — protein, DNA, and population — and predicted parameter ranges that could dramatically improve performance. Its recommendations included:
Simulation of Bxb1 and SSAP expression levels showing the optimal induction window for synchronized recombination (~200–300 min, 4–8 μM Bxb1).
Following these software-guided adjustments, our wet-lab team achieved reproducible ~80% integration efficiency, transforming what was once thought infeasible into a predictable and repeatable genome integration strategy.
Through this process, GenOMe Navigator demonstrated that even complex dual-site events can be rationally optimized — turning theory into design and computation into real experimental success.
Guided by GenOMe Navigator — our genome integration decision software, we engineered the GenOMe strain, a redesigned genome chassis that extends the ORBIT concept beyond single-site integration into a scalable and modular engineering platform.
In this chassis, the chromosome is equipped with dual attB sites, functioning as expandable “genomic sockets,” while the Cassette BioBrick carries matching attP “plugs” to deliver genetic payloads. The detailed construction of the strain — including the installation of the Landing Pad with recombination sites, Bxb1 integrase, and selection markers — is described in the Build section below.
Genome engineering in GenOMe is based on two components: a Cassette BioBrick carrying attP sites, and a GenOMe strain with pre-installed attB sites in the chromosome. Together, they form a modular “plug-and-socket” system that enables scalable genome integrations.
With the design principles established and dual attB integration shown to be feasible in our modeling, the next step was to physically realize GenOMe in E. coli. We therefore set out to construct a working chassis by embedding attB sites and a multifunctional Landing Pad into the chromosome, creating the foundation of the GenOMe system.
To implement this design, we established a stepwise workflow:
1. Installing attB sites by oligonucleotide recombineering.
2. Designing and integrating the Landing Pad with recombination sites, Bxb1 integrase, and a selectable marker.
3. Establishing the GenOMe strain by co-electroporation of all components.
This systematic approach ensured that each component of GenOMe was embedded into the chromosome in a stable and verifiable way.
The first step in constructing GenOMe was to design a single-stranded DNA oligonucleotide (ssDNA) capable of embedding attB1 and attB6 sites into the E. coli chromosome. This was achieved using oligonucleotide recombineering, a method adapted from the ORBIT system (Saunders & Ahmed, 2024).
We targeted the lacZ locus as the integration site, providing a defined chromosomal position and an easy screening phenotype. To achieve this, we designed a 150-nt ssDNA oligonucleotide, TargetingOligo_B1_B5, with homology arms to lacZ that carried two attB cores (attB1 and attB6) separated by a short neutral spacer, enabling their precise installation into the genome.
A synthetic 150-nt ssDNA (TargetingOligo_B1_B5) was designed with lacZ-derived homology arms (blue), attB1 and attB6 cores, and a short spacer (yellow)
The ssDNA anneals to the lagging strand template during DNA replication. Assisted by the SSAP encoded on the helper plasmid, attB1/attB6 are precisely embedded into the chromosome, creating genomic sockets for downstream integration.
The Landing Pad (LandingPad_B3B2B5) is an integrated DNA fragment that equips the GenOMe chassis with modular integration capacity. It is pre-encoded with attP1 and attP6 for recombination with chromosomal attB1/attB6, and contains additional attB3, attB2, and attB5 sites reserved for future insertions. To support stable selection and autonomous recombination, the fragment also carries a kanamycin resistance marker (KanR) and the Bxb1 integrase gene. Together, these elements form a compact platform embedded in the genome, enabling stepwise and expandable integrations.
LandingPad_B3B2B5 includes attP1/attP6 for recombination, additional attB2/attB3/attB5 sites for future integrations, KanR for selection, and the Bxb1 integrase gene for recombination.
To construct the GenOMe chassis, E. coli MG1655 was first transformed with the ORBIT helper plasmid pHELP_TS_V2_ampR (Addgene plasmid #214467); gift from the Saunders Lab). This plasmid provides expression of the SSAP and Bxb1 integrase upon induction, which are essential for efficient recombineering and integration.
After induction, two elements were co-electroporated:
1. the ssDNA oligonucleotide carrying attB1/attB6, and
2. the Integrated DNA fragment (Landing Pad) containing attP1/attP6, attB2/attB3/attB5, KanR, and bxb1.
The ssDNA introduced attB1/attB6 into the genome, while the Integrated DNA fragment recombined with these sites to form attL/attR junctions. This process generated the GenOMe strain, embedding the Landing Pad into the chromosome as the foundation for modular genome engineering.
Overview of the GenOMe design, showing how attB sites are installed and a Landing Pad fragment integrates into the chromosome, establishing the modular chassis. E. coli MG1655 carrying the helper plasmid pHELP_TS_V2_ampR was induced to express SSAP and Bxb1. A ssDNA (TargetingOligo_B1_B6) and the LandingPad_B3B2B5 fragment were then co-electroporated, resulting in insertion of attB1/attB6 and integration of the Landing Pad into the chromosome.
Having designed and built the GenOMe platform, the next step was to evaluate whether our system truly worked as intended. Importantly, our validation showed that GenOMe not only supports modular integration but also achieves high integration efficiency, with test fragments reaching up to ~80%. We divided this evaluation into two stages: first confirming the construction of the GenOMe strain itself, and then functionally validating it with test DNA fragments.”
After designing and building the GenOMe system, our first step was to verify that the chassis had been correctly established. To achieve this, we co-electroporated three elements into E. coli: (i) TargetingOligo_B1B6, a synthetic ssDNA oligo carrying attB sites; (ii) the helper plasmid pHELP_TS_V2_ampR, which provides recombineering proteins (SSAP, Bxb1, and a mismatch repair inhibitor); and (iii) LandingPad_B3B2B5, an integration fragment flanked by attP sites.
Stepwise procedure for building the GenOMe chassis: induction of the helper plasmid, co-electroporation of ssDNA and the Landing Pad, and site-specific recombination forming attL/attR junctions.
Successful construction of the strain was confirmed by blue-white screening and cross-boundary PCR (one primer on the payload and the other on the flanking lacZ gene). The expected PCR band provided direct evidence that the Landing Pad was correctly integrated, marking the completion of the GenOMe strain.
With the GenOMe strain in place, we next evaluated whether it could support modular genome editing. For this purpose, we designed three linear DNA fragments (test fragments), each flanked by attP sites, and introduced them into the chassis through the Two-to-Two mechanism.
To evaluate genome integration efficiency, electroporated cells were plated on both antibiotic-containing LB agar (selection) and plain LB agar (non-selection). Colony counts from the two plates were compared to calculate the proportion of cells in which the test fragment was successfully integrated.
Both dual-gene insertion (TestVer2) and complete Landing Pad replacement (TestVer3) occurred at high efficiency, confirming the robustness of the GenOMe platform.
Verification of all three test DNA fragments relied on cross-boundary PCR (one primer on the payload and one on the flanking genome), ensuring that only correctly integrated fragments produced the expected band size. In parallel, GFP fluorescence and gentamicin selection provided additional confirmation.
Together, these results demonstrated that GenOMe can perform single-gene integration, dual-gene insertion, and complete Landing Pad replacement at high efficiency, establishing it as a robust and scalable platform for modular genome engineering.
Schematic overview of three test integrations (TestVer1–3) through the Two-to-Two mechanism. Red arrows mark cross-boundary PCR primer sites for verifying correct integration.
From our test results, we understood the key factors for reliable recombination: high-quality competent cells, immediate recovery after electroporation, extended SSAP induction (~45 min) for efficient attB installation, and only short Bxb1 expression (~30 min) once attB sites were present. These lessons showed us that the true bottleneck was attB availability rather than integrase activity, and gave us the confidence to scale GenOMe into a modular genome engineering platform.
The refined design of GenOMe works as follows:
Step 1. Pre-installation of genomic slots
Multiple recombination sites (e.g., attP1, attB2, attB3, attP6) are embedded into the E. coli chromosome, serving as expandable slots for later insertions.
Step 2. Alternating insert modules (Slot A/B cassettes)
Two BioBrick cassettes are used in alternation:
Step 3. Iterative integration
The cycle proceeds step by step: Slot A inserts one fragment, Slot B the next, then Slot A again, enabling fragments to be added in a stable and reproducible manner.
Step 4. Modular expansion
By repeating this alternation, multiple fragments can be sequentially stacked into long genomic arrays. GenOMe thus functions as a “genome LEGO baseplate,” and any BioBrick flanked with attP/attB sites can directly enter this pipeline. In later replacement cycles, selection cassettes can be swapped out, leaving only minimal att scars.
This iterative framework goes beyond single insertions. It allows iGEM teams to assemble multi-gene constructs directly in the chromosome in a predictable, stepwise manner — free from plasmid instability, size limitations, or antibiotic dependence — and provides the community with a powerful tool to pursue more ambitious and creative projects.
Alternating Slot A (GenR) and Slot B (KanR) cassettes enable stepwise, expandable chromosomal integration, yielding stable genomic arrays
By applying the Design–Build–Test–Learn cycle, we transformed the idea of dual attB integration into GenOMe, a fully functional modular genome engineering platform. From GenOMe Navigator simulation to wet-lab validation, we proved that the Two-to-Two mechanism not only works, but also achieves remarkably high integration efficiency (~80%). Along the way, we identified the true bottlenecks and optimized the workflow into a scalable, iterative framework.
The result is a system that is stable, efficient, and easy to use — enabling reliable stepwise genome integrations. With replacement of selection cassettes leaving only minimal att scars, GenOMe delivers a final marker-free genome. This completes our engineering journey, providing iGEM teams with a genome “LEGO baseplate” to build ambitious and creative designs.
Ref: Saunders, H. & Ahmed, A. (2024). ORBIT in Escherichia coli: Oligonucleotide recombineering with Bxb1 integrase targeting enables kilobase-scale genome editing. Nucleic Acids Research, 52(5), 227. https://doi.org/10.1093/nar/gkae227
The objective of our design was to enable the host strain to express the single strand annealing protein(ssAP) and bxb-1 integrase. Literature reports indicate that the attB/attP recombination system facilitates the integration of a designed DNA sequence into the host genome, a process that requires the coordinated action of ssAP and bxb-1 integrase. To supply these factors, we introduced a Helper plasmid into the host cells. This plasmid encodes CspRecT, which mediates the expression of ssAP, and bxb-1 Int, which mediates the expression of the bxb-1 integrase.
We selected E. coli MG1655 as the host strain. The Helper plasmid was first isolated from E. coli DH5α using a miniprep procedure and subsequently introduced into MG1655 by electroporation. The rationale for choosing MG1655 as the host will be explained in the Two-to-Two cycle section. The Helper plasmid carries two inducible systems: the xylS system, which is activated by m-toluic acid, and the araC system, which is activated by L-arabinose. The sequence and purpose of using these two induction systems will also be described in the Two-to-Two cycle section.
Fig. 1. Schematic of the Helper plasmid.
Before initiating liquid culture, we performed colony PCR using our designed primers to verify the presence of the plasmid.
Fig. 2. The colony PCR result of the Helper plasmid.
Correct length: 860bp
Lane1, Helper plasmid, 860bp
Lane2, Helper plasmid, 860bp
Lane3, Helper plasmid, 860bp
Lane4, Helper plasmid, 860bp
Plasmids were extracted from the bacterial culture and electrocompetent MG1655 cells were prepared. The Helper plasmid was then introduced into MG1655 via electroporation, after which the cells were plated on selective agar. Colonies appearing on the plates were subsequently verified by colony PCR.
Before initiating liquid culture, we performed colony PCR using our designed primers to verify the presence of the plasmid.
Fig. 3. The colony PCR result of the Helper plasmid.
→ The colony PCR results show we lose the plasmid
Although colony growth on the selective plates appeared robust, the colony PCR results were negative. Possible explanations include reduced electroporation efficiency due to excessive salt concentration or low plasmid concentration, the emergence of satellite colonies caused by prolonged incubation, and the metabolic burden imposed by the Helper plasmid on the host strain.
Based on the Cycle 1, we redesigned the protocol for preparing electrocompetent cells by adjusting the number of glycerol washes to more thoroughly remove residual salts. This modification allowed us to increase the concentration of the Helper plasmid used during electroporation. In addition, we freshly prepared LB agar plates to ensure selection efficiency.
Electrocompetent cells were prepared using the revised protocol, and a higher concentration of the Helper plasmid was employed for electroporation.
PCR was performed to confirm that the plasmid stock remained intact.
Fig. 4. The colony PCR result of the Helper plasmid.
Correct length: 860bp
Lane1, Helper plasmid, 860bp
Lane2, Helper plasmid, 860bp
Lane3, Helper plasmid, 860bp
Lane4, Helper plasmid, 860bp
Following electroporation, the transformed strains were plated on selective agar. After overnight incubation, colony PCR was performed to assess whether the transformation was successful.
Fig. 5. The colony PCR result of the Helper plasmid.
→ The colony PCR results show we failed the transformation again.
The results did not meet our expectations, as PCR yielded negative outcomes. Since the other steps were carried out as intended, we hypothesize that the issue may have arisen during the preparation of electrocompetent cells. Specifically, the bacterial culture may have had an excessively high OD, making it difficult to completely remove residual salts. This could have resulted in arcing during electroporation and subsequently reduced transformation efficiency.
Based on the insights from Cycle 2, we suspected that the failure of transformation was caused by poor electrocompetent cell quality. Specifically, high OD during culture preparation likely led to incomplete salt removal, which caused arcing during electroporation. To address this, we redesigned the protocol with two main adjustments:
Fig. 6. The colony PCR result of the Helper plasmid.
→ The result shows we succeed this time.
By implementing stricter OD control and additional glycerol washes, we successfully addressed the issues identified in Cycle 2. These adjustments led to improved transformation efficiency, with colony PCR confirming the presence of the Helper plasmid in MG1655. This outcome highlights the importance of precise OD monitoring and thorough salt removal during competent cell preparation. With the Helper plasmid now stably maintained in MG1655, the strain is capable of expressing ssAP (CspRecT) and bxb-1 integrase, providing the essential components for subsequent genome integration experiments.
Discover our step-by-step development of the Two-to-Two recombineering system. This design builds on the principle of modular genome engineering, where attP sites act as plugs and attB sites act as sockets. Under the action of Bxb1 integrase, these plugs and sockets snap together like building blocks, enabling precise and repeatable genome modification.
We constructed the Ex2Ver2 payload, which includes:
Targeting oligo was designed with the homology arms which can bind on the host genome through the help of ssAP.
To initiate the process, the xylS system was induced with m-toluic acid to drive ssAP expression, after which electrocompetent cells were prepared for co-electroporation. The DNA template (Ex2Ver2) was amplified by PCR, and together with the targeting oligo, was prepared for transformation. Following co-electroporation, the cells were resuspended in LB broth supplemented with L-arabinose, thereby activating the araC regulatory system and enabling the expression of bxb1 integrase.
No colonies survived on the agar plates containing kanamycin for antibiotic selection.
The absence of surviving colonies on the kanamycin agar plates indicates that the DNA template (Ex2Ver2) was not integrated into the E. coli genome, as the kanamycin resistance gene encoded on Ex2Ver2 would otherwise have conferred survival.
This outcome suggests several possible explanations:
Increase the amount of targeting oligo & DNA template:
Increase induction time:
The induction time extension is based on the M2 data provided by Dry Lab. It can be seen from the above that the concentration of ssAP in cells approaches the highest point at about 45 mins.
The induction time extension is based on the M2 data provided by Dry Lab. It can be seen from the above that the concentration of bxb1 integrase in cells approaches the highest point at about 4 hrs.
In Cycle 2, the experimental workflow followed the same initial procedure as in Cycle 1, but the amount of targeting oligo and DNA template were increase. Following co-electroporation, the cells were resuspended in LB broth supplemented with L-arabinose. In contrast to Cycle 1, the induction period for the araC system was extended from 1 hr to 4 hrs, thereby aiming to achieve stronger and more sustained expression of the bxb1 integrase.
A distinct band at 741 bp was observed, matching the expected product size based on our primer design, thereby confirming successful amplification of the target sequence.
The modifications introduced in Cycle 2 proved effective. By increasing the amounts of DNA template and targeting oligo, we enhanced the likelihood of recombination events. Extending the induction period for the araC system from 1 hour to 4 hours allowed higher levels of bxb1 integrase to accumulate, which was consistent with predictions from Dry Lab modeling. As a result, colonies were able to grow on kanamycin selection plates, and colony PCR confirmed the successful integration of the Ex2Ver2 construct at the landing pad, producing the expected 741 bp fragment.
This cycle demonstrates the critical importance of optimizing both the amount of donor DNA and the timing of integrase induction in order to achieve reliable recombination efficiency. These insights will guide further refinements in subsequent cycles, particularly in balancing integrase expression with cellular growth to maximize transformation outcomes.
The objective of Test 1 was to verify whether the GenOMe system could successfully integrate a designed DNA sequence into the E. coli genome. We constructed IntTest_P3_GFP_P2, which contained two engineered attP sites (attP2 and attP3) corresponding to the chromosomal attB2 and attB3 loci in our GenOMe strain. The construct carried a GFP reporter gene, allowing confirmation of successful integration via green fluorescence.
Our rationale was straightforward:
By combining antibiotic selection with GFP fluorescence as a reporter, Test 1 provided a direct validation of the GenOMe system’s site-specific genome integration.
We introduced the IntTest_P3_GFP_P2 sequence into GenOMe strains via electroporation. Since the bxb1 integrase expression cassette had already been placed downstream of the constitutive promoter BBa_J23101, the transformed cells contained abundant bxb1 integrase at the time of DNA entry, thereby enabling efficient genomic integration of the IntTest_P3_GFP_P2 sequence.
After electroporating the IntTest_P3_GFP_P2 sequence into the GenOMe strains, the transformed colonies were observed using an OLYMPUS CKX53 fluorescence microscope. However, no colonies showed green fluorescence, suggesting that the designed integration did not occur under these conditions.
From the results, it was evident that the integration attempt was unsuccessful. Upon reviewing the experimental process, we identified two major issues:
To overcome the limitations observed in Cycle 1, we redesigned the experiment with two key improvements:
Our revised rationale:
If integration succeeds under the revised conditions, colonies with stable insertions should be clearly identifiable through fluorescence microscopy, which would show green color.
We repeated the electroporation of IntTest_P3_GFP_P2 into GenOMe strains, this time ensuring:
Following recovery and plating, colonies were observed using an OLYMPUS CKX53 fluorescence microscope. Colonies were also monitored for background growth reduction compared to Cycle 1.
The modifications introduced in Cycle 2 proved effective in overcoming the challenges observed in Cycle 1. By incorporating a short recovery phase in antibiotic-free LB broth after electroporation, the host cells had sufficient time to express the integration machinery before being subjected to selection. This significantly improved colony survival and reduced background growth. Additionally, serial dilution effectively lowered colony density, enabling clearer identification of individual colonies for downstream screening.
GFP expression confirmed successful integration of the IntTest_P3_GFP_P2 sequence, validating the robustness of the optimized protocol. These results demonstrate that both recovery optimization and careful control of plating density are critical factors for achieving reliable genome editing outcomes with the GenOMe system.
Building on the success of our initial integration, we next sought to expand the system’s utility by testing both dual-gene integration and gene replacement using specifically designed donor templates.
Test Version 2 (Dual-gene integration):
This construct was designed to introduce GFP under the control of promoter BBa_J23100 and RBS B0034, followed by a gentamicin resistance gene driven by a secondary RBS B0030. By combining a visible marker (GFP fluorescence) with an antibiotic resistance marker (gentamicin), successful integration could be confirmed through two independent selection strategies: fluorescence screening and resistance-based colony growth.
Test Version 3 (Gene replacement):
This construct was designed to replace the GenOMe strain’s original kanamycin resistance cassette with a dual module carrying GFP and gentamicin resistance, driven by the same promoter and RBS architecture as Test Version 2 (IntTest_P3_GFP_P2). This design allowed us to evaluate whether our integration system could not only insert new DNA but also precisely replace existing genomic elements with stable expression.
Both donor designs incorporated attP recombination sites (attP2/attP3in TestVer2 andattP3/attP5 in TestVer3 IntTest_P3_GFP_genR_P5) to mediate integration into the corresponding attB sites in the host genome. Verification was planned through colony phenotype screening (fluorescence and antibiotic resistance).
We introduced the Test Version 2(IntTest_P3_GFP_P2) and Test Version 3 (IntTest_P3_GFP_genR_P5) sequence into GenOMe cells via electroporation. Since the bxb1 integrase expression cassette was placed downstream of the constitutive promoter BBa_J23101, the transformed cells already contained abundant bxb1 integrase at the time of DNA entry, thereby facilitating efficient genomic integration. To improve transformation outcomes, cells were immediately transferred into pre-warmed LB broth and allowed to recover at 37 °C for 30 min. Prior to plating, serial dilution was performed to reduce colony density and improve downstream screening. Finally, cells were plated on gentamicin-containing agar plates to enable antibiotic selection; under these conditions, only successfully integrated colonies were expected to survive.
Following recovery and plating, colonies were observed using an OLYMPUS CKX53 fluorescence microscope.




The results of Test Version 2 and Test Version 3 demonstrated that our integration system is both effective and versatile. In Test 2, dual-gene integration was validated by the successful insertion of GFP and gentamicin resistance, confirmed through fluorescence screening, and antibiotic selection. In Test 3, we further established that our method is capable of precise gene replacement, as the kanamycin resistance locus was successfully substituted with GFP and gentamicin resistance, again with high efficiency.
These outcomes highlight several important insights:
Together, these findings validate the feasibility of our GenOMe approach as a generalizable framework for stable, marker-assisted genome editing inE. coli.
We constructed a minimal mRNA dynamics model [1][2]: $$ \frac{dm}{dt} = \beta - k_{\mathrm{deg}} m $$ where m(t) denotes the intracellular mRNA concentration, β is the transcription rate, and kdegis the first-order degradation constant. This model does not consider induction windows or gene-specific differences; instead, it serves as the mathematical backbone for subsequent protein-level modeling (M2). The analytical steady state is: $$ m^{*} = \frac{\beta}{k_{\mathrm{deg}}} $$
We implemented the model as an ordinary differential equation (ODE) with the initial condition m(0)=0. Baseline parameters were chosen from literature estimates:
We perturbed β and kdeg by ±20% and simulated mRNA concentration profiles over a 16-hour window. Results are shown in Fig. M1-1:
mRNA concentration profiles under ±5–20% perturbations of transcription rate (β, left) and degradation rate (kdeg, right). The black line denotes the baseline (β=0.20 μMh−1, kdeg=0.30 h−1). Colored curves represent perturbations, with annotated values showing concentrations and relative deviations at 12 h.
This cycle confirmed:
Building on the mRNA model from M1, we introduced translation and protein degradation to construct the simplest protein-level dynamics [1][2]: $$ \frac{dP}{dt} = k_{\mathrm{tl}} \cdot m(t) - d_{p} \cdot P(t) $$ where P(t) represents protein concentration, ktl is the translation rate constant, and dp is the protein degradation rate. At this stage, translational delay and cellular burden were not considered; instead, we focused on baseline dynamics of three representative cases: ssAP, Bxb1, and Bxb1 under promoter J23105[3].
We implemented the model using an ODE solver with the following baseline parameters:
The selected ktl values are within the range of E. coli translation rates [4], and the dp values are consistent with measured protein turnover times [5].
Simulation results (Fig. M2-C1-1) showed that protein trajectories (thick lines) were smoother and more delayed compared to their corresponding mRNA dynamics (thin lines), demonstrating the “low-pass filtering” effect of protein accumulation [1][2]. ssAP quickly peaked and declined, Bxb1 steadily accumulated and maintained a high level, while Bxb1_J23105 displayed intermediate behavior [3].
We further performed parameter sensitivity analysis for Bxb1 (Fig. M2-C1-2), perturbing ktl and dp by ±20%. Results indicated that:
Simulated concentration curves for ssAP, Bxb1, and Bxb1_J23105. Thin lines = mRNA; thick lines = protein. Shaded regions denote induction windows. Distinct dynamics are observed: ssAP declines rapidly, Bxb1 accumulates persistently, and Bxb1_J23105 lies between the two.
Protein concentration trajectories under ±20% perturbations of translation rate (ktl) and degradation rate (dp). Left: time courses; upper right: relative change in mean concentration (12–16 h); lower right: relative change in AUC (0–16 h).
From Cycle 1 we concluded:
To extend the basic protein dynamics framework, we incorporated two biologically realistic factors:
$$ \frac{dp}{dt} = k_{\mathrm{tl}} \cdot m(t - \tau) - d_{p} \, p(t) $$
Simulation results (Fig. M2-C2-1) showed that introducing a 2-minute translational delay shifted the half-max time (t50) by only 1~2 minutes in the first 30 minutes, confirming that delay effects are negligible for induction regimes longer than 1 h [7].


Simulated protein trajectories of ssAP and Bxb1 with (blue) and without (gray dashed) a 2-minute translational delay. Delay shifted the t50 by 1~2 min, confirming negligible influence under practical induction conditions (>1 h).
We next compared protein trajectories of ssAP and Bxb1 (Fig. M2-C2-2):
These contrasting behaviors highlight ssAP as a short-term supporter and Bxb1 as the primary long-term supplier [10].
Protein expression profiles of ssAP (orange, short half-life) and Bxb1 (blue, long half-life). Shaded regions mark induction windows (ssAP: 0–0.5 h pre-induction; Bxb1: 0–4 h induction). ssAP peaks early then decays rapidly, while Bxb1 accumulates steadily and sustains high levels, reflecting complementary biological roles.
To translate these observations into actionable design rules, we quantified time-over-threshold (ToT)—the duration for which protein levels exceed the functional requirement P*. The heatmap in Fig. M2-C2-3 shows ToT as a function of induction length and translation rate (ktl):
Heatmap of ToT (hours above threshold P*) as a function of induction length (x-axis) and translation rate ktl (y-axis). Warmer colors indicate longer ToT. The dashed box highlights the top 10% ToT combinations, revealing that long inductions coupled with strong translation maximize effective protein availability.
After establishing the protein dynamics framework (Cycle 1) and incorporating translational delay and gene specificity (Cycle 2), we focused on two factors most relevant to experimental design:
Simulated protein trajectories for ssAP (left) and Bxb1 (right). ssAP shows little difference between 1 h and 4 h induction, while Bxb1 requires ≥4 h induction to exceed threshold P
ToT as a function of induction duration under different induction strengths (0.1% vs 1%). ssAP is effective with short induction, while Bxb1 requires ≥4 h induction to achieve functional coverage.
Cycle 3 provided three major insights:
[1] Paulsson, J. (2004). Summing up the noise in gene networks. Nature, 427(6973), 415–418.https://doi.org/10.1038/nature02257
[2] Thattai, M., & van Oudenaarden, A. (2001). Intrinsic noise in gene regulatory networks. PNAS, 98(15), 8614–8619. https://doi.org/10.1073/pnas.151588598
[3] Kelly, J. R., Rubin, A. J., Davis, J. H., Ajo-Franklin, C. M., Cumbers, J., Czar, M. J., ... & Endy, D. (2009). Measuring the activity of BioBrick promoters using an in vivo reference standard. Journal of Biological Engineering, 3, 4. https://doi.org/10.1186/1754-1611-3-4
[4] Li, G.-W., Oh, E., & Weissman, J. S. (2012). The anti-Shine-Dalgarno sequence drives translational pausing and codon choice in bacteria. Nature, 484(7395), 538–541. https://doi.org/10.1038/nature10965
[5] Larrabee, K. L., Phillips, J. O., Williams, G. J., & Larrabee, A. R. (1980). The relative rates of protein synthesis and degradation in Escherichia coli. Journal of Biological Chemistry, 255(9), 4125–4130.
[6] Raj, A., & van Oudenaarden, A. (2008). Nature, nurture, or chance: stochastic gene expression and its consequences. Cell, 135(2), 216–226. https://doi.org/10.1016/j.cell.2008.09.050
[7] Swain, P. S., Elowitz, M. B., & Siggia, E. D. (2002). Intrinsic and extrinsic contributions to stochasticity in gene expression. PNAS, 99(20), 12795–12800. https://doi.org/10.1073/pnas.162041399
[8] Alberts, B., Johnson, A., Lewis, J., Raff, M., Roberts, K., & Walter, P. (2015). Molecular Biology of the Cell (6th ed.). Garland Science.
[9] Phadtare, S., & Inouye, M. (2003). Cold shock response and cold shock proteins. Genes & Development, 17(8), 971–981. https://doi.org/10.1101/gad.1077503
[10] Ghosh, P., Kim, A. I., & Hatfull, G. F. (2003). The orientation of mycobacteriophage Bxb1 integration is solely dependent on the central dinucleotide of attP and attB. Molecular Cell, 12(4), 1101–1111. https://doi.org/10.1016/S1097-2765(03)00440-6
[11] Raj, A., Peskin, C. S., Tranchina, D., Vargas, D. Y., & Tyagi, S. (2006). Stochastic mRNA synthesis in mammalian cells. PLoS Biology, 4(10), e309. https://doi.org/10.1371/journal.pbio.0040309
[12] Bennett, M. R., Pang, W. L., Ostroff, N. A., Baumgartner, B. L., Nayak, S., Tsimring, L. S., & Hasty, J. (2008). Metabolic gene regulation in a dynamically changing environment. Nature, 454(7208), 1119–1122. https://doi.org/10.1038/nature07211
[13] Shachrai, I., Zaslaver, A., Alon, U., & Dekel, E. (2010). Cost of unneeded proteins in E. coli is reduced after several generations in exponential growth. Molecular Cell, 38(5), 758–767. https://doi.org/10.1016/j.molcel.2010.05.015
[14] Scott, M., Gunderson, C. W., Mateescu, E. M., Zhang, Z., & Hwa, T. (2010). Interdependence of cell growth and gene expression: origins and consequences. Science, 330(6007), 1099–1102. https://doi.org/10.1126/science.1192588
[15] Bar-Even, A., Paulsson, J., Maheshri, N., Carmi, M., O’Shea, E., Pilpel, Y., & Barkai, N. (2006). Noise in protein expression scales with natural protein abundance. Nature Genetics, 38(6), 636–643. https://doi.org/10.1038/ng1807
Across Cycles 1–5, the M3 model progresses from a population-averaged description of attB formation to a stochastic, replication-aware framework.
Successive refinements reveal that oligo uptake heterogeneity, ssAP noise, and replication-fork gating collectively determine integration efficiency—culminating in Cycle 5, where locus distance from oriC emerges as the dominant factor.
In the initial formulation, we adopted a mass-action kinetics framework, assuming that single-stranded oligonucleotides (Oligo, ssDNA) and the single-strand annealing protein (ssAP, e.g., RecT or CspRecT) reversibly bind in the cytoplasm to form a complex CC (ssDNA:ssAP). This complex subsequently acts on the chromosome to trigger the formation of an attB site.
Key assumptions:
This design extends from classical protein–DNA binding models [1].
The model is represented by a system of three ODEs:
$$ \frac{d[\mathrm{ssDNA}]}{dt} = -k_{\mathrm{bind}}[\mathrm{ssDNA}][\mathrm{ssAP}] + k_{\mathrm{unbind}}[C] $$ $$ \frac{d[\mathrm{ssAP}]}{dt} = -k_{\mathrm{bind}}[\mathrm{ssDNA}][\mathrm{ssAP}] + k_{\mathrm{unbind}}[C] $$ $$ \frac{d[C]}{dt} = k_{\mathrm{bind}}[\mathrm{ssDNA}][\mathrm{ssAP}] - (k_{\mathrm{unbind}} + k_{\mathrm{deg,complex}})[C] $$where:
Output formulation:
The attB formation rate is assumed proportional to the complex concentration:
$$ \frac{d[\text{attB}]}{dt} \propto [C] $$
Using ssAP dynamics from the M2 model as input, we simulated under varying conditions:
Simulation results showed:
Insights from Cycle 1 include:
For these reasons, Cycle 2 transitioned to a Michaelis–Menten framework, replacing intractable kbind, kunbind with effective parameters (Vmax, Km, kcat) that better capture saturation effects and can be experimentally inferred.
In Cycle 1, we described ssAP–Oligo binding using simple mass-action kinetics. However, two major issues emerged:
Therefore, in Cycle 2, we reformulated the process using Michaelis–Menten saturation kinetics, treating attB formation as an enzyme-like reaction:
In addition, the initial assumption of a reversible two-step binding between ssAP and Oligo (association and dissociation of the complex) was simplified into a single effective step, represented by a Michaelis–Menten form, consistent with enzyme-like kinetics observed in recombineering systems [3][4]
The refined reaction rate is given by:
$$v(t) = \frac{k^{AP}_{cat} [ssAP](t)[Oligo]}{K^{AP}_{m} + [Oligo]}$$
where:
The attB formation rate is then expressed as:
$$\frac{d[attB]}{dt} = v(t)[Genome]$$
When [Oligo] ≫ KmAP, this reduces to:
$$\frac{d[attB]}{dt} \approx k^{AP}_{cat}[ssAP](t)[Genome]$$
This formulation naturally accounts for rate saturation at high Oligo levels, overcoming the overestimation problem in Cycle 1.
Simulation setup:
Results:
Cycle 2 brought the following improvements:
These limitations motivated Cycle 3, where we reframed attB formation as a probabilistic write-in event (“at least one success per cell”) and explicitly introduced genome copy number G.
Although Cycle 2 resolved the issue of rate saturation at high Oligo concentrations, two major limitations remained:
Therefore, in Cycle 3, we reframed attB formation as a probabilistic write-in process:
The instantaneous success rate at time t is given by:
$$k(t) = \frac{k^{AP}_{cat}[ssAP](t)[Oligo](t)}{K^{AP}_{m} + [Oligo](t)}$$
The probability of attB formation follows:
$$\frac{dp}{dt} = k(t)[1 - p(t)], \quad p(0) = 0$$
with solution:
$$p(t) = 1 - \exp\left(-\int_{0}^{t} k(\tau)\, d\tau\right)$$
If k(t) is approximated as constant:
$$p(t) = 1 - e^{-kt}$$
Here, p(t) represents the probability that a single cell has formed at least one attB site by time t.
For this stage, genome copy number was simplified to G = 1.
Results showed:
Cycle 3 provided an important conceptual shift:
These limitations motivated Cycle 4, where we expanded to a non-homogeneous Poisson process with Monte Carlo simulations, incorporating population-level variability in Oligo uptake, ssAP expression, and genome copy number.
While Cycle 3 reframed attB formation as a single-cell probabilistic event, two key issues remained:
To address these limitations, Cycle 4 models attB formation as a non-homogeneous Poisson process at the single-cell level and applies Monte Carlo simulations to incorporate distributions of Oligo uptake, ssAP expression, and genome copy number. This yields population-level attB success rates and explains the observed polarization.
$$\Lambda_i(t) = \int_0^t k_i(\tau)\, d\tau$$
with instantaneous rate:
$$k_i(t) = \frac{k^{AP}_{cat}[ssAP]_i(t)[Oligo]_i(t)}{K^{AP}_{m} + [Oligo]_i(t)} \cdot G_i$$
where Gi ∈ {1, 2, 4} is the genome copy number [6].
The probability of at least one successful attB formation in cell i is:
$$P_i(t \ge 1) = 1 - e^{-\Lambda_i(t)}$$
$$p_{\ge 1}(t) = \frac{1}{N} \sum_{i=1}^{N} P_i(t)$$
Simulation setup:
Simulation results:
Time landmarks:
The ~33% plateau reflects:
This agrees with reported 10–30% recombineering efficiency in λ-Red/RecT systems [5].
Single-cell distributions from Monte Carlo simulations revealed skewed probabilities concentrated around Pi ≈ 0.1–0.3. A dedicated polarization histogram (Fig. M3-2D) highlighted that only under higher Oligo uptake or reduced Km do a fraction of cells achieve Pi ≈ 1, explaining the experimentally observed “all-or-none” recombineering outcomes [7].
Most cells cluster at low–intermediate probabilities (Pi≈0.1–0.3); polarization with a fraction reaching Pi≈1 appears only when Oligo uptake is increased or Km is reduced.
While Cycle 4 identified Oligo delivery (gm) as the dominant determinant, followed by kcat and then Km, these molecular factors alone cannot explain why some genomic loci (e.g., leuD, lacZ) remain far below the predicted 30% plateau. This motivated Cycle 5, where locus-dependent replication timing and fork accessibility were explicitly incorporated.
Future work should extend to Gillespie-based stochastic simulations and integrate single-cell measurements for calibration. In addition, locus-specific replication timing and copy number dynamics should be incorporated, extending the non-homogeneous Poisson framework to include oriC–ter gradients. This motivates the subsequent Cycle 5, which explicitly models position- and fork-gated accessibility.
In previous cycles, we formalized the stochastic process of attB formation and initially assumed that two-to-two recombination would be intrinsically less efficient than one-to-one due to steric hindrance.
However, experimental data from Saunders & Ahmed (2024) challenged this assumption:
Both datasets confirm that genomic position, rather than site number, dominates integration probability.
Thus, this cycle introduces the locus-dependent replication-timing hypothesis [11][12]:
We extended the Cycle 4 non-homogeneous Poisson framework by adding a locus-specific replication-timing variable [12]:
$$ \lambda_i(x,t)=G_i(x,t),f([O]_i(t),[ssAP]i(t)),\chi{\text{rep}}(x,t) $$
where Gi(x,t) is the locus-specific copy number determined by replication timing, f([O]i, [ssAP]i) is the molecular rate from Cycle 4, and (χrep) the replication-window indicator. Replication timing was approximated by the Cooper–Helmstetter model [12]; six loci were simulated:
oriC → metA → hisA → lacZ → galK → leuD → (oriC)
Simulations reproduced the gradient measured by Saunders & Ahmed (2024) [10]: integration efficiency declines exponentially with fork distance from oriC, following
$$ P(d)\approx0.6%,e^{-0.0007d} $$
Each additional ~1000 kb causes ≈ 50–70 % loss of success probability.
At lacZ (two-pair), efficiency (~0.0195 %) was only ~5–7 % of the one-to-one baseline (metA ~0.35 %),
showing that replication-fork accessibility, not steric hindrance, limits success.
As forks initiate at oriC and travel bidirectionally (oriC → metA → hisA → lacZ → galK → leuD → oriC), early-replicated loci (metA, hisA) exhibit high accessibility; mid-replichore sites (lacZ, galK) moderate;
and terminus-proximal (leuD) the lowest [10][12].
By explicitly integrating locus dependence and replication-fork timing, the model provides a quantitative bridge between literature data (leuD, one pair, ~0.0386%) and our experimental lacZ (two pairs, 0.0195%) results [10][12].
These datasets, together with the ORBIT recombineering map [10], reveal a continuous gradient of integration success along the oriC → terC axis, consistent with the Cooper–Helmstetter replication-timing model [12].
By embedding these positional variables into the stochastic Poisson formulation, Cycle 5 transforms the abstract single-cell model of Cycle 4 into a chromosome-aware predictive system.
Contrary to initial expectations, steric hindrance from two-to-two recombination was not the dominant limiting factor.
Integration probability scales exponentially with locus distance from oriC, while the number of attB/attP pairs contributes only marginally at accessible sites.
At ter-proximal loci, the replication window is so narrow that additional site pairs do not compensate for the lack of fork overlap.
This positional dominance clarifies why hisA and metA (ori-proximal) outperform galK, leuD, and lacZ by over an order of magnitude [10].
Although lacZ may be numerically closer to hisA along the genome map, they lie on opposite replichores. Two-to-two recombination requires both ends to be simultaneously exposed within the same replication-fork window; cross-replichore pairs therefore suffer a geometric timing penalty, explaining the particularly low efficiency at lacZ.
[1] Segel, I. H. (1975). Enzyme Kinetics: Behavior and Analysis of Rapid Equilibrium and Steady-State Enzyme Systems. Wiley.
[2] Datta, S., Costantino, N., & Court, D. L. (2008). A set of recombineering plasmids for Gram-negative bacteria. Gene, 379, 109–115. https://doi.org/10.1016/j.gene.2006.12.009
[3] Wang, H. H., Isaacs, F. J., Carr, P. A., et al. (2009). Programming cells by multiplex genome engineering and accelerated evolution. Nature, 460(7257), 894–898. https://doi.org/10.1038/nature08187
[4] Raj, A., & van Oudenaarden, A. (2008). Nature, nurture, or chance: stochastic gene expression and its consequences. Cell, 135(2), 216–226. https://doi.org/10.1016/j.cell.2008.09.050
[5] Elowitz, M. B., Levine, A. J., Siggia, E. D., & Swain, P. S. (2002). Stochastic gene expression in a single cell. Science, 297(5584), 1183–1186. https://doi.org/10.1126/science.1070919
[6] Cooper, S., & Helmstetter, C. E. (1968). Chromosome replication and the division cycle of Escherichia coli B/r. Journal of Molecular Biology, 31(3), 519–540. https://doi.org/10.1016/0022-2836(68)90425-7
[7] Murphy, K. C. (2016). λ Red recombineering: Genetic engineering at its finest. Annual Review of Genetics, 50, 367–387. https://doi.org/10.1146/annurev-genet-120215-035104
[8] Scott, M., Gunderson, C. W., Mateescu, E. M., Zhang, Z., & Hwa, T. (2010). Interdependence of cell growth and gene expression: origins and consequences. Science, 330(6007), 1099–1102. https://doi.org/10.1126/science.1192588
[9] Modrich, P. (1991). Mechanisms and biological effects of mismatch repair. Annual Review of Genetics, 25, 229–253. https://doi.org/10.1146/annurev.ge.25.120191.001305
[10] Saunders, S. H., & Ahmed, A. M. (2024). ORBIT for E. coli: kilobase-scale oligonucleotide recombineering at high throughput and high efficiency. Nucleic Acids Research, gkae227. https://doi.org/10.1093/nar/gkae227
[11] Mosberg, J. A., Gregg, C. J., Lajoie, M. J., Wang, H. H., & Church, G. M. (2012). Improving oligonucleotide-mediated recombineering by mismatch repair protein mutS. Nucleic Acids Research, 40(14), e111. https://doi.org/10.1093/nar/gks412
[12] Cooper, S., & Helmstetter, C. E. (1968). Chromosome replication and the division cycle of Escherichia coli B/r. Journal of Molecular Biology, 31(3), 519–540. https://doi.org/10.1016/0022-2836(68)90425-7
[13] Modrich, P. (1991). Mechanisms and biological effects of mismatch repair. Annual Review of Genetics, 25, 229–253. https://doi.org/10.1146/annurev.ge.25.120191.001305
To provide the theoretical foundation for multi-site recombination modeling, we first constructed a minimal one-to-one model describing recombination between attB and attP mediated by Bxb1 integrase.At this stage, forward recombination and a provisional product inhibition (α) term were included. Subsequent cycles demonstrated that catalytic inhibition requires RDF expression; in its absence, attL/attR only reversibly bind Bxb1 without driving backward conversion. Therefore, α was later removed and replaced by a passive sequestration term in the Bxb1 mass-balance [1][2].
The system was formalized as a set of ordinary differential equations (ODEs) (Fig. M4-C1-1), with key parameters including the binding rate constant (kon), dissociation rate constant (koff), catalytic turnover rate (kcat), and product inhibition coefficient (α). Numerical simulations were implemented on the Google Colab platform.
System of ODEs describing forward recombination of attB and attP mediated by Bxb1 integrase, including product inhibition. RDF and orthogonal mismatches were excluded in this baseline formulation.
Simulation results (Fig. M4-C1-2) showed a progressive depletion of attB/attP and a concomitant accumulation of attL/attR, consistent with reported ranges of Bxb1 binding affinities (Kd∼10−8 M) and catalytic efficiencies [3][4]. This agreement confirmed that the baseline parameterization is biophysically reasonable.
Time-resolved trajectories of substrates (attB, attP) and products (attL/attR) generated under baseline parameterization. The model reproduces the expected depletion of substrates and accumulation of products.
Cycle 1 established that a minimal one-to-one model adequately captures the essential recombination dynamics of Bxb1. This provides a robust foundation for subsequent incorporation of site-specific efficiencies, decoy competition, and dual-end synchronization mechanisms.
Following the establishment of the one-to-one baseline model, we next incorporated the dynamic supply of Bxb1 integrase. Since recombination efficiency is determined not only by the peak enzyme concentration but also by the duration for which Bxb1 levels remain above a functional threshold (P*), we introduced time-over-threshold (ToT) as a key design metric [5][6].
Bxb1 trajectories were obtained from transcriptional and translational dynamics (M1/M2) and implemented under three biologically relevant scenarios:
Simulation results (Fig. M4-C2-1) demonstrated that these supply profiles yield distinct recombination outcomes. ToT values were directly proportional to recombination efficiency: Cycle1 = 240 min, Cycle2 = 180 min, Cycle3 = 120 min. Importantly, ToT provided a stronger predictive relationship than peak concentration alone, confirming its utility as a design rule [7]. Moreover, short inductions (<1 h) yielded negligible benefits, while longer inductions markedly extended ToT and enhanced recombination probability.
Simulated Bxb1 concentration trajectories under three supply scenarios: (Cycle1) ideal rise to ~53 μM, (Cycle2) plateau at ~11.16 μM, and (Cycle3) limited to ~8.53 μM. The red dashed line denotes the functional threshold (P* = 5.0 μM). Shaded regions above P* define the effective time-over-threshold (ToT) for each scenario, with ToT values: Cycle1 = 240 min, Cycle2 = 180 min, Cycle3 = 120 min.
Cycle 2 established that Bxb1 supply dynamics critically constrain recombination. By formalizing ToT as a quantitative design principle, we defined an operational guideline: induction strategies must ensure Bxb1 concentrations remain above the functional threshold for a sufficient duration to achieve robust recombination. This kinetic framework provided essential context for subsequent two-to-two modeling in Cycle 3.
After establishing the one-to-one baseline and defining Bxb1 supply dynamics, we extended the framework to two-to-two recombination, where two pairs of attB/attP sites must recombine simultaneously to yield a successful event. Because our integration sites were designed orthogonally, cross-pairing between non-cognate attB/attP sites was excluded [1]. Under this assumption, the initial success probability was modeled as the product of two independent one-to-one reactions: $$ p_{2\to2}^{\text{ind}} = 1 - (1 - p_1)(1 - p_2) $$
However, experimental trends suggested that this independence assumption consistently overestimated recombination efficiencies. To address this, we introduced a decoy correction factor, denoted as ffree, representing the fraction of free Bxb1 available for productive recombination after accounting for sequestration by additional att sites [8].
The conditional success rate was formulated in an ODE framework. The effective forward rate constants were scaled by $ f_{\text{free}} $, producing a reduced plateau value relative to the independence assumption. Model parameters included $ k_{\text{on}}, k_{\text{off}}, k_{cat}, f_{\text{free}} $, with enzyme input profiles taken from Cycle 2.
At the population level, the independence model predicted a conditional success plateau of ~85%. Incorporating the decoy correction $ f_{\text{free}} = 0.65 $ reduced the plateau to ~58%, in line with the expected trend that non-productive binding events limit recombination efficiency (Fig. M4-C3-1). Importantly, these results describe recombination conditional on attB being present ; overall experimental outcomes (e.g., 0.0195% for M416 vs. 87.5% for M423) additionally depend on the probability of attB formation, which is addressed in Cycle 6.
ODE simulations of conditional success probabilities for two-to-two recombination, assuming attB formation. Blue line: independent assumption $ p_{2\to2}^{\text{ind}} $; Orange line: decoy-corrected model $ f_{\text{free}} = 0.65 $. The decoy-corrected model yields a lower plateau (~58% vs. ~85%), capturing the effect of integrase sequestration. Note: These curves represent conditional successgiven attB presence; overall success rates (e.g., M416 and M423) are analyzed in Cycle 6.
Cycle 3 demonstrated that two-to-two recombination cannot be treated as the simple product of two independent one-to-one reactions. While orthogonality prevents cross-pairing, enzyme sequestration (decoy effect) significantly reduces the conditional efficiency. This established the foundation for subsequent cycles: Cycle 4 incorporates dynamic enzyme supply, while Cycle 5 introduces synchronization and competing failure pathways.
In Cycle 3, we established the conditional two-to-two model and demonstrated that the independence assumption overestimates recombination efficiency, while decoy correction provides a more realistic description. However, that framework still assumed static Bxb1 levels. In reality, Bxb1 expression changes dynamically under induction conditions (Cycle 2, described via ToT). Therefore, in Cycle 4 we incorporated dynamic Bxb1 supply trajectories into the two-to-two framework and introduced a synchronization parameter to account for temporal coordination between the two recombination ends. This extension was necessary because decoy correction alone could not fully explain the discrepancies in slope and plateau observed experimentally [10].
Simulation results are shown in Fig. M4-C4-1:
Simulated recombination trajectories under dynamic Bxb1 supply (Cycle 2 plateau profile) with progressive model refinement. Blue curve: independent assumption model, showing rapid rise and overestimated plateau (~85%). Orange curve: decoy-corrected model (ffree = 0.65)), which reduces plateau but retains steep slope. Green curve: decoy + synchronization model (khalf = 0.15)), incorporating temporal coupling between the two recombination sites, which simultaneously reduces both slope and plateau to match experimental kinetics. Black circles with error bars: experimental data from M423-like conditions (n = 3, mean ± SEM). Only the synchronization-corrected model recapitulates both the trajectory shape and final plateau, demonstrating that time-over-threshold (ToT) and inter-site coordination are both necessary for accurate prediction. Note: These data represent conditional success rates(precomb|attB); overall outcomes incorporating (pattB) gating are presented in Cycle 6.
Cycle 4 demonstrated three key points:
Thus, two-to-two recombination efficiency depends not only on attB availability but also on both Bxb1 temporal dynamics and end-to-end synchronization. This finding motivated Cycle 5, where failure pathways are explicitly modeled and single-cell variability becomes essential.
While Cycle 4 demonstrated that dynamic enzyme supply and decoy correction improved model performance, experimental data still revealed discrepancies in trajectory shape and final efficiency. In particular, colony-level outcomes exhibited a bimodal pattern—some cells achieved complete two-to-two recombination, while others failed entirely. Such behavior is reminiscent of well-known stochastic phenomena in gene regulation, where single-cell trajectories diverge due to intrinsic noise [12][13]. We therefore hypothesized that synchronization between the two recombination ends is essential, and that failure pathways must be explicitly incorporated into the model. Specifically, if one recombination ends earlier than the other, the system may irreversibly enter a failure state with probability khalf.
We extended the Cycle 4 framework by incorporating:
ODE and stochastic simulations under Cycle-2 plateau enzyme supply. Blue: independent assumption; Orange: decoy-corrected $ f_{free} = 0.65 $; Green: decoy + synchronization $ k_{half} = 0.15 $. Black circles: M423-like experimental data (mean ± SEM). Only the synchronization-corrected model reproduces both slope and plateau.
Population success fraction simulated with decoy + synchronization. Solid line: mean success; shaded area: 95% binomial CI across replicate populations. This captures aggregate uncertainty and validates reproducibility across experimental replicates.
Stochastic simulations of 200 single cells under Cycle-2 plateau supply. Green: cells achieving complete recombination (~60%); Red: cells stalled in incomplete states (~40%). The bimodal distribution highlights the impact of synchronization failure.
Cycle 5 established that:
Although Cycle 5 established conditional recombination efficiency under enzyme dynamics and synchronization effects, experimental data revealed a dramatic discrepancy between two test cases: M416 exhibited only 0.0195% overall success, whereas M423 reached 87.5%. Such a gap cannot be explained by decoy effects or Bxb1 kinetics alone. We therefore hypothesized that the true bottleneck lies in the probability of attB formation, which gates the downstream recombination process. Conceptually, the overall probability of successful recombination can be factorized as:
$$ p_{overall} = p_{attB} × p_{recomb|attB} $$
where pattB represents the likelihood of oligo integration into the genome to form attB, and precomb|attB denotes the conditional efficiency of Bxb1 recombination once attB exists [1][4].
We extended the Cycle 5 framework into a two-layer hierarchical model:
Parameters were calibrated to reflect two experimental regimes:
Cycle 6 revealed that:
Spaghetti plot with across-batch confidence bands for conditional recombination precomb|attB. Plateau efficiency reached ~87.5%, consistent with experimental M423 data, demonstrating robust downstream recombination once attB is pre-formed.
Spaghetti plot with across-batch confidence bands for overall recombination efficiency. Plateau collapsed to ~0.0195%, despite similar downstream parameters, highlighting attB gating as the critical upstream bottleneck.
Cycle 6 demonstrated that attB formation is the dominant gating bottleneck: without attB, overall success collapses to near zero (M416), while with pre-formed attB, conditional recombination efficiency can reach ~87.5% (M423). However, even in the presence of attB, experimental outcomes revealed a significant dependence on insert length (L). We therefore designed Cycle 7 to quantify the length effect and to determine whether resistance marker configuration (one vs two markers) plays any significant role.
We extended the two-layer framework into a length-dependent conditional model:
$$ p_{\text{overall}} = p_{\text{attB}} \times p_{\text{recomb}\mid \text{attB},L} $$
where pattB is attB gating (Cycle 6), and precomb|attB,L is the conditional efficiency given attB presence.
To incorporate length effects, we assumed that longer inserts:
Formally,
$$ k_{f2}(L) = k_{f2}^{(0)} e^{-\alpha(L-L_0)}, \quad $$ $$ k_{\text{half}}(L) = k_{\text{half}}^{(0)} e^{+\beta(L-L_0)} $$
which yields a logit-linear approximation for success:
$$ \log \frac{p}{1-p} \approx \text{const.} - (\alpha+\beta)(L-L_0) $$
Fitting to experimental data produced a length sensitivity slope of $\text{SLOPE} = 1.70 \times 10^{-3}/\text{bp}$, with %\alpha=\beta=8.5\times 10^{-4}/\text{bp}$.
This implies that for every additional base pair, success odds decrease by ~0.17%.
Three constructs were analyzed under M423-like conditions (attB pre-formed):
Results showed that Test2 and Test3 efficiencies were nearly identical (79% vs 80%) despite different resistance marker outcomes, while both were lower than Test1. This confirms that insert length, not resistance configuration, is the primary determinant of conditional recombination efficiency.
Logit regression accurately reproduced the plateau differences, capturing the monotonic decline in efficiency with increasing length (Fig. M4-C7-1).
Black crosses: experimental data (1039 bp: 87.5%; 1605 bp: 79%; 1613 bp: 80%). Blue line: logit regression fit (slope≈−1.03×10−3/bp), showing the decline in recombination efficiency with increasing insert length.
[1] Ghosh, P., Kim, A. I., & Hatfull, G. F. (2003). The orientation of mycobacteriophage Bxb1 integration is solely dependent on the central dinucleotide of attP and attB. Molecular Cell, 12(4), 1101–1111. https://doi.org/10.1016/S1097-2765(03)00427-0
[2] Rutherford, K., & Van Duyne, G. D. (2014). The ins and outs of serine integrase site-specific recombination. Current Opinion in Structural Biology, 24, 125–131. https://doi.org/10.1016/j.sbi.2013.12.004
[3] Smith, M. C. M., & Thorpe, H. M. (2002). Diversity in the serine recombinases. Molecular Microbiology, 44(2), 299–307. https://doi.org/10.1046/j.1365-2958.2002.02891.x
[4] Mandali, S., Gupta, K., & Johnson, R. C. (2020). Mechanistic insights into the Bxb1 integrase recombination reaction. Nucleic Acids Research, 48(7), 3669–3682. https://doi.org/10.1093/nar/gkaa095
[5] Elowitz, M. B., & Leibler, S. (2000). A synthetic oscillatory network of transcriptional regulators. Nature, 403(6767), 335–338. https://doi.org/10.1038/35002125
[6] Rosenfeld, N., Young, J. W., Alon, U., Swain, P. S., & Elowitz, M. B. (2005). Gene regulation at the single-cell level. Science, 307(5717), 1962–1965. https://doi.org/10.1126/science.1106914
[7] Potvin-Trottier, L., Lord, N. D., Vinnicombe, G., & Paulsson, J. (2016). Synchronous long-term oscillations in a synthetic gene circuit. Nature, 538(7626), 514–517. https://doi.org/10.1038/nature19841
[8] Sneppen, K., & Dodd, I. B. (2012). A simple histone code opens many paths to epigenetics. PLoS Computational Biology, 8(11), e1002643. https://doi.org/10.1371/journal.pcbi.1002643
[9] Weber, M., & Buceta, J. (2013). Noise in cell–cell communication: lessons from viruses. Cellular and Molecular Life Sciences, 70, 3475–3496. https://doi.org/10.1007/s00018-012-1226-2
[10] Dhar, G., Sanders, E. R., & Johnson, R. C. (2004). Architecture of the Bxb1 integrase–DNA synaptic complex: site-specific recombination in action. Genes & Development, 18(23), 2798–2810. https://doi.org/10.1101/gad.1241004
[11] Johnson, R. C. (2015). Site-specific DNA recombination: a historical perspective and lessons from phage integrases. FEMS Microbiology Reviews, 39(3), 316–336. https://doi.org/10.1093/femsre/fuv002
[12] Elowitz, M. B., Levine, A. J., Siggia, E. D., & Swain, P. S. (2002). Stochastic gene expression in a single cell. Science, 297(5584), 1183–1186. https://doi.org/10.1126/science.1070919
[13] Rosenfeld, N., Young, J. W., Alon, U., Swain, P. S., & Elowitz, M. B. (2005). Gene regulation at the single-cell level. Science, 307(5717), 1962–1965. https://doi.org/10.1126/science.1106914
[14] Shahrezaei, V., & Swain, P. S. (2008). Analytical distributions for stochastic gene expression. PNAS, 105(45), 17256–17261. https://doi.org/10.1073/pnas.0803850105
[15] Schleif, R. (1992). DNA looping. Annual Review of Biochemistry, 61(1), 199–223. https://doi.org/10.1146/annurev.bi.61.070192.001215
[16] Becker, N. A., & Peters, J. P. (1984). Dependence of site-specific recombination efficiency on DNA distance. PNAS, 81(22), 6948–6952. https://doi.org/10.1073/pnas.81.22.6948
[17] Rutherford, K., & Van Duyne, G. D. (2014). The ins and outs of serine integrase site-specific recombination. Current Opinion in Structural Biology, 24, 125–131. https://doi.org/10.1016/j.sbi.2013.12.004