Introduction
The protein Balcp19K(abbreviated to CP19K) is found in barnacle cement and plays a crucial role in the adhesive glue, allowing barnacles to latch strongly onto surfaces in aquatic environments. Since this project aims to find a protease that can degrade CP19K, it is therefore of best interest that CP19K can be synthesised in the lab.
This model has one goal: To simulate the production of Balcp19K with different promoters and operators, in order to maximise CP19K production. The CP19K synthesised is then used to experiment with different proteases to see which one is most efficient in degrading CP19K.
Methodology
To find a sequence that optimizes CP19K production, we use different promoters that control how strongly translation starts, as well as different operators that regulate the number of proteins synthesized. We first pick 2 promoters and 2 operators that we will model, chosen because of their wide range of use:
| Promoter: | Description | Sequence: | 
|---|---|---|
| T7 | A promoter that serves as a binding site for T7 RNA polymerase, a highly specific, DNA-dependent enzyme from the T7 bacteriophage that synthesizes RNA from a DNA template in the 5' to 3' direction. | taatacgactcactatagg(19bp) | 
| J23119 | The strongest promoter from the Anderson promoter family, frequently used as a standard in gene expression in E.Coli. | ttgacagctagctcagtcctaggtat aatgctagc(35bp) | 
| Operator: | Description | Sequence: | 
|---|---|---|
| LacO | An operator that a repressor(LacI) binds to to repress gene expression, until an inducer(IPTG) inactivates LacI. | ggaattgtgagcggataacaattcc (25bp) | 
| TetO (tetracycline operator) | An operator that a repressor(TetR) binds to to repress gene expression, until an inducer(aTc) inactivates TetR. | tccctatcagtgatagaga(19bp) | 
We consider 4 different sequences that we will model, using these 2 promoters and 2 operators:
 
       
      We represent the system with the following state variables:
- : Free DNA template
- DNA bound by LacI repressor
- DNA bound by TetR repressor
- : transcribed messenger RNA
- barnacle cement protein
- : Free repressor protein for the Lac operon
- Free repressor protein for the Tet operon
- : Inducer molecule for LacI
- Inducer molecule for TetR
- : LacI_IPTG complex (inactive repressor)
- TetR_aTc complex (inactive repressor)
- Available cellular energy
- : Transcription rate;
- : Translation rate;
* and are defined in the parameters section below.
With these state variables, the following reactions can be modelled:
 
      - Repressor binding/unbinding to DNA for the Lac operon:
- Inducer binding/unbinding to repressor for the Lac operon:
- Repressor binding/unbinding to DNA for the Tet operon:
- Inducer binding/unbinding to repressor for the Tet operon:
- Transcription(requiring ATP):
- Translation(requiring ATP):
- mRNA and CP19K degradation:
- ATP production and usage:
A set of parameters is also needed in order to characterise these reactions:
 
      ª The transcription rate for the J23119 is estimated based on the fact that it is the strongest of the Anderson series, but weaker than the T7 promotor. This is because no approximate values for transcription rate of J23119 has been found.
Now, we can fully write out the ordinary differential equations(ODEs) for these reactions:
 
      Implementation and Assumptions
The system of ODEs was simulated in MATLAB. Simulations were run with a fixed step solver (ode15s) for a duration of 12,000 minutes. Initial conditions were set to 1 DNA copy, 0 mRNA, 0 DNA bound by LacI/TetR, 0 CP19K proteins, 100 repressor molecules(LacI/TetR), 10000 inducer molecules(IPTG/aTc), 0 repressor-inducer molecules(LacI_IPTG/TetR_aTc), and 2e6 ATP molecules.
The following assumptions have been made in the model:
- The cell is treated as a homogeneous, well-mixed compartment with no special variation. There is no cell division, and the cell does not change in any way other than those represented in the ODEs above. Protein production is not limited by volume limitations of the cell.
- All molecules are assumed to diffuse instantly and instantly and interact uniformly.
- ATP is produced at a constant rate, not dependent on nutrient level, stress, etc. No other processes other than transcription and translation use ATP in this model.
- LacI is not synthesised or degraded in this model, only redistributed among free LacI, DNA_LacI, and LacI_IPTG.
Results
The results after 12000 min are shown in the table below:
| Sequence | mRNA Synthesised | CP19K Synthesised | ATP Count | 
|---|---|---|---|
| T7_LacO | 78.55 | 710317.59 | 4604376969655.85 | 
| T7_TetO | 76.34 | 688721.85 | 4604511736093.33 | 
| J23119_LacO | 35.93 | 324939.42 | 4606343706155.49 | 
| J23119_TetO | 34.92 | 315060.31 | 4606405355944.23 | 
*Absolute units are arbitrary; the model is intended to compare relative promoter/operator strengths rather than predict exact molecule counts. Therefore, in our Results section, we avoid strict values and instead use percentage difference to compare strength.
From these results, we see that the T7 promoter is two-fold stronger than the J23119 promoter. This is reflected from the higher transcription constant that the T7 promoter has in comparison to the J23119 promoter.
Between the Lac and Tet operons, the Lac operon has a marginally higher transcriptional output, by approximately 3%. This suggests that under full induction, operator choice has a small effect on protein synthesis, because the repressors were mostly repressed. However, it is evident by the higher number of CP19K synthesised in the Lac operon sequences that LacI is repressed more vigorously than TetR.
ATP levels are much higher than normal cellular values by a few orders of magnitude; this is the result of our third assumption, where no processes use ATP other than translation and transcription. As a result, ATP never became a limiting factor, and transcription/translation proceeded unscaled for the majority of the simulation. Because CP19K is synthesised under lab conditions, where near-optimal conditions are provided, the accuracy of these results does not decrease.
With this in mind, we chose the strongest sequence for CP19K production, which uses the T7 promoter and the Lac operon. The full sequence is shown below.
 
      References
- Elf, J., Li, G.-W., & Xie, X. S. (2007). Probing Transcription Factor Dynamics at the Single-Molecule Level in a Living Cell. Science, 316(5828), 1191–1194. https://doi.org/10.1126/science.1141967
- Xu, J., & Matthews, K. (2009, June 9). Flexibility in the Inducer Binding Region is Crucial for Allostery in the Escherichia coli Lactose Repressor. Nih.gov. https://pmc.ncbi.nlm.nih.gov/articles/PMC2772868/pdf/nihms112361.pdf
- Davide Normanno, Boudarene, L., Dugast-Darzacq, C., Chen, J., Richter, C., Proux, F., Olivier Bénichou, Raphaël Voituriez, Darzacq, X., & Dahan, M. (2015). Probing the target search of DNA-binding proteins in mammalian cells using TetR as model searcher. Nature, 6(1). https://doi.org/10.1038/ncomms8357
- Nguyen, T., Chen, M., Baer, R., Galagan, J., & Dennis, A. (2020). A Förster Resonance Energy Transfer-Based Ratiometric Sensor with the Allosteric Transcription Factor TetR. PubMed. https://doi.org/10.1002/smll.201907522
- Arnold, S., Siemann, M., Scharnweber, K., Werner, M., Baumann, S., & Reuss, M. (2001). Kinetic modeling and simulation of in vitro transcription by phage T7 RNA polymerase. Biotechnology and Bioengineering, 72(5), 548–561. https://doi.org/10.1002/1097-0290(20010305)72:5%3C548::aid-bit1019%3E3.3.co;2-u
- Part:BBa J23119 - parts.igem.org. (n.d.). Igem.org. Retrieved September 16, 2025, from https://parts.igem.org/Part%3ABBa_J23119
- Zhu, M., & Dai, X. (2019). Maintenance of translational elongation rate underlies the survival of Escherichia coli during oxidative stress. Nucleic Acids Research, 47(14), 7592–7604. https://doi.org/10.1093/nar/gkz467
- Global half-lives of mRNA in laboratory-grown - Bacteria Escherichia coli - BNID 111927. (2013). Harvard.edu. https://bionumbers.hms.harvard.edu/bionumber.aspx?id=111927&s=n&v=1
- Typical protein half-life - bacteria - BNID 111930. (2013). Harvard.edu. https://bionumbers.hms.harvard.edu/bionumber.aspx?s=n&v=2&id=111930
- Deng, Y., Beahm, D. R., Ionov, S., & Sarpeshkar, R. (2021). Measuring and modeling energy and power consumption in living microbial cells with a synthetic ATP reporter. BMC Biology, 19(1). https://doi.org/10.1186/s12915-021-01023-2
- Cost of amino acid polymerization - Bacteria Escherichia coli - BNID 114971. (2013). Harvard.edu. https://bionumbers.hms.harvard.edu/bionumber.aspx?s=n&v=1&id=114971
- Hu, X.-P., Dourado, H., Schubert, P., & Lercher, M. J. (2020). The protein translation machinery is expressed for maximal efficiency in Escherichia coli. Nature Communications, 11(1). https://doi.org/10.1038/s41467-020-18948-x