Software - Plot2Curve Simulations

The script uses os to manage directories and file paths. It creates an output folder for simulation results, builds paths for CSVs and figures, and reads environment variables to determine whether plots should be displayed. It also prints relative file paths when saving outputs, making the logs user-friendly. We utilize datetime to generate a timestamp, which ensures that output folders and files have unique names corresponding to the time of execution.

For plotting and visualization, matplotlib and its submodules play a central role. The backend is managed to allow headless operation when GUI plots are disabled. The pyplot library is used to create all figures, including stacked panels, overlays, and gallery plots, with dual y-axes for mRNA and protein data. The library also handles figure layout, axes, titles, grids, and optional log scales. PdfPages is used to collect all figures into a single multi-page PDF. Custom legends are implemented using Line2D, and helper functions ensure that color-coded legends and line styles are displayed cleanly in separate panels.

We utilize numpy for numerical computation and array handling. It is used to generate time arrays, simulate mRNA and protein dynamics, compute decay rates, binding probabilities, and translation kinetics. It also assists in interpolation for first-passage time calculations (t50) and numerical operations in sequence scoring and translation efficiency calculations.

The script relies on pandas for data organization and storage. Simulation results are stored in dataframes, grouped by promoter, and used to compute summary metrics. The pandas library handles sorting and filtering of data for plotting, as well as exporting results to CSV files for promoter dynamics and summary statistics.

Overall, the libraries integrate seamlessly within the workflow. The numpy library handles numerical simulation, pandas organizes and stores the resulting data, matplotlib generates figures and PDFs, and os and datetime manage file organization and naming. Together, these libraries enable a full pipeline from promoter sequence and translation analysis to simulation, summary, and high-quality visual output.

Plot2Curve: AHL Induction Simulation

The goal of the AHL induction simulation is to estimate the intensity of light emitted by sfGFP in E. coli containing the Metlock system. The amount of sfGFP in the system is simulated by calculating a chain of precursors in the system. Each timestep, each piece of the chain is calculated and added to the environment, remaining there until it is used to produce something else or decays.

Assumptions

Cells have all resources required for growth: The simulation does not take into account how the cells are being fed, but merely assumes that a plentiful source of food is available and there are no limiters on growth.
All cells identical: The simulation assumes that each cell and its contents are identical; it does not account for an imbalance of resources causing inefficiencies in sGFP production.
Production of sfGFP is free: The simulation assumes that the production of sfGFP takes place alongside normal cell behavior and does not drain the cell's resources at all.
Availability of Ribosomes: The simulation assumes that a limited number of ribosomes are always available to transcribe mRNA and that every piece of mRNA will be found by a ribosome so it can be transcribed.
Linearity of sfGFP intensity: We assume that the intensity of the light produced will vary linearly with the amount of sfGFP in the system.

Chain of Production

Create LuxR mRNA

An amount of LuxR mRNA is added to the system based on the amount of bacteria present. This plasmid is always active and producing mRNA.
Create LuxR

An amount of LuxR is created and added to the system based on the amount of LuxR mRNA and the amount of available ribosomes.
Add AHL dosage

At certain OD600 thresholds, an amount of AHL is added to the system to activate LuxR.
Activate LuxR

AHL in the system activates LuxR, creating activated LuxR. The amount of LuxR activated is calculated based off of the expected number of collisions between LuxR and AHL and an activation percentage. The expected number of particle collisions is proportional to the amount of AHL, the amount of LuxR, the cross-sectional area of their collision, their relative speed, and the size of the container. Our values for the cross-sectional area, the relative speed of AHL and LuxR, and the chance that a collision results in an activation are all estimates.
Produce sfGFP mRNA

Activated LuxR in the system binds to sfGFP-producing plasmids, activating them and causing them to produce sfGFP mRNA. Similar to the activation of LuxR, the amount of plasmids that begin producing sfGFP mRNA is based off the expected collisions between the sfGFP producing plasmids and activated LuxR. As with activating LuxR, the cross-sectional area of the collision, the relative speed of the plasmid to activated LuxR, and the interaction ratio are estimates.
Produce sfGFP

An amount of sfGFP is produced based on the amount of sfGFP mRNA and the available ribosomes.
Calculate Intensity

The intensity of the produced light is a scalar multiple of the amount of sfGFP in the system.
Maintain the system

All mRNA, LuxR, AHL, and sfGFP have a half-life, and an appropriate amount decays at each timestep. Additionally, the bacteria split, creating more bacteria at this time.

Results

The model produced a curve in line with our expectations of how the system should behave. As bacteria grow and AHL is added into the system, the intensity of light grows quickly and then stabilizes at a maximum. This maximum is where the rate of decay of sfGFP and the cells' ability to produce sfGFP are in equilibrium. After a short while, the supply of AHL runs out and the remaining sfGFP decays quickly, quieting the system.

Plot of normalized intensity vs. time in seconds.

Plot2Curve: Anderson Family of Promoters Simulator — Developer Overview

Software

Plot2Curve is a single Python script that produces a full pipeline from simulation to publication-ready figures:

os + datetime lets us create a timestamped output folder (sim_results_YYYYmmdd_HHMMSS/) and print relative paths for readable logs.
Environment variables for quick configuration:
- HEADLESS=1 saves figures without opening windows.
- THROTTLE=0/1 toggles ribosome resource sharing.
- K_R=<number> sets the resource budget scale.
- CDS_SEQ=<DNA or path> supplies a custom coding sequence (validated).
numpy, time grids, ODE updates, interpolation for t50/t80, and vectorized coupled simulations.
pandas, stores long (timecourses) and wide (per-promoter summary) tables and writes tidy CSVs.
matplotlib, stacked panels, overlays, gallery pages, diagnostics; PdfPages bundles all figures into one PDF; a helper renders a right-side legend panel that never collides with axes.

Gene Expression Simulation - Overview

Goal: Predict how bright each construct gets and how fast it becomes visible, so we can pick promoter swaps and schedule plate-reader windows before wet lab.

States per promoter: (1) mRNA, (2) immature protein (non-fluorescent), (3) mature protein (fluorescent).

Main outputs: a ranked k_tx ladder, a one-row-per-promoter summary (k_tx, m_eq, protein_eq), and timing metrics (t80 and detection time).

Assumptions

Identical, well-mixed cells (deterministic averages; no single-cell noise).
Promoter strength is a steady scalar (no TF dynamics).
Protein is diluted by growth; no active proteolysis modeled.
Fluorescence ∝ mature GFP molecules.
Optional global ribosome throttle approximates resource sharing.
RBS scoring (SD pairing + spacing) is a quick heuristic for initiation likelihood.

Simulation Steps (Chain of Production)

Select promoters from the Anderson set (J23100–J23118) plus CustomStrong/Weak, or choose a subset interactively.
Choose CDS, the default is sfGFP, or supply CDS_SEQ (string/FASTA/path). The parser checks frame and stop codon.
RBS sanity check: SD match + spacing → a 0–1 initiation factor.
Translation throughput: β = min(k_init_max · P_bind, v_nt/footprint).
Time grid: conservative dt from decay, growth, and maturation rates keeps Euler stable.
mRNA ODE: dm/dt = k_tx − δ_m m, with k_tx = k_tx_baseline × promoter_strength.
Protein + maturation - immature → mature at rate k_mat; growth dilution applies.
Optional coupling: β_eff(t) = β / (1 + M_tot/K_R) shares ribosomes across constructs.
Summaries: steady states; t50/t80 via first-crossing (with numeric fallback); detection time = first absolute threshold crossing (~10% of library max by default).
Write outputs: dynamics, summary, dashboard, k_tx ladder, and all figures (PNGs + combined PDF).

Parameters & Defaults

k_tx_baseline = 0.02 RNAs/s (scaled by promoter strength)
mRNA half-life ≈ 6 min → δ_m = ln(2)/360 s⁻¹
Doubling time ≈ 30 min → α = ln(2)/1800 s⁻¹
Maturation (sfGFP-like) ≈ 7 min → k_mat = ln(2)/420 s⁻¹
Translation: aa_per_sec = 20, footprint_nt = 30, k_init_max = 1.0
Detection threshold: ~10% of brightest construct (configurable)
Window: t_max = 5000 s; dt auto-selected for stability

Results & Deliverables

P2C “Anderson” report: anderson_k_tx_ladder.csv + bar-ladder figure; use for promoter swap shortlists.
CSV for Learn/Design-2: promoter_summary.csv with k_tx, m_eq, protein_eq, plus t80, detection time, maturation lag.
Raw traces: promoter_dynamics.csv for any time-resolved analysis.
Figures: PNGs and a single figures.pdf (overlays, galleries, diagnostics).

Limitations & Next Steps

No stochastic single-cell noise (fast, smooth ranking focus).
Global throttle is a trend-level approximation, not full resource economics.
RBS model is a heuristic; could be replaced with ΔG-based models later.
Easy upgrades: CLI flags (argparse), run logging, Numba/JAX for very large promoter sets, optional higher-order ODE solvers.

Glossary

k_tx: RNAs/s per cell (promoter speed)
m_eq: steady mRNA molecules per cell
protein_eq: steady mature protein per cell (brightness proxy)
t80: time to 80% of final brightness
Detection time: first crossing of a practical threshold
Throttle / coupling: translation slowdown when total mRNA is high

Software - Plot2Curve Simulations

Plot2Curve: AHL Induction Simulation

Assumptions

Chain of Production

Create LuxR mRNA

Create LuxR

Add AHL dosage

Activate LuxR

Produce sfGFP mRNA

Produce sfGFP

Calculate Intensity

Maintain the system

Results

Plot2Curve: Anderson Family of Promoters Simulator — Developer Overview

Software

Gene Expression Simulation - Overview

Assumptions

Simulation Steps (Chain of Production)

Parameters & Defaults

Results & Deliverables

Limitations & Next Steps

Glossary