Similar to many contemporary research groups, our team aims to leverage the background of each of its members to its advantage. Accordingly, part of the project’s efforts are dedicated to applying the computational approach of bioinformatics to, on one hand, optimize aerobactin biosynthesis, and on the other, deepen our understanding of the siderophore’s internalization by Klebsiella pneumoniae.
Considering that aerobactin uses citrate as its initial building block, it is important to optimize its production within the synthesis environment to maximize aerobactin yields. In this context, optimization predictions of citrate based on E. coli genome-scale metabolic models will be particularly useful.
On a different note, despite the well-documented specificity of siderophores, some studies report the phenomenon of xenosiderophores — internalized siderophores produced by other bacterial species. To understand how our siderophore might function as a xenosiderophore, it becomes essential to investigate, at the atomic level, how aerobactin is internalized by K. pneumoniae, and how it can be replicated within other bacterial species.
Our modeling effort combined two distinct scales: (i) a genome-scale metabolic model (GEM) predicting optimal biosynthetic flux distributions, and (ii) a molecular-level model simulating siderophore–membrane interactions. This multi-scale approach bridges metabolic engineering and structural biology.
Genome-Scale Metabolic Models (GEMs) are computational, mass-balanced reconstructions of all the biochemical reactions present in the metabolism of an organism. They integrate genomic, biochemical, and physiological data to provide a structured representation of cellular metabolism.
We used BiGG Models, a knowledgebase of Biochemically, Genetically and Genomically structured genome-scale metabolic network reconstructions. BiGG provides standardized, curated models that can be readily used for computational analysis.
COBRApy 0.29 (COnstraint-Based Reconstruction and Analysis in Python) [2] is a Python package or constructing, manipulating, and interrogating genome-scale metabolic network models. Through its intuitive object-oriented interface, users can load and save models in common formats (SBML, MATLAB, JSON) and apply standard constraint-based techniques, such as flux balance analysis (FBA), simulated gene/reaction knockouts, and parsimonious FBA. Therefore, it is possible to predict metabolic behavior under specified environmental or genetic perturbations[3].
To run our simulations, we used Python 3.11 because of its flexibility, readability, and compatibility with scientific computing libraries. All analyses were performed in Jupyter and terminal environments on Windows and Linux systems, allowing parallelized computation across multiple cores. Python’s ecosystem made it possible to combine mathematical modeling, data analysis, and visualization within the same workflow[4].
In our iGEM project, GEM-based simulations allow us to predict which environmental and genetic conditions maximize citrate and siderophore production. These predictions guided our experimental design, saving lab time and resources. The model helped us understand how citrate flux competes with growth in E. coli, explaining why siderophore production is limited under standard culture conditions. It also revealed that acidic pH and alternative carbon sources reroute carbon flux toward citrate, a key insight that guided our medium formulation for aerobactin production.
Our GEM aimed to predict how environmental parameters (pH, carbon source, phosphate levels) modulate the citrate flux feeding aerobactin biosynthesis. We hypothesized that acidic pH and Pi-rich conditions would reroute carbon flux toward citrate export, thus enhancing siderophore production.
This model does not include gene regulation or kinetics, but it captures steady-state metabolic trade-offs between growth and secondary metabolite secretion, providing quantitative guidance for our lab experiments.
The main impacts we evaluated were:
For our simulations, we focused on iML1515, one of the most comprehensive GEMs of E. coli K-12 MG1655. This model includes 1,516 genes, 2,712 metabolic reactions, and 1,877 unique metabolites, covering central carbon metabolism, amino acid biosynthesis, energy generation, and transport processes. Its scale and curation make iML1515 the reference point for constraint-based metabolic engineering in E. coli[5].
The tables below summarize the top 5 conditions for growth, citrate export, and siderophore production. These predictions highlight the trade-offs between cell growth and metabolite secretion, providing valuable insights for metabolic engineering.
Growth — What Stands Out?
Across substrates and temperatures, the wild type consistently reaches a maximal specific growth rate of 1.85 h⁻¹ at near-neutral pH with ammonium. This robustness suggests that, in silico, the core network can sustain optimal biomass production under multiple “favorable” carbon sources (glucose, acetate, glycerol, succinate). The fact that several entries share the same maximum highlights that growth is less sensitive than production metrics to modest pH/temperature shifts within these ranges.
| Model | Carbon source | pH | Ion | Temperature | Growth rate (h-1) |
|---|---|---|---|---|---|
| iML1515 | Glucose | ~7 | NH4+ | 30–45 °C | 1.85 |
| Acetate | ~7 | NH4+ | 30–40 °C | 1.85 | |
| Glycerol | ~7 | NH4+ | 30–40 °C | 1.85 | |
| Succinate | ~7–9 | NH4+ | 30–40 °C | 1.85 | |
| Acetate | ~5–9 | NH4+ | Standard (37 °C) | 1.85 |
Citrate export — What drives high flux?
Maximal citrate secretion (~18 mmol/gDW/h) occurs with glycerol as the carbon source, notably under more acidic conditions (~pH 5) and with Pi as the phosphate source. High-performing succinate/acetate scenarios also appear but at slightly lower fluxes (~15.5–17.9). Importantly, our full dataset shows that maximizing citrate export often coincides with low growth, indicating a classic trade-off: routing carbon toward secretion instead of biomass.
| Model | Condition | Carbon source | pH | Ion | Temperature | Citrate export (mmol/gDW/h) |
|---|---|---|---|---|---|---|
| iML1515 | WT | Glycerol | ~5 | Pi | 34 °C | 18.17 |
| WT | Glycerol | ~7 | NH4+ | 30–40 °C | 18.17 | |
| WT | Acetate | ~7 | NH4+ | 30–37 °C | 15.53 | |
| WT | Succinate | ~5-7 | Pi | 34–37 °C | 17.93 | |
| WT | Succinate | ~9 | Pi | 34–40 °C | 17.93 |
Design takeaway: If citrate is the priority product, test glycerol feeds at lower pH with Pi, but plan for strategies (co-feeding, staged cultivation, or two-stage processes) to rescue growth.
Predicted siderophore fluxes cluster around ~6.67 mmol/gDW/h across many environments. Notably, ICDHyr knockouts can preserve high siderophore flux while abolishing growth in silico, suggesting that biosynthesis can decouple from biomass formation. This robustness makes siderophores a compelling target when a stable production phenotype is desired across variable conditions.
| Model | Condition | Carbon source | pH | Ion | Temperature | Siderophore secretion rate (mmol/gDW/h) |
|---|---|---|---|---|---|---|
| iML1515 | WT | Glucose | ~7 | NH4+ | 30-45 °C | 6.67 |
| WT | Acetate | ~7 | NH4+ | 30–37 °C | 6.67 | |
| WT | Glycerol | ~7 | NH4+ | Standard (37 °C) | 6.67 | |
| KO - ICDHyr | Glucose | ~5 | NH4+ | 30–37 °C | 6.67 | |
| KO - ICDHyr | Acetate | ~5-9 | NH4+ | 34–40 °C | 6.67 |
The general mechanisms of siderophore uptake have been well characterized in both Gram-negative and Gram-positive bacteria [7]. Our bacterium of interest, Klebsiella pneumoniae, belongs to the Gram-negative group, which introduces the additional complexity of transporting siderophores through multiple stages, from the extracellular space, across the periplasm, and into the cytoplasm.
Given their molecular weight approximating 600 Da, siderophores must use an active process to traverse both the outer (OM) and inner (IM) membranes, engaging multiple proteins in the process [7]. Specifically, transport across the OM requires a TonB-dependent Transporter (TBDT), whereas crossing the IM relies on an ATP-binding cassette (ABC) transporter.
Transport across the OM relies on the TonB protein complex. As previously mentioned, the TBDT is embedded in the OM as a β-barrel containing an internal globular domain, referred to as the plug domain (PD), that serves as a selective gate for siderophores and other molecules. The critical role of this channel is reflected in its structural conservation: all annotated TBDTs feature a 22-stranded β-barrel incorporating the PD [8]. Upon activation by the interaction of the iron-bound siderophore, the TonB Box — a polypeptide motif near the N-terminal signal region of the TBDT — extends toward the IM and interacts with the β-sheet of the TonB protein, effectively adding an additional strand [9]. Anchored to the IM via the TonB-ExbB-ExbD complex, this interaction harnesses the proton motive force present in the IM to trigger a conformational change in the PD, allowing the siderophore to enter the periplasm [10]. Without this mechanism, transport would not be possible, as the OM lacks ATP or an ion gradient [7].
Following entry into the periplasm, two main mechanisms have been proposed to explain how iron ions are transferred to the cytoplasm. In the first, the Fe³⁺-siderophore complex interacts with a reductase that releases the iron in its Fe²⁺ form within the periplasm. As the siderophore has only a low affinity for Fe²⁺, it is either degraded or recycled back to the extracellular space to scavenge for new Fe³⁺ ions. The liberated Fe²⁺ is then transported across the IM to the cytoplasm via an ABC transporter [11], possibly assisted by a carrier protein as well.
In the second mechanism, a carrier protein binds the siderophore complex immediately after its entry into the periplasm. This interaction directs the complex to an ABC transporter, which imports it into the cytoplasm. Once inside, the iron is released through the action of an esterase, which degrades the siderophore in the process[12, 13].
Once released into the cytoplasm, the journey of the Fe²⁺ ions culminates, in part, in their role as a cofactor for Ferric Uptake Regulator (Fur), a transcriptional repressor [8]. In the presence of Fe²⁺, Fur binds to specific DNA sequences known as Fur boxes. Upon binding, Fur represses the expression of a dozen genes, including those involved in the biosynthesis of siderophores and TBDTs [8].
The literature provides a clear overview of the siderophore’s journey — from iron scavenging to its release into the cytoplasm. However, as we learn more about this process,new questions arise. One of the first we sought to understand concerned the PD of the TBDT.
A widely accepted model suggests that the PD must undergo a structural rearrangement to allow the siderophore to pass through. The extent of which, however, remains a matter of debate. Minimal shifts would result in the formation of narrow nanopores through which the siderophore could diffuse. However, this model fails to account for the translocation of much larger protein cargos such as colicins, which could range from 29 to 69 kDa, up to 100 times bigger than our siderophore [8].
This discrepancy leaves us with the two prevailing hypotheses: either the PD is partially displaced, functioning like a hinged door, or it is completely removed, like the cork of a champagne bottle [8]. One of our experts, Dr. Patrick Lagüe, pointed out that completely removing the PD could represent a simpler mechanism for the siderophore to enter the periplasm. However, this newly opened passage could also become an easy point of entry for many molecules, creating a significant vulnerability if this model were accurate.
To better understand the importance of the plug domain as a structural component of the TBDT, we employed molecular dynamics simulations, which enable us to analyze how the different parts of the protein interact and behave over time. To do so, we first had to acquire a structure of Klebsiella Pneumonia’s TBDT. Since none were found at the beginning of this iGEM cycle, and considering that a high degree of structural conservation is observed between all TBDTs [8], we opted to predict the structure using Alphafold3 [14] and Chai-1 [15]. To do so, we used the sequence from the Uniprot entry Q6U607 as an input sequence for both predictions. Both models predicted similar structures, with the two of them presenting a root mean square deviation (RMSD) of only 0.958 Å. In this case, the RMSD reflects the average distance between two atoms of superimposed structures [16]. A lower RMSD therefore, reflects a high similarity between the structures. To create the biomodelling system, we used the structure generated from AlphaFold3.
Furthermore, we constructed two biomodelling systems using the platform CHARMM-GUI [17]. The first system included the full TBDT domain, with the only exception being the signal peptide to reduce the size of the water box, effectively starting the polypeptide sequence at the 24th residue. The second system had the PD removed, beginning the structure at the 157th residues, immediately before the first β-strand of the β-barrel (Fig. 4).
The membrane was based on CHARMM-GUI’s outer membranes of Gram-negative bacteria [18], a generic membrane template used for any Gram- that we scaled up while preserving the original lipid ratios. Our two models were built with the membrane composition presented in the table below. Both systems were supplemented with 0.15M NaCl, and trajectories were conducted at a temperature of 303.15K.
| Leaflet composition | |||||||
|---|---|---|---|---|---|---|---|
| Lipid Name | Lipid Head/Tail | Area Per Lipids (APL) | Leaflets | ||||
| Outer | Inner | ||||||
| PPPE (PYPE) | PE (16:0/16:1 (9Z)) | 60.54 Å2 | - | 180 | |||
| PVPG (POPG) | PG(16:0/18:1(11Z)) | 62.1 Å2 | - | 48 | |||
| PVCL2 | CL(1'-[16:0/18:1(11Z)],3'-[16:0/18:1(11Z)]) | 124.47 Å2 | - | 12 | |||
| ECLIPA | E.coli R1 core-Lipid A* | 183 Å2 | 85 | - | |||
| Total | - | 85 | 240 | ||||
Using Nanoscale Molecular Dynamics (NAMD) [18], we calculated six (6) trajectories of at least 350 ns each — one set of triplicates for the first system created and another for the second system.
An additional important question to address is the interaction of the siderophore with the extracellular portion of the TBDT. This interaction serves as the initial step of the internalization process. The most suitable approach for this prediction was using Chai-1 [15]. To keep biological significance within our prediction, this method required the inclusion of the siderophore, the TBDT and a Fe3+ ions, since the siderophore will present itself bound to the ion for uptake.
However, SMILES codes can only represent covalent bonds, as they were designed to represent small molecules. As such, adding the ions to the SMILE would incorrectly model covalent bonds, rendering our prediction invalid. Given the siderophore’s intrinsic high affinity for Fe3+, we instead opted to predict the siderophore and Fe3+ ion separately, assuming that the ions would naturally associate in higher probability with the siderophore. The resulting predictions were then compared with empirically determined structures available in the Protein Data Bank.
Developing a deeper understanding of siderophores internalization is a crucial step towards using them efficiently as a “Trojan Horse”. Analyzing every component of the proteins involved in this pathway is therefore essential, both to optimize our strategy and to explore new approaches for tackling the problem of antibacterial resistance.
For a detailed exploration of this part of the project, please refer to the Results section via this link.
This research was enabled in part by support provided by Calcul Québec and the Digital Research Alliance of Canada