Loading Progress
0%

Mathematical modelling


Key Achievements

The REE part of our project focuses on the binding of rare earth elements to their specific binding protein, so that we can isolate them in the objective of recycling them from electronic waste. This is why we identified certain proteins used by natural bacteria to bind these REE with a high specificity. The initial goal of our project was to create a product that could be sold on an industrial scale. However, since currently we do not have the infrastructure required to be able to test the up-scaling of our project, we came to the idea to replace these experiments with a simulative model that would enable us to calculate how many rare earth elements we could bind and extract later on. For the up-scaling to be relevant for industry, optimisations of initial parameters are crucial, and therefore one of the central points of our model is also to deliver some optimisations that would take a very long time being done in the wet lab.

The modeling project is divided into four individual models. This mirrors the wet lab procedure where different steps of the total extraction process happen in different environments at different times. The first step of the process requires the bacteria to grow and to produce curli and metal binding proteins (we focused on lanthanum binding Lanmodulin, however a change of the parameter would allow us to simulate the binding of other binding proteins). Once that is done, the binding protein and the curli cells will then be extracted before being put into contact to form protein complexes. Once the proteins have formed complexes they are incubated with dissolved REE for the REE capturing to occur. To model these different processes we will use differential equations which track the dynamics of different bacteria and protein concentrations as can be seen in Figure 1.

The first and second parts of the model will each use similar differential equations to model the growth of an initial amount of bacteria growing with a limited amount of nutrients to produce a certain number of proteins. The first part will be specific to the curli protein producing bacteria and keep track of how many curli proteins the bacteria have produced. The second part will be specific to the metal binding protein producing bacteria. Because our project focuses on many different REE and that each REE needs its own protein, the parameters used in the model will change. For example, we focused on lanthanum binding, so we have adjusted our model parameters so that the values concerning the binding protein are specific for lanthanum. The third part of the model simulates the binding between curli proteins and metal binding proteins. Both protein concentrations will be determined by the first two parts of the model, but the ratio between both concentrations will be subject to optimization, as being one of the key results of the model. The fourth and final part of the model simulates the binding of an initial concentration of REE to an amount of protein complexes formed from curli protein and binding protein. This will then tell us how much REE we have captured.

figure 1
Figure 1 Graphical representation of the different parts, forming the model, mimicking the wet lab protocol for REE capture. Each submodule represents one step of the total process, modeled using ordinary differential equations.

The combination of all the models should give us a good idea of how efficient our project is for the capturing of specific REE.

We chose to code our model in RStudio with a Euler integration scheme. As mentioned previously we used differential equations to represent the evolution of different variables throughout time.

First and Second models

For the first and second model, we looked at the changes of three variables over time:

  • The biomass concentration (X)
  • The substrate concentration (S)
  • The curli or metal binding protein concentration (P, representing curli or BP respectively)

For each variable, we used a differential equation:

formula 1

  • For species growth, the per capita growth rate is calculated by a Monod equation, in which the bacterial growth rate is influenced by the decreasing substrate concentration. μmax is the maximal possible growth rate and Ks is the half saturation constant. They are multiplied by the current concentration of biomass X to determine the next concentration of biomass.
  • For resource utilization, Y is the yield factor describing the need for food for a certain biomass which is then multiplied by the current amount of biomass produced to determine how much concentration of substrate is left. The equation is fully negative because the substrate is being consumed.
  • In protein/curli production, α is the protein production rate multiplied by the current amount of biomass X to determine the next concentration of protein.

Third model

For the third model, we also looked at the changes of three variables over time:

  • The curli protein concentration (curli)
  • The binding protein concentration (BP)
  • The complex concentration (Complex)

For this part we transformed the concentrations of proteins from g/L to mol/L to make sure that the ratio of one protein binding to another is respected as the proteins do not have the same molar mass:

formula 2

Where kf is the binding rate between the curli protein and the metal binding protein and curli and BP are the current concentrations of the curli and metal binding proteins. dcurli/dt and dBP/dt are both only negative due to the fact they form a covalent bond which is irreversible.

Fourth model

For the fourth model, we also looked at the changes of three variables over time:

  • The complex concentration (Complex)
  • The REE concentration (ion)
  • The final product concentration (bound_ion)

formula 3

Where k_on is the binding rate between the binding proteins and the REE, k_off is the unbinding rate between the binding proteins and the REE, γ is the number of binding sites of one binding protein, Complex is the current concentration of protein complexes, ion is the current concentration of REE and bound_ion is the current concentration of metal-complexes.

The different parameters and initial variables values used in the models to obtain the results were taken from literature.

Table 1: Table of all the parameters, with their values, units and a short description, with the references at the bottom of the page.
Parameter Description Value Reference
mu_max maximal growth rate for E. coli strain MG1655. (as close as we could find to the actual strain of M037) 1.99 h-1 Wang et al. (2010)
ks_1 half saturation constant for MG1655 0,486 gL-1 Smaluch et al. (2022)
y_1 Yield factor describing the need for food for a certain biomass for E. coli strain MG1655. 0.63 gg-1 Waegeman et al. (2011)
alpha_1 curli production rate for E. coli 1.26 *10-5 gg-1s-1 Siri et al. (2023) *
mu_max2 maximal growth rate for E. coli strain BL21 1.1 h-1 Survyla et al. (2021)
ks_2 half saturation constant for E. coli strain BL21 0.7 gL-1 Survyla et al. (2021)
y_2 yield factor describing the need for food for a certain biomass for E. coli strain BL21 0.8 gg-1 Survyla et al. (2021)
alpha_2 lanM production rate for E. coli strain BL21 6.55*10-6 gg-1s-1 Babaeipour et al. (n.d.) (not for lanmodulin but for a protein similar in size) **
kf binding rate between the curli protein and the binding protein 760 Lmol-1s-1 Keeble et al. (2021)
k_on binding rate between the lanM and lanthanum 109 M-1s-1 Featherston et al. (2021)
k_off unbinding rate between lanM and lanthanum 0,05 s-1 Featherston et al. (2021b)
gamma number of binding sites 4 found by looking at the structure of the proteins through chimeraX (for Lanmodulin : PDB file related to Mattocks et al. (2023))
molar mass curli molar mass of the curli protein 14000 gmol-1 PDB file related to Bu et al. (2024)
molar mass BP molar mass of the binding protein 12460 gmol-1 PDB file related to Mattocks et al. (2023)

* The rate has been calculated as follows: (mass(curli) / mass(cells)) / time of assembly [s] = (0.016 / 6.34) / 200. The time of assembly (200 s) was calculated as the length of a curli fiber (1 μm, Hu, 2017) divided by the elongation rate (5 nm/s, Sleutel et al., 2017).

** The rate has been calculated as follows: (mass(protein) / mass(cells)) / time of growth [h] = (51 / 127) / 17h.

As described above, we based our simulation with Lanmodulin as binding protein and Lanthanum as ion, but our analysis is easily extendable to other binding proteins given the rate and affinity constants. Here we show intermediate steps of our model to allow for a better understanding of our model:

For the first part, we simulated the production of curli fiber. To recap, this used three variables: the biomass concentration (X1, [g/L]), the substrate concentration (S1, [g/L]) and the curli protein concentration (curli, [g/L]). We set the initial values of our variables, where X1 = 80 μg/L, S1 = 2 g/L and curli = 0 g/L, which corresponds to rough estimates of the substrate and initial bacterial biomass used experimentally. According to literature, curli is mainly produced during the stationary phase (Barnhart & Chapman, 2006), so we launched the curli production as soon as the stationary phase was reached. As we let the bacteria grow and produce curli fiber for four days, we simulated the curli production rate over 96h. To test whether the model and the parameters result in dynamics that are reasonable, we plotted them (figure 2).

figure 2
Figure 2 Dynamics of csgA producing strain growth,substrate concentration and curli concentration over time. The continuous production of curli comes from the model assumptions.

The binding protein production followed similar dynamics, except for different parameters which have been outlined in the table, and the fact that lanmodulin is expressed in the exponential phase, as being induced by IPTG, instead of the stationary phase for curli. The initial conditions were the same, but the simulation (figure 3) runs only over 24h to again mimic experimental procedures.

figure 3
Figure 3 Dynamics of binding protein producing strain, substrate concentration, and binding protein concentration produced over time.

For the third part, we took the final concentration of the curli and the binding proteins of their respective previous parts (respectively figures 2 and 3), and we converted their unit from g/L to mol/L, by dividing the concentrations by their respective molar mass. In order to stick closely to our experiments, we also added the specific volumes we used in the wet lab, where we tested our system by adding 2 μL of curli proteins and 1 mL of binding protein, resulting in ~ 1 mL total volume, and we ran the simulation for 3 hours. Then, we observe how the complex is formed (figure 4), which takes into account two protein concentrations and one parameter : kf (binding rate of DogTag - DogCatcher):

figure 4
Figure 4 Curli and binding protein concentration decreasing over time due to complex formation due to binding.

For the fourth part, we finally simulated the metal binding. To do this, we took the final concentration of the complex from the third simulation (figure 4) and the initial concentration of lanthanum ion (set at 250 μM which was our experimentally used value). During the simulation, we took into account that Lanmodulin has four binding sites by multiplying the amount of complexes by four. To avoid the complexity of simulating each binding site separately, we make an approximation for the total rate of association and dissociation (k_on and k_off) as an average between all four binding sites. We let this simulation (figure 5) run for about half an hour, because the very high k_on value from literature (10^9) indicated that the binding would probably be very quick, which mirrors the reason the wet lab protocol also used a short time for the ion binding.

figure 5
Figure 5 Dynamics showing decrease of curli-BP complex over time as it binds to the metal ion to form a bound ion state.

Despite using parameters that matched exactly the ones in the wet lab protocol, the model predicts that we capture less than what we observed in the wet lab, by one degree of magnitude (we observed around 15 μM captured experimentally, but the model predicts 3 μM in figure 5).

The reason for this could be that we had to include a lot of parameters that were unknown for the specific processes we were modelling. This meant that we had to use approximate values from the literature, either for other similar binding reactions, or in some cases where no such values were available, we had to make rough estimates of the rates (as we have mentioned in the table). Hence, many parameters occurring in the experimental setup might be different from what is considered in the model, despite probably being in similar orders of magnitude, and their accumulation can lead to this bias. Of course, the best option would be to verify each of the parameters experimentally, but we didn’t have the time nor the material needed to do so.

Despite these limitations, we can still proceed to investigate whether upscaling of REE ions captured can work. To this end, we coded a heatmap, where we vary the initial substrate concentrations. This permits to grow more bacteria, that will produce more proteins, and more proteins should lead to more ion binding. But in the process of varying the substrate concentration, we will understand the effect of each protein on the whole process.

figure 6
Figure 6 Heatmap comparing the bound ion concentration in function of the initial substrate concentrations, where an equal volume of curli solution and binding protein solution were mixed.

In figure 6, while using the same parameter set as before, we see that our results are binding-protein limited, because the increase of curli protein doesn’t affect the final bound ion concentration. Despite not exactly matching the experimental ion capture, we note that binding protein limitation is also what we discovered experimentally. In the wet lab we decided after some tests to go with a curli:BP ratio of 1:500.

figure 7
Figure 7 Heatmap comparing the bound ion concentration in function of the initial substrate concentrations, where the curli solution was mixed with a 500 times larger binding protein solution (based on experimental).

When the new ratio is implemented, we see through figure 7 that we jump into a regime of curli-limitation because the increase of binding protein doesn’t affect the final bound ion concentration, as the unbound curli fibers concentration reaches 0, disenabling any more complex to be formed. This concludes that the 1:500 curli:BP ratio is a too big ratio.

Given that the production of curli and binding protein are different, even if we optimize the substrate concentration such that the concentrations are the same for environment C, we do not see a complete capture (figure 8).

figure 8
Figure 8 Curli concentration used over time; BP concentration used over time; complex concentration formed over time, where we have the same concentration of curli and binding proteins.

As we observed some possibility of unspecific binding in the wet lab, and that unspecific binding was caused by curli, we state that it is preferable to have all the curli fibers bound. Therefore, we saturate the curli proteins by adding a little bit bigger concentration of binding proteins. In figure 8, the ratio was ~ 1:122 curli:BP. So, we decided to keep going when 99% of the curli proteins were bound to binding proteins, which occurs at 1:136 curli:BP ratio.

figure 9
Figure 9 Heatmap comparing the bound ion concentration in function of the initial substrate concentrations, where the curli solution was mixed with a 136 times larger binding protein solution (based on model calculations).

And in fact, we see that this 1:136 ratio seems to offer an optimal ratio for scaling up (figure 9). And the calculation of the diagonal (when both substrate concentrations are the same) gives us an increase of ~4.95 μM captured per g/L substrate increased in both conditions. If this is a linear relation, we would expect to catch the total 250 μM when we have initial substrate concentrations of 50.5 g/L, which is confirmed by the model (figure 10).

figure 10
Figure 10 Heatmap comparing the bound ion concentration in function of the initial substrate concentrations, where the curli solution was mixed with a 136 times larger binding protein solution (based on model calculations). The initial ion concentration is 250 μM, and the scale up went up to 100 g/L of substrate.

All the code used for figures 2-10 is available here

The model offers a way to investigate the role of different parameters of the process and its effect on capture. Given parameters we could find and estimates from rough calculations, we see a 3 μM capture of metal ion as opposed to experimentally observed 15 μM. However the model does provide qualitatively agreeing results, such as without additional manipulation, we are in the binding protein limited regime and need a higher curli:BP ratio. This ratio needs to be optimal for scaling up capture - if it is too high, the curli-limited regime is suboptimal. The current parameters suggest a 1:136 ratio, which can be verified experimentally. Furthermore, this validation can also inform back model parameters that can then more accurately predict capture, that is for now predicted to increase by ~5 μM per g/L substrate added in each cell culture.

So, in general, our two biggest motivations, seeing if an up-scaling was possible (and at which rate) and optimizing the ratio between the curli proteins and the binding proteins, have been achieved, even though the results have to be taken cautiously, as we saw some variations to the experimental values.

For building the model, we had to make some assumptions, to not overcomplicate the code with too many parameters.

For this model, we assumed that the cells don’t die (no death rate in both productions), we omitted the possible saturation of substrate on the cells (that would reduce the growth rate when there is too much substrate), we assumed we had no protein degradation, and we omitted the unspecific binding we observed experimentally curli having on the ions.

Due to these assumptions, we realize that our model is an approximation, as every other model. “Essentially, all models are wrong, but some are useful” (George Box). Well, we sure hope that our model is among the useful ones.