n
Modelling
Models are simplifications of reality: as the statistician George Box stated, “all models are wrong, but some are useful” [1]. Here, we use models for two purposes: to predict what we could not address in the lab and to improve the experimental design. Firstly, we have used an individual-based model (IBM) of the repressilator to gain insight into how long the bacterial population could oscillate synchronously in a gut-like environment. We track the number of proteins and undergone divisions in each bacterium, and set simple rules on how they divide, die or update their protein amounts guided by a system of ordinary differential equations (ODEs). This model was validated with previous simulations or experimental findings that we managed to reproduce. Secondly, we have developed an ODE model to optimise the experimental setup of a kill switch.
ODE model of the repressilator
Model Elowitz and Leibler 2000[3]
The first genetic circuits appeared already over two decades ago, changing the prospect of synthetic biology [2]. One of them, the repressilator, is a simple network in which LacI, TetR and CI respectively inhibit each other’s transcription, producing an oscillatory pattern of expression over time [3]. Its publication came along with a theoretical model where six coupled first-order differential equations describe how the mRNA and protein levels of the three above-mentioned nodes change over time. The detailed derivation of this model using Hill functions has been previously described [4,5]; and the final equations are as follows:
$$\frac{d\,[mRNA_{LacI}]}{dt} = \alpha_0 + \frac{\alpha}{1+[CI]^n} - [mRNA_{LacI}] $$ $$\frac{d\,[LacI]}{dt} = -\beta * ([LacI] - [mRNA_{LacI}])$$ $$\frac{d\,[mRNA_{TetR}]}{dt} = \alpha_0 + \frac{\alpha}{1+[LacI]^n} - [mRNA_{TetR}]$$ $$\frac{d\,[TetR]}{dt} = -\beta * ([TetR]- [mRNA_{TetR}])$$ $$\frac{d\,[mRNA_{CI}]}{dt} = \alpha_0 + \frac{\alpha}{1+[TetR]^n} - [mRNA_{CI}]$$ $$\frac{d\,[CI]}{dt} = -\beta * ([CI] - [mRNA_{CI}])$$Assumptions to obtain these dimensionless equations [3,4]:
- 1.1. All three repressors are identical regarding their transcription, translation and degradation rates, as well as cooperativity of their binding to the operator.
- 1.2. Time is rescaled in units of the mRNA lifetime.
- 1.3. Protein concentrations are written in units of KM, the number of repressors necessary to achieve half of the theoretical maximum repression of a promoter.
- 1.4. mRNA concentrations are rescaled by their translation efficiency (average number of proteins produced per mRNA molecule).
Note that the results from the model should be rescaled back to plot real units of protiens or time (which is what we show in all our figures)
Parameter | Value |
---|---|
promoter strength (not repressed) | 0.5 transcripts/s |
promoter strength (repressed) | 0.0005 transcripts/s |
translation efficiency | 20 proteins/transcript |
protein half-life | 10 min |
mRNA half-life | 2 min |
KM | 40 monomers/cell |
Hill coefficient (n) | 2 |
With the information in Table 1 we can obtain the values for the parameters α, α0 and β:
Parameter | Meaning | Value |
---|---|---|
α | Protein copies per cell produced in absence of the repressor | 216 (see here how to obtain it) |
α0 | Protein copies per cell produced in presence of the repressor (leakiness) | 0.216 (= α/1000) |
β | Ratio of the protein decay rate to the mRNA decay rate | 0.2 (see here how to obtain it) |
We initially found it hard to obtain the values in Table 2. To clarify how to get the proper value of α we contacted Michael B.Elowitz and he kindly solved our doubts. Here you can find a more detailed description.
The model shows that the value of α is directly correlated with the amplitude of the oscillations (Figure 1 a). Another interesting property is that β is inversely related to the period of the oscillation (Figure 1 b) (i.e. the longer the protein lifetime or the shorter the mRNA lifetime, the longer the period), as previously stated [6]. Both findings have also been shown by the SCU China 2017 iGEM team [5]. However, a known challenge is the fact that a change of period usually implies a change of amplitude and vice versa, as some studies have addressed [7].
The observed period of our repressilator (~14 bacterial generations [8]) multiplied by the assumed generation time of E.coli Nissle growing in vivo (100 minutes [9]) would render a period of roughly 24 hours (~23.3h), which is indeed what we aim to obtain. A similar period was also obtained experimentally in the gut of mice with the repressilator system we used [10]. However, we acknowledge that modulating the period of the oscillations would be necessary in order to adapt our system to particular conditions. Even men and women have been shown to have differences in their circadian circle and, according to these models, we could adapt the oscillation period for instance by promoting degradation of the proteins in the repressilator (see Integrated Human Practices for more information).
Introduction of IPTG and aTc into the model
The above-described model [3], has been widely studied and modified with different purposes [11, 12, 13]. We introduced the compounds IPTG and aTc into the equations, which will respectively bind to LacI and aTc, removing them from the system and inhibiting their repressing function. To do so, thanks to the advice from the Imperial College 2020 iGEM team, we used the Hill equation (find a detailed derivation of the function on pages 7-9 of their tutorial):
$$\frac{d\,[mRNA_{LacI}]}{dt} = \alpha_0 + \frac{\alpha}{1+[CI]^n} - [mRNA_{LacI}] $$ $$\frac{d\,[LacI]}{dt} = -\beta * ([LacI] - [mRNA_{LacI}])*(1 - \frac{[IPTG]^{n_I}}{k_{d_I} + [IPTG]^{n_I}})$$ $$\frac{d\,[mRNA_{TetR}]}{dt} = \alpha_0 + \frac{\alpha}{1+[LacI]^n} - [mRNA_{TetR}]$$ $$\frac{d\,[TetR]}{dt} = -\beta * ([TetR]- [mRNA_{TetR}])*(1- \frac{[aTc]^{n_a}}{k_{d_a} + [aTc]^{n_a}})$$ $$\frac{d\,[mRNA_{CI}]}{dt} = \alpha_0 + \frac{\alpha}{1+[TetR]^n} - [mRNA_{CI}]$$ $$\frac{d\,[CI]}{dt} = -\beta * ([CI] - [mRNA_{CI}])$$
Assumptions in this model (on top of the above-mentioned ones):
- 2.1. IPTG or aTc are added in excess so its consumption can be negligible.
- 2.2. IPTG or aTc rapidly diffuse into the cells so that the internal and external concentration is the same.
- 2.3. Quasi-equilibrium in the formation of the complexes (LacI-IPTG and TetR-aTc), and well-mixed system.
Assumptions 2.1 and 2.2 allow us to not need to track the abundance of IPTG inside the cell in an extra equation (which will be helpful to simplify our model as we discuss later on). Assumption 2.3 is required to derive the Hill equation we are using.
Parameter values:
Parameter | Meaning | Value |
---|---|---|
KdI | Apparent dissociation constant IPTG-LacI | 1.4e-6 M[14] |
Kda | Apparent dissociation constant aTc-TetR | 1.5e-8M [15] |
nI | Hill coefficient (cooperativity of binding) of IPTG | 1 [14] |
na | Hill coefficient (cooperativity of binding) of aTc | 2 [15] |
Increasing the amount of IPTG or aTc leads to a steady state without oscillations (Figure 2). In practice, this allows for “freezing” all the bacteria in the same phase and for letting them oscillate synchronously once the IPTG or the aTc are removed. These simulations allowed us to predict the state of the cells after such synchronization and to corroborate that the results from wet lab start in the expected phase after the overnight incubation with IPTG or aTc. Also, they helped to set the starting values in some simulations with our IBM (Animation 2)
.Individual-based model (IBM) of the repressilator
The oscillations rendered by the repressilator typically have a longer period than the cell division cycle, so the state of the oscillator must be transmitted through generations [3]. Hence, we were intrigued by how the variability in cell division, both in growth rate and in protein partition between daughter cells could influence the emergent signal of the population. To analyse the repressilator at the single cell level we built an IBM that defines variables characterizing each bacterium at each time step as well as the rules governing them. In the above ODE model, we explained how to coordinate the oscillations in all the cells using IPTG or aTc; now, using this IBM approach, we will assess for how long cells can remain in phase after such synchronization.
To validate the model, we explored the parameter space to adapt the results both to what has been previously modelled [3] and to experimental findings [8]. For instance, we checked whether the observed macroscopic ring-like structures of protein expression (Figure 3) emerge in simulated colonies. This is due to the fact that the “entry into the stationary phase causes the repressilator to halt” [3]. Since nutrient limitation ensures that only cells in the edge of the colony grow, it will be only in this active external part where oscillations will continue, while the inner and stationary part of the colony will remain arrested in different phases of the oscillations [8].
In this IBM we consider a bacterium as the individual unit, which consists of a list of three numbers representing the amounts of the proteins of the repressilator within that bacterial cell ([LacI, TetR, CI]) plus a counter to track the number of undergone divisions . At each time step, each bacterium will update its protein values, and it will have a chance to divide (i.e. generate a new the list) or die (i.e. erase the list) as Figure 4 shows. Division implies that the proteins inside the simulated bacterium will be split between the two daughter cells, with some noise involved (i.e. both cells do not necessarily need to receive exactly the same number of proteins, but the total will be maintained).
Time is discretised, and whether each of the three proteins increases or reduces in number inside a bacterium in a time step, will be determined by an ODE model of the repressilator. We simply solve the ODEs for the given starting values of the bacterium and for a particular time interval, hereafter referred to as “integration time”. As a result, we obtain three new values for the number of proteins that will define the bacterium at the beginning of the next time step (Animation 1). However, we have made important changes to the above-mentioned equations from [3].
Simplifying the ODEs within the IBM
Initially we were using the model from [3] to update the protein list of a cell. Since the ODEs had to be solved for each cell at time step, the simulation soon became quite heavy. In order to reduce the computational cost and to be able to run our model on a regular computer (e.g. 8 Gb RAM, 4 CPUs, 8 threads) we decided to simplify the equations. To that end, we have taken one of the simplest models for the repressilator based on Hill functions [4] and modified it according to the assumptions 1.1, 1.3 described above.
Generic equations from [4]:
$$\frac{d\,[Protein_1]}{dt} = \alpha_{0,1} + \frac{\alpha_1}{1+(\frac{[Protein_3]}{K_3})^n} - d_1*[Protein_1]$$ $$\frac{d\,[Protein_2]}{dt} = \alpha_{0,2} + \frac{\alpha_2}{1+(\frac{[Protein_1]}{K_1})^n} - d_2*[Protein_2]$$ $$\frac{d\,[Protein_3]}{dt} = \alpha_{0,3} + \frac{\alpha_3}{1+(\frac{[Protein_2]}{K_2})^n} - d_3*[Protein_3]$$Where 1,2,3 are LacI, TetR and CI; α is the protein production rate when promoters are not inhibited, α0 is the leakage, K is equivalent to the above-mentioned KM and d is the protein degradation rate. If we apply assumptions 1.1 and 1.3 and consider that there is no leakage (α0 = 0) we obtain the following simplified equations:
$$\frac{d\,[LacI]}{dt} = \frac{\alpha}{1+[CI]^n} - d_1*[LacI]$$ $$\frac{d\,[TetR]}{dt} = \frac{\alpha}{1+[LacI]^n} - d_2*[TetR]$$ $$\frac{d\,[CI]}{dt} = \frac{\alpha}{1+[TetR]^n} - d_3*[CI]$$The same kind of equations has been successfully used with other gene circuits such as the toggle-switch [16]. Importantly, this model does not consider the RNA levels, as neither do previous simulations of the repressilator [8]. It has been reported that removing the RNA levels from the equations will reduce the chance to obtain an oscillating system [6]. However, the same authors show that it can be counteracted by increasing the cooperativity of the repressors. For this reason, we increased the value of the Hill coefficient (n) from 2 to 2.4; the closest value rounded to one decimal place that promotes oscillations (Figure 6). Adapting the Hill coefficient to non-integer values is commonly done when modelling, even if it does not make sense biologically [17]. Experimental evidence shows that LacI acts as a tetramer [18] while CI and TetR act as dimers [19, 20]. So, at least, the value of 2.4 assumed for all of them falls within the biological range of cooperativity.
In addition, we have removed the protein degradation term from the equations, assuming that protein degradation will be exclusively due to dilution through cell division: when a cell divides, roughly half of the proteins will be distributed between the two daughter cells (Figure 5). This is a reasonable assumption in rapidly growing bacteria such as ours [17]. Moreover, we are using a repressilator system where protein degradation tags were removed to obtain more stable oscillations [8]. In addition, in silico experiments [21] recently showed that spatio-temporal patterns observed experimentally in growing colonies, such as static ring-like patterns (Figure 3), can be achieved when the protein degradation is exclusively due to dilution through cell division.
Finally, we will keep the term for synchronization either with IPTG or aTc, so we sould include the above mentioned assupmtions 2.1 and 2.2 an 2.3. The resulting equations to determine protein change within each bacteria at each time step are:
$$\frac{d\,[LacI]}{dt} = \frac{\alpha}{1+[CI]^n}*(1- \frac{[IPTG]^{n_I}}{k_{dI}*[IPTG]^{n_I}})$$ $$\frac{d\,[TetR]}{dt} = \frac{\alpha}{1+[LacI]^n}*(1- \frac{[aTc]^{n_a}}{k_{da}*[aTc]^{n_a]}})$$ $$\frac{d\,[CI]}{dt} = \frac{\alpha}{1+[TetR]^n}$$Note that, as in the prior section for the ODE model, due to the assumption 1.3, we will have to rescale back the protein units resulted from the model (multiplying by KM=40). This will render the number of proteins per cell that we show in all our plots.
The simulated bacterial cells can be placed in two spatial settings: one dimension (cells are placed in a list), or two dimensions (cells are placed in a grid). We used the former to assess which configuration can lead to a fixed stable state or continued oscillations (limit cycle oscillations) and to see how the noise in cell division can desynchronize the oscillations of individual cells. Using the latter, we tested different division algorithms and sought for the emergence of the ring-like macroscopic structure of protein expression in colonies, to later show how long the cells could remain coordinated in a gut-like environment.
One dimension IBM
Here we placed the simulated cells in a list that can be extended without limit. Although we have referred to this configuration as one dimensional (1D), the position that cells occupy in the list does not really matter; then, it is rather adimensional and represents a homogeneous environment. With this model we aim to simulate the experiments we performed in the lab, where bacteria are grown in shaken liquid medium, are also frequently diluted to maintain the exponential growth, and where their overall fluorescence is measured regularly. Thus, in our simulation we assumed limitless resources (as the medium is replenished) and performed dilutions at the end of a time step whenever the population exceeded 1000 cells by randomly selecting 200 cells to continue with the experiment in the next time step (otherwise the simulation would soon get computationally costly). Then, we calculated the overall signal for each of the proteins as the average amount of proteins per cell. The advantages over the wet lab experiments are that we have more control over dilutions and we can directly measure the amount of the proteins of the repressilator (with no need of reporters) as frequently as we wish.
With this configuration we explored a critical feature of repressilator systems: some parameter values will promote oscillations in the system (Figure 6a) while others will lead to a stable state without oscillations (Figure 6b)[6].
We ran simulations of 100 time steps with different parameter values (focusing on α, integration time and division probability), and tracked the amplitude of the oscillations in the last 20 time steps as a proxy of whether that configuration can sustain continued oscillations (non-zero amplitude; colour) or leads to a non-oscillatory state ( ~ 0 amplitude; black) (Figure 7).
As a general trend, both higher α and longer integration time increase the amplitude of the oscillations (Figure 7 a). This helped us to infer some parameter values for our simulation; for instance, for the widely used α=216 we might need an integration time of around 0.7 time units to obtain an amplitude in the order of 2500 proteins per cell as previously simulated [3]. Note that here we are referring to time in arbitrary time units, it would correspond to 0.7/(ln(2)/2) minutes in the original ODE model [3], however, since we removed the protein degradation and mRNA level in the equations, it would not be accurate to give a time unit. Yet, it should not be an issue, as this time just indicates the required integration interval in the ODE model to update the proteins in each cell at every time step in order to adapt our IBM to previous results with continuous models [3].
On the other hand, the probability of division does not have a big effect on the final amplitude of the oscillations (Figure 6 b). However, lower probabilities of division tend to promote oscillations, as they increase the average number of time steps that a bacterium undergoes before dividing (which is equivalent to increasing the integration time). Probabilities really close to 1 can also slightly promote oscillations, but would be an effect of decreasing the noise in cell division: if at each time step all the cells divide, the number of proteins in all the cells will be almost equal (depending on the noise on protein partition) and the average signal of all the cells will have less variance and be somehow clearer.
The fact that the noise in cell division can desynchronize the oscillations of individual cells , is an important concern in the field [8]. We took advantage of our model to also evaluate the effect of this noise, which we split up into two potential sources:
- The variability in the time that individual cells spend until they undergo division (growth rate variability). It has been reported to follow a normal distribution [33]. Just by reducing the probability of division (e.g. from 1 to 0.8) we could introduce some noise in the number of divisions that every cell undergoes. This noise can be slightly increased by sampling the probability of division from a normal distribution (e.g. N (0.8, 0.1); see Figure 8 b,c)
- The size of the daughter cells or the protein partitioning in cell division has also been reported to show some variability [22, 23]. To introduce such variability in our model, if the total amount of proteins in the mother cell is considered as 1 unit, we can sample the number of proteins for one of the daughters from a beta distribution centred around 0.5 (e.g. Beta(50,50)), and the remaining proteins will be attributed to the second daughter cell.
As the cell division noise increases (either by growth rate or protein partition), shifts in the phase of individual cells start to occur (Figure 8). The autocorrelation indicates how the data correlates with itself at a particular time step. For the noiseless case, the autocorrelation is 1 every so many time steps, due to the exact pattern repetition (Figure 8 a). The noise reduces the amplitude of the oscillations of the autocorrelation. Thus, the autocorrelation has been the choice to represent the coordination of the oscillations over time in previous studies [3, 8, 24]. Our results show that both the variation in growth rate (Figure 8 b,c) and, to a lesser extent, in protein partition (Figure 8 d) can desynchronize the oscillations, increasing the variation of protein expression between cells, and reducing the amplitude of the oscillations in the overall signal. The combination of both noises (Figure 8 e) will increase the dephasing as the autocorrelation shows (lower amplitude of oscillations than in Figure 8 b or 8 d). As we have seen, the robustness of the oscillator can soon be lost due to the noise in cell division, and it has been reported to particularly affect oscillators with a greater period than cell division [13].
Importantly, without any noise we can observe 7 peaks of protein expression per 100 time steps, which corresponds to a period of ~14 time steps (Figure 8 a). In such conditions, where the probability of division is 1 (i.e. all the cells will divide each time step), the number of time steps will be equivalent to the number of generations. Thus, we obtain a period of roughly 14 generations as has been reported for our repressilator system [8]. Note that reducing the division probability (and thus growth rate) would decrease the number of peaks in the same time frame (Figure 7 b), hence increasing the period of the oscillations. We could then target the bacterial growth rate to modulate the period of the oscillations as we discuss in our integrated human practices.
Two dimensions IBM
Moving one step further, we assessed how a 2D spatial setting would affect our model. Here, cells will be located in a specific position of a grid, and we will loop through all those positions in random order to check if there is a cell and, if so, update its protein values, let it divide or die. After each time step, we can colour each position of the grid to represent the number of proteins of the bacterium that occupies that position. If the ring-like structure in colonies emerges from this IBM, we could use it to validate the algorithm and the previously chosen parameter values. Then, we aim to use those parameters, adapted to a gut-like environment, to see for how long bacteria could release azurin in a reasonably coordinated manner. However, due to the lack of accessibility of the gut and debated validity of e.g. mouse models, some aspects of the microbial growth in the colon remain unclear [25]. The chosen conditions and assumptions thus, might not be perfect, but could at least “reduce the system complexity [for the model] to be useful” [26].
We wanted to keep our model simple and discrete, so we created a grid of predefined size (usually 100 x 100) where each position could be occupied by only one bacterium. Then, we have to define how a bacterial cell can divide and spread to adjacent positions; something that could resemble the cellular automaton [27] or the Eden algorithm [28], which have been studied since the early stages of computer science and theoretical biology. Inspired by these theories we started by defining two simple models of division:
- Mode A : Cells could be copied to their four adjacent positions (von Neumann neighbourhood, Figure 9 a) if any of those is available.
- Mode B : Same as mode A, but with eight adjacent positions (Moore neighbourhood, Figure 9 b).
However, those configurations rendered shapes of colonies that did not look natural (Figure 10 a, b). In consequence, we gave a try to some modifications:
- Mode C: If a bacterium cannot divide it can shove the adjacent cells to create an available position [29]. We found this configuration increases the speed of colony growth, however the colony remains to have squared shape (Figure 10 c).
- Bacterial cells can randomly choose a position to divide and if it is occupied, they will not divide, rather than continue looping until they find an empty position [30]. This configuration not only reduces the computational cost, but also renders shapes more reliable to simulated colonies (Figure 10 d, e). However, it slightly reduced the speed of growth of the colony, as more time steps are needed to obtain a colony of more or less the same size. Within this configuration we have explored two modes:
- Mode D: Cells will randomly choose one position in their Neumann neighborhood and will divide (i.e. be copied there) if it is empty. In addition cells can move to any position in their Neumann neighbourhood even if they do not divide [31].
- Mode E: The position cells choose to divide does not have to be directly adjacent to the position of the cells, so they can be copied to positions in the grid which are not only 1 but also 2 or more layers away (See Figure 11 to understand the concept of layer).
It is noteworthy that other sources of variation, such as reducing the probability for a cell to divide in a time step (e.g. 0.5 instead of 1) or increasing its probability of death (e.g. 0.3 instead of 0) would turn colony shapes like those in Figure 10 a,b,c into others resembling Figure 10 c,d. However, we wanted to select an algorithm that already provides a realistic colony shape without taking advantage of those tricks. For that reason, we continue exploring division modes D and E.
To further validate the algorithm of cell division we tested whether macroscopic static ring-like structures appeared when bacterial cells grew in a colony using the parameters set before (α = 216, integration time = 0.7 arbitrary units).
Importantly we could obtain such patterns with the 5 modes above described, and we found an interesting behaviour with division mode E: by increasing the number of layers where a cell can “travel” when dividing we can control the thickness of the obtained rings (Figure 12). In the simulation, however, the values of proteins per cell are higher than expected. The reason is that we have removed the protein degradation term from ODE equations; so, if cells do not divide, the number of proteins will continue to increase. This problem will be partially solved when we simulate a gut-like environment as cells will divide more evenly under those conditions.
This fact allows us to approximate the pattern of rings in the simulations to the observed in experiments. The thickness of the obtained rings is ∼200 μm [10]. Considering the size of E.coli is ∼1 × 3 μm [32], we would need rings of around 100 bacteria (i.e. 100 grid positions) to reproduce the real thickness of the rings. This is almost obtained with mode E and ∼20 layers (Figure 12). However, increasing the number of layers also increases the spacing between rings of expression of shame protein. In reality, the spacing is also ∼200 μm [10], that will be equivalent to 100 grid positions and is obtained with mode E and 10-15 layers (Figure 12). Considering this, and that the simulations we run use relatively small grids (no bigger than 500 x 500), we decided to choose the moderate number of 10 layers for the mode of division E. Here we also assessed whether adding the capacity of movement without division to the Neumann neighbourhood would alter the spatial structures of protein expression, but no significant change was observed.
So far, we have chosen some parameter values (α = 216, integration_time = 0.7, and division mode E with 10 layers). This reproduced previous experiments and generates the ring-like structure of protein expression when bacteria grow in colony configuration (this happens for the three proteins in the repressilator (Figure 13 a) as previously observed [10]). As explained, these patterns are due to the fact that cell division occurs mainly in the border in the colony. Thus, the closer to the border a cell is in the colony, the higher number of divisions it has undergone (Figure 13 b). This would cause an uneven distribution of growth rates that applies to bacterial colony growth in a lab setting, but does not necessarily apply to most real environments. The distribution of bacterial growth has been approximated to a normal distribution before [33]. Since we have not found in literature the growth rate distributions in the human colon, we assume that the distribution of growth rates for bacteria in a localized region of the colon (e.g. the tumour microenvironment) can be approximated to a Gaussian as well. Our aim now is to modulate parameters such as division and death probability within the set mode of division to obtain a Gaussian distribution of growth rates.
According to the central limit theorem, if all the cells have the same probability of division, after enough time steps the distribution of the number of divisions that every cell has undergone would follow a Gaussian. The probability of division for each cell in our 2D model will depend on its neighbourhood, so we cannot obtain exactly the same probability for all the cells at each time step. However, having a relatively high death probability would release positions in the grid, allowing for a not equal but more similar probability of division for the different cells in the grid. Figure 14 shows both in the q-q plot and the shape of the histogram, that increasing the death rate leads to a distribution of growth rates closer to a Gaussian. However, a death rate higher than 0.3 does not show any improvement. The best options are death probabilities 0.2 and 0.3. In this case, we select 0.3 as it looks slightly better and provides a higher growth rate, so we can simulate more bacterial divisions in fewer time steps. Changing the division probability did not seem to affect the final distribution, as neither did allowing cells to move when they do not divide nor looping 5 times (instead of 1) to find a free position; in consequence and for the sake of simplicity we do not include those properties in our algorithm for cell division.
Having a death probability of 0.3 at every time step is quite a high value and probably does not match the reality. We interpret it as an assumption that helps to adapt the model to a gut-like environment due to the following facts:
- We have simplified the 3D space in the gut to a 2D layer; thus, when a bacterium moves to the layer above or below in the real world and leaves a space to divide will be interpreted as death in our model.
- The reported competition taking place in the gut (e.g. toxin injection through T6SS)[34]
- Bacteria, while growing in the mostly anaerobic environment of the gut will ferment nutrients and release acids that will reduce pH and can increase mortality [35].
Another important assumption we are making is that nutrients are equally distributed in space and always replenished. To support this, we first have to consider that most bacterial growth in the colon takes place in the proximal part where carbohydrate content is much higher, while almost no bacterial growth can be expected in the rest of the colon [35]. Thus, the strategy with our repressilator would best target proximal colon cancer; whereas distal colon cancers should be treated with bacteria engineered with an oscillator whose functioning is not dependent on bacterial division. Then, the isotropic distribution is supported by the fact that “mixing is continuously occurring in the proximal colon and leads to well mixed behaviour on local scales” [35]. This, however, might not always be true, as the same authors document that there is a layer of mucus of ~400 μm around the gut wall where diffusion is slower. To be more precise, we could have implemented a nutrient diffusion from the lumen to the gut wall [36].
With all the tested conditions and given the assumptions, we were able to find the best growth algorithm and parameter values for our situation. Then, we used those conditions to show how long cells can stay in phase once synchronized with IPTG or aTc. Azurin will be expressed under PLtetO1 in the sponge plasmid, which is the same promoter controlling the expression of CI (see Design). In consequence, we should synchronize with IPTG rather than with aTc to not have an expression of azurin until bacteria are properly localised in the tumour area (Figure 2).
Animations 2 and 3 show how bacteria colonise a niche in the colon while releasing azurin (we are assuming immediate and total leakage of azurin in the Animations 2 and 3 c). Bacterial growth starts from the bottom of the grid that represents the gut wall where bacteria can attach and spreads towards the gut lumen as previously done [36]. The simulations allowed us to realise the importance of providing the synchronising compound while bacteria are invading the gut (Animation 3). Otherwise, the oscillatory state of individual cells will soon desynchronise and the overall amount azurin will be constantly around average values (Animation 2). Such desynchronization would be due to the higher noise in cell division while invading a niche. On the other hand, if we provide the synchronising compound (IPTG in this case, but could be lactose or adapted to other compounds) and once they have occupied the niche we remove it and let them oscillate, we could obtain 3 defined peaks of azuring expression according to our model (Animation 3 c). The third peak, however, has a lower amplitude and is noisier, which is a sign of the lack of coordination between the cells, as we can see in Animation 3 a; so we cannot extend the system further.
The repressilator has been reported to be functioning for up to 16 days in the gut of mice [10], which is the periodicity we consider for the intake of bacterial pills. However, as we have shown with our simulations, bacteria are not going to be perfectly synchronised for that whole period of time. As a safety measure and to guarantee that a higher dose of azurin is released when expected, we should provide a dose of the synchronising compound at least every 3 days.
Our model, particularly in the two-dimensional setting, relies on some vague assumptions due to the lack of experimental evidence about gut microbiota growth patterns. However, a recent publication started to decipher some properties about such “dark matter” in the gut of mice, including characteristics of bacterial growth at the individual level [37]. These insights show a dispersed model of growth, and qualitatively different growth rates between the daughter cells, which could support our algorithm and assumptions. Still, in the near future, we expect our approach to be perfectioned with less reliance on “gut feelings” and more on gut facts.
Kill switch modelling
In order to avoid the spread of our bacteria to undesired parts of our body or the environment we designed a kill switch as a form of biocontainment. It relies on two toxin-antitoxin pairs: ccdB-ccdA and miniColicin-IM2. The antitoxin would be expressed under the repressible promoter PphoB; addition of phosphate (e.g. phosphate concentrations in the blood) would inhibit the expression leading to bacterial death. In addition, an RNA-thermosensor allows translation of the antitoxin at 37ºC but would inhibit it at lower temperatures, causing bacterial death outside of the body.
The kill switch we developed experimentally is a nice proof of concept, but we aimed to take advantage of simulations to try to predict improved configurations of the system. For instance, including the toxin in the genome would increase the reliability of the system regarding plasmid maintenance: if the toxin is constitutively produced from the chromosome, any bacterium losing the plasmid would die due to the toxic effect. In consequence, the antibiotic of resistance could be removed from both our plasmids, contributing to fight the antibiotic resistance crisis [38].
Our goal is that in the conditions where bacteria should die (e.g. presence phosphate) the toxin is not counteracted by the antitoxin and the population dies; while in the gut-like conditions, there is enough antitoxin to neutralize the toxin and bacteria survive. To achieve this, we took into account the transcription and translation rate of the involved proteins and the copy number of our plasmids (~5 for repressilator and ~20 for sponge). We would like to have at least two copies of the toxin in the genome to reduce the problems with spontaneous mutations, but were wondering if the toxin in the plasmid should have only one or more copies to counteract the effect of the toxins. To solve this kind of issue, we built an ODE model.
Guided by the iGEM 2020 measurement tutorials and thanks to the help of Alejandro Vignoni we could derive the Hill function for a repressible promoter. Applied to the production of antitoxin under the PphoB promoter, the resulted equation would be:
$$\frac{d[AT]}{dt}=CN_{AT}*\frac{k_{1AT}*k_{2AT}}{d_{1AT}}*\frac{K_{dAT}}{K_{dAT}+[P]^n}$$Where AT is the antitoxin, CN the copy number of the gene, Kd is the koff/kon and P is phosphate. Note that for simplicity we are assuming that phophate is direclty the transcription factor (TF), although biologically there are intermediate proteins that turn a change in phosphate concentration into an effect in gene expression.
Using this function and otherwise just Mass Action kinetics we defined how the toxin, antitoxin and complex of both vary over time (see Table 4 and Figure 16 for parameter description):
$$\frac{d[T]}{dt}=CN_T*\frac{k_{1T}*k_{2T}}{d_{1T}} - d_{2T}*[T] - \gamma_f*[T]*[AT] + \gamma_d*[C] $$ $$\frac{d[AT]}{dt}=CN_{AT}*\frac{k_{1AT}*k_{2AT}}{d_{1AT}}*\frac{K_{dAT}}{k_{dAT}+[P]^n} - d_{2AT}*[AT] - \gamma_f*[T]*[AT] + \gamma_d*[C]$$ $$\frac{d[C]}{dt}= \gamma_f*[T]*[AT]-\gamma_d-*[C]-d_C*[C]$$In addition, we added an equation to track bacterial population changes assuming logistic growth. The effect of the toxin in the population was added as referenced [39]. Then, we also accounted for the "degradation" of the above compounds (T, AT and C) due to dilution in growth rate (see Figure 5 for clarification).
$$\frac{d[T]}{dt}=CN_T*\frac{k_{1T}*k_{2T}}{d_{1T}} - d_{2T}*[T] - \gamma_f*[T]*[AT] + \gamma_d*[C] - \mu*[T] $$ $$\frac{d[AT]}{dt}=CN_{AT}*\frac{k_{1AT}*k_{2AT}}{d_{1AT}}*\frac{K_{dAT}}{K_{dAT}+[P]^n} - d_{2AT}*[AT] - \gamma_f*[T]*[AT] + \gamma_d*[C] - \mu*[AT]$$ $$\frac{d[C]}{dt}= \gamma_f*[T]*[AT]-\gamma_d-*[C]-d_C*[C] - \mu*[C]$$ $$\frac{dN}{dt}=\mu*N*(1-\frac{N}{N_{max}})-N*\frac{T_{max}*[T]^n}{T_0+[T]^n}$$Assumptions during the derivation [40]:
- The binding of promoter and repressor occurs really fast and we can assume a quasi-equilibrium in the equation that describes how the complex (promoter + TF) evolves over time.
- Transcription is fast enough compared to translation, so we assume another quasi-equilibrium(dmRNA/dt≈0)
- Dilution due to cell division (μ) is considered separately to degradation (d) for all compounds.
Symbol | Meaning |
---|---|
T | Toxin |
AT | Antitoxin |
C | Complex (T + AT) |
CN | Copy number of the gene (including multiple copies in the genome and plasmid copy number) |
k1 | Transcription rate (depends on the promoter) |
k2 | Translation rate (depends on the RBS) |
d1 | mRNA degradation rate (not due to bacterial growth) |
d2 | Protein degradation rate (not due to bacterial growth) |
Kd | koff/kon(see Figure 15) |
μ | Growth rate |
γ_f | Formation rate of C |
γ_d | Dissociation rate of C |
dC | Degradation rate of C |
N | Bacterial population |
Nmax | Carrying capacity |
Tmax | Maximum rate of cell lysis |
T0 | Concentration of lysis gene resulting in half maximum lysis |
n | Hill coefficient of lysis |
Case study
Unfortunately, we could not find all the required parameter values to adapt the model to our particular situation with accuracy. Nonetheless, we would like to demonstrate the utility of the model in an approximated case study with the ccdB-ccdA system.
Two copies of the toxin (ccdB, BBa_K3482016) will be expressed in the chromosome with the promoter BBa_J23106 and using the RBS (B0030). The antitoxin (ccdA, BBa_K3482020 ) will be expressed with the promoter PphoB (BBa_K3482013) and the thermosensor F2 (BBa_K3482001; that includes its own RBS) in the plasmid of our repressilator system (BBa_K3482025) or in the sponge plasmid (BBa_K3482026). We want to know the number of copies of the antitoxin that we should clone in the plasmid (Figure 16).
Parameter | Value |
---|---|
k1T | 0.78 min-1 [40] |
k2T | 1.7 * |
k1AT | 0.36 min-1 [41] |
k2AT | 0.7* |
kd_AT | 1000 molec (assumption) |
d1T,d1AT | 0.231 min-1[40] |
d2T,d2AT | 3e-4 min-1[40] |
μ | 8.5e-3 min-1[40] |
γ_f | 1.67 M-1 min-1** |
γ_d | 0.0291 M-1 min-1** |
dC | 0 (assumption) |
Nmax | 100 |
T_max | 10*** |
T_0 | 2*** |
n_P | 1 (assumption) |
n_T | 2 [44] |
CN repressilator (BBa_K3482025) | 5 (addgene) |
CN sponge (BBa_K3482026) | 20 (addgene) |
* k2 rates were predicted using the Salis lab RBS calculator [42]. As the result is obtained in arbitrary units, it was rescaled to the expected range [40] by arbitrarily dividing by 10000. See extra information (here)
**We could not find the desired rates for the complex ccdB-ccdA. The described rates for the complex ccdB-GyrA (the union that causes the toxicity of ccdB) were 0.0167 M-1 * min-1 for the association and 0.0291 M-1 * min-1 for the dissociation [43]. As the antitoxin inhibits the formation of the toxic complex we arbitrarily assumed that the formation rate of the complex ccdB-ccdA is two fold greater than the one for ccdB-GyrA, while we kept the same dissociation rate.
*** As the values could not be determined, we arbitrarily chose the same values as in [39].
Figure 17 a shows that including 3 copies of antitoxin in the repressilator plasmid (so 15 copies in total per cell) would not be enough (as neither was 1 or 2 copies) and the population would go extinct. Adding 4 copies of the antitoxin per plasmid would solve the problem, allowing the bacterial population to grow (Figure 17 b). Interestingly, initially, the population starts to die, but soon the production of antitoxin can counteract the expression of the toxin. If we increase the concentration of phosphate, the expression of the antitoxin will be inhibited and the system will collapse (Figure 17 c).
If we instead add the antitoxin in the sponge plasmid, one copy per plasmid (20 copies in the cell) would be enough to allow the survival of the population (Figure 17 b). If we inserted two copies of antitoxin in the sponge plasmid (Figure 18), we would need one order more of phosphate concentration to successfully get rid of the bacteria (Figure 18 b,c).
This case study was just an example, we could have found different optimal configurations by changing the promoter, the RBS or the toxin-antitoxin system for instance. Nevertheless, overall it shows how our model could guide the experimental design of a kill switch.
References (click to see)
- [1] G. E. P. Box, “Robustness in the Strategy of Scientific Model Building,” in Robustness in Statistics, Elsevier, 1979, pp. 201–236.
- [2] D. E. Cameron, C. J. Bashor, and J. J. Collins, “A brief history of synthetic biology,” Nature Reviews Microbiology, vol. 12, no. 5. Nature Publishing Group, pp. 381–390, 01-Apr-2014.
- [3] M. B. Elowitz and S. Leibier, “A synthetic oscillatory network of transcriptional regulators,” Nature, vol. 403, no. 6767, pp. 335–338, Jan. 2000.
- [4] E. L. O. Brien, E. Van Itallie, and M. R. Bennett, “ Modeling synthetic gene oscillators,” Math. Biosci., vol. 236, no. 1, pp. 1–15, 2012.
- [5] “Team:SCU China/Model/Repressilator - 2017.igem.org.” [Online]. Available: https://2017.igem.org/Team:SCU_China/Model/Repressilator. [Accessed: 21-Oct-2020].
- [6] A. Loinger and O. Biham, “Stochastic Simulations of the Repressilator Circuit,” Phys. Rev. E, vol. 76, no. 5, Oct. 2007.
- [7] M. Tomazou, M. Barahona, K. M. Polizzi, and G. B. Stan, “Computational Re-design of Synthetic Genetic Oscillators for Independent Amplitude and Frequency Modulation,” Cell Syst., vol. 6, no. 4, pp. 508-520.e5, Apr. 2018.
- [8] L. Potvin-Trottier, N. D. Lord, G. Vinnicombe, and J. Paulsson, “Synchronous long-term oscillations in a synthetic gene circuit,” Nature, vol. 538, no. 7626, pp. 514–517, Oct. 2016.
- [9] N. Crook et al., “Adaptive Strategies of the Candidate Probiotic E. coli Nissle in the Mammalian Gut,” Cell Host Microbe, vol. 25, no. 4, pp. 499-512.e8, Apr. 2019.
- [10] D. T. Riglar et al., “Bacterial variability in the mammalian gut captured by a single-cell synthetic oscillator,” Nat. Commun., vol. 10, no. 1, pp. 1–12, Dec. 2019.
- [11] J. Garcia-Ojalvo, M. B. Elowitz, and S. H. Strogatz, “Modeling a synthetic multicellular clock: Repressilators coupled by quorum sensing,” Proc. Natl. Acad. Sci. U. S. A., vol. 101, no. 30, pp. 10955–10960, Jul. 2004.
- [12] O. Buse, R. Pérez, and A. Kuznetsov, “Dynamical properties of the repressilator model,” Phys. Rev. E - Stat. Nonlinear, Soft Matter Phys., vol. 81, no. 6, p. 066206, Jun. 2010.
- [13] D. Gonze, “Modeling the effect of cell division on genetic oscillators,” J. Theor. Biol., vol. 325, pp. 22–33, 2013.
- [14] J. Xu and K. S. Matthews, “Flexibility in the inducer binding region is crucial for allostery in the Escherichia coli lactose repressor,” Biochemistry, vol. 48, no. 22, pp. 4988–4998, Jun. 2009.
- [15] “Team:HSiTAIWAN/Model - 2016.igem.org.” [Online]. Available: https://2016.igem.org/Team:HSiTAIWAN/Model. [Accessed: 22-Oct-2020].
- [16] T. S. Gardner, C. R. Cantor, and J. J. Collins, “Construction of a genetic toggle switch in Escherichia coli,” Nature, vol. 403, no. 6767, pp. 339–342, Jan. 2000.
- [17] B. Ingalls, “Mathematical Modelling in Systems Biology : An Introduction,” 2012. https://mitpress.mit.edu/books/mathematical-modeling-systems-biology
- [18] J. Chens, S. Albertil, and K. S. Matthewsm, “THE JOURNAL OF BIOLWICAL CHEMISTRY Wild-type Operator Binding and Altered Cooperativity for Inducer Binding of lac Repressor Dimer Mutant R3*,” 1994.
- [19] A. C. Babić and J. W. Little, “Cooperative DNA binding by CI repressor is dispensable in a phage λ variant,” Proc. Natl. Acad. Sci. U. S. A., vol. 104, no. 45, pp. 17741–17746, Nov. 2007.
- [20] G. J. Palm et al., “Thermodynamics, cooperativity and stability of the tetracycline repressor (TetR) upon tetracycline binding,” Biochim. Biophys. Acta - Proteins Proteomics, vol. 1868, no. 6, p. 140404, Jun. 2020.
- [21] T. Yañez Feliú, Guillermo; Vidal, Gonzalo; Muñoz, Macarena; Rudge, “Novel tunable spatio-temporal patterns from a simple genetic oscillator circuit,”Front. Bioeng. Biotechnol. pp. 1–24, 2020.
- [22] J. M. Guberman, A. Fay, J. Dworkin, N. S. Wingreen, and Z. Gitai, “PSICIC : Noise and Asymmetry in Bacterial Division Revealed by Computational Image Analysis at Sub-Pixel Resolution,” vol. 4, no. 11, 2008.
- [23] M. Soltani, C. A. Vargas-Garcia, D. Antunes, and A. Singh, “Intercellular Variability in Protein Levels from Stochastic Expression and Noisy Cell Cycle Processes,” PLoS Comput. Biol., vol. 12, no. 8, p. 1004972, Aug. 2016.
- [24] J. Kuo, R. Yuan, C. Sánchez, J. Paulsson, and P. A. Silver, “Toward a translationally independent RNA-based synthetic oscillator using deactivated CRISPR-Cas,” Nucleic Acids Res., vol. 48, no. 14, pp. 8165–8177, 2020.
- [25] F. L. H.Tytgat, Nobrega, J. Van Der Oost, and W. M. De Vos, “Bowel Biofilms : Tipping Points between a Healthy and Compromised Gut ?,” Trends Microbiol., vol. 27, no. 1, pp. 17–25, 2018.
- [26] G. Vrancken, A. C. Gregory, G. R. B. Huys, K. Faust, and J. Raes, “Synthetic ecology of the human gut microbiota,” Nat. Rev. Microbiol., vol. 17, no. 12, pp. 754–763, 2019.
- [27] J. Von Neumann, Theory of self-reproducing automata. Urbana: University of Illinois Press, 1966.
- [28] M. Eden, “A Two-dimensional Growth Process”. Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, Volume 4: Contributions to Biology and Problems of Medicine, 223--239, University of California Press, Berkeley, Calif., 1961.
- [29] P. Nan, D. M. Walsh, K. A. Landman, and B. D. Hughes, “Distinguishing cell shoving mechanisms,” PLoS ONE, pp. 1–21, 2018.
- [30] S. Maccracken, E. Curtis, Z. Sun, and C. A. Klausmeier, “How spatial structure and neighbor uncertainty promote mutualists and weaken black queen effects,” J. Theor. Biol., vol. 446, pp. 33–60, 2018.
- [31] C. A. Yates, A. Parker, and R. E. Baker, “Incorporating pushing in exclusion-process models of cell migration,” Phys. Rev. E - Stat. Nonlinear, Soft Matter Phys., vol. 91, no. 5, p. 052711, May 2015.
- [32] G. Reshes, S. Vanounou, I. Fishov, and M. Feingold, “Cell shape dynamics in Escherichia coli,” Biophys. J., vol. 94, no. 1, pp. 251–264, Jan. 2008.
- [33] J. H. Van Heerden, H. Kempe, A. Doerr, T. Maarleveld, N. Nordholt, and F. J. Bruggeman, “Statistics and simulation of growth of single bacterial cells: Illustrations with B. subtilis and E. coli,” Sci. Rep., vol. 7, no. 1, pp. 1–11, 2017.
- [34] T. G. Sana, K. A. Lugo, and D. M. Monack, “T6SS: The bacterial ‘fight club’ in the host gut,” PLoS Pathogens, vol. 13, no. 6. Public Library of Science, 2017.
- [35] J. Cremer, M. Arnoldini, and T. Hwa, “Effect of water flow and chemical environment on microbiota growth and composition in the human colon,” Proc. Natl. Acad. Sci. U. S. A., vol. 114, no. 25, pp. 6438–6443, 2017.
- [36] J. Schluter and K. R. Foster, “The Evolution of Mutualism in Gut Microbiota Via Host Epithelial Selection,” PLoS Biol., vol. 10, no. 11, p. e1001424, Nov. 2012.
- [37] L. Lin et al., “Revealing the in vivo growth and division patterns of mouse gut bacteria,” Sci. Adv., vol. 6, no. 36, pp. 1–9, 2020.
- [38] C. L. Ventola, “The antibiotic resistance crisis: causes and threats.,” P T J., vol. 40, no. 4, pp. 277–83, 2015.
- [39] M. O. Din, T. Danino, A. Prindle, M. Skalak, J. Selimkhanov, K. Allen, E. Julio, E. Atolia, L. S. Tsimring, S. N. Bhatia “Synchronized cycles of bacterial lysis for in vivo delivery,” Nature, vol. 536, no. 7614, pp. 81–85, Jul. 2016.
- [40] Y. Boada, A. Vignoni, J. Picó, and P. Carbonell, “Extended Metabolic Biosensor Design for Dynamic Pathway Regulation of Cell Factories,” iScience, vol. 23, no. 7, 2020.
- [41] C. U. Ulus¸eker, J. Torres, J. L. García, M. M. Hanczyc, J. Nogales, and O. Kahramano˘ Gulları, “A Dynamic Model of the Phosphate Response System with Synthetic Promoters in Escherichia coli.”
- [42] H. M. Salis, E. A. Mirsky, and C. A. Voigt, “Automated design of synthetic ribosome binding sites to control protein expression,” Nat. Biotechnol., vol. 27, no. 10, pp. 946–950, Oct. 2009.
- [43] M. H. Dao-Thi et al., “Crystallization of CcdB in complex with a GyrA fragment,” Acta Crystallogr. Sect. D Biol. Crystallogr., vol. 60, no. 6, pp. 1132–1134, Jun. 2004.
- [44] H. Afif, N. Allali, M. Couturier, and L. Van Melderen, “The ratio between CcdA and CcdB modulates the transcriptional repression of the ccd poison-antidote system,” Mol. Microbiol., vol. 41, no. 1, pp. 73–82, 2001.
Extra information: