Model
The three genes cpxR, rpoE, and degP studied in this project are called Gx, Gy, and Gz, respectively. Gx can inhibit the expression of Gy, and Gx and Gy can activate the expression of Gz. The transcription and translation process is shown in Figure 1. When each gene meets its activation conditions, it will start to transcribe the corresponding mRNA at the same constant rate α, which is called Mx, My, Mz, and then the three mRNAs will also be the same The constant rate of β translation produces the corresponding proteins Px, Py, Pz. The protein produced by translation will generate its activator with a certain probability and spread in the cell. When its concentration at the position where the gene it regulates reaches the threshold, the transcription of the gene it regulates will be turned on or inhibited.
Figure 1. simple gene expression model.
There are 9 kinds of intracellular substances studied in this model: Gx, Gy, Gz, Mx, My, Mz, Px, Py, and Pz. Among them, Mx, My, and Mz are produced by transcription of the corresponding genes Gx, Gy, Gz, and They will spread and degrade in cells over time. Px, Py, and Pz are produced by translation of Mx, My, and Mz in cells, and will spread and degrade in cells over time.
Figure 2. Gene expression process in the feedforward loop network motif.
In order to improve the thermal adaptation ability of E. coli, we plan to change the distance between the thermal adaptation-related gene degP and its regulatory genes cpxR and rpoE, in order to change the expression concentration of thermal adaptation-related proteins, thereby improving the thermal adaptation ability of E. coli. The purpose of this model is to predict the changes in the final expression concentration of Z protein in the feedforward loop network motif and the changes in the expression start time under different spatial distances by simulating the relative dynamic processes in the cell. And through the later experimental verification, to evaluate the quality of the model, so as to predict the changes in E. coli thermal adaptability under different genetic distances.
Figure 3. XYZ gene plane rectangular coordinate system.
In order to facilitate quantitative research, we chose a way to visualize the spatial distance between the three genes. As shown in Figure 3, because in a space, the three points must be coplanar, so we put the three genes of Gx, Gy, and Gz into a rectangular coordinate system. We have established a 50X50 plane rectangular coordinate system, and use plane coordinates (j, k) to represent the position of each substance. The concentration of all substances at the edge of the rectangular coordinate system is regarded as 0, which means the distance is infinite. Since the state of various substances in the cell is constantly changing, the concentration of each substance at a certain point is related to time t and plane coordinates (j, k).
Figure 4. Dynamic Simulation of partial differential interaction of Mz, My, Mz, Pz, Py, Pz.
The changes in the concentration values of the six substances Mx, My, Mz, Px, Py, and Pz at time t are shown in Figure 4. α represents the transcription rate, β represents the translation rate, τm represents the life span of mRNA, and τp represents the life span of the protein. Gx(j,k,t), Gy(j,k,t), Gz(j,k,t) represent the copy number of the gene at position (j,k), and the copy number of the position without a gene is 0. Mx(j,k,t), My(j,k,t), Mz(j,k,t), Px(j,k,t), Py(j,k,t), Pz(j,k,t) represents the concentration of the substance at the position (j,k) at time t.
Assuming that Mx, My, Mz, Px, Py, and Pz diffuse at the same speed in space, the only driving force for diffusion is the concentration difference between substances in space. Then we define the function of the change in the molecular concentration at (j,k) at time t caused by the difference in the concentration of the protein or mRNA as diffusion(M(j,k,t)) or diffusion(P(j,k,t )). diffusion(M(j,k,t)) or diffusion(P(j,k,t)) can be expressed by the following formula:
When the activated X protein concentration near the Y gene reaches the threshold, the transcription of the Y gene is inhibited; when the activated X protein and the activated Y protein concentration near the Z gene reach the threshold at the same time, the transcription of the Z gene can be initiated.
The values of the parameters in the formula of Figure 4 are shown in Table 1.
Figure 5. The relationship between the distance between X gene and Z gene and the maximum concentration of Z protein and its experimental verification results.
It can be seen from the left picture of Figure 5 that there is a negative correlation between the distance of the XZ signal pathway and the highest value that the Z protein concentration can reach. When the distance of the XZ signal pathway is smaller, the highest concentration that the Z protein can accumulate is greater. It can be speculated that the smaller the distance of the signal pathway between X and Z, the greater the heat adaptability of E. coli. We designed experiments based on the results of this model fitting, editing the degP gene as the Z gene to different positions in the genome. As shown in the right figure of Figure 5, the experimental results show that the closer the XZ gene is, the higher the E. coli heat adaptability, which matches the model results.
Figure 6. Variation of substance concentration in feedforward loop network.
The kinetic process of the concentration of each substance changing with time is shown in Figure 6. In the spatial rectangular coordinate system, the X and Y axes represent the spatial position of each substance, and the Z axis represents the concentration of the corresponding substance at that position, blue Represents X protein, red represents Y protein, and green represents Z protein. It can be seen from the figure that the concentration of each substance starts from the location of the corresponding gene, and the further away from the location of the corresponding gene, the lower the concentration of the substance.
Figure 7. The relationship between the distance of XZ gene and the start time of Z protein expression.
Subsequently, in order to study the relationship between the start time of protein Z expression and the distance of the XZ signal pathway, a dynamic simulation was carried out. For the manifestation of the Z gene expression delay phenomenon in the feedforward loop network, as shown in Figure 7, there is a positive correlation between the distance of the XZ signal pathway and the time when the Z gene expression starts. The greater the distance between the XZ signal pathways , The later the Z gene expression starts.
The focus of our modeling and analysis is to provide a better theoretical basis for our research and analysis of the effect of genetic distance changes in the specific feedforward loop network motif of E. coli on its heat resistance. Through our model and the experimental data we have obtained, we can find that the increase in the distance between the regulatory genes will cause the Z protein expression concentration to decrease and the expression start time will be delayed, thereby reducing the heat resistance of the bacteria, and on the contrary, it can improve the resistance of the bacteria. Thermal. Our work can create a new idea for the iGEM team in the future. They will carry out more complex operations based on our existing work and ideas to achieve their experimental goals.
We hope that the future iGEM team can not only use the various components we have created for their experimental operation, but also solve some problems from the perspective of gene space distance based on our ideas, so as to add more new vitality to the research in the field of synthetic biology.
Huber D, Bukau B. DegP: a Protein "Death Star". Structure. 2008 Jul;16(7):989-90. doi: 10.1016/j.str.2008.06.004. PMID: 18611371.
S. Mangan, U. Alon. Structure and Function of the Feed-Forward Loop Network Motif. 2003, 100(21):11980-11985.
Van A L, Soula H A, Berry H. Space-induced bifurcation in repression-based transcriptional circuits[J]. Bmc Systems Biology, 2014, 8(1):1-14.