Team:Austin UTexas/Results

Results

Over the course of our project we collected expression data from hundreds of PineTree simulations. Here we display and discuss some of our most important findings.

Average Protein Expression of T7 Bacteriophage at Lysis Time

We began by finding the average protein expression levels for each gene in the wild-type T7 genome. This provided us with a baseline to compare the efficacy of our engineered phage to. According to our results, gene 10A, responsible for coding the phage's major capsid protein, has the highest level of expression in the genome. We believe this is due to the combination of overlapping polycistronic transcripts and strong promoters directly upstream of it.

Figure 1. Simulated protein abundances expressed by the wild-type T7 bacteriophage genome at lysis. Error bars represent 5th and 95th quantiles for each gene. Note: Class 1 genes (shown in pink) are present in this simulation but may be difficult to see due to their low expression levels.

Testing Specific Gene Deletions

Our first investigation involved sequentially deleting each gene in the T7 bacteriophage genome in order to determine how their absences affected gene expression. We found that none of the average expression levels in genomes with deleted genes were significantly higher than the wild type. Here we display the results for a deletion of gene 2, which codes for a small protein (gp2) that binds and deactivates the native E. coli RNA polymerase(RNAP). We expected the deletion of gene 2 to upregulate the expression of Class 1 genes, which are transcribed by the E. coli RNAP, however, the change we saw was not significantly higher. It is possible that because the E. coli RNAP is so much slower than the T7 RNAP, the stochastic effect of reducing its transcription is not that noticeable. This does illustrate one of the major limitations of our model, however. It is only programmed to recognize certain interactions (such as gp2 binding E. coli RNAP) that directly affect translation and transcription. This means that the effects of other interactions, such as the host's restriction enzymes degrading the genome, are not modeled. Thus, it is highly likely that the lack of deactivation of the host RNAP by gp2 would have a much more negative impact on T7's growth in a biological setting, something that we can investigate once we begin lab testing in phase 2.

Figure 2. Simulated protein abundances expressed by T7 bacteriophage genome at lysis with gene 2 deleted. Error bars represent 5th and 95th quantiles for each gene. Note: Class 1 genes (shown in pink) are present in this simulation but may be difficult to see due to their low expression levels.

Inserting GFP into the T7 Genome

In addition to deletions, we tested how GFP would be expressed when inserted into various locations within the T7 genome. The mutants include the replacement of nonessential genes 1.4, 1.5, and 1.6 as well as 4.3, 4.5, and 4.7 with GFP and the insertion of GFP before and after gene 10A. Among these mutants, we found that the point of highest expression was insertion of GFP just before gene 10A. This made sense, as it allowed the GFP gene to take advantage of the multiple strong promoters that regulated 10A without having to wait for 10A (a relatively large gene) to be transcribed before it. During the course of these experiments, we found the expression level of our inserted gene correlated very strongly with those of nearby genes. This reinforced the idea that T7's genomic structure, especially the location and multiplicity of promoters, heavily dictates gene expression levels within our model. Furthermore, we saw that the replacement of individual nonessential genes with GFP did not have any significant effect on expression in the rest of the genome, showing that there may be a potential for future engineering of genes into those locations. In addition, we saw no major changes in burst size and lysis time for the GFP mutants compared to the wild type, showing evidence the mutations don't have a significant effect on phage fitness.

Figure 3. Simulated protein abundances expressed by T7 bacteriophage genome at lysis with sfGFP inserted before gene 10A. Error bars represent 5th and 95th quantiles for each gene. Note: Class 1 genes (shown in pink) are present in this simulation but may be difficult to see due to their low expression levels.

Moving the Holin Gene Upstream to Reduce Lysis Time

One of the primary goals of our project was to reduce the lysis time of T7 bacteriophage in order to make it more useful as a reporter for contaminated water sources. We decided to test how moving gene 17.5, which codes for holin, a protein that is responsible for determining how fast the cell lyses during an infection, affects the time point at which our simulated cells lysed. We developed a calculator that determined lysis time from our simulation data by finding the time point where holin abundance reached the amount when lysis occurs in the wild type genome (12341 proteins). We found that moving the holin gene before gene 10A(as the point of highest expression) reduced lysis time by 40% compared to wild type, resulting in the greatest decrease out of all the other lysis mutants tested. One of the consequences of reducing the lysis time so drastically was that the abundances of all protein products at lysis was about a fourth of what they were for the wild-type. This is also reflected in a 62% decrease of burst size. The consequences of decreasing lysis time is a fact that we had to take into account when designing our final genome.

Figure 3. Simulated protein abundances expressed by T7 bacteriophage genome at lysis with the holin gene moved before gene 10A. Error bars represent 5th and 95th quantiles for each gene. Note: Class 1 genes (shown in pink) are present in this simulation but may be difficult to see due to their low expression levels.
Figure 4. Histogram of lysis times calculated from the wild-type T7 genome. The number of cells that "did not lyse" represents the number of simulations where holin abundance did not reach lysis point before the simulation ended.
Figure 5. Histogram of lysis times calculated from the T7 genome with the holin gene moved before gene 10A. The number of cells that "did not lyse" represents the number of simulations where holin abundance did not reach lysis point before the simulation ended.

Testing Final Genome Design

After testing multiple constructs, we determined that the optimal design for our PhastPhage genome consisted of inserting GFP before gene 10A and moving the holin gene between the two, such that the mutated region consists of GFP -> Holin -> 10A. We found that having the holin gene before GFP in this sequence only marginally decreased lysis time while greatly reducing the abundance of GFP at lysis. Likewise, putting 10A before both genes reduced their respective effects. This design yielded optimal results for reducing lysis time and increasing GFP expression, making it the prime candidate for building and testing in lab during phase 2 of our project. We also tested the this design with the nonessential genes 1.4, 1.5, 1.6, 4.3, 4.5, 4.7 all deleted, and found that there was not significant difference in GFP and lysis time compared to our final genome.

Figure 6. Simulated protein abundances expressed by T7 bacteriophage genome at lysis with sfGFP inserted before gene 10A and holin gene moved before 10A and after GFP. Error bars represent 5th and 95th quantiles for each gene. Note: Class 1 genes (shown in pink) are present in this simulation but may be difficult to see due to their low expression levels.