Difference between revisions of "Team:AUC-EGYPT/Model"

Line 25: Line 25:
 
   padding: 50px;
 
   padding: 50px;
 
   width: 980px;
 
   width: 980px;
   height: 13200px;
+
   height: 7000px;
 
   margin-left: 50px;
 
   margin-left: 50px;
 
}
 
}

Revision as of 02:52, 27 October 2020

Model


Modelling

  • RNA Folding :
  • In silico Thermodynamic Modeling
  • Reflections from the structural model
  • References

    We improved the performance of Ulaval’s Toeholder software and dedicated it to design a class of mammalian toehold switches. However, designing toehold switches is not enough. Structural modeling is essential to predict the stability and the favorability of binding between the toehold switch and the trigger. Structural predictions based on free energy calculations do not accurately simulate the folding of mRNAs. There is a lot more to the story: for instance, free energy models do not take into account chaperone-aided folding of RNAs (Tijerina & Russell, 2013). Nonetheless, free energy models are utilized extensively in studies to predict the secondary structure of mRNA(Huston et al., 2020; Rangan et al., 2020).

    RNA Folding

    RNAs are single stranded, yet they are not always free to bind a complementary target. They fold into all sorts of secondary structures (Fig 1), rendering the search for triggers for toehold switch design more difficult. Pardee et al (2016) designed toehold switches to detect zika virus. They scanned the genome of zika virus for regions of low secondary structures, i.e. regions with minimal base pairing. They reasoned that the trigger needs to be accessible by the toehold switch in order to bind to and activate it. In our design algorithm, we searched for candidate triggers of lengths 30 nucleotides that display a minimum of 15 unpaired region. The NUPACK suite was utilized to model the minimal free energy structure(Zadeh, Steenberg, et al., 2011; Zadeh, Wolfe, et al., 2011).



    Figure 1 Structural Model of the S2 subunit of the Spike mRNA generated by the NUPACK Suite



    In silico Thermodynamic Modeling



    The generated toehold switches were ranked based on five criteria:

    1) Minimal Free Energy (MFE) Difference: The binding of the toehold switch to the trigger has to be favorable over the unbound state of both the toehold switch and the trigger. The change in Gibbs free energy (ΔG) is a metric for measuring the favorability and the spontaneity of binding. Since a more negative ΔG is indicative of more favorability, ΔG of the bound state has to be lower than the sum of both ΔG of the toehold and the trigger in the unbound state.

    MFE = ΔGbound - (ΔGtoehold + ΔGtrigger)

    MFE < 0 2) Free Energy of the toehold switch: In order to fold simultaneously into the desired secondary structure and to maintain that structure, toehold switches need to display a negative ΔG. The more negative the ΔG, the more stable the toehold switch structure.

    ΔGtoehold <0

    3) ΔGRBS-Linker: The RBS-Linker is the sequence of nucleotides starting from the RBS loop to the end of the 21-nucleotide linker. Green et a(Green et al., 2014)l. (2014) inferred a strong correlation between the ΔGRBS_Linker and the dynamic range, a measure of efficacy, of the toehold switches. The closer the value of ΔGRBS_Linker to zero, the more efficient the toehold switch is.

    ΔGRBS-Linker ~ 0

    4) Perfect Matching: from the structural model of the bound state predicted by NUPACK, we can draw conclusions about the base matching between the toehold and the trigger. Sense activation and binding of toehold switches is sequence specific, we preferred toehold switches that are predicted to bind to the exact location in the trigger sequence.

    5) Orthogonality to the human genome: orthogonality is a mean to minimize false positives. Since our circuit is going to be delivered to human hosts, it was necessary to ensure minimal cross talk inside the host cell. Accordingly, we cross referenced the s2 subunit of the spike mRNA against the human genome (Camacho et al., 2009) and found out that the S2 domain is unique and orthogonal the human genome.

    Upon running the tool on the S2 mRNA sequence (~ 1800 bp), the tool generated a set of 142 switches. Of them, 68% are predicted to bind with 100% matching in the correct positions (Fig 2). This matching test is similar to the results generate by Ulaval https://2019.igem.org/Team:ULaval/Model. After eliminating toeholds that showed imperfect matching, we selected the top 30 toeholds that displayed the lowest MFE difference. Worth mentioning is all toehold switches are predicted to bind to the triggers spontaneously (Fig 3). Out of this pool, top 20 that showed minimal ΔGtoehold were shortlisted. The top 15 switches that displayed a ΔGRBS-Linker close to zero were finalized to be characterized in Phase II next season (See Parts).



    Figure 2: Percentage of accurate matching between toeholds and corresponding triggers. 68% are predicted to bind accurately.







    Figure 3 Distribution of the MFE differences the MFE difference is below zero for all designed all the 143 toehold switches. This means that all are predicted to bind spontaneously to the trigger.



    Ideally, Toehold switches feature a stable RBS loop flanked by a stable stem. They also feature a toehold domain, the free domain which is supposed to start the strand displacement between the toehold and the trigger(Green et al., 2014). All the 15 selected toeholds featured a stable RBS loop flanked by a stable stem. Only three of them had a toehold domain that is completely unpaired (Fig 4). Others had the toehold domain interact with the 21 nt domain.

    In the original publication, Pardee et al. (2016) hypothesized that the more accessible the trigger region is, the better the toehold is going to bind. Team Ulaval created a model to disprove this hypothesis. They modeled the relationship between the number of accessible positions and the free energy of the bound state (ΔGbound). Remarkably, there was no correlation between the two variables. We performed the same linear regression model on our set of 143 toehold switches (roughly equal sample size), and reached the same conclusion as the correlation coefficient R2 was 0.0147 (Fig 5).



    Figure 5: Correlation between availability of bases and favorability of binding In this sample of 143 toehold switches, there seems to be no correlation between the number of accessible (unpaired) nucleotides and the binding energy of the toehold to the trigger.



    Despite the prediction that there exists no correlation between accessibility of nucleotides and spontaneity of binding, there are two limitations to the way the model was constructed. First, the toehold switches fed to the model were originally designed with one critical constraint: candidate triggers had to display a minimal number of free bases: 10 in the case of Ulaval and 15 in our case. To construct a more valid model, we generated a new set of toehold switches with no constraint on the number available bases. Second, the sample size in both models is small. Since our improved version of Toeholder can process larger files in a significantly shorter amount time without crashing, we wanted to utilize this feature to generate a larger sample of data. Accordingly, we fed Toeholder a Fasta file containing the full genome of SARS-CoV-2 (~ 30,000 kb) and set the minimal number of accessible bases to zero. The tool generated ~11,000 toehold switch upon which we reperformed the linear regression analysis (Fig 6). Remarkably, we reached the same conclusion that there seems to be no correlation between the number of accessible positions and the free energy of binding.



    Figure 6 Correlation between accessible positions and free energy of bindind In this sample of 10827 toehold switches, there seems to be no correlation between the number of accessible (unpaired) nucleotides and the binding energy of the toehold to the trigger. There was no restriction on the minimum number of unpaired positions when designing the toehold switches



    Reflections from the structural model

    The linear regression model predicted no correlation between the number of unpaired bases and the free energy of binding between the toehold switch and the trigger. Accordingly, we redesigned our toehold switches by eliminating the criteria for unpaired bases. We fed the S2 sequence to our software and set the minimum number of unpaired regions to zero. The generated toeholds (see Parts) were ranked based on the aforementioned five criteria.

    References

    Camacho, C., Coulouris, G., Avagyan, V., Ma, N., Papadopoulos, J., Bealer, K., & Madden, T. L. (2009). BLAST+: Architecture and applications. BMC Bioinformatics, 10, 421. https://doi.org/10.1186/1471-2105-10-421

    Green, A. A., Silver, P. A., Collins, J. J., & Yin, P. (2014). Toehold Switches: De-Novo-Designed Regulators of Gene Expression. Cell, 159(4), 925–939. https://doi.org/10.1016/j.cell.2014.10.002

    Huston, N. C., Wan, H., de Cesaris Araujo Tavares, R., Wilen, C., & Pyle, A. M. (2020). Comprehensive in-vivo secondary structure of the SARS-CoV-2 genome reveals novel regulatory motifs and mechanisms. BioRxiv. https://doi.org/10.1101/2020.07.10.197079

    Pardee, K., Green, A. A., Takahashi, M. K., Braff, D., Lambert, G., Lee, J. W., Ferrante, T., Ma, D., Donghia, N., Fan, M., Daringer, N. M., Bosch, I., Dudley, D. M., O’Connor, D. H., Gehrke, L., & Collins, J. J. (2016). Rapid, Low-Cost Detection of Zika Virus Using Programmable Biomolecular Components. Cell, 165(5), 1255–1266. https://doi.org/10.1016/j.cell.2016.04.059

    Rangan, R., Zheludev, I. N., & Das, R. (2020). RNA genome conservation and secondary structure in SARS-CoV-2 and SARS-related viruses. BioRxiv. https://doi.org/10.1101/2020.03.27.012906

    Tijerina, P., & Russell, R. (2013). The Roles of Chaperones in RNA Folding. In R. Russell (Ed.), Biophysics of RNA Folding (pp. 205–230). Springer. https://doi.org/10.1007/978-1-4614-4954-6_11

    Zadeh, J. N., Steenberg, C. D., Bois, J. S., Wolfe, B. R., Pierce, M. B., Khan, A. R., Dirks, R. M., & Pierce, N. A. (2011). NUPACK: Analysis and design of nucleic acid systems. Journal of Computational Chemistry, 32(1), 170–173. https://doi.org/10.1002/jcc.21596

    Zadeh, J. N., Wolfe, B. R., & Pierce, N. A. (2011). Nucleic acid sequence design via efficient ensemble defect optimization. Journal of Computational Chemistry, 32(3), 439–452. https://doi.org/10.1002/jcc.21633