Team:AUC-EGYPT/Model



We improved the performance of Ulaval’s Toeholder software and dedicated it to design a class of mammalian toehold switches. However, designing toehold switches is not enough. Structural modeling is essential to predict the stability and the favorability of binding between the toehold switch and the trigger. Structural predictions based on free energy calculations do not accurately simulate the folding of mRNAs. There is a lot more to the story: for instance, free energy models do not take into account chaperone-aided folding of RNAs (Tijerina & Russell, 2013). Nonetheless, free energy models are utilized extensively in studies to predict the secondary structure of mRNA(Huston et al., 2020; Rangan et al., 2020).





RNA Folding

RNAs are single stranded, yet they are not always free to bind a complementary target. They fold into all sorts of secondary structures (Fig 1), rendering the search for triggers for toehold switch design more difficult. Pardee et al (2016) designed toehold switches to detect zika virus. They scanned the genome of zika virus for regions of low secondary structures, i.e. regions with minimal base pairing. They reasoned that the trigger needs to be accessible by the toehold switch in order to bind to and activate it. In our design algorithm, we searched for candidate triggers of lengths 30 nucleotides that display a minimum of 15 unpaired region. The NUPACK suite was utilized to model the minimal free energy structure(Zadeh, Steenberg, et al., 2011; Zadeh, Wolfe, et al., 2011).



Figure 1 Structural Model of the S2 subunit of the Spike mRNA generated by the NUPACK Suite







In silico Thermodynamic Modeling



The generated toehold switches were ranked based on five criteria:

1) Minimal Free Energy (MFE) Difference: The binding of the toehold switch to the trigger has to be favorable over the unbound state of both the toehold switch and the trigger. The change in Gibbs free energy (ΔG) is a metric for measuring the favorability and the spontaneity of binding. Since a more negative ΔG is indicative of more favorability, ΔG of the bound state has to be lower than the sum of both ΔG of the toehold and the trigger in the unbound state.

MFE = ΔGbound - (ΔGtoehold + ΔGtrigger)

MFE < 0 2) Free Energy of the toehold switch: In order to fold simultaneously into the desired secondary structure and to maintain that structure, toehold switches need to display a negative ΔG. The more negative the ΔG, the more stable the toehold switch structure.

ΔGtoehold <0

3) ΔGRBS-Linker: The RBS-Linker is the sequence of nucleotides starting from the RBS loop to the end of the 21-nucleotide linker. Green et a(Green et al., 2014)l. (2014) inferred a strong correlation between the ΔGRBS_Linker and the dynamic range, a measure of efficacy, of the toehold switches. The closer the value of ΔGRBS_Linker to zero, the more efficient the toehold switch is.

ΔGRBS-Linker ~ 0

4) Perfect Matching: from the structural model of the bound state predicted by NUPACK, we can draw conclusions about the base matching between the toehold and the trigger. Sense activation and binding of toehold switches is sequence specific, we preferred toehold switches that are predicted to bind to the exact location in the trigger sequence.

5) Orthogonality to the human genome: orthogonality is a mean to minimize false positives. Since our circuit is going to be delivered to human hosts, it was necessary to ensure minimal cross talk inside the host cell. Accordingly, we cross referenced the s2 subunit of the spike mRNA against the human genome (Camacho et al., 2009) and found out that the S2 domain is unique and orthogonal the human genome.

Upon running the tool on the S2 mRNA sequence (~ 1800 bp), the tool generated a set of 142 switches. Of them, 68% are predicted to bind with 100% matching in the correct positions (Fig 2). This matching test is similar to the results generate by Ulaval https://2019.igem.org/Team:ULaval/Model. After eliminating toeholds that showed imperfect matching, we selected the top 30 toeholds that displayed the lowest MFE difference. Worth mentioning is all toehold switches are predicted to bind to the triggers spontaneously (Fig 3). Out of this pool, top 20 that showed minimal ΔGtoehold were shortlisted. The top 15 switches that displayed a ΔGRBS-Linker close to zero were finalized to be characterized in Phase II next season Parts



Figure 2: Percentage of accurate matching between toeholds and corresponding triggers. 68% are predicted to bind accurately.







Figure 3 Distribution of the MFE differences the MFE difference is below zero for all designed all the 143 toehold switches. This means that all are predicted to bind spontaneously to the trigger.



Ideally, Toehold switches feature a stable RBS loop flanked by a stable stem. They also feature a toehold domain, the free domain which is supposed to start the strand displacement between the toehold and the trigger(Green et al., 2014). All the 15 selected toeholds featured a stable RBS loop flanked by a stable stem. Only three of them had a toehold domain that is completely unpaired (Fig 4). Others had the toehold domain interact with the 21 nt domain.

In the original publication, Pardee et al. (2016) hypothesized that the more accessible the trigger region is, the better the toehold is going to bind. Team Ulaval created a model to disprove this hypothesis. They modeled the relationship between the number of accessible positions and the free energy of the bound state (ΔGbound). Remarkably, there was no correlation between the two variables. We performed the same linear regression model on our set of 143 toehold switches (roughly equal sample size), and reached the same conclusion as the correlation coefficient R2 was 0.0147 (Fig 5).



Figure 5: Correlation between availability of bases and favorability of binding In this sample of 143 toehold switches, there seems to be no correlation between the number of accessible (unpaired) nucleotides and the binding energy of the toehold to the trigger.



Despite the prediction that there exists no correlation between accessibility of nucleotides and spontaneity of binding, there are two limitations to the way the model was constructed. First, the toehold switches fed to the model were originally designed with one critical constraint: candidate triggers had to display a minimal number of free bases: 10 in the case of Ulaval and 15 in our case. To construct a more valid model, we generated a new set of toehold switches with no constraint on the number available bases. Second, the sample size in both models is small. Since our improved version of Toeholder can process larger files in a significantly shorter amount time without crashing, we wanted to utilize this feature to generate a larger sample of data. Accordingly, we fed Toeholder a Fasta file containing the full genome of SARS-CoV-2 (~ 30,000 kb) and set the minimal number of accessible bases to zero. The tool generated ~11,000 toehold switch upon which we reperformed the linear regression analysis (Fig 6). Remarkably, we reached the same conclusion that there seems to be no correlation between the number of accessible positions and the free energy of binding.





Figure 6 Correlation between accessible positions and free energy of bindind In this sample of 10827 toehold switches, there seems to be no correlation between the number of accessible (unpaired) nucleotides and the binding energy of the toehold to the trigger. There was no restriction on the minimum number of unpaired positions when designing the toehold switches







Reflections from the structural model

The linear regression model predicted no correlation between the number of unpaired bases and the free energy of binding between the toehold switch and the trigger. Accordingly, we redesigned our toehold switches by eliminating the criteria for unpaired bases. We fed the S2 sequence to our software and set the minimum number of unpaired regions to zero. The generated toeholds Parts were ranked based on the aforementioned five criteria.

Model description

In our designed circuit, the CMV works as the input that gets transcribed (r1TX) into the ToeholdmRNA. In the presence of SARSCoV2mRNA. The Toehold gets activated, unfolds and bind to the SARSCoV2mRNA in a reversible reaction.

The result is a Toehold::SARSCoV2mRNAcomplex that gets translated to produce the GAL4BD_VB16 that activates the UAS. When the Toehold binds to the UAS, it initiates a transcription process that produces siRNA. The siRNA binds to the SARSCoV2mRNA and initiates the degradation process at a rate higher than the natural degradation rate of the virus. The schematic equations of the whole reaction process is shown below:






The circuit DNA transcription and translational reactions were assigned a Michaelis-Menten kinetics to account for the saturation of the production rates. The constants Ktx and ktx are the Michaelis-Menten constant and the reaction rate respectively for the transcription reaction were obtained from [1],[2].

Similarly, translation constants Ktl and ktl which are Michaelis-Menten constant and the reaction rate respectively could also be obtained from [1]. However, these values are a proxy to the real values we should be using as they are based on a cell free system, and our cell is a Eukaryotic cell.


We chose a mass action kinetics to describe the binding reaction of the Toehold with the SARSCoV2 and the GAL4BD_VB16 which is the activator with the UAS as these are reversable reactions where the rate of the reaction depends on the concentration of the reactants and the products. For reaction (4) we could not get the individual values for the kf and kr. However, we could use the ratio between the two constants (Kd) which is also a proxy to constant we should be using [1] based on the assumption that this reaction reaches equilibrium state


SARSCoV2 is produced at a rate P which is the viral production rate per cell [3], and it is degraded either naturally or when combined to the siRNA. The natural degradation rate is also a quired from [3].

On the other hand, the degradation rate of SARSCoV2 is accelerated when combined with the siRNA. We could not find a study in literature that leads us to exact value of the degradation constant kcordeg. However, we are expecting this rate to be much higher than the natural degradation rate of the virus.





References

Camacho, C., Coulouris, G., Avagyan, V., Ma, N., Papadopoulos, J., Bealer, K., & Madden, T. L. (2009). BLAST+: Architecture and applications. BMC Bioinformatics, 10, 421. https://doi.org/10.1186/1471-2105-10-421

Green, A. A., Silver, P. A., Collins, J. J., & Yin, P. (2014). Toehold Switches: De-Novo-Designed Regulators of Gene Expression. Cell, 159(4), 925–939. https://doi.org/10.1016/j.cell.2014.10.002

Huston, N. C., Wan, H., de Cesaris Araujo Tavares, R., Wilen, C., & Pyle, A. M. (2020). Comprehensive in-vivo secondary structure of the SARS-CoV-2 genome reveals novel regulatory motifs and mechanisms. BioRxiv. https://doi.org/10.1101/2020.07.10.197079

Pardee, K., Green, A. A., Takahashi, M. K., Braff, D., Lambert, G., Lee, J. W., Ferrante, T., Ma, D., Donghia, N., Fan, M., Daringer, N. M., Bosch, I., Dudley, D. M., O’Connor, D. H., Gehrke, L., & Collins, J. J. (2016). Rapid, Low-Cost Detection of Zika Virus Using Programmable Biomolecular Components. Cell, 165(5), 1255–1266. https://doi.org/10.1016/j.cell.2016.04.059

Rangan, R., Zheludev, I. N., & Das, R. (2020). RNA genome conservation and secondary structure in SARS-CoV-2 and SARS-related viruses. BioRxiv. https://doi.org/10.1101/2020.03.27.012906

Tijerina, P., & Russell, R. (2013). The Roles of Chaperones in RNA Folding. In R. Russell (Ed.), Biophysics of RNA Folding (pp. 205–230). Springer. https://doi.org/10.1007/978-1-4614-4954-6_11

Zadeh, J. N., Steenberg, C. D., Bois, J. S., Wolfe, B. R., Pierce, M. B., Khan, A. R., Dirks, R. M., & Pierce, N. A. (2011). NUPACK: Analysis and design of nucleic acid systems. Journal of Computational Chemistry, 32(1), 170–173. https://doi.org/10.1002/jcc.21596

Zadeh, J. N., Wolfe, B. R., & Pierce, N. A. (2011). Nucleic acid sequence design via efficient ensemble defect optimization. Journal of Computational Chemistry, 32(3), 439–452. https://doi.org/10.1002/jcc.21633

Senoussi, A. et al. Quantitative characterization of translational riboregulators using an in vitro transcription-translation system. bioRxiv 290403 (2018). doi:10.1101/290403.

Stögbauer, T., Windhager, L., Zimmer, R. & Rädler, J. O. Experiment and mathematical modeling of gene expression dynamics in a cell-free system. Integr. Biol. 4, 494 (2012).

Wang, Sunpeng, et al. "Modeling the viral dynamics of SARS-CoV-2 infection." Mathematical biosciences 328 (2020): 108438.