Team:IISER Berhampur/Model

IISER-BPR IGEM






1. Mutational Analysis


Before going into this section, let’s first have a look at the various types of DENV Proteins.




The RNA genome of DENV encodes 3 structural proteins (C, prM and E) and 7 non-structural proteins (NS1, NS2A, NS2B, NS3, NS4A, NS4B, NS5). The structural proteins are the molecules that the virus particle comes equipped with, such as the proteins comprising their capsids or envelope. The non-structural proteins are encoded by the viral genome and expressed in the infected host cells. These play important roles in several processes such as replication of the virus.


Now let’s come back to the topic of Mutational analysis.


To answer the question ‘Which proteins to select as the target of the peptide inhibitors: Non-Structural or Structural proteins?’, we studied the mutational landscape of the DENV genome, to infer which proteins are relatively stable in terms of their mutational entropy so that the inhibitor is not quickly rendered ineffective by mutations in the genome.


To study the genome data of dengue virus, we used the data available in the Nextstrain (https://nextstrain.org/dengue) open-source project website. This website provides the data for the genetic diversity of the DENV genome over several years. It compares the number of mutation events and Shannon Entropy values(which gives a measure of how frequently a gene mutates) of each codon that codes for an amino acid in the structural and non-structural proteins of the Dengue virus. It takes into account results from numerous studies from all over the world and provides a timeline for the evolution of the DENV genome and geographical distribution of the different genotypes.


To achieve our aim to understand the difference between the rate of mutations in NS Proteins and that in structural proteins, we took into account the number of mutation events that have occurred in each codon of all the genes of DENV. Out of these, we took those positions where the number of mutational events had been 10 or greater than 10. Some number of mutations is a common occurrence in most genes due to background mutational events. But if this number is significantly high in a gene, it shows that there is some kind of selection pressure. So this bar of 10 mutations was set as our threshold to isolate those positions where mutations were greater than normally occurring mutations, which may suggest a greater tendency to mutate. These positions would be more prone to undergo changes, thus rendering our inhibitor ineffective if targeted against these sites.


After this, the selected codon positions in each of the genes were plotted against ‘The Number of mutations’ and compared.


The results obtained were as follow:


  • For each of the four serotypes, the C and E structural proteins, along with the NS1 protein contained at least one codon that had undergone 10 or more than 10 mutations, implying that these proteins have evolved the most over the course of time. All the other NS Proteins did not have mutations greater than 10, in some serotype or the other.
  • Comparatively less number of positions in the structural proteins where mutational events are high, can be accounted for by the fact that Structural proteins are shorter in length and have less number of amino acids, as it is. Moreover, this fact further proves that rate of mutations is high in structural proteins because despite shorter length, C and E proteins have comparably high number of mutations in all 4 serotypes.

Hence, it can be concluded that Non-Structural Proteins would be a better target for our Peptide Inhibitor since it was seen that Structural proteins are more prone to mutations. If chosen as a target for our peptide inhibitor, a more mutationally stable protein, would ensure that the peptide inhibitor is viable for a longer time.


  • Among the Non-structural proteins, it was seen that our target protein NS5 (which was selected later on) did have a significantly high number of mutations, which probably could mean that NS5 too faces high selection pressure. But this could again be partially due to the long length of NS5. Moreover, on taking a closer look, it was found that that the region where STAT-2 interacts with NS5, there is a high number of mutations (In DENV1 and especially DENV2, in which this interaction has been shown). This shows that our chosen interaction is really an important interaction in the DENV pathogenesis since selection pressure is high in that region. Due to this, we stayed with our choice of targeting NS5 and STAT2 interaction (more on this in the ‘Protein-Protein Interaction Studies’ section) in spite of NS5 having a greater number of mutations.







2. Graphs


1) DENV-1



2) DENV-2



3) DENV-3



4)- DENV-4










3. References


Hadfield et al., Nextstrain: real-time tracking of pathogen evolution, Bioinformatics (2018)Nextstrain: real-time tracking of pathogen evolution









1. Protein-Protein Interaction Studies and Validations


To validate the working of FRaPPe, we thought of selecting a particular DENV associated Protein-Protein Interaction, design peptide inhibitors against them and finally validate the efficiency of the designed peptide inhibitors in inhibiting the target PPI using our reporter system. This will also serve as an exemplar pipeline which can be followed while using our reporter system for building peptide inhibitors against various PPIs.


As we had decided to go ahead with non-structural proteins, hence we searched through all the available literature on DENV interactomes to find out various PPIs involving DENV NSPs (and the interacting partners being either DENV proteins or human proteins) and their specific details (e.g. interacting domains, amino acids involved in the interactions etc.). Based on this, we prepared diagrams representing the interactomes of each NSP.




ABased on all the data we had collected, we wanted to select a particular PPI as our target PPI. For this selection, we appointed several criteria such as the amount of data available, whether the structures are available, how significant the role of the PPI is in DENV pathogenesis etc. Finally, by all these, we narrowed down to the interaction between Dengue Virus Non-structural Protein 5 (DENV NS5) and Human Signal Transducer and Activator of Transcription-2 (hSTAT2).


Just after the target PPI was selected, we again jumped into the literature to mine more data on DENV NS5-hSTAT2 interaction.









Brief


Here we will give a brief overview (mostly structural details) of what we obtained through the data-mining (the molecular mechanisms and pathway through which NS5-STAT2 interaction helps in DENV pathogenesis and the detailed biology relating it to our project will be discussed separately):


  • hSTAT2 consists of 5 domains which are as follow (from N to C terminal): Amino-terminal domain, Coiled-coil domain (139-316), DNA-binding domain (317-458), -SH2 domain (568-686, 581-700) and Carboxy terminal domain.
  • In contrast, DENV NS5 consists of 2 domains N-terminal Methyl Transferase domain and a C-terminal RNA dependent RNA polymerase domain.
  • Ihere are four serotypes of DENV (1-4) between which the structure, functions and mechanisms vary, but there is a lot of overlap too. The interaction between NS5-STAT2 has been mostly studied in DENV2 but has been also studied somewhat in DENV1. But this particular PPI is thought to be applicable for all serotypes because of the relatively high percentage of NS5 sequence identity when comparing DENV2 to the other 3 serotypes.
  • It is hypothesized that NS5 and STAT2 first bind with each other through specific regions present on each of them, then NS5 mediates STAT2 degradation and in this also specific regions on NS5 and STAT2 are responsible. But in this project, we are interested only in inhibiting the binding between NS5 and STAT2 (which will eventually inhibit the degradation of STAT2). Though several studies support this, evidence regarding direct interaction between NS5 and STAT2 is still lacking (some speculations suggest that NS5 and STAT2 may be actually binding to a third protein instead of binding to each other).
  • In spite of lack of any concrete evidence regarding NS5 and STAT2 direct interaction, people have mapped the ability of NS5 to bind STAT2 to a region within the hSTAT2 coiled-coil domain. Studies show that amino acids between 181-200 of STAT2 are indispensable for this interaction. Similarly, amino acids between residues 202 and 306 from the N-terminal region of the protein from the dengue 2 serotype have been shown to interact directly with STAT2.

The next job was to create a model of the NS5-STAT2 complex (as no models based on experimentally solved structures were available in the literature). For that, first, we went through the RCSB Protein Data Bank website (http://www.rcsb.org/) and searched for the available structures of DENV NS5 and hSTAT2, based on which we finally selected these two PDB structures for modelling:


  • PDB ID 5ZQK: Dengue Virus Non Structural Protein 5 (Dengue Virus 2) This was used to extract the DENV NS5 (monomeric) structure.
  • PDB ID 6WCZ: CryoEM Structure of full-length ZIKV NS5-hSTAT2 complex. This was used to extract STAT2 structure.
  • Then, using HawkDock Server (http://cadd.zju.edu.cn/hawkdock/), we docked (global docking) the extracted NS5 and STAT2 structure. Then we filtered the output models on the basis of how well they satisfy the experimental results mentioned earlier. The rank 7 model satisfied the experimental results very well with a score of -4037.07. The binding free energy was predicted to be -15.24 kcal/mol by MM/GBSA analysis. We named it as ‘model 1’.

But the problem with the available PDB structures for DENV NS5 and hSTAT2 was that none of them were for the complete proteins. Hence the docked structure was also made of these partial proteins. Hence, to overcome this problem we later thought of using homology modeling for building the complete structure. For that, first we obtained the sequence for full-length DENV NS5 (Accession: 5ZQK_A) and full-length hSTAT2 (Accession: 6WCZ_A) from NCBI Protein (https://www.ncbi.nlm.nih.gov/protein/). Then using SWISS-MODEL (https://swissmodel.expasy.org/) we created structures of complete DENV NS5 (Template: 5zqk.1.A) and hSTAT2 (Template: 6ux2.1.A) proteins by homology-modeling. Now using these two structures, we prepared the model for FL DENV NS5- FL hSTAT2 complex (Model 2) using a procedure very similar to what was adopted for the complex model created earlier (Model 1).


Based on this complex structure (Model 2), after analysing the position of N and C terminals, it seemed that it is better to have heavy protein tags only at N terminals of both the proteins so that those tags won’t constraint or affect the protein structure or their interaction between NS5-STAT2 to a very great extent. This insight was taken into consideration while designing the constructs so as to avoid false results as much as possible and to make our reporter more effective.









3. References:


1. Ashour, Joseph, et al. "Mouse STAT2 restricts early dengue virus replication." Cell host & microbe 8.5 (2010): 410-421.


2. Ashour, Joseph, et al. "NS5 of dengue virus mediates STAT2 binding and degradation." Journal of virology 83.11 (2009): 5408-5418.


3. Aslam, B., et al. "Structural modeling and analysis of dengue-mediated inhibition of interferon signaling pathway." Genetics and molecular research: GMR 14.2 (2015): 4215-37.


4. Boxiao, W., et al. (2020) CryoEM structure of full-length ZIKV NS5-hSTAT2 complex doi: http://doi.org/10.2210/pdb6WCZ/pdb


5. Chen F, Liu H, Sun HY, Pan PC, Li YY, Li D, Hou TJ. Assessing the performance of the MM/PBSA and MM/GBSA methods. 6. Capability to predict protein-protein binding free energies and re-rank binding poses generated by protein-protein docking. Physical Chemistry Chemical Physics, 2016, 18(18):22129-22139.


6. Chew, Miaw-Fang, Keat-Seong Poh, and Chit-Laa Poh. "Peptides as therapeutic agents for dengue virus." International journal of medical sciences 14.13 (2017): 1342.


7. Choubey, Sanjay Kumar, et al. "Structural and functional insights of STAT2-NS5 interaction for the identification of NS5 antagonist–An approach for restoring interferon signaling." Computational Biology and Chemistry 88 (2020): 10733


8. El Sahili, Abbas, and Julien Lescar. "Dengue virus non-structural protein 5." Viruses 9.4 (2017): 91.


9. El Sahili, Abbas, et al. "NS5 from dengue virus serotype 2 can adopt a conformation analogous to that of its Zika virus and Japanese encephalitis virus homologues." Journal of virology 94.1 (2019).


10. Feng T, Chen F, Kang Y, Sun HY, Liu H, Li D, Zhu F, Hou TJ. HawkRank: a new scoring function for protein-protein docking based on weighted energy terms. Journal of Cheminformatics, 2017, 9(1):66.


11. Guex, N., Peitsch, M.C., Schwede, T. Automated comparative protein structure modeling with SWISS-MODEL and Swiss-PdbViewer: A historical perspective. Electrophoresis 30, S162-S173 (2009).


12. H.M. Berman, J. Westbrook, Z. Feng, G. Gilliland, T.N. Bhat, H. Weissig, I.N. Shindyalov, P.E. Bourne. (2000) The Protein Data Bank Nucleic Acids Research, 28: 235-242


13. H.M. Berman, K. Henrick, H. Nakamura (2003) Announcing the worldwide Protein Data Bank Nature Structural Biology 10 (12): 980


14.Hou TJ, Wang JM, Li YY, Wang W. Assessing the performance of the MM/PBSA and MM/GBSA methods: I. The accuracy of binding free energy calculations based on molecular dynamics simulations. Journal of Chemical Information & Modeling, 2011, 51(1):69-82.


15. Hou TJ, Qiao XB, Zhang W, Xu XJ. Empirical Aqueous Solvation Models Based on Accessible Surface Areas with Implicit Electrostatics. Journal of Physical Chemistry B, 2002, 106(43):11295-11304.


17. Mazzon, Michela, et al. "Dengue virus NS5 inhibits interferon-α signaling by blocking signal transducer and activator of transcription 2 phosphorylation." The Journal of infectious diseases 200.8 (2009): 1261-1270.


18. Morrison, Juliet, and Adolfo García-Sastre. "STAT2 signaling and dengue virus infection." Jak-stat 3.1 (2014): e27715.


19. Protein [Internet]. Bethesda (MD): National Library of Medicine (US), National Center for Biotechnology Information; [1988] - [cited 2020 Oct 02]. Available from:https://www.ncbi.nlm.nih.gov/protein/


20. ASun HY, Li YY, Tian S, Xu L, Hou, TJ. Assessing the performance of MM/PBSA and MM/GBSA methods. 4. Accuracies of MM/PBSA and MM/GBSA methodologies evaluated by various simulation protocols using PDBbind data set. Physical Chemistry Chemical Physics, 2014, 16(31):16719-16729.


21. Waterhouse, A., Bertoni, M., Bienert, S., Studer, G., Tauriello, G., Gumienny, R., Heer, F.T., de Beer, T.A.P., Rempfer, C., Bordoli, L., Lepore, R., Schwede, T. SWISS-MODEL: homology modelling of protein structures and complexes. Nucleic Acids Res. 46, W296-W303 (2018).


22. Weng GQ, Wang EC, Wang Z, Liu H, Li D, Zhu F, Hou TJ. HawkDock: a web server to predict and analyze the structures of protein-protein complexes based on computational docking and MM/GBSA. Nucleic Acids Research, 2019, 47(W1): W322-W330.


23. Zacharias M. Protein–protein docking with a reduced protein model accounting for side-chain flexibility. Protein Science, 2003, 12(6):1271-1282.


24. The PyMOL Molecular Graphics System, Version 2.3.2 Schrödinger, LLC.









1. This is ALSO different content


blah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blah









2


blah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blah









3


blah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blah









4


blah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blah









5


blah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blahblah blah



©iGEM IISER Berhampur