Team:BITSPilani-Goa India/Model/Virtual Screening

Virtual Screening | SugarGain | iGEM BITS Goa

The Safari browser doesn't support some features of this website. Please switch to Firefox, Edge or Chrome for a better experience.

Virtual Screening

Motivation

The crux of our biosensor is the invertase inhibitor enzyme. The experiments that we have designed help us characterise this inhibitor but it was crucial that before we design our experiments, we have a sense of the different invertase inhibitors that exist and how they interact with sugarcane invertase. Although a sugarcane inhibitor would have been the first choice, while conducting a literature survey, it was found that the inhibitor from sugarcane was not extensively characterised which would prove to be a difficulty while cloning in a prokaryotic vector. Furthermore, many plant genomes harbor large, expanded gene families. These families probably originated from gene duplication events and often have evolved to fulfill diverse functions Hothorn et al., 2010. Hence, we screened potential inhibitors which have previously and extensively been cloned in E. coli. Prokaryotic inhibitors were not suitable as the folding and the three-dimensional structure was found to have very little homology with sugarcane invertase inhibitors.

Procedure

In order to perform protein-protein docking, we used the ClusPro docking server. We performed docking studies of the following plant inhibitors with invertase:

  1. Sugarcane inhibitor
  2. Arabidopsis inhibitor
  3. Tobacco inhibitor
  4. Potato inhibitor

ClusPro performs three computational steps: first it performs a rigid body docking; next it performs a RMSD based clustering of the 1000 lowest energy structures; and lastly it performs a removal of steric clashes by energy minimization. The ClusPro server 2 3 4 is a widely used tool for protein–protein docking. Docking with each energy parameter set results in ten models defined by centers of highly populated clusters of low-energy docked structures.

The ClusPro algorithm

Figure 1: The ClusPro algorithm

Docking methods can be classified as direct or template-based. Based on thermodynamics, direct methods attempt to find the structure of the target complex located at the minimum of Gibbs free energy in the conformational space. Template-based docking, on the other hand, is based on the observation that interacting pairs sharing above 30% sequence identity often interact in the same way, and hence the structure of the target complex can be obtained by homology modeling tools if an appropriate template complex of a known structure is available.

Although the applicability of template-based docking has been extended based on the observation that partial structures representing the interface region are capable of providing templates, the coverage of the template space at present is still limited and hence direct methods are generally more useful in many applications.

The interaction energy between the two proteins is given by the expression of the form $$ E = w_1E_{rep} + w_2E_{attr} + w_3E_{elec} + w_4E_{DARS} $$

where \( E_{rep} \) and \( E_{attr} \) denote the repulsive and attractive contributions to the van der Waals interaction energy, and \( E_{elec} \) is an electrostatic energy term. \( E_{DARS} \) is the DARS potential that primarily represents desolvation contributions (it is a pairwise structure-based potential constructed by the decoys as the reference state), and the coefficients \( w_1\) through \(w_4\) denote the weights of the corresponding terms.

Initially, homology modelling was used to create the structure of sugarcane invertase and sugarcane invertase inhibitor.

Structure of Sugarcane Invertase

Structure of Sugarcane invertase using homology modelling

Figure 2: Structure of Sugarcane invertase using homology modelling

A Ramachandran plot is a way to visualize energetically favoured regions for backbone dihedral angles against amino acid residues in protein structures. Histograms with a binning of 4 degrees are used to count Φ (Phi; C-N-CA-C) / Ψ (Psi; N-CA-C-N) occurrences for all displayed categories. The number of observed Φ / Ψ pairs determines the contour lines (SWISS-MODEL) 5 6 7 8 9 10.

Ramachandran Plot for the modelled sugarcane invertase

Figure 3: Ramachandran Plot for the modelled sugarcane invertase

From the above Ramachandran plot, it was observed that the Ramachandran outliers include SER169, PRO166, ASP239, SER459 and ARG387.

Next, we performed a comparision of the structure with a non-redundant set of PDB structures. This comparison helps us derive a quality estimate for our model via QMEAN4.

Quality comparision of the modelled sugarcane invertase

Figure 4: Quality comparision of the modelled sugarcane invertase

GMQE (Global Model Quality Estimation) is a quality estimation procedure which combines properties from the target template alignment and the template structure. They are combined using a multilayer perceptron. The resulting GMQE score is expressed as a number between 0 and 1, reflecting the expected accuracy of a model built with that alignment and template, normalized by the coverage of the target sequence. Higher numbers indicate higher reliability (SWISS-MODEL) 5 6 7 8 9 10.

Our model obtained a GMQE of 0.83.

Protein - Protein Docking

Finally we performed a docking of the anti-invertase with various plant invertases.

The following images illustrate the results we obtained. The final energy of binding is also expressed.

Sugarcane invertase and sugarcane inhibitor. E = -677.5 J

Figure 5: Sugarcane invertase and sugarcane inhibitor. E = -677.5 J

Sugarcane invertase and Arabidopsis inhibitor. E = -1058.1 J

Figure 6: Sugarcane invertase and Arabidopsis inhibitor. E = -1058.1 J

Sugarcane invertase and Tobacco inhibitor. E = -1058 J

Figure 7: Sugarcane invertase and Tobacco inhibitor. E = -1058 J

Sugarcane invertase and Potato inhibitor E = -767.9 J

Figure 8: Sugarcane invertase and Potato inhibitor E = -767.9 J

Insights

The catalytic site of invertases is a conserved RDP domain which gets disrupted by the inhibitor. The RDP domain lies near the Asp-309 residue and contains heavy Asp and Glu residues Hothorn et al., 2010. All the inhibitors above bind to the same region. Apart from the sugarcane inhibitor itself, the Arabidopsis inhibitor shows the most disruption of the protein structure. The energy of interaction is also the most negative of all the inhibitors screened. Hence, it would be feasible for us to use the inhibitor from Arabidopsis thaliana to complex with the sugarcane inhibitor for our downstream experiments as an alternative to sugarcane inhibitor.

References

  1. Hothorn, M., Van den Ende, W., Lammens, W., Rybin, V., & Scheffzek, K. (2010).

    Structural insights into the pH-controlled targeting of plant cell-wall invertase by a specific inhibitor protein.

    Proceedings of the National Academy of Sciences 107(40), 17427-17432.

    CrossRefGoogle ScholarBack to text
  2. Vajda, S., Yueh, C., Beglov, D., Bohnuud, T., Mottarella, S. E., Xia, B., ... & Kozakov, D. (2017).

    New additions to the ClusPro server motivated by CAPRI.

    Proteins: Structure, Function, and Bioinformatics 85(3), 435-444.

    CrossRefGoogle ScholarBack to text
  3. Kozakov, D., Hall, D. R., Xia, B., Porter, K. A., Padhorny, D., Yueh, C., ... & Vajda, S. (2017).

    The ClusPro web server for protein-protein docking.

    Nature Protocols 12(2), 255-278.

    CrossRefGoogle ScholarBack to text
  4. Kozakov, D., Beglov, D., Bohnuud, T., Mottarella, S. E., Xia, B., Hall, D. R., & Vajda, S. (2013).

    How good is automated protein docking?.

    Proteins: Structure, Function, and Bioinformatics 81(12), 2159-2166.

    CrossRefGoogle ScholarBack to text
  5. Waterhouse, A., Bertoni, M., Bienert, S., Studer, G., Tauriello, G., Gumienny, R., ... & Schwede, T. (2018).

    SWISS-MODEL: homology modelling of protein structures and complexes.

    Nucleic Acids Research 46(W1), W296-W303.

    CrossRefGoogle ScholarBack to text
  6. Bienert, S., Waterhouse, A., de Beer, T. A. P., Tauriello, G., Studer, G., Bordoli, L., & Schwede, T. (2016). The SWISS-MODEL Repository-new features and functionality. Nucleic Acids Research, 45(D1), D313-D319.

    The SWISS-MODEL Repository new features and functionality.

    Nucleic Acids Research 45(D1), D313-D319.

    CrossRefGoogle ScholarBack to text
  7. Guex, N., Peitsch, M. C., & Schwede, T. (2009).

    Automated comparative protein structure modeling with SWISS-MODEL and Swiss-PdbViewer: A historical perspective.

    ELECTROPHORESIS 30(S1), S162-S173.

    CrossRefGoogle ScholarBack to text
  8. Studer, G., Rempfer, C., Waterhouse, A. M., Gumienny, R., Haas, J., & Schwede, T. (2019). QMEANDisCo-distance constraints applied on model quality estimation. Bioinformatics, 36(6), 1765-1771.

    QMEANDisCo distance constraints applied on model quality estimation.

    Bioinformatics 36(6), 1765-1771.

    CrossRefGoogle ScholarBack to text
  9. Benkert, P., Biasini, M., & Schwede, T. (2010).

    Toward the estimation of the absolute quality of individual protein structure models.

    Bioinformatics 27(3), 343-350.

    CrossRefGoogle ScholarBack to text
  10. Bertoni, M., Kiefer, F., Biasini, M., Bordoli, L., & Schwede, T. (2017).

    Modeling protein quaternary structure of homo-and hetero-oligomers beyond binary interactions by homology.

    Scientific Reports 7(1).

    CrossRefGoogle ScholarBack to text