Team:IISER-Pune-India/Model Results

Pre-processing



The detailed explanation of this step is given in the Model Overview page.


PfEMP1 (Plasmodium falciparum Erythrocyte Membrane Protein) are a class of membrane proteins exhibited by P. falciparum-infected erythrocytes and coded by ∼60 var genes. Although the sequence of var genes expressed during an infection cycle varies for reasons still unknown, studies have found specific motifs responsible for Human ICAM-1 (Intercellular Adhesion Molecule 1) binding, and are conserved across all var genes. The ICAM-1 binding is mediated mainly by a subset of Duffy binding-like (DBL) domains called the DBL-β domains in PfEMP1.[1]

Lennartz et al found that a region of A-type PfEMP1 is associated with an increased risk of developing symptoms of cerebral malaria and developed the crystallographic structures of the DBL-β Domain of PfEMP1 interacting with the human ICAM-1 endothelial receptor. We used this interaction to design peptide inhibitors against the DBL-β Domain of PfEMP1A protein.[1]


The binding of A-type PfEMP1 with ICAM-1 is possible due to 3 main binding regions:

  • Region 1: A Hydrophobic interaction

  • Region 2: A motif rich in Glycine-Proline forms Hydrogen bonds with ICAM-1

  • I[V/L]x3N[E]GG[P/A]xYx27GPPx3H

  • An intrinsically disordered region of PfEMP1A responsible for variable binding.


We decided to design inhibitors against those motifs that form hydrogen bonds with ICAM-1, since this is the strongest of the three interactions and we hypothesize that inhibiting this interaction will prevent the binding of infected erythrocytes with Human-receptors.


The PDB file obtained from the RCSB database contained some missing regions and residues, namely residues no: 1027-1037, 1126-1156, 1188-1193 of Chain A. We modeled these missing regions and loops based on the principles of homology modeling. Inositol 6-phosphate and other non-standard residues that were used for crystallization were removed to aid in protein analysis.

Based on methods mentioned in our overview page, we identified five motifs of the human protein (Chain B), and these were used as models for further analysis.

The five models were:

1. Residues 10-18 : ILPRGGSVL

2. Residues 38-45 : PKKELLLP

3. Residues 49-58 : RKVYELSNVQ

4. Residues 83-88 : YWTPER

5. Residues 168-173 : QGLELF



The Invasion of pristine erythrocytes by P. falciparum-parasitic proteins is facilitated mainly by two kinds of protein families, namely:

1. Reticulocyte-binding protein homologue (RH).

2. Erythrocyte-binding like proteins.


Previous studies have confirmed that targeting PfRH-5 (via anti-PfRH-5 monoclonal antibodies) prevent its interaction with Basigin (a human protein responsible for the Ok blood group ) and hence resulted in halting of parasitic growth in-vitro.[2].

The interaction we studied was the Plasmodium falciparum reticulocyte-binding protein homologue 5 (PfRH5) bound to Basigin. We used homology modelling to complete any missing residues from the native crystal structure obtained from RCSB.

We then identified three motifs of the human protein that strongly mediated this interaction. Based on the general description of how we build inhibitory models, we obtained three peptide sequences, namely:

1. Residues 23-30 of Chain B : AGTVFTTV

2. Residues 93-102 of Chain B : PMGTANIQLN

3. Residues 128-134 of Chain B : SESVPPV



PfEMP1 has the capacity to bind to a variety of human cell surface proteins such as cluster of differentiation 36 (CD36), endothelial protein C receptor (EPCR), and intercellular adhesion molecule 1 (ICAM-1). The CIDRα domain of PfEMP1 expressed on the membrane of infected RBCs binds to the Human CD36 receptor (a scavenger receptor involved in fatty acid metabolism, innate immunity and angiogenesis) and helps the infected erythrocytes to attach on to new blood vessels and tissues. This removes the infected cells from circulation and hence protects the parasite from splenic clearance and allows increased parasitemia. Approximately 84% of PfEMP1 proteins contain domains predicted to bind to CD36, making this the most common adhesion phenotype and an excellent interaction to design peptide inhibitors against.[3]

From the crystallographic structure, we observed that the interaction is mediated by the insertion of a phenylalanine residue at the 153rd position (F153) from CD36 into a hydrophobic pocket on the CIDRα domain. It was further confirmed by mutation studies.[4]

Homology modelling was done on the complex to model the missing residues in the structure obtained from the PDB website. Other non-standard residues were removed for easy scoring and simulation. Few amino acids from the ends of both chain A and B in the PDB were missing after homology modelling. This may not affect our study because we know from the literature that these regions are not involved in the interaction.

Based on inhibitor model selection criteria discussed in our overview page, one sequence of length 10 amino acids was identified from residue number 151 to 160 (NQFVQMILNS) of the Human protein as a peptide model.



The interaction between Endothelial Protein C-Receptor (EPCR) and CIDRa1 domains of the PFEMP1 was discovered to be associated with severe childhood malaria. These CIDRa1 domains that bind to EPCR are found to be conserved in shape and binding potential across diverse PfEMP1 variants. As these domains mimic the natural EPCR ligands, this can be considered as a viable interaction for potential drug candidates for malaria. This interaction also is associated with the adherence of infected RBCs to the blood vessels and tissues protecting the parasites inside the RBCs from spleen mediated destruction.[5]

For designing peptide inhibitors against this interaction the amino acid sequences of the protein C receptor that is in close proximity to the PfEMP1 protein were selected. The complex existed as a dimer, with A/C chains being the parasite protein and chains B/D being the human endothelial protein C receptor. Two peptide sequences from the EPCR that existed in close proximity with CIDRa1 were identified and used for further analysis:

1. Residues 138 - 151 : TFTLQQLNAYNRTR

2. Residues 63 - 80 : QSYLLQFHGLVRLVHQER



Majority of the PfEMP1 proteins can be categorised into the three main groups: A, B and C. The complex we used in this analysis is the DBL beta domain of the B type IT4var13 PfEMP1 bound to the N-terminal domains of ICAM-1. Although the degree of sequence conservation of the ICAM-1 binding site is poor in B and C type PfEMP1, they share the overall chemical nature of the site. This includes the hydrophobic cluster at the core of the binding site, which is complemented by hydrogen bonds contributed by the side chains from a central loop (residues 973-976) and helices (residues 1098-1121) and the backbone groups. These are the three main interactions mediating the binding.[6]

After homology modeling to complete the structures, we identified 27 templates matching the target complex but only the original PDB of the complex, 6S8U was used for homology modelling. Smaller molecules like NAG and other oligosaccharides present in the complex were removed for ease in scoring and simulation.

Peptide sequences of ICAM-1 that are in close proximity to PfEMP1-B were selected as a basis for designing the peptide inhibitors. Four different peptide sequences were found, namely:

1. Residues 2-12 : TSVSPSKVILP

2. Residues 16-23 : SVLVTCST

3. Residues 39-55 : KKELLLPGNNNRKVYELS

4. Residues 165-171 : LRPQGLE



Computational Saturated Mutagenesis and Scoring



The detailed explanation of this step is given in the Model Overview page.

For easy reference, communication and data handling, all the peptides we used in the study were named after a convention. The common format used for naming the mutants was ‘Index number assigned to the interaction (1-5)_The peptide model used for designing this particular inhibitor_the specific mutation performed (Ex: A21C means that at residue 21, Alanine (A) was mutated to Cysteine (C) )’ and for hybrids it was ‘Index number assigned to the interaction (1-5)_The peptide model used for designing this particular inhibitor_letter ‘H’ representing hybrids’

Example :

  • 2_1_A21C refers to the model where in the 1st Inhibitory Model of 2nd Interaction (4U0Q), the 21st residue was mutated from Alanine (A) to Cysteine (C)

  • 1_1_H refers to the hybrid peptide inhibitor obtained for the 1st Inhibitory Model of 1st Interaction (5MZA)


We performed Computational Saturation Mutagenesis on all the five peptide models. After repairing all models to satisfy spatial constraints with the Foldx empirical force field, we scored all the unique 680 models and sorted them by Interaction energy values. We found that mutants of Model 1 (residues 10-18) had the Least Interaction Energy and Highest Stability. The scatter plots of the Interaction Energy and the stability of the peptide against each mutation illustrate the fact that Inhibitory models of Model 1 are most stable.

The heatmap of interaction energy scores of the mutants from the first model are shown below. Here the X-axis gives the position of mutation and the Y-axis shows the corresponding amino acid. We see that mutating residue 16 Ile (Isoleucine) and Phe (Phenylalanine) causes the highest decrease in Interaction energy between the Parasite protein and the peptide. Furthermore mutating residue 16 (wild type - Serine) seemed to influence the stability of binding the highest while mutation of residue 15 (wild type - Glycine) seems to destabilize the interaction.


Scatter plot showing the interaction energy and stability of the mutants


5MZA CSM PLOT

Interaction Energy Heatmap for Inhibitor model 1


5MZA-I.E-HEATMAP

We used the top five inhibitors models for further analysis. We found that when the serine residue at position 16 of the first model was mutated to Isoleucine (1_1_S16I), the peptide formed an extra H-bond with the falciparum protein with the N and O atoms of Ile, contributing to two hydrogen bonds. This implied that the acquired mutant was able to bind more effectively to the Parasite protein. This mutant with primary structure ILPRGGIVL had the least interaction energy score of -9.69 kcal/mol among the mutants.

We created a hybrid peptide, 1_1_H, of length 9 based on Model 1 where each amino-acid contained the least-scoring amino-acids for all positions from the Saturation Mutagenesis study. This hybrid model (GVPRGGDF) was found to have the least interaction energy than the best mutant (-11.04 kcal/mol) was higher than any of the mutants.

The top two Inhibitors with best scores were chosen for Molecular Dynamics Simulations:

1. 1_1_H : GVPRGGIDF

2. 1_1_S16I : ILPRGGIVL



We found that the majority of the mutants were better than their respective wild-type models. The third peptide model with Glutamic Acid at 129th position replaced by Arginine (2_3_E129R) had the least Interaction Energy (-7.29 kcal/mol) among all the unique mutants and wild-type peptide models. The scatter plots and the heat maps showing the Interaction energy scores and stability of the peptides after each mutation are shown below:



Scatter plot showing the interaction energy and stability of the mutants



Interaction Energy Heat Map for Inhibitor model 1


4U0Q HEATMAP-INHIBITOR-1

Interaction Energy Heat Map for Inhibitor model 2


4U0Q HEATMAP-INHIBITOR-2

Interaction Energy Heat Map for Inhibitor model 3


4U0Q HEATMAP-INHIBITOR-3

Using these plots we created hybrid peptides for all peptide models by replacing each residue by the amino acid that had the best interaction energy score at that position.

The three hybrid peptides designed from the three inhibitor models and their interaction energy scores are as follows:

1. Hybrid 1 : -9.15353 kcal/mol

2. Hybrid 2 : -13.1215 kcal/mol

3. Hybrid 3 : -5.21111 kcal/mol

Based on the above results, the top three Inhibitor models with the best score were for selected further studies. The amino acid sequences of the three models are shown below:

1. 2_1_H : YYYWMWMF

2. 2_2_H : NPMPMFDMEY

3. 2_3_E129R : SRSVPPV



The interaction energy of the wild type was found to be -16.5112 kcal/mol. Interaction energy scores of the mutants are shown in the heat map below.



Interaction energy heatmap of the inhibitor model


5LGD HEATMAP INHIBITOR

The heat map shows the interaction energy between all the mutant peptides and the CIDRα domain of the PfEMP1. Here the X -axis gives the position of mutation and the Y-axis shows the corresponding amino acid. The interaction energy of the hybrid peptide was found to be -21.17 kcal/mol. This was higher than any of the mutants.

The 3 best scoring inhibitors and the hybrid was chosen for the Molecular Dynamics simulations.

1. Hybrid 3_1_H : NKFWRWMWRM

2. Mutant 1 : 3_1_S160M : NQFVQMILNM

3. Mutant 2 : 3_1_S160F : NQFVQMILNF

4. Mutant 3 : 3_1_S160W : NQFVQMILNW



We could find point mutations which resulted in interaction energies, both higher and lower than the wild type sequence



Heatmap for inhibitor model 1


4V3D Heatmap inhibitor-1

Along the Y-axis, as the mutated amino acid changes, the interaction energy for a particular position seems to be approximately constant. A high degree of variance seems to be in the 143rd position where the energy values range from positive value(+2.82 kcal/mol) to least the negative value(-5.96 kcal/mol). This shows that changing the position and the type of amino acid affects the binding affinity of the inhibitory peptide. The sequence 4_1_H is made from this data by selecting and combining the amino acids that give lowest energy in each position.



Heatmap for inhibitor model 2


4V3D Heatmap inhibitor-2

The heatmap shows the distribution of interaction energy values of the mutants with the Y-axis showing the substituted amino acid and the position of the mutation on the X-axis, and the resulting amino acid residue. For each position, the amino acid type determines its interaction energy. In the plot 71st position shows the maximum variation along the Y-axis, while for all other positions, the value seems to be more or less constant. The least scored mutant was 4_2_H70N (-12.16 kcal/mol).

The two peptides used for further studies were:

1. 4_1_H : YMRDKLMIGYWRLR

2. 4_2_H70N : QSYLLQFNGLVRLVHQER


The Mutants from the First peptide Model yielded the lowest scores. These peptide sequences interact mainly with the loops of the binding-sites (residues 833-840 and 1107-1117). Glutamine mutation at position 5 (5_1_S5Q) yielded the lowest score of -9.33 kcal/mol while that of its wild type was found out to be -7.46 kcal/mol. The interaction energy scores of complexes from the first model after mutations could be found out from the heatmap.


Interaction energy heatmap for inhibitor model 1


Heatmap plot for 6S8U

Another trend that could be observed in the scores was that as the position of the peptide sequences selected as inhibitors moved farther from the binding site the interaction energy scores lowered. Three mutants with the lowest interaction energy scores selected for Molecular Dynamic (MD) simulations are:

1. 5_1_S5Q : TSVQPSKVILP

2. 5_1_S5W : TSVWPSKVILP

3. 5_1_S5I : TSVIPSKVILP



Molecular Dynamics Simulations



The detailed explanation of this step is given in the Model Overview page.


The inhibitor models which were finalized for MD simulations can be visualised in 3-D here.

All trajectories files can be found here.


We performed 3 MD runs for models: 1_1_S16I and 1_1_H to determine how the peptide moved relative to the Protein and if it was stable, we wanted to know which atoms contributed to the overall stability.


Inhibitor 1 : 1_1_S16I


The first inhibitor we considered for MD is obtained by mutating the 16th residue of the wild type human protein from Serine (S) to Isoleucine (I) in the 1st Inhibitor model of the 1st Interaction (5MZA). Overall, the trajectory of the peptide over 100ns seemed stable around 30 Å from the core of the P. falciparum protein. However, in the second run it was noticed that when the peptide was pushed into the core, it rebounded and started to move away.


Centroid Profile for 1_1_S16I


Centroid plot of s16i(5MZA)

The exponentially weighted moving averages suggest that the protein initially bounces away from the protein and then maintains a constant distance from the core, indicative of strong binding. The mean distance of the peptide from the protein was around 31 Å and the standard deviation varied for the three runs with the second run having the highest among the three. (0.466, 9.78, 0.578 Å)


Hydrogen Bond Profile for 1_1_S16I


H-bond plot s16i (5MZA)

Index Hydrogen bond Acceptor-Donor MD run 1 (%) MD run 2 (%) MD run 3 (%)
1 ILE:16.B.O - ASN:356.A.ND2 97.51 84.55 99.50
2 ASN:356.A.OD1 - ILE:16.B.N 93.53 42.28 92.39
3 ARG:13.B.O - THR:394.A.OG1 15.94 <5.00 82.58
4 THR:363.A.OD1 - GLY:10.B.N 5.47 <5.00 26.86
5 GLY:10.B.O - ASN:363.A.N 52.23 <5.00 16.41
6 PHE:18.B.OC2 - LYS:349.A.NZ <5.00 8.47 18.41


Hydrogen Bond analysis suggested that the number of intermolecular hydrogen bonds between the parasite protein and the peptide did change over time but many were conserved throughout the simulation period. The two hydrogen bonds formed by the N-atom of the mutated Isoleucine residue at position 16 of the peptide had the strongest retention over time, thus reinstating the initial hypothesis of efficient binding due to higher number of hydrogen bonds.


Inhibitor 2 : 1_1_H


Model 1_1_H refers to the Hybrid inhibitor of 5MZA. The Hybrid peptide was created by mutating each amino acid in the wildtype to the one that yielded least scores on scoring. It was the model with the least Interaction energy, thereby suggesting highest stability.


Centroid Profile for 1_1_H


Centroid plot of H(5MZA)

The distance between the centers of the protein and the peptide varied over time. We used Exponentially Weighted Moving Average (EWMA) to understand the average behavior and discerned that the distance between the centroids increased in two runs while it remained constant in another run. The peptide initially tends to move away from the protein but the distance stabilizes over time. However, in runs two and three, it was found that the peptide moves too quickly and then fails to stabilize over time. The mean distance was found to be 31.14, 32.94, 48.07 Å and with standard deviations 0.811, 11.72, 14.97 Å respectively for run 1, 2 and 3.


Hydrogen Bond Profile for 1_1_H


H-bond plot hybrid (5MZA)

Index Hydrogen bond Acceptor-Donor MD run 1 (%) MD run 2 (%) MD run 3 (%)
1 ILE:16.B.O - ASN:356.A.ND2 99.05 87.56 <5.00
2 ASN:356.A.OD1 - ILE:16.B.N 83.08 71.14 <5.00
3 ASP:393.A.OD2 - ARG:13.B.NE2 29.85 32.83 <5.00
4 ASP:393.A.OD1 - ARG:13.B.NH2 29.85 31.84 <5.00
5 ASP:393.A.OD2 - ARG:13.B.NH2 13.45 26.86 <5.00
6 ASP:17.B.OD2 -TYR:362.A.OH 27.86 11.94 <5.00


After computing the Hydrogen Bond profile for the three runs, we found that in the 3rd MD run, the peptide effectively flew away, over time and was not stabilized. However, we obtained conclusive evidence from Run1 and Run2 that Isoleucine at residue 16, formed two new hydrogen bonds with the protein and was conserved for almost all of the simulation period.

Based on these promising results, we directed the wet lab team to build primers and bio bricks for this Peptide Inhibitor.


All trajectories files can be found here.


Inhibitor 1: 2_1_H


This is a hybrid peptide designed from the first peptide model. Its interaction energy is lower than the best mutant due to which it was considered for MD simulation.


Centroid Profile for 2_1_H


centroid plot 2_1_H

A = angstrom unit


Centroid analysis of 2_1_H showed the distance between the peptide and protein to be fluctuating wildly during the 100ns run. Even then the values of the mean distance in the three runs were close (all approximately 43 Å) with only small standard deviations. Studying the Exponentially Weighted Moving Average (EWMA) of the plot we find that the centroids remain at a constant distance most of the time in the three MD runs.


Hydrogen Bond profile for 2_1_H


Hbond plot of 2_1_H

Index Hydrogen bond Acceptor-Donor MD run 1 (%) MD run 2 (%) MD run 3 (%)
1 ILE:446.A.O - TRP:28.B.N 0 32.83 0
2 TRP:447.A.O - TRP:28.B.N 95.02 51.24 97.51
3 PHE:30.B.OC1 - ARG:357.A.NH1 0 32.34 0
4 TRP:26.B.O - THR:449.A.N 92.04 41.29 98.01
5 THR:449.A.OG1 - TRP:26.B.N 72.14 70.15 95.52
6 ASN:349.A.O - TYR:23.B.OH 0 0 42.79


Due to the stochastic nature of the molecular dynamic simulations, the number and nature of the hydrogen bonds varied over time. But we could observe 6 hydrogen bonds conserved for at least 25% in at least one of the triplicate MD runs. Out of the six hydrogen bonds observed, only three hydrogen bonds appeared in all MD runs. These hydrogen bonds can be attributed to be the main mediators of the interaction.


Inhibitor 2 : 2_2_H


This inhibitory model was created by taking all those amino acids which gave the least interaction energy at different positions of the second inhibitor model. Because of its low interaction energy, MD simulations were done on using this peptide.


Centroid Profile for 2_2_H


centroid plot 2_2_H

A = angstrom unit


Centroid analysis of 2_2_H gave us a highly fluctuating graph for this complex. From the plot one could easily see that the distance between the centroids increased initially but after 50 ns it remained stable in MD run 1 and 2 but in MD run 3 the distance between the centroid decreases. The mean and standard deviation of the distance for the three runs are as follows:

MD Run 1: Mean= 45.9 Å; standard deviation: 2.89 Å

MD Run 2: Mean= 46.3 Å; standard deviation: 1.81 Å

MD Run 3: Mean= 43.0 Å; standard deviation: 2.04 Å


Hydrogen Bond Profile for 2_2_H


H-bond plot of 2_2_H

Index Hydrogen bond Acceptor-Donor MD run 1 (%) MD run 2 (%) MD run 3 (%)
1 MET:97.B.O - THR:449.A.N 1.49 48.76 0
2 TYR:194.A.O - TYR:102.B.OH 0 0 54.73
3 MET:95.BO - TYR:346.A.OH 0 0 24.88
4 PRO:96.B.O - THR:449.A.OG1 0 27.86 25.37
5 ASP:99.B.OD2 - ARG:357.A.NH1 0.50 43.28 10.94
6 PHE:98.B.O - ASN:354.A.ND2 2.49 9.95 70.65
7 TRP:447.A.O - MET:97.B.N 42.79 0 0
8 ASP:99.B.OD1 - ARG:357.A.NH2 0.50 46.27 7.46
9 ASN:352.A.OD1 - PHE:98.B.N 2.98 9.95 55.22
10 ASP:99.B.OD1 - ARG:357.A.NH1 0.50 53.73 8.96


Hydrogen bond conservation was found out to be poor in this interaction. No hydrogen bonds could be found crossing 75% abundance. This can be an indication that the hydrogen bond interactions do not play a major role in this interaction.


Inhibitor 3 : 2_3_E129R


This peptide formed by an arginine mutation at the 129th residue position of the third inhibitor model, had the least interaction energy value compared to the wild type, all the mutants and hybrids of this model. Hence it was selected for MD simulations.


Centroid Profile for 2_3_E129R


centroid plot 2_3_E129R

A = angstrom unit


2_3_E129R centroid distance analysis shows interesting features. The distance between the centroids remains stable in MD run 1, decreases in MD run 2 and in MD run 3 the peptide flies away from the protein, showing an increasing graph. This trend could also be observed in the mean and standard deviations of the graphs.

MD Run 1: Mean= 31.5 Å; standard deviation: 2.32 Å

MD Run 2: Mean= 27.6 Å; standard deviation: 1.45 Å

MD Run 3: Mean= 83.6 Å; standard deviation: 26.19 Å


Hydrogen Bond Profile for 2_3_E129R


H-bond plot for 2_3_E129R

Index Hydrogen bond Acceptor-Donor MD run 1 (%) MD run 2 (%) MD run 3 (%)
1 GLU:362.A.OE2 - ARG:129.B.NH2 14.92 27.86 0
2 GLU:362.A.OE1 - ARG:129.B.NH2 15.42 25.87 0
3 SER:130.B.O - TYR:203.A.OH 93.53 35.82 0


We could observe only three hydrogen bonds conserved for at least 25% time in at least one of the triplicate MD runs. Out of the 3 hydrogen bonds observed, none of them were conserved in all MD runs. The third MD run had no hydrogen bonds above 25% abundance indicating poor binding.

All trajectories files can be found here.


Inhibitor 1 : 3_1_H


This is the Hybrid inhibitor made from the computational saturated mutagenesis results. Eight out of the 10 amino acids in the wild type inhibitor was mutated to obtain this. Due to the high number of mutated residues, we have to check if this peptide causes any unwanted immunological responses inside the body.


Centroid Profile for 3_1_H


3_1_H centroid plot(5LGD)

A = angstrom unit


Distance between the centroids of protein and the peptide was computed using snapshots taken at an interval of 0.5 ns for the entire simulation time. We used EWMA to understand average behaviour. We discerned that the distance between the protein and peptide complex was relatively constant, which implies that the inhibitor has less tendency to move away from the parasite protein. The distance between the centroids fluctuated with a standard deviation of 0.49 Å around the average distance of 22.29 Å.


Hydrogen Bond Profile for 3_1_H


3_1_H H-bond plot(5LGD)

Index Hydrogen bond Acceptor-Donor MD run 1 (%) MD run 2 (%) MD run 3 (%)
1 GLU:97.B.OE2 - ARG:159.A.NH2 22.39 43.28 30.35
2 ASP:75.B.OD1 - ASN:151.A.ND2 5.97 8.46 56.22
3 ASP:101.B.OD1 - ARG:155.A.NH1 24.88 61.19 37.81
4 ASP:101.B.OD2 - ARG:155.A.NH1 32.84 41.29 35.82
5 ASP:101.BOD1 - ARG:155.ANH22 33.83 27.86 37.31
6 GLU:97.B.OE1 - ARG:159.A.NH2 21.89 63.18 28.86
7 GLU:97.BOE2 - ARG:159.ANH1 23.88 60.20 22.39
8 ASP:75.B.OD2 - ASN:151.A.ND2 5.97 6.47 45.77
9 ASP:75.B.OD1 - LYS:152.A.N 1.00 26.37 2.49
10 ASP:75.B.O - ASN:151.A.ND2 63.68 41.29 7.46
11 ASP:101.B.OD2 - ARG:155.A.NH2 23.88 53.73 40.80
12 GLU:97.B.OE1 - ARG:159.A.NH1 22.39 52.24 29.85
13 GLN:73.B.OE1 - ASN:151.A.ND2 72.14 11.94 83.08


In this interaction, 13 hydrogen bonds between the parasite protein and the inhibitor were conserved for at least 25% in at least one of the three MD runs. We observed that the hydrogen bond formed by the 160th residue of the inhibitor was lost. But other residues of the inhibitor formed hydrogen bonds with the parasite protein. This large number of hydrogen bonds may explain the high interaction energy of -21.17 kcal/mol.


Inhibtor 2 : 3_1_S160M


This mutant was obtained by replacing Serine at position 160th with Methionine. It was the best scoring mutant with respect to the interaction energy given by FoldX.


Centroid Profile for 3_1_S160M


3_1_S160M

A = angstrom unit


From the centroid plot it could be discerned that the distance between the protein and peptide complex is relatively constant, but in one of the triplicate runs it had a slight positive slope indicating that the inhibitor tended to move away from the parasite protein. The distance between the centroids fluctuated with a standard deviation of 0.64 Å around the average distance of 22.01 Å.


Hydrogen Bond Profile for 3_1_S160M


H-bond plot for 3_1_S160M

Index Hydrogen bond Acceptor-Donor MD run 1 (%) MD run 2 (%) MD run 3 (%)
1 GLN:73.B.OE1 - ASN:151.A.ND2 92.04 88.06 59.70
2 GLN:152.A.OE1 - ASN:8.B.N 39.30 0.50 6.96
3 ASP:75.B.O - ASN:151.A.ND2 33.33 61.19 44.28
4 ASP:75.B.OD1 - ASN:151.A.ND2 36.32 5.97 11.44
5 ASP:75.B.OD2 - ASN:151.A.ND2 30.85 9.45 16.42


Studying the hydrogen bonds that are conserved in at least 25% of the simulation time, it was observed that the hydrogen bond formed by the 160th residue of the inhibitor was lost, but a new hydrogen bond was formed by the residue 152 of the inhibitor acting as the acceptor. This new hydrogen bond may explain the high interaction energy value of -18.82 kcal/mol.


Inhibitor 3 : 3_1_S160F


This mutant was obtained by replacing Serine at position 160th with Phenylalanine. This was the second-best scoring mutant. The variation of the distance between the centroids of the protein and the peptide was analysed along with the hydrogen bond.


Centroid Profile for 3_1_S160F


Centroid plot for 3_1_S160F

A = angstrom unit


The distance between the protein and peptide complex had a slight positive slope indicating that the inhibitor tended to move away from the parasite protein. The distance between the centroids fluctuated with a standard deviation of 0.61 Å around the average distance of 22.04 Å.


Hydrogen Bond Profile for 3_1_S160F


H-bond plot for 3_1_S160F

Index Hydrogen bond Acceptor-Donor MD run 1 (%) MD run 2 (%) MD run 3 (%)
1 ASP:75.B.O - ASN:151.A.ND2 40.80 45.77 24.38
2 ASP:75.B.OD1 - ASN:151.A.ND2 20.90 12.44 39.80
3 GLN:152.A.OE1 - TRP:12.B.NE1 0 34.83 0.50
4 GLN:73.B.OE1 - ASN:151.A.ND2 90.05 50.75 98.51
5 ASP:75.B.OD2 - ASN:151.A.ND2 22.89 17.91 38.81


In this interaction, five hydrogen bonds between the parasite protein and the inhibitor were conserved for at least 25% in at least one of the triplicate MD runs as seen in 3_1_S160M. It was observed that the hydrogen bond formed by the 160th residue of the inhibitor was lost, but a new hydrogen bond was formed by the residue 152 of the inhibitor acting as the acceptor. This may explain the high interaction energy value of -18.82 kcal/mol, which is very close to 3_1_S160M.


Inhibtor 4: 3_1_S160W


This mutant was obtained by replacing Serine at position 160th with Tryptophan. This was the third-best scoring mutant. Similar analysis was done for this interaction also and the results are as follows:


Centroid Profile for 3_1_S160W


Centroid plot for 3_1_S160W

A = angstrom unit


It was found that the distance between the protein and peptide complex had a more significant positive slope indicating that the inhibitor tended to move away from the parasite protein. The distance between the centroids fluctuated with a standard deviation of 0.50 Å around the average distance of 22.41 Å.


Hydrogen Bond Profile for 3_1_S160W


H-bond plot for 3_1_S160W

Index Hydrogen bond Acceptor-Donor MD run 1 (%) MD run 2 (%) MD run 3 (%)
1 ASP:75.B.OD2 - ASN:151.A.ND2 22.89 36.82 27.36
2 GLN:73.B.OE1 - ASN:151.A.ND2 90.05 89.06 82.09
3 GLN:152.A.O - ASN:8.B.ND2 7.96 4.48 50.75
4 ASP:75.B.OD1 - ASN:151.A.ND2 20.90 42.29 30.85
5 GLN:152.A.OE1 - ASN:8.B.N 15.92 6.96 40.80
6 ASP:75.B.O - ASN:151.A.ND2 40.80 15.42 36.32


It was observed that the hydrogen bond formed by the 160th residue of the inhibitor was lost as observed in the previous two mutants. However, new hydrogen bonds were formed by the residues at positions 152 and 151 of the inhibitor, acting as the acceptor and donor respectively with the parasite protein. Even though we observed a new hydrogen bond, the interaction energy was found to be lower than the previous two mutants which had one hydrogen bond pair less than this mutant. The interaction energy observed was -18.47 kcal/mol.

All trajectories files can be found here.


Inhibitor 1 : 4_1_H


This hybrid sequence was created by replacing the primary sequence with the amino acids that gave the lowest energy at each position. The resulting 14 amino acids long sequence is YMRDKLMIGYWRLR. The interaction energy of the hybrid sequence was -7.29862 kcal/mol which is lower than all the mutants obtained from saturated mutagenesis. The binding affinity of the hybrid is higher than the natural binding affinity of the CIDRa1 domain of the PFEMP1 protein with EPCR. As a single CIDRa1 domain binds with EPCR with the same affinity as a full-length PfEMP1 protein, the higher affinity of the hybrid can potentially block the CIDRa1 domain which prevents the whole interaction.


Centroid Profile for 4_1_H


centroid plot for 4_1_H

A = angstrom unit


The distance between the centroid of the protein and the sequence was determined from the simulation and visualized. The plot maintains a constant value for the centroid distance and after a while increases in peaks of varying values. These variations, after visualizing in pymol appeared to be the result of dynamic motion and changing of the shape of the inhibitory sequence during MD simulation. The sequence, which was part of an alpha helix structure, does not retain its shape over the course of the simulation. It curled and unwound back and forth. Despite sudden peaks, the standard deviation of the centroid distance is found to be 2.263 Å with an average distance of 28.444 Å.


Hydrogen Bond Profile for 4_1_H


In this analysis the H-bonds that were conserved for more than 25% of the total simulation in any of the three MD runs is plotted. The table below shows the H-bonds and their percentage of abundance.



H-bond plot for 4_1_H

Index Hydrogen bond Acceptor-Donor MD run 1 (%) MD run 2 (%) MD run 3 (%)
1 GLU:199.A.OE1 - ARG:151.B.NH2 0 25.37 39.80
2 GLU:199.A.OE2 - TRP:148.B.N 0.50 20.40 46.77
3 ASP:185.A.OD2 - ARG:149.B.NH2 8.46 32.34 8.96
4 GLU:199.A.OE1 - TRP:148.B.N 0 14.43 31.34
5 GLU:199.A.OE2 - ARG:151.B.NH2 0 25.37 38.81
6 GLU:199.A.OE1 - ARG:151.B.NH1 0 12.44 64.18
7 ASP:185.A.OD2 - ARG:149.B.NH1 6.96 33.83 10.45
8 GLU:199.A.OE2 - ARG:151.B.NH1 0 20.90 69.15


From the above data, the simulations 2 and 3 show limited conservation of H-bonds. In the first simulation, the H-bonds are very low or non-existent. This could be due to the lack of stability around the protein as the inhibitory sequence turns out to be more dynamic and energetic.


Inhibtor 2 : 4_2_H70N


This sequence of 18 amino acids was obtained from the saturated mutagenesis of the second primary inhibitor chosen from the interaction. The sequence is part of a larger alpha-helix structure and the sequence maintains this helical structure even during the simulation. The interaction energy of this sequence turned out to be -12.16 kcal/mol. In this sequence the amino acid at the 70th position, Histamine is mutated with Asparagine and the resulting sequence is QSYLLQFNGLVRLVHQER.


Centroid Profile for 4_2_H70N


centroid plot for 4_2_H70N

A = angstrom unit


The Distance between the sequence and centroids of the protein was again computed by taking a snapshot of the MD simulation at equal intervals for the entire simulation. The distance appeared to be constant with minimal fluctuation. The distance fluctuated between 20 Å and 22.5 Å with a standard deviation of 0.84 Å


Hydrogen Bond Profile for 4_2_H70N


Those H-bonds which are conserved for more than 50% (100 snapshots) of the total simulation time in any of the MD runs are plotted. The table below shows the H-bonds that have a percentage of abundance greater than zero in all three MD simulations and their percentage of abundance.



H-bond plot for 4_2_H70N

Index Hydrogen bond Acceptor-Donor MD run 1 (%) MD run 2 (%) MD run 3 (%)
1 ASP:109.A.OD1 - ARG:74.B.NH1 5.97 14.43 62.69
2 ASP:185.A.OD1 - SER:64.B.N 9.95 46.27 54.23
3 ASP:109.A.OD1 - ARG:74.B.NH2 2.98 9.95 63.18
4 GLN:78.B.OE1 - GLN:190.A.NE2 15.42 32.34 6.47
5 SER:186.A.O - ARG:74.B.NH1 3.98 45.77 3.48
6 GLU:79.B.OE2 - TYR:193.A.OH 4.48 0.50 27.36
7 ASP:185.A.OD1 - GLN:68.B.NE2 17.91 55.22 3.98
8 ASP:109.A.OD2 - ARG:74.B.NH1 2.49 8.46 58.21
9 SER:186.A.O - ARG:74.B.NH2 1.49 46.27 4.48
10 ASP:109.A.OD2 - ARG:74.B.NH2 4.48 3.48 70.15
11 ASP:185.A.OD2 - SER:64.B.N 8.46 51.24 1.99


Again no H-bonds are retained in all the 3 MD runs equally at abundance more than 50%. We can conclude that the H-bonds does not influence the lowering of the interaction energy of the inhibitory sequence to a large extent.


All trajectories files can be found here.


Inhibtor 1 : 5_1_S5Q


This peptide of length 11 was obtained by mutating the serine residue at the fifth position of ICAM-1 to glutamine in the first peptide sequence among the four peptide models. Thus the amino acid sequence of this peptide is TSVQPSKVILP. The first beta strand from the wild type was lost due to the mutation but the mutation has led to an increase in the number of hydrogen bonds between the protein and the peptide from five to seven in the static structure. Glutamine has provided a new H-bond acceptor (OE1) as well as a new donor (NE2). This could be the reason for the best interaction energy score of -9.33 kcal/mol.


Centroid Profile for 5_1_S5Q


centroid plot 5_1_S5Q

A = angstrom unit


The distance between the centroids of the protein and the peptide were analysed along with EWMA to understand the average behavior. Through this we could learn that the distance between the protein and peptide complex is relatively constant. The distance between the centers fluctuated with a standard deviation of 0.756 Å around the average distance of 26.27 Å.


Hydrogen Bond Profile for 5_1_S5Q


Hydrogen bonds between the protein and peptide were analyzed using the snapshots for all the three MD runs. Only those hydrogen bonds which had a percentage of abundance above 25 were plotted. The table showing the percentage abundance of the H-bonds from the three MD runs and the plot showing their profile can be seen below:



H-bond plot for 5_1_S5Q

Index Hydrogen bond Acceptor-Donor MD run 1 (%) MD run 2 (%) MD run 3 (%)
1 GLN:5.B.OE1 - LYS:364.A.NZ 2.49 28.86 8.46
2 GLN:5.B.OE1 - LYS:364.A.NZ 2.49 30.35 6.96
3 ILE:10.B.O - SER:380.A.N 96.52 89.55 99.00
4 GLY:382.A.O - LYS:8.B.N 91.04 78.61 63.68
5 GLN:5.B.OE1 - GLY:384.A.N 28.86 0 0
6 SER:380.A.O - ILE:10.B.N 99.50 100 99.00
7 LYS:8.B.O - GLY:382.A.N 94.53 88.56 85.57
8 GLN:5.B.OE1 - LYS:364.A.NZ 3.48 24.88 7.96
9 SER:7.B.OG - THR:105.A.OG1 0 0 34.33
10 GLY:382.A.O - SER:7.B.N 51.24 59.70 27.86
11 THR:105.A.OG1 - SER:7.B.N 0 0 28.86


Four among the eleven H-bonds were conserved in all the three MD runs with an abundance percentage above 50%. The amino acids from the peptide associated with these H-bonds were all the native amino acids from the wildtype peptide. The H-bonds newly formed due to the glutamine mutation at the fifth position were not conserved equally in all the three runs indicating that the mutation has only a limited effect on the H-bond interaction of the complex.

So far the results for 5_1_S5Q seem promising that it can be selected as a candidate for further experimental validation as an inhibitory peptide of B-type PfEMP1. Further validation can include docking studies on other B-type PfEMP1 proteins using this peptide.


Inhibtor 2 : 5_1_S5W


This 11 amino acid long mutant is also from the first peptide model by mutating the same serine residue at the fifth position to tryptophan while CSM. The primary structure of the peptide is TSVWPSKVILP. Tryptophan being a non-polar residue has helped improve the hydrophobic interaction between the DBL- beta domain of PfEMP1 and ICAM1. This is evident from the large increase in the solvent excluded surface from 21993.4 Å2 of the wild type to 24031.8 Å2 of the mutant. All five H-bonds from the wild type were also conserved in the mutant. The interaction energy score of the complex was thus found out to be -8.42 kcal/mol.


Centroid Profile for 5_1_S5W


5_2_S5W centroid plot

A = angstrom unit


Variation of the distance between the centers of the protein and the peptide with time was computed and visualized. The graph of the distance shows a standard deviation of 0.69 Å with an average distance of 25.11 Å. The exponential moving average at 89 ns forms a rough straight line showing a stable complex.


Hydrogen Bond Profile for 5_1_S5W


The profile of the hydrogen bonds between the PfEMP1 protein and the mutant peptide was analyzed. Again only H-bonds that were conserved in more than 50 snapshots out of the 201 snapshots in at least one MD run were plotted. The table and the plot are given below:



H-bond plot for 5_2_S5W

Index Hydrogen bond Acceptor-Donor MD run 1 (%) MD run 2 (%) MD run 3 (%)
1 THR:2.B.O - THR:105.A.N 0 0 25.87
2 GLY:382.A.O - SER:7.B.N 9.95 39.80 70.15
3 GLU:367.A.OE2 - VAL:4.B.N 0 28.36 0
4 ILE:10.B.O - SER:380.A.N 81.59 82.59 93.03
5 LYS:8.B.O - GLY:382.A.N 94.53 93.03 91.54
6 ASP:385.A.OD2 - TRP:5.B.NE1 0 0 28.36
7 SER:380.A.O - ILE:10.B.N 99.50 99.00 99.00
8 THR:105.A.OG1 - VAL:4.B.N 0 0 24.88
9 GLU:367.A.OE 1 - SER:3.B.N 0 26.37 0
10 TRP:5.B.O - GLN:371.A.NE2 43.78 0 0
11 GLY:382.A.O - LYS:8.B.N 77.11 87.56 18.90


All the three hydrogen bonds, retained in all the three runs at abundance more than 50%, are between the parasite protein and the non-mutated residues from the peptide. This was expected as hydrophobic interactions are thought to be the main driving force of this inhibition. Also, two hydrogen bonds from the mutated tryptophan residue which could not be found in the static initial structure of the complex could be observed in the 1st and the 3rd MD run. These factors may have led to a better interaction energy score of the complex.

These analyses were done on 5_1_S5W to check the competence of the mutant to act as peptide inhibitor against PfEMP1 and the results imply that further studies could be performed on this mutant.


Inhibtor 3 : 5_1_S5I


This again is a mutant of the first peptide sequence model. Here the serine was replaced by isoleucine (TSVIPSKVILP). Although isoleucine is a highly nonpolar residue, not much difference could be found in the solvent excluded surface area. No new hydrogen bonds were formed nonetheless the five hydrogen bonds in the wildtype peptide were conserved. The interaction energy score of this mutant was -8.26 kcal/mol.


Centroid Profile for 5_1_S5I


5_3_S5I centroid plot

A = angstrom unit


Distance between the peptide and protein analyzed as a function of time was plotted. The standard deviation of the plot is 0.92 Å and the mean distance between the protein and the peptide is 25.74 Å. Like the distance graphs in the other two complexes, the exponentially weighted moving average becomes a straight line as time goes from 8 to 34 to finally 89 ns indicating stable binding.


Hydrogen Bond Profile for 5_1_S5I


Hydrogen bond analysis of the complex was performed in a manner similar to the other two complexes. The three hydrogen bonds formed by the native amino acids were found to be conserved in three MD runs with an abundance of more than 25%. The plot of the hydrogen bond profile and the table showing their abundance in percentage are given below:



H-bond plot for 5_3_S5I

Index Hydrogen bond Acceptor-Donor MD run 1 (%) MD run 2 (%) MD run 3 (%)
1 GLY:382.A.O - LYS:8.B.N 83.08 20.40 86.57
2 LYS:8.B.O - GLY:382.A.N 89.05 94.53 91.54
3 SER:380.A.O - ILE:10.B.N 99.00 100 99.50
4 ILE:10.B.O - SER:380.A.N 87.06 92.04 86.07
5 ASP:106.A.O - SER:3.B.N 0 0 37.31
6 ASP:106.A.O - SER:3.B.OG 0.50 0.50 35.82
7 GLY:382.A.O - SER:7.B.N 11.94 12.94 28.86


The three H-bonds which are preserved in all the three MD runs with abundance more than 50% are the native H-bonds same as in the case of mutant peptide 5_1_S5W. No important H-bonds are formed with the mutated residue.

Based on these results, studies can be done on this mutant peptide 5_1_S5I for further characterization of its inhibiting capacity.


Grafting Results



We have explained the tools and methods used for Grafting the peptide inhibitor in our Model Overview page.


The best peptide sequence selected for each interaction based on the results of Molecular dynamics simulations and scoring were grafted in the kalata-B1 backbone. The engineered peptide sequence was grafted in loop 6 while a Strep tag was grafted in loop 3 for the purification of the cyclotide. In-silico grafting and circularization was performed by a custom made script built on top of MODELLER. Using the five best peptide inhibitors we build five cyclotide models that act as a Model for our final peptide Inhibitor that is to be administered to patients diagnosed with Malaria. The five models can be found below. The region coloured in green is the grafted peptide while the sulfide bonds are also shown in the picture.


1_1_S16I grafted into loop-6 of kalata-B1


2_2_H grafted into loop-6 of kalata-B1




3_1_S160M grafted into loop-6 of kalata-B1


4_2_H70N grafted into loop-6 of kalata-B1





5_1_S5Q grafted into loop-6 of kalata-B1




We performed Molecular Dynamics simulations again to study the stability of these grafted cyclotides. The simulation was performed for 100ns using GROMACS. Here we used the same parameters as the protein-peptide MD simulations performed earlier. We calculated the Atomic Root Mean Square Deviations of these simulations with respect to the protein backbone to understand how the average distance between each atom in our grafted peptide evolves with time.


RMSD plot of 1_1_S16I grafted kalata-B1



RMSD plot of 2_2_H grafted kalata-B1



RMSD plot of 3_1_S160M grafted kalata-B1



RMSD plot of 4_2_H70N grafted kalata-B1



RMSD plot of 5_1_S5Q grafted kalata-B1




The RMSD calculation was performed with respect to the minimized, equilibrated system of the grafted cyclotide. We noticed that in some instances (1_1_S16I, 3_1_S160M) the peptide remained stable and in this folded state for a long time before starting to unfold. However, the unfolding and linearization of the cyclotide was more pronounced for the others. We believe that unfolding and other unwanted conformational changes could be minimised by using better modelling techniques.

We also noticed a correlation between the length of the grafted peptide and larger RMSD values. For example, 4_2_H70N which is 18 amino acid long shows the highest RMSD value ~ 1nm. A good solution to this would be to shorten the length of the peptides; cutting the ends of the peptide and giving more priority to those residues which have been identified to be crucial for the interaction, like the ones contributing to the H-bonding etc.


References:


1. Lennartz, F et al. Structure-Guided Identification of a Family of Dual Receptor-Binding PfEMP1 that Is Associated with Cerebral Malaria; 2017; Cell Host Microbe 21: 403-414; DOI: 10.1016/j.chom.2017.02.009

2. Douglas, A. D. et al. Neutralization of Plasmodium falciparum merozoites by antibodies against PfRH5. J. Immunol. 192, 245–258 (2014); DOI: 10.4049/jimmunol.1302045

3. Hsieh, F.-L., Turner, L., Bolla, J. R., Robinson, C. V., Lavstsen, T., & Higgins, M. K. (2016). The structural basis for CD36 binding by the malaria parasite. Nature Communications; DOI: 10.1038/ncomms12837

4. Smith, J. D., Rowe, J. A., Higgins, M. K., & Lavstsen, T. (2013). Malaria’s deadly grip: cytoadhesion of Plasmodium falciparum-infected erythrocytes. Cellular Microbiology, 12, 1976–1983; DOI: 10.1111/cmi.12183

5. Lau et al; Structure conservation despite huge sequence diversity allows EPCR binding by the PfEMP1 family implicated in severe childhood malaria; 2014; DOI: 10.1016/j.chom.2014.11.007

6. Lennartz, F. et al. Structural insights into diverse modes of ICAM-1 binding by Plasmodium falciparum-infected erythrocytes; 2019; Proc Natl Acad Sci U S A 116: 20124-20134; DOI: 10.1073/pnas.1911900116