Team:KCL UK/Structural Modelling

Structural Modelling of Pvfp-5β:


To be able to understand how our chosen Mussel Foot Protein (MFP), Pvfp-5β, is able to adhere to a variety of surfaces, we needed to provide a structural model. First, using the Phyre-2 Web Tool (Kelley et al., 2015) we generated our initial structural model based on our protein sequence (UniProt ID: U5Y6U9) and known templates. Phyre-2 is an automatic fold recognition server that develops a homology model based on your target sequence, and known template sequences. Figure 1 shows the homology model we generated through Phyre-2 - our Target sequence being that of PVFP-5 (U5Y6U9), and our template sequence being that of the ‘Human Notch-1 EGFs 11-13’ protein (PDB ID: 2VJ3). We decided to generate this homology model due to the fact that the structure of our target sequence, PVFP-5 (U5Y6U9), is currently unresolved and unreviewed on the UniProtKB database. While there has been no NMR, or X-Ray structure developed, we remain confident in our model due to the high query coverage (84%) and identity confidence (99.7%) it has with our template sequence of 2VJ3 - that has been resolved using X-Ray Diffraction at a resolution of 2.60Å.

Figure 1: 'Cartoon' depiction of Phyre-2 generated model of Pvfp-5β (Blue). Three (3) disulphide bonds as 'Spheres' are shown. From left to right; C25-C36, C48-C57, C126-C135.

To further develop our Phyre-2 generated homology model we conducted Molecular Dynamics (MD) simulations, however we were not able to gain access to a computing cluster, so therefore extensive simulations were not possible. Our motivation behind conducting MD simulations was simple; as we could not gain access to a laboratory due to COVID-19, we could not perform in vitro tests on our protein. Therefore we set out to further simulate our structurally modelled protein, in order to be able to understand it better. MD simulations are widely used to examine the behaviours of protein landscapes, and they provide an important link between structure and function by allowing an exploration of the complete protein system in a given (simulated) environment.

Issues Encountered

In 2019, our Supervisors Professor Annalisa Pastore and Dr Caterina Alfano published a paper in which they explored recombinant Pvfp-5β as a potential bioadhesive. In this paper, through a Mass Spectrometry based approach they were able to identify four disulphide bonds - C51-C60, C65-C76, C70-C87, and C89-C98 (Santonocito et al., 2019). Our Phyre-2 generated Pvfp-5β model created only three disulphide bonds (refer to Figure 1 for residues involved), this was somewhat inconsistent in relation to the literature above we were using for guidance - this meant we were missing one final disulphide bond. The disulphide bonds on our own model were present at similar residues compared to the model from Prof Pastore and Dr Alfano (when accounting for differences in residue numbering between the two sequences). However, there still remained notable discrepancies between our model, and the one from Santonocito et al., 2019. This could be due to the fact that we are using a different target sequence for our homology model than them. We were confident our final missing bridge was at residues C86-C96 due to the fact that, firstly these residues were labeled as a disulphide bond on the U5Y6U9 UniProtKB entry. Secondly, when superimposing our template (2VJ3) with our generated model, there is a disulphide bond on 2VJ3 (C478-C487) exactly where residues C86-C95 lie on our model, when adjusting for slight differences in residue numbers (Figure 2).

Figure 2: Zoomed in 'cartoon' depiction of superimposition between Phyre-2 generated homology model of Pvfp-5β, and template model 'Human Notch-1 EGFs 11-13’ protein (PDB ID: 2VJ3). 2VJ3 Disulphide bonds labelled in gray. From left to right; C456-C467, C461-C476, C478-C487, C494-C505. Pvfp-5β Disulphide bond labelled in black; C48-C57. Residues C86 and C95 implicated in creating a disulphide bond as per the 2VJ3 template, and various other data (from UniProtKB). All disulphide bonds depicted as 'sticks'.

To resolve this issue, our supervisors put us in contact Dr Erika Földesné Dudás - a postdoctoral researcher at King’s College London, and Dr Ladislav Hovan - a postdoctoral researcher at Protein Dynamics Group of the Francesco L. Gervasio Lab, at University College London. Initially, Dr Erika Földesné Dudás advised us to perform an energy minimisation simulation using the GROMACS MD package (Berendsen, van der Spoel and van Drunen, 1995) to bring our desired residues closer together - and then manually impose the bond using the PDB reader (Jo et al., 2014) on the CHARMM-GUI web tool (Jo, Kim, Iyer and Im, 2008). We ran the energy minimisation simulation using the OPLS-AA/L all atom force field (Jorgensen and Tirado-Rives, 1988) and the SPC water model, while using 10 Cl- counter-ions to neutralise our system. This method failed to generate our bond as the residues still remained too far apart from each other (12.7Å) as seen in Figure 3A. Dr Ladislav Hovan provided us with technical assistance using GROMACS, as our team members had no previous experience of running MD simulations. Dr Hovan advised us to manually link our desired cysteine residues directly through GROMACS, by invoking the -ss command - this lists all cysteine-cysteine pairings that have the potential to form a disulphide bond. Due to the fact that our desired cysteine residues were too far apart to be recognised by the software, we had to alter the distance at which GROMACS recognises a disulphide bond - we changed the distance from 2.0Å to 13.0Å to accommodate for our desired bond by changing the respective values in the specbond.dat file; this file (found in the GROMACS directory) outlines the criteria that must be met by certain atoms on specific residues, for a particular bond to be created. Although this did create our desired disulphide bond (after running another energy minimisation) between residues C86-C95, we failed to realise that our original disulphide bonds at residues C25-C36, C48-C57, C126-C135 would no longer be recognised - as they were too close together to be considered bond(s) by the -ss command, due to the changes we made in the specbond.dat file. We reversed the changes we made to the specbond.dat file, and re-ran our third minimisation, again using the same parameters. Our original disulphide bonds (C25-C36, C48-C57, C126-C135) were reformed - and once again we were left without the disulphide bond at residues C86-C95. Although we no longer had our desired bond, we had reduced the distance between cysteine residues from 12.7Å to 4.5Å through these cycles of energy minimisation - this can be seen in Figure 3B. The residues were now relatively close enough where we could look into taking a different approach into finally imposing the bond, as we lacked access to a computing cluster in which we could run a full MD simulation using GROMACS.

Figure 3: A) Zoomed in 'cartoon' representation of Pvfp-5β (blue) showing distance between residues C86 C95 (12.7Å). B) Zoomed in 'cartoon' representation of Pvfp-5β after successive energy minimisation simulations using GROMACS. Distance between residues C86 and C95 is shown (4.5Å).

Resolution of Structural Modelling Problems:

We got in touch with Dr Andrew Beavil from King’s College London for a number of reasons - including structural modelling. He advised us that as our cysteine residues are not too far apart (4.5Å), we can use the YASARA graphical modelling software (Krieger and Vriend, 2014) to create a ‘spring to atom’ force field constant that will bring the bonding sulphurs closer together. After creating this constant, we had to run a very short MD simulation through YASARA which generated our desired bond. We ran a final energy minimisation through YASARA to settle the dynamics of our protein system, before we got our finalised model of Pvfp-5β (with all four disulphide bonds!) (Figure 4 and Figure 5). Additionally, the tyrosine residues that undergo post-transcriptional modifications to form 3,4-dihydroxyphenylalanine (DOPA). These residues underpin the adhesion mechanism of Pvfp-5β. Our structural model allows us to visualise their spatial arrangement thus providing context for which regions of the protein are essential for adhesion. These can be visualised in Figure 6 and 7.

Figure 4: 'Cartoon' depiction of Pvfp-5β (blue) model resulting from successive simulations using GROMACS, and YASARA. All (four) disulphide bonds are depicted as 'spheres'. From left to right; C25-C36, C48-C57, C86-95, C126-C135.
Figure 5: Video of our finalised structural model of Pvfp-5β (blue) (depicted as 'cartoon') from Perna viridis visualised using PyMOL. All (four) disulphide bonds are depicted as 'spheres'. From left to right; C25-C36, C48-C57, C86-95, C126-C135.
Figure 6: 'Cartoon' depiction of Pvfp-5β (blue). All Tyrosine (Y) residues are shown as 'sticks' (purple).
Figure 7: Zoomed in 'cartoon' depiction of Pvfp-5β (blue). Tyrosine (Y) (purple) residues Y88, Y90, Y91, Y101, Y102, Y119, Y122 are shown in more clarity (as 'sticks').

iGAM & Future Perspectives:

Now that we have generated a final structural model using GROMACS, our aim is to carry out physicochemical analysis of the protein using the Yasara software, at the suggestion of Dr Andrew Beavil. Although we were unable to complete this work this summer, we will do so next year during the Phase II element of our project. This research will supplement our wet lab studies and will ensure that we have a deep understanding of how our protein works and how it can be manipulated to carry out our purposes. Thus, we will continue our in silico work in this way and will use this research to better understand our protein.

Another element of our project that has played a central role in the mussel foot protein subgroup is research into protein engineering and how we can better our protein for its use as a bioadhesive. Due to limited literature regarding our particular isoform, it was difficult to understand how we could properly carry out in silico protein design. However, a recent publication has greatly transformed our understanding of Pvfp-5β, and we will use this research as we design our future experiments. In August, we attended the After iGEM seminar hosted by Andrew Symes regarding the iGAM software and how it can be used to predict protein mutations. iGAM is an R-based software that allows one to create a genetic algorithm to predict which mutations suit a particular need. After watching this seminar, we were greatly interested in iGAM and how we could potentially use this software to predict which mutations could increase the adhesiveness of our protein. After reaching out to Andrew, it was made clear that this was possible. For the remainder of the summer, Andrew has worked closely with us and has prepared the iGAM software for our protein. However, we were unable to determine a fitness function for our particular need as a consequence of the aforementioned lack of literature regarding our specific protein isoform. Thus, we will further work with this algorithm next year and will use it to carry out mutagenesis studies as we begin to construct our protein polymer in vitro. We are extremely grateful for all of this help this year and are so glad that we were able to contribute to the development of this software. Additionally, we have made the script for the iGAM code pertaining to Pvfp-5β open source so other teams can use it to fulfill their needs. This code can be found on our Github.

We have further aims for the engineering of our protein and will build upon our structural model further in the months to come. In the case of the former, we aim to support our model and semi-rationally design a more thermostable protein through consensus design. There is a wealth of knowledge regarding mussel foot proteins derived from other mussel species (Bivalvia). Our aim is to support our model by constructing phylogenetic trees and researching further into the structure and function relationships of the Mfp-5 (the fifth protein to be secreted by the mussel). Through our evolutionary analyses, we aim to carry out consensus sequence design to improve the biological activity (adhesion) and stability of our protein. We were led in the direction of this form of protein design and engineering by Dr Mark Pfuhl. Dr Pfuhl provided us with further ideas as to how we could improve the adhesion of our protein. Although typically used for antibody design, phage display assays are ideal for the design of proteins that have maximised interactions with other proteins. Phage assays are based on physical connections (Hammers and Stanley, 2014), and our aim is to adopt this technology for our protein. Again, we will design these experiments next year once we are able to access the lab.

Beyond our structural modelling, we have carried out extensive research into how our protein will be implemented in our system. We have developed protocols for several experiments as a part of our engineering cycle. These can be found on our Engineering Success page. Additionally, we have designed a genetic system that will allow us to polymerise the protein in vitro. Additionally, we have determined how we will treat our protein polymer and how we will use it to coat our scaffold. Our experimental design is extensive and we are excited to work further with this protein in the future. More information regarding this can be found on our Composite Parts page. To learn more about mussel foot proteins in general please see our Mussel Foot Proteins page.


  • Berendsen, H., van der Spoel, D. and van Drunen, R., 1995. GROMACS: A message-passing parallel molecular dynamics implementation. Computer Physics Communications, 91(1-3), pp.43-56.
  • Hammers, C.M. & Stanley, J.R., 2014. Antibody Phage Display: Technique and Applications. Journal of Investigative Dermatology, 134(2), pp.1–5.
  • Jo, S., Cheng, X., Islam, S., Huang, L., Rui, H., Zhu, A., Lee, H., Qi, Y., Han, W., Vanommeslaeghe, K., MacKerell, A., Roux, B. and Im, W., 2014. CHARMM-GUI PDB Manipulator for Advanced Modeling and Simulations of Proteins Containing Nonstandard Residues. Advances in Protein Chemistry and Structural Biology, pp.235-265.
  • Jo, S., Kim, T., Iyer, V. and Im, W., 2008. CHARMM-GUI: A web-based graphical user interface for CHARMM. Journal of Computational Chemistry, 29(11), pp.1859-1865.
  • Jorgensen, W. and Tirado-Rives, J., 1988. The OPLS [optimized potentials for liquid simulations] potential functions for proteins, energy minimizations for crystals of cyclic peptides and crambin. Journal of the American Chemical Society, 110(6), pp.1657-1666.
  • Kelley, L., Mezulis, S., Yates, C., Wass, M. and Sternberg, M., 2015. The Phyre2 web portal for protein modeling, prediction and analysis. Nature Protocols, 10(6), pp.845-858.
  • KRIEGER, E. & VRIEND, G. 2014. YASARA View—molecular graphics for all devices—from smartphones to workstations. Bioinformatics, 30, 2981-2982.
  • Ou, X., Xue, B., Lao, Y., Wutthinitikornkit, Y., Tian, R., Zou, A., Yang, L., Wang, W., Cao, Y. and Li, J., 2020. Structure and sequence features of mussel adhesive protein lead to its salt-tolerant adhesion ability. Science Advances, 6(39).
  • Santonocito, R., Venturella, F., Dal Piaz, F., Morando, M., Provenzano, A., Rao, E., Costa, M., Bulone, D., San Biagio, P., Giacomazza, D., Sicorello, A., Alfano, C., Passantino, R. and Pastore, A., 2019. Recombinant mussel protein Pvfp-5β: A potential tissue bioadhesive. Journal of Biological Chemistry, 294(34), pp.12826-12835.