Team:UofUppsala/Model


Navigation Bar

Dry lab


On this page we present the results of our modelling approaches. This includes our system in silico, Molecular dynamics, Stochastic Modelling, and DIY models.

Introduction To Modelling in NANOFLEX


The process of modelling is in essence taking a real world object or process, then creating a computer environment with a combination of parameters that represent the natural forces that govern the system in question. During development of NANOFLEX we mainly relied on two kinds of modelling, one to develop and investigate the system in it’s proof-of-concept form, and a second, more advanced model to predict how the NANOFLEX platform could be improved upon and extended to be able to detect different molecular targets.

The general strategy in modeling is to predict behaviours of the reactions and results of the various components of the project. Doing this before the completion of the project makes it easier to change the concept without wasting materials and time in an unsuccessful prototype. Modeling, in the context of our project, has been used in a conservative manner: to test the feasibility and envisage the molecular and physical features of our project before it is carried out.

Figure 1. The dry lab and wet lab are integrated work procedures as the workflow scheme shows. First the modular parts (nanobody and the corresponding target protein) are chosen and docked by using some software (e.g. ZDOCK, HADDOCK, ROSIE). Then molecular dynamics provides the wetlab with sequences of optimized nanobodies and also affinity data to the stochastic modelling. The affinity data together with experimental data will make stochastic models which then are used in the final kit with the modified sensory cells.

Future Applications of NANOFLEX

Tuberculosis (also known as TB) is a contagious disease that targets the lungs. If left untreated, the disease kills the host at an average five years since detection. Annually, TB claims 2-3% lives every year, and detection of disease only occurs in the third and fourth stage when treatments are expensive and offer low cure rates. Our goal is to detect TB in its early stages without the need to frequent clinics and hospitals for blood tests.

Our modelling included an investigation into the possibility of using NANOFLEX to detect TB in an arbitrary sample. For protein detection we are testing parts of our system for tuberculosis detection by applying the system to a heat shock protein, HSP16.3, which is a biomarker secreted by Mycobacterium tuberculosis.

A quick search through the protein databank (PDB) yields a variety of nanobodies with could be excellent candidates to detect a range of viral, bacterial, and other protein targets which could be useful to detect. The possibilities for NANOFLEX are limited only by our imaginations!

Our systems in silico


Molecular Dynamics

Molecular dynamics (MD) is a computational method that can be used to create a deterministic model of large macromolecules, especially proteins, to investigate their properties. In our project, in order to investigate the properties of small, single domain antibodies called nanobodies, we used a steered molecular dynamics simulation combined with umbrella sampling to determine the binding energy of a nanobody of interest and our intended target.

To the right we see a short MD simulation of the equilibration of the HSP16.3 dimer in water. Equilibration ensures that the protein is in its lowest energy (most natural) state before being used in another simulation.

Docking

In order to ensure our model represents reality, we had to determine a correct docking pose to use as a starting point. Using experimental data from a study on HSP16.3(1), we derived a ranking of the nanobodies of interest and compared that to the relative energies that a docking screen determined. The docking pose which most closely matched the relative energies determined experimentally was chosen as the correct pose and used for further investigation.

Figure 2. The docking produces a huge array of possible configurations, the symmetrical structure above is the HSP16.3 protein from Tuberculosis, above is a variety of possible positions for the BF-10 nanobody to bind to a potential site on HSP16.3.
Figure 3. Three representations of the BF-10 nanobody in a prospective binding pose, A is simply the protein tertiary structure as represented by ribbons, B shows a wireframe surface showing the approximate volume and points of contact for nanobody and HSP16.3 protein, and C shows a map of electrostatic potential on the surface of both proteins.

Steered MD

The procedure for the steered MD was adapted from the Umbrella sampling tutorial by Justin A. Lemkul, Ph.D(2,3). During this procedure, the complex of nanobody and HSP16.3 was solvated in water, equilibrated, and then slowly pulled apart in a steered MD simulation. The benefit of the steered MD is that you can look at a system during a dynamic process which would not be possible to simulate happening on its own because of the time scales of such processes. The pulling force ensures the two proteins separate, and highlights some flexible parts of each protein which might not be seen without the pulling force. The video above shows the 5 ns steered MD simulation.

Not visible in Safari

Umbrella Sampling

In order to determine the binding energy of the nanobody to HSP16.3, configurations were generated every 0.2 nm which were individually analyzed, and potential energy determined for that configuration. Once all the energies are collected into a graph one can determine a slope and plateau which can be interpreted to approximate the binding energy as the point where the plateau begins.

Graph 1. In this graph is can be seen the potential energy derived from frames spread out over the total distance pulled during the simulation. The plateau around 5 nm tells us the approximate binding energy of this complex is around 10 kcal mol-1.

In this graph is the potential energy derived from different frames spread out over the total distance pulled during the simulation. Since the distance is calculated between the center atom of each protein, they start around 4 nm apart, and this also causes the artifact seen before the initial slope, as different time points can correspond to the same distance coordinate until the two proteins start moving apart. The plateau around 5 nm tells us the approximate binding energy of this complex is around 10-11 kcal mol-1, which is different from the value we expected, further docking studies would need to be done to find a docking pose that better represents the actual interaction between these two proteins.

Tools

Here are a list of tools, online resources, softwares, and computing resources we’ve used during this project, by taking advantage of as many different tools as possible, we were able to compare different approaches and verify results between softwares.

  • ZDOCK webserver (4,5) used to generate a variety of starting poses.
  • HADDOCK webserver (6) Used to refine poses for determination of the correct pose.
  • SPOTON webserver (7,8) Used to determine interacting residues and hotspots for use in residue scanning during nanobody optimization.
  • ROSIE webserver (9-11) used for examination of docking poses to evaluate the most likely docking pose.
  • Schrödinger Maestro (12) used to manipulate and modify structure files, as well as visualization of structures used during the project
  • Schrödinger Bioluminate (13-15) used for residue scanning and protein-protein docking
  • VMD, visual molecular dynamics (16) used for analysis of trajectory files from GROMACS
  • GROMACS (17) suite of software used to perform MD simulations

The computations were performed on resources provided by the Swedish National Infrastructure for Computing (SNIC) through Uppsala Multidisciplinary Center for Advanced Computational Science (UPPMAX) under Project SNIC g2020015.

Stochastic Modelling


Stochastic Modelling is a tool that studies the distribution of probable outcomes under certain conditions with variable input. In computational Biology, it can be used to describe and predict biological processes, e.g., predicting the potential changes in certain cellular pathways in response to a change in environment or stimulus. In our project, we used a stochastic model to predict synthesis of our reporter protein to simulate our biosensor in action (18).

In the model, we used a protein synthesis stochastic model to predict the production of our reporter molecule, red fluorescent protein. When using caffeine as a trigger in our proof of concept system, the model could show the change of all the components in the pathway and predict the final concentration of fluorescent protein in our proof of concept system.

StochPy

StochPy is a Python package which is designed for stochastic simulation. This software uses Python’s scientific libraries and PySCeS (Python Simulator for Cellular Systems) which are helpful tools to simulate the reactions happening inside a cell. The different functions can be used to make time series displaying how a system changes with probabilistically occurring events. A comprehensive userguide on the Stochpy’s web page provides instructions and demonstrations on how to use the software. (19)

Figure 4. Schematic of the equations in the stochastic model

Stochpy model of DBD system

The model contains parameters that describe the kinetics and propensities of the system. Hence, to ensure a reliable model prediction, we started the initial parameters from the literature and used the experimental data to fit the parameters (20-25). Firstly, the dimerization rate and dimer degradation rate were estimated from laboratory data. Considering the processes of dimer formation and activation are reversible, the reverse rate and activation rates are also estimated from experimental data in our laboratory. Subsequently, the synthesis ratios of mRNA, turnover to protein, and timeframe of protein maturation were found in the literature. Our model predicted the concentrations of dimer, mRNA, protein, and mature protein which closely matched experimental data. This simulation of the DBD process by stochastic modeling could be used for further simulation and investigation into ways to improve NANOFLEX.

Graph 2. A set of stochastic models with different parameters like concentration of caffeine, the noise caused by dimerization of the nanobodies without caffeine with 0 background noise on the first row, a value of 25 on row 2 and 50 for row 3, values for k1, the rate constant for binding of the DBD dimer to the promoter were also varied. Units are molecules per cell, given in copy number. Time is measured in seconds for all simulations.

We ran a range of simulations with a variety of parameters and amounts of caffeine and DBD dimers. For most combinations of factors, the simulation equilibrated at around 3000 copies per cell of the mature fluoroprotein, this corresponds to a visible signal. The values of 4, 472, and 4720 copies per cell were chosen because they correspond to the concentrations tested in the lab, 1 µM, 10 µM and 100 µM. The stochastic models match the wet-lab data, with very low signal at 1µM of caffeine and a strong signal at all concentrations higher than 1µM. Further investigations could examine all the parameters individually to further fit the model and find the exact levels that produce the optimal response.

DIY models


If you’d like to check out the model and it’s inner workings, go to this link then go into the python_scripts folder and all our stochpy models are there with all the data you need to make your own. Don’t be shy! Get your hands dirty!

To access the competition submission of our code, go to this link

References


  1. Trilling, A. K. et al. A broad set of different llama antibodies specific for a 16 kda heat shock protein of mycobacterium tuberculosis. PLoS One 6, (2011).
  2. Umbrella Sampling. http://www.mdtutorials.com/gmx/umbrella/index.html.
  3. Lemkul, J. A. & Bevan, D. R. Assessing the stability of Alzheimer’s amyloid protofibrils using molecular dynamics. J. Phys. Chem. B 114, 1652–1660 (2010).
  4. Pierce, B. G. et al. ZDOCK server: Interactive docking prediction of protein-protein complexes and symmetric multimers. Bioinformatics 30, 1771–1773 (2014).
  5. Pierce, B. G., Hourai, Y. & Weng, Z. Accelerating protein docking in ZDOCK using an advanced 3D convolution library. PLoS One 6, (2011).
  6. Van Zundert, G. C. P. et al. The HADDOCK2.2 Web Server: User-Friendly Integrative Modeling of Biomolecular Complexes. J. Mol. Biol. 428, 720–725 (2016).
  7. Moreira, I. S. et al. SpotOn: High Accuracy Identification of Protein-Protein Interface Hot-Spots. Sci. Rep. 7, 1–11 (2017).
  8. Melo, R. et al. A Machine Learning Approach for Hot-Spot Detection at Protein-Protein Interfaces. Int. J. Mol. Sci. 17, 1215 (2016).
  9. Lyskov, S. et al. Serverification of Molecular Modeling Applications: The Rosetta Online Server That Includes Everyone (ROSIE). PLoS One 8, e63906 (2013).
  10. Chaudhury, S. et al. Benchmarking and analysis of protein docking performance in Rosetta v3.2. PLoS One 6, 22477 (2011).
  11. Lyskov, S. & Gray, J. J. The RosettaDock server for local protein-protein docking. Nucleic Acids Res. 36, 233–238 (2008).
  12. Schrödinger Release 2020-3: Maestro, Schrödinger, LLC, New York, NY, 2020.
  13. Zhu, K. et al. Antibody structure determination using a combination of homology modeling, energy-based refinement, and loop prediction. Proteins Struct. Funct. Bioinforma. 82, 1646–1655 (2014).
  14. Salam, N. K., Adzhigirey, M., Sherman, W., Pearlman, D. A. & Thirumalai, D. Structure-based approach to the prediction of disulfide bonds in proteins. in Protein Engineering, Design and Selection vol. 27 365–374 (Oxford University Press, 2014).
  15. Beard, H., Cholleti, A., Pearlman, D., Sherman, W. & Loving, K. A. Applying Physics-Based Scoring to Calculate Free Energies of Binding for Single Amino Acid Mutations in Protein-Protein Complexes. PLoS One 8, e82849 (2013).
  16. Humphrey, W., Dalke, A. & Schulten, K. VMD: Visual molecular dynamics. J. Mol. Graph. 14, 33–38 (1996).
  17. Van Der Spoel, D. et al. GROMACS: Fast, flexible, and free. Journal of Computational Chemistry vol. 26 1701–1718 (2005)
  18. Chang HJ, Mayonove P, Zavala A, et al. A Modular Receptor Platform To Expand the Sensing Repertoire of Bacteria. ACS Synth Biol. 2018;7(1):166-175.
  19. StochPy. [online] http://stochpy.sourceforge.net/ (Accessed October 25, 2020)
  20. Edelstein AD, et al. Advanced methods of microscope control using muManager software. J Biol Methods. 2014;1 doi: 10.14440/jbm.2014.36.
  21. Balleza E, Kim JM, Cluzel P. Systematic characterization of maturation time of fluorescent proteins in living cells. Nat Methods. 2018;15(1):47-51.
  22. Ballegeer M, Van Looveren K, Timmermans S, Eggermont M, Vandevyver S, Thery F, et al. Glucocorticoid receptor dimers control intestinal STAT1 and TNF-induced inflammation in mice. J Clin Invest. (2018) 128:3265–79. doi: 10.1172/JCI96636
  23. Titolo S, Brault K, Majewski J, White PW, Archambault J. Characterization of the minimal DNA binding domain of the human papillomavirus e1 helicase: fluorescence anisotropy studies and characterization of a dimerization-defective mutant protein. J Virol. 2003;77(9):5178-5191.
  24. Schwanhäusser, B., Busse, D., Li, N. et al. Global quantification of mammalian gene expression control. Nature 473, 337–342 (2011).
  25. Taniguchi Y, Choi PJ, Li GW, Chen H, Babu M, et al. (2010) Quantifying E. coli proteome and transcriptome with single-molecule sensitivity in single cells. Science 329: 533–538