Team:TU Kaiserslautern/Model

Model

In our digital research, we focused on the laccase from Botrytis aclada.

In the lab we used the mutant laccase L499F, which has an phenylalanine as 499th sidechain instead the leucine in the original. This mutant has no crystallographic structure. So we used the mutant L499M, which has as 499th sidechain an methionine.

In the end we make some suggestions how viable our simulation is with the alternative amino acid regarding our project.
Introduction
The simulation is divided into four main parts. First, the structure of our laccase was found. The MD simulation showed the behavior of the laccase in a NaCl waterbox for about 50 ns. In a pKa determination, the charge of the sidechains in the active centrum was determined. With docking simulation, a good beginning position for diclofenac to the laccase for further QM simulation were discovered. The best results are pictured above.
Structure
In this work, the mutant of the laccase from Botrytis aclada should be analyzed. But there was no X-ray crystallography for L499F. We modeled it with Phyre2(Protein Homology/analogy Recognition Engine V 2.0) which was introduced by Prof. Dr. Frankenberg-Dinkel. In order to visualize the entire structure, there are four steps. First, it compares the sequence using multiple sequence alignment with other proteins and makes an initial secondary structure prediction. In the second step, the algorithm compares this result with a database of known structures and reviews if there is another protein which has equal parts as the queried sequence. For any missing parts, it gets the information for folding of similar proteins from the database. In the third step, it models the loops for the missing part with the information from step 2. In the final step, the side chains are added.1 However, this program fails to incorporate copper ions directly. It was suggested to us the pdb-file from the mutant L499M as a very similar Laccase. This Laccase has on sidechain 499 a methionine instead of the phenylalanine. The structure was determined using X-ray crystallography at a resolution of 1.7 Å by Osipov et al..2 It can be found on the website of the NCBI (National Center for Biotechnology Information) with the PDB ID 3V9E.3 The problem with this file is that the first 37 and 405-408 amino acids in the structure are missing,4 so we decided to complete the missing ones to obtain the complete version of the 3D-structure. To obtain the structure with copper ions we made use of www.charmm-gui.org.5 In this program, the pdb-file and CHARMM FF - file should be uploaded. The CHARMM FF - file consists of separate information about topology of the system and the interaction parameters. In our case, the parameters are known for the used types of atoms, so it just has to assign the standard parameters for our protein. For proteins like the laccase the CHARMM FF c36 will be used. For an unknown small molecule like Diclofenac, however, the Charmm General Force Field may be used. To identify the unrecognized small molecule there are two options. The first option is Antechamber, which reduces hydration energies. The other option is ParamChem which compares the penalties between similar known connections of atom types and chooses the lowest. After both options, it generates a new pdb and psf file where the pdb file has the coordinate information of the atoms and the psf-file has the entries which atoms are connection in which way (bond, angle, dihedral, impropers).6,7 With this program we modeled the laccase with the missing amino acids and copper ions and used it in the remainder of this work.

Figure 1: Comparison of the modeled structure with Phyre2 and CHARMM-GUI by overlapping the two using Swiss-PdbViewer


To be sure that the files from Phyre2 of L499F and CHARMM-GUI of L499M are similar, we compared it with Swiss-PdbViewer from the SIB Swiss Institute of Bioinformatics (Guex1997). We uploaded both pdb-files and showed just the histidines for reasons of clarity. Those are the most important to coordinate the copper ions in the protein accurately. After that, we chose three atoms and put them on top of each other in order to show the well alignment. In Figure 1 a strong correlation between these models can be seen, therefore, the structure can be assumed to be realistic. In comparison the new amino acids are organized different. In Figure 2 the structure on the top is from CHARMM-GUI and the structure below from Phyre2 are shown. In the further work the model of CHARM-GUI was used.

Figure 2: Comparison of the modeled new amino acids with Phyre2 and CHARMM-GUI using Swiss-PdbViewer


MD simulation
In a molecular dynamic simulation, proteins can be analyzed with a great spatial and temporal resolution. For that, it is important to know which kind of bonds exist inside a molecule. The distance of the atoms determines if it is a bonded or non-bonded interaction while the latter can be described with the Lennard-Jones-Potential (LJ). The potential seen in Figure 3 is given by the equation:



Figure 3: Lennard-Jones-Potential made with Python8


with:
r: Distance between the atoms
ε: negative value of the minimum of the potential (showed by horizontal line), so if r = r0, here ε = 1
For r < r0 the repulsive interaction dominates. It originates from the Pauli-exclusion principle and is included in the equation as: (σ/r )12
For r > r0 the attractive Van-der-Waals interaction dominates and is included as:-(σ/r )^6
The angles between the bonds important for the simulation are also important. The angle bending is given by:




with:
c: material constant
ν: angle between the bonds
If ν= ν0, the system is in equilibrium.
These are two examples which are defined in the parameter files for the MD simulation which are specific for each used software.9, 10
The following simulations are prepared with VMD plugin QwikMD.11 This is a software which combines the graphical 3D viewer VMD and the simulation software NAMD. In order to run a MD Simulation with NAMD four files are required. The pdbfile has the information about the coordinates of each atom. The psf-file contains the structure information, for example the type of bonding interactions. In the force field parameter file, the values for the force constant have to be defined for each occurring atom type and combination respectively. In the last file which is called configuration file the steps of the simulations are described.12
We created a setup with with consecutive standard steps of minimization, annealing to 300 K, equilibration, and MD in separate configfiles respectively. The first three steps prepare the MD simulation by finding a adequate position. Usually the backbone of the protein are approximately in the right conformation so in the minimization step just the side chains rotates around the backbone and change their angles between their bonds to obtain a minimized potential energy while the backbone is fixed. The temperature in this step is 0 K. To be able to escape from a local minimum to a global minimum, though to cross steric obstacles the system will be heated at intervals in the annealing step and energy will be minimized again in the equilibration step. The amount of steps in each protocol part ensure the protein folding is practically the folding in the global minimum. After those preparations step, the production MD simulation begins.




Figure 4: First running protocol of the laccase in the salt water box, taken from the GUI of VMD plugin QwikMD




a)
b)
Figure 5: Copper ions (orange) outside the laccase in simulation step 654 a) from the one side and b) the opposite side


In the first MD simulation, the protein was solvated in a water box with 0,15molL-1 NaCl. The intention of this simulation is to gain protein stabilization information in this conformation of the CHARMM-GUI model and understand the behavior of the laccase in water. Of key importance is the organization of the copper ions within the laccase and the missing amino acids (0-37).
The protocol was used from figure 4.
As seen in the protocol, the protein has less mobility in the preparation step than in the MD simulation. This is because it is expected that the backbone is already near to the global minimum and otherwise the preparation would take too much time. In the MD simulation, it should be shown whether or not this assumption is justified, though there are no limitations. The result of the simulation shows there was a mistake in the parameter for the copper ions. After few simulation steps, the copper ions were outside of the protein as you can see in Figure 5.
So this simulation is not expedient for obtaining information about the behavior of the copper ions inside the laccase. The mistake was in the psf-file so the copper ions was not on the right beginning position. Instead they were located on (0,0,0). On the other hand, the simulation shows the organization for the missing amino acids.


a)
b)
Figure 6: Analyzing the missing amino acids (green) a) in the first simulation step and b) in the last simulation step with less loops




Figure 7: Second running protocol of the laccase in the salt water box


The amino acids are stick out of the laccase in the first step and have less loops and becomes higher in the final step (step 1146), as seen in Figure 6.
So, this simulation is not expedient to obtain information about the behavior of the copper ions inside the laccase.
To gain more information about the role of the copper ions, the simulation was repeated with altered running protocol. In this simulation, we had the protocol from Figure 7. The copper ions has the information which was normally used for QM simulation because there were none available for classical MD simulation. Therefore, the result of this simulation is just qualitative.
As you can see in the protocol from Figure 7, in addition to the backbone, the copper ions and the proximal atoms are fixed in the first three steps. Afterwards, a second equilibration step takes place where just the backbone is fixed. The MD simulation has no limitation like the simulation before. There are about 50 ns simulated:
25512 * 1000(dcdfreq) * 2fs = 25512 * 2ps = 50,1ns
It is clear the copper ions are located in the laccase in Figure 8.


a)
b)
Figure 8: Copper ions (orange) inside the laccase in simulation a) in the first simulation step and b) in the last simulation step




a)
b)
Figure 9: Analyzing the missing amino acids (green) a) in the first simulation step and b) at the end of simulation (step 25512)


The missing amino acids also had a different structure as seen in Figure 6 and 9. To summarize, the MD simulation shows that the model of CHARMM- GUI can be used for further work. Though the copper ions are organized inside the laccase, the previously missing amino acids changes their structure but does not inhibit the active centrum sterically.
PKa Determination
The pH of the system has a huge effect of the activity of the protein.13 To explain why the activity of the laccase from Botrytis aclada works better at a lower pH, the pKa value of laccase and the charge of the side chains by pH 5 and 7 were analyzed. Most activity of the laccase can be seen at pH 4, however the proteins in lab would denatured in live experiments.14 So we decided to analyse pH5, which should show a higher activity than at pH7 and can be compared in lab. The pKa value is the negative logarithm of the acidity constant Ka, which can be measured by the following equation:

Ka = c(H3O+)*c(Ac-)/c(HAs)
It provides information about how much the sample dissociate in water (the strength of the acidity). The pKa is given as:
pKa = -log(Ka)

If pH = pKa the system is in the area of buffer. That means if there were added a little amount of acid or buffer the pH will not change a lot.15
The changes in charge states of laccase and diclofenac upon binding are related to binding-induced changes in pKa value of ionizable groups. The effect of it is compatible with partial protein unfolding or destabilizing mutations.
So every ionizable side chain in a protein has an effect on the electrostatic interactions between protein and ligand and between protein and solvent. This is the reason why it is important to study the pKa of the laccase and the charge of the side chains in the binding pocket with the ligand.14
For this investigation, the software DelPhiPKa was used.16,17 To obtain the information about the pKa, the software uses the equation:

pKai(protein)= pKai,ref(solvent)+ δ pKai (solvent -> protein)



Figure 10: Titration curve with marked pKa 18


The first term is the pKa value of the reference solvent and the second describes the shift of the residue solvent reference pKa. The other option to obtain the pKa value of a titratable residue is using a titration curve. For both, the electrostatic free energy of the titratable residue in its deprotonated and protonated states are needed, which is calculated. In a accuracy test, 90% of the analyzed proteins had less than 1.0 pH unit error.

The effect of the pH should be considered. So the titration curve of each residue was calculated without copper ions in Figure 10. The calculation with copper ions does not work and the issues could not be fixed in the data before the end of the competition. The charge of the histidine residues of pH = 5 is compared with pH = 7, where the laccase has a residual activity which is mentioned below.14


Figure 11: Charge of the histidine residues as a function of pH


Most important are the proximal residues near the copper ions. They are shown in the Figure 12.

Figure 12: Charge of the histidine residues near the copper ions as a function of pH


Table 1: pKa and charge calculation of the histidine residue by pH 5 and 7
ResiduepKaCharge at pH 5Charge at pH 7
T1 Copper    
 His 4636.820.99590.5098
 His 5316.940.99640.6100
T3 Copper    
 His 1267.220.96660.6678
 His 1685.050.53380.0932
 His 5275.190.54880.1396
T3 Copper Adjacent    
 His 1245.320.55980.1100
 His 1704.890.51350.1654
 His 4666.800.87740.4865
 His 5254.850.48980.1778
The results for pH 5 and 7 are compiled in Tab. 1. The section labeled T3 Copper Adjacent lists the four histidine residues near the T3 Copper which could possibly impact the resulting charge. These results demonstrate there is a higher overall charge at pH 5. It can be assume that the histidine on the T3 coppers pulls the electron from the cysteine much more quickly, so the reactivity is higher.
Docking
In a docking simulation, the most energy-efficient position of a ligand to a protein is investigated. For this task, the docking tool AutoDock was used.22,23 It consists of two subprograms, autodock and autogrid. Autodock is used to perform the docking of the ligand to a set of grids describing the target protein. The other subprogram precalculates these grids.
The estimated position was used as beginning situation for an QM/MM simulation in further works.
In this work, the docking for pH = 5 and pH = 7 was performed with and without copper ions for comparison. For the running step, autodock4 and autodock vina were used.23 Vina and AutoDock use slightly different models and should deliver an optimal position with respect to the underlying energy evaluation functional.24 To identify the global minimum, it can take a lot of time. So the process is limited by the time provided in this years iGEM competition. For accuracy, the time can be increased by expanding the exhaustiveness parameter manually.24 The effect is clarified by a docking simulation with an exhaustiveness of 20 and 80, with a pH value of 5 and accounting the copper in Figure 13. For this, Autodock vina was used.
These provide the most energy-efficient position for each docking simulation and fit perfectly onto each other. The next measurements are compared with an exhaustiveness of 20.
The influence of the copper ions is analyzed in Figure 14. For this, Autodock vina was also used.
The most energy-efficient position, both with and without copper, lay on top of each other.
As mentioned before, the model could be incorrect. Thus, it is comparable with the results from Autodock4. But for Autodock4, not enough parameters were given for the copper ions. It can be correlated without copper and exhaustiveness 20 such as in the comparison above.


Figure 13: Comparison exhaustiveness of 20 and 80 with Autodock vina




Figure 14: Comparison with and without copper from two different perspectives




Figure 15: Most energy-efficient docking from two different views


First the most energy-efficient position with Autodock vina is shown in Figure 15. Next, the three best energy-efficient docking position are shown in Figure 16 through 18. They have just small differences, such as the direction of the COOH in Figure 16a and Figure18a. These positions are compared with the most energy-efficient docking position with Autodock vina in Figure 15.
No position matches the position of Autodock vina in Figure 18. So the other position was tried and the best fitting was the 6th most energy-efficient position of Autodock4 in Figure 19.
However, the match is not perfect.
For a further QM/MM simulation, both options can be used and compared. When comparing both softwares, the tests of Vieira shows Autodock vina has better results by polar ligands.35 Diclofenac and the copper ions are charged in pH = 5 and pH = 7. The result of Autodock vina is more likely than Autodock4, which better supported hydrophobic ligands.


Figure 16: Most energy-efficient docking position from two different perspectives




Figure 17: Second most energy-efficient docking position from two different perspectives




Figure 18: Third most energy-efficient docking position from two different perspectives




Figure 19: Comparison of the running software Autodock vina in green and Autodock4 in blue a) best position of Autodock4 b) second best position of Autodock4 and c) third best position of Autodock4




Figure 20: Comparison with 6th most energy-efficient docking position in green a) overview and b) closeup


Discussion
Because there is a lack of X-Ray crystallographic structure of L499F, we used one provided from L499M as a substitute. In the comparison, the similarity between both can be shown, but their models have a different structure in the initial missing amino acids.

In the MD simulation in a NaCl-Water box, the missing amino acids were demonstrated on the other side of the active centrum. Thus it could be assumed that they have no effect on the reaction itself. Here the difference between L499M and L499F should have no effect, however the stability of the complete protein can be influenced by the different arrangement of the first 30th amino acids.

The pKs of the sidechains for the histidine on the T1 copper should be different due to the altered sidechain on position 499. But in the middle of the protein, on the two T3 coppers, the pKs of the histidine should be about the same. The mutant L499M demonstrated a higher activity at pH5 as compared to pH7, presuming a similar explanation. However, the results in literature show a slightly different activity between these two mutants.25

Also, the docking simulation is just representing how it would work. The change of the sidechain is just next to the diclofenac, therefore it's presumed the most energy efficient docking position is different to the result of the simulation. But it indicates when the parameters for copper ions are elucidated and the X-Ray crystallographic structure of L499F is measured, the beginning position of the Diclofenac for QM simulation could be investigated with this software. A caveat to this is that this software could simply assume a good position, so there could be a change of conformation during the QM simulation.

Conclusion
To sum it up, the active centrum of the laccases are free, so no inhibition by its own sidechain occur. Also the main structure of the histidine, which coordinates the coppers, are very similar between the mutants L499M and L499F, but the stability of these proteins could be different. Moreover, the impact of the pH on activity levels can be explained by the titration curve. The charge of the histidine, which coordinates the copper ions, shifts with the pH. But how intensive this relation is could depend on the side chains, however the histidine on T1 copper likely plays a larger role. For future studies, it would be key to repeat this work when the structure for L499F is investigated. For the docking, the effect of the side chain change should be the most significant. For an accurate QM simulation, this should be also repeated. Unfortunately time was too limited this year with iGEM, but we hope that it can be expanded on in the future. Even obtaining the useful parameters for copper ions would take more than one year, according to the advise of our supervisor René Hamburge. This work supported our findings in our wet lab research and discussions with experts that the copper ions are a huge issue and are lacking given parameters for simulations.
Outlook
When all missing parameters and structures are investigated, these steps could be repeated. Then results could be compared and the effect of changing the side chain better understand. Once this is solved, there is a better chance to create a high efficient mutant with the best fitting sidechain on position 499.
References
[1] Lawrence A Kelley, Stefans Mezulis, Christopher M Yates, Mark N Wass, and Michael JE Sternberg: Trabajo práctico Nº 13 . Varianzas en función de variable independiente categórica. Nature Protocols, 10(6):845–858, 2016. http: //dx.doi.org/10.1038/nprot.2015-053.

[2] Evgeny Osipov, Roman Kittl, Pavel Dorovatovsky, Tamara Tikhonova, Roland Ludwig, and Vladimir Popov: research papers Effect of the L499Mmutation of the ascomycetous Botrytis aclada laccase on redox potential and catalytic properties research papers. pages 2913–2923, 2014.

[3] NCBI: 3V9E: Structure of the L499M mutant of the laccase from B.aclada, 2014. https://www.ncbi.nlm.nih.gov/Structure/pdb/3V9E, visited on 2020-08-05.

[4] NCBI: Sequences and Annotations, 2014. https://www.ncbi.nlm.nih.gov/Structure/icn3d/full.html?{&}mmdbid=106841{&}bu=1{&}showanno=1,visited on 2020-08-05.

[5] Sunhwan Jo, Taehoon Kim, Vidyashankara G. Iyer, and Wonpil Im: Software News and Updates CHARMM-GUI: A Web-Based Graphical User Interface for CHARMM. Wiley InterScience, 2008. https://onlinelibrary.wiley.com/ doi/epdf/10.1002/jcc.20945.

[6] Sunhwan Jo, Xi Cheng, ShahidulM. Islam, Lei Huang, Huan Rui, Allen Zhu, Hui Sun Lee, Yifei Qi, Wei Han, Kenno Vanommeslaeghe, Alexander D. MacKerell, Benoît Roux, and Wonpil Im: CHARMM-GUI PDB manipulator for advanced modeling and simulations of proteins containing nonstandard residues. Advances in Protein Chemistry and Structural Biology, 96(September 2017):235–265, 2014.

[7] CGenFF:Welcome to CGenFF, 2018. https://cgenff.umaryland.edu/, visitedon 2020-08-07.

[8] Fred L Van Rossum, Guido and Drake Jr: Python reference manual. Centrum Voor Wiskunde en Informatica Amsterdam, 1995.

[9] Michael P. Allen: Introduction to Molecular Dynamics Simulation. Computational SoftMatter: From Synthetic Polymers to Proteins, 23:1–28, 2004.

[10] Herbert (TU Kaiserslautern/FB Physik) Urbassek: 2.Potentiale. In Biophysik VI,Kaiserslautern, 2019.

[11] João V Ribeiro, Rafael C Bernardi, Till Rudack, John E Stone, James C Phillips, Peter L Freddolino, and Klaus Schulten: QwikMD — Integrative Molecular Dynamics Toolkit for Novices and Experts. Nature Publishing Group, (May):1–14, 2016.

[12] Rafael C Bernardi and Till Rudack: QwikMD - Easy Molecular Dynamics with NAMD and VMD. 2017. http://www.ks.uiuc.edu/Training/Tutorials/qwikmd/qwikmd-tutorial.pdf.

[13] Alexey V Onufriev and Emil Alexov: Protonation and pK changes in protein ligand Binding. Quarterly Reviews of Biophysics, 46(2):181–209, 2013.

[14] Muhammad Bilal, Muhammad Adeel, Tahir Rasheed, Yuping Zhao, and Hafiz M.N. Iqbal: Emerging contaminants of high concern and their enzyme assisted biodegradation – A review. Environment International, 124(December 2018):336–353, 2019.

[15] Helmut (TU Kaiserslautern/FB Chemie) Sitzmann: Allgemeine Chemie für Ingenieure und Biologen. Kaiserslautern, 2016.

[16] Lin Wang, Lin Wang, Min Zhang, and Emil Alexov: DelPhiPKa web server : predicting pKa of proteins , RNAs and DNAs Structural bioinformatics DelPhiPKa web server : predicting p K a of proteins , RNAs and DNAs. Oxford University Press, (October):1–2, 2015.

[17] LinWang, Lin Li, and Emil Alexov: pKa predictions for proteins, RNAs and DNAs with the Gaussian dielectric function using DelPhiPKa. proteins, 83(12):2186–2197, 2015. https://onlinelibrary.wiley.com/doi/epdf/10.1002/prot.24935.

[18] Carlo (StudyHelp GmbH) Oberkönig: Titration. https://www.studyhelp.de/online-lernen/chemie/titration/, visited on 11.08.2020.

[19] Swagata Pahari, Lexuan Sun, Sankar Basu, and Emil Alexov: DelPhiPKa: Including salt in the calculations and enabling polar residues to titrate. proteins,86(12):1277–1283, 2018.

[20] Garrett M Morris, David S Goodsell, Robert S Halliday, Ruth Huey, William E Hart, Richard K Belew, Arthur J Olson, and Morris E T Al: Automated Docking Using a Lamarckian Genetic Algorithm and an Empirical Binding Free Energy Function. Journal of Computational Chemistry, 19(14):1639–1662, 1998. https://doi.org/10.1002/(SICI)1096-987X(19981115)19:14{%}3C1639::AID-JCC10{%}3E3.0.CO;2-B.

[21] Michel F. Sanner: Python: a programming language for software integration and Development. Journal of molecular graphics & modelling, 17(1):57–61, 1999.

[21] Michel F. Sanner: Python: programming language for software integration and Development. Journal of molecular graphics & modelling, 17(1):57–61, 1999.

[22] O. Trott and A. J. Olson: AutoDock Vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization and multithreading. Journal of Computational Chemistry, 31:455–461, 2010.

[23] Vina: AutoDock Vina Manual. http://vina.scripps.edu/manual.html{#}contents, visited on 12.08.2020.

[24] Tatiana F. Vieira and Sérgio F. Sousa: Comparing AutoDock and Vina in ligand/decoy discrimination for virtual screening. Applied Sciences (Switzerland),9(21), 2019.

[25] Scheiblbrandner, S.; Breslmayr, E.; Csarman, F.; Paukner, R.; Führer, J.; Herzog, P. L.; Shleev, S. V.; Osipov, E. M.; Tikhonova, T. V.; Popov, V. O.; Haltrich, D.; Ludwig, R.; Kittl, R. Evolving Stability and PH-Dependent activity of the High Redox Potential Botrytis Aclada Laccase for Enzymatic Fuel Cells. Sci Rep 2017, 7 (1), 13688. https://doi.org/10.1038/s41598-017-13734-0.