What is this measurement?
When working with molecular dynamics data, our team looked to find a way to characterize the protein's dynamics. For this, we generated dynamics data for the proteins and then used a Gaussian Process Dynamics Model for dynamics characterization. This modelling scheme has been simplified into a deployable R script that allows teams to effectively measure their proteins throughout their simulations.
As seen in the diagram above, GausHaus works by measuring and fitting the dynamics of the protein within a certain axis onto a Gaussian process. Then after completing the GausHaus measurement on the three principal axes (x,y, and z), we are ready to interrogate their variances. With the three measurements completed, we have access to three different variances. These can be placed into a vector and compared with the dynamics of other proteins.
Keeping Aberrance at Bay
To ensure the generated data's integrity and reproducibility, we enlisted safeguards. One big safeguard we utilized was setting the seed before the measurement began. Through the set.seed() command in R, we were able to take out the measurement's random components, ensuring that you will get the same measurement given the same data.
To allow for even further viability, we also ensured that the measurement only uses a subset of the data and not the complete set. Through this, the GausHaus measurement results can be checked with the obtained data to ensure it is representative.
Finally, we ensured that every software used to generate the data throughout the entire workflow was using the same units and scales. Therefore the findings are not skewed by a rogue x10^15 multiplier.
GausHaus is built to be as reliable as the simulation it measures.
How to Use GausHaus For Your Proteins
For generating your own measurements via GausHaus you will need the following items:
-Molecular Dynamic Simulation File
-Pymol installed on your computer
-The GausHaus script available on the iGEM Calgary GitHub
Place your Molecular Dynamic Simulation File in Pymol and generate Excel representations of the simulation.
Read the generated Excel files into R and store it as a large data frame.
Run the GausHaus Github script on the Excel file to get results.
Interrogate the results and manipulate them into a graphic or numeric value suitable for your means.
Boast to your wetlab how easy it was to complete the measurement.
We see GausHaus being initially used as a substitute for backbone RMSD measurements. This will allow them an easy avenue to serve iGEM teams for the future to come. Another possible area where this algorithm can have an impact is in computationally intensive genetic algorithms in need of a brutal fitness function. Along with these, it can also be used to compare and contrast simulations quantitatively.
Now this all was fun and dandy, but for the math and guts of this measurement please go to The GausHuas Page