Modelling
Overview
Tensile properties of hagfish slime intermediate filament
Tensile properties like elasticity and stiffness are import criteria in the quality assessment of textile materials. It has been reported that native hagfish slime threads exhibit intriguing mechanical performance, which makes it appealing as a potential textile material. In the hydrated state, they exhibit an initial elastic modulus of ~6 MPa and a yield stress of ~3 MPa. After 70% extension, the threads experience a strain hardening process, resulting in an UTS of ~200 MPa and strain failure up to 220% [1]. Another notable feature of these threads is that their mechanical response can be tailored by a draw-processing mechanism as follows: The drawn threads are processed by subjecting hydrated native threads to a certain strain under wet conditions, and then dried afterwards [1]. The process results in a remarkable increase in both stiffness and ultimate tensile strength (UTS).
The molecular basis of the enhanced property during draw-processing
The molecular basis of the enhanced property during draw-processing is thought to be due to strain induced conformational transition from alpha-helical coiled-coil to aligned beta-sheet rich structures [2]. This process is also known as α→β transition.
The α→β transition is a universal deformation mechanism in alpha-helix rich protein materials such as wool, hair, hoof, and cellular proteins. This transition in hagfish slime threads has been confirmed by wide-angle X-ray scattering from previous research [2].
However, the small number of strains examined do not allow previous researchers to provide a detailed account of the α→β transition from the x-ray data.
Therefore, we’d like to employ molecular modeling to carry out a real-time simulation of the α→β transition as a function of strain and estimate the tensile properties of our IF product simultaneously.
Molecular Modeling
Molecular modelling encompasses all theoretical methods and computational techniques used to mimic and study the structure and behaviour of molecules [3]. To explore the structural and functional properties of hagfish slime intermediate filament fibre, we started from studying its building block - the coiled coil IF protein dimer. First, we predicted the three-dimensional structure of IF dimer and then performed molecular dynamics (MD) simulations to reveal the real-time conformational changes during draw-processing and estimate the tensile properties at the same time.
Objectives
- To predict the molecular structure of hagfish slime IF subunits and dimer
- To demonstrate the real-time conformational changes during draw-processing at atomic level
- To estimate the tensile properties of our product as a versatile material
Protein Structure Prediction
Predict protein structures based on sequences
In the iGEM part registry, there are only the amino acid sequences of alpha & gamma subunits but no 3D structures available. And from our interview with Dr. Douglas Fudge, we learned that the terminal regions of IF are inherently elastomeric domains, which may make it difficult to experimentally determine the structure at the atomic-resolution level. Therefore, we turned to the power of computation and algorithms to predict the tertiary structure of IF subunits and dimer.
There are many free software tools to choose from for creating a predicted model of a protein's tertiary structure. The methodologies employed by these software tools can be categorised as either template-based or ab initio protein structure prediction. Two web-servers were used in our prediction. SWISS-MODEL is a structural bioinformatics web-server dedicated to homology modeling, while I-TASSER (Iterative Threading ASSEmbly Refinement) server utilises both template-based threading and ab initio simulation methods to produce its final model [4,5].
Our original plan was to first predict the structure of individual subunits and the next step is to assemble two subunits into one dimer with protein docking. Herein, with the I-TASSER, we input two subunit sequences individually and created two individual jobs.<\p>
However, we later found that the SWISS-MODEL server allows us to input two query sequences together by adding hetero target, and the output is inherently a dimer model, which saves the effort of protein docking. So, we took advantage of this function and got the dimer model in only one step.
Quality assessment of predicted models
Without knowing the protein's native structure we cannot determine the accuracy of a predicted model. There are, however, tools available to assess the hypothetical quality of a protein model. First, we selected those top models scoring the highest in the in-built analysis values of two prediction web-servers. Then, we employed two external quality assessment tools which were ProSA and Molprobity, respectively. ProSA calculates an overall quality score for a specific input structure, and if this score is outside a range characteristic for native proteins, the structure probably contains errors [6]. Molprobity, developed by the Richardson group at Duke university, provides a summary of scoring statistics as shown below, and the green cell means that the model passes one test while the red cell means that it fails [7].
ProSA Evaluations
We can see from the figures above that the z-score of the dimer structure is within the range of scores typically found for native proteins of similar size, while the z-scores of two subunits are beyond the range, which indicates that the dimer model is a more ideal prediction than the other two subunits.
Molprobity Evaluations
The figures above obviously show that the dimer model satisfied much more test statistics than the other two subunits although it’s still not perfect owing to the limitations of in silico prediction.
Conclusion
According to the evaluation results of both tools, we may conclude that the dimer model possesses a better structural quality than the other two subunit models and thus we eventually chose the dimer structure as our prior model for the following molecular dynamics simulations, which also saved the effort of applying protein docking to two individual subunits.
Molecular Dynamics Simulations
Overview
Molecular Dynamics (MD) is a numerical method of simulation, in which the atoms and molecules are allowed to interact for a fixed period of time, giving a view of the dynamic "evolution" of the system. The trajectories of atoms and molecules are determined by numerically solving Newton's equations of motion for a system of interacting particles, where forces between the particles and their potential energies are often calculated using interatomic potentials or molecular mechanics force fields [8].
With the predicted dimer model, we planned to perform molecular dynamics (MD) simulations to reveal the real-time conformational changes during draw-processing and estimate the tensile properties at the same time. Before the actual simulation takes place, we need to first achieve the solvation, energy minimization, and equilibration in water for our dimer molecule. After getting our molecule prepared, steered molecular dynamics (SMD) is performed to simulate the draw processing. By analysing the trajectories, we are able to visualize the real-time conformational changes and extract the tensile properties of the dimer molecule as an ideal rigid rod model.
For molecular dynamics simulation, our team used the Visual Molecular Dynamics (VMD) software. VMD is designed for modeling, visualization, and analysis of biological systems such as proteins, nucleic acids, lipid bilayer assemblies, etc [9]. In particular, VMD provides embedded scripting languages (Python and Tcl) for the purpose of user extensibility. Here we have collated a detailed explanation of the methods our team used for MD simulation with VMD and uploaded all scripting files onto our GitHub repository. Our aim is that these instructions will provide a reference point, helping future iGEM teams which wish to use VMD software to do molecular dynamics simulation.
Solvation, energy minimization and equilibration in water
Objectives of solvation, minimization and equilibration before SMD
- Approach in vitro aqueous incubation environment
- Many protein structures come from x-ray crystallography at extremely low temperature, so the determined conformation may not be realistic at room temperature [10]. We’d better do the minimization under desired conditions like room temperature.
- Prepare a nice starting structure for the following simulations by releasing constraints
As suggested by VMD user guide, before the actual molecular dynamics simulation, we need to first perform the following three steps to reach the stable conformation of the molecule:
- Minimize the whole system for 6000 steps
- Equilibrate the system in the NPT ensemble (constant atom number N, constant pressure P, and constant temperature T) for 50000 steps (i.e. 100 ps) with protein fixed
- Free all atoms, minimize for another 2000 steps, and run NPT dynamics for 100 ps again
VMD includes many analysis tools and we use these as criteria to determine whether the molecule has reached the equilibrium. One of these, the NAMD Plot plugin, may be used to plot the TOTAL v.s. Timestep curve to show the variation of total free energy of the system during the minimization. Besides, the NAMD log files during minimization record a parameter called gradient tolerance, and normally when the gradient value drops to 1, it indicates the convergence to the minimum [11]. You can also analyze the extent to which your system has equilibrated using what is known as the Root Mean Square Deviation, or RMSD. The RMSD characterizes the amount by which a given selection of your molecule deviates from a defined position in space, and a flattening RMSD curve usually means the system is equilibrated [12].
Step 1: Minimize the whole system for 6000 steps
From the plot (fig.2) we can see that the total energy dropped to the negative range and stayed at the stable negative level afterwards. The gradient value at ts 5997 (fig.3) was close to 1, which also indicates the minimization was done.
Step 2: Equilibrate the system in the NPT ensemble for 50000 steps (i.e. 100 ps) with protein fixed
From the plot (fig.4) we can see that the total energy raised a little bit due to the movement of water molecules and soon reached a stable negative level again.
Step 3: Free all atoms, minimize for another 2000 steps, and run NPT dynamics for 100 ps again
From two plots above (fig.6 & 7), we can see that the total energy of the system reached a stable negative level and the RMSD curve also became flattening at ~1.7 +/-0.2 Angstroms.
Conclusion
We may conclude that the molecular system has reached equilibrium after these three steps and are ready for the following steered molecular dynamics simulation.
Steered Molecular Dynamics
Objectives
- To demonstrate the real-time conformational changes during draw-processing
- To estimate the tensile properties of our product
A tensile test, also known as a tension test, is one of the most fundamental and common types of mechanical testing. A tensile test applies tensile (pulling) force to a material and measures the specimen's response to the stress. By doing this, tensile tests are able to determine many tensile properties such as ultimate tensile strength, Young's modulus, strain-hardening characteristics, etc [13].
However, the tensile test can not account for the conformational changes like the α→β transition of the molecules from a micro point of view. Therefore, we introduced steered molecular dynamics (SMD) simulations into our model. SMD simulations apply forces to a protein in order to manipulate its structure by pulling it along the desired direction [14]. These in silico experiments can not only be used to reveal the real-time structural changes in a protein at the atomic level but also mimic the tensile test to estimate the tensile properties.
Real-time conformational changes at atomic level
In our simulation, we employed the SMD simulations with constant velocity pulling on the dimer molecule. Upon fixing the Cαs of the N-terminus amino acid, the SMD simulation pulled the Cαs of the C-terminus amino acid of two subunits at a constant velocity, whose direction is determined by the mass centers of fixed and pulled atoms, for 80000 timesteps (i.e. 160 ps). So, the dimer molecule would undergo structural deformation, and the VMD would create the trajectories recording the states of all atoms at each timestep.
Results
The animation above shows the elongation and deformation of the dimer molecule. Unfortunately, the VMD software cannot update secondary structure automatically in the animation, so we have to use the console to recalculate the secondary structure manually at each frames.
According to the simulation outcome, we may demonstrate the conformational changes during the draw-processing from a micro point of view:
In the native state, two α-helical proteins wind into coiled-coils. Under stress, the coiled-coils start to unravel into random coils at specific stress values. When loading increases, the random coils are extended further and begin to form extended β-sheet structures with inter-chain hydrogen bonds. Eventually, the majority of α-helices unravel, and β-sheets extend and align, and this process exactly elucidates the α→β transition.
However, due to the limitations of the quality of our dimer model and the in silico force field parameters, we didn't see as much beta-sheet alignment as expected. We will try to refine the simulation quality in the future and give a more accurate and detailed molecular basis of the strain-hardening during draw-processing.
Tensile properties of hagfish slime IF fibres
Apart from explaining the α→β transition at atomic level, we can also regard the dimer molecule as an ideal rigid rod model to estimate the tensile properties of our product.
Hypothesis
The following model is given under the assumption that coiled coil IF protein dimers are the functional units of IFs and possess the same properties as all higher order structures up to the IFs themselves. By regarding the dimer molecule as a rigid rod, the diameter of the cross-section area of the molecule can be approximated to the distance between two fixed atoms while the length of the molecule can be approximated to the distance between centers of mass of fixed and pulled atoms, respectively.
Formulas and Parameters
In this type of simulation, we define the position of the fixed atom and the SMD (pulled) atom, and the pulled atom is attached to a virtual atom via a virtual spring. This virtual atom is moved at constant velocity and then the force between both is measured using:
\begin{equation}U = \frac{1}{2}k\left[vt - \left(\vec{r} - \vec{r_0}\right) * \vec{n}\right]^2\end{equation}
\begin{equation}\vec{f} = -\nabla U = \left( -\frac{\partial U}{\partial x}, -\frac{\partial U}{\partial y}, -\frac{\partial U}{\partial z} \right)\end{equation}
The stress-strain curve is universally used to give the relationship between stress and strain and demonstrate the tensile properties of textile materials. Therefore, we applied the formulas below to extract useful information from the SMD simulations for the stress-strain curve.
\begin{equation}\vec{F} = \vec{f} * \vec{n} = f_x * n_x + f_y * n_y + f_z * n_z\end{equation}
\begin{equation}strain = \frac{\Delta L}{L_0} = \frac{L_i - L_0}{L_0} = \frac{\sqrt{(x_i - x_0)^2 + (y_i - y_0)^2 + (z_i - z_0)^2}}{\sqrt{(x_1 - x_0)^2 + (y_1 - y_0)^2 + (z_1 - z_0)^2}}\end{equation}
\begin{equation}\sigma = \frac{F}{A} = \frac{F}{\pi * \left(\frac{d}{2}\right)^2}\end{equation}
Results
With the data we obtained, we could plot the unique stress-strain curve of the dimer molecule (figure 8).
We can see that the curve is approximately J-shape, which means that the molecule exhibits the low initial stiffness and later extreme strain hardening. Region I is a low-stiffness initial region that terminates at a strain of ɛ ≈ 0.7. In region II, stiffness rises dramatically until a strain of ɛ ≈ 2.0.
In comparison with the stress-strain curves of other textile materials (figure 9), we may conclude that the model of our IF fibres initially possess remarkable elasticity as nylon and also exhibit high stiffness similar to spider silk after the draw-processing, which supports the outstanding tensile properties of the hagfish slime intermediate filament as a versatile material.
In addition, if we check the stress-frame curve combined with secondary structures at different frames (figure 10), it could be found that the rise of stiffness is accompanied by the appearance of β-sheet, which is consistent with previous research result that the α→β transition harden the IF fibres during draw-processing.
Colour Mixing
Objectives
- To predict the percentage of chromoproteins required to make a certain colour: so that we can control the resulting colour
- To generate a color chart, so that we can predict the theoretical range of colours possible
Colour Mixing Models
The colour that we see is a result of a combination of different wavelengths of light with different power levels. When a light source shines on an non-luminous object (an object that does not glow), certain wavelengths of the light will be absorbed and the remaining wavelengths of light will be reflected and enter our eyes, which will be interpreted as colours in our brain. Because our eyes only contain three different types of colour-sensitive cells for red, green, and blue light respectively, we can use the three primary colours of light (red, green, and blue light) to produce a wide range of colours. This is the RGB colour model, a subset of additive colour mixing models, which produces colour using a mixture of different wavelengths of light.
In contrast to luminous objects, non-luminous objects do not produce light directly but instead only reflect them. Thus, we apply the principles of subtractive colour mixing to predict the colour of this mixture of reflected light. In our project, we will apply one of the models of subtractive colour mixing : the RYB colour model. In this model, red, yellow and blue are the three primary colours. The RYB model was traditionally used by artists, due to the limitation of the colours of pigments. By combining different amounts of these three primary pigments, we can reproduce a wide range of colours. For example, combining Red and Blue will create Purple. (Figure. 1)
Hypothesis
The chromoproteins used in our project are non-fluorescent, which means it does not emit light. Thus, we try to apply the RYB color model to predict and reproduce different colors by mixing different amounts of chromoproteins with different colors.
To simplify the color mixing model, we have made some assumptions. The chromoproteins are treated as pigments that attach to our IF at one terminus. And IF threads are assumed to be white. In total, we have the three primary colors plus white. We also assume that chromoproteins of different colors will evenly distribute within the hagfish fiber after the polymerization of the fiber and the interference among the chromoproteins are negligible.
Assumptions
- IF thread is treated as white
- Chromoproteins of different colors will evenly distribute within the hagfish fiber after the polymerization
Major Targets in our color mixing system
This table demonstrates how we planned to do colour mixing of Intermediate filaments.
Factors Affecting our results
There are three factors that may influence the color of the fiber:
- The type of chromoprotein attached
- Relative percentage of each chromoprotein attached
- Absolute percentage of each chromoprotein in the fibre
The relative percentage here means the relative ratio of different chromoproteins in the fiber. The absolute percentage of a chromoprotein means the percentage of IF that has been attached with a specific type of chromoprotein.
To help illustrate, the HSL (hue, saturation, lightness) color system is used. In terms of the three parameters in the HSL system, we can say that the relative percentage determines the hue while the absolute percentage determines the color intensity or saturation. Hue here means a general color like red while the color intensity or saturation here means how intense or saturated the color is. For example, red is more saturated than pink. The hue will change when the relative percentage of chromoprotein changes. The color intensity will increase when the absolute percentage of each chromoprotein increases but the relative percentage remains unchanged. Lightness depends on the amount of light in the environment so it is not modeled.
Refining Our Model
To determine the color of an object precisely, we can use an instrument called a spectrophotometer to measure the absorption and emission spectra of the object. And using specific computer programs, we can render the color that humans can understand from this spectra. These experimental data will help us refine the color mixing model, and better estimate the color of the fiber. However, we do not have that data now. So we only have a rough prediction for the color based on subtractive color mixing.
References
[1] D. S. Fudge, K. H. Gardner, V. T. Forsyth, C. Riekel, and J. M. Gosline, “The Mechanical Properties of Hydrated Intermediate Filaments: Insights from Hagfish Slime Threads,” Biophysical Journal, vol. 85, no. 3, pp. 2015–2027, 2003.
[2] J. Fu, P. A. Guerette, A. Pavesi, N. Horbelt, C. T. Lim, M. J. Harrington, and A. Miserez, “Artificial hagfish protein fibers with ultra-high and tunable stiffness,” Nanoscale, vol. 9, no. 35, pp. 12908–12915, 2017.
[3] “Molecular modelling,” Wikipedia, 20-Oct-2020. [Online]. Available: https://en.wikipedia.org/wiki/Molecular_modelling. [Accessed: 24-Oct-2020].
[4] A. Waterhouse, M. Bertoni, S. Bienert, G. Studer, et al., “SWISS-MODEL: homology modelling of protein structures and complexes,” Nucleic Acids Research, vol. 46, no. W1, 2018.
[5] J. Yang, R. Yan, A. Roy, D. Xu, J. Poisson, and Y. Zhang, “The I-TASSER Suite: protein structure and function prediction,” Nature Methods, vol. 12, no. 1, pp. 7–8, 2014.
[6] M. Wiederstein and M. J. Sippl, “ProSA-web: interactive web service for the recognition of errors in three-dimensional structures of proteins,” Nucleic Acids Research, vol. 35, no. Web Server, 2007.
[7] V. B. Chen, W. B. Arendall, J. J. Headd,et al., “MolProbity: all-atom structure validation for macromolecular crystallography,” International Tables for Crystallography, pp. 694–701, 2012.
[8] J. Gelpi, A. Hospital, R. Goñi, and M. Orozco, “Molecular dynamics simulations: advances and applications,” Advances and Applications in Bioinformatics and Chemistry, p. 37, 2015.
[9] W. Humphrey, A. Dalke, and K. Schulten, “VMD: Visual molecular dynamics,” Journal of Molecular Graphics, vol. 14, no. 1, pp. 33–38, 1996.
[10] B. Halle, “Biomolecular cryocrystallography: Structural changes during flash-cooling,” Proceedings of the National Academy of Sciences, vol. 101, no. 14, pp. 4793–4798, 2004.
[11] “Gradient descent,” Wikipedia, 16-Oct-2020. [Online]. Available: https://en.wikipedia.org/wiki/Gradient_descent. [Accessed: 24-Oct-2020].
[12] M. I. Sadowski, “Protein Structure Comparison Methods,” Encyclopedia of Biophysics, pp. 2055–2060, 2013.
[13] “Tensile testing,” Wikipedia, 10-Oct-2020. [Online]. Available: https://en.wikipedia.org/wiki/Tensile_testing. [Accessed: 24-Oct-2020].
[14] D. Ma, “A Steered Molecular Dynamics Study On Elastic Behavior Of Polyethylene Chains,” Acta Polymerica Sinica, vol. 008, no. 5, pp. 448–453, 2008.
Bibliography
I-TASSER
- Roy, A., Kucukural, A., and Zhang, Y. (2010). I-TASSER: a unified platform for automated protein structure and function prediction. Nature Protocols, 5, pp.725-738
- Yang, J. Yan, R., Roy, A., Xu, D., Poisson, J., and Zhang Y. (2015). The I-TASSER Suite: Protein structure and function prediction. Nature Methods, 12, pp.7-8
- Yang, J., and Zhang, Y. (2015). I-TASSER server: new development for protein structure and function predictions. Nucleic Acids Research, 43, pp.174-181
SWISS-MODEL
- Arnold, K., Bordoli, L., Kopp, J. and Schwede, T. (2005). The SWISS-MODEL workspace: a web-based environment for protein structure homology modelling. Bioinformatics, 22(2), pp.195-201.
- Benkert, P., Biasini, M. and Schwede, T. (2011). Toward the estimation of the absolute quality of individual protein structure models. Bioinformatics, 27(3), pp.343-350.
- Bienert, S., Waterhouse, A., de Beer, T., Tauriello, G., Studer, G., Bordoli, L. and Schwede, T. (2016). The SWISS-MODEL Repository—new features and functionality. Nucleic Acids Research, 45(D1), pp.D313-D319.
- Waterhouse, A., Bertoni, M., Bienert, S.,et al. (2018). SWISS-MODEL: homology modelling of protein structures and complexes. Nucleic Acids Research.
ProSA
- Wiederstein & Sippl (2007) ProSA-web: interactive web service for the recognition of errors in three-dimensional structures of proteins. Nucleic Acids Research 35, W407-W410.
- Sippl, M.J. (1993) Recognition of Errors in Three-Dimensional Structures of Proteins.Proteins 17, 355-362.
Molprobity
- Christopher J. Williams, Jeffrey J. Headd, Nigel W. Moriarty, et al.(2018) MolProbity: More and better reference data for improved all-atom structure validation. Protein Science 27: 293-315.