<!DOCTYPE html>

### Overview

Modeling is an extremely important part of synthetic biology. A successful model can get and promote the implementation of the project, as well as the summary and feedback of the work.

#### The goal of our modelling:

·Exploring a model to predict the Tm of PETase mutants.

·Achieves high prediction of changes of dynamic stability of proteins at a high temperature

### Calculation of Tm based on FoldX

After amino acids for PETase mutation based on FoldX, we get the results of the thermal stability, where one of the most important parameters is ΔΔG. But role of ΔΔG on the improvement of thermal stability (Tm) is unknown. Therefore, we quantified this effect by building a mathematical model.

Before modeling, we made the following assumptions:

1.The folding free energy of the protein is determined only by the interaction predicted by FoldX.

2.Tm predicted by FoldX is positively correlated with experimental Tm.

3.On the premise that the prediction of FoldX is correct, ΔΔG and ΔTm are negatively correlated.

#### Fitting Tm predicted by FoldX with experimental Tm

ΔGfoldx is the predictive folding free energy，each subterm represents the interaction the FoldX considers. N represents the folding state of the protein, U represents the unfolding state of the protein. It can be concluded that:

When T = Tm, K = 1, Δ Gfoldx = 0. Based on it, we calculate Tm based on FoldX by make Δ Gfoldx = 0, and get corresponding Tm.

First, we collected data on variants of PETase from published literature (data see Table2). Then the unfolding free energy (ΔG) of protein was calculated with software FoldX. Making Δ Gfoldx = 0, then we got the corresponding Tm. Finally, the correlation coefficient between the Tm values predicted by FoldX and the Tm values experimentally measured were analyzed. As shown in Fig. 1 (B), there is a positive correlation between the experimental Tm values and calculated Tm values with relatively high accuracy (correlation coefficients R^{2}~0.9399). The regression analysis indicates that the confidence is over 99% and the regression equation is significant (due to 0.9399>0.8745, confidence test table is shown in Table 3).

This indicates that the prediction based on FoldX has great accuracy and our model is right. And this model based FoldX can guide our work.

Then we considered whether such a model is scalable. Therefore, we collected data on variants of Lipase A from published literature (data see Table1) and applied this method to Lipase A and found that it can also construct a regression line (see Fig. 1 (A), R^{2}~0.9209>0.7603, confidence greater than 99.9%). This suggests that our method may be applied to more enzymes.

Figure 1. Linear fitting between prediction Tm and experimental Tm.

(A)Variants of Lipase A from Bacillus subtilis

(B)Variants of PETase

#### Fitting ΔΔG calculated by FoldX with experimental ΔTm

Since there is a good linear relationship between Tm values predicted by FoldX and Tm values experimentally measured, we speculate that there is also a linear relationship between the unfolding energy change (ΔΔG) and ΔTm.

Then we used the energy function of FoldX for calculating the unfolding energy change (298K) in accurate manner between the wildtype and a variant of PETase.

After that, the correlation coefficient between the ΔTm values of the mutant and the calculated ΔΔG values of the mutants were analyzed. As shown in Fig. 2 (B), there is a negative correlation between the ΔTm values and ΔΔG values with relatively high accuracy (correlation coefficients R^{2}~0.9143>0.8745, confidence greater than 99%). Again, we extended this method to Lipase A and found that it can also construct a regression line (see Fig. 2 (A), R^{2}~0.8822>0.7603, confidence greater than 99.9%). And then we fit ΔΔG calculated at 343K by FoldX with experimental ΔTm of PETase (see Fig. 2 (C), R^{2}~0.917>0.8745, confidence greater than 99%).

This indicates that the prediction based on FoldX has great accuracy and linear fitting with PETase is different from Lipase A. And the straight line of mutation at different temperatures is slightly different (Fig 2.B and C).

Figure 2. Linear fitting between ΔTm and ΔΔG .

(A) Mutation for Lipase A’ from Bacillus subtilis at 298K

(B) Mutation for PETase at 298K

(C) Mutation for PETase at 343K

All these results guide us to maintain a uniform temperature when using FoldX for mutation and estimate Tm. And it indicates that our method may have a broader meaning and can be applied to more enzymes.

Table 1. Variants of Lipase A from Bacillus subtilis

Table 2. Variants of PETase

Table 3. Critical table for testing correlation coefficients

References:

[1]Hyeoncheol Francis Son, In Jin Cho, Seongjoon Joo, Hogyun Seo, Hye-Young Sagong, So Young Choi, Sang Yup Lee, and Kyung-Jin Kim. Rational Protein Engineering of Thermo-Stable PETase from Ideonella sakaiensis for Highly Efficient PET Degradation. ACS Catalysis 2019 9 (4), 3519-3526

[2]Q. Li, Y. Yan, X. Liu, Z. Zhang, J. Tian, N. Wu, Enhancing thermostability of a psychrophilic alpha-amylase by the structural energy optimization in the trajectories of molecular dynamics simulations, International Journal of Biological Macromolecules (2019), doi: https://doi.org/10.1016/j.ijbiomac.2019.10.004

[3]Rathi PC, Jaeger K-E, Gohlke H (2015) Structural Rigidity and Protein Thermostability in Variants of Lipase A from Bacillus subtilis. PLoS ONE 10(7): e0130289. doi:10.1371/journal.pone.0130289

[4]Andreas S. Bommarius , Mariétou F. Paye. Chem. Soc. Rev.,2013,42, 6534

[5]http://foldxsuite.crg.eu/products#foldx

### Molecular Dynamics Simulation

#### The Significance of Molecular Dynamics Simulation

Molecular dynamics (MD) simulation is demonstrated to do better in predicting the thermostability changes of the mutated enzymes than using static-state information of the enzymes，offering insights into molecular details underlying conformational changes of biomolecules that is inaccessible to experiments. Therefore, we use MD simulation to achieve high prediction of changes of dynamic stability of proteins at a high temperature.

#### Basic principles of molecular dynamics simulation

（1）Extracting the Dynamics of a Molecular System: The Time Evolution Algorithm

In a classical MD framework, Newton’s second law of motion rules the dynamics, stating that the acceleration a that a particle is subject to at time t depends on the total force F acting on the particle itself and on its mass m :

As the acceleration a(t) is the second derivative of the position r(t) with respect to time, given the initial position and velocity of the particle (r(t0), v(t0)), their temporal evolution can be computed by integrating a(t) =F(t)/m as follows:

In the case of complex biomolecular systems with many particles and multiple interactions acting between them, it is impossible to integrate analytically Eq. 2, while a different and feasible approach consists in discretizing Eq. 1. The idea is to consider very short time steps of length Δt so that in such intervals the forces are (almost) constant, and the integration of Eq. 2 becomes trivial.A careful choice of the values to integrate allows reducing the approximations derived from such an approach. For example, choosing the velocity value at time t0+ Δt/2 (and not at t0) reduces the error to orders of (Δt)4(rather than (Δt)2). This framework is at the basis of the leap-frog algorithm, which is used in the vast majority of MD simulations engines:

This algorithm can thus “solve” every possible Newton equation at the expense of precision.

（2）Force filed

The total force F acting on the ith particle is equal to the resultant force of all the particles around it：

The interaction force F between particles can be obtained by deriving the distance between particles by potential function：

Force fields for classical MD simulations provide the expression of the potential energy of a system. Thus, they determine the forces employed in Newton’s law ruling the dynamics. They usually rely on the breakdown of interactions into several independent, additive, and derivable terms, identified on an empirical physical basis.

Covalent (Bonded) Interactions

Covalent interactions are modeled with potential energy terms representing bond stretching, angle bending, and improper and proper dihedral angle torsion. The functional forms of potential energy functions aim a simplified, classical description of the atomic motion of molecules.

In the GROMOS force field, this translates in the equations displayed in Table 1, where for proper dihedrals, the convention states that φijkl is the angle between the (i, j, k) and (j, k, l) planes; with i, j, k, and l being four subsequent atoms, for example, along a protein backbone. A value of zero for φijkl corresponds to a cis configuration and π to a trans. The integer n denotes the number of equally spaced energy minima available in a 360°turn. The same conventions hold for improper dihedrals ξijkl, which are used to ensure ring planarity and control the chirality of tetrahedric centers. It must be noticed that these types of potentials cannot model bond breaking: for this, more sophisticated descriptions are needed.

Table 1．Equations used for covalent bonds simulated in GROMOS force field

Nonbonded Interactions

Nonbonded interactions include the short-range Pauli repulsion, the “mid”-range van der Waals attraction, and the long-range electrostatic term.

The first two terms can be modeled together by a Lennard–Jones potential. Its functional form, describing the interaction between two neutral atoms at distance r, models the long-range dispersion with an r-6behavior typical of the dipole–dipole interactions found in noble gases (London dispersion forces), while the Pauli term is represented by an r-12behavior to ease the computation in relation with the previous one:

Two parameters, ε and σ, tune the interaction strength and the equilibrium distance, respectively. They are parametrized by fit to experimental data and are specific of each pair of atom species.

The Coulomb energy between two charges q1 and q2 at distance r is represented by the Coulomb law:

with ε0 being the dielectric constant of vacuum and εr the relative dielectric constant, introduced to properly take into account the screening provided by the material surrounding the object.

Indeed, as electrons are not present in classical force fields, the screening effect due to the atom polarization must be modeled with this mean-field approach.

Finally, force fields designed to describe biomolecules are parametrized to describe our PETase^{MUT}-solution system and PETase^{WT}-solution system at high temperature (343K). The evaluation results and the analysis of it can be seen on our Results page.

Reference:

[1]Marzuoli I., Fraternali F. (2021) Molecular Simulations Guidelines for Biological Nanomaterials: From Peptides to Membranes. In: Ryadnov M. (eds) Polypeptide Materials. Methods in Molecular Biology, vol 2208. Humana, New York, NY.https://doi.org/10.1007/978-1-0716-0928-6_6

[2]Rapaport D C , Blumberg R L , Mckay S R , et al. The Art of Molecular Dynamics Simulation[J]. Computing in ence & Engineering, 2002, 1(1):70-71.

[3]Dror RO et al (2012) Biomolecular simulation: a computational microscope for molecular biology. Annu Rev Biophys 41(1):429

[4]Andersen, Hans C . Molecular dynamics simulations at constant pressure and/or temperature[J]. Journal of Chemical Physics, 1980, 72(4):2384-2393.

[5]Chen J , Iii C L B . Can molecular dynamics simulations provide high-resolution refinement