INTRODUCTION
A scientific model is a simplification of reality used to predict the behaviour of a real-life
system. It is an educated guess that scientists use to make a prediction about real life. With these
models or simulations, insights into the modelled system are gained. In the past years, models have
become an important part of scientific research since they save researchers’ time and costs. After
studying, the model can predict how a system will behave when working with it and changing some
parameters.
During our project, we have performed different kinds of modelling in order to predict and
optimize the performance of our SINISENS biosensor. In order to achieve this, we have used
different
softwares: MATLAB, Rosetta and SWISS-MODEL. Our genetic circuit
consists of a promoter that controls the expression of our MphR transcription factor and ermC
gene, which gives resistance to macrolide antibiotics. When there are no macrolides present, MphR
binds to the pMphR and represses the expression of the gene controlled by this promoter,
egfp. Consequently, if there are no macrolides present there is no output signal, this is, no
fluorescence. On the other hand, if there are macrolides present in the sample, these bind to the
MphR, forming a complex incapable to bind to the pMphR. Thus, if macrolides are present, egfp
gene is expressed, so the biosensor emits fluorescence (Fig. 1).
MATLAB: MATHEMATICAL MODELLING
Mathematical modelling consists of describing a physical system with mathematical functions and
properties. It usually starts with an examination of a physical system to determine how it works.
The end goal of such process is to find a set of functions that describe the physical system as
precisely as possible. This set of functions is called a mathematical model. Mathematical
models can be as simple as just one function, but they can contain tens or even hundreds of
functions that all affect the output. A simple example of such a model could be the trajectory of a
thrown ball which can be simplified to one quadratic function. Mathematical models can be useful
during the design of a new physical system. For example, a mathematical model of a flying ball can
tell at what angle the ball needs to be thrown to reach a required distance.
In synthetic biology, mathematical models can be very beneficial. For example, they can describe how
output measured from biosensor changes when parameters of a system are tuned. These parameters might
include biological constants, such as promoter strength, dissociation constant, and protein
degradation rate.
Our motivation behind mathematical modelling was to achieve three goals we had set to ourselves:
- Estimate the range of macrolide concentration at which our biosensor could be used.
- Get insights on how we could improve the sensitivity of the biosensor.
- Calibrate our biosensor.
To achieve these goals, we built two different mathematical models. We decided that mathematical modelling was a proper way to achieve these goals due to the fact that biological systems such as pathways can be described with mathematical functions and the output is easily comparable as it is numerical.
Estimating the Range of the Biosensor
We built our first model to determine the range of the biosensor. To achieve this, we used
experimental data from literature and applied it to the Hill equation. In biochemistry, the Hill
equation describes the relationship between ligand and receptor. We used this equation to plot the
percentage of receptors bound to macrolides as a function of macrolide concentration. We used data
from wildtype as well as mutated Escherichia coli [1], since we wanted to get insights on how
mutations affect the MphR range. In addition, we plotted macrolide concentrations in wastewater to
see if these were inside the range of the biosensor.
The maximum concentration of macrolides in effluent wastewaters is not inside the range of the
biosensor. In fact, our biosensor would need to be 10-100 times more sensitive than the best
performing circuit found from literature to detect the low concentrations of macrolides found in
wastewaters (Fig. 2).
These results were crucial to move one step further in our modelling: we decided that we were
planning to use Rosetta to predict mutations in the binding site of MphR that increase the binding
affinity of MphR to macrolide antibiotics. Furthermore, since the sensitivity had to be heavily
increased, we arrived at the conclusion that we would need to tune our system even more and we
settled the third main goal of our project, to concentrate macrolide antibiotics inside our
biosensor cells.
To conclude, this first model highlighted the importance of modelling as it gave us insight to
prepare for
this problem well before obtaining any experimental data from the lab.
Biosensor Functionality
Introduction and Motivation
The goal of the second model was to understand how the biosensor worked. This would help us to gain
insights on how the sensitivity could be improved by discovering possible bottlenecks in the
pathway. In addition, the mathematical model could help us during the calibration of the final
device by estimating the output in different scenarios.
To build this model, we used MATLAB’s add-on application, SimBiology.
SimBiology is a tool designed for systems biology; this is
mathematical analysis of biological systems. This app allows building models interactively with a
graphical interface in addition to the traditional method of programming with MATLAB-language. The
graphical interface of SimBiology, referred to as block diagram editor, allows building
complicated biological systems visually. Thus, it requires little to no programming experience to be
able to build a mathematical model.
System and Reactions
In our modelled genetic system (Fig. 3) mphR is transcribed and translated. Once there is MphR
protein in the cell, it binds to the pMphR promoter, repressing the expression of egfp. If
there are macrolide antibiotics present, they will enter the cell and bind to the MphR. This results
in MphR unbinding from the promoter and the egfp can be expressed.
This model consists of two parts: a sample of wastewater and an E. coli cell in it. In SimBiology, these are referred to as compartments. Our pathway starts
from outside of the cell; wastewater contains some concentration of macrolide antibiotics. We
assumed that macrolides get inside of the cell by diffusion: passive transport of molecules through
the cell membrane. The rate of such a transportation method depends on the concentration difference
between the two sides of the membrane. It is important to notice that as the movement of molecules
is more or less random, in the end the system settles down in an equilibrium state. In this state,
the concentration on both sides of the membrane is equal. This reaction can be modelled with
reversible mass action kinetics, which states that the rate of the reaction is proportional to the
concentration of the reactants.
After the macrolide molecules have entered the cell, they can bind to the repressor MphR, which is
controlled by a constitutive promoter about which we will discuss later. This reaction results in
the derepression of egfp expression:
- Two macrolide molecules bind to the MphR transcription factor and this causes MphR to unbind from the plasmid.
- Transcription of egfp begins.
- The end products of the reaction are the plasmid and the MphR-macrolide complex.
We decided to model the derepression, as all other reactions, using mass action kinetics. We made
this decision due to the fact that there is very limited amount of information and experimental data
on these systems. Thus, finding different constants such as the promoter strength or the
dissociation
constant might be impossible. As a consequence, we might need to estimate these constants, which is
discussed later.
The transcription of egfp can then start once the promoter is derepressed. At the end, this
results
in the last step of the pathway: the EGFP protein translation. Noteworthy, in our system there are
two transcription-translation pathways: transcription-translation of MphR and
transcription-translation of EGFP. These two pathways are very similar and only the constants are
different. However, there is a small difference in the regulation of the transcription: mphr
transcription is controlled by a constitutive promoter, which means that the translation is
continuous. On the other hand, egfp transcription is controlled by the previously mentioned
transcription factor, the MphR repressor. This means that the egfp transcription stops as
soon as the MphR binds to the plasmid.
Opposite to the previously discussed reactions (derepression and diffusion), neither transcription
nor translation are reversible reactions. In addition, the reactants in these reactions are not used
in the process. Hence, if the model doesn’t take that into account, there would be a reaction
product overflow at the end. In real biological systems, proteins are not permanent; they degrade
and
their concentration decreases when the cell grows. Thus, we added reactions for mRNA degradation as
well as protein degradation. In addition, we assumed dilution not to be significant due to the fact
that the volume of immobilized cells stays constant in the sensing time frame of 1-2 hours, which is
the time of use of our biosensor.
Two reactions we have yet to discuss are repression of the egfp gene in the plasmid by MphR
and binding of two
macrolide molecules with the unbound MphR. These reactions are similar: both are reversible as well
as both have two types of reactants binding to one complex. The only difference is that MphR binds
with two molecules, which makes the reaction rate function a bit different.
Finding Constants
The last part of modelling the SINISENS pathway was to set initial concentrations (Table 1) as well as finding proper constants for each reaction (Table 2). Due to the limited amount of experimental data and information available, some of the reaction constants had to be estimated by either fitting data or using numerical averages of a similar part.
Initial Values | Amount | Units | Notes | Source |
---|---|---|---|---|
Macrolides outside the cell | 1.41E-09 | 1 mol/L | --- | [2] |
MphR:plasmid | 200 | Molecules | High Copy Number plasmid | [3] |
Average E. coli cell volume | 0.47 | μm^3 | Used to convert molecules to concentration | [4] |
Wastewater Sample Size | 1 | L | --- | --- |
Others | --- | --- | Other initial values are assumed to be zero | --- |
Reaction Constants | Amount | Units | Notes | Source |
---|---|---|---|---|
Average Translation Rate (E. coli) | 17-21 | Amino acids / second | Used 17AA/s to calculate MphR and EGFP translation rate | [5] |
Length of MphR (WT) | 194 | Amino acids | Used to calculate MphR translation rate | [6] |
Length of EGFP | 240 | Amino acids | Used to calculate EGFP translation rate | [7] |
Absolute Promoter Strength of BBa_J23101 | 0.03 | Polymerase / second | Used to calculate promoter strenght of BBa_J23106 | [8] |
Relative Promoter Strenght of BBa_J23106 | 0.67 | Relative | Compared to BBa_J23101, used to calculate promoter strenght of BBa_J23106: 0.03 · 0.67 = 0.02 Polymerase / second | [9] |
Typical mRNA Degradation Rate | 3-8 | Minutes | Used 5 minutes | [10] |
EGFP Degradation Rate | 0.021 | 1/hour | --- | [11] |
Dissociation Constant of MphR Binding to its Operator (DNA chain) | 574 +/- 29 | nM | K_d: Ratio of reverse reaction rate and forward reaction rate. Used in repression. | [12] |
Concentration of Ligand at Half Maximum Normalized GFP Fluorescence | 0.19 +/- 0.02 | μM | Used to calculate dissociation constant of macrolide molecule binding to MphR | [1] |
Hill Coefficient | 1.7 +/- 0.3 | --- | Used to calculate dissociation constant of macrolide molecule binding to MphR | [1] |
Assumptions
When modelling physical systems, it is impossible to take every little detail into account to add them in mathematical equations. For example, in the throwing ball example we did not take air resistance into account to simplify the problem. The same applies for modelling biological systems: we have to make assumptions to simplify the problem. Below is a list of assumptions that we made in order to simplify our system in the mathematical model.
- The volume of immobilized cells stays constant, therefore dilution is insignificant in the measuring time-frame.
- At the beginning, egfp is repressed by MphR in all the plasmids.
- At the beginning, the cell only contains plasmids with MphR.
- Macrolides get inside the cell by passive diffusion.
- Leakiness in the model is explained as a reversible reaction.
Results
Using the values from tables 1 and 2, we run a SimBiology
simulation with stop time of 10 hours. Resulting plots can be seen in figures 4 and 5, which show
concentration in molarity on y-axis and time on x-axis. We used ode45 as a solver in the simulation
as
it was recommended by MATLAB documentation [13].
EGFP concentration is the most important measure due to the fact that it is the output of our
optical biosensor. As expected, at the beginning EGFP concentration grows exponentially due to mass
action kinetics: the rate of the reaction is proportional to concentrations of reactants. The
production of EGFP slows down after 4 hours.
The goal of our final product was not only to detect macrolide antibiotics but also quantify them. However, the above figures give us little to no information on how our sensor would behave in different macrolide concentrations. Therefore, we run a scan with 10 different initial concentrations of macrolides in wastewater starting from 1E-10 M to 1E-5 M by evenly spaced intervals. The result of this scan can be seen in figure 6. As expected, a higher concentration of macrolides in wastewater produces higher output of EGFP. However, the difference between selected concentrations is very little, especially in the working time-frame of the final biosensor: first 2 hours. This might become a problem if leakiness and noise of the output is high enough to hide the low differences in the EGFP output.
Later on in the project, we had a suspicion that we could improve our biosensor output by choosing a weaker promoter for MphR transcription. Due to the fact that MphR works as a repressor, it is evident that a cell with less MphR protein will produce more fluorescence. Thus, using a weaker promoter will result in higher EGFP output. To confirm this hypothesis we decided to run a MATLAB scan with five different promoter strengths with constant intervals between 0 and 0.02 polymerase/second (constitutive promoter BBa_J23106 [9]) in a constant concentration. The results of the scan confirmed the hypothesis: output is higher when less repressor is present. This result inspired us to try a different promoter for our biosensor; we switched from our initial constitutive promoter to an inducible promoter, pBAD.
ROSETTA: PROTEIN MODELLING
As we previously said, after realizing we needed to increase the range of our biosensor, we decided
to use Rosetta to perform binding site modelling in our MphR protein. Rosetta is a software that
includes
algorithms for computational modelling and analysis of protein structures. It allows enzyme design,
de novo protein design, ligand docking and structure prediction of biological macromolecules and
complexes among others. We used this software to try to predict modifications in the binding sites
of MphR that would increase the binding affinity that this transcription factor has for macrolides.
Concretely, we modelled this increase focusing on two macrolides, erythromycin and clarithromycin,
which have been placed on the ‘watch list’ of pharmaceuticals for EU-wide monitoring in aquatic
environments in 2015 and are now considered for prioritization [14]. For more information, refer to
the official
website of Rosetta.
In order to achieve our objective, we based our modelling in the paper “Rosetta and the Design of
Ligand Binding Sites” by Moretti et al. [15]. We found this paper as a source from a Rosetta Guide
iGEM Technion 2016 and iGEM TU
Eindhoven 2016 wrote during the iGEM 2016 Competition. For our modelling we needed to follow several
steps:
- Preparation of the MphR pdb file.
- Preparation of the ligand files.
- Docking of the ligands in the binding sites of the protein.
- Design of the binding site.
- Filtering.
- Re-run the results.
Noteworthy mentioning, we took part in a journal initiative
organized by the Maastricht 2020 iGEM team, in which we wrote a step-by-step guide for
designing ligand binding sites. This guide is based on the Moretti and colleagues paper above
mentioned and
the previous guide made by the 2016 iGEM teams, but it is thought to be used by non-computational
biologists, so it has more explanations and more detailed steps. We hope our Guide for Using
Rosetta when Designing Ligand Binding Sites helps future iGEM participants to use this
software even if they lack computational biologists in their teams.
In the following sections, we present an overview of our whole modelling process, but if the reader
is interested
in using the Rosetta software, they can download our step-by-step guide.
1. Preparation of the MphR pdb file
First, we downloaded the MphR pdb file from Protein Data Bank (PDB) and we used the molecular visualization software PyMOL to gain insights on its structure. As seen in the video, this transcription factor is an homodimer formed by two chains with two ligand binding sites, one binding site per chain. In the video, we can visualize the macrolide antibiotic erythromycin (orange) attached to the MphR. After visualizing the structure of MphR, we prepared the PDB file by cleaning it (removing the waters and other molecules) and relaxing it, so it could be used for the designing of the ligand binding sites. The module for relaxing the protein is the following: Khatib F, Cooper S, Tyka MD, Xu K, Makedon I, Popovic Z, Baker D, and Players F. (2011). Algorithm discovery by protein folding game players. Proc Natl Acad Sci USA 108(47):18949-53. doi: 10.1073/pnas.1115898108. More information about the concrete commands applied can be found in our guide.
2. Preparation of the ligand files
In the second step, we downloaded the ligand files (erythromycin and clarithromycin) from ZINC and studied their structures in PyMOL (Figure 8). We also did some preparations to be able to use the files in Rosetta. To work in line with the experiments performed in wet-lab, we added hydrogens to both ligands according to the pH of wastewater samples we were working with (pH=6). This pH was measured from sterilized wastewater samples from one of HSY wastewater treatment plants that we used during the lab experiments. After the addition of hydrogens, we prepared a library of conformers to be used during the modelling.
3. Docking of ligands to the binding site
Molecular docking predicts the orientation and binding of a molecule to another molecule. There are
many different software that can be used for molecular docking, such as Autodock 4.2, AutoDock Vina and SwissDock.
In our study, the ligands were macrolide antibiotics (erythromycin and clarithromycin) and the
receptor the transcription factor MphR. The binding site of erythromycin to MphR has been determined
with crystallography [12] and is therefore known. Because of this, we did not have to perform blind
docking. However, some other problems were caused by the size of the ligand and its multiple torsion
angles. In general, it is harder to perform accurate docking with a bigger ligand, because of the
amount of orientations the molecule can have. A second challenge in the docking process was that the
MphR protein is an homodimer and therefore it binds to two ligands. Nonetheless, docking can be done
only with one ligand at a time and at the end, the docked ligands have to be combined in one file.
However, the additional challenge arose from the difference between the two chains (chain A and
chain B) of the MphR. For the chain A the docking was easily successful but for the chain B the
docking was more challenging. The estimations of the binding affinities were much lower and to
achieve the correct visual orientation many runs with different parameters and software were needed.
We used Autodock
4.2 to perform our first docking of erythromycin with the prepared and relaxed MphR. We
achieved good results with chain A, but not chain B. After multiple runs with different parameters
we tried to do the docking with the AutoDock Vina software. With slight adjustments to the parameters,
the docking was successful for both erythromycin and clarithromycin. For comparison we also used the
SwissDock software to
dock erythromycin to MphR. However, we still achieved the best results with AutoDock Vina, therefore, it is the
software we finally used.
For the analysis of the docking results we use the molecular visualization software PyMOL. Since the binding of erythromycin to MphR has been experimentally determined, we compared the docked conformation with the crystallized structure (Figures 9 and 10). We chose the conformation based on the best visual alignment of the ligand in combination with the estimated binding affinities, which were between –9 and –13 kcal/mol.
4. Design of the binding site
The following step is the designing itself, in which modifications to the ligand binding sites are made and assessed. Noteworthy, this step is computationally heavy and users who want to perform it should have access to a computer cluster if they want to obtain results in a reasonable time-frame. To perform this step, we needed to create a file in which we specified the amino acid residues we wanted to mutate. During our research, we came up with a paper by Feng et al. (2015) [16], which states that the main energy contributors for the binding between MphR and the macrolides are Met59, Val66, Ser92, Met93, Tyr103, Val126, His147, Thr154 and Met155. These are the MphR residues we modelled during our project. It is important to mention that during our modelling we did not allow mutations to cysteine because this amino acid tends to form sulphur bonds. Because of that, if allowing mutations for this amino acid almost all mutations obtained were cysteines. For more concrete information on this step, check our guide.
5. Filtering
In this step, we filtered two times the results obtained during the designing part in order to get the results with the best scores, which are indicative of a probable better mutation. Here, the user can choose which metrics they use for the filtering. In our case, we used the metrics specified in tables 3 and 4. For all the metrics, we used the average values as the cut-off to obtain our final results. For more information on filtering and metrics, our guide can be checked.
Metric | Explanation |
---|---|
total_score | Measure of protein stability |
if_X_fa_atr | Measure of attractive potential between atoms in different residues in the interface protein-ligand |
fa_rep | Measure of ligand clashes in the whole protein |
if_X_fa_rep | Measure of ligand clashes in the interface protein-ligand |
ligand_is_touching_X | Measure of the distance between protein and ligand |
interface_delta_X | Binding energy |
Metric | Explanation |
---|---|
packstat | Packing metric |
sc_value | Shape complementarity |
delta_unsatHbonds | Unsatisfied hydrogen bonds |
dG_separated/dSASAx100 | Binding energy per contact area |
dG_separated | Binding energy |
6. Re-Run
The final step of our modelling process consisted of re-running the design of the ligand binding sites using the best files obtained in the previous round. It is recommended to re-run the design from 3 to 5 times in order to obtain better results in each round. We performed the re-running 3 times for each ligand and we selected the 5 files with the best binding energies as our final files for each ligand. Figure 11 shows a summary of all the designing processes above explained.
Final Analyses and Results
Tables 5 and 6 show the design scores for the final files we selected for erythromycin and clarithromycin. To ensure these files had better scores than the PDB file with erythromycin bound, we evaluated the file from PDB and the final files with the same Rosetta scoring function, ref2015. Table 7 shows the scoring of the proteins for this function. Focusing in the total_score metric, it can be clearly seen that the proteins with mutations obtained with Rosetta have more stability than the file retrieved from PDB, since they have lower scores. This indicates that a protein with the modelled mutations is expected to work in real life.
File | total_score | fa_rep | if_X_ fa_atr |
if_X_ fa_rep |
ligand_is_ touching_X |
interface_ delta_X |
---|---|---|---|---|---|---|
Erythromycin 1 | -1658,168 | 185,078 | -49,016 | 7,503 | 1 | -28,208 |
Erythromycin 2 | -1668,171 | 183,354 | -48,433 | 8,016 | 1 | -26,936 |
Erythromycin 3 | -1670,627 | 184,126 | -51,915 | 8,126 | 1 | -27,151 |
Erythromycin 4 | -1659,102 | 182,991 | -48,626 | 6,068 | 1 | -30,052 |
Erythromycin 5 | -1660,972 | 185,565 | -47,716 | 6,736 | 1 | -25,527 |
Clarithromycin 1 | -1678,894 | 189,084 | -56,086 | 6,633 | 1 | -35,033 |
Clarithromycin 2 | -1675,459 | 187,563 | -55,160 | 5,954 | 1 | -33,033 |
Clarithromycin 3 | -1679,833 | 185,432 | -55,558 | 5,842 | 1 | -34,199 |
Clarithromycin 4 | -1672,542 | 187,834 | -54,593 | 6,168 | 1 | -31,958 |
Clarithromycin 5 | -1675,155 | 183,774 | -53,198 | 6,856 | 1 | -33,903 |
File | packstat | sc_value | delta_ unsatHbonds |
dG_separated/ dSASAx100 |
dG_separated |
---|---|---|---|---|---|
Erythromycin 1 | 0,657 | 0,535 | 12 | -1,918 | -52,311 |
Erythromycin 2 | 0,625 | 0,533 | 13 | -1,880 | -52,008 |
Erythromycin 3 | 0,613 | 0,542 | 15 | -1,848 | -50,556 |
Erythromycin 4 | 0,593 | 0,539 | 17 | -1,802 | -50,519 |
Erythromycin 5 | 0,603 | 0,528 | 14 | -1,845 | -50,516 |
Clarithromycin 1 | 0,610 | 0,584 | 17 | -2,236 | -62,079 |
Clarithromycin 2 | 0,614 | 0,581 | 16 | -2,252 | -61,642 |
Clarithromycin 3 | 0,642 | 0,579 | 15 | -2,237 | -61,031 |
Clarithromycin 4 | 0,630 | 0,580 | 16 | -2,188 | -60,103 |
Clarithromycin 5 | 0,614 | 0,559 | 15 | -2,171 | -59,758 |
File | total_score | dG_separated |
---|---|---|
MphR Protein Data Bank | -527,085 | --- |
Erythromycin 1 | -944,062 | -52,311 |
Erythromycin 2 | -960,434 | -52,008 |
Erythromycin 3 | -962,823 | -50,556 |
Erythromycin 4 | -952,452 | -50,519 |
Erythromycin 5 | -948,739 | -50,516 |
Clarithromycin 1 | -955,655 | -62,079 |
Clarithromycin 2 | -963,694 | -61,642 |
Clarithromycin 3 | -967,505 | -61,031 |
Clarithromycin 4 | -957,522 | -60,103 |
Clarithromycin 5 | -971,376 | -59,758 |
After selecting these results, we downloaded the FASTA protein sequences and compared them. Worth mentioning, for erythromycin two out of the five selected sequences had the same mutations and for clarithromycin this happened with 3 out of the 5 selected sequences. We interpreted this as a good sign: if several of the best sequences have the same mutations, we expect these mutations to likely increase the binding affinity of MphR to macrolides. An important issue to point out is that even if our protein was an homodimer, we obtained different predicted mutations for the two chains. We talked with Maria Sammalkorpi and Bart Rooijakkers, who have experience in modelling and they recommended us to work with the mutations obtained in chain A, since we already had problems with the docking of the ligand with chain B, as above mentioned. Therefore, we agreed with our team that we would order the MphR with the mutations obtained in chain A to test them. Below, we show the ordered sequences (Fig. 12): (i) one sequence with the common mutations for erythromycin and clarithromycin, (ii) two sequences of the mutations obtained for erythromycin, concretely, the sequence with the best score and the sequences that had the same mutations and (iii) two sequences of the mutations obtained for clarithromycin, the sequence with the best score and the sequences that had the same mutations.
SWISS-MODEL
Since the concentration of macrolide antibiotics is very low in the wastewaters (Fig. 2), we
investigated ways to further increase the sensitivity of the biosensor. One way to do this, would be
to increase the concentration of macrolides within the cells. After researching, we found a
paper by Mohammad and colleagues [17] in which it was shown that the modification of the
transmembrane protein FhuA could potentially increase the amount of some molecules such as
antibiotics inside the cells. In the article they had replaced some of the loops of the protein with
a single polypeptide. However, we wanted to estimate the protein stability of FhuA with only removed
loops. In order to investigate this, we used the SWISS-MODEL server, a fully
automated protein structure homology-modelling server thought to make protein modelling accessible
to all life science researchers worldwide.
With SWISS-MODEL
we compared the wildtype FhuA (video: green) with a modified FhuA with the loops removed (video:
blue) and estimated the stability of the proteins. The sequence for the FhuA protein can be found in
Figure 13, the highlighted sequences are the ones removed. Based on the paper by Mohammad and the
colleagues [17] we chose to remove L3 (Tyr243–Asn273), L5 (Asp394–Asn419), L11 (Asn682–Arg704), L4
(Cys318–His339) and the cork domain (Met1–Pro160).
The results from the SWISS-MODEL scoring, which were very promising, can be seen in tables 8 and 9. Notable is that all the values of the modified FhuA protein lie close to the ones of the unmodified. Worth mentioning, although we studied the stability of FhuA computationally, at the end we did not run the experiments in the lab.
Modified Fhu(A) | Scores | Definitions or residues |
---|---|---|
GMQE | 0,99 | The resulting GMQE score is expressed as a number between 0 and 1, reflecting the expected accuracy of a model built with that alignment and template and the coverage of the target. Higher numbers indicate higher reliability [18]. |
MolProbity Score | 1,692 | --- |
Clash Score | 1,83 | --- |
Ramachandran Favoured | 93,76% | --- |
Ramachandran Outliers | 1,25% | A459 ALA, A235 ASP, A278 PHE, A394 LYS, A463 MET, A347 GLY |
Rotamer Outliers | 2,71% | A279 ASP, A298 VAL, A10 THR, A267 GLN, A401 VAL, A170 LEU, A45 GLN, A480 THR, A175 VAL, A171 GLN, A236 LYS |
C-β Deviations | 1 | A187 ASP |
Bad Bonds | 3/3943 | _1A GP1-_1B GP4, _1D GMH-_1E GPH, _1F GLC-_1H GLA |
Bad Angles | 34/5342 | A384 ASP, A235 ASP, A278 PHE, A478 THR, A440 ASN, A265 ASP, A172 ASN, A451 ASP, A390 ASP, A264 ASP, (A397 THR-A398 PRO), A480 THR, (A472 VAL-A473 ASN), A230 ASP, A473 ASN, A191 THR, A404 HIS, (A62 ARG-A63 PRO), A187 ASP, (A293 GLU-A294 PRO), A144 GLU, A437 ASP, A344 ASP, A139 ASN, A197 ASP, (A246 ASP-A247 TRP), (A346 GLU-A347 GLY), A92 THR, A471 HIS, A114 THR, A6 PHE, A246 ASP |
QMEAN | -2,07 | The QMEAN Z-score provides an estimate of the "degree of nativeness" of the structural features observed in the model on a global scale and is described in Benkert et al,. It indicates whether the QMEAN score of the model is comparable to what one would expect from experimental structures of similar size. QMEAN Z-scores around zero indicate good agreement between the model structure and experimental structures of similar size. Scores of -4,0 or below are an indication of models with low quality. The four individual terms of the global QMEAN quality scores are Cβ, all atom, solvation and torsion. Positive values indicate that the model scores higher than experimental structures on average. Negative numbers indicate that the model scores lower than experimental structures on average [18]. |
Cβ | -1,95 | --- |
All Atom | 1,27 | --- |
Solvation | 1,30 | --- |
Torsion | -2,27 | --- |
Unmodified Fhu(A) | Scores | Definitions or residues |
---|---|---|
GMQE | 0,99 | The resulting GMQE score is expressed as a number between 0 and 1, reflecting the expected accuracy of a model built with that alignment and template and the coverage of the target. Higher numbers indicate higher reliability [18]. |
MolProbity Score | 1,84 | --- |
Clash Score | 2,22 | (_1F GLC-_1H GLA), (A266 PHE-A403 TYR), (A384 ARG-_1C KDO) |
Ramachandran Favoured | 93,37% | --- |
Ramachandran Outliers | 1,01% | A128 ARG, A555 GLY, A667 ALA, A602 LYS, A486 PHE, A443 ASP, A20 SER |
Rotamer Outliers | 3,45%% | A268 GLU, A475 GLN, A609 VAL, A132 MET, A71 SER, A352 LEU, A457 ASP, A118 ASP, A487 ASP, A419 ASN, A711 THR, A357 VAL, A205 GLN, A66 VAL, A397 VAL, A112 GLN, A93 ARG, A76 VAL, A353 GLN, A444 LYS |
C-β Deviations | 2 | A403 TYR, A592 ASP |
Bad Bonds | 4/5600 | _1A GP1-_1B GP4, _1D GMH-_1E GPH, A403 TYR, _1F GLC-_1H GLA |
Bad Angles | 35/7602 | A266 PHE, A592 ASP, A413 ASN, A171 ASP, A699 PHE, A403 TYR, A623 PHE, A323 ASN, A20 SER, A486 PHE, (A73 THR-A74 PRO), A354 ASN, A557 PHE, A404 ASN, (A605 THR-A606 PRO), A339 HIS, A89 HIS, A443 ASP, A132 MET, A308 SER, A304 GLU, A679 HIS, A419 ASN, A88 ASP, A186 ASP, A61 HIS, A402 LEU, (A666 LEU-A667) |
QMEAN | -1,60 | The QMEAN Z-score provides an estimate of the "degree of nativeness" of the structural features observed in the model on a global scale and is described in Benkert et al,. It indicates whether the QMEAN score of the model is comparable to what one would expect from experimental structures of similar size. QMEAN Z-scores around zero indicate good agreement between the model structure and experimental structures of similar size. Scores of -4,0 or below are an indication of models with low quality. The four individual terms of the global QMEAN quality scores are Cβ, all atom, solvation and torsion. Positive values indicate that the model scores higher than experimental structures on average. Negative numbers indicate that the model scores lower than experimental structures on average [18]. |
Cβ | -1,99 | --- |
All Atom | -0,28 | --- |
Solvation | -0,31 | --- |
Torsion | -1,15 | --- |
REFERENCES
1. Kasey, C. M., Zerrad, M., Li, Y., Cropp, T. A., & Williams, G. J. (2018). Development of
Transcription Factor-Based Designer Macrolide Biosensors for Metabolic Engineering and Synthetic
Biology. ACS Synthetic Biology, 7(1), 227–239. https://doi.org/10.1021/acssynbio.7b00287
2. Schafhauser, Bruno Henrique, Lauren A. Kristofco, Cíntia Mara Ribas de Oliveira, and Bryan
W. Brooks. “Global Review and Analysis of Erythromycin in the Environment: Occurrence,
Bioaccumulation and Antibiotic Resistance Hazards.” Environmental Pollution 238 (July 1, 2018):
440–51. https://doi.org/10.1016/j.envpol.2018.03.052.
3. iGEM Registry: http://parts.igem.org/Part:pSB1K3
4. B10NUMB3R5: https:// bionumbers.hms.harvard.edu/ bionumber.aspx?id=114925&ver= 1&trm=
cell+ volume+ Bacteria+ Escherichia+ coli&org=
5. Ross, J. F., and M. Orlowski. “Growth-Rate-Dependent Adjustment of Ribosome Function in
Chemostat-Grown Cells of the Fungus Mucor Racemosus.” Journal of Bacteriology 149, no. 2 (February
1982): 650.
6. https:// www.ebi.ac.uk/ ena/ data/ view/ BAB12241
7. https:// www.fpbase.org/ protein/ egfp/
8. iGEM Warsaw: http:// 2010.igem.org/ Team:Warsaw/ Stage1/ PromMeas
9. iGEM registry: http:// parts.igem.org/ Promoters/ Catalog/ Anderson
10. B10NUMB3R5: https:// bionumbers.hms.harvard.edu/ bionumber.aspx?id=108598&ver=
0&trm=mrna+ degradation&org=
11. iGEM registry: http:// parts.igem.org/ Part:BBa_E0040
12. Zheng, Jianting, Vatsala Sagar, Adam Smolinsky, Chase Bourke, Nicole LaRonde-LeBlanc, and
T. Ashton Cropp. “Structure and Function of the Macrolide Biosensor Protein, MphR(A), with and
without Erythromycin.” Journal of Molecular Biology 387, no. 5 (April 17, 2009): 1250–60.
https://doi.org/10.1016/j.jmb.2009.02.058.
13. https:// www.mathworks.com/ help/ matlab/ math/ choose-an-ode-solver.html#bu3n5rf-1
14. Decision (EU) 2015/495 of 20 March 2015 establishing a watch list of substances for
Union-wide monitoring in the field of water policy pursuant to Directive 2008/105/EC of the European
Parliament and of the Council ' (2015) Official Journal L78, p. 40.
15. Moretti, R., Bender, B. J. & Allison, B. (2016). Methods in Molecular Biology: Design and
Creation of Ligand Binding Proteins, 1414, 47–62. https://doi.org/10.1007/978-1-4939-3569-7
16. Feng, T., Zhang, Y., Ding, J. N., Fan, S., & Han, J. G. (2015). Insights into resistance
mechanism of the macrolide biosensor protein MphR(A) binding to macrolide antibiotic erythromycin by
molecular dynamics simulation. Journal of Computer-Aided Molecular Design, 29(12), 1123–1136.
https://doi.org/10.1007/s10822-015-9881-0
17. Mohammad, M. M., Howard, K. R., & Movileanu, L. (2011). Redesign of a plugged β-barrel
membrane protein. Journal of Biological Chemistry, 286(10), 8000–8013.
https://doi.org/10.1074/jbc.M110.197723
18. https://swissmodel.expasy.org/docs/help