MODEL
Several distinct modelling approaches were used as part of our rational protein design.
Rosetta
Introduction to Rosetta
To analyze the proposed design changes from the rational design process, software tools were used to provide initial analysis and modelling of the structure. Among various computational protein design platforms, Rosetta was chosen for redesigning the structure and conducting energy analyses. The Rosetta software suite offers algorithms and protocols for macromolecular structure prediction and analysis of protein structures (Leaver-Fay et al., 2011). As a unified software package, it provides a wide range of tools suited for the project: to redesign the protein structure that has an increased binding affinity for Cu2+ (Kaufmann, Lemmon, DeLuca, Sheehan & Meiler, 2010).
Within the Rosetta software suite, the tasks and operations within the libraries are combined as algorithms called protocols. Some of protocols’ major functionalities include protein structure prediction, protein docking, protein design and energy scoring analysis. In this project, the structure of the original protein is known, therefore only energy scoring functions and the protein design protocols are used to assess how the residue-specific mutations increase stability and binding affinity (Leaver-Fay et al., 2011). Some specific functions such as Relax, a protocol for simple all-atom refinement of input structure, and Point Mutant Scan Application, a protocol for mutating the protein structure at specific residues and analyzing its stability, are used in the rational design process ("Applications for Macromolecule Design", 2020).
Protein Mutation Procedure
Using the Rosetta official documentation and the Rosetta Guide for the iGEM beginner made by Team iGEM Technion and iGEM TU Eindhoven 2016, a structure analysis procedure was developed. The protocols mentioned in the following steps are written in a shell script.
- Obtain the original Protein Data Bank (PDB) file of the Mst-CopC structure. (It is accessible from RCSB PDB.)
-
Prepare the structure.
- Clean the PDB file while keeping the ligand. (This removes the information in PDB that Rosetta doesn’t use. ("How to prepare structures for use in Rosetta", 2020))
- Run Relax protocol. (This reduces minor steric clashes in the protein-ligand complex. Running this protocol ensures the structure is at its most stable form before making mutations ("Relax application", 2020). )
- Point mutant scan application. (Though the algorithm can scan for point mutants for given residues and determine the change in Gibbs free energy a mutation causes, a mutant file can be used to specify the mutants on specific residues determined in the rational design phase. )
- Energy scoring analysis for the overall structure and residue-specific energy breakdown. (Residue specific energy and overall energy information will be produced as outputs along with mutated PDB files in step 3. )
At step 2.2 Relax and step 3 Point Mutation, relaxed and mutated protein structures will be produced as PDB files along with the energy score of each output. The scoring function in Rosetta is an optimized energy function that calculates the energy of all atomic interactions in a globular protein ("Scoring Tutorial", 2020). The score is a weighted sum of energy terms, including some representing physical forces like electrostatics and Van der Waal’s interactions and others statistical terms that indicate if the protein structure looks like any known structures ("Scoring Tutorial", 2020). A complete list of energy terms and weights used in the scoring function can be found on the Rosetta Scoring Tutorial Documentation. Therefore, instead of presenting the scores in physical energy units like kcal/mol, a custom Rosetta Energy Units (REU) is used due to the presence of the statistical terms. This score allows us to compare stability between scored structures. The lower the score, the more stable a structure is, which is sufficient to evaluate the relative stability of the mutated CopC-Cu2+ candidates.
Results
Single Residue Mutation
The table below shows the overall energy scores in REU and energy term breakdown for all protein candidates mutated on a single residue. The first column of the table indicates the mutation applied to the structure: 5icu is the PDB ID for Mst-CopC, and mutation code <original amino acid><residue number><new amino acid> follows the ID.
As seen below, the original score of the protein is -320.377. The mutated proteins are sorted in ascending total scores, meaning that they are ordered in decreasing stability. Among all candidates, only the mutation from histidine to aspartic acid at residue 85 (H85D) produced a lower energy score than the original structure. However, most candidate’s scores are between -320 to -290, which means most candidates are relatively stable after mutation and molecular dynamic analysis can be done to further analyze the candidates.
description | total_score | description | total_score |
---|---|---|---|
5icu | -320.377 | 5icu.L80D | -310.078 |
5icu.H85D | -327.949 | 5icu.L80V | -308.632 |
5icu.Y34F | -319.826 | 5icu.V79D | -308.15 |
5icu.F3H | -319.02 | 5icu.H85I | -304.996 |
5icu.S82E | -318.491 | 5icu.G29D | -299.658 |
5icu.V5K | -318.478 | 5icu.D83E | -298.052 |
5icu.I86E | -318.333 | 5icu.V79E | -297.659 |
5icu.I86K | -317.939 | 5icu.V79L | -295.35 |
5icu.I86H | -317.353 | 5icu.D83H | -291.52 |
5icu.H85E | -317.217 | 5icu.D83G | -291.469 |
5icu.I86D | -316.911 | 5icu.V79H | -273.441 |
5icu.V87T | -314.902 | 5icu.S2D | -270.366 |
5icu.G28D | -313.721 | 5icu.S81H | -264.303 |
5icu.L80E | -313.472 | 5icu.G29H | -261.843 |
5icu.S81D | -312.271 | 5icu.S81E | -259.4 |
5icu.R78H | -312.167 | 5icu.G29E | -256.144 |
5icu.L80H | -311.984 | 5icu.S81V | -246.804 |
5icu.G27S | -311.112 | 5icu.G29V | -216.733 |
5icu.G28H | -311.068 | 5icu.S2H | -137.613 |
5icu.G28E | -311.022 | 5icu.S2E | -99.242 |
The complete table of results is found here
These PDB files also contain per-residue energy scores, which can be used to compare the effect each mutation has on each residue. We assembled these scores into heatmaps, where each row corresponds to either a mutation or the original protein, and each column corresponds to a certain residue. This data was difficult to visualize effectively due to the presence of outliers, so certain data transformations were applied to better analyze some aspect of our energy score results. (The heatmap can be found in our github as heatmap_single_relative.png.)
One such transformation is the above heatmap, which has subtracted the unmutated proteins energy scores from each mutant (and itself). The resulting scores are therefore relative to the unmutated protein, and show increases or decreases in score, aiding one in finding whether a mutation is beneficial to the energy score of each residue.
Seen above is another heatmap, except that each column has been scaled to the range [0, 1], highlighting changes in energy score. Unlike the previous heatmap, it does not indicate the scale of these changes.
Note that in both heatmaps the rows and columns have the same labels, and represent the same things, only the values contained within the map itself have been modified.
Using these heatmaps, it is possible to rationally select mutations to simultaneously apply in order to further reduce the overall energy scores. Possible next steps include applying a logarithmic scale to better differentiate between the values in the heatmap.
Double Residue Mutations
To determine a suitable combination of two mutations, it is computationally feasible to simply run all possible pairs of mutations from our set of rationally selected single mutations. The overall energy scores of each of these “double mutants” was also assembled into a heatmap, as an efficient method of comparing all of these mutants at once.
Each row and column respond to a particular mutation, with each intersection representing both mutations applied together. The symmetry appears to be inexact, a phenomenon which was likely due to how Rosetta applies multiple mutations.
Based on the results from Rosetta and analyzing heatmaps, a list of optimal candidates is compiled. These mutated candidates will be further assessed using more detailed molecular dynamic analysis.
The list of mutations passed onto MD:
- D83E
- F3H
- G28E
- H85D
- I86K
- S82E
- S82E, F3H
- F3H, V87T
- H85D, S82E
- Y34F, F3H, S81D
- H85D, V87T, I86K, G28E
- Y34F, F3H, S81D, H85D
Molecular Dynamics
Molecular dynamics was used to test the relative binding affinities of these candidate proteins. We used GROMACS (Berendsen et al., 1999) along with the Charmm36 (Vanommeslaeghe et al., 2010) force field parameters which include parameters for copper ions in solution. This allows us to estimate the total interaction energy between the copper ion and the protein, and using the relative magnitudes of these interaction energies allows us to compare the strength of binding. Molecular dynamics provides a more rigorous method for comparing binding strength due to the fact that a molecular dynamics simulation takes into account the motion of every individual atom in the system and their interactions.
All GROMACS simulations were run on the Graham computing cluster located at the University of Waterloo and hosted by ComputeCanada.
In order to implement a molecular mechanics simulation, the protein pdb files were converted to the gromacs format using Charmm-GUI, an online tool developed in order to construct topology files and parameter files for proteins using the Charmm force field. Once the required files were attained, the system was placed in solution in a process known as solvation, and the charge in the system was balanced by introducing faraway sodium or chloride ions. The system then undergoes an energy minimization process in which the protein and copper ion are restrained while the solvent molecules settle into place. This avoids extreme behaviour that could occur due to collisions between molecules. The thermodynamics of the system is then set up using standard NVT and NPT equilibration procedures in GROMACS, and then the final simulation is run.
A typical ten nanosecond simulation was performed for each candidate protein. The results are found in the table below.
Name | Total Interaction Energy |
---|---|
Mst-CopC | -540.6373 kJ/mol |
H85D | -1278.448 kJ/mol |
F3H | -563.9996 kJ/mol |
G28E | -607.4388 kJ/mol |
I86K | -586.829 kJ/mol |
S82E | -588.4093 kJ/mol |
D83E-V87T | -853.725 kJ/mol |
H85D-S82E | -1392.7281 kJ/mol |
H85D-V87T-I86K-G28E | -1246.7286 kJ/mol |
S82E-F3H | -557.0123 kJ/mol |
Y34F-F3H-S81D | -1145.2119 kJ/mol |
Y34F-F3H-S81D-H85D | -1698.6439 kJ/mol |
From the table of results we can see that candidate Y34F-F3H-S81D-H85D has the highest relative binding affinity when compared to Mst-CopC, with a 3.14 improvement in total interaction energy over the original Mst-CopC. Figure 4 shows the copper ion in the H85D mutant’s binding site. This animation was produced using VMD, a free software tool for visualizing molecular dynamics simulations.
Motivation For Our Treatment Process
Our metal-removal technology will compete with a variety of other aqueous metal waste treatment processes. Some current methods for treating aqueous metal wastes include chemical precipitation, ion exchange, adsorption, and membrane filtration (Fu et al., 2011). Chemical precipitation is particularly widespread because of its relatively simplicity and low cost (Fu et al., 2011). In this method, the pH of the metal waste is increased by the addition of a base such as lime, causing the metal ions to precipitate as insoluble hydroxides that can be recovered as solid sludge. However, removing water from this sludge can be costly, and the resulting sludge is typically disposed of, so valuable metals cannot be recovered or reused. Ion exchange, membrane filtration, and membrane filtration are all relatively new technologies (Fu et al., 2011), so high costs often prohibit large scale applications. Furthermore, there are few established methods that recover metal ions from industrial wastewaters to allow recycling. We therefore aim to fill in this gap in metal waste treatment by using metal binding proteins to create a novel metal removal and recovery process. This process should quickly, selectively, and cheaply remove metal ions from wastewater, allowing metal ions to be recovered and recycled, providing economic and environmental benefits.
In order for our designed metal and cellulose binding fusion protein to have merit in the real world, it is imperative to design a suitable industrial process that can use our biotechnology to treat metal waste.
We will design a reactor to treat copper-containing metal finishing waste, at a scale relevant for a typical semiconductor manufacturing plant. Metal finishing wastes result from processes including electroplating, in which dissolved metals (copper, in this case) are deposited in thin layers onto solid surfaces using electrochemical reactions (Arán, 2017). After the process is complete, the remaining solutions must be treated before disposal. However, since dissolved metal wastes are produced by a large variety of industrial processes, the design and modelling process applied here could be used for any other waste stream and dissolve metal(s).
Choosing Our Reactor Type
In order to effectively treat large volumes of aqueous waste, we immobilised our cellulose and metal binding fusion protein inside an industrial bioreactor. Some advantages granted by using a bioreactor include: reactor size control and dimension tunability, continuous operation, and a wide variety of sub-types of reactors to base our design off of (Shuler & Kargi, 2002). Furthermore, the bioreactors have long been used for industrial scale processes that operate based on enzymatic or cellular function, so research in this area is well developed.
Of the many varieties of bioreactors available, we chose to use a packed column reactor. A packed column reactor consists of a large container filled with some sort of porous packing material, and effluent is fed through the bed, allowing a reactor to occur (Shuler & Kargi 2002). For our reactor, we will use cellulose beads as a packing material. Packed column reactors enable large volumes of waste to be treated in a continuous manner, providing much greater efficiency than batch processes that treat a fixed volume of waste at one time (Shuler & Kargi 2002). For our metal removal application, all removed metal will stay trapped inside the column as waste is fed through. Thus, the metal can be removed at a convenient time, such as during off-peak waste production hours.
Since our fusion proteins include a cellulose binding domain that binds cellulose very strongly, the fusion proteins can be immobilised on the cellulose packing by simply feeding them into the column. By immobilising our fusion proteins on the surface of the cellulose packing, we are able to maximize the surface area of reactive sites, thus increasing the treatment rate of the wastewater system.
With this general design in mind, we set out to use engineering and modelling tools to decide how to build a suitable reactor to treat our waste stream of interest.
We envision that our reactor will operate in two stages:
- First, the metal containing wastewater will be flowed through the reactor, allowing the metal in the wastewater to bind to the proteins within. This will produce a stream of clean water that can be discharged to the environment. This is the “binding” stage.
- Secondly, once the reactor is saturated with metal, we will rinse the reactor with another solution of a different pH, which will cause the metal ions to unbind from the reactor. This will result in a concentrated and pure solution of dissolved metal. This resulting solution could then be recycled for various applications. This is the “elution” stage.
Below is a 3D model that provides an example of our reactor’s general construction. Wastewater will be pumped into the reactor from the bottom, passing through the cellulose packing before exiting through the top of the reactor.
Developing Equations For Modelling
In order to model a system undergoing chemical reactions, we must first develop a kinetic rate law for the reactions. For our system, we will use a straightforward dimensional analysis approach similar to the Langmuir isotherm. This is a commonly applied model for adsorption of a reactant onto a protein.
To determine the net rate of adsorption of Cu2+ onto CopC, we add the forward binding rate to the reverse binding rate. A key nuance of this model is that the concentration of CopC is given as a surface concentration. In order to relate the surface concentration to the rate of change of a volumetric concentration, a conversion factor is computed by considering the surface area per volume of BMCC cellulose that the protein is attached to.
Suppose the surface area per volume of BMCC is given by a parameter \( a \) and that the fraction of the reactor occupied by cellulose is \( f \). Let \( m = [Cu^{2+}] \), let \( Q \) be the initial protein surface concentration, and let \( q = Q - [CopC] \).
Free protein + Free metal \( \underset{k_d}{\stackrel{k_a}{\rightleftharpoons}} \)
- Protein-metal complex
- \( k_a \): adsorption rate constant, \( k_d \): desorption rate constant
Adsorption (forward) rate:
\( v_a = k_a m N \)
- Where \( N = (Q - q) \)
- So, \( v_a = k_a m (Q - q) \)
Desorption (reverse) rate:
\( v_d = k_d q \)
- Variables:
- \( v \): rate (\( \frac{mol}{m^2 s} \))
- \( m \): free metal concentration (\( \frac{mol}{L} \))
- \( k_a \): adsorption rate constant (\( \frac{L}{mol s} \))
- \( k_d \): desorption rate constant (\( \frac{1}{s} \))
- \( N \): free sites (\( \frac{mol}{m s} \))
- \( Q \): total sites (\( \frac{mol}{m s} \))
- \( q \): occupied sites (\( \frac{mol}{m s} \))
Net rate:
\( \frac{dq}{dt} = v_a - v_d = k_a m (Q - q) - k_d q \)
\( \frac{dq}{dt} \) is a rate in \(\frac{mol}{m^2 s} \) (this is how much metal adsorbs to a given surface area each second)
But, we’re interested in the change \(\frac{dm}{dt} \), i.e. the change in metal concentration per second
For a given volume in the packed bed \( V \), a portion is liquid (\( V_{aq} \)) and a portion is packing material (\( V_p \))
There is also a specific surface area, i.e., the packing surface area (\(s \)), per total volume (\( V \))
Define specific surface area \(\rho = \frac{s}{V} \left( \frac{m^2}{m^3} = \frac{1}{m} \right) \)
- And void fraction \( f = \frac{V_{aq}}{V} \left( \frac{m^3}{m^3} = 1 \right) \)
- To convert \( \frac{dq}{dt} \) to \( \frac{dm}{dt} \),
\( \frac{dm}{dt} = -\frac{dq}{dt} \left( \frac{s}{V} \right) \left( \frac{V}{V_{eq}} \right) = -\frac{dq}{dt} \frac{\rho}{f} \)
So, \( \frac{dm}{dt} = -\frac{\rho}{f} \left( k_a m (Q - q) - k_d q \right) \)
Then the net adsorption rate is given by
\( \frac{dm}{dt} = -(a/f) (k_a m (Q-q) - k_d q) \)
Using this model allows computation of important properties of well-mixed reactors, such as a batch reactor where there is no inflowing or outflowing material. For a nominal input concentration of Cu2+ solution the results and a list of relevant parameters can be found below.
However, we aim to design a packed column reactor with a constant flow of new material, requiring us to consider transport phenomena. In order to incorporate transport phenomena this model needs to be extended to a partial differential equation. This can be done by employing the conservation of mass law as well as a constitutive equation called Fick’s law which governs the rate of diffusion of solute in the reactor. These transport-reaction models are described in the Encyclopaedia of Systems Biology published by Springer (Clairambault, 2013).
To begin deriving the transport equation, one must first write down the governing conservation and constitutive relations. The conservation of mass equation for a three dimensional reactor of fixed volume \( V \) is given by the following equation, where \( \partial V_{\pm} \) denotes the inflow (+) and outflow (-) boundaries of the reactor.
\( \frac{d}{dt} \iiint_V m(x,y,z) dxdydz \)
\( = \Phi(x,y)|_{\partial V_+} - \Phi(x,y)|_{\partial V_-} + \iiint_V R(x,y,z) dxdydz \)
Here \( R \) is the reaction rate described by the kinetic model. The flux terms \( \Phi \) can be computed by combining the advective flux and the diffusive flux. The advective term describes mass travelling through the reactor at a velocity \( \vec{v} \) (which can be a function of position), and can be written as \( \Phi = \vec{v} m \). The diffusive term describes mass entering and exiting the reactor due to the concentration gradient, which is described by Fick’s Law:
\( \Phi_{diffusion} = D \frac{dm}{dz} \)
This yields the total mass conservation equation:
\( \frac{d}{dt} \iiint_V m(x,y,z) dxdydz \)
\( = -\vec{v}(m(x,y,z_{+}) - m(x,y,z_{-})) + D(\frac{dm}{dz}(x,y,z_+) \)
\( - \frac{dm}{dz}(x,y,z_-)) + \iiint_V R(x,y,z) dxdydz \)
Here we may apply the fundamental theorem of calculus to convert the flux terms into an integral over \( \partial V \).
\( \frac{d}{dt} \iiint_V m(x,y,z) dxdydz - \iiint_V R(x,y,z) dxdydz \)
\( = \iint_{\partial V} -m \vec{v}\cdot \hat{n} + D\nabla m \cdot \hat{n} dxdy \)
We can apply Gauss’s Divergence Theorem in order to convert the double integrals into triple integrals.
\( \frac{d}{dt} \iiint_V m(x,y,z) dxdydz \)
\( = \iiint_{V} R(x,y,z) -\nabla(m) \cdot \vec{v} + D\nabla^2 m dxdydz \)
And finally, by applying the Reymond-Dubois lemma we will arrive at the following system of partial differential equations.
\( \frac{dm}{dt} + \vec{v}\cdot \nabla m - D\nabla^2 m = -(\frac{a}{f}) (k_a m (Q-q) - k_d q) \)
\( \frac{dq}{dt} = k_a m (Q-q) - k_d q \)
Unfortunately due to numerical problems, a solution to this system has not been completely developed. More work on improving numerical stability must be done before this will produce significant results. The main issue faced in solving this model using the finite element method is the timescale discrepancy between the transport and the reaction rate. From our review of the literature it appears that a typical reaction rate for our system will be much faster than the speed at which solution is transported by the advective term in the above equation. This implies that a very high resolution simulation must be performed, and advanced computational techniques may be required. However, we found that transport effects may be neglected locally in the reactor by a quasi-steady state approximation which is explained below. This allows us to still compute meaningful results from the model despite not having an exact solution.
Once a reactor has been designed, we can use the Ergun equation (Ergun, 1949), a standard tool in chemical engineering, to determine the pressure drop across the packed column. Pressure drop corresponds to the positive pressure required at the input end of the reactor to drive fluid flow through the reactor. This positive pressure would need to be applied using a pump. An approximate version of the Ergun equation suitable for a liquid packed column is listed below (Wu, 2007). All parameters necessary for computation of this pressure drop can be found in the parameter list above and the description of the reactor.
\( \frac{\Delta P}{L} = \frac{72 \mu \tau (1 - \epsilon)^2 v_s}{D^2_P \epsilon^3} + \frac{3 \tau (1 - \epsilon) \rho v^2_s \left(\frac{3}{2} + \frac{1}{\beta^4} - \frac{5}{2 \beta^2} \right)}{4 \epsilon^3 D_p} \)
Where L is the height of the reactor, and tau and beta are the tortuosity and pore-to-throat ratio which are defined as follows.
\( \tau = \frac{1}{2}\left[1 +\frac{1}{2} \sqrt{1 - \epsilon} +\sqrt{1 - \epsilon}\frac{\sqrt{\left(\frac{1}{\sqrt{1 - \epsilon}} - 1\right)^2 + \frac{1}{4}}}{1 - \sqrt{1 - \epsilon}}\right] \)
\( \beta = \frac{1}{1 - \sqrt{1 - \epsilon}} \)
Computational Approaches
To simulate the partial differential equation model, we decided to use FEniCS, a computing platform which solves partial differential equations and has an easy-to-use Python interface. To use FEniCS in a computational environment on Windows (used by most of us), it needs to be installed on the Windows Subsystem for Linux (WSL). To use FEniCS on macOS, the macOS Anaconda package can be used. For editing while using WSL, we found VS Code with the Remote - WSL extension to be most useful. Last, for source code management, we used a Git repository for the team. This environment set-up allowed us to work efficiently and collaborate easily. Our main FEniCS simulation code can be found here.
Getting Parameters
Once model equations had been developed, it was necessary to find values for the parameters so that useful results could be computed. These parameters were found directly in literature, or computed from combinations of other parameters.
Parameters are required to describe the scale of the problem that we are trying to address using this reactor. We need to know the volume of waste to be treated per day and the concentration of dissolved copper in that waste. We also need to know the acceptable concentration of dissolved metal in the treatment outflow, so that it can be safely released to the environment.
To estimate the amount of copper-containing waste to treat each day, we looked to existing electronics manufacturers that produce metal-finishing wastes. One semiconductor manufacturer in Taiwan reported producing 4.44 million metric tonnes of dissolved metal waste over three manufacturing facilities each year (“TSMC Corporate Social Responsibility Report”, 2019). This would correspond to 1.48 million metric tonnes per year per facility, or about 46.9 L/s at each facility. For this reason, we will design a reactor that can treat an average of 46.9 L/s of waste over the course of each day.
Chiu et al. (1987) states that typical copper concentrations in metal finishing wastewaters can range from 1-30 mg/L. For this reason, we will design our reactor to be able to treat 30 mg/L dissolved copper.
The United States Environmental Protection Agency (USEPA) states that the permissible limit for dissolved copper in industrial wastes discharged to the environment is 1.3 mg/L (Al-Saydeh et al., 2017). For this reason, our reactor must be able to reduce the concentration of dissolved copper from 30 mg/L to below 1.3 mg/L.
The parameters relevant to the cellulose binding domain and the cellulose packing material are the specific surface area and the surface protein concentration. These were found in McLean (2000) and are listed in the table below. The fraction of volume occupied by cellulose for a typical 3 millimeter diameter packing material is found in (Sen, 2015).
Similarly, an equilibrium constant for a variant of CopC was found in Young et al (2015) and a forwards rate constant for a structurally and functionally similar copper binding protein was found in Du, 2013. The equilibrium constant is necessary to determine whether the reaction happens at an appreciable rate relative to the fluid transport speed in the reactor. This rate constant provides an estimate of the reaction timescale which will inform how large our reactor must be.
Description | Variable | Value | Source |
---|---|---|---|
Input Concentration (Cu) | \(m\) | 30 \( \frac{mg}{L}\) | Chiu, 1984 |
Output Concentration (Cu) | 1.3 \( \frac{mg}{L}\) | Saydeh et al, 2017 | |
Surface Protein Concentration | \(Q\) | 0.35E-6 \( \frac{mol}{m^2}\) | McLean, 2000 |
Specific Surface Area | \(a\) | 72600000 \(\frac{1}{m}\) | McLean, 2000 |
Forwards Rate Constant | \(k_a\) | 0.015 \(\frac{1}{s}\) | Du, 2013 |
Reverse Rate Constant | \(k_d\) | 3e-16 \(\frac{1}{s}\) | Du, 2013 and Young et al, 2015 |
Equilibrium Constant | \(K_D\) | 1E-13.7 | Young et al, 2015 |
Void Fraction | \(\rho\) | 0.31 | Sen, 2016 |
Density of Cellulose | 1.5 \(\frac{g}{cm^3}\) | National Center for Biotechnology Information, 2020 | |
Porosity | \(\epsilon\) | 0.69 | Sen, 2016 |
Superficial Reactor Velocity | \(v_s\) | 0.2552 | Computed based on reactor size. See below |
Results & Discussion
The following plots are taken from the aforementioned model with the approximation that advective transport phenomena can be neglected when compared to the reaction timescale. This is reasonable because the copper binding protein chosen has a very small equilibrium constant, which implies that the reaction time scale is very fast. This model assumes that the reactor is well mixed, since the cellulose in the reactor will produce very strong eddy diffusion.
From these two plots we can see that within 5 seconds a vast majority of the copper present in solution has adsorbed onto the CopC binding sites, while only a small fraction of the CopC has actually been used up. This implies that a packed column reactor assuming similar initial inflow concentrations will not saturate too quickly to be of use.
It is also visible from the model that the timescale of the reaction is much smaller than the time a copper ion spends in a typical large packed column reactor. This led to numerical problems when solving the partial differential equation model, which we did not have time to fix. Instead, due to the rapid equilibrium approximation we may use the batch reactor to estimate a typical length scale for the reactor.
Through some dimensional analysis, we can find that 99.5% of copper ions which spend longer than 5 seconds in the reactor will be adsorbed. This means that we can potentially treat 200 times the volume of the reactor in copper solution before the reactor is saturated. This approximation allows us to choose a scale for our reactor.
For the metal finishing waste stream chosen, we will need to treat an average of 46.9 L/s of waste over the course of each day (see “Getting Parameters”), or 4 052 160 L/day. We choose to treat this waste using four 5-hour reactor cycles, allowing for 1-hour breaks in between for elution and other routine maintenance. This means that 1 013 040 L of waste must be treated in each of the four reactor cycles. Since we can treat 200 times the volume of the reactor in each reactor cycle, our reactor must be able to accommodate 1/200th of the waste per cycle, or 5065 L. However, due to the packing of 3 mm cellulose particles,, only 31% of our total reactor volume is actually free space. This means that we need the total volume of our packed column to be 16 339 L to accommodate 5065 L of liquid. Based on a 1:10 height-diameter ratio, this gives a packed column 1.28 m in diameter and 12.8 m in length.I n order to fill the reactor, we would require approximately 17 000kg of cellulose packing material, assuming a 3 mm particle diameter and 31 % void space.
In our discussion with stakeholders from Aevitas Hazardous Management, they told us that they used an approximately 30 000 L batch reactor to treat their wastes. Although we are using a continuous process with some differences from Aevitas’ process, this confirms that our proposed reactor scale of about 16 000 L is reasonable and within the size range of existing waste treatment processes.
With the above assumptions for our reactor, according to the estimate of 99.5% of all copper ions being depleted after a reactor cycle the effluent concentration should be below 1 µmol/L. This falls well below our goal of an output concentration of 1.3mg/L of copper.
Now that the reactor has been designed, we can use the Ergun equation to compute pressure drop. The velocity of fluid in the reactor should then be chosen to be the height divided by 50 seconds, in order to allow the copper ions to stay in the reactor for longer than 5 seconds at a time. This means that the superficial flow velocity in the reactor is 0.2556 m/s. We apply the Ergun equation (Ergun, 1949), a standard tool in chemical engineering, to calculate the resulting pressure drop across our packed column reactor.
For our reactor of diameter 1.28 m and length 12.8 m, the pressure drop is 0.126 bar. This means that in order for wastewater to flow through the reactor, we must use a pump to pressurize the reactor inlet to at least 0.126 bar greater than the outlet. This is very reasonable by industrial standards.
Next Steps & Implementation
Refer to the Implementation page for a more detailed discussion of our envisioned implementation.
The results of our modelling allow us to propose a reactor design that is suitable for treating a copper containing waste stream from a typical semiconductor manufacturing facility. Although this model provides the reactor scale and dimensions needed, there are several other parameters required in order to operate the reactor.
In particular, we must determine the effect of pH on copper binding in the reactor. pH must be determined for both the initial binding phase and the subsequent elution phase. This would ideally be determined using experiments to characterize our protein’s copper binding at a range of pH values. However, information from literature can be used to make rough estimates of the pH for both the initial binding phase and the elution phase.
In the binding stage, we want the copper to bind to the copper binding proteins in the reactor as strongly as possible. In CopC, the copper ion is coordinated by histidine, aspartic acid, and glutamic acid (Lawton, 2016). In order for copper binding to occur, these amino acids must all exist in their deprotonated forms so that their lone electron pairs can bind to copper. These deprotonated forms are also negatively or neutrally charged, attracting the positively charged copper ion. The deprotonated state can be achieved for all of these amino acids at a pH of 7.2 or higher (Včeláková et al., 2004). This means that the dissolved copper waste fed into the reactor must be at pH 7.2 or higher. Metal finishing waste is generally near neutral pH (van Dijken et al., 1999) so only small adjustments would be needed. The pH could be adjusted by adding small amounts of concentrated acid or base to the dissolved copper waste before feeding it to the reactor.
In the elution stage, we want copper ions to bind to the copper binding proteins in the reactor as weakly as possible, so that the copper ions can be removed from the reactor. To weaken copper binding, we can protonate the histidine, aspartic acid, and glutamic acid residues involved in copper binding. Protonation prevents these amino acids’ lone pairs from bonding with the copper ion, and also causes the side chains to become neutrally or positively charged, further repelling the copper ion. Protonation of these amino acids can be achieved at pH 4.8 or lower (Včeláková et al., 2004). This means that metal ions can be removed from the reactor by flowing through a solution with a pH of 4.8 or lower. This would result in a concentrated and pure copper solution. If necessary, the pH of the recovered metal solution could be further adjusted for recycling by using small volumes of concentrated acid or base.
Determining the pH required for the elution and binding stages of our process therefore moves us one step closer to building a functioning industrial process. The pH values determined herein are specific to copper and the CopC protein, but required pH values could be calculated for any metal and metal binding protein. In general, amino acids must be deprotonated to bind metals, so the metal binding step will always require a higher pH than the elution step.
Below is a proposed schematic of our packed column reactor’s proposed implementation, explained further on the Implementation page.
References
Clairambault, J. (2013). Reaction-Diffusion-Advection Equation. In W. Dubitzky, O. Wolkenhauer, K.-H. Cho, & H. Yokota (Eds.), Encyclopedia of Systems Biology (p. 1817). Springer New York. https://doi.org/10.1007/978-1-4419-9863-7_697
Fu, Fenglian, and Qi Wang. "Removal of heavy metal ions from wastewaters: a review." Journal of environmental management 92.3 (2011): 407-418.
Malik, et. al. "Detection and removal of heavy metal ions: a review." Environmental Chemistry Letters 17.4 (2019): 1495-1521. https://doi.org/10.1007/s10311-019-00891-z
Zoroddu, M. A., Medici, S., & Peana, M. (2009). Copper and nickel binding in multi-histidinic peptide fragments. Journal of Inorganic Biochemistry, 103(9), 1214-1220. doi:10.1016/J.jinorgbio.2009.06.008
Včeláková, K., Zusková, I., Kenndler, E., & Gaš, B. (2004). Determination of cationic mobilities and pKa values of 22 amino acids by capillary zone electrophoresis. Electrophoresis, 25(2), 309-317. doi:10.1002/elps.200305751
Wijekoon, C. J., Young, T. R., Wedd, A. G., & Xiao, Z. (2015). CopC protein from Pseudomonas fluorescens SBW25 features a conserved novel high-affinity Cu2+ binding site. Inorganic chemistry, 54(6), 2950–2959. https://doi.org/10.1021/acs.inorgchem.5b00031
Shuler, M.L., and Kargi, F. 2002. Bioprocess Engineering Basic Concepts. In 2nd edition. Prentice Hall PTR, Upper Saddle River, NJ.
Arán, D., Antelo, J., Lodeiro, P., Macías, F., and Fiol, S. 2017. Use of Waste-Derived Biochar to Remove Copper from Aqueous Solution in a Continuous-Flow System. Ind. Eng. Chem. Res. 56(44): 12755–12762. doi:10.1021/acs.iecr.7b03056.
Al-Saydeh, S.A., El-Naas, M.H., and Zaidi, S.J. 2017. Copper removal from industrial wastewater: A comprehensive review. J. Ind. Eng. Chem. 56: 35–44. The Korean Society of Industrial and Engineering Chemistry. doi:10.1016/j.jiec.2017.07.026.
Chiu, H.S.S., Tsang, K.L., and Lee, R.M.L. 1984. Treatment of electroplating wastes. Water Sanit. Asia Pacific Proc. 10th WEDC Conf.: 115–119.
Leaver-Fay, A., Tyka, M., Lewis, S. M., Lange, O. F., Thompson, J., Jacak, R., Kaufman, K., Renfrew, P. D., Smith, C. A., Sheffler, W., Davis, I. W., Cooper, S., Treuille, A., Mandell, D. J., Richter, F., Ban, Y. E. A., Fleishman, S. J., Corn, J. E., Kim, D. E., … Bradley, P. (2011). Rosetta3: An object-oriented software suite for the simulation and design of macromolecules. Methods in Enzymology, 487(C), 545–574. https://doi.org/10.1016/B978-0-12-381270-4.00019-6
Kaufmann, K. W., Lemmon, G. H., Deluca, S. L., Sheehan, J. H., & Meiler, J. (2010). Practically useful: What the Rosetta protein modeling suite can do for you. Biochemistry, 49(14), 2987–2998. https://doi.org/10.1021/bi902153g
How to prepare structures for use in Rosetta. (2020). Retrieved 14 October 2020, from https://www.rosettacommons.org/docs/latest/rosetta_basics/preparation/preparing-structures
Applications for Macromolecule Design. (2020). Retrieved 14 October 2020, from https://www.rosettacommons.org/docs/latest/application_documentation/design/design-applications
Relax application. (2020). Retrieved 14 October 2020, from https://www.rosettacommons.org/docs/latest/application_documentation/structure_prediction/relax
Scoring Tutorial. (2020). Retrieved 14 October 2020, from https://www.rosettacommons.org/demos/latest/tutorials/scoring/scoring
iGEM Technion 2016, & iGEM TU Eindhoven 2016. (2016). Retrieved 14 October 2020, from https://static.igem.org/mediawiki/2016/5/59/Rosetta_Guide_for_the_iGEM_Beginner.pdf
Lawton, T. J., Kenney, G. E., Hurley, J. D., & Rosenzweig, A. C. (2016). The CopC Family: Structural and Bioinformatic Insights into a Diverse Group of Periplasmic Copper Binding Proteins. Biochemistry, 55(15), 2278-2290. doi:10.1021/acs.biochem.6b00175
van Dijken, K., Prince, Y., Wolters, T., Frey, M., Mussati, G., Kalff, P., Hansen, O., Kerndrup, S., Søndergârd, B., Rodrigues, E.L., and Meredith, S. 1999. Electroplating industry. : 97–128. doi:10.1007/978-94-007-0854-9_9.
TSMC Corporate Social Responsibility Report 2019. 2019. Retrieved from https://csr.tsmc.com/download/csr/2019-csr-report/english/pdf/e-all.pdf
Du, X., Li, H., Wang, X., Liu, Q., Ni, J., & Sun, H. (2013). Kinetics and thermodynamics of metal binding to the N-terminus of a human copper transporter, hCTR1. Chemical Communications, 49(80), 9134. doi:10.1039/c3cc45360j
McLean, B. W., Bray, M. R., Boraston, A. B., Gilkes, N. R., Haynes, C. A., & Kilburn, D. G. (2000). Analysis of binding of the family 2a carbohydrate-binding module from Celludomonas fimi xylanase 10a to cellulose: Specificity and identification of functionally important amino acid residues. Protein Engineering, 13(11), 801–809. https://doi.org/10.1093/protein/13.11.801
Young, T. R., Wijekoon, C. J. K., Spyrou, B., Donnelly, P. S., Wedd, A. G., & Xiao, Z. (2015). A set of robust fluorescent peptide probes for quantification of Cu2+ binding affinities in the micromolar to femtomolar range. Metallomics, 7(3), 567–578. doi:10.1039/c4mt00301b
Sen, P., Bhattacharjee, C., & Bhattacharya, P. (2016). Experimental studies and two-dimensional modelling of a packed bed bioreactor used for production of galacto-oligosaccharides (GOS) from milk whey. Bioprocess and Biosystems Engineering, 39(3), 361–380. https://doi.org/10.1007/s00449-015-1516-2
Berendsen, et al. 1995. GROMACS: A message-passing parallel molecular dynamics implementation. Comp. Phys. Comm. 91: 43-56.. doi:10.1016/0010-4655(95)00042-E
National Center for Biotechnology Information (2020). PubChem Compound Summary for CID 16211032, Cellulose. Retrieved October 22, 2020 from https://pubchem.ncbi.nlm.nih.gov/compound/Cellulose.
Ergun, S., & Orning, A. A. (1949). Fluid Flow through Randomly Packed Columns and Fluidized Beds. Industrial & Engineering Chemistry, 41(6), 1179-1184. doi:10.1021/ie50474a011
Wu, J., Yu, B., & Yun, M. (2007). A resistance model for flow through porous media. Transport in Porous Media, 71(3), 331-343. doi:10.1007/s11242-007-9129-0