Team:Waterloo/Engineering

We have used the engineering design cycle for several key parts of our project. First, we used it to improve the copper affinity of the protein CopC. We then used it to design copper- and cellulose-binding fusion proteins to place inside our bioreactor. Lastly, we used the cycle to design an industrial-scale packed column reactor that allows our fusion proteins to treat large volumes of dissolved copper waste water.
PROTEIN DESIGN
The steps of the engineering cycle
The steps of the engineering cycle.

We applied the engineering design cycle to rationally improve the copper affinity of the protein Mst-CopC. Increasing Mst-CopC’s copper affinity allows our industrial process to remove copper from wastewaters more efficiently. Our industrial process will use Mst-CopC fused to the cellulose binding module CBM2a. However, since Mst-CopC and CBM2a are connected by a long flexible linker, we can consider the Mst-CopC protein in isolation for the purposes of improving copper affinity. Although we were not able to access the lab, molecular dynamics simulations demonstrate that our engineered Mst-CopC has increased copper affinity.

Research

For the “Research” stage, we investigate existing copper binding proteins and evaluate their viability for our industrial process. This eventually leads us to select the protein Mst-CopC.

Criteria For Suitable Metal Binding Proteins

Metals are required for many important cellular processes, giving rise to the prevalence of a diverse range of metal-binding proteins capable of binding a multitude of metals (Major et al., 2016). For the purposes of a cellulose packed column bioreactor (this reactor discussed in the process design section of this page), metal-binding proteins must satisfy several criteria: proteins should preferably have a high binding affinity for the chosen metal, a compact binding domain, and be monomeric as well as periplasmic (Urbina et al., 2019). High binding affinity is preferred to increase the metal recovery from wastewater effluent (Urbina et al., 2019). Selecting monomeric and periplasmic proteins is also critical to ensure proper folding of the selected domain outside of the cell and as part of a recombinant protein (Urbina et al., 2019).

Once we had identified copper as an industrially-relevant metal to target, we needed to identify proteins that could bind copper and remove it from industrial wastewaters. We decided on Mst-CopC after considering a range of other copper-binding proteins as described below.

Copper-Binding Proteins

Copper is a cofactor for a myriad of key biological processes, so maintaining balanced copper levels within the cell is critical. To properly regulate copper levels, biology has evolved many systems of copper-binding proteins. These protein families include the Cue, Cus and Cop proteins(Lawton, Kenney, Hurley, & Rosenzweig, 2016).

The copper efflux (Cue) proteins prevent cellular copper concentrations from reaching dangerously high levels under aerobic conditions (Lawton et al., 2016). The copper sensing (Cus) proteins function similarly, preventing dangerous copper concentrations under anaerobic conditions (Lawton et al., 2016). Although there exist a number of different Cop proteins, designated A, B, C, D, E, R and S, this group is less understood (Udagedara, Wijekoon, Xiao, Wedd, & Maher, 2019). From this group, CopC is a periplasmic copper binding protein believed to play a role in copper uptake along with the inner membrane CopD proteins (Lawton, 2016), but the exact roles of Cop proteins are still an active research area.

Selecting Mst-CopC

CopC proteins are periplasmic copper-binding proteins that are believed to play a role in the regulation of copper concentrations within bacteria (Lawton, 2016). Different versions of CopC exist in many bacteria yet they all seem to serve similar roles (Lawton, 2016). CopCs are good targets for protein engineering for several reasons.

For one, CopCs are monomeric, unlike other more complex oligomeric alternatives such as CSS, COX17, or COX11 (Heaton, George, Garrison, & Winge, 2001; Palumaa, 2013; Timón-Gómez et al., 2018). Oligomeric proteins consist of several non-covalently bound subunits making them impractical for our packed column process as the subunits could become detached. Oligomeric proteins are also more difficult to express, purify, and engineer. CopC is also a relatively small protein with a modest size of 13 042 Da (UniProt). A smaller protein is more desirable, since the expression of small proteins typically proves more successful, especially for fusion proteins. Small proteins are also easier to model and engineer.

Additionally, many copper-binding proteins are membrane proteins. Membrane proteins are not suitable for our application since they contain exposed hydrophobic regions, meaning that they cannot be folded in the aqueous environment of our reactor. Expression and purification of membrane proteins are also difficult. Since CopC is a soluble periplasmic protein (Lawton, 2016), it is a good candidate for our technology.

CopC has multiple variants, isolated from different bacterial strains such as Methylosinus trichosporium (Mst-CopC), Pseudomonas syringae (Pss-CopC) and Pseudomonas fluorescens (Pf-CopC) (Lawton et al., 2016; Wijekoon, Young, Wedd, & Xiao, 2015). Many of these CopC variants include separate binding sites for Cu+ and Cu2+. However, Mst-CopC only has one Cu2+ binding site. In comparison to its analogous CopC variants, Mst-CopC lacks the critical residues required to bind Cu+ (Lawton et al., 2016). Cu+ binding sites in CopC proteins typically exhibit multiple conserved methionine residues, whereas the Cu2+ binding site tends to appear more histidine-rich (Lawton et al., 2016). Mst-CopC only coordinates single Cu2+ ions in a distorted square pyramidal coordination geometry with H23, H107, D105 and an axial positioned water molecule (Lawton et al., 2016). Mst-CopC also possesses a seven stranded 𝛽-barrel, similarly observed in other variants such as Pss-CopC but with slight deviations in the arrangement of the strands (Lawton et al., 2016). For this reason, we chose Mst-CopC as our starting point for improving copper affinity. The single copper binding site makes this protein much simpler to engineer and model compared to other copper binding proteins.

Cartoon structure of Mst-CopC, including the bound Cu2+ ion
Figure 1. Cartoon structure of Mst-CopC, including the bound Cu2+ ion.

Imagine

For the “Imagine” section, we consider useful improvements that we could make to CopC and develop the general approaches that could be used to achieve these improvements.

Motivation for Improving Mst-CopC

Mst-CopC will be implemented in a packed column reactor, which may then be used in industry to recover copper from effluent. From an industrial and environmental standpoint, we want this reactor to be as efficient and robust as possible. To achieve this, we need to ask the following:

  • How can we improve our fusion protein to bind copper as efficiently as possible?
  • How can we reduce the operating costs of our system, making it appealing for use in industry?
  • How can we ensure that our system binds copper as selectively as possible, thereby maximizing the purity of the recovered metal?

From a biochemical perspective, these questions can be answered by improving the affinity and specificity of our copper-binding protein, Mst-CopC. By modifying Mst-CopC such that its binding affinity to copper increases, we can allow our column to recover copper as efficiently as possible, which in effect reduces operating costs. Additionally, by improving Mst-CopC’s specificity to copper, we can increase the proportion of copper recovered and minimize the proportion of other undesired metals that bind to Mst-CopC, thereby improving the purity of the recovered copper.

To improve affinity, key residues involved in binding may be mutated such that the binding site is more well-suited or has higher affinity for copper (Koay et al., 2013). To do so, residues that are predicted to have a marked effect on affinity (either thorough literature or analysis of the protein structure of Mst-CopC) are mutated, and the effect of the mutation on binding affinity/energy is assessed using molecular dynamics simulations and other software tools. To improve specificity, Mst-CopC’s affinity for metals other than copper can be decreased - by definition, by increasing Mst-CopC affinity for copper and decreasing Mst-CopC affinity for other metals, we would improve Mst-CopC’s specificity for copper (Koay et al., 2013).

Below we present research into improving CopC’s Cu2+ affinity, and several other ideas we had for modulating Mst-CopC’s specificity. However, research into each of these specificity ideas showed that they were not feasible, leading us to focus exclusively on improving CopC’s copper affinity.

Investigating Copper Binding Using Multiple Sequence Alignments

A multiple sequence alignment (MSA) is a bioinformatics tool that utilizes an optimization algorithm to identify regions of homology/similarity between different amino acid sequences (Chatzou et al., 2016). As a result, MSA is useful for the study of homology and evolutionary relationships between proteins (Chatzou et al., 2016). With respect to protein binding, MSAs may help to predict key binding residues of an uncharacterized protein by inferring these residues from a highly similar protein. Alternatively, the usefulness of MSAs can be appreciated from the standpoint of identifying potential residue candidates, such that mutating these candidates may result in binding with higher/lower affinity; this is the purpose for which the MSAs were used in this case. Consider two proteins, protein A and protein B, which both bind the same ligand and have known, characterized binding site residues. By comparing differences in binding site residues between these two proteins, one can infer that mutating protein A to more closely match protein B’s binding site residue(s) would be likely to increase or decrease the binding affinity of the original protein A.

It can be seen, then, that an MSA might be useful to identify potential residues on Mst-CopC that, when mutated, might increase its affinity towards its ligand, Cu2+. The following MSA aligned the sequence of Mst-CopC with 9 other copper-binding proteins in an attempt to identify homologous binding site residues. The Cu2+-binding proteins involved in this MSA, preceded by their NCBI accession numbers, were the following:

  • Template: Accession #; Protein name (native species)
  • AAH10933.1; COX17 (Homo sapiens), binds Cu+, copper ligand binded
  • AAH05895.1; COX11 (Homo sapiens, yeast homolog), binds Cu+
  • AAI05017.1; CSS (Homo sapiens), binds Cu+ and Cu2+
  • AAC41458.1; SlyD (E. coli), binds Cu+
  • AAH15504.1; SCO1 (Homo sapiens), binds Cu+
  • Q94BT9.2; Atx1 (Arabidopsis thaliana), binds Cu+
  • O68481; NosL (A. cycloclastes), binds Cu+
  • DAA06748.1; Cup1 (Saccharomyces cerevisiae), binds Cu+ and Cu2+
  • VWQ03775.1; NikR (E. Coli), binds Cu2+
  • 5ICU_A; CopC (Methylosinus trichosporium), binds Cu2+

If homologous binding site residues were to be found between Mst-CopC and another protein, then the other protein could be further analyzed to see if any other residues were involved in Cu2+-binding, which would be mutation candidates for Mst-CopC. The following image is the MSA for Mst-CopC with the 9 other Cu2+-binding proteins:

Multiple sequence of alignment of Mst-CopC and several other Cu2+ binding proteins.
Figure 2. Multiple sequence of alignment of Mst-CopC and several other Cu2+ binding proteins.

(This MSA was performed on Geneious using the Clustal Omega algorithm. Residues were coloured based on hydrophobicity, where a scale of red to blue = most hydrophobic to least hydrophobic. Binding residues were annotated in brown under each amino acid sequence.)

Unfortunately, this MSA did not help to identify mutation candidates, as there were no copper-binding proteins with binding sites/residues similar to Mst-CopC. In fact, cysteine-based binding sites (especially C-X-C motifs, where C = cysteine and X = another amino acid) were common to all copper-binding proteins except Mst-CopC. The main takeaway from this MSA was that Mst-CopC was more anomalous in terms of binding mechanism that was originally expected and that a more useful MSA to perform would be an MSA aligning Mst-CopC with other CopC proteins, which would be most likely to utilize similar sites of binding that could be readily compared with Mst-CopC. As well, this MSA showed that cysteine was a common copper-binding residue, suggesting that mutating a candidate residue to cysteine may help to facilitate copper-binding and improve affinity.

From here, another MSA was done with Mst-CopC and 12 other CopC proteins from other species (henceforth termed CopC variants). As a follow-up to the previous MSA analysis, this was done in the hopes of aligning proteins with overall homologous binding sites to Mst-CopC, while also identifying some binding residue differences that would give rise to mutation candidates. Other than Mst-CopC (accession number 5ICU_A), the following were the CopC variants included in the MSA (preceded by their NCBI accession number):

  • WP_030140788.1; PfCopC variant (Pseudomonas fluorescens)
  • WP_027898553.1; Pseudomonas syringae multispecies CopC variant (Pseudomonas syringae)
  • WP_026612268.1; YobA (Klebsiella aerogenes)
  • WP_019845025.1; DzCopC variant (Dickeya zeae)
  • WP_015884740.1; Pf-SBW25-CopC variant (Pseudomonas fluorescens)
  • WP_011514843.1; CmCopC variant (Cupriavidus metallidurans)
  • WP_002445015.1; SbCopC variant (Shimwellia blattae)
  • Q56797.1; CopC variant
  • Q47454.1; PcoC variant (Escherichia coli)
  • AAP88297.1; PpCopC variant (Pseudomonas putida)
  • AAN80710.1; YobA variant (Escherichia coli CFT073 strain)
  • AAA25808.1; PsCopC variant (Pseudomonas syringae)

The following is the MSA of Mst-CopC with the 12 other CopC variants:

Multiple sequence of alignment of Mst-CopC and several other CopC variants binding proteins.
Figure 3. Multiple sequence of alignment of Mst-CopC and several other CopC variants binding proteins.

(This MSA was performed on Geneious using the Clustal Omega algorithm. Residues were coloured based on hydrophobicity, where a scale of red to blue = most hydrophobic to least hydrophobic. Binding residues were annotated in brown under each amino acid sequence.)

Previous literature on Mst-CopC has suggested that mutating residue 3 of the cleaved protein (where “cleaved” refers to the removal of a signal peptide before residue 33 on the “consensus identity” sequence) from phenylalanine to histidine would likely result in a ~100x increase in binding affinity for Cu2+ (Lawton et al., 2016). This MSA confirms this, as this residue appears in other CopC variants but not Mst-CopC, making a His-3 mutation a prime candidate for improving Mst-CopC binding affinity.

Idea: Decreasing Hg2+ Affinity

Literature shows that some CopC variants can bind Hg2+ (Song et al., 2016). This would be undesirable for our industrial process, since the copper we recovered from our column would be contaminated with mercury. For this reason, we considered rational design approaches to decrease Hg2+ affinity.

Pss-CopC consists of a Cu+ binding site situated at the C-terminal, and a Cu2+ binding site at the N-terminal (Song et al., 2016). Its Cu+ binding site is defined by His48 and 2-3 conserved Met residues (Song et al., 2016). The Cu2+ binding site consists of H1, E27, D89 and H91 (Song et al., 2016). This differs from the conserved residues observed in Mst-CopC for binding Cu2+, previously mentioned as H23, H107, D105 and an axial positioned water molecule (Lawton et al., 2016).

Given the possibility of Hg2+ competing for the Pss-CopC Cu2+ binding site, it was thought to be worth investigating whether Hg2+ would behave similarly with regards to Mst-CopC. Hg2+ has an ionic radius of 102 pm while Cu2+ has an ionic radius of 73 pm (Petrucci, Herring, Madura, & Bissonnette, 2017). Thereby, one such method to improve selectivity was to decrease the Cu2+ binding site such that it excludes ions larger than that of Cu2+. Mst-CopC and Pss-CopC were modelled in PyMOL to visualize the area of the Cu2+ sites and inform possible mutations to induce that would decrease its size in Mst-CopC.

Cartoon drawing of Mst-CopC Cu2+ binding site rendered using PyMOL
Figure 4. Cartoon drawing of Mst-CopC Cu2+ binding site rendered using PyMOL.
Cartoon drawing of Pss-CopC Cu2+ binding site rendered using PyMOL
Figure 5. Cartoon drawing of Pss-CopC Cu2+ binding site rendered using PyMOL.

When analyzed on PyMOL, it was evident that the Cu2+ binding site in Pss-CopC was profoundly larger than the Cu2+ binding site in Mst-CopC. It appears as though in Mst-CopC, the Cu2+ ion exhibits a rather tight fit, coordinated by three critical residues (and an axial water molecule). However, in Pss-CopC, there is a great distance between the critical Cu2+ binding residues, which may explain why Pss-CopC is reported to bind other, larger ions in this site.

Due to the significant size difference in binding sites, it was concluded that the reported observation of competing ions in Pss-CopC was an unlikely concern for Mst-CopC. Therefore, modifications to decrease the Cu2+ binding site was deemed a rather ineffective method to make significant changes in binding specificity of Mst-CopC.

Idea: Decreasing Ni2+ Affinity

Literature shows that the active sites of many copper and nickel binding proteins are very similar in structure and composition (Sudan, et al., 2015). This means that many copper binding proteins also show affinity for nickel. This would be undesirable since the copper we recovered from our reactor column would be contaminated with nickel. For this reason, we considered rational design approaches to decrease Ni2+ affinity.

The first approach used to decrease Ni2+ affinity to perform a multiple sequence alignment of both copper- nickel-binding proteins. Comparing these two MSAs gave the team a better understanding of the active sites of both copper- and nickel-binding proteins, as well as more tools to further increase CopC's copper selectivity.

Comparing the structure and layout of copper- and nickel- binding proteins gave us more insights into the structure and functionality of each type of protein. First, most copper-binding proteins had coordinating residues centered around amino acids 175 to 200 of the consensus sequence. On the other hand, nickel-binding proteins had coordinating residues centered around amino acids 500 to 550 of the consensus sequence. This means that nickel-binding coordinating residues tend to be closer to the C-terminus of the protein in comparison to copper-binding proteins. Additionally, Ni-binding active sites tended to consist of 3 or 4 amino acids, while most Cu-binding active sites consisted of 5-7 amino acids.

Multiple sequence alignment of nickel-binding proteins
Figure 6. Multiple sequence alignment of nickel-binding proteins
Figure 7. Multiple sequence alignment of copper binding proteins
Figure 7. Multiple sequence alignment of copper binding proteins.

(MSAs were performed on Geneious using the Clustal Omega algorithm. Residues were coloured based on hydrophobicity, where a scale of red to blue = most hydrophobic to least hydrophobic. Binding residues were annotated in brown under each amino acid sequence.)

Additionally, the nickel affinity of a protein can be regulated through the manipulation of the coordination geometry formed by the coordinating residue and the Ni(2+) ion (Sudan, et al., 2015). For example, after extensive PyMol modelling, Sudan, et al. determined that practically all tripeptide nickel-coordinating residues exhibited trigonal pyramidal geometry when bound to a Ni 2+ion. By having a tripeptide coordinating residue that exhibits any other coordination geometry, it is possible to significantly reduce the nickel affinity of the desired protein.

With these concepts in mind, two changes were proposed to decrease copC's Ni(2+) affinity.

  1. Changing D105 to E105 (residue 83)
  • Glutamic acid has a similar structure to aspartic acid, but the R chain is one carbon longer.
  • This makes sense since Ni has a slightly larger atomic radius compared to copper (1.49 Å vs 1.45 Å, respectively.)
  1. Switching S103 changed to H103 (residue 81)
  • Creates square pyramidal geometry
  • According to Sudan, R et al. (2015) practically all Ni (2+)-binding tripeptide coordinating residues Ni exhibited trigonal pyramidal geometry.
  • Since it has a different coordination geometry to the one needed to bind Ni, it is likely to drastically decrease nickel affinity.

Idea: Increasing Nickel Affinity

The past several approaches discuss strategies for decreasing CopC’s affinity toward other metals and increasing the affinity toward copper. This would serve to increase copper specificity, resulting in greater purity of the recovered copper. However, we also thought about changing CopC’s specificity preference to other metals. This would be useful for building an industrial process to treat wastes containing these other metals. Here, we describe research into changing CopC’s specificity preference to nickel.

Cu2+ ions bonding distances occur between 2.0 to 2.6 Å and Ni2+ between 1.8 to 2.4 Å, which is a very narrow range to optimize for one ion over the other (Balakrishnan et al., 1997). Interchanging copper and nickel binding may change the entire architecture of the binding site as shown in Balakrishnan et al. by inducing changes to the protein structure causing residues to flip out of the active site being replaced by water molecules, potentially to make a smaller pocket for binding (Balakrishnan et al., 1997). For these reasons that nickel has very similar active site size and that nickel can entirely change the active site conformation in a protein, increasing nickel affinity was deemed an unrealistic goal via rational protein design (Balakrishnan et al., 1997). It would be extremely difficult to predict the outcome of binding a new metal in a mutated active site without experimental data.

Design

For the “Design” phase, we use rational protein design and the knowledge developed during the “Imagine” phase to propose specific changes to Mst-CopC that might improve copper binding.

Rational Protein Design

Rational protein design was used to improve copper affinity in the Methylosinus trichosporium (Mst) CopC protein. Mst-CopC’s three dimensional structure determined by Lawton et al. was used as a starting point to begin selecting key residues involved in copper coordination and binding. This was done by analyzing residues adjacent to the copper ion included in the crystal structure that were close enough to aid in direct coordination of the ion. Further insight was obtained by Lawton et al. which showed that two histidines, one aspartic acid, the N-terminus and a water molecule were the primary residues involved in Cu2+ binding (Lawton et al., 2016). These residues were visualized in PyMOL and adjustments were made using the mutagenesis and measurement tool to determine possible amino acid substitutions that could be made to enhance the binding site affinity for Cu2+ ions.

Key Residues Involved in Cu2+ Binding

It was made clear early in designing this protein that histidine, aspartic acid and glutamic acid are key binding residues with histidine’s negatively charged imidazole and the two acids’ negatively charged and deprotonated carboxyl groups at a pH above 7 (Včeláková et al., 2004). Mutations to histidine, aspartic acid and glutamic acid were made and analyzed in PyMOL using the measurement tool to ensure that the most plausible variants of the mutation would not jeopardize the size of the Cu2+ binding pocket. Some Recommendations for Single Mutations Included:

  • Changing S81 to H81 or E81 to provide additional binding in the lower portion of the binding pocket, build upon the square planar geometry of the original protein and provide an additional point of attraction (Lawton et al., 2016).
S81 to H81 or E81
Figure 8. S81 to H81 or E81.
  • Changing D83 to H83 to provide another imidazole group to the square planar binding residues since histidine seems to be absolutely essential in CopC Cu2+ binding domains (Lawton et al., 2016).
  • Changing D83 to E83 in attempt to decrease the size of the binding pocket to potentially push the Cu2+ ion further into the binding pocket and coordinate more binding residues (Cordero et al., 2008)
Figure 9. D83 to H83 or E83.
Figure 9. D83 to H83 or E83.
  • Change G28 to H28 to provide additional binding in the lower portion of the binding pocket, build upon the square planar geometry of the original protein and provide an additional point of attraction
Figure 10. G28 to H2
Figure 10. G28 to H28.
  • Change S82 to E82 in attempt to pull the binding site down towards the protein more and increase the points of attraction
Figure 11. S82 to E82
Figure 11. S82 to E82.
  • Change H85 to D85 in attempt to increase the size of the binding pocket to decrease nickel affinity and allow for deeper copper binding in the pocket which would be stabilized by mutations deep in the pocket (Balakrishnan et al., 1997).
Figure 12. H85 to D85
Figure 12. H85 to D85.

Sequence comparison between Psf-CopC and Mst-CopC was done since Psf-CopC has a higher Cu2+ affinity than Mst-CopC, but unfortunately no published three dimensional structure (Lawton et al., 2016). Most mutation suggestions were made based on residues within the binding domain or residues that were highly prevalent across all or the majority of Psf-CopC variants. These mutations included; I86 to K86, V87 to T87, G28 to E28, Y34 to F34 and F3 to H3.

Figure 13. I86 to K86.
Figure 13. I86 to K86.
Figure 14. V87 to T87.
Figure 14. V87 to T87.
Figure 15. G28E
Figure 15. G28E.
Figure 16. Y34F.
Figure 16. Y34F.
Figure 17. F3H.
Figure 17. F3H.

Analysis of Single Mutations and Recommendations For Further Improvements

Results from Rosetta (see Build below) gave an energy score for the single mutations made to the protein. These mutations were used to design a heat map (see Build below) to aid in determining multiple mutations that could be made to the protein to achieve a relaxed, low energy protein while still keeping key amino acid residues to coordinate Cu2+ binding. Suggestions from the heat map analysis were reviewed in PyMOL and double mutations were recommended

Recommendations for Multiple Mutations Included

  • H85 to D85 and S82 to E82
  • S82 to E82 and F3 to H3
  • V87 to T87 and F3 to H3
  • Y34 to F34, F3 to H3, S81 to D81 and H85 to D85
  • Y34 to F34, F3 to H3 and S81 to D81
  • H85 to D85, V87 to T87, I86 to K86 and G28 to E28

These recommendations were developed on the same basis as the single mutations except Rosetta results were used in addition to aid in determining which single mutations provided a more energetically stable protein and could be used in combination to make energetically favourable multiple mutation proteins.

Build

For the “Build” stage, we use the protein modelling software Rosetta to create the mutated proteins chosen in the “Design” stage, and perform a preliminary check for copper affinity.

Rosetta - A Protein Design Tool

To analyze the proposed design changes from the rational design process, software tools were used to provide initial analysis and modelling of the structure. Among various computational protein design platforms, Rosetta was chosen for redesigning the structure and conducting energy analyses. The Rosetta software suite offers algorithms and protocols for macromolecular structure prediction and analysis of protein structures (Leaver-Fay et al., 2011). As a unified software package, it provides a wide range of tools suited for the project: to redesign the protein structure that has an increased binding affinity for Cu2+ (Kaufmann, Lemmon, DeLuca, Sheehan & Meiler, 2010).

Within the Rosetta software suite, the tasks and operations within the libraries are combined as algorithms called protocols. Some of protocols’ major functionalities include protein structure prediction, protein docking, protein design and energy scoring analysis. In this project, the structure of the original protein is known, therefore only energy scoring functions and the protein design protocols are used to assess how the residue-specific mutations increase stability and binding affinity (Leaver-Fay et al., 2011). Some specific functions such as Relax, a protocol for simple all-atom refinement of input structure, and Point Mutant Scan Application, a protocol for mutating the protein structure at specific residues and analyzing its stability, are used in the rational design process ("Applications for Macromolecule Design", 2020).

Protein Mutation Procedure

Using the Rosetta official documentation and the Rosetta Guide for the iGEM beginner made by Team iGEM Technion and iGEM TU Eindhoven 2016, a structure analysis procedure was developed. The protocols mentioned in the following steps are written in a shell script.

  1. Obtain the original Protein Data Bank (PDB) file of the Mst-CopC structure. (It is accessible from RCSB PDB.)
  2. Prepare the structure.
    1. Clean the PDB file while keeping the ligand. (This removes the information in PDB that Rosetta doesn’t use. ("How to prepare structures for use in Rosetta", 2020))
    2. Run Relax protocol. (This reduces minor steric clashes in the protein-ligand complex. Running this protocol ensures the structure is at its most stable form before making mutations ("Relax application", 2020).)
  3. Point mutant scan application. (Though the algorithm can scan for point mutants for given residues and determine the change in Gibbs free energy a mutation causes, a mutant file can be used to specify the mutants on specific residues determined in the rational design phase.)
  4. Energy scoring analysis for the overall structure and residue-specific energy breakdown. (Residue specific energy and overall energy information will be produced as outputs along with mutated PDB files in step 3.)

At step 2.2 Relax and step 3 Point Mutation, relaxed and mutated protein structures will be produced as PDB files along with the energy score of each output. The scoring function in Rosetta is an optimized energy function that calculates the energy of all atomic interactions in a globular protein ("Scoring Tutorial", 2020). The score is a weighted sum of energy terms, including some representing physical forces like electrostatics and Van der Waal’s interactions and others statistical terms that indicate if the protein structure looks like any known structures ("Scoring Tutorial", 2020). A complete list of energy terms and weights used in the scoring function can be found on the Rosetta Scoring Tutorial Documentation. Therefore, instead of presenting the scores in physical energy units like kcal/mol, a custom Rosetta Energy Units (REU) is used due to the presence of the statistical terms. This score allows us to compare stability between scored structures. The lower the score, the more stable a structure is, which is sufficient to evaluate the relative stability of the mutated CopC-Cu2+ candidates.

Results

Single residue mutation

The table below shows the overall energy scores in REU and energy term breakdown for all protein candidates mutated on a single residue. The first column of the table indicates the mutation applied to the structure: 5icu is the PDB ID for Mst-CopC, and mutation code <original amino acid><residue number><new amino acid> follows the ID.

As seen below, the original score of the protein is -320.377. The mutated proteins are sorted in ascending total scores, meaning that they are ordered in decreasing stability. Among all candidates, only the mutation from histidine to aspartic acid at residue 85 (H85D) produced a lower energy score than the original structure. However, most candidate’s scores are between -320 to -290, which means most candidates are relatively stable after mutation and molecular dynamic analysis can be done to further analyze the candidates.

description total_score description total_score
5icu -320.377 5icu.L80D -310.078
5icu.H85D -327.949 5icu.L80V -308.632
5icu.Y34F -319.826 5icu.V79D -308.15
5icu.F3H -319.02 5icu.H85I -304.996
5icu.S82E -318.491 5icu.G29D -299.658
5icu.V5K -318.478 5icu.D83E -298.052
5icu.I86E -318.333 5icu.V79E -297.659
5icu.I86K -317.939 5icu.V79L -295.35
5icu.I86H -317.353 5icu.D83H -291.52
5icu.H85E -317.217 5icu.D83G -291.469
5icu.I86D -316.911 5icu.V79H -273.441
5icu.V87T -314.902 5icu.S2D -270.366
5icu.G28D -313.721 5icu.S81H -264.303
5icu.L80E -313.472 5icu.G29H -261.843
5icu.S81D -312.271 5icu.S81E -259.4
5icu.R78H -312.167 5icu.G29E -256.144
5icu.L80H -311.984 5icu.S81V -246.804
5icu.G27S -311.112 5icu.G29V -216.733
5icu.G28H -311.068 5icu.S2H -137.613
5icu.G28E -311.022 5icu.S2E -99.242
Table 1. Rosetta energy scores for the original Mst-CopC “5icu” and mutated candidates.

The complete table of results is found here

These PDB files also contain per residue energy scores, which can be used to compare the effect each mutation has on each residue. We assembled these scores into heatmaps, where each row corresponds to either a mutation or the original protein, and each column corresponds to a certain residue. This data was difficult to visualize effectively due to the presence of outliers, so certain data transformations were applied to better analyze some aspect of our energy score results. (The heatmap can be found in our github as heatmap_single_relative.png.)

Figure 18. Heat map with unmutated protein energy scores subtracted.
Figure 18. Heat map with unmutated protein energy scores subtracted.

One such transformation is the above heatmap, which has subtracted the unmutated proteins energy scores from each mutant (and itself). The resulting scores are therefore relative to the unmutated protein, and show increases or decreases in score, aiding one in finding whether a mutation is beneficial to the energy score of each residue.

Figure 19. Heat map with each column scaled to itself.
Figure 19. Heat map with each column scaled to itself.

Seen above is another heatmap, except that each column has been scaled to the range [0, 1], highlighting changes in energy score. Unlike the previous heatmap, it does not indicate the scale of these changes.

Note that in both heatmaps the rows and columns have the same labels, and represent the same things, only the values contained within the map itself have been modified.

Using these heatmaps, it is possible to rationally select mutations to simultaneously apply in order to further reduce the overall energy scores. Possible next steps include applying a logarithmic scale to better differentiate between the values in the heatmap.

Double residue mutations

To determine a suitable combination of two mutations, it is computationally feasible to simply run all possible pairs of mutations from our set of rationally selected single mutations. The overall energy scores of each of these “double mutants” was also assembled into a heatmap, as an efficient method of comparing all of these mutants at once.

Figure 20. Heat map of double residue mutations.
Figure 20. Heat map of double residue mutations.

Each row and column respond to a particular mutation, with each intersection representing both mutations applied together. The symmetry appears to be inexact, a phenomenon which was likely due to how Rosetta applies multiple mutations.

Based on the results from Rosetta and analyzing heatmaps, a list of optimal candidates is compiled. These mutated candidates will be further assessed using more detailed molecular dynamic analysis.

The list of mutations passed onto MD included:

  • D83E
  • F3H
  • G28E
  • H85D
  • I86K
  • S82E
  • S82E, F3H
  • F3H, V87T
  • H85D, S82E
  • Y34F, F3H, S81D
  • H85D, V87T, I86K, G28E
  • Y34F, F3H, S81D, H85D

Test

For the “Test” phase, we use molecular dynamics to further evaluate the copper affinity of the most promising Mst-CopC mutants developed in the “Build” stage.

Molecular dynamics was used to test the relative binding affinities of these candidate proteins. We used GROMACS (Berendsen et al., 1999) along with the Charmm36 (Vanommeslaeghe et al., 2010) force field parameters which include parameters for copper ions in solution. This allows us to estimate the total interaction energy between the copper ion and the protein, and using the relative magnitudes of these interaction energies allows us to compare the strength of binding. Molecular dynamics provides a more rigorous method for comparing binding strength due to the fact that a molecular dynamics simulation takes into account the motion of every individual atom in the system and their interactions.

All GROMACS simulations were run on the Graham computing cluster hosted by ComputeCanada.

In order to implement a molecular mechanics simulation, the protein pdb files were converted to the gromacs format using Charmm-GUI, an online tool developed in order to construct topology files and parameter files for proteins using the Charmm force field. Once the required files were attained, the system was placed in solution in a process known as solvation, and the charge in the system was balanced by introducing faraway sodium or chloride ions. The system then undergoes an energy minimization process in which the protein and copper ion are restrained while the solvent molecules settle into place. This avoids extreme behaviour that could occur due to collisions between molecules. The thermodynamics of the system is then set up using standard NVT and NPT equilibration procedures in GROMACS, and then the final simulation is run.

A typical ten nanosecond simulation was performed for each candidate protein. The results are found in the table below.

Name Total Interaction Energy
Mst-CopC -540.6373 kJ/mol
H85D -1278.448 kJ/mol
F3H -563.9996 kJ/mol
G28E -607.4388 kJ/mol
I86K -586.829 kJ/mol
S82E -588.4093 kJ/mol
D83E-V87T -853.725 kJ/mol
H85D-S82E -1392.7281 kJ/mol
H85D-V87T-I86K-G28E -1246.7286 kJ/mol
S82E-F3H -557.0123 kJ/mol
Y34F-F3H-S81D -1145.2119 kJ/mol
Y34F-F3H-S81D-H85D -1698.6439 kJ/mol
Table 2. Molecular dynamics interaction energies for the original Mst-CopC and selected mutation candidates.

From the table of results we can see that candidate Y34F-F3H-S81D-H85D has the highest relative binding affinity when compared to Mst-CopC, with a 3.14 improvement in total interaction energy over the original Mst-CopC.

Animation of copper in H85D binding site
Animation of copper in H85D binding site

The above animation shows the copper ion in the H85D binding site. This animation was produced using VMD, a free software tool for visualizing molecular dynamics simulations.

Learn

For the “Learn” stage, we discuss the results of our Rosetta and molecular dynamics modelling and what they mean for the success of our engineered proteins.

The energy score of the Y34F, F3H and S81D mutant in Rosetta was slightly higher than the original Mst-CopC energy score which was surprising considering there was a significant decrease in the total interaction energy, seen in Table 2 above. A similar incident occurred in the H85D, V87T, I86K and G28E where the mutated protein has a higher energy score in Rosetta, however when run in MD the total interaction energy was again significantly reduced. The H85D and S82E double-mutation was very comparable across Rosetta and MD with significant reductions in both the energy score and total interaction energy. The mutation with the lowest total interaction energy; Y34F, F3H, S81D and H85D, had a very comparable energy score in Rosetta, however when run in MD, there was a significant reduction in the total interaction energy. This seems to be a common trend in Rosetta where the energy score is within +/-20 of the original Mst-CopC and then when analyzed in MD it had a significantly reduced total interaction energy.

There is no significant trend between the Rosetta scores and the interaction energies as shown in the plot below. All mutated proteins analyzed on GROMACS have rosetta scores ranging from -290 REU to -327 REU, where some of them are slightly higher than the original Mst-CopC (-320 REU). However, as shown in the interaction energy table, all the analyzed proteins have lower energy than the original Mst-CopC, which indicates that the mutated structures are more stable than the original. This suggests that Rosetta, although does not always correlate with interaction energy, is a useful tool to narrow down the list of rational design mutation candidates and provide an initial estimate to the stability of the mutated protein. Molecular dynamic analysis, which is more time consuming than the Rosetta analysis, is needed to further analyze a subset of selected candidates.

Figure 21. Rosetta vs MD scores
Figure 21. Plot of Rosetta energy scores and molecular dynamics total interaction energies for each protein, showing no correlation.

Single and double mutations that included I86 to K86, V87 to T87, G28 to E28, Y34 to F34 and F3 to H3 were suggested with inspiration from Psf-CopC, since it has a higher binding affinity than Mst-CopC (Lawton et al., 2016). These residues are not key binding residues, but rather are more likely to be involved in the positioning of the key residues and arranging the protein’s backbone (Lawton et al., 2016). These mutations were used in combination with other mutations that were designed to improve Cu2+ binding. In Table 2 above, it can be observed that the single mutations by themselves do not reduce the total interaction energy, but are more effective when used in combination.

G28E, S81D and S82E were designed to specifically target residues that are key in Cu2+ binding. These residues were mutated to residues with a more negative charge at a neutral pH to aid in coordination of the positive copper ion. Adding in additional points of attraction from the original 4 in a square planar geometry would allow for a redesign of the coordination geometry to include more coordination ligands (Lawton et al., 2016).

H85D proved to be an advantageous mutation for Cu2+ binding. H85D was originally selected to try and increase the outer rim diameter size of the binding pocket since histidine is significantly larger than aspartic acid. This increase in the rim diameter would potentially allow for the Cu2+ to penetrate deeper into the protein itself, instead of being mostly exterior. The semi-internalization of the Cu2+ ion would increase the coordination of the ion and increase binding affinity. There was hesitancy in selecting H85D since it is highly conserved across many CopC proteins from many different organisms. However, the Rosetta scores indicated that this mutation could not be overlooked and it was analyzed further. The total interaction energy for H85D was significantly reduced compared to the original CopC protein. Thus, this mutation was used in combination in subsequent multi-mutation proteins and the results continued to be promising.

Inspiration drawn from Psf-CopC seems to have been useful for improving Mst-CopC’s binding affinity. Specific mutations included; Y34F, F3H, S81D and H85D. These mutations in combination produced a protein with a total interaction energy significantly lower than any other protein designed (see molecular dynamic results for data). It is suspected that the Cu2+ ion will now interact with more ligands due to the additional negatively charged aspartic acid at residue 81. This residue may be even more critical to the stability of the Cu2+ ion after the mutation from histidine to aspartic acid at residue 85 since this may increase the size of the binding pocket and allow for deeper Cu2+ ion penetration into the pocket.

Figure 22. Pink shows the original Mst-CopC and green shows the mutated Y34F, F3H, S81D and H85D Mst-CopC.
Figure 22. Pink shows the original Mst-CopC and green shows the mutated Y34F, F3H, S81D and H85D Mst-CopC.

Improve

For the “Improve” stage, we reflect on the successes and failures of the approach used in the previous stages and suggest further work that could be done to develop even better Mst-CopC mutants.

Time constraints limited the number of protein candidates that we were able to investigate with Rosetta and molecular dynamics. We would therefore like to investigate as many protein candidates as possible. We would specifically like to run single-mutation proteins with V87T, D83E, S81D, and Y34F on molecular dynamics. We did run candidates that included these mutations along with other mutations, but running single mutation proteins would allow us to better understand the effects of these mutations on copper binding. We would also like to run Rosetta and molecular dynamics on Psf-CopC. Since many of our suggested mutations were inspired by Pst-CopC, it would be interesting to determine the interaction energy of the Psf-CopC protein itself. These experiments would provide further insight into our understanding of CopC and its copper binding, allowing us to make more informed design choices.

More generally, we can continue to explore the functionalities that Rosetta provides. In this project, the Rosetta scores calculated indicate overall stability of the protein-ligand complex. An additional protocol that we could have investigated is LigandDocking, which takes a receptor structure (typically a protein) and a small molecule, and tries to find a conformation that minimizes the Rosetta score function ("Ligand_Dock Application"). This protocol can be more useful in analyzing the binding affinity of the ligand-protein complex. We can also research additional software tools that analyze metal protein docking. Moreover, we only scored Mst-CopC in Rosetta and GROMACS in our project, but it may be useful to compare Psf-CopC energy scores with Mst-CopC.

Lastly, the only way that we can truly determine the success of our designs is to test them experimentally. Each CopC mutant should be heterologously expressed in E. coli and its copper affinity evaluated using the strategies discussed on the measurement page. The results of these experiments would provide useful validation of the predictions made by Rosetta and molecular dynamics, allowing us to further refine these predictions for future work.

FUSION PROTEIN

Research

Cellulose Binding Module Selection

To enable our fusion protein to bind cellulose in the column for metal ion capture, we screened common cellulose binding module candidates for desirable characteristics that would serve this function. Notably, CBM2a, CBDct, Cel7a and CelK were considered in the design process of our engineered protein.

CBDct is a cellulose binding module that has been used by other iGEM teams and is found on the iGEM Parts Registry. While this supports its functionality, apart from its detailed account of use in iGEM-based projects, it is not as well characterized in literature as CBM2a, for example.

Cel7a is a Type I cellulose binding module and is considered an attractive candidate to serve as an anchoring moiety as it boasts a high binding affinity for cellulose (Griffo et al., 2019). There are, however, conflicting reports of whether this module binds reversibly or irreversibly (Jalak & Väljamäe, 2014; Maurer, 2012).

CelK, another candidate, is a Type III cellulose binding module that was considered but ultimately rejected as it dimerizes under 4°C conditions (Kataeva, Seidel, Li, & Ljungdahl, 2001). Dimerization of the CBM tags would risk aggregation of the fusion protein in the column and therefore lead to unsuccessful capture of the metal ions.

CBM2a was selected as the cellulose binding module for our fusion protein for a number of advantages it poses over the alternatives described above. CBM2a is a Type II, irreversible cellulose binding module derived from Cellulomonas fimi and is commonly reported in literature used in recombinant proteins as a fusion tag (McLean et al., 2000; Rodriguez et al., 2004). For its purposes of fixing the fusion protein onto the column to filter metal ions, the CBM would have to exhibit irreversible binding such that the protein does not dissociate from the column during this process. This module is highly characterized, defined as 110 amino acids long and consists of 2 sheets of 𝛽-sheets in a barrel-type backbone (McLean et al., 2000). Its binding to cellulose is mediated by multiple residues on its binding face, notably three highly conserved solvent-exposed Trp residues (McLean et al., 2000).

Reviewing the Stability of Cellulose

For a successful cellulose packed column, the cellulose must be stable in a variety of conditions such that it does not break down to the point that CBM2a cannot bind it. As a result, we searched literature regarding the stability of cellulose in various conditions.

Acidic Stability

At the microcrystalline level, cellulose contains amorphous and crystalline regions (Ioelovich, 2017). In acidic conditions, acid hydrolysis can occur such that the amorphous regions of microcrystalline cellulose are degraded, leaving the smaller crystalline cellulose units intact as nanocrystalline cellulose (Ioelovich, 2017). CBM2a binds crystalline regions of cellulose, so we expect that CBM2a will successfully bind cellulose even if it has been degraded to its nanocrystalline form by acid hydrolysis (McLean et al., 2000).

Alkaline Stability

In a long-term study of cellulose stability in cementitious radioactive waste at pH 13.3, it was shown that complete cellulose degradation in alkaline conditions would take well over 100 years (Glaus & Van Loon, 1970). Another study showed that the amorphous regions of cellulose were more reactive and prone to degradation in alkaline conditions (Li et al., 2017). However, since CBM2a only binds to cellulose’s crystalline regions, alkaline degradation of the amorphous regions should not impede CBM2a binding (McLean et al., 2000). Considering these studies, we can conclude that alkaline degradation of cellulose poses no problems for CBM2a binding in our column reactor.

Thermal Stability

Thermal degradation of cellulose only begins between 200˚C-300˚C, depending on the source of cellulose (Trache et al., 2014). Given that the column processes aqueous effluent, it’s safe to assume that our column will not be dealing with temperatures within this range, so thermal degradation of cellulose will not be an issue.

From this research, we concluded that cellulose should remain stable within the process conditions expected for a column treating metal-finishing effluent. Any degradation that occurs would not be enough to compromise the functionality of the column.

Design

E. coli Expression Strain

E. coli BL21 (DE3) was chosen as the expression strain for the designed proteins since it is generally used as an all purpose strain and in protein expression. It also lacks lon and ompT proteases, which helps limit degradation of recombinant proteins. Additionally, it contains T7 polymerase to be used with an IPTG inducible-T7 promoter.

Assembly of Fusion Protein

The three parts we submitted to the registry for the Engineering silver medal are:

  1. Mst-CopC (copper-binding), BBa_K3381002
  2. Mst-CopC-CBM2a, BBa_K3381005
  3. Improved Mst-CopC-CBM2a, BBa_K3381006

For more information on the design of the fusion protein sequences, please visit the design subpage of their parts registry entries.

TEV Protease Site Specificity

TEV protease is a commonly used enzyme in protein purification protocols to cleave affinity tags from proteins as it recognizes the amino acid sequence ENLYFQ/X, where X is typically G or S (Kapust et al., 2002). The extra amino acid X left behind after TEV cleavage can be problematic, as this introduces an extra amino acid into the native protein, which may alter the protein’s biological activity (Kapust et al., 2002). This is especially the case for Mst-CopC, in which the amino group of the N-terminus is directly involved in binding copper (Lawton et al., 2016). Luckily, it has been shown that although X is usually G or S, it can actually be replaced by any amino acid except for proline and TEV protease will still recognize and cleave the site (Kapust et al., 2002). Although this replacement could somewhat compromise cleavage efficiency, replacing X with H (the first amino acid in biologically active Mst-CopC) does not decrease the efficiency tremendously (Kapust et al., 2002). Thus, the recognition sequence in our Mst-CopC fusion protein was changed to ENLYFQ/H so Mst-CopC functions properly after cleavage while maintaining cleavage efficiency (Lawton et al., 2016).

Build

Construction Plan

The protein expression vector pET 11a from Novagen was chosen since we did not require a cleavable affinity tag as this was already incorporated into the metal binding-CBM fusion parts. Additionally, it contains an IPTG inducible promoter to control expression of the fusion proteins. The parts would be synthesized in their entirety and would be incorporated with standard assembly protocols utilizing the NdeI and BamHI restriction sites, which removes the included T7 affinity tag that was not needed.

Selection of Affinity Tag

After we express our fusion protein, we need to detect and purify it using an affinity tag (Lichty, Malecki, Agnew, Michelson-Horowitz & Tan, 2005). Three affinity tags were considered for our protein: HIS, MBP and GST. Research was conducted on each option.

HIS

  • Small size, low cost, easy to use (Zhao, Li, & Liang, 2013)
  • Little effect on protein's native structure (Carson et al., 2007)
  • Purification by immobilized metal-ion affinity chromatography (IMAC) (Lichty, Malecki, Agnew, Michelson-Horowitz, & Tan, 2005)
  • His-tag binds to immobilised nickel or cobalt ions (Scheich, Sievert, & Büssow, 2003)

MBP

  • Larger tag
  • Greater influence on protein
  • Positively influence solubility & expression efficiency
  • Size & immunogenicity can complicate downstream application
  • Purified via affinity chromatography on cross-linked amylose resin (Kimple, Brill & Pasker, 2013)

GST

  • Larger tag
  • Often increases solubility of fusion protein
  • Purified by affinity chromatography on glutathione Sepharose
  • Very expensive (Kimple, Brill & Pasker, 2013)

GST tags were ruled out as an option due to their expensive price and our lack of experience with them. HIS tags were ultimately selected over MBP tags because they are smaller in size. Given that the metal binding and CBM components of our protein are large, we wanted to avoid further increasing the size of our recombinant protein. Since larger recombinant proteins are at a greater risk of experiencing folding issues, we tried to reduce our protein size where possible (Puetz et. al., 2019) IMAC columns for HIS-tag affinity purification would be sourced from Cytiva Life Sciences, who provides methods for their use.

Test

Although we were unable to access the lab this year due to COVID-19, we researched ways that we would measure the binding affinity of CopC, other metal-binding proteins, and cellulose-binding modules in the lab, as described in our Measurement page. We also quantified the binding affinity in-silico, as described above.

PROCESS DESIGN

Finally, we applied the engineering cycle to the design of our packed column process. We completed the Research, Imagine, and Design steps of the cycle, and planned experiments to build and test our design if we were able to access our lab.

Research

For our “Research” stage, we investigated current metal treatment methods. We selected copper-containing waste from a typical semiconductor manufacturing waste as an example wastestream to treat.

Our metal-removal technology will compete with a variety of other aqueous metal waste treatment processes. Some current methods for treating aqueous metal wastes include chemical precipitation, ion exchange, adsorption, and membrane filtration (Fu et al., 2011). Chemical precipitation is particularly widespread because of its relative simplicity and low cost (Fu et al., 2011). In this method, the pH of the metal waste is increased by the addition of a base such as lime, causing the metal ions to precipitate as insoluble hydroxides that can be recovered as solid sludge. Removing water from this sludge can be costly, and the resulting sludge is typically disposed of. This also means that the metals and their value cannot be recovered for other uses. Ion exchange, and membrane filtration are all relatively new technologies with further development required (Fu et al., 2011). For this reason, their costs are often prohibitive for large scale applications. Furthermore, there are few established methods that recover metal ions from industrial wastewaters. For these reasons, we aim to use metal binding proteins to create a novel metal removal process that can quickly, selectively, and cheaply remove metal ions from dissolved wastes. Our process will also allow for metal ions to be recovered and recycled, providing many economic and environmental benefits.

Other parts of our project discuss designing of our metal and cellulose binding fusion protein and increasing the binding affinity of Mst-CopC, an example copper-binding protein. However, it is also important to design a suitable industrial process that can use our designed proteins to treat relevant metal wastes.

As an example, we will design a reactor to treat copper-containing metal finishing waste at a scale determined to be relevant for a semiconductor manufacturing plant. Metal finishing wastes result from processes including electroplating. In electroplating, metal solutions are used to deposit thin layers of the dissolved metal onto solid surfaces (Arán, 2017). After the process is complete, the remaining solutions must be treated before disposal. Copper is a metal commonly found in metal finishing wastewater, so it is fitting to consider it in our case here (Wei et al., 2013).

However, since dissolved metal wastes are produced by a large variety of industrial processes, the design and modelling process applied here could be used for any other waste stream and dissolved metal(s).

Imagine

For our “Imagine” stage, we brainstormed ways that we could use our metal-binding fusion proteins in large scale industrial applications. We propose a reactor type and the operation stages that would be used.

In order to effectively treat large volumes of aqueous waste, we immobilised our cellulose and metal binding fusion protein inside an industrial bioreactor. Some advantages granted by using a bioreactor include: reactor size control and dimension tunability, continuous operation, and a wide variety of sub-types of reactors to base our design off of (Shuler & Kargi, 2002). Furthermore, the bioreactors have long been used for industrial scale processes that operate based on enzymatic or cellular function, so research in this area is well developed.

Of the many varieties of bioreactors available, we chose to use a packed column reactor. A packed column reactor consists of a large container filled with some sort of porous packing material, and effluent is fed through the bed, allowing a reactor to occur (Shuler & Kargi 2002). For our reactor, we will use cellulose beads as a packing material. Packed column reactors enable large volumes of waste to be treated in a continuous manner, providing much greater efficiency than batch processes that treat a fixed volume of waste at one time (Shuler & Kargi 2002). For our metal removal application, all removed metal will stay trapped inside the column as waste is fed through. Thus, the metal can be removed at a convenient time, such as during off-peak waste production hours.

Since our fusion proteins include a cellulose binding domain that binds cellulose very strongly, the fusion proteins can be immobilised on the cellulose packing by simply feeding them into the column. By immobilising our fusion proteins on the surface of the cellulose packing, we are able to maximize the surface area of reactive sites, thus increasing the treatment rate of the wastewater system.

With this general design in mind, we set out to use engineering and modelling tools to decide how to build a suitable reactor to treat our waste stream of interest.

We envision that our reactor will operate in two stages:

  1. First, the metal containing wastewater will be flowed through the reactor, allowing the metal in the wastewater to bind to the proteins within. This will produce a stream of clean water that can be discharged to the environment. This is the “binding” stage.
  2. Secondly, once the reactor is saturated with metal, we will rinse the reactor with another solution of a different pH, which will cause the metal ions to unbind from the reactor. This will result in a concentrated and pure solution of dissolved metal. This resulting solution could then be recycled for various applications. This is the “elution” stage.

Below is a 3D model that provides an example of our reactor’s general construction. Wastewater will be pumped into the reactor from the bottom, passing through the cellulose packing before exiting through the top of the reactor.

3D Model of our reactor design
Figure 23. 3D model of our packed column reactor created using CAD.

Design

For our design stage, we applied computational approaches to model and design our packed column reactor, optimising it to the best of our ability.

Developing our model

We used a variety of mathematical approaches to derive equations that would describe our packed column bioreactor. We eventually found that our reactor would be governed by the following system of differential equations. The solution of this system of differential equations provides useful information for designing our reactor.

\( \frac{dm}{dt} + \vec{v}\cdot \nabla m - D\nabla^2 m = -(\frac{a}{f}) (k_a m (Q-q) - k_d q) \)

\( \frac{dq}{dt} = k_a m (Q-q) - k_d q \)

Detailed derivation of these equations and the computational approaches used are described on the model page.

Once model equations had been developed, it was necessary to find values for the parameters so that useful results could be computed. These parameters were found directly in literature, or computed from combinations of other parameters.

Once model equations had been developed, it was necessary to find values for the parameters so that useful results could be computed. These parameters were found directly in literature, or computed from combinations of other parameters.

Parameters are required to describe the scale of the problem that we are trying to address using this reactor. We need to know the volume of waste to be treated per day and the concentration of dissolved copper in that waste. We also need to know the acceptable concentration of dissolved metal in the treatment outflow, so that it can be safely released to the environment.

To estimate the amount of copper-containing waste to treat each day, we looked to existing electronics manufacturers that produce metal-finishing wastes. One semiconductor manufacturer in Taiwan reported producing 4.44 million metric tonnes of dissolved metal waste over three manufacturing facilities each year (“TSMC Corporate Social Responsibility Report”, 2019). This would correspond to 1.48 million metric tonnes per year per facility, or about 46.9 L/s at each facility. For this reason, we will design a reactor that can treat an average of 46.9 L/s of waste over the course of each day.

Chiu et al. (1987) states that typical copper concentrations in metal finishing wastewaters can range from 1-30 mg/L. For this reason, we will design our reactor to be able to treat 30 mg/L dissolved copper.

The United States Environmental Protection Agency (USEPA) states that the permissible limit for dissolved copper in industrial wastes discharged to the environment is 1.3 mg/L (Al-Saydeh et al., 2017). For this reason, our reactor must be able to reduce the concentration of dissolved copper from 30 mg/L to below 1.3 mg/L.

The parameters relevant to the cellulose binding domain and the cellulose packing material are the specific surface area and the surface protein concentration. These were found in McLean (2000) and are listed in the table below. The fraction of volume occupied by cellulose for a typical 3 millimeter diameter packing material is found in (Sen, 2015).

Similarly, an equilibrium constant for a variant of CopC was found in Young et al (2015) and a forwards rate constant for a structurally and functionally similar copper binding protein was found in Du, 2013. The equilibrium constant is necessary to determine whether the reaction happens at an appreciable rate relative to the fluid transport speed in the reactor. This rate constant provides an estimate of the reaction timescale which will inform how large our reactor must be.

Description Variable Value Source
Input Concentration (Cu) \(m\) 30 \( \frac{mg}{L} \) Chiu, 1984
Output Concentration (Cu) 1.3 \( \frac{mg}{L} \) Saydeh et al, 2017
Surface Protein Concentration \( Q \) 0.35E-6 \( \frac{mol}{m^2} \) McLean, 2000
Specific Surface Area \( a \) 72600000 \( \frac{1}{m} \) McLean, 2000
Forwards Rate Constant \( k_a \) 0.015 \( \frac{1}{s} \) Du, 2013
Reverse Rate Constant \( k_d \) 3e-16 \( \frac{1}{s} \) Du, 2013 and Young et al, 2015
Equilibrium Constant \( K_D \) 1E-13.7 Young et al, 2015
Void Fraction \( \rho \) 0.31 Sen, 2016
Density of Cellulose 1.5 \( \frac{g}{cm^3}\) National Center for Biotechnology Information, 2020
Porosity \(\epsilon\) 0.69 Sen, 2016
Superficial Reactor Velocity \(v_s\) 0.2552 Computed based on reactor size. See below
Table 3. Parameters, variables, values, and literature sources for each of the parameters used in our model’s equations.

Modelling Results

The following plots are taken from the aforementioned model with the approximation that advective transport phenomena can be neglected when compared to the reaction timescale. This is reasonable because the copper binding protein chosen has a very small equilibrium constant, which implies that the reaction time scale is very fast. This model assumes that the reactor is well mixed, since the cellulose in the reactor will produce very strong eddy diffusion.

Batch Reactor Copper Concentration
Figure 24: Concentration of copper in a batch reactor versus time. Note how the majority of the copper in the solution has been bound within 5 seconds.
Batch Reactor Occupied Protein
Figure 25: Percentage of occupied protein active sites in batch reactor versus time, assuming one gram of cellulose packing material uniformly distributed throughout the batch reactor.

From these two plots we can see that within 5 seconds a vast majority of the copper present in solution has adsorbed onto the CopC binding sites, while only a small fraction of the CopC has actually been used up. This implies that a packed column reactor assuming similar initial inflow concentrations will not saturate too quickly to be of use.

It is also visible from the model that the timescale of the reaction is much smaller than the time a copper ion spends in a typical large packed column reactor. This led to numerical problems when solving the partial differential equation model, which we did not have time to fix. Instead, due to the rapid equilibrium approximation we may use the batch reactor to estimate a typical length scale for the reactor.

Through some dimensional analysis, we can find that 99.5% of copper ions which spend longer than 5 seconds in the reactor will be adsorbed. This means that we can potentially treat 200 times the volume of the reactor in copper solution before the reactor is saturated. This approximation allows us to choose a scale for our reactor.

For the metal finishing waste stream chosen, we will need to treat an average of 46.9 L/s of waste over the course of each day (see “Getting Parameters”), or 4 052 160 L/day. We choose to treat this waste using four 5-hour reactor cycles, allowing for 1-hour breaks in between for elution and other routine maintenance. This means that 1 013 040 L of waste must be treated in each of the four reactor cycles. Since we can treat 200 times the volume of the reactor in each reactor cycle, our reactor must be able to accommodate 1/200th of the waste per cycle, or 5065 L. However, due to the packing of 3 mm cellulose particles,, only 31% of our total reactor volume is actually free space. This means that we need the total volume of our packed column to be 16 339 L to accommodate 5065 L of liquid. Based on a 1:10 height-diameter ratio, this gives a packed column 1.28 m in diameter and 12.8 m in length.I n order to fill the reactor, we would require approximately 17 000kg of cellulose packing material, assuming a 3 mm particle diameter and 31 % void space.

In our discussion with Aevitas, they told us that they used an approximately 30 000 L batch reactor to treat their wastes (citation). Although we are using a continuous process with some differences from Aevitas’ process, this confirms that our proposed reactor scale of about 16 000 L is reasonable and within the size range of existing waste treatment processes.

With the above assumptions for our reactor, according to the estimate of 99.5% of all copper ions being depleted after a reactor cycle the effluent concentration should be below 1 µmol/L. This falls well below our goal of an output concentration of 1.3mg/L of copper.

Now that the reactor has been designed, we can use the Ergun equation to compute pressure drop. The velocity of fluid in the reactor should then be chosen to be the height divided by 50 seconds, in order to allow the copper ions to stay in the reactor for longer than 5 seconds at a time. This means that the superficial flow velocity in the reactor is 0.2556 m/s. We apply the Ergun equation (Ergun, 1949), a standard tool in chemical engineering, to calculate the resulting pressure drop across our packed column reactor.

For our reactor of diameter 1.28 m and length 12.8 m, the pressure drop is 0.126 bar. This means that in order for wastewater to flow through the reactor, we must use a pump to pressurize the reactor inlet to at least 0.126 bar greater than the outlet. This is very reasonable by industrial standards.

Additional Reactor Design Considerations

Refer to the Implementation page for a more detailed discussion of our envisioned implementation.

The results of our modelling allow us to propose a reactor design that is suitable for treating an example copper containing waste stream from a typical semiconductor manufacturing facility. Although this model provides the reactor scale and dimensions that should be used, several other parameters are required in order to operate the reactor.

In particular, we must determine the effect of pH on copper binding in the reactor. pH must be determined for both the initial binding phase and the subsequent elution phase. This would ideally be determined using experiments to characterise our protein’s copper binding at a range of pHs. However, literature provides some information that can be used to make rough estimates.

In the binding stage, we want the copper to bind to the copper binding proteins in the reactor as strongly as possible. In CopC, the copper ion is coordinated by histidine, aspartic acid, and glutamic acid (Lawton, 2016). In order for copper binding to occur, these amino acids must all exist in their deprotonated forms so that they are able to use their lone electron pairs to bond to copper. These deprotonated forms are also negatively or neutrally charged, attracting the positively charged copper ion. This deprotonated state can be achieved for all of these amino acids at a pH of 7.2 or higher (Včeláková et al., 2004). This means that the dissolved copper waste fed into the reactor must be at pH 7.2 or higher. Metal finishing waste is generally near neutral pH (van Dijken et al., 1999), so only small adjustments would be needed. This adjustment could be achieved by adding small amounts of concentrated acid or base to the dissolved copper waste before feeding it to the reactor.

In the elution stage, we want copper ions to bind to the copper binding proteins in the reactor as weakly as possible, allowing them to be removed from the reactor. To weaken copper binding, we can protonate the histidine, aspartic acid, and glutamic acid involved in copper binding. Protonation prevents these amino acids’ lone pairs from bonding with the copper ion, and also causes the side chains to become neutrally or positively charged, further repelling the copper ion. Protonation of these amino acids can be achieved at pH 4.8 or lower (Včeláková et al., 2004). This means that metal ions can be removed from the reactor by flowing through a solution with a pH of 4.8 or lower. This would result in a concentration and pure copper solution. If necessary, the pH of the recovered metal solution could be further adjusted for recycling by using small volumes of concentrated acid or base.

Determining the pH required for the elution and binding stages of our process therefore moves us one step closer to building a functioning industrial process. The pH values determined herein are specific to copper and the CopC protein, but required pH values could be calculated for any metal and metal binding protein. In general, amino acids must be deprotonated to bind metals, so the metal binding step will always require a higher pH than the elution step.

Below is a proposed schematic of our packed column reactor’s proposed implementation, explained further on the Implementation page.

 A schematic of our packed column reactor’s proposed implementation and operation schedule.
Figure 26. A schematic of our packed column reactor’s proposed implementation and operation schedule.

Build, Test & Learn

Although we were not able to build or test a real prototype of our reactor, we have laid out the steps we would have followed if lab access were possible.

This procedure provides a method for constructing a small scale packed column similar to our large scale process. This prototype would evaluate several aspects of our design:

  1. It would demonstrate that our fusion protein is able to bind and stay bound to the cellulose packing of the column.
  2. It would evaluate our protein’s ability to capture dissolved copper from an example waste stream, and to elute that copper when desired.
  3. It would allow us to measure the copper binding and elution rates and calculate better parameters for our model. This would allow us to improve our model and create a better industrial scale design.

The following materials and methods were selected so that we could use an off-the-shelf pump-based fast protein liquid chromatography column, since a gravity column chromatography system would be too divergent from the process model.

Since our real industrial process must treat 46.9 L/s of waste and will do so with a ~16 000 L reactor, we will scale the process accordingly and treat 1 mL/min of waste with our 5 mL column. Since the column has a volume of 5 mL, we expect 1.55 mL based on a calculated void fraction of 0.31.

Materials

  • 8.3 g 51 µm diameter crystalline cellulose powder (Sigma Aldrich)
  • 5 mL XK 50/60 chromatography column (Cytiva Life Sciences)
  • Peristaltic pump
  • Copper and cellulose fusion protein (purified); the molar amount of protein required must be calculated using the specific surface area of the cellulose powder
  • 1 mM tetraphenylporphine solute (Itoh et al., 1975)
  • Binding buffer: 50 mM phosphate buffer (adjusted to pH 7.2)
  • Elution buffer: 50 mM citrate buffer (adjusted to pH 4.8)
  • Example waste: 30 mg Cu/L CuCl2 dissolved in binding buffer

Method

  1. In a beaker, combine the purified fusion protein with 8.3 g crystalline cellulose powder hydrated with 15 mL of binding buffer. Allow to incubate for an hour at 4°C, swirling occasionally, to allow the fusion protein to thoroughly bind to the cellulose.
  2. Centrifuge at 15 000 rpm for 15 minutes to recover the cellulose pellet before washing with 15 mL of binding buffer. Repeat twice for a total of three washes.
  3. Transfer the suspension to the XK 50/60 chromatography column. Open the outlet of the column to allow the cellulose bed to settle by gravity. Refill the column with the binding buffer.
  4. We hypothesise that the column can treat approximately 200 times its void volume of our example waste, which is about 300 mL. In order to test this, we will feed 300 mL of example waste through the column at a rate of 1 mL/min using a peristaltic pump. While feeding the example waste, collect the column’s eluate in 5 mL aliquots.
  5. Wash the column with another 10 mL of binding buffer at a flow rate of 1 mL/min.
  6. To elute, feed the column with 10 mL of elution buffer at a rate of 1 mL/min. Collect 1 mL aliquots of the flowthrough.
  7. Measure the copper concentration of each aliquot using the spectrophotometric method described by Itoh et al. (1975). Also measure the absorbance at 280 nm of each aliquot to determine the amount of dissolved protein.

Data Analysis & Expected Results

The concentrations of protein in the aliquots taken throughout the process are representative of how well our fusion protein stays bound to the cellulose packing throughout the process. If protein concentrations in the reactor outflow are more than negligible, this means that our fusion protein’s cellulose-binding domain might be failing to bind the cellulose packing. Further investigation would be required to improve cellulose binding.

The concentrations of copper in the aliquots taken while feeding the example waste are representative of the copper concentration that would be released to the environment in our industrial scale design. Therefore, these copper concentrations must all be below 1.3 mg/L for the design to have worked successfully. If all outflow copper concentrations are higher than this, we could consider a longer column or slower flow rate, giving the copper more time and available proteins to bind to. If copper concentration is initially low but rises later on, this means that we have saturated all available copper binding sites, and that the column will need to be eluted and reset after less than 200 times the void volume of waste.

The concentration of copper in the aliquots taken during elution are representative of the copper concentration that would be recovered when bound copper is eluted from our column. This is the concentration of the purified copper that could be recycled for other applications. Further research can be conducted to determine whether this eluted copper concentration is suitable for other applications.

References

"Ligand_Dock Application". Rosettacommons.Org, 2020, https://www.rosettacommons.org/docs/latest/application_documentation/docking/ligand-dock. Accessed 22 Oct 2020.

Al-Saydeh, S.A., El-Naas, M.H., and Zaidi, S.J. 2017. Copper removal from industrial wastewater: A comprehensive review. J. Ind. Eng. Chem. 56: 35–44. The Korean Society of Industrial and Engineering Chemistry. doi:10.1016/j.jiec.2017.07.026.

Applications for Macromolecule Design. (2020). Retrieved 14 October 2020, from https://www.rosettacommons.org/docs/latest/application_documentation/design/design-applications

Arán, D., Antelo, J., Lodeiro, P., Macías, F., and Fiol, S. 2017. Use of Waste-Derived Biochar to Remove Copper from Aqueous Solution in a Continuous-Flow System. Ind. Eng. Chem. Res. 56(44): 12755–12762. doi:10.1021/acs.iecr.7b03056.

Balakrishnan, R., Ramasubbu, N., Varughese, K. I., & Parthasarathy, R. (1997). Crystal structures of the copper and nickel complexes of RNase A: Metal-induced interprotein interactions and identification of a novel copper binding motif. Proceedings of the National Academy of Sciences, 94(18), 9620-9625. doi:10.1073/pnas.94.18.9620

Berendsen, et al. 1995. GROMACS: A message-passing parallel molecular dynamics implementation. Comp. Phys. Comm. 91: 43-56.. doi:10.1016/0010-4655(95)00042-E

Carson, M., Johnson, D. H., McDonald, H., Brouillette, C. & DeLucas, L. J. (2007). Acta Cryst. D63, 295-301.

Chatzou, M., Magis, C., Chang, J., Kemena, C., Bussotti, G., Erb, I. & Notredame, C. (2015, November 27). Multiple sequence alignment modeling: Methods and applications. Retrieved October 16, 2020, from https://doi.org/10.1093/bib/bbv099

Chiu, H.S.S., Tsang, K.L., and Lee, R.M.L. 1984. Treatment of electroplating wastes. Water Sanit. Asia Pacific Proc. 10th WEDC Conf.: 115–119.

Cordero, B., Gómez, V., Platero-Prats, A. E., Revés, M., Echeverría, J., Cremades, E., . . . Alvarez, S. (2008). Covalent radii revisited. Dalton Transactions, (21), 2832. doi:10.1039/b801115j

Glaus, M. A. & Van Loon, L. R. (1970, January 01). Cellulose Degradation at Alkaline Conditions: Long-Term Experiments at Elevated Temperatures. Retrieved October 16, 2020, from https://inis.iaea.org/search/search.aspx?orig_q=RN%3A36011143

Griffo, A., Rooijakkers, B. J. M., Hähl, H., Jacobs, K., Linder, M. B., & Laaksonen, P. (2019). Binding forces of cellulose binding modules on cellulosic nanomaterials. Biomacromolecules, 20(2), 769-777. doi:10.1021/acs.biomac.8b01346

Harper, S., & Speicher, D. W. (2011). Purification of proteins fused to glutathione S-transferase. Methods in molecular biology (Clifton, N.J.), 681, 259–280. https://doi.org/10.1007/978-1-60761-913-0_14

Heaton, D. N., George, G. N., Garrison, G., & Winge, D. R. (2001). The mitochondrial copper metallochaperone Cox17 exists as an oligomeric, polycopper complex. Biochemistry, 40(3), 743-751. doi:10.1021/bi002315x

How to prepare structures for use in Rosetta. (2020). Retrieved 14 October 2020, from https://www.rosettacommons.org/docs/latest/rosetta_basics/preparation/preparing-structures

iGEM Technion 2016, & iGEM TU Eindhoven 2016. (2016). Retrieved 14 October 2020, from https://static.igem.org/mediawiki/2016/5/59/Rosetta_Guide_for_the_iGEM_Beginner.pdf

Ioelovich, M. (2017, March 31). Study on Acidic Degradation of Cellulose. Retrieved October 16, 2020, from https://www.eurekaselect.com/148252/article

Itoh, J.-I., Yotsuyanagi, T., and Aomura, K. 1975. Spectrophotometric determination of copper with tetraphenylporphine trisulfonate. Anal. Chim. Acta 74: 53–60.

J. Puetz and F. M. Wurm, “Recombinant Proteins for Industrial versus Pharmaceutical Purposes: A Review of Process and Pricing,” Processes, vol. 7, no. 8, p. 476, Jul. 2019, doi: 10.3390/pr7080476.

Jalak, J., & Väljamäe, P. (2014). Multi-mode binding of cellobiohydrolase Cel7A from trichoderma reesei to cellulose. Plos One, 9(9), e108181. Retrieved from https://doi.org/10.1371/journal.pone.0108181

Jesu Jaya Sudan, R., Lesitha Jeeva Kumari, J., & Sudandiradoss, C. (2015). Ab Initio Coordination Chemistry for Nickel Chelation Motifs. PLoS ONE, 10(5). https://link.gale.com/apps/doc/A432243972/AONE?u=uniwater&sid=AONE&xid=4d4767c3

Kapust, R. B., Tözsér, J., Copeland, T. D. & Waugh, D. S. (2002, June 28). The P1' specificity of tobacco etch virus protease. Retrieved October 15, 2020, from https://pubmed.ncbi.nlm.nih.gov/12074568/

Kataeva, I. A., Seidel, R. D., Li, X. L., & Ljungdahl, L. G. (2001). Properties and mutation analysis of the CelK cellulose-binding domain from the clostridium thermocellum cellulosome. Journal of Bacteriology, 183(5), 1552-1559. doi:10.1128/jb.183.5.1552-1559.2001

Kaufmann, K. W., Lemmon, G. H., Deluca, S. L., Sheehan, J. H., & Meiler, J. (2010). Practically useful: What the Rosetta protein modeling suite can do for you. Biochemistry, 49(14), 2987–2998. https://doi.org/10.1021/bi902153g

Kimple, M. E., Brill, A. L., & Pasker, R. L. (2013). Overview of affinity tags for protein purification. Current protocols in protein science, 73, 9.9.1–9.9.23. https://doi.org/10.1002/0471140864.ps0909s73

Koay, M. S., Janssen, B. M., & Merkx, M. (2013). Tuning the metal binding site specificity of a fluorescent sensor protein: From copper to zinc and back. Dalton Trans., 42(9), 3230-3232. doi:10.1039/c2dt32082g

Lawton, T. J., Kenney, G. E., Hurley, J. D., & Rosenzweig, A. C. (2016). The CopC Family: Structural and Bioinformatic Insights into a Diverse Group of Periplasmic Copper Binding Proteins. Biochemistry, 55(15), 2278-2290. doi:10.1021/acs.biochem.6b00175

Leaver-Fay, A., Tyka, M., Lewis, S. M., Lange, O. F., Thompson, J., Jacak, R., Kaufman, K., Renfrew, P. D., Smith, C. A., Sheffler, W., Davis, I. W., Cooper, S., Treuille, A., Mandell, D. J., Richter, F., Ban, Y. E. A., Fleishman, S. J., Corn, J. E., Kim, D. E., … Bradley, P. (2011). Rosetta3: An object-oriented software suite for the simulation and design of macromolecules. Methods in Enzymology, 487(C), 545–574. https://doi.org/10.1016/B978-0-12-381270-4.00019-6

Li, Q., Wang, A., Ding, W. & Zhang, Y. (2017). Influencing factors for alkaline degradation of cellulose. Retrieved October 16, 2020, from https://bioresources.cnr.ncsu.edu/resources/influencing-factors-for-alkaline-degradation-of-cellulose/

Lichty, J. J., Malecki, J. L., Agnew, H. D., Michelson-Horowitz, D. J., & Tan, S. (2005). Comparison of affinity tags for protein purification. Protein Expression and Purification, 41(1), 98-105. doi:10.1016/j.pep.2005.01.019

Major, B., Copper, F. S., & Ccoa, I. (2016). Uncovering the Transmembrane Metal Binding Site of the Novel. MBio, 7(1), 1–11. https://doi.org/10.1128/mBio.01981-15.Editor

Maurer, S. A. (2012). Surface-based assays for enzyme adsorption and activity on model cellulose films Retrieved from http://www.riss.kr/pdu/ddodLink.do?id=T13297598

McLean, B. W., Bray, M. R., Boraston, A. B., Gilkes, N. R., Haynes, C. A., & Kilburn, D. G. (2000). Analysis of binding of the family 2a carbohydrate-binding module from cellulomonas fimi xylanase 10A to cellulose: Specificity and identification of functionally important amino acid residues. Protein Engineering, Design and Selection, 13(11), 801-809. doi:10.1093/protein/13.11.801

Palumaa, P. (2013). Copper chaperones. the concept of conformational control in the metabolism of copper. FEBS Letters, 587(13), 1902-1910. doi:S0014-5793(13)00361-X [pii]

Petrucci, R., Herring, G., Madura, J., & Bissonnette, C. (2017). In O'Donnell C. (Ed.), General chemistry principles and modern applications (11th ed.). Toronto: Pearson.

Relax application. (2020). Retrieved 14 October 2020, from https://www.rosettacommons.org/docs/latest/application_documentation/structure_prediction/relax

Rodriguez, B., Kavoosi, M., Koska, J., Creagh, A. L., Kilburn, D. G., & Haynes, C. A. (2004). Inexpensive and generic affinity purification of recombinant proteins using a family 2a CBM fusion tag. Biotechnology Progress, 20(5), 1479-1489. doi:10.1021/bp0341904 [doi]

Scheich, C., Sievert, V., & Büssow, K. (2003). An automated method for high-throughput protein purification applied to a comparison of His-tag and GST-tag affinity chromatography. BMC Biotechnology, 3(1), 12. doi:10.1186/1472-6750-3-12

Scoring Tutorial. (2020). Retrieved 14 October 2020, from https://www.rosettacommons.org/demos/latest/tutorials/scoring/scoring

Shuler, M.L., and Kargi, F. 2002. Bioprocess Engineering Basic Concepts. In 2nd edition. Prentice Hall PTR, Upper Saddle River, NJ.

Song, Z., Dong, J., Yuan, W., Zhang, C., Ren, Y., & Yang, B. (2016). The stability and the metal ions binding properties of mutant A85M of CopC. Journal of Photochemistry and Photobiology B: Biology, 161, 387-395. doi:https://doi.org/10.1016/j.jphotobiol.2016.06.006

Timón-Gómez, A., Nývltová, E., Abriata, L. A., Vila, A. J., Hosler, J., & Barrientos, A. (2018). Mitochondrial cytochrome c oxidase biogenesis: Recent developments. Seminars in Cell & Developmental Biology, 76, 163-178. doi:10.1016/j.semcdb.2017.08.055

Trache, D., Donnot, A., Khimeche, K., Benelmir, R. & Brosse, N. (2014, January 27). Physico-chemical properties and thermal stability of microcrystalline cellulose isolated from Alfa fibres. Retrieved October 16, 2020, from https://www.sciencedirect.com/science/article/pii/S0144861714000678

Udagedara, S. R., Wijekoon, C. J. K., Xiao, Z., Wedd, A. G., & Maher, M. J. (2019). The crystal structure of the CopC protein from pseudomonas fluorescens reveals amended classifications for the CopC protein family. Journal of Inorganic Biochemistry, 195, 194-200. doi:https://doi.org/10.1016/j.jinorgbio.2019.03.007

UniProt.UniProtKB - P12376 (COPC_PSEUB). Retrieved from https://www.uniprot.org/uniprot/P12376

Urbina, J., Patil, A., Fujishima, K., Paulino-Lima, I., Saltikov, C., & Rothschild, L. (2019, November 11). A new approach to biomining: Bioengineering surfaces for metal recovery from aqueous solutions. Retrieved October 17, 2020, from https://www.ncbi.nlm.nih.gov/pubmed/31712654

van Dijken, K., Prince, Y., Wolters, T., Frey, M., Mussati, G., Kalff, P., Hansen, O., Kerndrup, S., Søndergârd, B., Rodrigues, E.L., and Meredith, S. 1999. Electroplating industry. : 97–128. doi:10.1007/978-94-007-0854-9_9.

Vanommeslaeghe, K. Hatcher, E. Acharya, C. Kundu, S. Zhong, S. Shim, J. E. Darian, E. Guvench, O. Lopes, P. Vorobyov, I. and MacKerell, Jr. A.D. "CHARMM General Force Field (CGenFF): A force field for drug-like molecules compatible with the CHARMM all-atom additive biological force fields," Journal of Computational Chemistry 31: 671-90, 2010, PMC2888302

Včeláková, K., Zusková, I., Kenndler, E., & Gaš, B. (2004). Determination of cationic mobilities and pKa values of 22 amino acids by capillary zone electrophoresis. Electrophoresis, 25(2), 309-317. doi:10.1002/elps.200305751

Wei, X., Kong, X., Wang, S., Xiang, H., Wang, J., & Chen, J. (2013). Removal of Heavy Metals from Electroplating Wastewater by Thin-Film Composite Nanofiltration Hollow-Fiber Membranes. *Ind. Eng. Chem. Res., 52*(49), 17583–17590. https://doi.org/10.1021/ie402387u

Wijekoon, C. J., Young, T. R., Wedd, A. G., & Xiao, Z. (2015). CopC protein from Pseudomonas fluorescens SBW25 features a conserved novel high-affinity Cu2+binding site. Inorganic chemistry, 54(6), 2950–2959. https://doi.org/10.1021/acs.inorgchem.5b00031

Woestenenk, E. A., Hammarström, M., Berg, S. V., Härd, T., & Berglund, H. (2004). His tag effect on solubility of human proteins produced in Escherichia coli: A comparison between four expression vectors. Journal of Structural and Functional Genomics, 5(3), 217-229. doi:10.1023/b:jsfg.0000031965.37625.0e

Zhao, X., Li, G., & Liang, S. (2013). Several Affinity Tags Commonly Used in Chromatographic Purification. Journal of Analytical Methods in Chemistry, 2013, 1-8. doi:10.1155/2013/581093

Zoroddu, M. A., Medici, S., & Peana, M. (2009). Copper and nickel binding in multi-histidinic peptide fragments. Journal of Inorganic Biochemistry, 103(9), 1214-1220. doi:10.1016/j.jinorgbio.2009.06.008