The engineering process was long, spanning multiple phases; in addition, we also had to adapt to an ever-changing set of conditions due to the COVID-19 pandemic - all communication took place through online media, and, given how international our team is, scheduling meetings has been difficult due to time zone differences. Nonetheless, we are happy to say that our project was a partial success, all conditions considered.
Initial project ideas
How can we help?
We conducted an extensive search for problems we could solve in both our local area, as well on a global level. What caught our attention was the impact of the COVID-19 pandemic upon cancer screening, particularly colorectal cancer, as well as a novel idea for detection...
Initial project ideas
In order to decide on a problem to approach, we turned to issues both in our local area and in the whole of the UK, finding many in the process, including the world's most microplastic-ridden river - the river Tame (which runs through the Greater Manchester area) with roughly 517000 microplastic particles/m²[1], as well as the increasing levels of water consumption by chicken farms, which among others led us to consider egg substitutes.
Despite all this, what caught our attention the most was a problem that affected many even in normal times, and that has only been exacerbated by the COVID-19 pandemic - bowel cancer. Despite being, in many cases, a treatable if not preventable disease, it is one of the leading causes of death in the UK and a significant one worldwide. While researching the issue, we came across a number of papers published in recent years which stated that as many as two-thirds of colon cancer patients harboured a colibactin producing bacterial strain (usually E. coli, but other Enterobacteriaceae have been found to contain the gene cluster responsible for this) in their gut; the opportunity for us to make a difference for as many people as possible by creating a non-invasive, cheap and reliable testing kit for the presence of such strains was too good to pass up. Instrumental to this would be a biosensor which could detect the markers of such strains' presence.
In-depth research
Know thy enemy
After deciding on creating a colibactin biosensor, we found that the substance is way too reactive to detect - with help from our Chemistry department, we decided on trying to detect a byproduct - N-myristoyl-D-asparagine. The large size of this molecule meant that we had to plan a two-step approach, however...
In-depth research
After deciding on trying to create this colibactin biosensor, we entered a more detailed research phase, during which we found that, due to its high reactivity and large size, we would not be able to create a sensor for it.
Fortunately, we were able to find an alternative, with the help of the University of Warwick's Chemistry department, in the form of N-myristoyl-D-asparagine, a molecule which is cleaved off a precolibactin by colibactin peptidase (ClbP) in the final stages of biosynthesis[2].
Several predicted or elucidated structures of precolibactins from various research groups[2]: notice the common element, circled in red: a N-myristoyl-D-asparagine residue.
However, that molecule was still far larger than the MMFs (methylenomycin furans, a family of bacterial hormones), the protein's native ligands. Hence, we decided on a two-step approach, by shortening the alkyl chain to two carbon atoms, we would be able to prepare the binding pocket for a molecule with the features of N-Myr-D-Asn, after which we would try to use the resulting N-acetyl-D-asparagine biosensor as a design template for the N-Myr-D-Asn sensor.
3D model of one of the more stable conformers of the N-myristoyl-D-asparagine molecule.
Dry lab: part 1
Laying the foundations
The first phase of our design process involved designing an N-acetyl-D-asparagine biosensor, which, due to our lack of wet lab access, had to be done completely in silico. The apparent success of the initial design procedure allowed us to move on to the second phase of our project...
Dry lab: part 1
In order to design the ligand binding pocket, we opted to use Rosetta, a piece of software which allows us to computationally design proteins, predict structures and dock ligands, which among other things was also recently used to predict the 3D structures of SARS-CoV-2 spike proteins. We used paper [3] as a source for methodology, however the actual structuring and file preparations were ours to do. As a control for the design we tried docking one of the native ligands, MMF2, into the wild-type protein, the results being as shown below, in blue, with the elucidated crystal structure being shown in green.
After seeing that the docking process was accurate enough for what we set out to do, as the results of the docking algorhythm were very similar to the pre-determined position of the ligand, we started preparations for the docking and design protocol for the N-Ace-D-Asn receptor. We used a piece of software named Avogadro for conformer searches and energy optimisation of the N-Ace-D-Asn molecule, in order to prepare it for docking, after which we followed the docking and design procedure outlined in the paper above. One of the best results is shown in the figure below.
The high degree of hydrogen bonding (bonds shown in yellow) as well as the fact that the ligand (shown in pink) fits neatly within the binding pocket made us confident that this receptor would indeed be functional.
Contacting specialists
Seeking sage advice
Our attempts to contact real-world specialists in bowel cancer screening have born fruit, as we were able to interview a number of professionals who were able to give us feedback on our idea. Even better, they told us of how the system currently works - and where it could use some more work...
Contacting specialists
Early on in the timeline of our project, we made an effort to reach out to specialists in the field of CRC early detection, in order to assess how our detector could work in the real world and integrate their advice into our work. Among others, an interview with Cancer Research UK, a cancer research and awareness charity, was particularly enlightening, as we were informed of the problems currently present in the system.
We learned that current methods of testing, most important of which are fecal immunochemical tests (FIT), are rather reliable but they have problems with acceptability, as, understandably, patients are reluctant to handle fecal matter. Moreover, the problem of large workloads of medical professionals was brought to our attention, especially in case of false positives, when blood in faeces occurs from non-cancerous sources. In contrast, our test would detect the presence of colibactin-producing bacterial strains, so one obvious advantage would be the reduction of false positive cases in which treatment is administered erroneously. Pre-emptive treatment would also be possible in cases in which the polips have not formed yet.
Our efforts have also led to the discovery that academics at the University of Warwick have been working on a method to train neural networks to recognise cancerous growths based on biopsies - this would eliminate the subjectivity of visual interpretation. Interviewing Dr. Nasir Rajpooton his team's work has led us to consider the possibility of creating a sort of complementarity between our proposed method of detection and their work[4]
Dry lab: part 2
An early grave?
We attempted to use the N-acetyl-D-asparagine biosensor as a design template for our final product: the N-myristoyl-D-asparagine biosensor. However, we had overestimated the power of the computational tools at our disposal...
Dry lab: part 2
With the apparent success of the N-Ace-D-Asn receptor design, we attempted to repeat the process for N-Myr-D-Asn, however the docking results were as follows:
The N-Myr-D-Asn (shown in pink) docking results into our best N-Ace-D-Asn design (shown in green).
The asparagine residue is sticking out of the binding pocket while the alkyl chain is situated within it; this makes hydrogen bonding impossible, hence why this ligand pose is highly unlikely.
Here is where we realised what the problem with our idea was: the binding pocket was far too small for the 14C alkyl chain of the N-Myr-D-Asn molecule. Enlarging the pocket computationally is sadly impossible with the tools at our disposal, so we decided to look for alternative ways to progress.
Outreach effort - conclusions
Integrating new ideas
The professionals we had the privilege of interviewing had expanded our perspective on how our kit could fit into the screening system, as well as ideas on how it could interact with newly emerging, very exciting components...
Outreach effort - conclusions
The knowledge we gained by interviewing said specialists has given us a great number of ideas on how to improve our project; specifically, attempting to assess how costly our kit would be to produced compared to FIT tests; the possibility of synergy between our kit and other, more in-depth methods of testing for positive results which would greatly reduce the workload of trained professionals, like the neural network-driven sample interpretation method, drove us to at least lay the groundwork for a future continuation of our projects by other iGEM teams.
The information we gathered, as well as the results which we had obtained thus far led us to carry out the docking and design procedures with N-butyryl-D-asparagine, in order to accommodate as large a molecule as possible...
Dry lab - change of plans
We decided to at least try and see what the maximum alkyl chain length that could be accommodated in the binding pocket was, and therefore repeated the design procedure with a new ligand - N-butyryl-D-asparagine. The best result is shown below:
The ligand (shown in pink) is able to be accommodated fully within the binding pocket. A total of six hydrogen bonds are able to be formed (shown in yellow), while the alkyl chain (on the left) is accommodated in a hydrophobic pocket between the side chain of an isoleucine residue (not visible in this image) and of a leucine residue (circled in orange).
The relatively high number of hydrogen bonds established between the ligands and the key residues as well as the hydrophobic region around the 4C alkyl chain lend credence to the fact that this receptor would indeed be functional; however, this remains to be tested in the lab. The interesting part is that the best ligand pose according to the software is with the alkyl chain facing outwards, in contrast to the best pose in the template N-Ace-D-Asn receptor, where the alkyl chain faces inward. This is likely due to the fact that the protein's function would be greatly disrupted by the larger chain, as it would clash with crucial hydrophilic residues in that part of the protein.
Wet lab experiment plans
What could not be done
Under normal circumstances, our dry lab work would have been complemented by wet lab work, like many other iGEM projects. This year, however, matters are different, but that has not prevented us from planning out what we would have done...
Wet lab experiment plans
The results of our in silico modelling process look promising enough for us to have attempted to test them in a wet lab environment under normal circumstances. The amino acid sequence would have been converted into codon-optimised DNA sequences which would have been delivered to E. coli cultures in the lab by means of a plasmid cloning vector. Production of our biosensors would have been induced and their activity tested by adding different concentrations of ligand to different cultures to check for affinity as well as controls to see whether false positives occur.
Map of a pET TOPO 151 plasmid, our proposed vector, containing the gene encoding the N-acetyl-D-asparagine sensor which we used as a design template for N-butyryl- and N-myristoyl-D-asparagine biosensors.
By examining binding affinity we would have been able to select the most sensitive biosensor and perhaps used it as a template for another design run in Rosetta in order to further refine it. This would have allowed us to create a biosensor which could detect the slightest amount of ligand; moreover, we had planned to avoid the problem of including GMOs in our kit by attempting to make the machinery work in a cell-free system but sadly this was not possible.
Future work?
Hope for the future
As our project neared its close, it became apparent that our work need not be in vain. Despite our failure in creating the biosensor, we laid the groundwork for a future team to complete what we set out to do, using the tools of the future - directed evolution being chief among them.
Future work?
Our end result was not quite what we set out to do, however, we came across a number of papers on directed evolution, which gave us hope that our project could be continued by using the best resulting biosensor, gradually evolving it up to the point that it could accommodate the 14C alkyl chain. The work of 2018 Nobel Prize laureate Frances Arnold[5] suggests that such a thing could be done given enough time and proper methodology, but this is sadly beyond our capabilities to realistically carry out - as outlined in her paper - Design by Directed Evolution (1998) - such a thing takes weeks to months to accomplish.
Nonetheless, we maintain hope that the first steps we have taken on this path will be continued by future iGEM teams who wish to develop and build upon our work, and that the improvement of the lives of many through this rather simple innovation is not too far in the future.
References
↑Hurley, R., Woodward, J., Rothwell, J. J., Microplastic contamination of river beds significantly reduced by catchment-wide flooding, Nat. Geosci. 2018, 11, 251-257, doi: https://doi.org/10.1038/s41561-018-0080-1
↑ 2.02.1Wernke, K., M., Xue, M., Tirla, A., Kim, C., S., Crawford, J. M., Herzon, S. B., Structure and bioactivity of colibactin, Bio. & Med. Chem. Lett., 2020, 30, 127280, doi:https://doi.org/10.1016/j.bmcl.2020.127280
↑Moretti, R., Bender, B. J., Allison, B., Meiler, J., Rosetta and the Design of Ligand Binding Sites, Comp. Des. Lig. Bin. Prot., 2016, pp 47-62
↑Awan, R., Sirinukunwattawa, K., Epstein, D., Jeffereyes, S., Qidwai, U., Aftab, Z., Mujeeb, I., Snead, D., Rajpoot, N., Glandular Morphometrics for Objective Grading of Colorectal Adenocarcinoma Histology Images, Sci Rep, 7, 16852 (2017), doi: https://doi.org/10.1038/s41598-017-16516-w
↑Arnold, F. H., Design by Directed Evolution, Acc. Chem. Res. 1998, 31, 125-131