We sought out to develop a tool that would be comprehensive for first-time users. Doing this within the realm of bioinformatics was expected by us to be an uphill battle. The field itself is more than just writing R scripts and consulting Google when encountering malfunction. Hence, we reasoned that a good tool would be able to inform the user of the processes within the pipeline enough for the user to be willing to try new things out. That would hopefully encourage new users to try different options and learn along the way, and give experienced users familiar and customizable tools.
The logic behind it builds upon the idea of simplicity and user-friendliness. The only thing required from the user is curiosity. An experienced user would know what some of the packages within MODifieR offers, recognize an enrichment method or two and have probably written a script by hand before. Encountering new, already integrated options to experiment with and simultaneously having a consulting user guide within the tool itself will hopefully make ClusteRsy the go-to option for even experienced bioinformaticians. Or at least that is our vision.
Still, anyone can use ClusteRsy to generate figures from data. You can randomly select an inference method and a type of enrichment and voila. Being able to play around within our ClusteRsy was important to us, especially since curiosity is said to be the driving force behind science in general. By offering users the option to change if not all default parameters the tool becomes highly customizable. This leads to another engineering principle, an abundance of options.
Clusterly features eight different disease module generators, gene, and disease ontology analysis, pathway analysis, the ability to upload your own PPI network, save and delete results, and much much more. This will hopefully make it the go-to tool for finding overrepresented genes within disease modules, thus helping understand complex disease by pointing out new potential biomarkers.
To summarize, our most important engineering principles whilst building the tool have been user-friendliness alongside complexity, offering users new and old clear insight into the various methods that they try out whilst simultaneously being simple to navigate and understand.
The experimental group spent several weeks researching known asthma biomarkers, especially in T2 asthma, and came up with many potential candidates. The biomarkers of highest interest were then chosen, with the criteria that they should be proteins and preferably enzymes since that could aid in the production of a biosensor. Three proteins were chosen, eosinophil derived neurotoxin (EDN), eosinophil cationic protein (ECP), and histamine n-methyltransferase (HNMT). These proteins were then designed in Benching where T7-RNA-polymerase promoter, as well as a 5'-UTR region, was added to the constructs and the constructs were ordered from IDT (Figure 2). The T7 promoter was used to enable a high expression of the chosen proteins.
In the wet lab, these constructs were transformed into Escherichia coli (E. coli) and expression levels were measured. In the laboratory, it was noted that the bacteria that were to express HNMT did not grow as well as the other bacteria, and it was determined that E. coli could not express this protein sufficiently due to it probably being toxic to the bacteria. EDN and ECP were successfully transformed into bacteria, but the levels of EDN that were expressed were very low. ECP was expressed successfully, and after the NanoDFS that was performed to control the folding of the proteins, one could observe that the ECP protein had at least partially correctly folded.
From this experiment we did more research about these constructs and learned that ECP and EDN contain a lot of disulfide bonds, each one has four of them, and this makes it hard to express them in the construct we choose for this wet-lab. From this, we learned that we need to use another method to express these proteins (Figure 3). Using a signal peptide for the transfer of proteins to the periplasm of the bacteria could ensure better folding because the environment in the periplasm is better suited for disulfide formation. We also learned that there are enzymes (DsbA) that help the formation of disulfide bonds in the periplasm [1]. To enable transportation, we understood that it is not only the signal peptide that is needed but also a certain fold of the protein [2]. The safest way to be sure that the transport happens, one of the native transported proteins can be used as a fusion. For example, Ecotin, a highly expressed protease inhibitor in some strains of E. coli, can be used as a fusion to transport another protein to the periplasm [3,4]. With the addition of the His-tag and TEV site between Ecotin-ECP/EDN, one could separate the proteins after purification (building upon Sydney Australia iGEM 2017 construct: BBa_K2417002). We believe that this could be a good way to express proteins with many disulfide bonds. Since some of these proteins may be toxic to the host, we also learned that the use of T7 lysozyme can promote the survival of the host. This is usually put in a second plasmid, which will inhibit the low leaking expression of T7-RNA-polymerase [5]. Without the leaking expression of toxic products, the cells will hopefully grow normally until the point of induction.
[1] Denoncin K, Collet JF. Disulfide bond formation in the bacterial periplasm: major achievements and challenges ahead. Antioxid Redox Signal. 2013;19(1):63-71.
[2] Holland IB. Translocation of bacterial proteins—an overview, Biochimica et Biophysica Acta (BBA) - Molecular Cell Research. 2004; 694(1-3), 5-16.
[3] McGrath ME, Erpel T, Browner MF, Fletterick RJ. Expression of the protease inhibitor ecotin and its co-crystallization with trypsin. J Mol Biol. 1991;222(2):139-42.
[4] Paal M, Heel T, Schneider R, et al. A novel Ecotin-Ubiquitin-Tag (ECUT) for efficient, soluble peptide production in the periplasm of Escherichia coli. Microb Cell Fact. 2009; 8(7).
[5] Moffatt BA and Studier FW. T7 Lysozyme Inhibits Transcription by T7 RNA Polymerase. Cell. 1987; 49: 221-227