From the moment of choosing to fight against the oak processionary caterpillar and consulting with the OPC expert of the shortcoming of the current control methods, we knew we had to utilise recent scientific advances to make an effective and specific pesticide. Whilst first looking for potential mechanisms to achieve this task, we looked at many different options, such as nanoparticles, however, we zeroed on RNA interference (RNAi) in the end. There has been a growing interest in the application of RNAi in pest control and in the development of biological pesticides (Niu et al., 2018). RNAi is already widely applied to control gene expression, and refers to a mechanism of gene silencing, where translation of target mRNA is inhibited by homologous, double-stranded siRNAs (Petrova, Zenkova & Chernolovskaya, 2013). There has been plenty of promising research for the use of RNAi as a new pest control approach in protecting food crops (Fishilevich et al., 2016; Palli, 2014). It appeared to us that another potential and significant target for the implementation of RNAi is the control of invasive species and nature conservation.


Our idea was to create a bacterium which would produce siRNA molecules, targeting specific and essential genes of the OPC. This would lead to inhibition of the growth of the caterpillars and consequently decrease in the out-of-control populations, providing an improved control method. The first crucial step in our approach, and specifically in designing our siRNA, was the identification of unique target genes. Since the OPC genome has not yet been sequenced completely, we searched different articles to find possible targets. We used NCBI sequence data for BLAST search of barcode genes to find unique regions.

Unfortunately, due to the ongoing Covid-19 pandemic and strict regulations to stop its spread, we had no access to the lab and no way to validate our targets. Hence, we started looking at possible models to use to determine lethality of silencing the identified target genes. The challenge we faced was the limited data on the OPC itself and the lack of possible models to determine the effects of these genes in the OPC. To overcome this took a different approach to get to a model.


After extensive research, we found several tools, which allows one to build a metabolic model of an organism using either protein or RNA expression by linking this data to the corresponding reaction fluxes from the chosen framework. To determine the effect of silencing a gene, a metabolic model for the OPC using the IMAT algorithm was created (Zur, Ruppin & Shlomi, 2010). This model was built with the COBRA toolbox and works with expression data of one cell in an organism, as the expression of a gene or protein is directly linked to the flux of the associated reactions. The model uses the Gurobi solver to perform flux balance analysis and single gene deletion analysis, with which it can help identify the effects of a gene silencing. To check whether the silencing of the chosen gene has any effect, a fake BIOMASS reaction was added into the model, since it requires materials for the cell to survive and duplicate. The model was then set to optimize for the BIOMASS reaction. When the flux of the reaction is significantly lowered, this is an indication of harming the cell. We then encountered another problem - no arthropod frameworks existed, and we had to take a step back and modify approach.

When we re-examining our targets, we noticed that our targets were highly evolutionary conserved in terms of their functions and pathways (Nomiyama & Yoshie, 2014; Teo, Möhrlen, Plickert, Müller, & Frank, 2006). This gave us an idea to instead use gene analogues with an existing framework. We deduced that since our targets are well conserved, they should remain functionally intact using mammalian analogues of a drosophila melanogaster gene expression data set. For this we chose the human framework Recon3D, as this one was the most complete and detailed available (Brunk et al, 2018). Next, we translated our OPC RNA expression data into their respective human analogues before mapping them to the framework. This created the metabolic model that was as close to an approximation of the OPC with the data available. The model did not directly model the interaction of the siRNA but the expected effect of a gene knockout, and using this model we were able to determine the lethality of all but one possible target genes.


With the help of our models we focused our research and prepared to start work in the lab. We learned which genes would have the desired effect if knocked out. In total 4 genes with the most promising effect were identified, and these became our priority in planning the future experiments in the lab. These include the gene for the subunit 2 of the allergenic protein Tha p2 that can be found in the urticating setae of the caterpillars (Berardi, Battisti & Negrisolo, 2015); Pro2 photolyase gene that is involved in proline metabolism; and EF1a gene for elongation factor-1 alpha which takes part in the protein synthesis elongation phase. As well as, Wg gene for wingless protein, that is thought to be involved in numerous processes through development (Simonato et al., 2013).

In the design of the siRNAs, 21 nucleotide sequences were identified within the genes according to siRNA design guidelines (Thermo Fisher Scientific, n.d.) which include 30-50% GC content, absence of stretches with 4 or more A’s or T’s in a row, and TT nucleotide beginning. These siRNA sequences were selected at different target sites for the same gene to potentially increase the effectiveness of RNA interference. These short sequences were run through the NCBI BLAST algorithm again in order to assess their specificity. Finally only siRNA that were most specific to OPC were chosen.

Figure 1. BLAST-N search for one of our target genes (potential specific targets marked as features)


After obtaining the promising target genes and designing siRNAs for them, the next step was to design the bacteria to be used to produce and deliver these siRNAs. To design our "Oak Shield" strain of bacteria, we first researched different papers which use bacterial mediated RNAi for specifically pest control. Since several studies showed success using the HT115(DE3) E. coli strain and the L4440 plasmid, we decided to use this strain and plasmid to express our siRNAs (Ganbaatar et al., 2017; Giesbrecht et al., 2020; Zhang et al., 2019). The E. coli strain HT115(DE3) was selected because it is RNAse III deficient, therefore our RNA's are not degraded by the bacteria. Furthermore, HT115(DE3) expresses T7-polymerase which allows for efficient expression of dsRNA products.
We wanted the bacteria to precisely express the siRNAs we have designed and selected. However, since the plasmid does not contain a transcription termination signal we decided to insert a terminator sequence into our expression cassette. We feared that prolonged transcription of the siRNA coding region would lead to nonspecific and ineffective products. Literature research confirmed that a termination signal significantly increases the yield of correctly sized and effective shRNAs. We designed our final expression cassette using the L4440 plasmid as a template (Sturm, Saskoi, Tibor, Weinhardt, & Vellai, 2018). We decided to keep the T7 promoters of the plasmid but replaced the MCS with EcoRI & BamHI restriction sites (to clone in the shRNA coding DNA-oligo) followed by a T7 terminator sequence. This fragment was ordered as a gBlock and cloned into L4440. We created 14 different siRNAs for 4 different genes with adapters for EcoRI and BamHI cloning, each siRNA-oligo was inserted into the L4440-expression cassette plasmid. The resulting plasmid can be used universally to clone in any desired shRNA sequence. This way we do not need to order each siRNA as a separate gBlock containing the whole expression cassette.
The inserts were ordered as single stranded oligonucleotides that after annealing have the necessary overhangs for EcoRI and BamHI, therefore the shRNA DNA constructs can be inserted in the vector when it is cut with the same restriction enzymes (Figure 2).

Figure 2. The expression cassette is cloned into L4440 linearized with Pci1 & NgoMIV using gibson assembly. A shRNA coding DNA-sequence can then be inserted into the vector using BamH1 and EcoR1.


In total we designed 14 shRNAs to target essential genes in the OPC. In addition we designed the bacterial expression cassette for these shRNAs. Learn more about our designed parts here.


When we finally got permission to enter the lab, we had plenty of work to do in an extremely limited timeframe. First task was to validate the target sequences. For this, a series of experiments were conducted. DNA was first extracted from early stage OPCs, followed by a PCR using primers designed for the target genes. This lets us identify if the targeted regions are indeed present. Gel electrophoresis was used to separate our DNA fragments (from the PCR) based on their size to see the success of our PCR amplification of the target fragments. Afterwards, these fragments were isolated and further amplified in order to obtain a sufficient concentration for them to be sequenced by Macrogen. This was necessary to ensure that the target sequences are exactly the same as the ones that were found on databases. Only this way, we can ensure that the siRNAs can specifically and efficiently bind to their selected target. In total, 11 siRNA target regions were confirmed by the Macrogen sequencing with 100% identity, 1 region was confirmed with 86% identity to the reference sequence, and 2 were not found. This turned out to be because one of the primer sets for Tha p2 gene did not produce any results.

The next step was the plasmid assembly and testing of the production of the shRNAs by the bacteria. The assembly was done in several steps of digestion and annealing. In the end we had 6 different plasmids+inserts in the DH5α bacteria that were successful, as there were visible colonies on the plates. As some did not grow, we would have required more time to repeat the vector+insert assembly in order to obtain all 14 inserts.

Figure 3. Antibiotic selection plate of our final Oakshield strain

Afterwards, the successful plasmids were further transfected into HT115(DE3) E. coli, from which 4 were successful, as colonies could be seen on the plates. For these 4, IPTG induction was done to produce the siRNAs.


Unfortunately due to the pandemic we were not able to do all lab work we wanted. 1 ½ months are not enough to correctly test if all our designed parts are successfully engineered and implemented. However, despite the limited timeframe in which we were able to go the labs, we were managed to construct the basic expression cassette vector and were able to insert 6 shRNA sequences. These final constructs still need to be confirmed by sequencing. As the next step, we would test if the shRNAs are produced by the IPTG induction in the HT115(DE3) cells. This would be done by extracting the RNA content of the bacteria and running a gel to see if the desired shRNAs are being produced. If this test yielded expected results, we would move on to conduct tests with the caterpillars. In case of unexpected results, we would need to do more research on whether there are, for example, problems with our expression cassette. Another subsequent step would be to test how the HT115(DE3) E. coli, with the inserts, survive sprayed on the oak branches. This would provide us key information about wether we would need to further modify the bacteria to survive sufficient time on the oak branches.