Introduction
The first idea we had when we were thinking about how to create parts that would extend the capabilities of cell regulation to more than just the DNA Level, was to create fusion proteins in which protein subunits were tethered together with RNA. Tools that would enable this would pave the way for many interesting applications (Also see our Proposed Implementation site). Proteins would not only be constituted through the information that is written and stored in their deoxynucleotides, but new combinations of functional subunits could also be assembled post-translationally. This could for example be useful when working with constructs like trifunctional CRISPR-AID systems, where a dCas is combined with different transcriptional factors to parallelly activate and inhibit the expression of genes
trifunctional. With RNA linking of the transcriptional factors to the dCas, there would be no need to create multiple bulky gene expression cassettes with a large dCas – instead, one could create a single expression cassette for the dCas and tether the transcription factors post-translationally to it to create the full protein complex. The described RNA linking between a transcription factor and a dCas has already been shown, using small viral RNA-binding proteins that bind to an added hairpin at the sgRNA. (
SoxS). But apart from dCas systems any system, in which the recombination of functional protein subunits is desired, could potentially benefit from tools that allow the assembly of protein subunits after their translation.
No palace has been built without proper tools, so we first thought about the development of parts to create as much design-space as possible for researchers planning to develop such systems. Namely, we focused on the characterization and creation of RNA-binding proteins (RBPs), which mediate the contact between a protein subunit, and RNA linkers which connect protein subunits. Supporting this, we created
PRISM and
CONCORDE which facilitate the artificial creation of these parts.
But especially for RBPs we do not necessarily need to artificially create proteins but can also pull from a vast repertoire of naturally occurring RBPs like viral RNA binding proteins or the modular Pumilio Homology Domains and Pentatricopeptide Repeat Proteins. Here we provide an overview of the design process of these parts when planning the post-translational assembly of functional subunits and a concept to validate RBPs and RNA linkers in vivo.
Biology and Design
RNA Binding Proteins
As set out above, we started with looking for RBPs that occur in nature and would be suitable to be used as mediators between RNA linkers and protein subunits. We discovered the Antitermination Protein N from the Lambdaphage (lambda N), the MS2 Coat protein from the bacteriophage MS2 (MCP), Pumilio Homology Domains (Pumby), and Pentatricopeptide Repeat Proteins (PPRs).
The Lambdaphage is an
Escherichia Coli infecting phage that can integrate itself in the bacteria’s genome (lysogenic pathway) or produce many copies of itself immediately after infection, which finally leads to cell lysis (lytic pathway
lambdaphage. The Lambda N protein is involved in the activation of the lytic phase in phage development. It suppresses the activity of transcriptional terminators that interfere with the transcription of phage proteins, by binding tightly to the RNA Motif
Box B in neighboring genes of the phage genome.
Because of its small size (60 amino acids), its simple RNA binding motif (15 base pairs) and strong association strength with its RNA motif (Kd = 1.3 nM)
lambdan, we chose Lambda N to be part of our toolbox. We theorize that the small peptide would enable the connection of protein subunits through RNA without interfering sterically with the activity of the protein subunits.
For similar reasons we wanted to use the RNA binding coat protein of the bacteriophage MS2 for our toolbox (Figure 1).
In the bacteriophage MS2 the MCP binds a stem loop structure in the viral RNA of the phage. It is thought that this complexation of the loop is involved in the suppression of the synthesis of replicases and further induces the first impulse for encapsulation of the virus
ms2.
Figure 1: Protein Crystal Structure of MCP
MCP is constituted of 5 beta sheets and two alpha helices. The alpha helices extend over the beta sheets and stabilize the monomer. The RNA is bound by the beta sheets where inwards looking lysins, asparagins, threonins and tyrosins bind to the MCP RNA hairloopbinding.
Similarly to Lambda N, MCP has a fairly small size (= 129 amino acids), a small and well understood RNA binding motif (= 21 base pairs) and a high association strength to its motif (Kd = 10 nM)
binding. With lambda N and MCP we now have RBPs to create a simple protein RNA complex between two subunits.
For the design of Pumby and PPR Proteins see our
page designated solely to them. The following design principles on fusing RBPs to protein subunits and the design of RNA linkers count for them as much as for MCP and Lambda N.
Finally, the RNA binding proteins have to be connected to the functional protein subunits. To avoid steric inhibition between the two, the RBP should be separated from the subunit with a peptide linker. There is a big repertoire of
peptide linkers with different properties already in the registry. For example, there are alpha-helix forming rigid connectors, proline rich peptide linkers which have minimal interaction with the proteins they connect, or the GS-Linkers which are flexible linkers with good solubility
fusion. Being the simplest and most commonly used, we used GS-Linkers subsequently in all designs of RBP protein complexes.
RNA Linker Design
The last part-type missing in our design are the molecules that actually tether the protein subunits together – the RNA linkers. We thought of a simple structure of RNA linkers (Figure 3) and applied this structure to design several RNA linkers with different sequences for the same RBPs to avoid unforeseeable problems that may occur in the experimental validation like rapid RNA degradation of certain RNA motifs. We used the ViennaRNA webservice RNAfold to assess the correct folding of the Hammerhead Ribozymes and the structure of the RNA binding motifs
forna.
Figure 2: Structure of RNA linkers
On the top you can see the schematic structure of a typical RNA Linker that contains a 5’ Hammerheadribozyme, the binding motif of RBP A, a variable sequence of nucleotides, the binding motif of RBP B and a 3’Hammerheadribozyme. On the bottom you can see the secondary structure of a linker we designed for MCP and Lambda N in the experimental validation described later on this page. In this case we designed a hairpin between the binding motifs of lambda N and MCP as variable sequence.
We used hammerhead ribozymes, flanking the RNA target motifs of the RBPs, to ensure optimal processing of the RNA linkers. With the ribozymes, there would be no additional RNA that is an artifact of the promoter or RBS after the transcription of the sequence. Given the chaotic nature of RNA secondary structures, we theorized that such artifacts could potentially interfere with the structure of the RNA target motifs and therefore minimized the risk through the precise cleavage of the RNA linkers at the target sites of the Hammerhead ribozymes. We also created
DesignSimple a script that can design hammerhead ribozymes that fold with high stability to facilitate the process of RNA linker design.
Experimental Validation
To test the assembly potential of the designed RNA linkers and RBPs, we designed a simple reporter assay that would give us a measure of the strength of association between the protein subunits we are tethering together. We designed a split-GFP assay based on this paper
splitgfp. In this assay a GFP is split into two functional units an N-terminal domain (NGFP) and C-Terminal domain (CGFP). Alone none of the units fluoresce, but when being brought near to each other, they form the original GFP complex that emits green light when excited (Figure 4). We designed complexes of NGFP bound to lambda N and MCP bound to CGFP (
BBa_K3657030,
BBa_K3657031).
Additionally, we created several RNA linkers like
BBa_K3657036, containing the RNA binding motif of Lambda N and MS2. Note that it is important to use two different RNA binding proteins with individual binding RNA target motifs to avoid a protein assembly like
NGFP-Lambda N + NGFP-Lambda N. The same experimental setup was designed for two Pumby Proteins binding the sequence AUAGAU and the sequence AUGGUU (
BBa_K3657034,
BBa_K3657033).
Figure 3: Scheme of the Split-GFP RBP Assay
Without the expression of an RNA linker the split-GFP subunits do not form a complex and no light is emitted. When the RNA Linker is expressed the RBPs can bind to their target motifs on the RNA Linker and the split-GFP subunits are brought together, forming an active fluorescent protein unit and emitting green light.
Cloning
All constructs were created using the Golden-Gate/Mo-Clo Assembly-based
Marburg Collection and protocols from Marburg 2018 or adapted versions of them (for exact descriptions of the cloning process see our
protocols and
labbook). We had to create many plasmids to conduct proper measurements including positive and negative controls and different versions of RNA linkers. We provide here a short overview of these constructs (Table 1 and Table 2), but if you do not want to click through endless links of BioBricks you can also download the annotated GenBank files of our plasmids
here and view them in the bioinformatic software of your choice.
Table 1: Overview of level 1 plasmid elements for the Split-GFP RBP Assay
Overview of the elements of the designed level 1 plasmids. All constructs were cloned with BBa_K2560017 as a promoter, BBa_K2560010 as the RBS and BBa_K2560034 as the terminator. Connectors are not included in this list but were used as needed when constructing Level 2 plasmids. (Table 2). We did not concatenate gene expression cassettes in a specific manner in the Level 2 plasmid construction.
Table 2: Overview of level 2 plasmid elements for the Split-GFP Assay
After successful Level 1 cloning the full gene expression cassettes of RNA linkers and CGFP Probe and GFP positive are combined so no more than two plasmids have to be transformed into one probe when conducting the experiment.
After, successful isolation of the plasmids we would have co-transformed NGFP negative & CGFP negative, CGFP-Linker & NGFP Probe, and GFP-Linker & Empty positive. NGFP & CGFP negative is the negative control in our experiment as they miss the expression of the RNA Linker and therefore should not emit fluorescence. GFP-Linker & Empty positive is the positive control in which a full GFP is bound to MCP instead of a CGFP and Lambda N is not bound to CGFP. With the positive control we wanted to make sure that the interaction of the RBPs would not interfere with the fluorescence of GFP if an assembly occurs. Finally, CGFP – Linker & NGFP probe constitute the actual probe in which the RBPs bound to split GFP parts and the RNA Linker is expressed. As can be seen in the cloning tables (table 1 and 2) the same experimental setup was used when designing the split-GFP assay for the Pumby proteins, where a Pumby binding a AUGGUU RNA motif replaced lambda N, a Pumby binding a AUAGAU RNA motif replaced MCP and the RNA Linker was adapted to contain Pumby RNA target motifs.
Measurement
For our measurement we would read out the fluorescence with a plate reader. The excitation maximum of the assembled split-GFP is at 468 nm and the emission maximum at 505 nm. We would measure multiple replicates and compare the performance of different linkers to each other and the fluorescence intensities to our negative controls and positive controls (Figure 4).
Figure 4: Measurement layout in a black 96 well plate for Split-GFP RBP assay
After co-transformation three colonies of every construct would be picked to take variance between bacterial colonies into account. The seed culture of the positive control, negative, control, and all probes with different RNA linkers would all be measured in four replicates per colony. Additionally, another negative control with wells just containing crude culture and empty medium, to adjust the measurement for background fluorescent, would be pipetted at the bottom of the plate.
Because we would measure the fluorescent in black well plates, we could not measure the OD600 of the probes directly before the measurement in the same well plate. Therefore, we would measure the OD600 of every colony separately by pipetting some volume from every well in the black measurement plate into a transparent 96 well plate and measure the absorbance at 600 nm there. The fluorescence of the probes could thereby be adjusted to the concentration of bacterial colonies in each well (for more specifics to our measurement and data analyzation process you can also view our
protocols).
Our progress
Due to Covid-19 we unfortunately could only start working in the wetlab by the beginning of September. Because of this we were not able to create all plasmids shown in table 1 and 2 in time to conduct proper assays. We could create all Level 1 plasmids of the lambda N and MCP split-GFP assay and all but the GFP positive, the NGFP negative and CGFP negative of the Pumby split-GFP assay. Because the Level 2 plasmids shown in table 2 were necessary to conduct the assays, and we could not create a Level 2 plasmid in our limited time, we cannot present experimental data that would prove or disprove the concept of RNA-Protein linking here. We still hope that the parts we designed based on intensive literature research, will benefit future iGEM teams or researchers that want to make use of the potentially great powers of RBPs.