DNA:RNA Triple Helices
Tethering RNA to DNA
Introduction
We all learn at school that two nucleic acid strands can interact with each other by forming hydrogen bonds between adenine and thymine or between guanine and cytosine bases to form a double helix. This phenomenon is known as Watson-Crick interaction. Another form of interaction in which nucleic acids can bind is the Hoogsteen interaction. The difference lies in a different orientation of the bases which leads to different atoms forming hydrogen bonds (see Figure 1). Hoogsteen interaction can cause alternative DNA and RNA structures, such as a G-quadruplex and triple helix.
In our experiment we focused on the RNA•DNA–DNA triple helix (• = Hoogsteen interaction, – = Watson-Crick interaction). It forms when RNA is inserted into the major groove of the DNA double helix. The bonds between different bases have varying stability, which results in a sequence-specific binding. Specificity and bond strength are significantly weaker compared to the Watson-Crick interactions.
Our goal is to design an RNA that can form a triple helix with a specific DNA sequence and simultaneously attract transcription factors. To attract transcription factors, an RNA is designed in a way that it forms a hairpin structure, which can be bound by RNA-binding proteins. If the transcription factor is now coupled to an RNA-binding protein, it is possible to recruit them, targeting a specific sequence.
Motivation
In recent time research put a lot of effort in the creation of logic gates in living cells on the level of transcription. One promising approach is to use dCas fused to a specific activator and multiple different guide RNAs, that bind to the regions flanking the promoter of a reporter gene. Alternatively a dCas fused to a repressor may be used.
Here, a more lightweight system, which is entirely RNA based could greatly improve efficiency. A possible solution is to use a RNA that can form a triple helix. It would greatly reduce the base pair size of the construct and would come at a much lower production cost for the cell. Additionally, advances made in translational cell regulation, such as riboswitches, could be also implemented on the transcriptional level.
Design
Design of the triple helix
It was recently shown by Kunkler et al. that the shortest RNA that creates a triple helix with DNA in vitro is 19 base pairs long.
Table 1: DNA and RNA nucleotide pairs used for triple helix design.
DNA | RNA |
---|---|
Adenine | Uracil |
Cytosine | Uracil |
Guanine | Cytosine |
Additionally we used the the following sequences, proven by Kunkler et al., that can create a triple helix in vitro: RNA (5’-UUUUUCUUUUCUUUUCUUUCUU-3’), DNA (5’-AAAAAGAAAAGAAAAGAAAGAA-3’).
Background
The ability of RNA and DNA to form triple helices may be fascinating on its own, but how exactly can it be used to regulate gene expression? In recent years there has been much research on the CRISPRa/i system. It’s based on designed guide RNAs that can bind through complementary base pairing in front of the promoter region.
Chen Dong et al. scanned multiple transcription activators and showed that SoxS was the most effective. SoxS is a part of the superoxide response regulation.
Design of the Experiment
Proof of concept - Recruitment of activator
For the first experiment we wanted to find out whether the RNA sequence of the triple helix could replace the dCas as the specific DNA binding factor and whether it could bring a transcription activator in proximity of a reporter gene. Our approach was similar to the experiment that Chen Dong et al. conducted with a dCas.
The first consists of SoxS, fused to the MS2 coat protein. The second expresses the RNA, which consists of a region that can form the triple helix and a MS2 hairpin loop. The sequence for the RNA is flanked by hammerhead ribozymes. These ensure that the RNA part needed to form the triple helix and the MS2 hairpin can get cut out, to prevent any disruptive overhangs that could form unwanted secondary structures. The third expression cassette is a superfold GFP cassette with a weak constitutive promoter(P_J23109). The DNA sequence, which can form a triple helix, is inserted upstream of the promoter. The first and second “level 1” plasmids are combined into one regulatory “level 2” plasmid. The reporter plasmid is used directly as a “level 1” plasmid. The two plasmids were co-transformed into one cell.
The hypothesis is that, when expressed, the MS2 coat protein binds the MS2 hairpin. This leads to the assembly of the activator. The construct can now bind to the sequence in front of the superfold GFP promoter and brings SoxS in close proximity to it, which activates the expression of the fluorescent protein (see Figure 2).
With this basic design multiple assays are possible. Firstly, we wanted to test whether three different DNA and RNA pairs form a triple helix in vivo. One pair was proven by Kunkler et al. to fold into a triple helix in vitro.[1] The other two were designed as described above. Secondly, we wanted to determine the optimal distance between RNA binding site and the promoter to ensure optimal activation. As suggested by the results of Bikard et al. we designed one promoter with the binding site 91 base pairs upstream of the transcription starting site.
Regulation system
We designed our second experiment in order to further explore the possibilities that the modularity of this approach provides. We wanted a reporter gene (superfold GFP) to be activated or repressed with the addition of lactose or L-arabinose respectively (see Figure 3). SoxS was used as the activator. We chose the KRAB protein as a repressor. Although KRAB is to date only described as a transcription factor in mammals, we thought it would be worth a try, since there are no known repressors in E. coli that influence a promoter in the same manner as SoxS. KRAB is a big protein that should be able to disrupt the RNA-polymerase from getting to the promoter.
KRAB was fused to the lambdaN RNA binding protein and cloned into a “level 1” plasmid together with the SoxS-MS2 fusion protein. Because they were cloned into the same cassette, they were expressed under the same strong promoter. To separate the two transcription factors, a stop codon was placed behind SoxS and an RBS was placed in front of lambdaN. In the reporter plasmid, a degradation tag was added behind the superfold GFP and a medium-strength Anderson promoter (PJ23106) was used.These modifications were made to obtain a fast visible response, when the superfold GFP is activated or repressed.
For this experiment, two RNAs that can form triple helices are needed. One includes the MS2 hairpin, the other one includes the BoxB hairpin which LambdaN can bind. RNA with MS2 hairpin is under control of the araC pBad promoter and dependent on L-arabinose. (BBa_K808000) The expression of the RNA with BoxB is regulated by the lac promoter (BBa_R0010) and can be induced by allolactose. Both RNAs include identical triple helix forming sequences. As in the first experiment, the RNA coding sequence is surrounded by the hammerhead ribozymes. The expression cassette with the transcription factors and the two cassettes with the triple helices are cloned into one “level 2” plasmid. Cells are co-transformed with this “level 2” plasmid and the reporter plasmid. This experiment can be performed with different triple helices, and different distances of the RNA binding site from the transcription start site.
Due to COVID19 we had access to a laboratory only for a very limited time. Therefore we were unfortunately not able to get any results from these experiments.
Measurement
The fluorescence intensity can be measured in a plate reader. It would indicate how well the superfold GFP is expressed. The changes in the level of fluorescence allow detection of activation and repression, and comparison of their efficacies. By comparing results of the measurements with a positive control (expression of the superfold GFP under a constitutive promoter) and a negative control (not transformed cells) we could determine whether the RNA forms a triple helix with the DNA. Additionally we could detect which distance from the transcription starting site is optimal. Furthermore, the measure the speed of our system with two promoters (second experiment) responding to the addition of allolactose or L-arabinose could be measured . We could also check how the cells would react if first exposed to one of the substances and then to the other.