Software
In order to design the siRNA more efficient, we developed a group of Python scripts and assembled them into a program called ‘siRNA Designer’.
In the ‘siRNA Designer’, we have a preprocessing step. The processing step can transform the primary data(.fasta form), which is from the sequencing company, to another more standard one that is one line in name and the other line in sequence. Then we follow a protocol to choose a 23nt-sequence in an mRNA sequence from the processed file. The mRNA sequence comes from RNA-seq technology.
There are the principles we use to design the siRNA[1]:
1. G or C must be first base in the sense strand. 2. A or U at the 19th base 3. The energy of double-strand 5' end should be higher than the 3' end. That is the sense strand 15th~19th bases have at least three A or three U, or 13th~19th bases have five A or five U. This principle is the most important[2][3][4]. 4. Avoid continuous multiple G or C. 5. GC content is preferably between 30% and 52%. 6. The 16th base is preferably C, and the 13th is preferably non-G.
After all the processes are done, we use the Blast software to identify if the siRNA we designed is specific with other species.
We only select the most specific sequence to use for the experiment. In case of it influence other species.
The software has been uploaded on GitHub.
[1] Elbashir S M, Harborth J, Weber K, Tuschl T. Analysis of gene function in somatic mammalian cells using small interfering RNAs. Methods, 2002, 26: 199~213.
[2] Hutvagner G. Small RNA asymmetry in RNAi: Function in RISC assembly and gene regulation. Febs Lett, 2005, 579: 5850~5857.
[3] Amarzguioui M, Rossi J J, Kim D. Approaches for chemically synthesized siRNA and vector-mediated RNAi. Febs Lett, 2005, 579: 5974~5981
[4] Pancoska P, Moravek Z, Moll U M. Efficient RNA interference depends on global context of the target sequence: quantitative analysis of silencing efficiency using Eulerian graph representation of siRNA. Nucleic Acids Res, 2004, 32(4): 1469~1479.