Team:SDU-Denmark/Laboratory

Experimental Design Considerations

"Solid knowledge builds the foundation of great products."

Research into current tests for prostate cancer unveiled that multiple biomarkers found in aggressive prostate cancer were also present in other types of prostate cancer, making it hard to distinguish between the different types and thereby combatting the problem with overdiagnosis. However, it is known that cancer arises from mutations in the genome, and it is therefore possible to use the mutated RNA or DNA as biomarkers instead of proteins, as mutations may affect translation so that no target protein is transcribed from the mRNA transcript. Additionally, these RNA and DNA biomarkers can be found in e.g. urine, as tumors grow uncontrollably and are known to die by necrosis, releasing its content into the nearby tissue, blood stream or urine.
To create a non-invasive test, PROSTATUS decided to test for biomarkers found in urine samples. However, to measure the presence of these biomarkers, a precise tool would be needed. The team behind PROSTATUS decided to utilize the properties of the CRISPR-Cas based systems “Specific High-sensitivity Enzymatic Reporter unLOCKing” (SHERLOCK) and “DNA Endonuclease Targeted CRISPR Trans Reporter” (DETECTR), as these are known to be highly specific and able to detect low levels of present biomarkers. The SHERLOCK system was utilized for recognition of RNA biomarkers, while DETECTR was utilized for recognition of DNA biomarkers [1].


SHERLOCK: Urine test – PMT
The Specific High-sensitivity Enzymatic Reporter unLOCKing (SHERLOCK) platform is a method that utilizes Cas13a, a CRISPR protein, to detect nucleic acids. Cas13a is isolated from the organism Leptotrichia wadei and targets single stranded RNA(ssRNA). Cas13a combined with recombinase polymerase amplification (RPA) is sensitive enough to detect specific RNA down to 5x10-18 mol/L [2].

Cas13a is guide by sgRNA to detect a complementary ssRNA-target. Once this occurs, Cas13a is activated and will engage in collateral cleavage. This includes the cleavage of a reporter RNA, which releases a signal allowing for real-time detection of the target on a lateral flow strip [3].

Video of Cas protein activation as sgRNA and target sequence bind.

For PMT, we have chosen to combine the SHERLOCK method and utilize its rapid detection together with a lateral flow strip, as described in the section "Flow Strip Readout”. In order to do this, we designed three guide RNAs (sgRNAs) that match the sequences of our three chosen PCa biomarkers found in urine, TMPRSS2:ERG, PCA3, and AMACR. Furthermore, a fully functional PROSTATUS test would also make use of the RPA method in order to amplify the RNA found in urine, as described under "amplifiction".

DETECTR: Saliva test – CRAT
The DNA Endonuclease Targeted CRISPR Trans Reporter system or DETECTR is a method that harnesses the specificity of Cas9 coupled with an inducible non-discriminatory ssDNA cleavage activity. The protein is called Cas12a or CPF1 and has gained the attention of the world, as the inducible non-discriminatory cleavage ability can be exploited as a tool in providing a fast, specific, and flexible diagnosis for a plethora of disease. Cas12a is like Cas9 in that it can be activated by matching its sgRNA’s spacer to a DNA target following a PAM sequence consisting of TTTV* in the case of Cas12a. As opposed to Cas9, Cas12a does not need help from ribonucleases to mature its sgRNA as it has its own ribonuclease activity [4]. After activation, Cas12a’s endonuclease activity towards ssDNA is constitutively activated. This activation is what we and many other groups use for detecting biomarkers in saliva. This activity can be coupled with any ssDNA reporter, such as in flow-strips, to yield a fast and easily readable result.

For CRAT, we have designed sgRNAs to target the mutations of two SNPs located in chromosome 8q24, rs16901979 and rs6983267, which both have been proven to significantly increase the risk for developing prostate cancer[5]. The mutations specifically include the (A;A) and (A;C) genotypes for rs16901979 (G;G) and (G;T) genotypes for rs6983267. We have developed sgRNAs for both the risk and non-risk allele, however, this process exposed a flaw in the DETECTR system. The system is dependent on the previously mentioned PAM sequence. As the sequence is 4 nucleotides, it is highly specific in its target recognition, but the increased specificity also makes it harder to target SNPs, as the SNP has to be within ~21 nucleotides from the PAM since this is what constitutes the spacer sequence. If the SNP is not within reach of the spacer sequence, it is simply not possible to design sgRNA for targeting the SNP. Also, it is preferable if the SNP is closer to the PAM, as mismatches in the seed region are more deleterious to the endonuclease activity than mismatches further out in the spacer[6]. However, despite these negatives, there are still great positives to the DETECTR system. The Cas12a enzyme is very sensitive and has been reported to detect attomolar concentrations[7]. This will ensure that even the slightest amounts of biomarkers are going to be detected, as there can be varying levels of cells/genomic DNA in saliva samples.

Cas12a is a powerful protein that is highly sensitive. Its sensitivity, however, poses a problem as the read-out is an all or nothing response which does not allow for distinguishing between people that are heterozygotic and people that are homozygotic for our target mutations. Once Cas12a is activated, it will cleave ssDNA in its surroundings non-specifically. This will happen regardless of the number of sick alleles in the sample.
There is, however, a way to come around this issue. For gain-of-function mutations, a strip that detects the sick allele will suffice, since a single sick allele will result in a pathogenic phenotype. This, however, is not the case for loss-of-function mutations, as a single sick allele will result in a healthy phenotype. We therefore contemplated the idea of creating an additional strip that detects the healthy allele, to detect cases of heterogeneity.
Positive test results for either the healthy or sick allele alone will indicate a homozygous genotype, thus pointing out that the user is either healthy or at risk. On the other hand, positive test results on both strips will indicate a heterozygotic genotype, and the user is therefore healthy. Through this, we can avoid diagnosing healthy heterozygotic people as being at risk for PCa.

* V can be A, C or G


Single Guide RNA (sgRNA) for the Cas12a and Cas13a systems consist of two parts - the scaffold and spacer. The scaffold forms a complex with Cas and enables the spacer-sequence to guide the Cas to the target RNA biomarker by complementary binding. The sgRNAs of these two Cas enzymes are very similar, but small and important differences exist in both the scaffold sequence and their target.

General sgRNA considerations
Before designing sgRNAs for one's biomarkers, multiple factors should be taken into consideration. For our biomarkers, it was important to investigate sensitivity and specificity, as we wanted to develop a more precise test than those already found on the market. Regarding this, data on driver-mutations found in malignant prostate cancer were investigated.
To in-vitro synthesize sgRNA sequences, the DNA template: 5’ T7 promotor + scaffold + spacer 3’ should be followed. The original RNA scaffold and spacer should be complementary to the DNA-template. sgRNA transcribed by T7 RNA polymerase will have the correct composing: 5’ - scaffold + spacer ‘3.

sgRNA for Cas12a
Many Cas proteins require a protospacer adjacent motif (PAM) sequence near the site of their target. These sequences allow the Cas proteins to recognize the site of interest in the nucleic acid target. If the spacer is complementary to a PAM adjacent sequence, the Cas protein will become activated. For Cas12a, the PAM sequence must include the sequence TTTV, where V is any arbitrary nucleotide except T, and be within 21 bp of the target sequence. Once activated, Cas12a will gain unspecific deoxyribonuclease activity towards ssDNA, and specific nuclease activity towards dsDNA. The spacer does, however, not always need to be completely complementary for activation of Cas12a to occur. Depending on the position of the mismatches, the protein can still become activated upon binding [16]. The first few nucleotides of the spacer are called the seed sequence. This sequence does not tolerate mismatches, so for highly selective enzymatic activation, it is important to have precise complementary binding, the spacer should be around 21-24 nucleotides [16].

An example of a sgRNA for a sequence of interest could be designed like this:

sgRNA for Cas12a
This sgRNA is the sgRNA that has been designed for Cas12a by IDT [17]. Yellow: DNA target; Red: Spacer; blue: PAM; black: Scaffold; green: SNP

sgRNA for Cas13a
Cas13a does not rely on the presence of PAM sequences to bind complimentary to the target. Therefore, it is not necessary for the sgRNA to include this. For Cas13a the following scaffold sequence is utilized for sgRNA design [17]:

5’-GAUUUAGACUACCCCAAAAACGAAGGGGACUAAAAU – ‘3

When designing the spacer sequence of our sgRNA, it was taken into consideration that optimal length is 28-31 nucleotides. Additionally, the rules regarding the tolerance on the seed sequence and mismatch for the Cas12a spacer sequence also applied to the Cas13a spacer sequence. Therefore, the only difference between the sgRNAs will be the scaffold sequence and the unspecific ribonuclease activity towards ssRNA instead of ssDNA[17]. For scaffold sequence and further design of sgRNA for Cas13a visit our contribution .

Multiple sgRNAs were designed for Cas13a:
T7-Scaffold-ERG (BBa_K3602002)

AAGCTAATACGACTCACTATAG

GATTTAGACTACCCCAA
AAACGAAGGGGACTAAAAC

GCACAGTTCCTTCCCATCGATGTTCTGG



T7-Scaffold-rs6983267-G (BBa_K3602003)

AAGCTAATACGACTCACTATAG

GATTTAGACTACCCCAA
AAACGAAGGGGACTAAAAC

TGCCATTCATCTGCTGAGCTCAAAGGAC



T7-Scaffold-rs6983267-T (BBa_K3602004)

AAGCTAATACGACTCACTATAG

GATTTAGACTACCCCAA
AAACGAAGGGGACTAAAAC

TGACATTCATCTGCTGAGCTCAAAGGAC




For PROSTATUS to be able to isothermally detect very small amounts of biomarkers in a sample, the biomarkers are amplified through a process called recombinase polymerase amplification (RPA).

RPA is a rapid and specific process that can amplify input samples at the attomole level and can proceed at 25-42°C. The process works through utilization of a recombinase protein, a reverse transcriptase, and a single stranded DNA binding protein (SSB) [2]. In order to not only have amplified cDNA, but also the RNA of interest, a T7 RNA polymerase would be used as well.

The reverse transcriptase generates complementary DNA from the RNA biomarkers while the recombinase protein inserts the primers to their matching biomarkers. From here, the SSBs bind and keep the DNA strands apart while the reverse transcriptase transcribes the biomarkers as the T7 RNA polymerase simultaneously transcribes the cDNA to RNA, thus exponentially amplifying the RNA [8]. This not only increases the sensitivity of PROSTATUS, but also makes it accessible outside of the laboratory.

For designing the primers for the biomarkers PROSTATUS can target, the primer lengths should lie at around 30 nucleotides [2]. Moreover, the amplicon size must be 100-200 bp, containing the target sequence of the sgRNA and about 50-80 extra nucleotides before and after the target sequence for optimal primer design.

For future use, the following primers found on Table 1 can be used with PMT. In order to allow for T7 transcription, a T7 RNA polymerase promoter was added to the 5’ end of the forward primers.

Primer Name Sequence
ERG_F GAAATTAATACGACTCACTATAGGGACAGACCATGTGCGGCAGTGGCTGGAGTGG
ERG_R CTGGGGGTGAGCCTCTGG
AMACR_F GAAATTAATACGACTCACTATAGGGTTCTGTGCTATGGTCCTGGCTGACTTCGGGGC
AMACR_R CCTATGAATCAGGAGGCAGG
PCA3_F GAAATTAATACGACTCACTATAGGGGCTAAGGCAGGAGAATCTTGAACCCAGGAG
PCA3_R CCTATGAATCAGGAGGCAGG
rs6983267_F GAAATTAATACGACTCACTATAGGGGAGGGACGAATAAACTCTCCTCCTACCACT
rs6983267_R CCCCCACATAAAATAAAATAAAGT

Table 1: RPA primers designed specifically for Cas13a target biomarkers.


Urine and saliva have a lot of natural RNase activity as a part of the human immune system and from the resident bacteria. As our prostate malignancy test (PMT) is using induced RNase activity to detect our signal, any naturally occurring unspecific RNases would produce a false positive signal.

Therefore, we will have to inactivate or remove any naturally occurring RNases before introducing Cas13a. When searching for a possible method the home test possibility is considered. RNase inhibitor seems like a great solution, but it is very expensive for a home-kit and it would need specific inhibitors that will not inhibit Cas13a. Another solution is boiling the samples. However, a literature search for RNases in humans showed that spit [9] and urine [10] contain RNase A, which will be denatured at high temperatures, but unfortunately unfold and resume its activity after the samples cool down.

Therefore, complete hydrolysis of the natural RNases was the goal. For this, we wanted to see if the unspecific proteinase K could cleave the naturally occurring RNases. This seemed possible based on a BLAST search for human RNases, which showed a great number of serine residues. In fact, all the RNases contained serine residues which are targeted by proteinase K. This, of course, coincides with the fact that proteinase K cleaves at serine residues. One of the most abundant RNases in urine is RNase I [11]. RNase I can be targeted with proteinase K as it has many serine residues (Figure 1) and it was therefore chosen for the PMT.

Figure 1. Canonical FASTA file of the Human RNAse 1. Yellow highlight indicates cleavage sites for protease K.

A significant benefit of using proteinase K is the possibility of inhibiting its activity cheaply and effectively, making it a reasonable choice for an at-home consumer kit. Proteinase K can be inhibited by either heat treatment or the use of inhibitors. The heat treatment requires a temperature at about 95°C for 10 min [10]. This would not be a good customer experience, and it could also denature a significant amount of the mRNA biomarkers. Therefore, PROSTATUS are using phenylmethylsulphonyl fluoride (PMSF) which has an inhibitory capability of around 98%. PMSF is an irreversible and non-specific inhibitor of multiple proteinases and other enzymes as it sulfonylates hydroxyl groups of the serine residues [12].


Pregnancy test strips
For the read-out of the test results serval, possible methods were investigated. Among others, the possibility of using a commercial pregnancy test strip together with loop-mediated isothermal amplification (LAMP) was examined [13]. This method utilizes the binding of human chorionic gonadotropin (hCG) to the pregnancy test strip and the ability to block the binding with LAMP products in the case of a positive test [13].

Lateral flow strips
However, an even simpler method was found during research: the utilization of a Universal Lateral Flow Assay Kit from Milenia [14]. These lateral flow strips work by using the fluorescence of nano gold-particles bound to an anti-fluorescein isothiocyanate (FITC) antibody combined with a specialized reporter. The gold-particle-conjugates will stick to the reporter and make their position on the flow strip visible. The reporters are collected in two detection lines: A control line with biotin receptors and a test line with antibodies that bind the anti-FITC antibody on the gold particle (Figure 1)[14].

The reporter can be designed to detect different analytes. This would only require a biotin at one end, making it possible to bind the control line, and a FITC or fluorescein amidite (FAM) at the other end binding the gold-particle-conjugates thus visualizing the line [14].

Pieces of the flowstrip
Figure 1. Overview of the flow strip with the two detection lines, the reporter sequence and the gold-particle bound to anti-FAM antibody. Created with BioRender.com

When combining flow strips with CRISPR/Cas detection the reporter often consists of a biotin linked to a FAM by ssDNA or ssRNA. When the CRISPR/Cas system is activated the reporter will be cleaved and visualize the test line while non-activated CRISPR/Cas will result in intact reporters visualizing the control line [15].

Our reporter sequence links the biotin and FAM by ssRNA in PMT and by ssDNA in CRAT. Hereby the flow strips allow us to see if the biomarker was detected by the CRISPR/Cas system by visualising cleavage of the reporter.

In a negative test (Figure 2, left), the absence of biomarker leaves the CRISPR/Cas detector inactivated and thus the reporter intact. The intact RNA reporter will travel to the biotin receptors and be giving a visual read-out at the control line.

Negative Flowstrip readout Positive Flowstrip readout
Figure 2. A negative read-out (left). When no biomarker is present, the Cas proteins are not activated, and the reporter stays intact. The reporter then only binds to the control line by the biotin receptor making the line visible. A positive read-out (right). When biomarker is present, the Cas proteins are activated, and the reporter cleaved. The biotin part of the reporter then binds to the control line, while the visible FAM-gold-conjugate binds to the test line making it visible. Created with BioRender.com.

In a positive test (Figure 2, right), the CRISPR/Cas detector will recognize the biomarkers and cleave the reporters, releasing the biotin. The detached biotins and uncleaved reporters will bind to the control line while the detached gold-particle-conjugates will pass through and bind to the test line. Depending on how much reporter was cleaved, this will result in a single line at the test line.


References

[1] Cecchetelli, A (2020), Finding Nucleic Acids with SHERLOCK and DETECTR, Addgene blog, visit: 20/10/20
[2] Kellner MJ, Koob JG, Gootenberg JS, Abudayyeh OO, Zhang F. SHERLOCK: nucleic acid detection with CRISPR nucleases. Nat Protoc. 2019;14(10):2986-3012.
[3] Gootenberg JS, Abudayyeh OO, Kellner MJ, Joung J, Collins JJ, Zhang F. Multiplexed and portable nucleic acid detection platform with Cas13, Cas12a, and Csm6. Science. 2018;360(6387):439-44.
[4] Chen JS, Ma E, Harrington LB, Da Costa M, Tian X, Palefsky JM, et al. CRISPR-Cas12a target binding unleashes indiscriminate single-stranded DNase activity. Science. 2018;360(6387):436
[5] Liang M, Li Z, Wang W, Liu J, Liu L, Zhu G, et al. A CRISPR-Cas12a-derived biosensing platform for the highly sensitive detection of diverse small molecules. Nat Commun. 2019;10(1):3672.
[6] Safari F, Zare K, Negahdaripour M, Barekati-Mowahed M, Ghasemi Y. CRISPR Cpf1 proteins: structure, function and implications for genome editing. Cell Biosci. 2019;9:36.
[7] Zheng SL, Sun J, Wiklund F, Smith S, Stattin P, Li G, et al. Cumulative Association of Five Genetic Variants with Prostate Cancer. New England Journal of Medicine. 2008;358(9):910-9.
[8] TwistDx™. Recombinase Polymerase Amplification, or RPA, is the breakthrough, isothermal replacement to PCR 2020. Available from: https://www.twistdx.co.uk/en/rpa.
[9] Sugiyama RH, Blank A, Dekker CA. Multiple ribonucleases of human urine. Biochemistry. 1981 Apr 14;20(8):2268-74. doi: 10.1021/bi00511a031. PMID: 7236598.
[10] Koczera, P., Martin, L., Marx, G., & Schuerholz, T. (2016). The Ribonuclease A Superfamily in Humans: Canonical RNases as the Buttress of Innate Immunity. International journal of molecular sciences, 17(8), 1278. https://doi.org/10.3390/ijms17081278
[11] The UniProt Consortium; UniProt: a worldwide hub of protein knowledge; Nucleic Acids Res. 47: D506-515 (2019); Accesion number: P07998
[12] Sigma-Aldrich (2020). PMSF. From https://www.sigmaaldrich.com/catalog/product/roche/pmsfro?lang=en&region=DK&gclid=CjwKCAjwoc_8BRAcEiwAzJevtXZxp6d59_H3NvrjzoL6gcYDgztltSSrH-SE9eY_vf6jR6Jd9UW6gBoCxLsQAvD_BwE
[13] Du, Y., Pothukuchy, A., Gollihar, J. D., Nourani, A., Li, B., & Ellington, A. D. (2017). Coupling Sensitive Nucleic Acid Amplification with Commercial Pregnancy Test Strips. Angewandte Chemie (International ed. in English), 56(4), 992–996. https://doi.org/10.1002/anie.201609108
[14] Milenia Biotech (2019). HybriDetect - Universal Lateral Flow Assay Kit. From https://www.milenia-biotec.com/en/product/hybridetect/
[15] Breitbach, A (2020). Lateral Flow Readout for CRISPR/Cas-based detection strategies. From https://www.milenia-biotec.com/en/tips-lateral-flow-readouts-crispr-cas-strategies/
[16] Creutzburg, S. C. A., Wu, W. Y., Mohanraju, P., Swartjes, T., Alkan, F., Gorodkin, J., . . . van der Oost, J. (2020). Good guide, bad guide: spacer sequence-dependent cleavage efficiency of Cas12a. Nucleic Acids Res, 48(6), 3228-3243. doi:10.1093/nar/gkz1240
[17] Gootenberg JS, Abudayyeh OO, Lee JW, et al. Nucleic acid detection with CRISPR-Cas13a/C2c2. Science. 2017;356(6336):438-442. doi:10.1126/science.aam9321