Rational Design
Promare
Protein and Macromolecular Remodeler (PROMARE) is a protein design software originally developed by SJTU-BioX-Shanghai team. With Promare, we integrated 3 correlated modules including feature extraction module, mutation module and design module.
Feature Extraction
We compared the structural difference between wild type spCas9 and directed evolved xCas9 with PACE by David R. Liu group[3].Both protein structures of SpCas9 and xCas9 are from this article[4]. Compared with commonly used wild-type SpCas9 protein, xCas9 has higher specificity and a wider PAM region[3]. Therefore, we use Promare to compare the structural difference between SpCas9 and xCas9 to find the key feature related to the off-target effect between them.
In Cas9 on/off-target effect, one of the most important features is the difference between the local structure around the DNA-sgRNA double helix and close protein residues. The recognition area in Cas9 is the first 20 nucleotides in sgRNA and 20 nucleotides followed by NGG PAM in DNA.
We first selected all the residues with a close contact with these residues (close contact: 15 Å). These residues are located in recognition region.
This figure showed the location of the recognition area in spCas9 and xCas9.
Compared the distance map and contact map between spCas9 and xCas9, we can find out that the contact changed a lot between them, and recognition area has more different than others.
Interestingly, we found that the 8 mutation site on xCas9 distributed wild. They are not close to the recognition area, only 4 of them within 15Å range, and 1 of them are on the protein outer surface, far from the recognition area.
After analyzing the contacts and mutation sites, we focused on the cavity provided by Cas9 protein for base pairs in recognition area. In molecular dynamics simulations we know that off-target mismatch took up in more regions, which means bigger cavity has a higher tolerance for off-target situations. To quantify the cavity size, we conducted a research on the atomic chemistry environment around phosphorus atom in the outer part of nucleotide. By extracting the top 10 closest protein atoms on protein to each nucleotide, we can know the size between nucleotide and protein. This figure showed the difference between spCas9 (blue) and xCas9 (orange) in top 10 closest protein-nucleotide distances.
In this figure, xCas9 showed an obvious improvement in recognition area cavity. Also, the top 10 distances can be a key feature when we want to learn about off-target effect.
Mutation and Design
The mutation step in Promare is based on python and pymol[5]. Our python script can build a pymol batch pml file. After running the pmf file, users can mutate protein to plenty of candidates for selection.
We built a mutation process to mutate all residues appeared in top 10 distance. Within 2h's PC calculation, we searched 19*129 mutants and evaluated them. The scoring function is as follows: $$ \text{Score} = \Delta a + 0.3\times \Delta b + 0.1\times \Delta c $$ a, b, c, refers to the first, second and third closest atom to nucleotide distances. Comparing all the 2451 mutants, we find that only a small part of them showed difference to wild-type SpCas9.
Following figure showed the score of 129 mutated sites, each site contains 19 kinds of mutation.
In those mutated sites, we selected some of them to do further expression and tested them by our characterization and off-target detection system.
Reference
[1] Dutta, S., & Berman, H. M. (2005). Large macromolecular complexes in the Protein Data Bank: a status report. Structure, 13(3), 381-388.
[2] Neuman, N. (2016). The complex macromolecular complex. Trends in biochemical sciences, 41(1), 1-3.
[3] Hu, J. H., Miller, S. M., Geurts, M. H., Tang, W., Chen, L., Sun, N., ... & Liu, D. R. (2018). Evolved Cas9 variants with broad PAM compatibility and high DNA specificity. Nature, 556(7699), 57-63.
[4] Chen, W., Zhang, H., Zhang, Y., Wang, Y., Gan, J., & Ji, Q. (2019). Molecular basis for the PAM expansion and fidelity enhancement of an evolved Cas9 nuclease. PLoS biology, 17(10), e3000496.
[5] DeLano, W. L. (2002). PyMOL.