Team:SJTU-BioX-Shanghai/Model

home

overview

Overview

Background

The CRISPR (clustered regularly interspaced short palindromic repeats)/Cas9 system can deliver the Cas9 exonuclease complexed with a synthetic guide RNA (gRNA)[1] into a cell, then the cell's genome can be cut at the desired location. This gene editing process[2] has a wide variety of applications including basic biological research, development of biotechnology products, and treatment of diseases. However, the off-target problem or requirements for higher target specificity is still the largest challenge for the promotion of gene editing techniques based on CRISPR/Cas9.

Fortunately, researchers have continued to improve the DNA specificity by mutating specific amino acids on Cas9. The utilization of directed evolution[3] to search better Cas9 becomes the mainstream due to the lack of prior knowledge about protein structure and function. Directed evolution has wide applications in enzyme engineering[4] while xCas9[5], snipper Cas9[6], SaCas9[7] and other evolved variants have achieved preliminary success. Furthermore, computational approaches are combined with directed evolution[8] and generate a kind of semi-rational design since high-performance computing can provide us deeper understandings of protein structure and functions.

Questions

We have put forward four principal questions on CRISPR-Cas9 and directed evolution as well as explain them or propose our own strategies to solve the problems.

Q1: How to search off-target sites?

There are several sites on the genome that are highly similar to the target site. Can we produce a robust classifier to identify actual off-target sites which can be detected in experiments?

Q2: How to obtain off-target rate?

The value of the off-target rate can be quantified in experiments, but few models simulate the off-target kinetics[9] and predict the off-target rate. Can we set sights on the kinetics process of dCas9 binding and regulation?

Q3: How to find key residues?

The interaction between amino acids and target DNA chain or sgRNA can infer the importance of the residue on dCas9 protein's binding function[10]. Can we find these key residues with molecular dynamics simulation?

Q4: How to measure mutation distance?

The process of DNA mutation is involved in directed evolution, but we have observed divergence in mutation frequency for different single-base mutations. Can we provide a visualized interpretation of mutation distance?

Model logic

Directed evolution and CRISPR/Cas9 system have enabled Frances H. Arnold, Emmanuelle Charpentier and Jennifer A. Doudna to win the Nobel Prize in Chemistry respectively. Our experiment and model have combined these two high-profile research areas together and develop an unconventional logic to produce evolved Cas9 variants for a specific target as well as evaluate their off-target effects.

We have collected cases of directed evolution to explore the feasibility of molecular dynamics and machine learning methods. A summary about the models related to CRISPR/Cas9 system of all iGEM teams from 2017 to 2019 are also conducted. Meanwhile, machine learning gradually becomes a more and more acceptable and effective approach in the field of biology. We have taken representative applications of ML in synthetic biology together from iGEM Team Wikis (We can see them in model collection).

In short, we have proposed a pipeline integrating machine learning and Markov process in off-target prediction. Molecular dynamics and graph theory are applied to guide and interpret directed evolution and rational design.

The flow chart that depicts our design in modeling

We firstly improve a bioinformatics tool to screen potential off-target sites by machine learning and feed predicted sequences into the kinetic model to estimate the off-target rates. The mutation on the sequence according to molecular dynamics result generates directed evolution or rational design to obtain the evolved protein with higher DNA specificity. In addition, our graph model, a codon network helps interpret the process of randomized mutation. Next, we refit the kinetics model with new off-target dataset and compare the energy parameters to reveal the improvement in performance of the evolved Cas9/dCas9.

List of our model pages
Prediction Model Kinetics Model Molecular Dynamics
Rational Design Graphic Interpretation Software Tools

Reference

[1] Jinek, M., Chyliński, K., Fonfara, I., Hauer, M., Doudna, J., & Charpentier, E. (2012). A Programmable Dual-RNA–Guided DNA Endonuclease in Adaptive Bacterial Immunity. Science, 337, 816 - 821.
[2] Công, L., Ran, F., Cox, D., Lin, S., Barretto, R., Habib, N., Hsu, P.D., Wu, X., Jiang, W., Marraffini, L.A., & Zhang, F. (2013). Multiplex Genome Engineering Using CRISPR/Cas Systems. Science, 339, 819 - 823.
[3] Arnold, F. (2018). Directed Evolution: Bringing New Chemistry to Life. Angewandte Chemie(International Ed. in English), 57, 4143 - 4148.
[4] Renata, H., Wang, Z., & Arnold, F. (2015). Expanding the enzyme universe: accessing non-natural reactions by mechanism-guided directed evolution. Angewandte Chemie, 54 11, 3351-67 .
[5] Hu, J., Miller, S.M., Geurts, M.H., Tang, W., Chen, L., Sun, N., Zeina, C.M., Gao, X., Rees, H., Lin, Z., & Liu, D. (2018). Evolved Cas9 variants with broad PAM compatibility and high DNA specificity. Nature, 556, 57 - 63.
[6] Lee, J.K., Jeong, E., Lee, J., Jung, M., Shin, E., Kim, Y., Lee, K., Jung, I., Kim, D., Kim, S., & Kim, J. (2018). Directed evolution of CRISPR-Cas9 to increase its specificity. Nature Communications, 9.
[7] Xie, H., Ge, X., Yang, F., Wang, B., Li, S., Duan, J., Lv, X., Cheng, C., Song, Z., Liu, C., Zhao, J., Zhang, Y., Wu, J., Gao, C., Zhang, J., & Gu, F. (2020). High-fidelity SaCas9 identified by directional screening in human cells. PLoS Biology, 18.
[8] Yang, K.K., Wu, Z., & Arnold, F. (2019). Machine-learning-guided directed evolution for protein engineering. Nature Methods, 1-8.
[9] Boyle, E., Andreasson, J.O., Chircus, L.M., Sternberg, S.H., Wu, M.J., Guegler, C.K., Doudna, J., & Greenleaf, W. (2017). High-throughput biochemical profiling reveals sequence determinants of dCas9 off-target binding and unbinding. Proceedings of the National Academy of Sciences, 114, 5461 - 5466.
[10] Ricci, C.G., Chen, J., Miao, Y., Jinek, M., Doudna, J., McCammon, J., & Palermo, G. (2019). Deciphering Off-Target Effects in CRISPR-Cas9 through Accelerated Molecular Dynamics. ACS Central Science, 5, 651 - 662.