Team:Peking/Description

Description

DNA storage is becoming a prosperous research field in recent years. It uses synthetic DNA to store text documents, pictures and sound files, and then read the information when needed. DNA molecule is an incredibly dense storage medium, which means it has remarkably large capacity. One gram of DNA can store about two petabytes, equivalent to about three million CDs. At the same time, DNA is so stable that the information can be preserved in it for as long as thousands of years, and unlike the traditional storage media such as hard disk and magnetic tape, DNA does not need to be maintained frequently. Moreover, as far as the reading mode is concerned, DNA storage does not involve compatibility issues. At present, one of the main challenges of DNA storage technology is the cost and speed of reading and writing.

The study of DNA storage and coding will be conducive to enrich the diversity of DNA coding, so that it can be used in more fields. At present, there are various coding methods, such as simple coding scheme proposed by Church, Hoffmann coding scheme, coding scheme based on fountain code and so on. Some researchers who tried to develop new coding rules to store images have also achieved a lot and shown fruitful production.

DNA storage lights up the spark of ideas, and the prospect of combining gene and art is "ready to move". Based on these understandings, we are curious of the following questions:

Can we create a set of coding rules with robustness, universality, simplicity and error correction to store music and decode it?

We might as well open the brain hole a little bit. Is it possible to "grow" the music itself through DNA mutation and select the most moving music among the "descendents"?

How to select the appropriate DNA mutation scenes? How to judge the degree of "the beauty of music"?

We don't want to give up the idea that DNA can encode pictures and videos, but how can we relate it to music coding?

After months of effort, we now give positive answers to all these questions.

The goal of our project is to realize the storage, evolution and visualization of music based on DNA storage and gene editing technology. The project is divided into the following five steps:

1. Coding: The existing coding methods are easy to operate with, which can resist frame shift mutation and have good degeneracy. However, they need a lot of gRNA sequences and have poor directivity. Our new coding rule is based on instruction set (see Design) , compared to the previous solution, it requires smaller gRNA pool and provides flexibility in anchor/guide sequence and coding sequence combination;

2. Mutation: There are two ways of creating mutations. One is in silico, which is to simulate mutations in DNA sequences with a computer, and the other is to introduce mutation directly in the E. coli genome in vivo. For the latter, we use two mutation systems: the base editor system based on dCas9 to realize directed mutation and the EvolvR system based on nCas9 to realize random mutation.

3. Decoding: We decode the mutated sequences according to the coding rules.

4. Scoring: We make use of fitness function to judge the degree of "the beauty of music", involving The transformation of tonality, Dividing the bars and phrases, Scoring of a single bar, Scoring the connection between bars, Scoring the connection between phrases, and Getting the final score.

5. Visualization: We seek inspiration from Mandelbrot set and Julia set, optimize images based on Koch curve, and combine music rules with visual rules to produce artistic images and videos.