Team:Tsinghua-A/Background


Background

Digital Storage

Modern computers use semi-conductor units (mainly made of silicon) to store information. The basic principle of doing so is a binary coding method, which uses a sequence of 0 and 1 to represent specific things. For example:

This sequence is using the coding standard called ASCII. There are also other standards for coding texts, images, audio, and everything we have on computers.

DNA storage

DNA is a molecule composed of two polynucleotide chains that coil around each other to form a double helix carrying genetic instructions for the development, functioning, growth, and reproduction of all known organisms. The basic component of DNA is called nucleotide. Nucleotides have four types, depending on the type of nucleobases attached on the molecular. We name the four types of nucleotides after the abbreviation of the nucleobases, which are A, T, C and G.

Like what we do for 0 and 1, we can have these nucleotides arranged by a specific rule so that a sequence of nucleotides can represent certain information. If we use DNA to encode “iGem 2020”, it could be:

Advantages of DNA storage

Compared to digital storage, DNA storage has lots of advantages:

  • Higher information density

    Compared with coding using only 0 and 1, 4 possible components provides much higher information density.

  • Longer retension period

    Due to the physical structure of semi-conductors, the data are likely not able to be restored after a few decades. This problem is not likely to be solved soon. DNA can provide much better stabilites if treated porperly.It can be used for storing information for centuries without losing data.

  • Much less space and weight

    Compared to semi-conductor storage units, DNA molecules are much lighter.It has been estimated that only 1kg of DNA is needed to store all the information on Earth.

About Medical Data

Medical data is a kind of data generated within the process of diagnostic or inspection. These data are growing explosively every day.What's more , these data have much to do with a person’s health and privacy. So, we urgently need to establish a huge medical database in order to realize the scientific management of hospitals and improve the level of medical services.

Why using DNA to store medical data?


  •   As is known, DNA storage has significantly outperformed their respective counterpart in terms of density and stability. Nowadays, medical data is generated at an unprecedented speed. The problem of how to store all these information has already appeared and will be more serious in the near future. On the other hand, the storage of medical data requires high stability as a person’s previews record could not only benefit his/her treatment even decades after, but also his/her offspring due to heredity.

  •   After the treatment, the chances that the medical data will be used again are overall small. As mentioned before, DNA storage is more commonly used for cold data due to its relatively high cost during restoration.

  •   With the rapid development of all kinds of technology, medical history record has begun to show its unique value and potential, revolutionizing both medical research and clinical treatment. Highly dependent on the dataset, data analysis and modeling and machine learning method has proven to be a powerful tool in many research areas. Undoubtedly, with more detailed information, a patient could be treated more effectively, such is exactly the goal of precision medicine.

  •   The feature of the data and DNA storage make sure DNA sequences will definitely offer a promising paradigm for future medical data storage.