Team:SYSU-Software/Description

animation

What is Maloadis?

Maloadis is the abbreviation of

Machine Learning based Optimization and Automated Design Platform with Image Search

It is an integrated design platform that solves various blackbox problems from designing to building a genetic circuit.

Features

INS2

Automated Design

• Automatically design genetic circuit structures based on desired functions

• Autofill devices and parts for the circuit structure

• Regulate gene expression with transcription factors and promoters

• TF-Binding sites affinity prediction

INS2

Image Search

• Recognize genetic parts' information

• Extract circuit structure information

• Match with related genetic circuits

INS2

Parameter Optimization

• Apply to kinds of system and easy to operate

• Shorten experiments cycle

• Make full use of experiment results

INS2

Simulation

• One click simulation of gene circuits constructed by users

• Circuit Design Optimization Based on Evolutionary Algorithms

Why Maloadis?

Project background

Designing and implementing a genetic circuit top-down design could be a long process. The challenge of design lay in the complexity and variety of genetic circuit structures, which requires synthetic biologists to spend a lot of time doing research and considering all the possibilities. On the other hand, experiments may not yield ideal result in the first time, and a repetition of trial and error is often unavoidable in labs.

Motivation

Of course, the traditional way of designing a genetic circuit works, but we want to do it faster, and better. The complexity of biology system and the amounting data might be overwhelming for human brains to process, but they can be solved with computer algorithms.
When combined with computer sciences, synthetic biology design can be taken to a new level. Computer programs can exploit the existing massive data and reduce redundancy in the engineering procedure, leading genetic circuit design towards automation. We aim to present our software, Maloadis as a de novo approadch to facilitate synthetic biology design automation.

Workflow

An overview of Maloadis's workflow

The process of genetic circuit construction can be simply summarized as Design-Build-Test-Learn (DBTL)[1]. Users can use the automated design platform and image search on Maloadis during the design stage. Then, Maloadis can provide simulation of genetic circuit, to help users predict its expression. After users build their designed circuit in the lab, the simulation result can also be used to test whether the lab result reaches the ideal theoretical value. To further improve the circuit design, users can get experiment suggestions using the parameter optimization function on Maloadis to better learn and optimize lab results.

Automated Design

Automated design can be divided into three procedures in Maloadis.

1. GeneNet: designing the structure

We use GeneNet algorithm[2] to design genetic circuit structure in the following procedure:

First, users choose a target function describing how the desired gene expresses overtime (For example, an oscillating expression in the left picture below).

Then, GeneNet algorithm provides the topological structure that will regulate the target gene (marked in red in the picture to the right below), and control its expression so that it can meet with the target function.

For more details of how this algorithm works, check our Modelling page of GeneNet.

2. Autofill: fill in the genetic circuit structure with genetic devices

After GeneNet algorithm designs the structure of the genetic circuit, the next step is to find suitable promoters and transcription factors that can implement this design.

Autofill algorithm automatically search for a set of transcription factors and promoters according to their interaction types (promoting or inhibiting) in the database, and rate these genetic devices with our scoring mechanism.

Scoring mechanism of Autofill: Devices with the following characteristics will be scored higher:
  • Was used in a larger number of published PubMed articles
  • Is shorter in length

Finally, users will get a final design of the genetic circuit which can then be built in the lab.

Learn more about our Autofill Algorithm on our modelling page Autofill.

3. TF-Binding sites affinity prediction

While users are constructing functional circuits with GeneNet algorithm and our Auto-fill algorithm, there are still something missing. The calculated topological structure of the genes not only represents the qualitative activation parameters of each gene part, but more importantly, their activation parameters have also been given. Here, we designed a deep learning algorithm based on ChIP-seq data set to predict the affinity between transcription factors and binding sites, and attempted to use the affinity information to provide users with a reference to the activation parameters.
See the algorithm on TF-Binding.

Image Search

The quickest way to learn a synthetic biology design is looking at the genetic circuit diagram. However, when using Traditional text-based retrieval we need many keywords, which are tedious. On the other hand, current google image recognition technique and biomedical image search engines are Content-based Image Retrieval (image's color or shape), which leaves out the parts' information and relationship in a generic circuit.

Thus we design a Image-center search engine to facilitate synthetic biologists searching for genetic circuits efficiently and accurately. We select circuit diagrams of articles with YOLO v4[3,4] (You Only Look Once), and use OpenCV[5](Computer Vision) and OCR (Optical Character Recognition) to extract parts information. Using this information, we build a database in the standard of SBOL (Synthetic Biology Open Language). To provide users with multi-dimensions, accurate, precise and extensible output, we employ an Image Search Algorithm to match the genetic circuit component and structure information of the input with designs from the databases. In order to make full use of our image search algorithm, our image search module obtains thousands of articles form ACS Synthetic Biology. Circuit graph of every paper is processed, then circuit information is extracted and stored into our database with SBOL form.

Below is a flowchart showing how our Image Search function works.

Further details on image search algorithm.

Simulation

To help researchers better understand how the gene circuit they created works, we build a series of models to simulate the dynamic behaviors of the genetic system. Most of the biochemical processes are made up of activation or repression, so we constructed activation and inhibition formulas based on Hill equation. After the user constructs the gene circuit in the designer, our software can automatically create the corresponding ODE system.

For more information, see our Simulation Model page.

Parameter Optimization

After designing the main a gene circuit, how can we tune it to reach the best performance?

For most of the times, what bothers us in the lab is not knowing what parameters to change in order to achieve ideal results. In Maloadis, we use a top-down method, the Bayesian Optimization Algorithm[6], so that lack of detailed mechanism will not stop you from achieving good result.

the blue line represents function of the system we are exploring. The turquoise area represents sampling area of Gaussian process, in other words, it means the possible area for system function to distribute. The red points represent points that have been tested.

In every round, the Bayesian Optimization algorithm return some points for users to test, which have the highest possibility to achieve the best results. After they conduct the corresponding experiments, the algorithm learn from these results and update the sampling area. We can see the sampling area is getting smaller and smaller round by round, and the possible "good result" areas are tested one by one, which means the algorithm are leading us to the optimized solution.

Process of finding the optimized solution

The optimizing process is an iteration of finding the parameters that will yield ideal results.

First step of optimization is to determine the initial design, inputs and outputs of the system as well as the objective function.

Experiment instruction: give users suggestions on how to change the parameters of gene circuit to optimize it.

Learn from results: After performing the instructions, every experiment result will be learned by Maloadis, giving users new instructions for the next round.

Learn more about our Bayesian Optimization Algorithm on Bayesian algorithm.

References

[1] Carbonell, P. et al. An automated Design-Build-Test-Learn pipeline for enhanced microbial production of fine chemicals. Communications Biology 1, 66, doi:10.1038/s42003-018-0076-9 (2018).
[2] Hiscock, T. W. Adapting machine-learning algorithms to design gene circuits. BMC bioinformatics 20, 214-214, doi:10.1186/s12859-019-2788-3 (2019).
[3] https://arxiv.org/abs/2004.10934.
[4] https://github.com/AlexeyAB/darknet.
[5] https://opencv.org.
[6] HamediRad, M. et al. Towards a fully automated algorithm driven platform for biosystems design. Nature Communications 10, 5150, doi:10.1038/s41467-019-13189-z (2019).

footer

CONTACT

ADDRESS

GET IN TOUCH

footer