Team:FAFU-CHINA - 2020.igem.org

Modeling design of cellulose complex enzyme

Cellulosome:

Firstly, the concept of fibrosome is introduced: cellulosome, which is a kind of cellulase system existing in anaerobic organisms. It is a kind of multi-enzyme complex structure formed by many kinds of cellulase and hemicellulase by anchoring-adhesion mechanism. Cellulosomes are attached to bacterial cell walls by cell adhesion proteins, which can degrade natural cellulose materials efficiently and thoroughly. The structure and function of cellulosomes are important models for understanding protein-protein interactions in prokaryotes and for bacterial degradation of natural cellulose.

The molecular weight of natural cellulosomes is very large, which is not conducive to their heterologous expression. By consulting the relevant literature, we found a method to construct artificial cellulosomes. It includes recombinant enzyme with docking protein domain, secondary scaffold protein with adhesin domain and scaffold protein with lectin subunit.

Since the enzymes used in artificial cellulosomes are not natural components of cellulosomes, we need to modify these enzymes to have the ability to form cellulosomes by anchoring-adhesion mechanism and scaffold proteins. The simplest way is to use the appropriate linker, such as 36-bp glycine rich linker, to reduce the steric hindrance between dockerin and cellulase, and fuse it at the C-terminal of the enzyme. (a large number of literatures have pointed out that the natural dockerin domain generally exists in the C-terminal of enzymes)

The skeleton of cellulosomes is composed of scaffold protein. The scaffold protein itself has multiple cohesin and dockerin domains, which means that they can also bind to each other by anchoring-adhesion mechanism. This leads to our secondary scaffold protein with adhesin domain and scaffold protein with lectin subunit. The secondary scaffold protein binds to the scaffold protein with cohesin through dockerin, and the scaffold protein binds to the clotting protein anchor subunit of cell wall through the lectin subunit, thus forming a dendriform artificial cellulosomes.

Of course, the rational design of protein is far from this. You can find the enzyme you need by querying the database. Then, by understanding the type, quantity and distribution of the domain in the protein sequence and the family of the protein, a number of complex enzyme designs are designed. Then, according to the evaluation of the computer, modify your design repeatedly until the best design is obtained. Finally, the actual function of the complex enzyme is verified by experiments. The same is true for the design of scaffolds, which can further reduce the difficulty of heterologous expression and improve the catalytic ability of cellulosomes by deleting repetitive sequences and reasonably adding cohesin, dockerin and CBM (cellulose binding domain). At the same time, we hope to design more complex enzymes, especially lignin and hemicellulase, to further expand the catalytic ability of cellulosomes.

Defects:

The system is not resistant to high temperature.On the one hand, the substrate organisms (Saccharomyces cerevisiae) used for cell surface display can not tolerate high temperature, and on the other hand, the selected enzyme is not heat-resistant.
The function of cellulosome is limited to degradation of cellulose rather than lignocellulose.
The species of cohesin dockerin used in the system was single, and the 64 enzyme binding sites of artificial cellulosomes could not be reasonably utilized.

Expectation:

From the above point of view, we hope to design a high temperature resistant cellulosic system which can degrade lignin and hemicellulose at the same time. Due to the limited time, money and capacity, we want to evaluate the cellulose bodies we designed in the form of modeling.

The first is the selection of chassis organisms: it seems that the close relative of Saccharomyces cerevisiae--Max Kluyveromyces can normally perform various functions in a high temperature culture environment.

Secondly, the selection of monomer enzyme: there are two important prerequisites for the selection of monomer enzyme. One is that not all enzymes are suitable for adding into cellulosomes, and the other is that the selected monomer enzyme should have similar optimum temperature and pH. By consulting a large number of literature and databases, I found that cellulase from Clostridium fibrinolyticum is very suitable for our system. On the one hand, these enzyme monomers have a natural dockerin domain at the C-terminal of the enzyme. On the other hand, these enzymes not only come from the same species, but also can tolerate high temperature.

Our idea is to replace the dockerin domain of these enzymes with that of artificial cellulosomes, so that they can specifically bind to cellulosomes.

The general structure of the newly designed chimeric enzyme is: enzyme + linker + compatible dockerin domain + restriction site + 6 his tag.

Problems:

Using homologous modeling algorithm, the modeling of "+ linker + compatible dockerin domain + restriction site + 8 his tag" will be ignored directly. So we can't directly model the chimeric enzyme we designed.
How to evaluate the performance of chimeric enzyme by other methods than experiment.
It is not clear what length of linker is appropriate (we chose linker with 15 amino acids). If we can extend the design of hemicellulase to xylanase. Therefore, we hope to master the modeling and evaluation methods of recombinant protein.

Solutions:

According to the existing homologous protein modeling algorithm, if the whole amino acid sequence of the designed complex enzyme is modeled, the "+ linker + compatible dockerin domain + restriction site + 8 his tag" will be directly ignored. Therefore, our first strategy is to use I-TASSER to predict the tertiary structure of the complex enzyme from the beginning. Because I-TASSER (iterative thread optimization) is a hierarchical method for protein structure prediction and structure based functional annotation. Firstly, the structure template is identified from PDB by LOMETS, and the full-length atomic model is constructed by fragment assembly simulation based on iterative template. Then, the 3D model was rewired by BioLiP, the protein function database, to obtain the function of the target. Our second strategy is to model the complex enzyme in two parts. The first part is the tertiary structure of part of the complex enzyme obtained by homologous protein modeling, and the second part is the neglected "+ linker + compatible dockerin domain + restriction site + 8 his tag"; and then use Discovery Studio and ICM Pro connect the two parts to get the configuration of the complex enzyme we need. Next, we predicted the basic physical and chemical properties of the complex enzyme by ProtParam, and calculated its relative molecular weight, total average hydrophilicity, isoelectric point (PI), extinction coefficient, half-life, instability coefficient, etc. By using DNAMAN to analyze the hydrophobicity of the complex enzyme, and to score each complex enzyme model one by one, we screened out the complex enzyme structures that we needed and designed reasonably: TaLPMO-T、MtCDH-T、CelCCE-1、CelCCA-1和Ccel_2454-1. Finally, we use PyMOL to render and beautify the tertiary structure of the complex enzyme to get the final result we want (as shown in the figure below).

Team:FAFU-CHINA/Protein Modeling