Team:Worldshaper-Nanjing/Model

Worldshaper-nanjing Modeling

Abstract

    Background: Yarrowia lipolytica has been widely used in fields like the production of biological lipids, organic acid, polyol.
    Scientific problem: In order to overcome the inaccuracy of cellular phenotype predicted by traditional metabolic network model. A new generation of an enzyme-constrained model of Yarrowia lipolytica was constructed by integrating data of enzymatic properties, which was used to achieve an accurate prediction of cellular phenotype of Yarrowia lipolytica.
    Research results: Use GECKO toolbox, the parameter of enzymatic properties of Yarrowia lipolytica were automatically downloaded from BRENDA database and Uniprot database, which involves kcat and molecular weights, etc.. Based on the latest genome-scale metabolic model of Yarrowia lipolytica, iYLI647, combining with the latest experimental data an enzyme-constrained model of Yarrowia lipolytica, named ec_iYLI647 can be then successfully constructed.
    Research meaning: By constructing the new generation of the metabolic network, the model can predict the results more accurately, and also provide the theoretical evidence for instructing further improvement in the metabolic engineering of Yarrowia lipolytica.


    Keywords: Yarrowia lipolytica; metabolic model; enzyme-constrained model;metabolic engineering


1. Restatement

    Yarrowia lipolytica, being the most efficient oil-producing yeast, can decompose and recombine fat and protein to produce biodiesel. Stale grains, also known as unqualified aging crops, are inedible due to the long period and unmeet situations they are stored. One of their major ingredients is starches, which is deemed as resource that cannot be efficiently utilized today. After understanding the background above, we decided to focus on this topic: to modify the Yarrowia lipolytica to turn major ingredients of the aging crops into biodiesel. We added two genes to the gene coding region of Lystis yeasts so that they could directly convert starch in excess grain into diesel oil. After the successful operation at the laboratory level and to realize the industrial mass production, it is necessary to optimize the culture conditions as well as to find the key transformation targets, to improve the transformation efficiency.

    In order to improve the research efficiency and reduce the research cost, the method of combination of "dry" and "wet" is adopted, also known as the experiment guided by experiments. By constructing a new generation of enzyme restriction model, the enzymatic properties of Yeast were predicted and the key targets for efficient transformation of starch were identified.



2. Problem analysis

    Because of the controllability of the laboratory environment, the microorganism can grow in the optimum environment. However, in comparison, environment in factory is not controllable. Though with same conditions of laboratory, when the conditions are enlarged, the growth of micro mass will be inhibited. The modified microbial growth inhibition is even more serious, which results in the limitation of their industrial production. In order to ensure a certain profit, make a higher efficiency in utilization of bacteria higher, and eventually promote the spread of its utility, we need to know the most suitable conditions for the reproduction of bacteria. Considering the cost of experiments and the best control of variables, we integrated the metabolic model previously constructed and enzymatic properties with the construction of the enzyme-constrained model ec_iYLI647. By using the enzyme-constrained model and MATLAB, the growth situations under different environments can be stimulated to predict the key nodes during the process the production is synthesized, and to identify metabolic modification targets.


3. The Construction and application of enzyme-constrained model of Yarrowia lipolytica

3.1 Data collection

3.1.1 Use Uniprot to collect the protein information of Yarrowia lipolytica
    The information includes the consequences of genome fasta and information of format of tab respectively.


3.1.2 The collection of data of protein abundance
    Using “Yarrowia lipolytica” as a keyword to search series of data of protein abundance in NCBI, the database, and etc.. At the same time, looking for essays related with Yarrowia lipolytica via Web of Science, Google Scholar and databases.


3.2 Pretreatment of model

3.2.1 The comparison and selection among different metabolic model of Yarrowia lipolytica

Image placeholder
    When comparing the data above, we took the comprehensiveness, practicability and other aspects, such as conformity towards real results into concern, and finally chose iYLI647 model as the basis of the construction of enzyme-constrained model. The completed metabolic network model was downloaded through the attachment from essay published in SCI[5]. Then, input metabolic network model with matchable format into MATLAB. Due to the reason that iYLU647 is a model formed by COBRA Toolbox[6], while GECKO Toolbox require the model formed by RAVEN Toolbox[7], following codes, are necessary to convert the format:

For COBRA model:

model = xls2model(‘iYLI647_COBRA.xlsx’)

model = readCbModel(‘iYLI647_COBRA.xml’)

For RAVEN model:

model = importExcelModel(‘iYLI647_RAVEN.xlsx’)

model = importModel(‘iYLI647_RAVEN.xml’)



3.2.2 The modification of iYLI647 model:
    To make the model can be directly read and calculated, it is necessary to remove certain unnecessary, repeated data, modify the format of the data, and amend the results according to our experiments’ results.

The specific steps are as follow:

(1) Removing “model.metCharge”, “model.subSystems”, “model.confidenceScores”, “model.rxnReferences”, “model.rxnECNumbers”, “model.Notes”, “model.metChEBIID”, “model.KEGGID”, “model.metPubChemID”, and “model.metInChIString”.

(2) Modifying (YALI0F04)103g from Gene list to YALI0F04103g.

(3) Deleting repeated biomass equations.

(4) Modifying parameter within biomass equations.


Image placeholder

(5) Alter the format of biomass equations, from one reaction into eight reactions.

Before modification:

0.28148 1_3_beta_D_Glucan[c] + 23.09 H2O[c] + 0.029675 L_Aspartate[c] + 23.09 ATP[c] + 0.0482 AMP[c] + 0.002047 Trehalose[c] + 0.038228 L_Glutamate[c] + 0.019239 L_Arginine[c] + 0.038228 L_Glutamine[c] + 0.097847 L_Alanine[c] + 0.092204 Glycine[c] + 0.0069171 L_Methionine[c] + 0.03363 L_Proline[c] + 0.029675 L_Asparagine[c] + 0.0368 CMP[c] + 0.0005865 L_Cysteine[c] + 0.0013858 dAMP[c] + 0.0014355 dCMP[c] + 0.00127 dGMP[c] + 0.0014245 dTMP[c] + 0.035251 Ergosterol[c] + 0.25911 Chitin__monomer_[c] + 0.0593 GMP[c] + 0.008378 L_Histidine[c] + 0.014434 L_Isoleucine[c] + 0.031007 L_Leucine[c] + 0.043083 L_Lysine[c] + 0.07037 Mannan[c] + 5.7748e-05 Phosphatidate__yeast_specific[c] + 0.00034649 Phosphatidylcholine__yeast_specific[c] + 0.00023677 phosphatidylethanolamine__yeast_specific[c] + 0.012739 L_Phenylalanine[c] + 3.4649e-05 phosphatidylserine__yeast_specific[c] + 7.5072e-05 phosphatidyl_1D_myo_inositol__yeast_specific[c] + 0.047217 L_Serine[c] + 0.02 Sulfate[c] + 0.034093 L_Threonine[c] + 0.00023403 triglyceride__yeast_specific[c] + 0.00032478 L_Tryptophan[c] + 0.0063004 L_Tyrosine[c] + 0.0397 UMP[c] + 0.028069 L_Valine[c] + 0.0030293 zymosterol[c] => 23.09 H[c] + 23.09 Phosphate[c] + 23.09 ADP[c]


After modification:

(1) Protein:

0.19796 L_Aspartate[c] + 0.25502 L_Glutamate[c] + 0.12834 L_Arginine[c] + 0.25502 L_Glutamine[c] + 0.65275 L_Alanine[c] + 0.61511 Glycine[c] + 0.04615 L_Methionine[c] + 0.22435 L_Proline[c] + 0.19796 L_Asparagine[c] + 0.05589 L_Histidine[c] + 0.09629 L_Isoleucine[c] + 0.20685 L_Leucine[c] + 0.28741 L_Lysine[c] + 0.31499 L_Serine[c] + 0.22744 L_Threonine[c] + 0.00217 L_Tryptophan[c] + 0.04203 L_Tyrosine[c] + 0.18725 L_Valine[c] + 0.00391 L_Cysteine[c] + 0.08499 L_Phenylalanine[c] => protein[c]

(2) DNA:

0.00924 dAMP[c] + 0.00958 dCMP[c] + 0.00847 dGMP[c] + 0.0095 dTMP[c] => dna[c]

(3) RNA:

0.04997 AMP[c] + 0.0459 GMP[c] + 0.05352 UMP[c] + 0.05158 CMP[c] => rna[c]

(4) ION:

0.02 Sulfate[c] => ion[c]

(5) LIPID:

0.035251 Ergosterol[c] + 5.7748e-05 Phosphatidate__yeast_specific[c] + 0.00034649 Phosphatidylcholine__yeast_specific[c] + 0.00023677 phosphatidylethanolamine__yeast_specific[c] + 3.4649e-05 phosphatidylserine__yeast_specific[c] + 7.5072e-05 phosphatidyl_1D_myo_inositol__yeast_specific[c] + 0.00023403 triglyceride__yeast_specific[c] + + 0.0030293 zymosterol[c] => lipid[c]

(6) Cell wall:

0.28148 1_3_beta_D_Glucan[c] + 0.002047 Trehalose[c] + 0.25911 Chitin__monomer_[c] + 0.07037 Mannan[c] => cellwall[c]

(7) Biomass:

23.09 H2O[c] + 23.09 ATP[c] + protein[c] + dna[c] + rna[c] + cellwall[c] + ion[c] => 23.09 H[c] + 23.09 Phosphate[c] + 23.09 ADP[c] + biomass[c]

(8) Growth:

biomass[c] <=>



3.3 The construction of enzyme-constrained model

(1) Construction of database of protein

[swissprot,kegg] = updateDatabases(‘yli’) (Standardize the data of gene series of Yarrowia lipolytica downloaded from Uniprot and KEGG database so that they can be used in the construction of enzyme-constrained model.)

(2) Construction of enzyme-constrained model

Input the original model, iYLI647, into MATLAB, the results are displayed in following picture(Fig.3.1-3.6), which includes the parameter of reaction number, metabolic number, gene number, etc..


Image placeholder

model_data = getEnzymeCodes(model)

Matching the gene within the model with the data collected by Uniprot and KEGG database, which includes information about substrates, productions, EC codes, molecular mass etc. that are involved in biochemical reactions (Fig. 3.2).


Image placeholder

ecModel = readKcatData(model_data, kcats)

This command achieving the match of kcat number with the model (Fig. 3.5)


Image placeholder

[ecModel,modifications] = manualModifications(ecModel)

Modifying the model by operating this command, and the results shows that repeated reactions R_HDCAt and R_HCAt were deleted (Fig. 3.6).


Image placeholder

3.4 Matching the enzyme-constrained model with experiment data

    (1) Collection of growth rate under different culture conditions:
    According to the literature mining results, the maximum growth rate of Yarrowia lipolytica on minimal glucose media was 0.24 h-1, while that on minimal glycerol media was 0.30 h-1[10].
    (2) Collection of total protein mass:
    By analyzing existing biomass equations of metabolic network model of Yarrowia lipolytica, the total protein content was found 0.35g·gDW-1[5].
    (3) Because lack modification modified by protein content, the default value 0.5 was adopted.
    (4) Other parameters used for the construction of enzyme constrained model: sigma = 0.5; %Optimized for glucose Ptot = 0.403; %Assumed constant gR_exp = 0.24; %[g/gDw h] Max batch gRate on minimal glucose media c_source = 'D Glucose exchange (reversible)'; % Rxn name for the glucose uptake reaction
    Finally, these constraints were used for the construction of enzyme constrains model.
    [ecModel_batch,OptSigma] = getConstrainedModel(ecModel,c_source,sigma,Ptot,gR_exp,modifications,name)
    Among above, because of the lack of protein abundance data, the BLAST program was used to compare the data with Saccharomyces cerevisiae. According to the blast results, the protein abundance data was replaced (yli_s288c.xlsx). All these protein abundance data was stored in prot_abundance.txt, which were further used as the measured abundance during the construction of model.
    Modifying the growth maintenance (GAM) value:
    Using fitGAM.m, the ATP requirement that satisfies the cellular growth of Yarrowia lipolytica was modified, which was based on the following formula (Equation 1). And the results showed that the modified GAM value was 41.7167 mol·gDW-1, while the original GAM value was 23.09 mol·gDW-1.
    GAMpol = Ptot*16.7245 + Ctot*4.288 + R*1.768 + D*0.312; 「1」 ………… Equation 1.
    After these steps, the enzyme-constrained model is constructed, which was named as (ec_iYLI647 Fig.3.7).

Image placeholder

3.5 Analysis of properties of the enzyme-constrained model

3.5.1 Detailed information about ec_iYLI647 was listed in Table 3.2.

Image placeholder

3.5.2 The visualization of enzyme-constrained model
    Cytoscape 3.7.0 was used for the visualization of ec_iYLI647. According to the command: notShownMets = outputNetworkCytoscape(model_cobra,'ecModel'), the files output include ecModel.sif,ecModel_edgeType.noa, ecModel_nodeComp.noa, ecModel_nodeType.noa and ecModel_subSys.noa.
    After importing the ecModel.sif file into Cytoscape, the network of ec_iYLI647 can be visualized. When marking distinguish metabolic production within different areas with distinct colors. As shown in Fig. 3.8, red represents 632 proteins, which involved in the model; crimson purple represents 14 endoplasmic-reticulums metabolites; bottle green represents 10 vacuoles metabolites; light blue represents 16 golgi metabolites; sapphire blue represents 31 nucleus metabolites; yellow represents 107 peroxisomes metabolites; purple represents 126 extracellular metabolites; light green represents 608 cytoplasms metabolites; and orange represents 212 mitochondria metabolites.

Image placeholder


    From the figure above, we can also easily observe the accuracy the targets are oriented and the width the model covers, which both indicates the successful and the accuracy of the construction of the model.


4. The application of enzyme-constrained model ec_iYLI647

    Based on the model ec_iYLI647, the carbon source utilization capacity of Y. lipolytica was predicted. There were 36 kinds of carbon sources, which were estimated respectively, by setting the lower bound value of corresponding exchange reaction as -1000 mmol·gDW·h-1. According to the simulation results, 29 kinds of carbon sources could be used by Y. lipolytica, including 13 kinds of amino acids. When using glucose as carbon source, the growth rate of Y. lipolytica was 0.1610 h-1. The growth rate value was the maximum value, which means that glucose was the optimum carbon source of Y. lipolytica. In addition, when using starch as carbon source, the growth value was 0.1508 h-1, which was only 6.34% lower than the value cultured by glucose. These results provided the theoretical evidence for further modification of Y. lipolytica to utilize starch.
    Detail data about the rate Y. lipolytica can utilize 36 kinds of carbon sources are listed in Table 3.3.

Image placeholder

5. The prospect of the model and project

    To conclude the general results, by constructing a new generation of metabolic network model, the inaccuracy in the prediction of cellular phenotypes by traditional models are overcome, and at the same time, the accuracy of prediction of Yarrowia lipolytica is improved as well. The results analyzed by the model can provide theoretical evidence for further modification of Yarrowia lipolytica. With the succeed in the application of the predicted results, we also gained succeed in our experience. By orientating key targets and then modify according to these targets, this technology is prospective in the future to be successfully propelled and utilized in mass production.

6. References

    1. Loira N, Dulermo T, Nicaud JM et al: A genome-scale metabolic model of the lipid-accumulating yeast Yarrowia lipolytica. BMC Syst Biol 2012, 6(1):35.
    2. Pan P, Hua Q: Reconstruction and in silico analysis of metabolic network for an oleaginous yeast, Yarrowia lipolytica. PLoS ONE 2012, 7(12):e51535.
    3. Kavscek M, Bhutada G, Madl T et al: Optimization of lipid production with a genome-scale model of Yarrowia lipolytica. BMC Syst Biol 2015, 9.
    4. Kerkhoven EJ, Pomraning KR, Baker SE et al: Regulation of amino-acid metabolism controls flux to lipid accumulation in Yarrowia lipolytica. Npj Systems Biology and Applications 2016, 2:7.
    5. Mishra P, Lee NR, Lakshmanan M et al: Genome-scale model-driven strain design for dicarboxylic acid production in Yarrowia lipolytica. BMC Syst Biol 2018, 12:12.
    6. Heirendt L, Arreckx S, Pfau T et al: Creation and analysis of biochemical constraint-based models using the COBRA Toolbox v.3.0. Nat Protoc 2019, 14(3):639-702.
    7. Wang H, Marcisauskas S, Sanchez BJ et al: RAVEN 2.0: A versatile toolbox for metabolic network reconstruction and a case study on Streptomyces coelicolor. PLoS Comput Biol 2018, 14(10):e1006541.
    8. Zhang HY, Wu C, Wu QY et al: Metabolic Flux Analysis of Lipid Biosynthesis in the Yeast Yarrowia lipolytica Using C-13-Labled Glucose and Gas Chromatography-Mass Spectrometry. PLoS ONE 2016, 11(7):14.
    9. Feist AM, Henry CS, Reed JL et al: A genome-scale metabolic reconstruction for Escherichia coli K-12 MG1655 that accounts for 1260 ORFs and thermodynamic information. Mol Syst Biol 2007, 3:121-128.
    10. Workman M, Holt P, Thykaer J: Comparing cellular performance of Yarrowia lipolytica during growth on glucose and glycerol in submerged cultivations. Amb Express 2013, 3.
    11. Sanchez BJ, Zhang C, Nilsson A et al: Improving the phenotype predictions of a yeast genome-scale metabolic model by incorporating enzymatic constraints. Mol Syst Biol.

Copyright © All rights reserved | This template is made with by Colorlib