Reaction Mechanism: Geniposide was hydrolyzed by beta-glucosidase, and next would react with to produce gardenia blue.
Figure 1. Reaction Mechanism of Gardenia blue
Based on the reaction mechanism, we used one-step enzymatic conversion method to prepare gardenia blue. It means that we added the geniposide, β-glucosidase and glycine into the reaction system at the same time. This preparation method of enzymatic conversion has the advantage of high specificity. However, there are many factors affecting the production results. And each factor influences each other, which makes the test factors and results have a strong discrete type. So there is a nonlinear relationship between each factor and the yield. The mathematical model established by conventional methods often cannot accurately describe this relationship, and the test results are not satisfactory.
1.Design of BP-NN Model
To simulate the relationship between the 4 factors and yield, and also predict the yield of gardenia blue, we established the Back Propagation Neural Network (BP-NN) model based on the orthogonal test. The neural network model is a computational model of information processing inspired by the brain and nervous system in biology. Kolmogorov theorem[1] has proved that a fully learned three-layer BP network can approximate any function. At this time, the main structure of the model is composed of three layers, including the input layer, the hidden layer, and the output layer.
Then we used the 49 groups of sample data from the Orthogonal test to establish a BP neural network with temperature, geniposide content, glycine content, and time as input variables. The output variable was the yield of gardenia blue.
Since the singular sample data will affect the learning efficiency of the neural network, especially the maximum or minimum sample data relative to other input samples. It is necessary to normalize all the mean samples. The original data can be converted into comparable data through normalization processing, which can avoid the situation of network training time extension or network convergence failure caused by the singular sample data. [4-8]
BP neural network model was carried out in Matlab R2019b, and the topological structure was 4-n-1. (n is the number of hidden layer neurons.)
Figure 2. Topology structure of BP network: 4-n-1
2.Establishment and test of BP-NN model
In the model, we use the basic idea of Cross-validation. In a sense, a data-set was divided into groups, one part of which was the train-set, the other part was the validation-set or the test-set. The first part was training the classifier, and the second part was testing the trained model, to evaluate the performance index of the classifier. Our model selected 43 groups of mixed level data as training samples and the remaining 6 groups as test samples. The number of training iterations was set as 2000, and the error target was set as 110-4. Then, sample modeling was carried out. The function transferred from the input layer to the hidden layer was tansng (S-type tangent function). The function transferred from the hidden layer to the output layer was purelin (pure linear function), and the training function was traingdx.
The structure of the gardenia blue’s yield prediction model based on BP-NN is shown in Figure 3.
Figure 3. Structural diagram of gardenia blue yield prediction model
The number of neurons in the hidden layer can determine the convergence of the training function, the length of training, and the size of training error. So determining the number of neurons in the hidden layer is very important for the model. We compared the Mean Square Error (mse) with different hidden nodes and then determined the optimal number of neurons. The mean square error of the predicted value corresponding to the number of neurons in different hidden layers is shown in Figure 4.
Figure 4. Prediction mean square error of different hidden nodes
The results show that the mean square error of the predicted value reaches the smallest when the number of hidden layer nodes is 8. After 7 iterations of the network, the mean square error reaches the target value, and the network performance is stable. The error convergence curve and fitting result is shown in Figure 5. and Figure 6.
Figure 5. Convergence curve of error
Figure 6. Fitting result of BP-NN model
To further determine the simulation performance and generalization ability of the network, we inputted the independent variables of the test sample into the model and compared the difference between the predicted value and the true value. The result is shown in Figure 7.
Figure 7. Comparison of predicted and true values
It can be seen from the error curve that the error between the predicted value and the true value is tiny. And the coefficient of determination is as high as 0.99147, indicating that the degree of agreement is really high. Therefore, the established BP neural network model is stable and has strong generalization ability, and can be used to estimate the yield of gardenia blue under different preparation conditions.
3.Genetic Algorithms and Engineering
Optimization
Genetic Algorithm (GA) is a swarm optimization algorithm based on the theory of natural selection and the principle of genetics, introducing concepts such as reproduction, hybridization, mutation, competition, and selection into the algorithm, which is very suitable for searching for optimal solutions by combining with BP neural network [7-8].
After the BP neural network was successfully modeled, the mapping relationship between the input and the output captured by the BP network was used as the fitness function of the genetic algorithm. Then determine the level change range (search space) of the four factors according to the training data sample, randomly select 20 combinations in the search space as the initial group, and go through continuous selection, crossover, and mutation operations until the termination condition is met.
Here we choose 4 for the number of variables, the upper and lower limits of the variables are [40 240 2 5] and [100 480 14 180]. The initial variable is 20. The change type is double-precision. The variable selection method is Roulette. The cross point is single point cross, and the cross probability is 0.75, the mutation probability is 0.05. The evolution algebra is 100, and the others are kept to a minimum.
When the temperature, geniposide dosage, glycinedosage, and time were 94.2 ℃, 470.3 ug, 13.5 ug, and 156.6 min, we got the maximum predicted value of gardenia blue. The yield was 26.2857 ug/L.
The BP neural network model can well simulate the relationship among the 4 factors and predict the yield of gardenia blue. Combined with the genetic algorithm, the best preparation process can be quickly obtained.
Based on the reaction mechanism, we used one-step enzymatic conversion method to prepare gardenia blue. It means that we added the geniposide, β-glucosidase and glycine into the reaction system at the same time. This preparation method of enzymatic conversion has the advantage of high specificity. However, there are many factors affecting the production results. And each factor influences each other, which makes the test factors and results have a strong discrete type. So there is a nonlinear relationship between each factor and the yield. The mathematical model established by conventional methods often cannot accurately describe this relationship, and the test results are not satisfactory.
To simulate the relationship between the 4 factors and yield, and also predict the yield of gardenia blue, we established the Back Propagation Neural Network (BP-NN) model based on the orthogonal test. The neural network model is a computational model of information processing inspired by the brain and nervous system in biology. Kolmogorov theorem[1] has proved that a fully learned three-layer BP network can approximate any function. At this time, the main structure of the model is composed of three layers, including the input layer, the hidden layer, and the output layer.
Then we used the 49 groups of sample data from the Orthogonal test to establish a BP neural network with temperature, geniposide content, glycine content, and time as input variables. The output variable was the yield of gardenia blue.
Since the singular sample data will affect the learning efficiency of the neural network, especially the maximum or minimum sample data relative to other input samples. It is necessary to normalize all the mean samples. The original data can be converted into comparable data through normalization processing, which can avoid the situation of network training time extension or network convergence failure caused by the singular sample data. [4-8]
BP neural network model was carried out in Matlab R2019b, and the topological structure was 4-n-1. (n is the number of hidden layer neurons.)
In the model, we use the basic idea of Cross-validation. In a sense, a data-set was divided into groups, one part of which was the train-set, the other part was the validation-set or the test-set. The first part was training the classifier, and the second part was testing the trained model, to evaluate the performance index of the classifier. Our model selected 43 groups of mixed level data as training samples and the remaining 6 groups as test samples. The number of training iterations was set as 2000, and the error target was set as 110-4. Then, sample modeling was carried out. The function transferred from the input layer to the hidden layer was tansng (S-type tangent function). The function transferred from the hidden layer to the output layer was purelin (pure linear function), and the training function was traingdx.
The structure of the gardenia blue’s yield prediction model based on BP-NN is shown in Figure 3.
Figure 3. Structural diagram of gardenia blue yield prediction model
The number of neurons in the hidden layer can determine the convergence of the training function, the length of training, and the size of training error. So determining the number of neurons in the hidden layer is very important for the model. We compared the Mean Square Error (mse) with different hidden nodes and then determined the optimal number of neurons. The mean square error of the predicted value corresponding to the number of neurons in different hidden layers is shown in Figure 4.
Figure 4. Prediction mean square error of different hidden nodes
The results show that the mean square error of the predicted value reaches the smallest when the number of hidden layer nodes is 8. After 7 iterations of the network, the mean square error reaches the target value, and the network performance is stable. The error convergence curve and fitting result is shown in Figure 5. and Figure 6.
Figure 5. Convergence curve of error
Figure 6. Fitting result of BP-NN model
To further determine the simulation performance and generalization ability of the network, we inputted the independent variables of the test sample into the model and compared the difference between the predicted value and the true value. The result is shown in Figure 7.
Figure 7. Comparison of predicted and true values
It can be seen from the error curve that the error between the predicted value and the true value is tiny. And the coefficient of determination is as high as 0.99147, indicating that the degree of agreement is really high. Therefore, the established BP neural network model is stable and has strong generalization ability, and can be used to estimate the yield of gardenia blue under different preparation conditions.
3.Genetic Algorithms and Engineering
Optimization
Genetic Algorithm (GA) is a swarm optimization algorithm based on the theory of natural selection and the principle of genetics, introducing concepts such as reproduction, hybridization, mutation, competition, and selection into the algorithm, which is very suitable for searching for optimal solutions by combining with BP neural network [7-8].
After the BP neural network was successfully modeled, the mapping relationship between the input and the output captured by the BP network was used as the fitness function of the genetic algorithm. Then determine the level change range (search space) of the four factors according to the training data sample, randomly select 20 combinations in the search space as the initial group, and go through continuous selection, crossover, and mutation operations until the termination condition is met.
Here we choose 4 for the number of variables, the upper and lower limits of the variables are [40 240 2 5] and [100 480 14 180]. The initial variable is 20. The change type is double-precision. The variable selection method is Roulette. The cross point is single point cross, and the cross probability is 0.75, the mutation probability is 0.05. The evolution algebra is 100, and the others are kept to a minimum.
When the temperature, geniposide dosage, glycinedosage, and time were 94.2 ℃, 470.3 ug, 13.5 ug, and 156.6 min, we got the maximum predicted value of gardenia blue. The yield was 26.2857 ug/L.
The BP neural network model can well simulate the relationship among the 4 factors and predict the yield of gardenia blue. Combined with the genetic algorithm, the best preparation process can be quickly obtained.
The number of neurons in the hidden layer can determine the convergence of the training function, the length of training, and the size of training error. So determining the number of neurons in the hidden layer is very important for the model. We compared the Mean Square Error (mse) with different hidden nodes and then determined the optimal number of neurons. The mean square error of the predicted value corresponding to the number of neurons in different hidden layers is shown in Figure 4.
The results show that the mean square error of the predicted value reaches the smallest when the number of hidden layer nodes is 8. After 7 iterations of the network, the mean square error reaches the target value, and the network performance is stable. The error convergence curve and fitting result is shown in Figure 5. and Figure 6.
Figure 5. Convergence curve of error
Figure 6. Fitting result of BP-NN model
To further determine the simulation performance and generalization ability of the network, we inputted the independent variables of the test sample into the model and compared the difference between the predicted value and the true value. The result is shown in Figure 7.
Figure 7. Comparison of predicted and true values
It can be seen from the error curve that the error between the predicted value and the true value is tiny. And the coefficient of determination is as high as 0.99147, indicating that the degree of agreement is really high. Therefore, the established BP neural network model is stable and has strong generalization ability, and can be used to estimate the yield of gardenia blue under different preparation conditions.
3.Genetic Algorithms and Engineering
Optimization
Genetic Algorithm (GA) is a swarm optimization algorithm based on the theory of natural selection and the principle of genetics, introducing concepts such as reproduction, hybridization, mutation, competition, and selection into the algorithm, which is very suitable for searching for optimal solutions by combining with BP neural network [7-8].
After the BP neural network was successfully modeled, the mapping relationship between the input and the output captured by the BP network was used as the fitness function of the genetic algorithm. Then determine the level change range (search space) of the four factors according to the training data sample, randomly select 20 combinations in the search space as the initial group, and go through continuous selection, crossover, and mutation operations until the termination condition is met.
Here we choose 4 for the number of variables, the upper and lower limits of the variables are [40 240 2 5] and [100 480 14 180]. The initial variable is 20. The change type is double-precision. The variable selection method is Roulette. The cross point is single point cross, and the cross probability is 0.75, the mutation probability is 0.05. The evolution algebra is 100, and the others are kept to a minimum.
When the temperature, geniposide dosage, glycinedosage, and time were 94.2 ℃, 470.3 ug, 13.5 ug, and 156.6 min, we got the maximum predicted value of gardenia blue. The yield was 26.2857 ug/L.
The BP neural network model can well simulate the relationship among the 4 factors and predict the yield of gardenia blue. Combined with the genetic algorithm, the best preparation process can be quickly obtained.
To further determine the simulation performance and generalization ability of the network, we inputted the independent variables of the test sample into the model and compared the difference between the predicted value and the true value. The result is shown in Figure 7.
Figure 7. Comparison of predicted and true values
It can be seen from the error curve that the error between the predicted value and the true value is tiny. And the coefficient of determination is as high as 0.99147, indicating that the degree of agreement is really high. Therefore, the established BP neural network model is stable and has strong generalization ability, and can be used to estimate the yield of gardenia blue under different preparation conditions.
3.Genetic Algorithms and Engineering
Optimization
Genetic Algorithm (GA) is a swarm optimization algorithm based on the theory of natural selection and the principle of genetics, introducing concepts such as reproduction, hybridization, mutation, competition, and selection into the algorithm, which is very suitable for searching for optimal solutions by combining with BP neural network [7-8].
After the BP neural network was successfully modeled, the mapping relationship between the input and the output captured by the BP network was used as the fitness function of the genetic algorithm. Then determine the level change range (search space) of the four factors according to the training data sample, randomly select 20 combinations in the search space as the initial group, and go through continuous selection, crossover, and mutation operations until the termination condition is met.
Here we choose 4 for the number of variables, the upper and lower limits of the variables are [40 240 2 5] and [100 480 14 180]. The initial variable is 20. The change type is double-precision. The variable selection method is Roulette. The cross point is single point cross, and the cross probability is 0.75, the mutation probability is 0.05. The evolution algebra is 100, and the others are kept to a minimum.
When the temperature, geniposide dosage, glycinedosage, and time were 94.2 ℃, 470.3 ug, 13.5 ug, and 156.6 min, we got the maximum predicted value of gardenia blue. The yield was 26.2857 ug/L.
The BP neural network model can well simulate the relationship among the 4 factors and predict the yield of gardenia blue. Combined with the genetic algorithm, the best preparation process can be quickly obtained.
It can be seen from the error curve that the error between the predicted value and the true value is tiny. And the coefficient of determination is as high as 0.99147, indicating that the degree of agreement is really high. Therefore, the established BP neural network model is stable and has strong generalization ability, and can be used to estimate the yield of gardenia blue under different preparation conditions.
Optimization
Genetic Algorithm (GA) is a swarm optimization algorithm based on the theory of natural selection and the principle of genetics, introducing concepts such as reproduction, hybridization, mutation, competition, and selection into the algorithm, which is very suitable for searching for optimal solutions by combining with BP neural network [7-8].
After the BP neural network was successfully modeled, the mapping relationship between the input and the output captured by the BP network was used as the fitness function of the genetic algorithm. Then determine the level change range (search space) of the four factors according to the training data sample, randomly select 20 combinations in the search space as the initial group, and go through continuous selection, crossover, and mutation operations until the termination condition is met.
Here we choose 4 for the number of variables, the upper and lower limits of the variables are [40 240 2 5] and [100 480 14 180]. The initial variable is 20. The change type is double-precision. The variable selection method is Roulette. The cross point is single point cross, and the cross probability is 0.75, the mutation probability is 0.05. The evolution algebra is 100, and the others are kept to a minimum.
When the temperature, geniposide dosage, glycinedosage, and time were 94.2 ℃, 470.3 ug, 13.5 ug, and 156.6 min, we got the maximum predicted value of gardenia blue. The yield was 26.2857 ug/L.
The BP neural network model can well simulate the relationship among the 4 factors and predict the yield of gardenia blue. Combined with the genetic algorithm, the best preparation process can be quickly obtained.
[1] He Yubin, Li Xinzhong. Application of Neural Network Control Technology [M].] and Beijing: Science Press,2000.
[2]GONZALEZ-SAIZ J M,PIZARRO C,GARRIDO-VIDAL D. Evaluation of kinetic models for industrial acetic fermentation:
[3]proposal of a new model optimized by genetic algorithms[J]. Biotechnology Progress,2003,19(2):599-611.
[4]Zhang S, Wang B, Li X, et al.Research and application of improved gas concentration prediction model based on grey theory and BP neural network in digital mine[J].Procedia CIRP, 2016, 56:471.
[5]Jiang J P.Prediction of concrete strength based on BP neural network[J].Advanced Materials Research,2012, 341-342:58.
[6]Yao Hui Qin, Jiang Ye Long.Based on the genetic algorithm to optimize the BP neural network in the degree of concrete creep prediction model[J].Applied Mechanics and Materials, 2014, 584-586:1346.
[7]CARRILLO-URETA G E,ROBERTS P D,BECERRA V M. Genetic algorithms for optimal control of beer fermentation [C]//
[8]Intelligent Control Proceedings of the IEEE International Symposium,2001,17:391-396.