Team:TAS Taipei/Model

fork me

Modeling

Purpose of Model

The purpose of our mathematical model is to approximate viral load of SARS-CoV-2 within a patient based on the reaction rate of our detection test. Although viral load does not necessarily correlate with disease severity, it is an appropriate measure for the risk of transmissibility (Lescure et al., 2020). This information is especially important for asymptomatic, presymptomatic, and paucisymptomatic individuals, who may be unintentionally transmitting the virus within their communities. RT-PCR is currently the gold standard for nucleic acid amplification tests; however, they require expensive thermal cyclers and can only provide an indirect quantification of viral load through CT values, which are dependent on the fluorescent probe used and specific to PCR-based tests (CDC, 2020). Our model uses data collected by our user-friendly software, VisualpH, to provide a direct quantification of viral load through viral RNA copy numbers. This not only helps the development of non-PCR-based detection tests, but also allows patients to implement preventative measures with less delay.

Quantification of Colormetric Data

Software

The first step to build our mathematical model was to quantify our data from rolling circle amplification (RCA) tests. We ran our reactions at a relatively small volume of 40 μL, and thus our options for pH measurement were limited to the use of computer vision or a micro-pH electrode. Because the electrode was expensive and locally inaccessible, we developed a Python script, VisualpH, that converts pixel hue to pH values. The function used by VisualpH to convert hue to pH is a logarithmic function derived from a logistic fit of pH to hue data. We chose to fit the data with a logistic fit because phenol red, the indicator used in our detection test, only changes color between a pH of 6 and 8, which closely mimics the asymptotic behavior of a sigmoid function. Although we did not have the opportunity to calibrate VisualpH to the readings from a micro-pH electrode, we are confident in the accuracy of the software as it was calibrated to known pH standards containing phenol red concentrations identical to those in our experiments.

Initial Data

To test the viability of our model, we initially conducted experiments at 0.0625 μM, 0.000625 μM, 0.0625 nM, and 0.0277pM of target RNA. Each trial consisted of separate reactions for each concentration and a negative control that had deionized water in place of target RNA. We ran all reactions at 40 μL according to our standard protocol and recorded them in a controlled environment. Data from our first trial suggested that reaction rate is dependent on target RNA concentration. Reactions containing higher concentrations of target RNA showed a faster reaction rate (figure 1).

Figure 1: The scatterplot for a trial with reactions containing 0.0625 μM (red), 0.000625 μM (orange), 0.0625 nM (yellow), and 0.0277 pM (purple) of target RNA. The graph displays the concentration of hydrogen ions in micromolar over time. The separation of the plots for each RNA concentration suggests that concentration affects the overall reaction rate.

When we cross-compared data from our first and second trials, we realized that reaction rate was also dependent on the initial pH of the reaction. Despite containing identical amounts of target RNA (0.0625 μM), the reaction in our second trial had an evidently slower reaction rate (figure 2).

Figure 2: The scatterplots for the reactions containing 0.0625 μM of target RNA in the first and second trials. The graph displays the concentration of hydrogen ions in micromolar over time. The separation of the plots for each trial suggest that the overall reaction rate is dependent on the initial pH of the reaction.

Moreover, we saw that the reaction containing 0.000625 μM of target RNA in the first trial had a faster reaction rate than the reaction containing 0.0625 μM of target RNA in the second trial (figure 3). However, when we compared the data for reactions containing 0.0625 μM, 0.000625 μM, 0.0625 nM, and 0.0277 pM of target RNA in the first and second trials independently, we noticed that the previously established relationship between target RNA concentration and reaction rate still held.

Figure 3: The combined scatterplots and regressions for reactions containing 0.0625 μM (red), 0.000625 μM (orange), 0.0625 nM (yellow), and 0.0277 pM (purple) of target RNA in the first and second trials. Circle markers and square markers represent data from the first and second trials, respectively. The graph displays the concentration of hydrogen ions in micromolar over time.

Analysis of VisualpH Output

Our detection test relies on the production of hydrogen ions by φ29 DNA polymerase during DNA synthesis. The RCA reaction begins after target RNA are hybridized with padlock probes, forming circularized DNA that act as the template for amplification. While reverse primers allow a branching-like amplification, and thus a rapid production of hydrogen ions, a complete strand of DNA must be synthesized first. Therefore, we expected the reaction to be split into a linear phase and an exponential phase. The linear phase would last longer at lower concentrations of target RNA because the probability of hybridization would be lower, delaying the synthesis of the first DNA strands. The exponential phase would be similar across all reactions because branching is only dependent on the amount of φ29 DNA polymerase and reverse primers.

In figures 1-3, the separation of the reaction into a linear and an exponential regime is clearly shown by a slow, constant growth followed by a rapid growth in hydrogen ion concentration. The assumptions about the two regimes are supported by data from the initial trials, which shows similar exponential growth rates across all reactions but longer linear regimes for those containing lower concentrations of target RNA.

Based off these conclusions we decided to fit the time to hydrogen ion concentration data to an exponential function with an additional linear term (eq. 1). In an ideal scenario, where the initial pH of the reaction does not affect its rate, only parameters ‘a’ and ‘d’ of the function would change across reactions containing different concentrations of target RNA. This is because we expect ‘b’, which represents the exponential growth rate, to remain the same. Thus we could build a mathematical model to backtrack and interpolate the target RNA concentration of a reaction by relating functions of ‘a’ and ‘d’ in a three-dimensional function.

Equation 1: Hydrogen ion as function of time (x). In the case where the initial pH of a reaction is controlled, ‘a’ and ‘d’ are the only meaningful parameters.

However, as shown in figure 2, the reaction rate is dependent on initial pH. Moreover, we realized that the value of ‘a’ was changing in addition to that of ‘C’ for reactions containing identical amounts of target RNA but at different starting pH values. This suggests that ‘a’ is both a function of target RNA concentration and initial pH and that the activity of φ29 DNA polymerase is significantly affected by pH, a fact we had not previously considered.

Experimental Adjustments Based on Modeling Output

To resolve the issues presented by the confounding factor of initial pH, our experimental team ran three new trials simultaneously with a master mix, which contained all reagents except target RNA and padlock probes. We recognize that there will be an inherent difference in starting pH due to a difference in nucleic acid concentration, but as we realized later, this difference is negligibly small. The data from our new trials showed that reactions containing the same concentration of target RNA had similar reaction rates (figure 4).

Figure 4: The combined scatterplots for reactions containing 0.0625 μM (red), 0.00625 μM (orange), 0.000625 μM (yellow), and 0.0625 nM (purple) of target RNA in the three new trials. The graph displays the concentration of hydrogen ions in micromolar over time. Though the cluster is weaker for data of reactions containing 0.00625 μM and 0.000625 μM of target RNA, there is still an expected separation in data points for reactions of different target RNA concentrations.

While the cluster is weaker for data of reactions containing 0.00625 μM and 0.000625 μM of target RNA, the values of the regression function parameters support the expected separation in data points for reactions of different target RNA concentrations (table 1). The values of ‘a’ and ‘d’ in table 1 were obtained by fitting data from the new trials to equation 1 with ‘b’ set to the average value of ‘b’ from independent regressions. We made ‘b’ constant because our model is built around the assumption that the exponential regime behaves similarly across all reactions.

Table 1: Values of the regression function parameters. The ‘a’ and ‘d’ values were determined from a regression of the data with a version of equation 1 where ‘b’ is set to the average value of ‘b’. The values of ‘a’ are around 100 times bigger than those of ‘d’, which suggest that ‘a’ has more weight in the determination of target RNA concentration.

Whereas the regression line for the reaction containing 0.000625 μM of target RNA overlapped with that for the reaction containing 0.0625 μM of target RNA in the old trials, the regression lines for the new trials show a clear separation (figure 5). This suggests that it is feasible to determine the target RNA concentration within a reaction based on the values of its regression parameters.

Figure 5: The regression lines for reactions containing 0.0625 μM and 0.000625 μM of target RNA in the three new trials.

Model Construction

To determine the relationship between ‘a’, ‘d’, and the target RNA concentration, we plotted target RNA concentration against values of ‘a’ and values of ‘d’ (figure 6 and 7). The shape of data suggested an exponential relationship between the parameters and target RNA concentration.

Figure 6: Exponential regression for ‘a’ to target RNA concentration data.

Figure 7: Exponential regression for ‘d’ to target RNA concentration data.

From the exponential regressions of the parameter to target RNA concentration data, we realized that ‘a’ and ‘d’ are negatively correlated. Indeed, when we plotted values of ‘d’ against corresponding values of ‘a’, we saw a strong, negative linear relationship with a correlation coefficient of -0.99 (figure 8). This suggests that ‘d’ is a function of ‘a’ and that we can therefore model target RNA concentration of a reaction as a function of ‘a’. Based on our data, this function is f(x) = 0.00226(5.33x1022)x, where 0.00226 and 5.33x1022 are empirically determined from the exponential fit of ‘a’ to target RNA concentration data.

Figure 8: Linear regression relating parameters ‘a’ and ‘d’ of the regression function.

Conclusion and Discussion

To improve our model, we would like to collect data from reactions containing a wider range of target RNA. We expect the relationship between parameter ‘a’ of our regression function and target RNA concentration to remain exponential, but more data points would help increase the accuracy of our model. Assuming that the initial pH may be controlled for all reactions, our model may be easily implemented into our software as a general function. However, we would also have to include functions to fit collected data to our regression function (eq.1). This would allow users of our detection test obtain an approximation of their viral load through a video recording of their test (figure 9).

Figure 9: Flowchart of sample collection to viral load analysis.

References

CDC. (2020, February 11). Information for Laboratories about Coronavirus (COVID-19). Centers for Disease Control and Prevention. https://www.cdc.gov/coronavirus/2019-ncov/lab/grows-virus-cell-culture.html

Lescure, F.-X., Bouadma, L., Nguyen, D., Parisey, M., Wicky, P.-H., Behillil, S., Gaymard, A., Bouscambert-Duchamp, M., Donati, F., Le Hingrat, Q., Enouf, V., Houhou-Fidouh, N., Valette, M., Mailles, A., Lucet, J.-C., Mentre, F., Duval, X., Descamps, D., Malvy, D., … Yazdanpanah, Y. (2020). Clinical and virological data of the first cases of COVID-19 in Europe: A case series. The Lancet Infectious Diseases, 20(6), 697–706. https://doi.org/10.1016/S1473-3099(20)30200-0