Development of Generalized Correlations for Thermophysical Properties of Light Hydrocarbon Solvents (C1–C5)/Bitumen Systems Using Genetic Programming

Accurate modeling of thermophysical properties of solvent/bitumen mixtures is critical for proper design and implementation of thermal- and solvent-based bitumen recovery processes. In this study, three generalized correlations were developed for prediction of solubility, density, and viscosity of light hydrocarbon/bitumen mixtures. The generalized correlations were developed using symbolic regression based on genetic programming and employing a 10-year set of comprehensive phase behavior experimental studies conducted under the SHARP research program on solvent-aided thermal recovery of bitumen. The data set comprised Surmont, JACOS, Mackay River, and Cold Lake bitumen samples and five light hydrocarbon solvents including methane, ethane, propane, n-butane, and n-pentane. The developed correlations are valid for gaseous solvents. Finally, the developed correlations for solubility, density, and viscosity were validated against a large data set of experimental measurements collected from the literature. The validation demonstrates that the developed correlations are able to accurately predict the available experimental data of solubility, density, and viscosity reported in the literature.


INTRODUCTION
Oil will continue to be the largest source of energy in the world with oil demand projected to increase by 12 percent from 2016 to 2040. 1 With the constant depletion of the conventional oil resources, the scarceness of new conventional oil discoveries, and the deficiency to meet the increase in energy demand, global heavy oil and bitumen become increasingly important. For instance, in Canada, oil sands production has exceeded conventional oil production since 2010. In 2017, oil sands production was reported to be 2.7 MMBD compared to 1.5 MMBD of conventional oil production. 2 From a global perspective, it is estimated that the unconventional oil reserves (heavy oil, extra heavy oil, and bitumen) exceed 6 trillion barrels which is equivalent to about 70% of the total fossil fuels energy resources in the world. 3 Also, the global heavy oil and bitumen production is expected to increase from 13 to 18 MMBD by 2035. 4 However, because of the high viscosity of heavy oil and bitumen, traditional processes of conventional oil recovery are not usually applicable, and therefore, enhanced oil recovery methods are developed in the past few decades to economically produce heavy oil and bitumen.
There are primarily two methods involved when it comes to extracting oil sands and bitumen: the open-pit mining method and the in situ recovery. The in situ method can be further subdivided into thermal and nonthermal recovery methods.
The thermal in situ recovery processes that are widely used and economically viable are: steam assisted gravity drainage (SAGD), and cyclic steam stimulation (CSS). The nonthermal recovery methods include cold heavy oil production and vapor extraction.
In Canada, however, as 80% of the oil sands reserves are too deep for mining, it is only accessible via in situ recovery processes. Therefore, the viability of the SAGD process comes from the fact that most of the oil sands deposits that is too deep for mining is also too shallow for high pressure steam injection processes such as the CSS process. Hence, SAGD presents the most economically viable option and is considered the most widely used in situ recovery method in Canada. 5 Despite the commercial success of thermal in situ processes such as SAGD and CSS, they present multiple challenges including considerable supply cost, high-energy intensity, high GHG emissions, and substantial water consumption. 6 In the past two decades, operators have been experimenting on innovative technologies to reduce the high-energy intensity and water consumption of thermal in situ processes. Two processes which have recently gained increased industrial interest are the solvent-assisted process (SA-SAGD) and pure solvents injection process (Nsolv). SAGD with solvent coinjection was developed by Nasr and Isaacs in 2002. The fundamental mechanism of the process is that solvent condenses with steam around the cold formation interface of the steam chamber causing oil dilution and viscosity reduction. The compound effect of the solvent with heat has the potential to provide higher bitumen production rates that is equivalent or even higher than those of steam-only injection processes.
The current development and pilots of solvent-based processes are limited and that could be attributed to the lack of experimental data and thermophysical modeling of solvent/ bitumen systems. Therefore, fundamental understanding of the phase behavior and the effects of the solvent on bitumen thermophysical properties are not fully understood, and such knowledge is essential for proper modeling, design, and implementation of solvent-based recovery processes.
Several experimental studies have been carried out to study the phase behavior of solvent/bitumen mixtures, but most of these studies are limited to low temperature conditions, and therefore not quite applicable at in situ conditions. For the past 10 years, SHARP research consortium aimed at studying phase behavior of solvent/bitumen mixtures and experimentally determining the thermophysical properties of several solvents/bitumen mixtures for a wide range of temperature and pressure conditions. These properties include solubility, viscosity, and density of several solvent/bitumen systems.
Artificial intelligence has been applied to multiple aspects in the oil and gas industry and has further gained increased interest lately with the development of smart computational techniques. Below, we provide few applications of artificial intelligence with a focus on genetic algorithm (GA) used on thermophysical properties of heavy oil/solvent mixtures. Emera and Sarma 7 developed empirical correlations using a GA-based technique to predict CO 2 solubility, oil swilling factor, CO 2 /oil density, and CO 2 /oil viscosity for both dead and live oils. They found that the GA-based correlations yielded more accurate predictions when compared with other published correlations. Tatar et al. 8 applied the intelligent model named radial basis function network optimized by GA to estimate diluted heavy oil viscosity mixed with kerosene. The proposed model was found accurate in estimation of viscosity of the mixture for their target data. Recently, Daryasafar and Shahbazi 9 used an adaptive neurofuzzy interference system to predict the effect of methane, ethane, propane, butane, carbon dioxide, and n-hexane on the density of undersaturated Athabasca bitumen on wide ranges of operating conditions. Their proposed model showed superiority in predicting bitumen density at different conditions. Baghban et al. 10 modeled the viscosity of Athabasca bitumen with n-tetradecane using the least square support vector machine method. Finally, Rostami et al. 11 modeled bitumen/ntetradecane mixture density using gene expression programming as a function of solvent composition, pressure, and temperature. Their developed model predicts the mixture density with an AARD % of 0.3%.
The aim of this study is to model and develop generalized correlations for a 10 year set of experimental data of the SHARP research program, namely, solubility, density, and viscosity of several solvent/bitumen systems. These newly developed correlations are valid for the entire range of conditions and parameters investigated with high degree of accuracy. There have been no universal correlations developed in the literature that can accurately predict thermophysical properties of several solvent/bitumen mixtures, and therefore, these correlations can be utilized to understand the phase behavior and determine the thermodynamic properties of solvent-based recovery processes. Furthermore, these correlations can be readily used to develop fluid models for commercial numerical reservoir simulators and properly design solvent-based thermal recovery processes. This paper is structured as follows: first, the range of applicability and parameters used in the newly developed correlations is presented. Next, details of the methodology utilized to develop the correlations are discussed. Then, the three developed correlations for solubility, density, and viscosity is presented along with their accuracy metrics. Lastly, the correlations are further validated and compared with available experimental data from the literature.

METHODOLOGY
We use the software Eureqa 12 for the development of the empirical correlations in this study. Eureqa is a symbolic regression (SR) software using genetic programming, created by Cornell's Creative Machines Lab and later commercialized by Nutonian. SR is the technique of finding a symbolic function that describes a data set. The developed symbolic function/model is then used to predict response behaviors and facilitate the understanding of the interconnection between the data set-independent variables and the desired response variable. The SR technique of Eureqa software uses genetic programming (GP) in which a set of functions are allowed to breed and mutate with the genetic propagation into subsequent generations based upon a survival-of-the-fittest criteria. 13 The generic implementation of a GP is to first feed the program with training and validation data sets. The data splitting between training and validation can be customized as required. In this case study, the validation data set was selected randomly based on an internal algorithm of Eureqa software. Then, a random symbolic function generator creates solutions by combining operands and arithmetic operators (constant, variables, addition, subtraction, multiplication, and division). Other advanced arithmetic operators such as power, logarithms, and exponential functions can be included as well in the search of the optimal solutions. 14 The operands used in this study are: constants, variables, addition, subtraction, multiplication, division, logarithms, exponentials, power, and square root. The obtained solutions are then compared with the validation data set by using a chosen error metric such as mean absolute error, mean squared error, R 2 goodness of fit, correlation coefficient, and so forth. The error metric used in this case study is the mean absolute error. Unfit solutions are abandoned, and good solutions are retained and further combined/mutated with new subexpressions until further satisfactory solutions emerge. Several satisfactory solutions are provided in the GP process with different accuracy and complexity levels. The best solutions are found based on the Pareto front which describes accuracy versus complexity. 14 Figure 1 illustrates the concept of the SR process using genetic programming.
Even though the predictive ability of the developed symbolic model overall increases with increasing complexity, conventionally, simple solutions with generally reasonable accuracy and sound forms are strongly desired. In this study, conventional mathematical expressions known in the literature

ACS Omega
Article for solubility, density, and viscosity models were imposed into Eureqa software in order to develop physically sound predictive models with the least complex forms and the highest prediction accuracy achievable (>97%).

SHARP EXPERIMENTAL DATA
This study is implemented utilizing a 10-year set of comprehensive experimental data conducted by the SHARP research program at the University of Calgary which pertains to measuring thermodynamic properties of several solvent/ bitumen mixtures at a wide range of temperature and pressure setting. This research focuses on modeling the experimental measurements of solubility, density, and viscosity of solvent/ bitumen systems. Several experimental studies have been conducted on phase behavior of solvent/bitumen mixtures. 15−29 The experimental data set used in this study conducted on four bitumen samples, Surmont, JACOS, Mackay River, and Cold Lake, which covers a wide range of the diverse Athabasca bitumen characteristics. The solvents used with each of the bitumen samples are C 1 to C 5 , which represents the most commonly used vapor solvents for typical solvent-aided thermal recovery processes. A statistical summary of the temperature/pressure settings covered in the SHARP experimental studies along with the measured solubilities, densities, and viscosities are presented in Table 1 below.
The temperature and pressure ranges of SHARP experiments are shown in Figure 2 as a function of bitumen samples and solvents used. As noted from the figure, the temperature and pressure settings cover a wide range that is applicable for most in situ conditions of solvent-aided thermal recovery processes.

MODELING THE SOLUBILITY OF LIGHT
HYDROCARBON SOLVENTS/BITUMEN MIXTURES Conventionally, solubility is expressed as a function of temperature and pressure for each solvent independent of bitumen sample characteristics. This may not be a valid assumption especially in the case of heavy hydrocarbon solvents such as liquid propane, butane, and pentane, and nonhydrocarbon solvents such as dimethyl ether (DME). However, our intention was to develop a generalized correlation that predicts the solubility of gaseous hydrocarbon solvents (C 1 −C 3 , n-C 4 , and n-C 5 ) mixed with a typical Athabasca-type bitumen. Therefore, it is necessary to incorporate other parameters that relates solvents and bitumen characteristics into the developed correlation. The additional parameters selected to characterize the solvent type are acentric factor ω and critical temperature T c . For the bitumen sample, molecular weight of bitumen M b was chosen. Despite having many other parameters that can be utilized to characterize the solvent used and bitumen type, these specific parameters were selected as they are widely used and readily available in the literature. Therefore, the correlation developed can be easily implemented to predict the solubility of any light hydrocarbon solvents mixed with typical Athabasca bitumen with a known molecular weight.
4.1. Development of Solubility Correlation. Initially, to examine the influence of the independent parameters on solubility, all parameters were correlated individually against solubility. Figure 3 illustrates the correlation matrix plot for all the independent variables impacting solubility. The independent variables are basically the temperature, pressure, molecular weight of bitumen, and solvent type (characterized by the critical temperature or acentric factor of the solvent). As shown from Figure 3, the most influential parameter affecting solubility is the solvent type (defined by critical temperature of solvent) with a correlation coefficient of 0.62. The pressure parameter comes next in impacting the solubility of the solvent into bitumen with a correlation coefficient of 0.41. Temperature is slightly influential where it negatively impacts the solubility variable. The molecular weight of the Athabasca bitumen does not seem to influence the solubility and that could be attributed to the relatively low variability of the bitumen molecular weight data. The four bitumen samples used in the analysis are Surmont, JACOS, MacKay River, and Cold Lake with the molecular weight ranging between 500 and 550 g/g·mol.
Next, nonlinear regression was performed on the experimental solubility data set using genetic programming to find a generalized mathematical expression correlating the abovementioned independent variables with solubility. The following correlation was found to best describe solubility x s in terms of temperature and pressure settings and solvent characteristics (T c and ω). where x s is the mole fraction of the solvent in bitumen, ω is the solvent acentric factor, T is the temperature in K, T c is the solvent critical temperature in K, and P is the pressure in MPa.

Article
The total number of experimental data points used in the development of the solubility correlation (1) is 148. As noted from eq 1, it is a simple correlation yet accurate for predicting the solubility of light hydrocarbons (C 1 to C 5 ) in any typical Athabasca bitumen. The correlation match along with the error metrics associated is shown in Figure 4. As noted from the figure, R 2 goodness of fit is 0.968. The pressure and temperature ranges used for the developed correlation are within 322−463 K and 0.5−8.2 MPa. It is worth noting that the correlation is only valid for the temperature and pressure at which the solvent is in the gaseous state (i.e., T > T sat and P < P sat ).

Validation
Analysis of the Solubility Correlation. As mentioned previously, the GP software (Eureqa) splits the data set randomly into training and testing sets using an internal algorithm to avoid over fitting of the developed model. Therefore, as observed from Figure 4, the solubility correlation presents an R 2 accuracy of 0.968 which is also representative of the predictive goodness of fit. However, to further test the robustness of the developed solubility correlation and validate its accuracy, a comprehensive experimental data available in the literature were collected and compared against the developed solubility correlation. It is worth emphasizing here that none of the solubility data collected from the literature were used in the development of the solubility model.
The total number of data points collected from the literature and used in the validation analysis which are presented in Figure 5 is 185 points. 30−37 As shown from Figure 5, the R 2 goodness of fit is relatively lower than the R 2 obtained previously and that can be attributed to the fact that several of the data points found from the literature are outside the range of applicability of the solubility model as shown in Table 2. Also, another important reason for the discrepancy in R 2 goodness of fit is due to the inherent error associated with the measured solubility results owing to the difference in the experimental setups and/or human interpretation of the measured data. For example, Mehrotra and Svrcek 31 measured

ACS Omega
Article the solubility of methane in Athabasca bitumen at a temperature of 373 K and pressure of 7.82 MPa to be 0.2344, whereas the respective solubility measured by the SHARP experiments for methane at the same temperature and pressure is 0.27. Despite having a lower R 2 goodness of fit of 0.86, the model is still fairly accurate and robust to be used in the prediction of solubility data in the absence of experimental measurements with only ∼13% margin of error. A summary of the validation data set collected from the literature is presented in Table 3 along with the associated R 2 goodness of fit for each study.
It is worth noting from Table 2 that the study conducted by Mehrotra and Svrcek 36 is based on the Wabasca bitumen sample which has a reported molecular weight of 446.5 g/g· mol that is lower than the range of applicability of the developed solubility correlation. Nonetheless, the developed solubility correlation provides fairly accurate predictions with less than 12% margin of error. The same can be said about the studies conducted by Mehrotra and Svrcek 35 and Badmachi-Zadeh et al.; 37 the range of temperature and pressure settings for most of the data points in these studies is outside the range of applicability of the developed correlation, yet the solubility correlation provides accurate predictions with less than 13% margin of error. This demonstrates the versatility of the developed correlation in predicting solubilities even for data points outside applicable ranges of temperature, pressure, and bitumen sample types.
Furthermore, it is possible to use the solubility correlation presented to model new experimental solubility measurements for other solvents or bitumen samples. A parametric correlation suggested to model any solubility data with parameters a to j to be regressed on the new experimental data.
For example, Mehrotra and Svrcek 33 measured the solubility of CO 2 in bitumen and to show the adaptability of correlation given by eq 2, their measured data were modeled using the suggested correlation. Parameters a to j were regressed and optimized to model their experimental data. Figure 6 shows the results of the regression in which the R 2 goodness of fit obtained in this case is 0.976. Although the solubility correlation was developed for hydrocarbon solvents, the correlation is capable of modeling nonhydrocarbon solvents as well. It should be mentioned that the proposed correlation (eq 2) are applicable to predict solubility of vapor and gaseous CO 2 in bitumen. It cannot be implemented in the TP regions where CO 2 is in the liquid phase and forms liquid−liquid equilibrium systems with bitumen.
To further demonstrate the adaptability of the solubility correlation (2) in modeling the solubility of other solvents in bitumen, Haddadnia et al. 39 measured the solubility of DME in Athabasca bitumen for a wide range of temperatures and pressures. Their experimental data were modeled using correlation (2) in which parameters a to j were regressed and optimized. Figure 7 shows the results of the regression for which the R 2 goodness of fit obtained in this case is 0.984.

MODELING THE DENSITY OF LIGHT
HYDROCARBON SOLVENTS/BITUMEN MIXTURES The experimental density measurements were modeled using the following equation   Figure 6. Modeling CO 2 solubility in Athabasca bitumen 38 using the correlation given by eq 2.

ACS Omega
Article where ρ m is the mixture density in kg/m 3 , w s is the mass fraction of the solvent in bitumen, and ρ s and ρ b are the densities of the solvent and bitumen components in kg/m 3 , respectively. Equation 3 is developed based on the assumption of no volume change upon mixing. In this case, it is assumed that the volumes of the individual components are additive, and the mixture is a regular solution. Even though solvent/ bitumen mixtures are not considered regular solutions and exhibit volume change upon mixing, eq 3 provides reasonable density predictions based on the work of several researchers. Saryazdi, 40 for example, reported 1% AARD in predicting mixture densities for ethane, propane, and n-butane as dissolved gas and n-decane, toluene, and cyclooctane as the heavier liquid component through using the no volume change upon mixing rule (eq 3). Also, Kariznovi 29 used eq 3 in predicting n-heptane with the Surmont bitumen mixture and reported that the results are in agreement with the measured data within 1.12% AARD.
First, to study the effect of the independent parameters on density variables, all parameters were correlated individually against density. Figure 8 illustrates the correlation matrix plot for all the independent variables that may affect density. The independent variables are the temperature, pressure, molecular weight of bitumen, solvent mass fraction, and solvent characteristics (critical temperature/acentric factor of the solvent). As shown in Figure 8, the most influential parameter affecting density is the mass fraction of the solvent with a correlation coefficient of −0.87. The solvent type (critical temperature) is the second highest influential parameter with a correlation coefficient of −0.7. The temperature variable comes third. As noted from Figure 7, all influential parameters are negatively correlated with density except for the molecular weight of bitumen.
5.1. Modeling Liquid Density of Light Hydrocarbon Pure Components (C 1 −C 5 ). One of the challenges encountered when modeling mixture densities is handling light/gaseous solvents. At the pure state, the gas (solvent) densities is well defined, but when mixed with heavier components, they are more like a liquid. 40 This usually results in underprediction of mixture densities. Therefore, several approaches were suggested by researchers such as accounting for the excess volume in the prediction of the densities 40 and/ or using the concept of "effective" liquid density 41 to predict the densities of light/gaseous solvents (ρ s ) when mixed with heavier liquid components. The "effective" liquid density is a hypothetical liquid density for the gas solvent component when it is part of the liquid mixture.
The approach used to handle the prediction of density of gaseous solvents (ρ s ) in this study is different from the methods available in the literature and discussed above. Actual liquid densities of pure light hydrocarbon components (C 1 to n-C 5 ) were taken from the NIST database. 42 A correlation was developed for the obtained liquid densities of light hydrocarbon components using genetic programing. The developed Figure 7. Modeling DME solubility in Athabasca bitumen 39 using the correlation given by eq 2. Figure 8. Correlation matrixindependent parameters impacting density variableblue = C 1 , red = C 2 , green = C 3 , purple = n-C 4 , and black = n-C 5 .

ACS Omega
Article correlation can be utilized to predict pseudo liquid densities of light hydrocarbon components beyond the temperature and pressure conditions at which the component is in the liquid state. For example, propane at a pressure of 1 MPa is in the liquid state at temperatures between 90.5 and 300 K. The developed correlation predicts the liquid density of propane at this pressure and temperature range with an AARD of 2%. The same correlation is applied to temperatures beyond 300 K when the gaseous solvent is dissolved in a liquid mixture to predict the pseudo liquid density. Figure 9 illustrates the actual propane density obtained from NIST at a pressure of 1 MPa. The developed correlation is used to predict the actual liquid density of propane as shown in the solid black curve. The red curve shows the pseudo liquid densities of propane for temperatures beyond 300 K using the same developed correlation.
The generalized correlation developed to model the liquid densities of light hydrocarbon components (C 1 to n-C 5 ) to be used in prediction of mixture density is given by where T and T c are the temperature of the system and critical temperature of the solvent in K, respectively, and ρ s is the density of the pure solvent in kg/m 3 . This correlation is valid in predicting the liquid densities of pure C 1 to n-C 5 with an AARD of 3.29% and R 2 goodness of fit of 0.944. Figure 10 below shows the actual liquid densities of C 1 to n-C 5 versus the predicted densities along with the associated error metrics.

Modeling Density of Pure Bitumen.
A generalized correlation was developed to predict the density of pure bitumen (ρ b ) using genetic programming, Eureqa software. The density of four bitumen samples measured in the SHARP research program were used in the development of the bitumen density correlation: Surmont, MacKay River, JACOS, and Cold Lake. The range of temperature and pressure of the experimental measurement of pure bitumen samples are 308.15−451.15 K and 1.12−10.43 MPa, respectively. The reported molecular weights of the bitumen samples are 540, 513, 530, and 534 for Surmont, MacKay River, JACOS, and Cold Lake, respectively. It should be noted that these values have been measured using the freezing point depression method with benzene solvent. The obtained correlation is given by where M b is the molecular weight of the bitumen sample in g/ g·mol, T and P are temperatures in K and pressure in MPa, respectively, and ρ b is bitumen density in kg/m 3 . As noted from the developed correlation (5), it is quite simple with only three parameters: T, P, and M b . However, this correlation is highly accurate with an R 2 goodness of fit of 0.986 and an AARD of 0.228%. Figure 11 shows the accuracy plot of all bitumen density measurements versus predicted ones using eq 5, along with the error metrics associated.

Modeling Mixture Densities of Solvent/Bitumen Binary Systems.
Having modeled pure solvent densities and pure bitumen densities (eqs 4 and 5), now it is possible to predict experimental densities of solvent/bitumen mixtures using eq 3 presented previously. By combining eqs 4 and 5 with eq 3, the complete generalized correlation to predict mixture densities of light hydrocarbon solvents/bitumen binary systems is given by The developed correlation (6) was found to best describe density in terms of temperature and pressure, solvent mass fraction, molecular weight of bitumen, and critical temperature of solvent. As noted from correlation 6, it is a simple correlation with two separate terms for calculation of the pseudo liquid density of solvent and pure bitumen densities.

Article
The generalized correlation (6) is accurate for predicting the mixture densities of light hydrocarbons (C 1 to n-C 5 ) dissolved in any typical Athabasca bitumen. The correlation match along with the error metrics associated is shown in Figure 12. Again, the pressure and temperature ranges used for the developed correlation are within 322−463 K and 0.5−8.2 MPa, respectively. It is worth noting that the correlation is only valid for the temperature and pressure at which the solvent is in the gaseous state (i.e., T > T sat and P < P sat ).
The total number of experimental data points used in the development of the mixture density correlation (6) and illustrated by Figure 12 is 186 which comprised four bitumen samples with molecular weight ranging from 500 to 550 g/g· mol and five hydrocarbon solvents, C 1 to n-C 5 . As noted from Figure 12, the R 2 goodness of fit obtained is 0.97 and the associated AARD is 0.92%. 5.4. Validation Analysis of the Density Correlation. As noted earlier, the GP software (Eureqa) splits the data set randomly into training and testing sets using an internal algorithm to avoid over fitting of the developed model. Accordingly, as noted from Figure 12, the density correlation (6) gives an R 2 accuracy of 0.975 that is for both sets of data: training and testing sets. However, to further assess the robustness of the developed density correlation and confirm its accuracy in prediction, a comprehensive experimental data available in literature were collected and compared against the developed density correlation. It is worth emphasizing here that none of the density data collected from the literature were used in the development of the density correlation.
The total number of data points collected from the literature and used in the validation analysis for density which are presented in Figure 13 is 169 data points. As shown in Figure   13, the R 2 accuracy for prediction is 0.894 which is lower than the R 2 obtained previously in Figure 12. This can be attributed to the fact that several of the data points found from literature are outside the range of applicability of the density model as shown in Table 3. Also, another significant reason for the discrepancy in R 2 prediction accuracy is the inherent error associated with the experimental density measurements due to the difference in either the experimental setups and/or human interpretation when measuring the data. Despite having a lower R 2 prediction accuracy of 0.894, the model is still fairly robust to predict mixture densities in the absence of experimental measurements with only ∼10% margin of error. A summary of the validation data set collected from the literature is presented in Table 2 below along with the associated AARD for each study.
It is worth noting from Table 3 that the study conducted by Mehrotra and Svrcek 36 is based on the Wabasca bitumen sample which has a reported molecular weight of 446.5 g/g· mol that is lower than the range of applicability of the developed solubility correlation. Nonetheless, the developed density correlation provides very accurate predictions with only 0.76% AARD. The only data set presented a poor prediction is Mehrotra and Svrcek 35 where the obtained AARD is 3%. This could be attributed to the difference in bitumen sample composition and characteristics as they used the Peace River bitumen sample which was not part of the data sets used in the development of the density correlation. Overall, the validation data sets demonstrate the versatility of the developed correlation in predicting densities even for data points outside applicable ranges of temperature, pressure, and bitumen sample types.
In addition, it is possible to use the density correlation presented to model new experimental density measurements for other solvents or bitumen samples. Equation 7 below presents the parametric correlation suggested to model any density data with parameters a 0 to a 4 and b 0 to b 4 to be regressed on the new experimental data.
For example, Haddadnia et al. 39 measured the density of DME in Athabasca bitumen for a wide range of temperatures and pressures, and to show the adaptability of correlation given by eq 7, parameters a 0 to a 4 and b 0 to b 4 were regressed and optimized to model their experimental data. The R 2 goodness of fit obtained is 0.99 and the correlation of best fit is given by a 0 = −76.9, a 1 = 0.16, a 2 = −5.14 × 10 3 , a 3 = 12.7, a 4 = 175, b 0 = −24.2, b 1 = 16.3, b 2 = 6.45 × 10 3 , b 3 = 42.1, b 4 = 25.8. Although the parametric density correlation (7) was developed for light hydrocarbon solvents, the correlation is capable of modeling other solvents as well with great accuracy.

MODELING THE VISCOSITY OF LIGHT
HYDROCARBON SOLVENTS/BITUMEN MIXTURES The experimental viscosity measurements were modeled using the fourth root mixing rule, 43 which is extensively used in refinery calculations where μ m is the mixture viscosity in cP, w s is the mass fraction of solvent in bitumen, μ s and μ b are the viscosities of the

ACS Omega
Article solvent and bitumen components in cP, respectively. The fourth root mixing rule, also known by the quarter-law mixing rule, was found to provide superior results when compared to the typically used logarithmic-based mixing rule. The same process was performed on the viscosity experimental data set in which the impact of all considered independent variables on the viscosity was examined. All variables were correlated individually against viscosity. Figure  14 illustrates the correlation matrix plot for all the independent variables that may affect viscosity. The independent variables are the temperature, pressure, molecular weight of bitumen, solvent mass fraction, and solvent characteristics (critical temperature/acentric factor of the solvent). As shown in Figure 14, the most influential parameter affecting viscosity is the temperature of the system with a correlation coefficient of −0.44. Solvent type (critical temperature) is the second highest influential parameter with a correlation coefficient of −0.31. As noted from Figure 14, all independent parameters are negatively correlated with viscosity except for the molecular weight of bitumen.
6.1. Modeling Liquid Viscosity of Light Hydrocarbon Pure Components (C 1 −C 5 ). Similar challenge as density property is present when modeling the viscosities of pure light hydrocarbon solvents. At the pure state, the gaseous solvents viscosity is well defined, but when mixed with heavier components, they are more like a liquid. Therefore, it is mandated to estimate pseudo liquid viscosities of gaseous solvents dissolving in bitumen.
There are several correlations proposed in the literature to estimate pseudo liquid viscosities for solvent/bitumen mixtures which were found not to be applicable for a wide range of temperature and pressure settings. For example, Zirrahi et al. 16 reported a correlation that can be used in estimating the pseudo liquid viscosities of solvents dissolving in bitumen. It assumes that the effective solvent viscosity in the oleic phase is a function of temperature and pressure of the system: μ s = d 1 + d 2 T + d 3 T 2 + (d 4 + d 5 T)P where T and P are the absolute temperature and pressure in K and MPa, respectively. It was found that their correlation is not valid for prediction of the pseudo liquid viscosity of solvents for temperatures above 450 K.
In this study, a new correlation is proposed to predict the pseudo liquid viscosities of light hydrocarbon components (C 1 to C 5 ). The developed correlation is based on actual liquid viscosities which were taken from the NIST database. 42 Through utilizing genetic programming (Eureqa software), the developed correlation is capable of predicting actual liquid viscosities of light hydrocarbon component and can further be extended to predict pseudo liquid viscosities beyond the temperature and pressure conditions of the components' liquid state. The developed correlation is given by where μ s is the viscosity of solvent in cP, and T and T c are the temperatures of the system and critical temperature of solvent in K, respectively. As noted from the developed correlation, it is only a function of temperature and critical temperature of the solvent type. It is worth emphasizing herein that it is not the objective to model liquid viscosities of pure solvent components. However, in order to model and predict mixture viscosities of solvent/bitumen systems, pseudo liquid viscosities of solvents (μ s ) need to be defined. The developed correlation (9) predicts actual viscosities for C 1 to n-C 5 with an R 2 accuracy of 0.90. As noted, the prediction is not highly accurate. However, as mentioned previously, it is not the intent to perfectly match the pure solvent liquid viscosities, but to define a good fit and sound representation of pseudo liquid viscosities that will provide an accurate prediction of mixture viscosities for solvent/bitumen systems.
6.2. Modeling Viscosity of Pure Bitumen. A generalized correlation was developed to predict the viscosity of pure bitumen (μ b ) using SR software Eureqa which utilizes genetic programming. Four bitumen samples experimentally measured in the SHARP research program were used in the development of the bitumen viscosity correlation: Surmont, MacKay River, Figure 14. Correlation matrixindependent variables impacting viscosityblue = C 1 , red = C 2 , green = C 3 , purple = n-C 4 , and black = n-C 5 .

ACS Omega
Article JACOS, and Cold Lake. The ranges of temperature and pressure of the experimental measurement of pure bitumen samples are 308. 15 (10) where μ b is the viscosity of pure bitumen in cP and T is the temperature in K and P is the pressure in MPa. As noted from the developed correlation (10), it is simple with only the temperature and pressure parameters as the independent variables. However, the correlation is highly accurate to predict pure bitumen viscosities with an R 2 goodness of fit of 0.985 and an AARD of 3.5%. Figure 15 shows the accuracy plot of all bitumen viscosity measurements versus predicted ones using eq 10, along with the error metrics associated ( Figure 16).
6.3. Modeling Mixture Viscosities of Solvent/Bitumen Binary Systems. Correlations 9 and 10 that were developed to predict the viscosities of pure solvent and pure bitumen components can be combined using the fourth root mixing role (eq 8) in order to predict experimental mixture viscosities of solvent/bitumen systems. The complete generalized correlation to predict mixture viscosities of light hydrocarbon solvents/bitumen binary systems is given by The generalized correlation (11) was found to best describe viscosity in terms of temperature, pressure, solvent mass fraction, and critical temperature of solvent (T c ). As noted from eq 11, it is a simple correlation where it is composed of two separate terms for predicting solvent pseudo liquid and pure bitumen viscosities. Nonetheless, it is quite accurate for predicting the mixture viscosities of light hydrocarbons (C 1 to n-C 5 ) dissolved in any typical Athabasca bitumen. The correlation match along with the error metrics associated is shown in Figure 17. The pressure and temperature ranges used for the developed correlation are 322−463 K and and 0.5−8.2 MPa, respectively. It is worth noting that the correlation is only valid for the temperature and pressure at which the solvent is in the gaseous state (i.e. T > T sat and P < P sat ).
It is worth noting that the viscosity plots are presented in a logarithmic base for ease of visualization and proper statistical analysis. The total number of experimental data points used in the development of the mixture viscosity correlation (11) and illustrated by Figure 17 is 186 points which covers four bitumen samples ranging from 500 to 550 g/g·mol and five hydrocarbon solvents: C 1 to n-C 5 .
6.4. Validation Analysis of the Viscosity Correlation. To assess the robustness of the developed viscosity correlation and confirm its accuracy in the prediction of solvent/bitumen mixture viscosity, a comprehensive experimental data available in the literature were collected and statistically analyzed against the developed viscosity correlation. It is worth emphasizing here that none of the viscosity data collected from the literature were used in the development of the viscosity correlation.
The total number of data points collected from the literature and used in the validation analysis for viscosity predictions which are presented in Figure 18 is 169 points. The R 2 coefficient is 0.94 which is lower than the R 2 obtained previously in Figure 17. The discrepancy is due to the range of some of the data points collected being outside the range of applicability of the developed correlation. In addition, the inherent error/discrepancy associated with the experimental viscosity measurements of other researchers when compared to the SHARP studies affect the accuracy of the prediction results. Notwithstanding, the model is still robust to predict mixture viscosities and dead bitumen viscosities in the absence of experimental measurements with only ∼6% margin of error. A summary of the validation data set collected from the literature is presented in Table 3 below along with the associated AARD for each study (Table 4).
Overall, the validation data sets demonstrate that the developed correlation can fairly predicts the viscosities of different solvent/bitumen systems even for datasets outside the range of applicability of the developed correlation.
In addition, it is possible to use the viscosity correlation presented to model new experimental viscosity measurements for other solvents or bitumen samples. Equation 12 presents the parametric viscosity correlation suggested to model any viscosity data with parameters a 1 to a 4 and b 1 to b 3 to be regressed on the new experimental data.

SUMMARY AND CONCLUSIONS
This study presented three generalized correlations for modeling solubility, density, and viscosity of light hydrocarbon/bitumen systems. The generalized correlations were developed using SR analysis based on genetic programming (Eureqa software). The generalized correlations for solubility, density, and viscosity were developed utilizing a 10-year set of comprehensive experimental data conducted in the SHARP research program which comprises four bitumen samples ranging from 500 to 550 g/g·mol and five hydrocarbon solvents, C 1 to n-C 5 . The developed correlations are valid for a wide range of temperature and pressure setting which covers the conditions of thermally based solvent-aided recovery processes. Therefore, the generalized correlations can be utilized for proper design, modeling, and implementation of solvent-aided thermal recovery processes. The developed solubility correlation (1) is only a function of four parameters: temperature, pressure, solvent critical temperature, and acentric factor. Yet, it provides accurate predictions of solubilities when validated against a large list of experimental measurements available in the literature. The solubility correlation is versatile enough to predict solubilities for data sets outside the range of applicability of the developed correlation with the 13% margin of error.
The density correlation (6) was developed based on the assumption of no volume change upon mixing and therefore it is essentially composed of two separate and additive parts. The first part of the correlation (4) is for predicting pseudo solvent liquid densities and the second part (5) is for predicting pure bitumen densities. In general equation, (6) provides robust prediction of mixture densities in the absence of experimental measurements with only ∼10% margin of error when compared to a large list of experimental measurements available in literature.
The viscosity correlation was developed using the fourth root mixing rule (8). This rule was found to provide better

ACS Omega
Article prediction results when compared to typical logarithmic based mixing rules used extensively in modeling viscosity measurements. Analogous to the density correlation, the viscosity correlation (11) is composed of two separate parts for predicting pure solvent pseudo liquid and pure bitumen viscosities. The viscosity correlation (11) provides accurate prediction of mixture viscosities with ∼6% margin of error when validated against a large list of experimental viscosity measurements available in literature.