Multi-Variable Multi-Metric Optimization of Self-Assembled Photocatalytic CO2 Reduction Performance Using Machine Learning Algorithms

The sunlight-driven reduction of CO2 into fuels and platform chemicals is a promising approach to enable a circular economy. However, established optimization approaches are poorly suited to multivariable multimetric photocatalytic systems because they aim to optimize one performance metric while sacrificing the others and thereby limit overall system performance. Herein, we address this multimetric challenge by defining a metric for holistic system performance that takes multiple figures of merit into account, and employ a machine learning algorithm to efficiently guide our experiments through the large parameter matrix to make holistic optimization accessible for human experimentalists. As a test platform, we employ a five-component system that self-assembles into photocatalytic micelles for CO2-to-CO reduction, which we experimentally optimized to simultaneously improve yield, quantum yield, turnover number, and frequency while maintaining high selectivity. Leveraging the data set with machine learning algorithms allows quantification of each parameter’s effect on overall system performance. The buffer concentration is unexpectedly revealed as the dominating parameter for optimal photocatalytic activity, and is nearly four times more important than the catalyst concentration. The expanded use and standardization of this methodology to define and optimize holistic performance will accelerate progress in different areas of catalysis by providing unprecedented insights into performance bottlenecks, enhancing comparability, and taking results beyond comparison of subjective figures of merit.

).This is attributed to self-quenching being faster for RubpyC17 assembled into micelles (higher local concentration) compared to freely diffusing Rubpy.Adding 3 CMC of C12E6 surfactant increases the Ru* lifetime for RubpyC17 (τ1 50 ns, 62%, τ2 503 ns, 38%; Fig S5, Table S2), with the effect attributed to the surfactant mitigating self-quenching by forming mixed RubpyC17/surfactant micelles that result in longer Ru-Ru distances (decrease in local concentration).The RubpyC17* lifetime shows quencher concentration dependence, thereby excluding the possibility of static quenching by ascorbate electrostatically assembled at the charged micelle surface (Fig S6, Table S3).
In the presence of the reductive quencher, NaAsc, the absorption peak at 510 nm enables quantification of Ru − and determination of its lifetime.The intensity of this peak is unchanged for Rubpy in water, phosphate buffer and C12E6 surfactant (40 mOD), while RubpyC17 shows ~50% less Ru − forming in water and buffered media (17 mOD;Fig S7 and Table S4).This is expected as most Ru* is decaying through self-quenching before it can be quenched to Ru − by ascorbate.The inclusion of C12E6 surfactant increases the reduced RubpyC17 concentration to 75% of the diffusionally free Rubpy value (30 vs 40 mOD), most probably due to the longer Ru* lifetime enabling more centers to be quenched.
Both Rubpy and RubpyC17 show single exponential decay of Ru − (Fig S8, Table S5), yet the Rubpy lifetime is below 100 µs in water, phosphate buffer and C12E6 solution (72, 62 and 16 µs), whereas RubpyC17 shows a lifetime >200 µs under these conditions (247, 216 and 282 µs).These lifetimes are further increased when catalyst is added into the micelles (without changes to Ru − yield), with τ for RubpyC17/C12E6 micelles increasing 33% (282 to 375 µs) while Rubpy shows no change (Fig S8, Table S5).This leads to a 23-fold longer Ru − lifetime for RubpyC17 in C12E6 micelles with catalyst compared to the standard Rubpy (375 vs 16 µs).The RubpyC17 − lifetime at 510 nm was not sensitive to catalyst loading from 1.5 to 5 µM (375 ± 13 vs 370 ± 6 µs; Fig S11A ), nor sensitive to switching from Ar to CO2 saturated media (Fig S11B).Charge separated states from direct photoexcitation of the catalyst were not observable on the ns-timescale employed.
Contrasting the RubpyC17/C12E6 system against the catalytically inactive cationic and anionic surfactants shows that the photoexcited Ru* lifetime is higher in cationic CTAC (note chloride salt used rather than bromide), while SDS prevents self-quenching as effectively as using freely diffusing Rubpy (Fig S5, Table S2).Cationic CTAC (note chloride salt used rather than bromide, so bromide quenching cannot be excluded) resulted in RubpyC17 showing a double-exponential decay like C12E6, but the weightings for the two processes were inverted to give more Ru* a longer lifetime (τ1 67 ns, 28%, τ2 585 ns, 72% for CTAC compared to τ1 31 ns, 63%, τ2 394 ns, 37% for C12E6).This trend extended with RubpyC17/SDS, where only the slower decay pathway of Ru* was observed and the lifetime was comparable to freely diffusing Rubpy (τ 684 ns for RubpyC17/SDS vs 667 ns for Rubpy in SDS solution).This indicates that SDS very effectively prevents self-quenching of Ru* in the micelles.The RubpyC17* lifetime was dependent on the NaHAsc concentration in CTAC, but independent in SDS as there was no lifetime change compared to freely diffusing Rubpy absent NaAsc (Fig S6).
The yield of RubpyC17 − was 17% lower with CTAC than C12E6 (25 vs 30 mOD; Fig S7), whereas negligible RubpyC17 − was formed in SDS micelles (4 mOD) presumably due to electrostatic repulsion between the negative charges of surfactant and reductant preventing formation of a 'solvent cage' to explain their catalytic inactivity.The RubpyC17 − lifetime in C12E6 micelles is 2.4-fold higher than CTAC (282 vs 119 µs), which may explain the lack of catalytic turnover with electron transfer to the catalyst being limiting.

Calculation of Number of Points in Parameter Space (Simplex Size).
The concentrations of each variable are continuous variables that must be discretized a fixed number of points.Testing 10 different concentrations of each variable would give reasonable resolution of performance peaks.With five different solutions, each tested at 10 concentrations in all combinations, this is 10 5 combinations.

Cross-validation of Regression Model.
The widely recognized k-fold cross-validation approach was employed to identify the most effective hyperparameters for the regression models, with k=5.The dataset of 103 samples, each containing 5 features along with the target property to be predicted, was divided into a training set (72 samples) and a testing set (31 samples).The training set is further divided into five subsets or folds.All possible combinations of (i) the number of estimators, selected from [100,200,500,1000,2000], and (ii) the number of features to retain for the optimal split, chosen from ['auto', 'sqrt', 'log2'], were examined and resulted in 15 distinct hyperparameter combinations.For each hyperparameter combination, the model underwent training on four folds while its performance was assessed on the fifth fold, serving as the validation set.This cross-validation process was repeated five times, with each iteration using a different fold as the validation data and the remaining four as training.Following this, the average performance across all five validation folds was calculated for each hyperparameter combination, and the best-performing combination was chosen.Using the top-performing hyperparameter combination, the final model was trained on all five folds, and the trained model was then utilized to predict the target property on the testing set.

Control Group Feature Analysis.
The maximum performance is on a peak which appears particularly sensitive to small changes, but alternative parameter combinations can also be shown for more stable regions such as the fourth best performance combination, where an alternative combination of parameters can be found that also lowers the catalyst loading by 30 % while requiring 7 % less photosensitizer (Catalyst 2.7 vs. 3.8 µM, Photosensitizer 88 vs 95 µM, Surfactant 17 vs 14 CMC, Reductant 162 vs 136 mM, Buffer 262 vs 305 mM).Implementation details are shown below.
To find points with the same performances and different concentrations of the original features  1 , … ,  5 , we have solved the following system of two implicit equations:  S9),   s represent the exponents to construct the optimised features for objective function 1 (namely, the first two rows in Table S10), const 1 and const 2 are the first and second coordinates in the optimised features chart of the point we want to find an alternative combination for (namely,  1,obj 1,norm and  2,obj 1,norm in Fig. S13),  1 ̂,  2 ̂,  1 ̂,  2 ̂ are the maximum and minimum values observed over the training set for the non-normalised mixed features referred to objective function 1 (namely, the first two rows in Table S11).
We aimed at finding the values of the five coefficients  2 , … ,  5 such that those two equations were solved within an error of 1e-4 on the value of the const, imposing  1 = 0.7 and  2 < 1.
In particular, for the best point they become Given  1 = 0.7 (coefficient of the Catalyst), we have explored all the possible combinations of  2 , … ,  5 from 0.7 to 1.2 and with a step of 0.005; the two proposed combinations are the ones minimizing  2 (coefficient of the Photosensitiser).
Multi-Variable Multi-Metric Optimization of Self-Assembled Photocatalytic CO2 Reduction Performance using Machine Learning Algorithms SI Page 5 / 19 Supplementary Tables

Fig S14 :
Fig S14: Correlation pair-plot matrix for all the possible couples of parameters and performance metrics for all the 103 experimentally tested combinations.Clear linear correlations are observed between TOF and TON as well as Yield and QY as one is the other divided by a constant, respectively.The correlation coefficients for each plot are shown in a matrix in Fig S15.

Fig S15 :
Fig S15:Correlation matrix with linear correlation coefficients for all the possible couples of both parameters and performance metrics calculated over the 103 experimentally tested combinations.Clear linear correlations are observed between TOF and TON as well as Yield and QY as one is the other divided by a constant, respectively.

Fig S17 :
Fig S17:Control Group Feature analysis comparing training sets using the top two features according to the SHAP ranking and using the 'optimized features' that are obtained as power combinations of the catalyst, photosensitizer, surfactant, reductant and buffer concentrations; and exemplifying the predictive power with the testing set.Results are reported for a y obj 1 , b COform, c TONCO, d TOFCO, e QY.When plotted against mixed features, y obj 1 shows optimal values for x1,obj1,norm and x2,obj2,norm about 0.8 and 0.5 respectively; COform shows optimal values for x1,COform,norm and x2,COform,norm about 0.3; QY shows optimal values for x1,QY,norm and x2,QY,norm below 0.2.No clear indication appears for TONCO and TOFCO.

Table S7 .
Tabulated values from learning algorithm optimization.a Multi-Variable Multi-Metric Optimization of Self-Assembled Photocatalytic CO2 Reduction Performance using Machine Learning Algorithms SI Page 8 / 19 a Solutions CO2-sat.(pH 6.3) and illuminated for 15 min with 447 nm illumination with 2.3 W LED at 25 °C with 250 rpm orbital shaking, 1 mL reaction volume.Cat = CoPyPC16, PS = RubpyC17, Surf.= C12E6, Red.= NaHAsc, Buf.= phosphate buffer.*Initial pool of experiments.**Experiments from heuristic optimization that were not included in the initial pool.***Experiments not suggested by the learning algorithm optimization.

Table S8 :
Selected self-assembled and homogeneous molecular catalyst systems for photocatalytic CO2-to-CO reduction in aqueous media.

Table S9 :
Minima and maxima over the training set for Catalyst, Photosensitizer, Surfactant, Reductant, Buffer concentrations.

Table S10 :
Coefficients (exponents)for power combinations of the original features.

Table S11 :
Minima and maxima over the training set of the non-normalized optimized mixed features.