Automated de Novo Design of Olefin Metathesis Catalysts: Computational and Experimental Analysis of a Simple Thermodynamic Design Criterion

Methods for computational de novo design of inorganic molecules have paved the way for automated design of homogeneous catalysts. Such studies have so far relied on correlation-based prediction models as fitness functions (figures of merit), but the soundness of these approaches has yet to be tested by experimental verification of de novo-designed catalysts. Here, a previously developed criterion for the optimization of dative ligands L in ruthenium-based olefin metathesis catalysts RuCl2(L)(L′)(=CHAr), where Ar is an aryl group and L′ is a phosphine ligand dissociating to activate the catalyst, was used in de novo design experiments. These experiments predicted catalysts bearing an N-heterocyclic carbene (L = 9) substituted by two N-bound mesityls and two tert-butyl groups at the imidazolidin-2-ylidene backbone to be promising. Whereas the phosphine-stabilized precursor assumed by the prediction model could not be made, a pyridine-stabilized ruthenium alkylidene complex (17) bearing carbene 9 was less active than a known leading pyridine-stabilized Grubbs-type catalyst (18, L = H2IMes). A density functional theory-based analysis showed that the unsubstituted metallacyclobutane (MCB) intermediate generated in the presence of ethylene is the likely resting state of both 17 and 18. Whereas the design criterion via its correlation between the stability of the MCB and the rate-determining barrier indeed seeks to stabilize the MCB, it relies on RuCl2(L)(L′)(=CH2) adducts as resting states. The change in resting state explains the discrepancy between the prediction and the actual performance of catalyst 17. To avoid such discrepancies and better address the multifaceted challenges of predicting catalytic performance, future de novo catalyst design studies should explore and test design criteria incorporating information from more than a single relative energy or intermediate.


S1.3 Sensitivity of Ru=C Bond Length to Computational Model
To assess the sensitivity of the Ru=C bond length descriptor to basis set and solvation model, the geometry of Ru5 for a representative selection of ligands among those in the training set of Ref. 2 (and listed in Table S2) was re-optimized with three variations of the computational protocol here defined as Prediction Model 1 (see Computational Details section).(i) an extended basis set, def2TZVPP, instead of LANL2DZ, (ii) implicit solvation in benzene as from the SMD model, 3 and (iii) the combination of def2TZVPP basis set with SMD solvation in benzene.
Although the values are numerically different (Table S3) the trend is essentially retained.The series obtained from each variation of Prediction Model 1, in fact, show high correlation coefficients with the series obtained from the original Prediction Model 1 (see Figure S3, Figure S4, and Figure S5).Consequently, the correlation between the Ru=C bond length and the computed productivity remains high (R 2 >0.92) for each alternative computational model (Figure S6, Figure S7, and Figure S8).Importantly, the Ru=C bond distance obtained for the de novo designed ligand 9 is consistently the shortest.
Overall, these results suggest that Prediction Model 1 is relatively insensitive to extending the basis set or to inclusion of solvent effects.The higher computational cost that would derive from the application of such refinements are not justified.S3.The red point represents the ligand designed in this work, i.e., ligand 9.

Figure S4.
Correlation plot for Ru=C bond lengths in molecular models of Ru5 optimized with Prediction Model 1 (PM1 for brevity) and its variation using SMD(benzene) solvation model.Data from Table S3.The red point represents the ligand designed in this work, i.e., ligand 9.  S3.The red point represents the ligand designed in this work, i.e., ligand 9. 2 and reported in Table S2) and the length of the Ru=CH2 bond at Ru5 when optimized with the variation of Prediction Model 1 (PM1 for brevity) using def2TZVPP basis set (data from Table S3). 2 and reported in Table S2) and the length of the Ru=CH2 bond at Ru5 when optimized with the variation of Prediction Model 1 (PM1 for brevity) using SMD(benzene) solvation model (data from Table S3). 2 and reported in Table S2) and the length of the Ru=CH2 bond at Ru5 when optimized with the variation of Prediction Model 1 (PM1 for brevity) using def2TZVPP basis set and SMD(benzene) solvation model (data from Table S3).

S2.1.1 Prediction Model 1
The prediction model used in de novo design experiments is based on the findings of ref. 2 (i.e., correlation between Ru=CH2 distance and productivity) and was designed specifically for automated catalyst design using DFT-quality molecular models while using the least possible computational resources.The initial molecular model of each candidate visited during evolutionary experiments was constructed by assembling the 3D structure of its building blocks, i.e., the molecular fragments that define the structure in the genetic algorithm. 4The molecular geometry was then relaxed by performing a conformational search with the potential energy surface smoothing and search method available in Tinker (v6.3). 5 The potential energy used in this calculation included only the van der Waals term of the universal force field (UFF), 6 and operated in the torsional space defined by all rotatable bonds connecting two building blocks.The resulting geometry was analyzed to exclude the presence of atom clashes and submitted to DFT-driven geometry optimization.These DFT calculations were performed with Gaussian 09-B.01 7 or 16-C.01, 8 using one version consistently within a single experiment.For consistency with previous work on which the prediction model is based upon, 2 molecular geometries were optimized using the "pure" DFT functional OLYP (a combination of Handy's OPTX 9,10 exchange functional and the correlation functional by Lee, Yang, and Parr (LYP)). 11,124][15] Integration used the Gaussian's ultrafine grid, and self-consistent field procedure was converged to Gaussian's default criteria (RMS change in density matrix < 1.0•10 −8 , max.change in density matrix = 1.0•10 −6 ).The geometry optimization did not exploit symmetry and was continued until satisfaction of the tight convergence criteria (max.force 1.5•10 −5 , RMS force 1.0•10 −5 , max.displacement 6.0•10 −5 , RMS displacement 4.0•10 −5 ).A complete definition of the DFT protocol leading to the final geometry optimization is available at https://doi.org/10.5281/zenodo.7762776,where the complete machinery for generating input files is also available as open source.
From the DFT-optimized geometry the Ru=C bond length was calculated and, its negated value, collected to be used as fitness value in the genetic algorithm.Geometries were also checked for correctness.In particular, we ensured compatibility (i.e., presence of bonds, but not bond types) between the connectivity of the input to DFT and the one obtained from OpenBabel's 16 connectivity perception from the Gaussian output file.Moreover, to ensure consistency of the DFT-modelled geometry with the expected trans-Cl geometry of Ru5, candidates were accepted only when respecting the following geometrical requirements: i) interatomic distance to Ru greater than 2.7 Å for all atoms external to the fragment CNHC-Ru(=CH2)(Cl)Cl, ii) average Ru-Cl bond distance shorter than 2.5 Å, iii) average Cl-Ru-CNHC angle greater than 90°, iv) minimum dihedral angle H-C=Ru-CNHC lower than 20°.
With the exclusion of third-parties software, all parameters and software tools are available at https://doi.org/10.5281/zenodo.7762776.

S2.1.2 Prediction Model 2
According to the previous work from ref.
2, the catalyst productivity was estimated by the stability of the MCB intermediate Ru9 relative to the averaged energy of three models for the 16-electron species Ru8.The latter was calculated as the average of Ru8 models where L' = PMe3, H2O, or CH2O: Enthalpy differences are given with respect to the 14-electron active complexes Ru5, which cancels out in Equation 1, thus and where the enthalpy of a given molecule,   , is calculated as the sum of the electronic energy,  ':8;0, , and the enthalpy correction,  38;0, <=>> : =  ':8;0, +  38;0, <=>> Equation 5 The molecular models of relevant MCB complexes Ru9 and the 16-electron species Ru8 were constructed in Spartan'08 17 and conformational searches were performed using semiempirical method PM3. 18Low-energy conformers were then taken as input structures for geometry optimization using the same DFT protocol described above for Prediction Model 1, except for Gaussian09's default fine integration grid and default convergence criterion (max.force 4.5•10 −4 , RMS force 3.0•10 −4 , max.displacement 1.8•10 −3 , RMS displacement 1.2•10 −3 ) in accordance with the original work. 2 The nature of the final stationary points was confirmed by the analysis of the eigenvalues of the analytically calculated Hessian matrix.The ideal gas, rigid-rotor and harmonic oscillator approximations were used to compute the thermochemical corrections required to get the enthalpy at 298.15 K used in the productivity calculation ( 38;0, <=>> in Equation 5).
Single-point energies ( ':8;0, in Equation 5) were obtained using the hybrid functional B3LYP 19 and an upgraded basis set from the geometry optimizaton.For ruthenium, the Hay and Wadt primitive basis set (5s,6p,4d) 20 was contracted to [4s,4p,3d].For the other (non-Ru) elements, a diffuse s-function was added throughout, whereas, in addition, a diffuse pfunction was added to all non-hydrogen elements.The diffuse s-functions of the non-hydrogen elements were added in an even-tempered manner, whereas the s-exponent for hydrogen and the p-exponents were modified. 21Polarization functions were added for all atoms; a p-function for hydrogen and a d-function for the other elements.The self-consistent field convergence criterion was set to SCF(Conver=5) (RMS change in density matrix < 1.0•10 −5 , max.change in density matrix = 1.0•10 −3 ) and otherwise default Gaussian 09 settings were used.
Energies ( ':8;0, ) and thermal correction to the enthalpy ( 38;0, <=>> ) for all molecular models involved in the calculation of the productivities are reported in Table S4.The input files and the results of the geometry optimizations and single point calculations are available in the ioChem-BD repository 22,23 at https://doi.org/10.19061/iochem-bd-6-292.

S2.1.3 Reactivity Model
Model Building.The construction of molecular structures, conformational searches, and preliminary strain relaxations of molecular models were performed with Spartan18 24 using its implementation of Merck's force field (MMFF94) 25 and of the semi-empirical method PM6. 26 These calculations were carried out in conjunction with manually-set geometrical constraints in the first coordination sphere of ruthenium, to preserve geometrical features inaccurately described by empirical and semiempirical methods.All density functional theory (DFT) calculations were performed with the Gaussian 16 suite of programs, version 16 C.01. 27 Geometry Optimization.Geometry optimization was performed using the Gaussian 16 implementation of the generalizedgradient approximation (GGA) functional of Perdew, Burke and Ernzerhof (PBE), 28,29 including Grimme's D3 empirical dispersion term 75 in conjunction with revised Becke-Johnson damping (overall labelled PBE-D3M(BJ) for brevity). 76ll atoms except ruthenium were described by Dunning's correlation-consistent valence double-ζ plus polarization basis sets (cc-pVDZ), 30,31 as retrieved from the EMSL basis set exchange database. 32,33Ruthenium was described with the Stuttgart 28-electron relativistic effective core potential (ECP28MDF retrieved from the Stuttgart/Cologne group website) 34 in combination with the correlation-consistent valence double-ζ plus polarization basis set (cc-pVDZ-PP) 34 retrieved from EMSL basis set exchange database. 32,33he Gaussian 16 ultrafine grid was explicitly specified for numerical integration (keyword int=ultrafine), which implies that this grid was used also for the analytical Hessian calculations.Geometries were optimized using tight convergence criteria (max.force 1.5•10 −5 a.u., RMS force 1.0•10 −5 a.u., max.displacement 6.0•10 −5 a.u., RMS displacement 4.0•10 −5 a.u.), without symmetry constraints, using the following convergence criteria for the self-consistent field (SCF) optimization procedure: RMS change in density matrix < 1.0•10 −9 , max.change in density matrix < 1.0•10 −7 .
All stationary points were characterized by the eigenvalues of the analytically calculated Hessian matrix, confirming the absence (for minima) or presence of a single negative eigenvalue (for transition states).The translational, rotational, and vibrational components of the thermal corrections to enthalpies and Gibbs free energies were calculated within the ideal-gas, rigid-rotor, and harmonic oscillator approximations considering a temperature of 298 K, except that all frequencies below 100 cm -1 were shifted to 100 cm −1 when calculating the vibrational component of the entropy (i.e., the quasi-harmonic oscillator approximation) 35 to prevent the asymptotic behavior of the harmonic approximation with modes of very low frequencies.
Energies (E PBE ) and thermal correction (G PBE @6 298.15K ) for all molecular models are reported in Table S5.The input files and the results of the geometry optimizations and Hessian calculations are available in the ioChem-BD repository 22,23 at https://doi.org/10.19061/iochem-bd-6-292.

C6H6
) for all molecular models are reported in Table S5.The input files and the results of the single point energy calculations are available in the ioChem-BD repository 22,23 at https://doi.org/10.19061/iochem-bd-6-292.
Calculation of Gibbs Free Energies.Gibbs free energies were calculated at 298.15 K according to Equation 6: Equation 6 where E PBE-D3M(BJ)

C6H6
is the potential energy resulting from single-point calculation with PBE-D3M(BJ), and include the contributions from the implicit solvation model; G PBE @6 C6H6 298K is the thermal correction to the Gibbs free energy calculated at the geometry optimization level with the quasi-harmonic approximation at 298.15 K; and G 1atm→1M 298.15K is the standard state correction corresponding to 1 M solution (but exhibiting infinite-dilution, ideal-gas-like behavior), which is equal to 1.89 kcal mol −1 (= RT•ln(24.46))at room temperature.Table S5  ) calculated with respect to the pyridine-stabilized precatalyst Ru12.Ru12_9 and Ru12_11 are, thus, the reference points for the species bearing the ligands 9 and 11 respectively.Natural Bond Orbital (NBO) Analyses.The natural bond orbital analyses were performed with the NBO7 software, 47 using the electron density of the single-point energy calculations as input.
The overall electron donation of the carbene ligands in the ruthenium species Ru9, Ru19, and Ru21 was calculated as the natural charge of the fragment Cl2Ru(CH2)3 for Ru9, and that of the fragment Cl2Ru(CH(CH3)CH(CH3)CH2) for Ru19, and Ru21.
The carbene p-back-donation in the ruthenium complex Ru9 has been estimated as the difference between the occupancy on the lone vacancy (LV) at the carbene atom in the complex Ru9 and that in the isolated ligand fragment having the same geometry as in complex Ru9.The carbene s-donation has been calculated as the sum of the overall donation from natural charge analysis (see above) and the p-backdonation.
To ensure a comparable set of orbitals between the complexes and the 'frozen' ligand fragments, the Lewis structures were explicitly required (via the $CHOOSE input section) to have a lone pair (LP) at the carbene carbon atom as well as on the adjacent N-atoms.

S2.2.1 De Novo Design Experiments
The input data, parameters, software tools, and output data pertaining the evolutionary de novo design experiments are available at https://doi.org/10.5281/zenodo.7762776.

S3.1 Experimental Details
All reactions were performed under argon atmosphere inside a glovebox.Toluene and tetrahydrofuran were purified using an MBraun solvent purification system ("Grubbs' column") and stored over activated molecular sieves (4 Å).Anhydrous pentane was purchased from Sigma-Aldrich and used as received.CDCl3 was dried over CaH2 and distilled before use, while anhydrous C6D6 was purchased from Sigma-Aldrich and degassed before use.1-octene (Sigma-Aldrich) was purified by passage through a column of activated basic alumina and stored over activated Selexsorb ® CD in the glovebox freezer for 1 week before use.The imidazolium salt 9•HCl was purchased from Santai Labs, Inc. and used as received.The ruthenium compounds 16 48 and 18 49 were prepared according to literature procedures.All the other chemicals were purchased from Sigma-Aldrich and used as received.
NMR spectra were recorded on Bruker Biospin AV 500, AVANCE NEO 600, and AV III HD 850 spectrometers.The chemical shifts are reported relative to the residual solvent peaks. 50Elemental analyses were performed using an Elementar Vario EL III analyzer.

S3.2 Preparation of Ruthenium Complex 17 Scheme S3. Preparation of Complex 17.
In a glovebox, a 25 mL vial, equipped with a magnetic stirring bar and a screw cap, was charged with the imidazolium salt 9•HCl (25 mg, 0.055 mmol), potassium bis(trimethylsilyl)amide (13.2 mg, 0.066 mmol), and tetrahydrofuran (4 mL).The suspension was heated at 70° C for 5 minutes, then filtered through a glass-fiber filter paper, and the filtrate was transferred to a 2 mL THF solution of 16 (36.6mg, 0.052 mmol).The mixture was stirred at room temperature for 30 minutes.During the first five minutes the color of the solution changed from bright green to brown-orange.The solvent was reduced under reduced pressure to about one third of the initial amount, then pentane (10 mL) was added dropwise under magnetic stirring.addition of pentane caused the separation of the title compound as a green solid, which was isolated by vacuum filtration, washed three times with pentane, transferred in a vial, and dried under reduced pressure.Isolated weight: 19 mg (48 %).In a glove box, a 25 mL vial equipped with a magnetic stirring bar and a screw cap was charged with 2954 mg (13.160 mmol) of 1-octene, 59.5 mg (0.366 mmol) of hexamethylbenzene (internal standard) and the catalyst (0.00132 mmol, 0.01 mol %).The vial was closed, and the reaction mixture was stirred at room temperature (22 °C) for 24 hours.Samples were taken at 5, 10, 15, 30, 60, 120, 240, and 1440 minutes.Each sample was quenched with an excess of ethyl vinyl ether (EVE) and stored in the glovebox freezer at -35 °C before 1 H NMR analysis.Determination of conversions and yields were done by quantitative 1 H NMR according to literature procedures. 51,52.4 Self-Metathesis of Neat 1-Octene with 1 ppm of Catalyst at 60 °C In a glove box, a 4 mL vial equipped with a magnetic stirring bar and a screw cap was charged with 10 mg of stock solution, containing 0.0026 µmol of the catalyst.The solvent was removed by reduced pressure and then (2.6 mmol) of 1-octene were added.The vial was closed, and the reaction mixture was stirred at 60 °C for 1 hour.Determination of conversions and yields were done by quantitative 1 H NMR according to literature procedures. 51,52.5 NMR Spectra of 17

Figure S1 .
Figure S1.Correlation between the relative energy of metallacyclobutane Ru9 (∆  ) and that of the cycloreversion transition state Ru10 ( ‡ ).Values taken from Ref. 1 and reported in TableS1.

S1. 2 Figure S2 .
Figure S2.Correlation plot between the enthalpic components of the stability of metallacyclobutane (MBC) Ru9 and the length of the Ru=CH2 bond at Ru5.Values taken from Ref. 2 and reported in TableS2.

Figure S3 .
Figure S3.Correlation plot for Ru=C bond lengths in molecular models of Ru5 optimized with Prediction Model 1 (PM1 for brevity) and its variation using def2TZVPP basis set.Data from TableS3.The red point represents the ligand designed in this work, i.e., ligand 9.

Figure S5 .
Figure S5.Correlation plot for Ru=C bond lengths in molecular models of Ru5 optimized with Prediction Model 1 (PM1 for brevity) and its variation using def2TZVPP basis set and SMD(benzene) solvation model.Data from TableS3.The red point represents the ligand designed in this work, i.e., ligand 9.

Figure S6 .
Figure S6.Correlation between the enthalpic components of the stability of metallacyclobutane (MBC) Ru9 (values taken from Ref.

Figure S7 .
Figure S7.Correlation between the enthalpic components of the stability of metallacyclobutane (MBC) Ru9 (values taken from Ref.

Figure S8 .
Figure S8.Correlation between the enthalpic components of the stability of metallacyclobutane (MBC) Ru9 (values taken from Ref.

Prediction Model 2 .
Scheme S1.De Novo-Designed Ligands Evaluated with Prediction Model 2. a a The length of the Ru=CH2 bond in the DFT model of Ru5 generated in the evolutionary experiments is reported in parenthesis.

Figure S9 .
Figure S9.Molecular models Ru8_8_H2O_PR2 (a) and Ru8_8_CH2O_PR2 (b) showing effects that are not considered in Prediction Model 2 and thus invalidate its prediction.Namely, the H-bond and metal-coordination resulting from the presence of carbonylcontaining moieties in the ortho position of the aromatic N-substituents on the NHC.All hydrogen atoms excluding those of water and formaldehyde are omitted for clarity.The labels report the distance Hwater-Oester (a) and Ru-Ocarbonyl (b).
Profiles for Precatalyst 17 and 18 Scheme S2.Definition of Species Modelled to Determine the Reaction Energy Profiles for Precatalysts 17 and 18, which bear ligand 11 and 9, respectively.

Figure S12. 1 H
Figure S12. 1 H NMR spectrum of 17 (1 scan), recorded after the first 512 scan of the 13 C NMR spectrum.

Figure S14 .
Figure S14.Stacked 1 H NMR spectrum of 17 (1 scan) showing the formation of a new alkylidene species during the recording of the 13 C NMR spectrum.

Table S1 . Relative Energies (kcal/mol Relative to Ru8) Taken from Table 1 of Ref. 1. a
a Catalyst identifiers are defined in the original publication.See Ref. 1 for details.

Table S2 . Ru=CH2 Bond Lengths at Ru5 and Taken from the Supporting Information of Ref. 2. Refer to the Original Publica- tion for Further Details.
a Catalyst identifiers are defined in the original publication.See Ref. 2 for details.

Table S3 . Sensitivity to Basis Set and Solvation Model of Ru=CH2 Bond Lengths of Ru5.
a Catalyst identifiers are defined in the original publication.See Ref. 2 for details.b Catalyst designed in this work.

Table S4 . Calculated Energies, Enthalpies, and Productivities Computed with Prediction Model 2.
Productivity calculated according to Equation 1. b Not determinable due to effects as the H-bonds and over-coordination of the metal center (FigureS9) that are not accounted for by Predicion Model 2. a

Table S5 . Calculated Energies and Standard-State Gibbs Free Energies.
Unless otherwise specified, values are calculated usen the corresponding model of Ru12 as reference, namely Ru12_9 for all species bearing ligand 9, and Ru12_11 for all species bearing ligand 11.