Unveiling Bifunctional Hydrogen Bonding with the Help of Quantum Chemistry: The Imidazole-Water Adduct as Test Case

The ubiquitous role of water and its amphiprotic nature call for a deeper insight into the physical–chemical properties of hydrogen-bonded complexes formed with building blocks of biomolecules. In this work, the semiexperimental (SE) approach combined with the template model (TM) protocol allowed the accurate determination of the equilibrium structure of two isomeric forms of the imidazole-water complex. In this procedure, the integration of experiment (thanks to a recent rotational spectroscopy investigation) and theory is exploited, also providing the means of assessing the reliability and accuracy of different quantum-chemical approaches. Overall, this study demonstrated the robustness of the combined SE-TM approach, which can provide accurate results using affordable quantum-chemical methods. Finally, the structural and energetic characteristics of these complexes have been examined in detail and compared with those of analogous heterocycle–water adducts, also exploiting energy decomposition analyses.


■ INTRODUCTION
In recent years, the number of joint experimental−computational studies on noncovalent complexes has significantly increased, with the aim of obtaining a deeper knowledge of the underlying interactions. 1−9 The interest is often focused on spectroscopic and structural properties as well as on energetic characterizations, the latter often being coupled with their quantitative interpretation in terms of chemically meaningful concepts (e.g., electrostatics, induction, dispersion, etc.). 2,3,6,7,9 Among different experimental techniques, rotational spectroscopy is the most suitable to derive structural information owing to the direct connection between rotational constants and molecular structure. 10 Rotational spectroscopy is indeed a powerful tool for molecular structure determinations, which are the mandatory prerequisites for investigating structure− activity relationships as well as deriving chemical and physical properties. 11−15 Furthermore, rotational spectroscopy investigations are performed in the gas phase, thus allowing one to focus on intrinsic effects without the perturbations introduced by the environmental effects operative in condensed phases.
Accurate structural determinations can be performed by exploiting the semiexperimental (SE) approach, which relies on extracting from the experimental outcomes the equilibrium structure details using quantum-chemical (QC) computations for providing the missing information. 16 Going more in detail, SE equilibrium structures can be obtained from a least-squares fit (LSF) of the SE equilibrium rotational constants, in turn derived from the experimental ground-state counterparts by subtracting the computed vibrational corrections. 16 Whenever experimental information is available for a sufficient number of isotopologues, a complete structural determination is possible. For small-and medium-sized organic and biological molecules, several studies have shown that vibrational corrections computed using hybrid or, even better, double-hybrid density functionals in conjunction with suitable basis sets have the required accuracy to obtain reliable results. 17,18 The application of such an approach has led to the development of a database of SE equilibrium structures containing information for an increasing number of species, and which already incorporates the most relevant prebiological building blocks. 17,18 Moving to molecular complexes, it becomes particularly challenging to obtain a set of experimental data sufficient for a complete structural determination. In particular, the most difficult information to retrieve is the position of hydrogen atoms because a full set of deuterated species is rarely experimentally available. In all cases where there are missing data, the usual approximation is to fix the intramolecular parameters at those of the isolated fragments and to fit only a limited number of intermolecular parameters. Unfortunately, this approximation is not free from difficulties, especially when flexible fragments are involved, and more advanced approaches are needed. To overcome this issue, we propose an extension of the template model approach (TMA), 17 which consists of correcting the structural parameters of a complex system, obtained at a suitable QC level, by using the corresponding SE values of suitable fragments, referred to as template models (TMs). 19 In the case of noncovalent molecular complexes, the TMs are the monomers, whose SE equilibrium structures are likely available in the above-mentioned database. The intramolecular parameters are greatly improved by the exploitation of the TMA, and they can thus be confidently kept fixed. Then, the number of experimental data becomes sufficient to optimize the intermolecular parameters.
Whenever some SE equilibrium intermolecular parameters cannot be determined, they can be obtained from partial geometry optimizations at an accurate QC level. On the basis of recent works, 20−22 the so-called jun-"cheap" composite scheme (hereafter jun-ChS) provides, for noncovalent complexes, interaction energies on par with the most sophisticated composite methods at a significantly reduced computational cost. It is, therefore, natural to investigate if this computational approach is able to also deliver accurate geometrical parameters to be employed in the interpretation and prediction of equilibrium rotational constants. In the perspective of studying even larger systems, the double-hybrid DSD-PBEP86 functional (in the recent reparametrization by Martin and co-workers, 23 rev-DSD-PBEP86) in conjunction with the jun-cc-pVTZ basis set 24,25 appears particularly promising and will be compared to jun-ChS results.
Once the methodology is defined and the computational level is selected, the next step is the choice of a challenging, yet representative, test case to assess the methodology. Among different noncovalent interactions, hydrogen bonds play a central role, especially for biological systems. Therefore, the interaction between building blocks of biomolecules and water represents a natural starting point for our analysis, also in view of the ubiquitous presence and the amphiprotic character of water. Focusing on the potential partner of the molecular complex, heterocyclic compounds appear interesting in view of the prebiotic character of the species containing the C−N moiety as well as their presence in important biological molecules. In this framework, imidazole is particularly appealing because of the concomitant presence of donor and acceptor sites.
In conclusion, this study will address the characterization (by means of the methodologies mentioned above) of the water−imidazole complex, whose rotational spectrum has recently been reported for a number of different isotopologues. 26 Interestingly, two different adducts have been experimentally observed (see Figure 1) in which water plays the role of either hydrogen bond (HB) donor (HBD) or acceptor (HBA). 26 On the contrary, in a previous experimental investigation with supersonic jet FTIR spectroscopy, only the HBD complex was detected, thus suggesting that it is significantly more stable than the HBA counterpart. 27 However, the study of ref 26 was not able to experimentally determine the relative stability of the two adducts. For this reason, an accurate energetic characterization is also deserved.
This manuscript is organized as follows. In the next section, the computational methodology is presented in all details. The following section will report and discuss the results of structural and energetic investigations. Finally, concluding remarks will be provided.

■ COMPUTATIONAL METHODOLOGY
The equilibrium structure of both isomers of the imidazole− water molecular complex (see Figure 1) has been evaluated by exploiting QC composite schemes as well as the SE approach. Both of them are described with some details in the following.
As briefly mentioned in the Introduction, within the SE methodology, the equilibrium structural parameters of the investigated system are determined from a LSF of the equilibrium moments of inertia of different isotopologues. 16 However, laboratory measurements give access to vibrational ground-state rotational constants (B 0 i , where i refers to the inertial axes a, b, c), thus requiring the vibrational corrections (ΔB vib i ) to be subtracted in order to obtain the corresponding SE equilibrium rotational constants (B e i ). 16 From the latter, the SE equilibrium moments of inertia are straightforwardly derived, these being inversely proportional to rotational constants. 10 To better clarify the statements above, it has to be noted that, in the framework of vibrational perturbation theory to the second order (VPT2), 28 the equilibrium rotational constants can be expressed as where the α r 's are the vibration−rotation interaction constants and the sum runs over all r vibrational modes. Noted is that the evaluation of the α r 's implies anharmonic force field calculations (for details, see, e.g., refs 29−32). To recap, in the SE procedure, the second term on the righthand side of eq 1 (i.e., ΔB vib i ) is derived from QC calculations, while the first term is obtained from the experiment. ΔB vib i being significantly smaller than B e i , the former term can be determined at an affordable level of theory (such as methods rooted in the density functional theory, DFT) without significantly affecting the accuracy of the resulting SE equilibrium rotational constants. 11,18,29,33 The calculation of the α r 's from DFT anharmonic force fields led to the collection of accurate SE equilibrium geometries for several systems, ranging from isolated molecules to clusters. 3,6,9,17,18,34−38 In this work, the global-hybrid B3LYP functional, 39,40 also incorporating the Grimme's DFT-D3 scheme 41 for the treatment of dispersion effects in conjunction with the Becke−Johnson (BJ) damping function, 42 has been used for the calculation of the vibrational contributions. The partially augmented double-ζ jun-cc-pVDZ basis set 24,25 has been employed in combination with B3LYP-D3(BJ). Hereafter, this level of theory is shortly referred to as B3.
The starting point of the LSF procedure is a guess geometry obtained by means of the TMA. 18 In the present context, the TM species are the monomers, imidazole and water, and their structural parameters within the complex have been adjusted using the SE equilibrium geometry of the isolated fragments. a For a generic intramolecular equilibrium structural parameter r, which has been optimized at the x level of theory, the TMA parameter (r TMA ) is calculated as In eqs 2 and 3, TM is the isolated fragment. The rev-DSD-PBEP86-D3(BJ) functional 23 in conjunction with the jun-cc-pVTZ basis set 24,25 (hereafter denoted as rev-DSD) has been considered as the x level of theory in the equations above because it is proven to offer a very good description of noncovalent complexes. 43 In addition to the SE approach, the equilibrium structure of the molecular complex has been evaluated using the so-called "cheap" composite scheme (hereafter ChS). 44 The original version of this accurate yet cost-effective approach for mediumsized systems 45,46 is the following: 44 r r r r r The terms on the right-hand side are presented here below in order of appearance: i. The starting point is the coupled-cluster ansatz including single and double excitations with a perturbative treatment of triples, CCSD(T), 47 within the frozencore (fc-) approximation and in conjunction with the cc-pVTZ basis set. 24 ii. The correction due to the extrapolation to the complete basis set (CBS) limit is evaluated using the n −3 formula 48 and applied to the geometrical parameters optimized using Møller−Plesset perturbation theory to second order 49 (fc-MP2). The cc-pVnZ (with n = T,Q) basis sets are employed. iii. The core−valence (CV) correlation contribution is calculated as the difference between all-MP2/cc-pCVTZ 50 and fc-MP2/cc-pCVTZ optimized parameters, where "all-" denotes the correlation of all electrons. iv. The effect of the inclusion of diffuse functions in the basis set is estimated as the difference between fc-MP2/ aug-cc-pVTZ 24,51 and fc-MP2/cc-pVTZ optimized parameters.
The jun-ChS variant 20 is obtained by employing the partially augmented jun-cc-pVnZ basis sets 24,25 (in the place of cc-pVnZ). Since these sets already incorporate diffuse functions, the iv term is not considered in conjunction with them. Instead, the iii term is the same in both approaches. If the contribution of diffuse functions (iv) is neglected in the original scheme, the "ChS-mAUG" model is obtained. Computational Results. A preliminary scan of the potential energy surface (PES) of the molecular imidazole− water complex has been carried out at the B3 level and confirmed the DFT results reported in ref 26. Next, the geometries of Imid−H 2 O and H 2 O−Imid have been optimized at the rev-DSD, ChS, ChS-mAUG, and jun-ChS levels of theory. Their rev-DSD structures are sketched in Figure 1, while the geometrical parameters obtained at the different computational levels are collected in Table 1. From an inspection of the results collected in this table, we note that the four different approaches provide very similar results for the intramolecular parameters, with rev-DSD showing deviations within a few mÅ for distances and well within 0.1°for angles. Larger deviations, even within the different ChS variants, are noted for intermolecular parameters and, in particular, for the N···O distance. From Table 1, the cost-effectiveness of the rev-DSD level is evident: while providing results in nearly quantitative agreement with those issuing from composite schemes, its cost is comparable with the cheapest step of the latter approaches (i.e., MP2 in conjunction with a triple-ζ basis set).
Unlike H 2 O−Imid, the structure of Imid−H 2 O is characterized by a symmetry plane containing the imidazole ring and the oxygen atom of the water molecule. Thus, Imid−H 2 O belongs to the C s symmetry point group. In order to ease the interpretation of the geometrical parameters, we have used a dummy atom (X) placed on the HOH angle bisector to locate the water molecule with respect to the ring (see Table 1). In addition to the symmetry issue, the HB geometry presents some slight differences between the two isomers. On average, the computed N···O distance is about 0.11 Å longer in Imid− H 2 O than in H 2 O−Imid. Furthermore, the O−H bond in H 2 O−Imid and the N−H distance in Imid−H 2 O, which are involved in the HB donation, show stretches of about 13 and 6 mÅ, respectively, with respect to the isolated fragments. Concerning Imid−H 2 O, we note that, for the X11−O10−N3 angle, the original formulation of the ChS scheme leads to a value ∼15°larger than that obtained with jun-ChS and ChS-mAUG. This implies that ChS orients the water molecule differently. As expected, no significant differences are noted for the other geometrical parameters, the only exception being the HOH angle of water, which is overestimated at the ChS level in both H 2 O−Imid and Imid−H 2 O complexes. Noted is that the dihedral angles not explicitly reported in Table 1 are either 0°or 180°, as a consequence of the planarity of the imidazole ring.
For comparison purposes, we have also computed the structure of several complexes formed by water with Ncontaining heterocycles (Het) and reported the O···N distance at the rev-DSD level of theory (see Table 2). From the analysis of the results obtained, a systematic increase of the HB length when the heterocycle acts as HBD is noted, with a difference by averaging on the two familiesof 0.1 Å. Table 3 collects the rotational constants. While the B e 's have been straightforwardly derived from the equilibrium structures, 10    The Journal of Physical Chemistry A pubs.acs.org/JPCA Article deviates by 0.25%, while the relative discrepancy decreases by 1 order of magnitude moving to ChS-mAUG (0.02%) and jun-ChS (0.03%). This can be associated with the larger value, mentioned above, of ∠(X11−O10−N3), in the case of ChS. Concerning rev-DSD, which isas already mentioneda level of theory affordable also for larger systems, the calculated B 0 i 's are in good agreement with the experimental ones, indeed showing a relative error of about 0.4% for both H 2 O−Imid and Imid−H 2 O complexes, which improves to 0.3% if one does not consider the A constant. In this respect, if we compare our rev-DSD/B3 results (with B3 referring to the vibrational corrections) with the current practice of directly using the B e i 's, evaluated from global-hybrid DFT calculations, for guiding spectral recording and analysis (as done, for example, in ref 26), a reduction of the discrepancies ranging from a factor of 2 to more than 1 order of magnitude is evident.
Finally, the imidazole ring is characterized by the presence of two nitrogen atoms, which are quadrupolar nuclei. Having significantly different local environments, the corresponding quadrupole coupling interactions lead to distinctive spectroscopic features. 26 Therefore, we decided to compare the experimental nuclear quadrupole coupling constants (χ ii 's) with the calculated ones. The results, reported in Table 4, show an overall good agreement with experiment at any level of theory considered, with the largest discrepancies being observed for the experimental data affected by large uncertainty. A remarkable improvement is noted when going from B3LYP 26 to rev-DSD and CCSD(T)/jun-cc-pVTZ. Application of the jun-ChS approach only marginally changes the constants with respect to the latter level of theory.
Semiexperimental Structure. As mentioned in the Computational Methodology section, the determination of the SE equilibrium structure for the two molecular complexes starts from a guess geometry obtained with the TMA. For the exploitation of the SE approach, the experimental rotational constants of four and three isotopic species (reported in ref 26) for H 2 O−Imid and Imid−H 2 O, respectively, have been employed. The number of experimental data not being sufficient for a complete geometry evaluation, the LSF procedure has been applied to a reduced number of intermolecular structural parameters, while keeping the intramolecular and the other intermolecular parameters fixed at the guess values. Therefore, first of all, the accuracy of the   The Journal of Physical Chemistry A pubs.acs.org/JPCA Article equilibrium structures of the monomers, which are at the basis of the TMA, needs to be discussed. Focusing on the isolated fragments (see Table 5), it is noted that all ChS schemes show a very good performance in terms of deviation from the SE equilibrium values. An inspection of Table 5 confirms the accuracy expected on the basis of the literature on this topic: 0.001−0.002 Å for bond lengths and 0.1−0.2°for angles. 44,45,55 A slight overestimation of the water angle is observed at the ChS level. A good agreement is also found between the rev-DSD and SE values, with an average error of 3 mÅ and 0.15°for bond lengths and angles, respectively.
The LSF procedure has been carried out using the molecular structure refinement (MSR) software. 56 Figure 2). These investigations locate the minimum in the same position found in all geometry optimizations, therefore leading to the exclusion of the outcomes of this fitting procedure.
The source of the anomalous results addressed above may be traced back to the water mobility. Indeed, the presence of large amplitude motions has a huge impact on the geometry and questions the foundations of the SE approach, which relies on second-order vibrational perturbation theory. Experimental information on isotopic substitution at the hydrogen−water atoms would have helped in correctly deriving their positions within the molecular adduct. This cannot be accomplished by fitting instead intramolecular parameters such as the ∠(H−O− H) angle, the overall conclusion being the exclusion of any structural parameter involving water hydrogens from our LSF procedure. A further evidence of this problem has been met in the fit of Imid−H 2 O. For this isomer, the value of the moment of inertia along the a axis (and, therefore, the rotational constant A) is strongly affected by the position of the hydrogen atoms of the water molecule, which, however, experience large amplitude motions. Within the VPT2 framework, the treatment of the latter represents a well-known issue, 57 which has been confirmed, and partially solved, for this adduct by the exclusion of the A's from the LSF procedure.
The results obtained by employing the TMA (fit 1) or theoretical (fit 2) intramolecular parameters are reported in Table 6 and confirm that the O···N distance is shorter in H 2 O−Imid than in Imid−H 2 O by at least 0.1 Å. The already mentioned difficulties in positioning the water molecule with respect to imidazole are at the origin of the slight differences in the SE equilibrium structure of H 2 O−Imid derived from the different fits. However, by comparing the results of fit 1 and fit 2 and taking into consideration the confidence interval of one standard deviation, we note that the fitted values are comparable. For Imid−H 2 O, the O···N distance as well as the ∠(N···O···X) and ∠(O···N−C1) angles are in remarkable agreement. Larger deviations are noted for the ChS results for both isomers, probably due to a greater difference in the "guess" values for the water molecule. Interestingly, the comparison of the results of fit 1 points out the strength of the TMA, which is able to provide reliable geometries even when the reference structure is obtained at the DFT level (instead of a computational expensive CC-based composite scheme).
Energetics. The energetic characterization of the two isomers has been carried out at both the jun-ChS and rev-DSD levels, with the former model for interaction energies being fully described in ref 20. Actually, the different "cheap" expressions can be derived from eq 4 by replacing the geometrical parameter r with the total energy. In particular, the jun-ChS model chemistry is proven to provide a very good compromise between accuracy and computational cost in the description of noncovalent interactions. 20 Furthermore, two different reference geometries (rev-DSD and jun-ChS) have been used in order to determine the influence of the structure on the energetics. The results are reported in Table 7. According to them, H 2 O−Imid is about 6.2 kJ/mol more stable than the Imid−H 2 O isomer, in agreement with both a previous FTIR experiment 27 and the findings of ref 26. With respect to the relative stability, the effect of the different geometry on the energy is negligible (i.e., less than 0.1 kJ/ mol).
The jun-ChS and rev-DSD levels have also been employed to evaluate the interaction energy (int en), possibly accounting The Journal of Physical Chemistry A pubs.acs.org/JPCA Article for the basis set superposition error (BSSE) by means of the counterpoise (CP) correction. 58 The results (Table 7) show a good agreement between the two levels of theory (within 0.1 kJ/mol) when CP corrections are not incorporated (denoted as NCP). As expected, NCP and CP results are quite similar to one another at the jun-ChS level (within 0.2 kJ/mol) because of the extrapolation to the CBS limit, whereas the CP correction significantly worsens (by about 1.5 kJ/mol) the rev-DSD results (with respect to jun-ChS). It is thus confirmed that jun-ChS computations (for both energies and geometries) can be safely performed without any CP correction. The same applies to rev-DSD results, possibly due to to a fortuitous error compensation and/or the conditions employed in the parametrization of the functional. From a close inspection of the results of Table 7, a conclusion on the deformation experienced by the monomers (from isolated to part of the adduct) can be drawn. The difference between the jun-ChS-CP interaction energies of the two isomers (6.65 and 6.46 kJ/mol at the rev-DSD and jun-ChS geometries, respectively) is mainly ascribable to the relative jun-ChS electronic energy (6.20 and 6.27 kJ/mol at the two reference geometries). The remaining contribution (0.45 and 0.20 kJ/mol), due to the deformation of the monomers, slightly favors the Imid−H 2 O isomer.
While the interaction energy clearly points out a stronger interaction in H 2 O−Imid, a deeper insight on the origin of different HB patterns can be gained by the natural energy decomposition analysis (NEDA). 59 This has been carried out at the B3 level using the NBO 7.0 60 suite of programs interfaced to the latest revision of the Gaussian16 program. Furthermore, in order to obtain general information, such an analysis has been extended to different N-heterocycle−water adducts. The results, collected in Table 8, show a close agreement between B3 (last column) and rev-DSD (last column of Table 7) total energies for the case of imidazole and point out a clear trend of the two hydrogen bond patterns The error within brackets is one standard deviation. b No TMA has been used to correct the intramolecular parameters.  . When the N-heterocycle molecule acts as HBD, the interaction is weaker than that experienced in those adducts where water acts as HBD, this outcome being in agreement with the basicity of the heterocycles and the O···N distances reported in Table 2.
Within the two groups, no significant difference in the various contributions to the total interaction energy can be noted. Focusing on the total energy, the results within the two group are in line with the acidity and basicity of imidazole, which always shows the largest interaction energy. Indeed, the pK A value of imidazole is 14.5, thus being more acidic than pyrrole.
On the other hand, the pK A value of the conjugated acid is ∼7, which means that imidazole is about 60 times more basic than pyridine.
The two attractive terms, the electrical interaction (which includes the electrostatic and polarization contributions) and the charge transfer, are similar, with the maximum discrepancy being smaller than 7.5 kJ/mol. Furthermore, similar values are also found for the repulsive core interaction, with average values of 64.4 and 85.0 kJ/mol for Het−H 2 O and H 2 O−Het, respectively. In both the investigated groups, imidazole is found to have the strongest interaction with the water molecule.

■ CONCLUSIONS
A computational characterization of H 2 O−Imid and Imid− H 2 O, two isomers of the imidazole−water complex, has been carried out using different levels of theory for the geometry optimization, also aiming to test the reliability of the jun-ChS composite scheme and the DSD-PBEP86 double-hybrid density functional in its most recent reparametrization. To check the performance of these two levels of theory, we relied on the results obtained in a recent experimental work based on rotational spectroscopy. 26 Indeed, rotational constants intrinsically contain structural information. We proceeded in two different ways: we derived the rotational constants from our structures to be compared with the experimental ones and we used the latter, available for different isotopic species, in the semiexperimental approach for deriving accurate structures (to be compared with the computed ones). Both strategies required the computation of the vibrational corrections to rotational constants (either to be added to the computed equilibrium rotational constants or to be subtracted from the experimental ground-state rotational constants), which have been determined from B3LYP-D3(BJ)/jun-cc-pVDZ anharmonic calculations within the VPT2 model. While both strategies pointed out the accuracy of the jun-ChS model applied to geometries and the reliability of the rev-DSD level of theory, some discrepancies have been found that could be ascribed to the high flexibility of the water molecule position within the adduct. Further developments in the treatment of large amplitude motions will be addressed in a future work.
The semiexperimental equilibrium structure of the two isomers has been determined by means of a least-squares fit of the SE equilibrium moments of inertia for several isotopologues, with the intramolecular parameters being kept fixed. The values used for the latter have been derived from the application of the template model approach. One important finding is that the use of this approach leads to robust and reliable results, which are independent of the level of theory employed. The overall conclusion is that the rev-DSD-PBEP86-D3(BJ)/jun-cc-pVTZ level of theory in conjunction with the TMA is a valuable choice for the determination of the equilibrium structure of medium-sized molecular complexes at a reasonable computational cost.
Finally, an accurate energetic characterization of the imidazole−water complexes has been carried out. In addition, their energy decomposition analysis (EDA) is reported also for several N-heterocycle complexes with water for comparison purposes. The EDA pointed out that the different contributions are essentially unchanged within the series of adducts considered.
In our opinion, together with the interest of the studied system, the proposed computational strategy paves the route toward accurate structural and energetic characterizations for large noncovalent complexes of current technological and biological interest.