Exploring the Stability and Disorder in the Polymorphs of L-Cysteine through Density Functional Theory and Vibrational Spectroscopy

Static and dynamic density functional calculations are reported for the four known polymorphs of l-cysteine. Static calculations are used to explore the relative free energies (within the harmonic approximation) of the polymorphs as a function of pressure. An important feature of the structural differences between the polymorphs is shown to be the dihedral angle of the C–C–S–H bond. It is shown that, by varying this angle, it is possible to move between hydrogen bonding motifs S–H···S and S–H···O in all four polymorphs. The energetics for dihedral angle rotation are explored, and the barriers for rotation between the hydrogen bonding motifs have been calculated for each polymorph. Two possible models for the experimental disorder observed in Form I at room temperature are explored using both static and dynamic methods; a domain disorder model, where the disorder is localized, and a dispersed disorder model, where the disorder is randomly distributed throughout the crystal. Molecular dynamics calculations show transitions between the two hydrogen bonding motifs occurring in the dispersed disorder model at 300 and 350 K. In addition, molecular dynamics calculations of Form IV also showed the onset of hydrogen bond disorder at 300 K. Calculations of the predicted infrared and terahertz absorption are performed for both the static and dynamic simulations, and the results are compared with experimental results to understand the influence of disorder on the observed spectra.


Experimental Crystal Structures
A summary of experimental determination of the four polymorphs of l-Cysteine is shown in Table 1. The results include the change in Form I as a function of temperature 1 and study of the effect of pressure and the stability of all the polymorphs. 2 The Table also Table 2 and Table 3. No dispersion correction was used for these calculations.   The wavefunction was converged so that the energy changed less than 1 × 10 −8 eV and the atom positions and unit cell dimensions were optimised so that all forces were less than 5 × 10 −4 eV/Å.
The results of calculations of the optimised unit-cell dimensions of Form I and their percentage deviation from experiment are reported in Table 4. The method that gives the closest agreement with experiment is MBD, with the lowest average deviation in unit cell parameters. However, it was found that MBD in this version of VASP did not scale well with system size and proved difficult to apply to the large unit cells which were used later in the project. It was therefore decided to use the GD3/BJ dipersion correction as this performed nearly as well as MBD and had no problems in treating the larger unit-cells. The GD3/BJ dispersion has been shown to be reliable in previous work. 15 For full geometry optimisations of the unit-cell and the molecular geometries, the default optimiser within VASP were used. For those calculations which required constraints the GADGET optimiser 16,17 was used, which has an interface to the VASP package. The optimisations were terminated when the following criteria were met; predicted internal coordinate changes were less than 0.02 (Bohr or radians), the gradients were less than 0.0004 (Hartree/Bohr or Hartree/radian) and the energy change was less than 1.0 × 10 −5 Hartree.

Phonopy Protocol
Phonopy 18 was used to calculate the quasi harmonic free energies of each polymorph as a function of pressure and temperature. The super-cells used to calculate the atomic displacements for the calculation were 1x1x2, 1x2x1, 1x1x2 and 1x2x1 for Forms I, II, III and IV respectively. Single point calculations were performed with VASP at these displacements using the same settings a described in Section 2.

CP2K Energy Cutoffs
Ab Initio molecular dynamics calculations were performed with CP2K package. 19 As was used in VASP, all calculations employed the PBE exchange-correlation potential along with the GD3/BJ dispersion correction with no additional Axilrod-Teller-Muto (C9) correction to allow a fair comparison with VASP. The Molopt double zeta valence plus polarisation basis sets 20 were used with the GTH PBE pseudopotentials. 21 CP2K has two energy cutoffs which affect the accuracy of the calculation; the cutoff and the relative cutoff.
With the relative cutoff set at 60 Rydberg, the cutoff was varied and the effect on the first cycle of the self consistent procedure was calculated. The results are show in Table 5. Based on these results a cutoff of 450 Rydberg was chosen. With this value of the cutoff, the relative cutoff was changed and the results shown in Table 6. The final values of the cutoff and the relative cutoff used in the subsequent calculations were 450 and 50 Rydberg respectively.  After equilibration of the NPT calculations the average cell dimensions were determined from a simulation of at least 3 ps. The average cell dimensions from the NPT calculations were then used in the NVT ensemble simulations. NVT equilibration was at least 2 ps before a production run of at least 30 ps.

Calculation of Dielectric Permittivity from MD Calculations
The development of the theory for calculating the permittivity from an MD simulation closely follows the appendix in the paper by Chen and Li. 22 SI units are used here and account is taken of the fact that the dipole moment of the cell is not zero, but it may have a permanent dipole moment.

Static Permittivity
The displacement field is related to the polarisation of the system due to the applied field.
Here ϵ 0 is the permittivity of free space and P is the induced polarisation due to an electric field. The permittivity, ϵ, is given by the relationship between the displacement field and the applied field E; The induced polarisation and the permittivity are also related to the susceptibility χ or defining the relative permittivity (dielectric constant), ϵ r ; For a molecular dynamics calculation of an infinite system, we have a periodic cell with a fluctuating dipole moment M . The average of the dipole moment vector can be written as; The average of the square of the dipole moment can be written as; where M is a vector of the 3 components of the dipole moment, and the average is over the trajectory.
The polarisation and the average of the induced dipole moment M I are related by (V is the volume of the cell); The expectation value of the dipole moment of a system perturbed by a field E according to Chen and Li (equation A6) is; Here β = 1 kT . We are interested in the induced dipoles which are the terms only depending on the field. This gives; Since the polarisation is related to the field through the permittivity tensor (Equation 1); The induced polarisation can be expressed in terms of the average induced dipole moments, using equation 7.
Substituting the expectation value of the induced dipole moment (equation 9) and cancelling out the field, gives;

Frequency Dependent Permittivity
The starting equation for this is equation A13 in the paper by Chen and Li; Here C ij (t) is the dipole correlation function for dipole moments M i and M j and needs to be calculated from the trajectory.

The Total Permittivity
The sections above deal with the atomic motion contribution to the permittivity but the calculation of the absorption of spectrum in the infrared region requires the total permittivity and this requires the inclusion of the electronic polarisation at zero frequency (ϵ optical ). The assumption being that for insulators the electronic contribution is constant for the frequency range of interest in infrared spectroscopy. This term is calculated routinely in most DFT packages. In the calculations here the term has been taken from the VASP calculations of the phonon spectra.

The Discrete Dipole Moment Correlation Function
The dipole moment correlation function was calculated from the cell dipole moments calculated by CP2K at 0.5 fs time intervals. A typical trajectory will contain over 60,000 data points. A discrete version of Equation 17 was used to calculate F ϕ ij as Equation 18 seemed to have more numerical problems. The time derivative of the dipole moment correlation function was calculated using numerical differentiation.
Various signal processing ideas were used in processing the correlation function. If there is a sequence of calculated dipole moments at regular intervals the sequence can be written as a vector M i . If the sequence is long enough it can be sampled at different starting points along the sequence, the starting points being separated by a correlation depth, N corr . The correlation depth is chosen to be an integer power of two so that the discrete fast Fourier transform can be used in the calculation of equation 17. In this way a single trajectory can be split into several, hopefully independent trajectories, each of which can be used to calculate a dipole correlation function which can be averaged.
The correlation function can now be written as a vector C i,j of length N corr . For a perfect simulation of sufficient length and sufficient size, the function should decay to zero in an exponential manner as the phonons scatter have a lifetime related to the decay of the correlation function. In practice the systems that are simulated are small and are often not simulated for long enough to determine the system phonon lifetimes. To compensate for this a window function, W is applied to the correlation function to ensure that it behaves in a physical manner.
Thus the new correlation function is written as a convolution of the window function with the correlation function; The elements of the window function are written as an exponential; where the index k runs from 1 to N corr . The parameter a determines the decay rate of the exponential and as the Fourier transform of an exponential is a Lorentzian, it also imposes a Lorentzian line shape onto the signal. In addition to applying a window to the correlation function, the function is padded with zeros so that the total number of points in the signal is pN corr where p is the padding factor.
Two sets of parameters were used to process the correlation function in this work as shown in the table below. A similar approach is taken by Travis. The default settings are used in the paper unless specified otherwise. The narrow settings are used to give narrower peaks and more definition to the peaks. This latter setting is especially useful for the terahertz region of the spectrum, which otherwise seems to have absorption peaks which are too broad when compared to experiment. The draw back of using the "narrow" settings is that the calculated spectrum is noisier.

Calculation of the Absorption Spectrum
The trace of the total permittivity tensor is used as an estimate of the permittivity of a powder containing randomly oriented crystallites. From this the absorption coefficient (A) is calculated from the resulting imaginary component of the refractive index (k). In the equations below the dependence on the frequency and hence the wavelength (λ) is implicit. The units of A are inverse distance.

Super-cells used in VASP and CP2K calculations
Super-cells for VASP and CP2K calculations were created by extending the initial unit-cell in the a, b or c direction. The choice of which dimension to extend when enlarging the cell was to double the size of the smallest cell dimension. A super-cell is indicated by the use of the "SC" or "DC" designation followed by the polymorph, followed by the number of molecules in the cell and optionally a letter to distinguish between cells. Thus SCII8 refers to a super-cell of Form II with 8 molecules in the super-cell. Where there is some disorder in hydrogen bonding pattern the designation "DC" is used instead of "SC" and therefore DCI32 would refer to a disordered cell of Form I containing 32 molecules. Where more than one super-cell of the same dimensions was created but with different molecular geometries they are labelled a or b. Table 9 shows the CSD code of the reference unit cell, the designation of the super-cell, the dimensions of the super-cell, and the number of molecules and atoms in the cell for each polymorph considered.

Optimisation of all Polymorphs at Experimental Pressures
It is assumed that ambient pressure ( 0.000 101 GPa) is adequately represented computationally by calculations at 0 GPa. Table 10 shows the optimised unit cell parameters using a pressure appropriate to the corresponding experimental conditions. Table 11 shows the percentage deviations of the calculated unit-cell parameters from the experimental values. Form I shows particularly good agreement between the calculated and experimental unit-cell parameters which were measured at 30 K.   the results is a measure of the accuracy of the geometry optimisation.

Variation of Polymorph Geometry with Pressure
The molecular geometries of molecules in the unit cell were calculated as a function of pressure from 0 to   Table 14 shows the calculated internal energies of the polymorphs as a function of pressure. Table 15 shows the calculated volume of the polymorph unit-cells as a function of pressure. Using Phonopy the phonon contributions to the free energy were calculated using the harmonic approximation. The vibrational Helmholtz free energy is shown in Table 16 and the total Helmholtz free energy is shown in Table 17.    The internal energy relative to Form I is shown in Figure 6 and the relative total Helmholtz free energies at 300K as a function of pressure are shown in Figure 7.  Table 18 show the results of optimising a single unit-cell maintaining the experimental space group symmetry (P 2 1 2 1 2 1 ). The hydrogen bonding pattern SH...S refers to the low temperature bonding pattern observed experimentally. The SH...O bonding pattern refers to a unit cell created from the experimental one with the C-C-S-H torsion angle rotated so that the the S-H is directed at a nearby oxygen atom and the unit cell and atoms are subsequently optimised to lower the energy  Table 19 summarises and compares the significant torsion angles found by experiment and determined using

Form I Unit-Cells and Super-Cells
A series of super-cells of Form I were created as described in Table 9