Atomistic and Thermodynamic Analysis of N6-Methyladenosine (m 6 A) Recognition by the Reader Domain of YTHDC1

: N6-Methyladenosine (m 6 A) is the most frequent modi ﬁ cation in eukaryotic messenger RNA (mRNA) and its cellular processing and functions are regulated by the reader proteins YTHDCs and YTHDFs. However, the mechanism of m 6 A recognition by the reader proteins is still elusive. Here, we investigate this recognition process by combining atomistic simulations, site-directed mutagenesis, and biophysical experiments using YTHDC1 as a model. We ﬁ nd that the N6 methyl group of m 6 A contributes to the binding through its speci ﬁ c interactions with an aromatic cage (formed by Trp377 and Trp428) and also by favoring the association-prone conformation of m 6 A-containing RNA in solution. The m 6 A binding site dynamically equilibrates between multiple metastable conformations with four residues being involved in the regulation of m 6 A binding (Trp428, Met438, Ser378, and Thr379). Trp428 switches between two conformational states to build and dismantle the aromatic cage. Interestingly, mutating Met438 and Ser378 to alanine does not alter m 6 A binding to the protein but signi ﬁ cantly redistributes the binding enthalpy and entropy terms, i.e., enthalpy − entropy compensation. Such compensation is reasoned by di ﬀ erent entropy − enthalpy transduction associated with both conformational changes of the wild-type and mutant proteins and the redistribution of water molecules. In contrast, the point mutant Thr379Val signi ﬁ cantly changes the thermal stability and binding capability of YTHDC1 to its natural ligand. Additionally, thermodynamic analysis and free energy calculations shed light on the role of a structural water molecule that synergistically binds to YTHDC1 with m 6 A and acts as the hub of a hydrogen-bond network. Taken together, the experimental data and simulation results may accelerate the discovery of chemical probes, m 6 A-editing tools, and drug candidates against reader proteins.


INTRODUCTION
N6-methyladenosine (m 6 A), as a representative of RNA epigenetic 1 (also called epitranscriptomic) 2 modification, plays essential roles in modulating many biological functions, including brain development, 3 chromatin accessibility, 4 heat shock response, 5 oocyte maturation, 6 immune response to infection, 7 and tumorigenesis. 8 Although it is the most prevalent chemical modification in human mRNAs, it is also found in tRNA, rRNA, and small nuclear RNA (snRNA) as well as several long noncoding RNAs, such as Xist and is related to important functions, 9 thus attracting profound attention as a therapeutically relevant target. The level of m 6 A modification is maintained by homeostasis reversibly and dynamically. 9 The m 6 A modification is installed on mRNA by a methyltransferase complex, i.e., METTL3/METTL14 (also called the writer enzyme). 10,11 It is reversibly removed by the corresponding demethylases (also called erasers), for example, FTO 12 and ALKBH5, 13 for regulating the homeostasis of m 6 A. The methylated RNA exerts its effects on the downstream cellular functions by binding m 6 A-reader proteins. Five direct reader proteins have been discovered thus far; they are YTHDC1, 14 YTHDC2, 15 and YTHDF1-3. 14, 16 These reader proteins are located and populated differently in different types of cells, thus modulating diverse biological functions. 17 YTHDC1, as a reader protein in the nucleolus, regulates mRNA splicing, 18 mediates nuclear export of m 6 A-containing mRNA, 19 and participates in Xist-mediated gene silencing. 20 All of these regulatory functions are achieved through its binding to m 6 A-modified RNA and therefore it is necessary to fully understand its molecular recognition mechanism. YTHDC1 is the first human reader protein whose structure was successfully determined 21 and was recently validated as a target of small-molecule inhibitors. 22 These pioneer studies make YTHDC1 an ideal model system for studying the binding mechanisms of reader proteins and their binders. The crucial interaction between YTHDC1 and its binders has been confirmed wherein an aromatic cage consisting of two tryptophan residues facilitates binding (Figure 1a,b). 21 A part of key residues contributing to the binding has been confirmed by X-ray crystallography and isothermal titration calorimetry. 23 However, the detailed mechanism of the molecular recognition is still vague because most previous studies drew conclusions based either on static structural models or macroscopic observations, thus missing the dynamic feature of the interactions or atomic resolution, respectively. Moreover, as newly discovered functions of epitranscriptomic proteins, the field has gradually turned its focus into unlocking new therapeutic targets, 24 editing tools of RNA methylation and demethylation, 25−27 medicinal chemistry, 22,28−30 and molecular mechanics. 31,32 This transition requires a thorough elucidation of how the proteins recognize their natural and artificial binders.
In this study, we focus on YTHDC1 as a model system to dissect the mechanism of m 6 A-RNA binding utilizing X-ray crystallography, isothermal titration calorimetry (ITC), differential scanning fluorimetry (DSF), site-directed mutagenesis experiments, and explicit solvent molecular dynamics (MD) simulations (Table S1). We analyze multiple factors impacting molecular recognition, namely, the residues in the binding site that (in)directly interact with m 6 A, bridging water molecules, and protein dynamics.

MATERIALS AND METHODS
Main computational methods implemented in this work are described in this section, and the experimental methods, including cloning, protein expression, purification, crystallography, isothermal titration calorimetry (ITC), differential Figure 1. Contributions of the N6 methyl group to the binding of GG(m 6 A)CU to YTHDC1. (a) YTHDC1-m 6 A complex structure. The recognition loop (binding loop) is colored in cyan and the binding site is shown in the transparent gray surface. (b) Key residues of the m 6 A binding to YTHDC1 are colored according to interactions (carbon atoms in green for direct hydrogen bonds, white for water-bridged hydrogen bonds, and salmon for aromatic stacking). (c) Thermodynamic cycle is used for determining the contribution of the methyl group of m 6 A to the free energy of binding. The binding contribution (ΔΔG) is obtained by alchemical simulations and the thermodynamic integration (TI) method. ΔΔG is achieved through the horizontal direction (alchemical pathway) instead of the vertical one (physical pathway). The chemical structures were drawn by MarvinSketch. 52 scanning fluorimetry (DSF), and other in silico protocols are presented in the Supporting Information.
2.1. Conventional (Unbiased) Molecular Dynamics Simulations. Eleven molecular systems for MD simulations were constructed based on previously published structures and newly released ones in this study. These systems include three apo systems (YTHDC1 protein only), four holo systems (YTHDC1 bound with oligoribonucleotides or m 6 A), two oligoribonucleotides (GGACU with m 6 A modification or not), and two unphysical hybrid systems for the alchemical simulations. Their setting information is summarized in Table S1. We take the holo structure YTHDC1 bound with m 6 A as an example to describe the model-building procedure. The original coordinates, including protein, ligand, and crystal water molecules, were extracted from Chain A of structure 6ZCN. The hydrogen atoms were added by the CHARMM program (version 42b2) 33 and the protonation states were determined at neutral pH conditions. The complex system was solvated in a rhombic dodecahedron (RHDO) TIP3P 34 water box (lattice length: 67 Å) to ensure a 10 Å buffer space between the macromolecular atoms and the boundary of the water box. To neutralize the system and mimic the physiological conditions, Na+ and Cl− ions at a 0.15 M concentration were added to the solvated systems. Other holo and apo systems were trimmed similarly with a slightly varied number of water molecules and salt atoms. For the two free oligoribonucleotides, namely, GGACU and GG(m 6 A)CU, they were solvated in a 47 Å of RHDO water box with 0.15 M of NaCl.
Each simulation system was initially minimized for 10 000 steps under a series of restraints and constraints on the solute atoms to release bad contacts and poor geometry. The minimized structure was heated to 300 K and equilibrated in an NVT condition (constant volume and temperature). Finally, the structure was further equilibrated in an NPT condition (constant pressure and temperature). All of the equilibration phases lasted for 1 ns using the CHARMM program (version 42b2). 33 Production runs of 1000 ns each were carried out in NPT conditions using the NAMD program (version 2.13). 35 The pressure was controlled by the Nose− Hoover Langevin piston method with a 200 ps piston period and 100 ps piston decay time. 36,37 The temperature was maintained at 300 K using the Langevin thermostat with a 5 ps friction coefficient. The integration time step was set to 2 fs by constraining all of the bonds involving hydrogen atoms by the SHAKE algorithm. van der Waals (vdW) energies were calculated using a switching function with a switching distance from 10 to 12 Å, 38 and electrostatic interactions were evaluated using the particle mesh Ewald summation (PME) method. 39 The Lennard−Jones long-range correction was enabled. 40 The CHARMM36 force field was used for protein 41 and oligoribonucleotide 42 molecules. For parameterizing m 6 A, the force field for naturally occurring modified ribonucleotides was used. 31 Five independent runs with random initial velocities were carried out for each system for a total of 5 μs. MD snapshots were saved every 20 ps along the MD trajectories for further analysis. Geometric measurements, for example, dihedral angles, root-mean-square deviation (RMSD), and solvent-accessible surface area (SASA) analysis, were performed with CHARMM routines. All statistical figures were plotted by MATLAB (Version 2018a) 43 and structural figures were generated with PyMOL graphic software (Version 2.2). 44 Maestro (Version 11.5) was used for analyzing protein−ligand interactions. 45 2.2. Alchemical Free Energy Calculations. 2.2.1. m 6 Ato-Adenosine Transformation. The contribution of the methyl group to the binding was determined using the alchemical free energy simulation method. In doing so, a thermodynamic cycle was designed to reduce the computational expense (Figure 1c). Instead of simulating the binding event physically (vertical arms in Figure 1c), an unphysical pathway (horizontal arms in Figure 1c) was followed. Because the realistic association process of YTHDC1 and m 6 A is not the focus, such alchemical simulations enable us to enhance the sampling on the local dynamics, therefore producing accurate thermodynamic quantities, namely, the binding difference caused by methylation (ΔΔG).
A hybrid molecule presenting m 6 A and adenosine simultaneously was constructed. The molecular topology follows the dual topology strategy. The transformation was conducted in both protein and aqueous environments. Each transformation (addition of the methyl group to or removing it from adenosine) was carried out in three steps. In the first step, the partial charges on the methyl group and several associated atoms were removed (i.e., uncharging step). In the second step, the uncharged methyl group was transformed into an uncharged amino group with a soft-core potential (implemented by the PSSP routine in the PERT module). 46 In the final step, the charges on the adenosine were restored (i.e., charging step). Each transformation was accomplished in 11 λ steps (that is λ = 0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, and 1.0 values), where λ is the coupling parameter for a linear transformation between the starting and ending points. At each λ value, a simulation of 3 ns in total was carried out and the last 2 ns were used for free energy calculations by thermodynamic integration (TI). To remove the bias when estimating the free energy change, uncorrelated data were extracted by observing their correlation time. The bootstrapping protocol was applied to obtain the associated error of the free energies with 1000 repetitions for each transformation. The convergence of the determined free energy values was checked by plotting its time series ( Figure S1c). All simulations were carried out using the PERT module in the CHARMM program (version 42b2).

Binding Free Energy
Calculations of Water 1 to YTHDC1. The binding free energy of water 1 was calculated by a double decoupling protocol, that is, the water molecule was annihilated in protein (or complex) and bulk water separately by alchemical transformation simulations. For each annihilation process, a two-step alchemical transformation protocol was applied. The electrostatic interaction of water 1 was first decoupled from the environment, and then the vdW interaction was removed with the soft-core potentials. The removal of the electrostatic interaction was accomplished by 11 λ simulations, that is, λ = 0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, and 1.0. The removal of the vdW interaction was achieved by 16 λ simulations, that is, λ = 0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.825, 0.850, 0.875, 0.9, 0.95, 0.975, and 1.0. Each λ simulation lasted for 3 ns, and thus in total 81 ns were needed to finish an annihilation process.
We are interested in the binding free energies of water 1 to both apo and complex (bound to m 6 A) systems in their wild types and Thr379Val mutants. Therefore, water 1 was annihilated in five systems, i.e., apo proteins in their wild type and mutant, complex systems in their wild type and mutant and bulk water. For four protein-involved systems, the Journal of Chemical Theory and Computation pubs.acs.org/JCTC Article system setting-up, minimization, and equilibration were the same as the description in Section 2.1. Throughout the alchemical simulations, a mass-weighted harmonic potential (force constant of 0.5 kcal/mol/Å 2 ) was applied to restrain heavy atoms of protein, m 6 A, and water 1. For the bulk water system, water 1 was restrained in the center of a 40 Å of Journal of Chemical Theory and Computation pubs.acs.org/JCTC Article RHDO water box with the same force constant as those in protein systems. All of the alchemical simulations were conducted by the PERT module of CHARMM. Finally, the four binding free energies were calculated based on the alchemical simulations described above, that is, ΔGw1 apo wild (binding free energy of water 1 to the wild-type apo protein), ΔGw1 apo Thr379Val (to the Thr379Val mutant apo protein), ΔGw1 holo wild (binding free energy of water 1 to the wild-type holo protein), and ΔGw1 holo Thr379Val (to the Thr379Val mutant holo protein). Each of the values was the difference between an annihilation free energy of water 1 in a protein-involved system and the one in bulk water (shown in Table S3).
2.3. Metadynamics. The unbiased MD simulations revealed significant conformational changes of several residues in the binding pocket of YTHDC1. However, the conformational changes cannot be sampled sufficiently by the unbiased MD simulations. To settle this issue, metadynamics 47 was used to examine the conformational changes quantitatively. The metadynamics routine implemented in NAMD (version 2.13) was used in this study. 35 Trp428 experienced a flipping during the MD simulations ( Figure 2c). The flipping can be well presented by its dihedral angle χ 2 (CA-CB-CG-CD2 shown in Figure 2c). This dihedral angle was selected as the collective variable for the metadynamics simulations. The Gaussian "hill" was added to the original potential by setting the parameters "hillweight" and "hillwidth" to 0.05 and 1.5, respectively. χ 2 ranged from −180 to 180°with a 2°bin width. The frequency of a newly added Gaussian hill was set to every 500-integration step. A complete PMF profile was saved every 10 ns. To ensure enough sampling and reproducibility, 5 replicas were conducted with 500 ns for each. In this study, we were only interested in the local conformational changes of Trp428, which is its flipping in the closed binding pocket. Thus, the contribution from large conformational changes, for example, the loop opening, was excluded. To map the final PMF profile (Figure 2c), all PMF output files from 5 replicas were combined, and the associated error was estimated by the bootstrapping method.

RESULTS AND DISCUSSION
3.1. Influence of N6 Methylation on the Recognition of m 6 A-Containing RNA by YTHDC1. The N6 methyl group of m 6 A is the structural warhead for the specific recognition by m 6 A reader proteins and thus is expected to contribute substantially to the binding of m 6 A-containing RNAs to reader proteins. However, it is not clear how and how much the methyl group contributes towards the binding. A study has shown that the methyl group is vital for the binding as the unmodified GGACU completely loses its binding affinity to YTHDC1. 21 Another work, however, reported an approximately 50-fold reduction (∼2.3 kcal/mol) after removing the methyl group from an oligoribonucleotide. 48 To quantify the contribution of the methyl group, we devised a thermodynamic cycle (Figure 1c) and conducted a series of alchemical free energy calculations. 49,50 The calculations show that the van der Waals (vdW) interaction dominates the binding energy and the methyl group contributes approximately 1.7 kcal/mol to the binding, which corresponds to a 16-fold stronger affinity than the unmethylated nucleoside. The robustness and convergence of the calculations were checked by the alchemical transformation integrands and the time series of the simulations ( Figure S1). Multiple MD simulations (for a cumulative sampling of 5 μs) show that the methyl group stabilizes the consensus oligoribonucleotide GG(m 6 A)CU in the binding pocket of YTHDC1. By contrast, the complex with the unmethylated adenosine of GGACU is structurally less stable ( Figure S2). The key hydrogen bond interactions (shown in Figure 1b) are not well maintained in the unmethylated GGACU system and the adenosine even dissociates from the aromatic cage occasionally in some of the trajectories.
The binding difference can be partially explained by the direct interaction of the methyl group with the aromatic cage. Besides, previous MD simulations suggest that the methylation shifts the conformational preference of the free oligoribonucleotide GG(m 6 A)CU towards bound-like conformations in the aqueous solvent. 51 Here, we extended the MD simulations for the unbound oligoribonucleotides in solution (collectively 5 μs for each system) and reanalyzed the data by a twodimensional projection of the conformational changes. The conformational free energy analysis clearly shows that the m 6 Amodified oligoribonucleotide adopts a metastable state near the bound-like region in the aqueous solvent while the unmethylated counterpart does not (circled regions in Figure  S3a,b). Moreover, methylation confines the conformational space of the oligoribonucleotide. A further projection onto a single dimension shows that the bound-like conformation is favored by 0.5−1.0 kcal/mol caused by methylation ( Figure  S3c,d). In short, m 6 A not only fits better in the aromatic cage than the unmodified adenine but the additional methyl also results in a free energy landscape of the oligoribonucleotide GG(m 6 A)CU in the aqueous solution more prone to binding to the cage.
3.2. Hidden Metastable Conformations of the Aromatic Cage. Most residues that are directly in contact with m 6 A do not undergo evident changes in existing apo and holo forms of YTHDC1, 21 which is further confirmed by our new m 6 A-YTHDC1 complex (PDB ID: 6ZCN). A previously published YTHDC1 apo structure (PDB ID: 4R3H) has an unaccounted electron density in the binding pocket, which might originate from imidazole of the purification buffer. Also, the aromatic cage may be sustained by other buffer ingredients, for example, 1,2-ethanediol (PDB ID: 6WE8). 53 Thus, we sought to purify the protein without the histidine-tag, which requires the use of imidazole. Therefore, a GST-tagged YTHDC1 construct was cloned, purified, and crystallized in the same conditions as those used for solving the structure 4R3H. A new crystal arrangement was obtained for the apo structure with two YTH domains in one asymmetric unit. Interestingly, the geometries of the binding pocket are different in the two chains (PDB ID: 6ZD9, Figure 2a,b). The conformation in chain A resembles the canonical one (called here Met438-out conformation). In chain B, we observe a sidechain flipping of Met438 from the solvent into the binding pocket (called Met438-in conformation), resulting in an obstructed aromatic cage for the m 6 A binding.
The plasticity of the binding site, which emerges from our new structure, motivated us to further explore conformational changes. To this end, MD simulations were carried out starting from two different apo conformations (Figure 2a,b). First, five independent MD simulations of 1μs each were run for each of the two conformations. From the canonical apo conformation, we observed a conformational change of Trp428. Its side chain rotated more than 100°relative to its crystal position and thus the conformational change deformed the aromatic cage (Figures 2c and S4a). Because this new conformation was Journal of Chemical Theory and Computation pubs.acs.org/JCTC Article not observed in crystal structures, we used a metadynamics 47 enhanced sampling protocol to validate its reliability. The potential of mean force (PMF) along the χ 2 dihedral of Trp428 shows that the two conformations are equally populated and they are separated by two free energy barriers of similar height (about 5 kcal/mol) for rotation in the two directions, respectively (Figure 2c). For the Met438-in conformation (Figure 2b), the unbiased simulations (collectively 5 μs) show that Met438 rotates out of the aromatic cage, while Leu439 enters in, resulting in the recovery of the canonical aromatic cage ( Figure S4b). This result suggests that the Met438-in conformation is a metastable state. By comparing the two new conformations (Met438-in and Trp428-flipped, respectively) with the canonical one (Met438-out conformation), we find that the conformational changes reshape the binding site (Figure 2d−f). Thus, these unique conformations may be beneficial to the discovery of selective chemical probes to YTHDC1 against other m 6 A-binding proteins. Besides, as the newly observed conformations sterically block the binding site of m 6 A, their existence may play a role in disfavoring the association of m 6 A with YTHDC1.

Structural and Thermodynamic Analysis of m 6 A-Recognizing
Residues. Previous studies have revealed several key residues responsible for the m 6 A binding, including Trp377, Trp428, and Asn367, 21,23 which directly interact with m 6 A. However, our simulations and new crystal structures suggest that some residues that are not in direct contact with m 6 A also play a substantial role in ligand recognition. For example, Ser378 interacts with the side chain of Trp428 and the backbone of Gly441 through hydrogen bonds and thus may contribute to the structural stability of the aromatic cage and binding pocket loop (Figure 2a). Met438 reshapes the binding site as discussed above (Figure 2b). The side-chain hydroxyl group of Thr379 is integrated into a hydrogen-bond  The table shows the values of stoichiometry (N), enthalpy (ΔH), and entropy (−TΔS). Data for the wild-type protein and GG(m 6 A)CU are from a recent work. 51 The given standard deviation is calculated from the ITC curve fitting by MicroCal Origin software.
Journal of Chemical Theory and Computation pubs.acs.org/JCTC Article network and indirectly interacts with m 6 A via a conserved water molecule. To confirm their roles in the binding of m 6 A, we performed point mutagenesis and crystallized apo and m 6 Abound structures (Figure 3 and Table S2). The structural comparison shows that the mutations do not affect the overall architecture of the binding site or the position of the conserved water but increase the plasticity of the binding pocket loop. Furthermore, m 6 A stabilizes the recognition loop (binding loop) in both wild-type and mutant structures, in contrast with the partially disordered loop in the absence of m 6 A (comparing apo and holo structures in Figure 3). To complement the structural data and simulation results, isothermal titration calorimetry (ITC) was used to measure the thermodynamic parameters for binding of the m 6 A nucleoside and the preferred oligoribonucleotide GG(m 6 A)CU to the wild type and mutants of YTHDC1. Met438Ala and Ser378Ala do not significantly alter the dissociation constants (K d ) compared to the wild type (Table 1 and Figure S5). In contrast, the Thr379Val mutation substantially reduces the affinity to m 6 A (approximately 20-fold) and GG(m 6 A)CU (approximately 140-fold). Further, we employed the differential scanning fluorimetry (DSF) assay to measure changes in the melting temperature of the YTHDC1 variants upon binding of m 6 A and GG(m 6 A)CU. Met438Ala and Ser378Ala show a similar thermal stability as the wild type (ΔT m ≈ 2°C) while no significant thermal shift was observed for Thr378Val even at a 500 μM concentration of m 6 A or 100 μM concentration of GG(m 6 A)CU (Figure 4a). Moreover, the protein thermal stability is significantly reduced (ΔT m ≈ 3°C) by the Thr379Val mutation (Figure 4b), indicating a dual role of Thr379 in the binding of m 6 A and stabilization of the protein.
3.4. Roles of the Pocket Residues in Regulating the Protein and Water Structures upon m 6 A Binding. Although Met438Ala and Ser378Ala mutations do not considerably alter the binding affinity of YTHDC1 to m 6 A and GG(m 6 A)CU, they significantly change the thermodynamic parameters, i.e., enthalpic and entropic terms as measured by ITC (Table 1 and Figure S5). These changes indicate that Met438 and Ser378 may play roles in fine-tuning the specificity of the m 6 A binding to YTHDC1 and the protein dynamics. To dissect these aspects, we compared the differences between the wild type and mutants on their thermodynamics, structural stability, and protein dynamics before and after the m 6 A binding.
The Met438Ala mutation redistributes enthalpic and entropic contributions upon the binding of GG(m 6 A)CU (Table 1 and Figure S5) compared to the wild type. This leads to an enthalpy−entropy compensation with an unfavored enthalpic change (ΔΔH = 4.4 kcal/mol) and favored entropic change (−TΔΔS = −3.6 kcal/mol) as measured by ITC. The compensation phenomenon can be explained, at least in part, by the existence of the Met438-in conformation (Figure 2b). First, the Met438-in conformation is more rigid (lower Bfactor in Figure S6) than the Met438-out conformation. The transition from Met438-in to Met438-out elevates the local protein dynamics and thus increases the conformational entropy of the binding site. Moreover, the Met438-in conformation confines more crystal water molecules in its binding pocket and pocket loop than the Met438-out conformation (25 versus 17 crystal water molecules, respectively, the top panel of Figure S7). This observation is further supported by the number of water molecules in the first hydration shell sampled by MD simulations (70 versus 65 crystal water molecules, the bottom panel of Figure S7). The switch from Met438-in to Met438-out conformation releases a number of water molecules, thereby increasing the entropy of the system. Therefore, the existence of the Met438-in conformation, which is unlikely for the Met438Ala mutant, disfavors the conformational entropy of the protein and solvent, while it improves the enthalpy of the system because of interactions among Met438, confined water molecules, and the rest of the binding pocket.
The Ser378Ala mutation also leads to an enthalpy/entropy redistribution according to the ITC measurements, but its compensatory mechanism is different from the Met438Ala mutation. Compared to the binding of GG(m 6 A)CU to the wild-type protein, this mutation penalizes the entropic term by 2.4 kcal/mol and ameliorates the enthalpic term by 1.9 kcal/ mol. Ser378 interacts with Trp428, Val442, and Gly441 by multiple hydrogen bonds and connects the binding pocket and the pocket loop (PDB ID: 6ZD9 in Figure 3). The Ser378Ala mutation loses the polar interactions and thus destabilizes the closed binding loop (PDB ID: 6ZD4 in Figure 3). Upon the binding of m 6 A, the Ser378Ala mutant has to pay more entropic penalty for transforming the more flexible loop to its bound state than the wild type. This is consistent with the increased entropic term. Concerning the more favorable enthalpic term, one plausible explanation is that the Ser378Ala mutation makes the aromatic cage more lipophilic and thereby enables a more favorable vdW interaction with the N6 methyl group of m 6 A.
3.5. Synergistic Binding of the Structural Water Molecule (Water 1) and m 6 A to YTHDC1. The Thr379Val mutation is an isosteric replacement that results in the loss of a hydrogen bond with the structural water (called water 1 hereafter, Figure 5a,b). The DSF data and crystal structure (PDB ID: 6ZD7) suggest that water 1 is involved in m 6 A recognition and protein stability (Figure 4a,b) but its contributions are difficult to quantify specifically by experiments. To solve this issue, we used three theoretical methods  (Table S3). First, a rapid water sampling method, i.e., SZMAP, 54 shows that water 1 binds to the wild type much stronger than to Thr379Val. The same trend was also obtained from the inhomogeneous solvation theory (IST). 55−57 Finally, we used the alchemical free energy calculation method to evaluate the binding affinity of water 1 to apo and holo YTHDC1. For the wild type, the binding free energy of water 1 to the m 6 A-YTHDC1 complex is −5.1 kcal/mol and that to the apo protein is −1.9 kcal/mol. For Thr379Val, the corresponding values are −3.0 and −2.0 kcal/mol for the holo and apo proteins, respectively. These values suggest that water 1 is indeed a structural water molecule and thus is important for maintaining the folded structures of YTHDC1. Moreover, water 1 may also be essential for the recognition between m 6 A and YTHDC1. This is supported by the calculated free energy difference of water 1 to the holo proteins of the wild type and Thr379Val (ΔΔG = −2.1 kcal/ mol). The number is surprisingly in line with the corresponding ITC data for the m 6 A binding difference (ΔΔG = −1.8 kcal/mol). The free energy difference originates, at least in part, from the impairment of the hydrogen-bond network bridged by water 1 between m 6 A and YTHDC1 (Figure 5a,b). Interestingly, water 1 has a similar binding affinity to the wild type and Thr379Val in their apo forms (−1.9 kcal/mol versus −2.0 kcal/mol, respectively), which cannot explain their different thermal stabilities (Figure 4b).
We thus decided to investigate the relative stability by MD simulations of the unbound state (5 μs for each system). The backbone fluctuations along the MD trajectories show that the wild-type YTHDC1 is more rigid than Thr379Val ( Figure S8), which indicates that the higher thermal stability of the wild type is mainly due to more stable intraprotein interactions (enthalpic contribution). Taken together, it is a synergistic binding of m 6 A and water 1 to YTHDC1.
The inhomogeneous solvation theory (IST) results show that the apo binding site is mostly occupied by thermodynamically unfavored water molecules except water 1 ( Figure S9). Upon the binding of m 6 A, the unstable water molecules are well replaced by the nucleobase of m 6 A. Notably, the N6 methyl group replaces the most unstable water molecule (water 2 in Figure S9: +7.3 kcal/mol). Simultaneously, N7 of m 6 A forms a hydrogen-bond network together with water 1 and multiple polar residues. The water analysis by IST provides a unique angle to view the specific molecular recognition beyond hydrogen bonding of m 6 A to residues Ser378, Asn363, and Asn367. Because water 1 is thermodynamically stable, replacing it with YTHDC1 binders must compensate for the binding free energy of water 1. We have recently released in the PDB 50 crystal structures of YTHDC1 in the complex with small-molecule ligands. A few of these ligands replace water 1 according to the crystal structures ( Figure S10) but show undetectable binding even at a very high concentration near their solubility limit. Most ligandefficient binders discovered so far do not replace water 1 but rather form a hydrogen bond with it. 22 Nonetheless, an unstable water molecule was also found near water 1 and located in a hydrophobic tunnel (water 3 in Figure S9). Thus, it seems possible to simultaneously replace water 1 and water 3 by new chemical entities, thereby merging the m 6 A binding pocket and the hydrophobic tunnel.

CONCLUSIONS
We have used atomistic simulations and biophysical experiments to examine the molecular recognition mechanism of m 6 A-containing RNA to the reader domain of YTHDC1 and in particular the role of key residues in the binding site. We have confirmed that the specific recognition originates from the direct interaction of the N6 methyl group of m 6 A with the aromatic cage. Furthermore, we have unveiled that the addition of the methyl group shifts the energy landscape of the free GG(m 6 A)CU (in aqueous solution) toward a bindingprone conformation. Concerning individual side chains, Met438 and Ser378 regulate the flexibility and hydration around the binding pocket. The Thr379 side chain plays an important role in both the binding of m 6 A and protein stability. A structural water molecule that interacts with Thr379 synergistically binds to YTHDC1 with m 6 A. This water molecule is conserved among five reader proteins, including Journal of Chemical Theory and Computation pubs.acs.org/JCTC Article YTHDF proteins, 23,58,59 thus providing a general mechanism for recognizing m 6 A-containing RNA. Importantly, the MD simulations and protein crystallography have revealed that the reader domain of YTHDC1 undergoes conformational switches between multiple metastable states. The comprehensive atomistic and thermodynamic analysis is useful for the development of selective chemical probes for m 6 A-related functional studies in cells.
■ ASSOCIATED CONTENT * sı Supporting Information The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/acs.jctc.0c01136. Experimental and simulation methods; X-ray crystallographic statistical data; ITC titration curves; and more figures for the analyses of MD simulations (PDF)