Structural Insights into Methylated DNA Recognition by the Methyl-CpG Binding Domain of MBD6 from Arabidopsis thaliana

Cytosine methylation is an epigenetic modification essential for formation of mature heterochromatin, gene silencing, and genomic stability. In plants, methylation occurs not only at cytosine bases in CpG but also in CpHpG and CpHpH contexts, where H denotes A, T, or C. Methyl-CpG binding domain (MBD) proteins, which recognize symmetrical methyl-CpG dinucleotides and act as gene repressors in mammalian cells, are also present in plant cells, although their structural and functional properties still remain poorly understood. To fill this gap, in this study, we determined the solution structure of the MBD domain of the MBD6 protein from Arabidopsis thaliana and investigated its binding properties to methylated DNA by binding assays and an in-depth NMR spectroscopic analysis. The AtMBD6 MBD domain folds into a canonical MBD structure in line with its binding specificity toward methyl-CpG and possesses a DNA binding interface similar to mammalian MBD domains. Intriguingly, however, the binding affinity of the AtMBD6 MBD domain toward methyl-CpG-containing DNA was found to be much lower than that of known mammalian MBD domains. The main difference arises from the absence of positively charged residues in AtMBD6 that supposedly interact with the DNA backbone as seen in mammalian MBD/methyl-CpG-containing DNA complexes. Taken together, we have established a structural basis for methyl-CpG recognition by AtMBD6 to develop a deeper understanding how MBD proteins work as mediators of epigenetic signals in plant cells.


■ INTRODUCTION
A certain percentage of cytosine bases in eukaryotic DNA exist in an epigenetically modified form, as 5-methylcytosine. Cytosine methylation plays essential roles in numerous vital functions such as repression of gene expression, organization of the chromatin structure, and inactivation of transposons. 1,2 In animal cells, cytosine methylation occurs mostly at symmetrical CpG sequences and is achieved by the action of DNA methyltransferases. 1,3 5-Methylcytosine may be further oxidized to 5-hydroxymethylcytosine. These two modifications are examples of distinct epigenetic marks, which can then be specifically recognized ("read-out") with specific reader domains in many proteins.
Methyl-CpG binding domain (MBD) proteins were first identified in mammals that have very high CpG methylation levels as high as 70−80% 3 as chromatin regulators that recognize the cytosine methylation at CpG dinucleotides (hereafter: methyl-CpG sites). 4,5 Within the MBD family, several "canonical" MBD proteins from humans (Homo sapiens, Hs) have been intensively studied. For example, HsMeCP2 and HsMBD1 act as transcriptional repressors by binding to the methyl-CpG site and then recruiting histone methyltransferases and histone deacetylases, which promote chromatin condensation (i.e., formation of mature heterochromatin). 6 As another example of a canonical MBD protein, the HsMBD4 MBD domain preferably recognizes (mismatch) TpG/methyl-CpG and hydroxymethyl-CpG/methyl-CpG sites, which are intermediate products in the course of active DNA demethylation. 7,8 In addition to the MBD domain, these MBD proteins contain one or more functional domains that define the specific function of these proteins such as the transcription repression domain in HsMBD1 and the glycosylase domain in HsMBD4. 6 In plants, cytosine bases in not only CpG but also symmetrical CpHpG and asymmetrical CpHpH contexts (H represents A, T, or C) can be methylated. 9,10 All these methylation patterns are thought to be established by the RNA-directed DNA methylation mechanism and maintained through different regulatory pathways. 11,12 Previous studies demonstrated that genome-wide levels of cytosine methylation in plants largely deviate among species; 24% CpG, 6.7% CpHpG, and 1.7% CpHpH are methylated in thale cress (Arabidopsis thaliana, At), 10 whereas high methylation rates of 87−88% CpG, 67−68% CpHpG, and 41−43% CpHpH are observed in rice cultivars. 13 Epigenetic mark reader domains, such as MBD proteins, are also found in plants. Interestingly, however, domain organization of plant MBD proteins is different from that of mammalian MBD proteins. 14 The differences in the sequence contexts and the recognitive machineries of cytosine methylation between mammals and plants indicate that epigenetic regulation via DNA methylation in plant cells contains multiple pathways that do not exist in mammalian cells. However, the molecular mechanism by which each methylation pattern is read out, interpreted, and translated into downstream signals is still largely unclear.
In A. thaliana, 13 proteins, AtMBD1−AtMBD12 and AtIDM1, were identified as putative MBD proteins on the basis of their sequence homology with mammalian MBD proteins. Previous studies have shown by fluorescence microscopy that at least three of them, AtMBD5, AtMBD6, and AtMBD7, are localized to chromocenters abundant in cytosine methylation. 15 Although the subnuclear distribution indicated that these MBD proteins recognize methylated DNA, both their binding specificity and affinity toward methylated DNA in various sequence contexts remain elusive. Furthermore, no structural information on the MBD domains from plants has been published.
Here, we focused on one particular MBD protein from A. thaliana, AtMBD6. AtMBD6 does not contain any known functional domains except the MBD domain. On the basis of the subnuclear colocalization observed by fluorescence microscopy and in vitro binding examined by pull-down assays, it has been speculated that AtMBD6 is involved in maintenance of DNA methylation mediated by an ATPdependent DNA helicase, AtDDM1. 15 Another previous study suggested by yeast two-hybrid assays and Forster resonance energy transfer experiments that AtMBD6 interacts with proteins involved in the RNA-directed DNA methylation pathway, including an RNA binding protein, AtAGO4, and a histone deacetylase AtHDA6. 16 However, many fundamental properties of AtMBD6 such as the structure and the binding specificity toward methylated DNA remain to be elucidated. In the present study, we characterized the AtMBD6 MBD domain using NMR spectroscopy to establish a structural basis toward understanding the roles of MBD domains in the complex of epigenomic regulation in plant cells.
Binding Specificity of the AtMBD6 MBD Domain toward Methylated DNA. First, we examined the binding preference of the MBD domain of AtMBD6 (residues 78−140; MBD AtMBD6 ) toward DNA harboring single CpG, CpApG, or CpApA sites in methylated or nonmethylated states (Table 1) using a gel shift assay. MBD AtMBD6 showed specific binding to the DNA harboring the methyl-CpG site and roughly the same degree of nonspecific binding to all other DNA samples examined (Figures 1 and S1). Since a previous study had indicated that AtMBD6 could form both a homodimer and a heterodimer with AtMBD5 in vitro, 17 we considered that formation of a homodimer might alter the observed binding specificity of MBD AtMBD6 toward methylated DNA. However, the two-fold serial dilutions of a highly concentrated solution of MBD AtMBD6 showed no detectable changes in the backbone amide chemical shifts, indicating that MBD AtMBD6 exists as a monomer in solution even at concentrations as high as 200 μM ( Figure S2). Taken together, these results indicated that MBD AtMBD6 is a canonical monomeric MBD domain that specifically recognizes the methyl-CpG sites in DNA.
Solution Structure of the AtMBD6 MBD Domain. To gain structural insights into the DNA binding of MBD AtMBD6 , we determined the solution structure of MBD AtMBD6 in the free form by NMR spectroscopy. In the initial structure calculation, interproton distance restraints from three-dimensional NOESY spectra and dihedral angle restraints derived from backbone and β-carbon chemical shifts were used. The lowest-energy structure of the NOE-based structure calculation was then used as the initial structure for further structural refinement applying residual dipolar coupling (RDC) restraints in addition. The obtained 20 minimum energy structures were well defined, as indicated by the root-mean-square deviation (RMSD) for the backbone of 0.6 ± 0.2 Å except for a loop region (residues 93−98) and a disordered C-terminal region (residues 128−140) ( Figure 2A and Table 2).
The structure is composed of three β-strands, an α-helix, and a flexible loop L1 that connects the β-strands β1 and β2 ( Figure 2B). This fold is indeed the canonical fold of mammalian MBD domains. Accordingly, the overall structure is quite similar to the solution and crystal structures of mammalian MBD domains in complex with methyl-CpGcontaining DNA reported previously ( Figure 2C). 18−20 The structural flexibility of the L1 loop is consistent with relatively small backbone heteronuclear NOE values previously observed for residues in this loop, 21 and very similar observations have also been reported for the free forms of the MBD domains of HsMBD1 and HsMeCP2. 22,23 In addition, four important   Table 1. residues including two arginine residues termed as arginine fingers (RFs), which are conserved among mammalian MBD domains, are also present at the corresponding position in the structure of MBD AtMBD6 ( Figure 2C, shown as sticks). In the reported complexes of human MBD domains and DNA, these residues were shown to form unique structural motifs to recognize the methyl-CpG sites via hydrophobic interactions with the methyl groups of 5-methylcytosine bases and direct or water-mediated hydrogen bonds with 5-methylcytosine and guanine bases. 18,19,24 Taken together, our data showed that MBD AtMBD6 adopts the canonical MBD fold in solution, which is in fine agreement with the result of the gel shift assay showing the binding specificity of MBD AtMBD6 toward methyl-CpG-containing DNA.
Binding Interface and Affinity of the AtMBD6 MBD Domain for Methyl-CpG-Containing DNA. In order to reveal the mechanism of methyl-CpG recognition by MBD AtMBD6 , we performed NMR titration experiments using 15 N-labeled MBD AtMBD6 and unlabeled methyl-CpG-containing DNA as a ligand. Almost all cross-peaks of MBD AtMBD6 showed some degree of displacement upon the addition of DNA solution ( Figure 3A). To examine whether any structural changes of MBD AtMBD6 occurred upon binding to methyl-CpGcontaining DNA, we compared the secondary structure propensity (SSP) score of each residue in the free state 21 and the bound state. No significant difference between the SSP scores of both forms was observed ( Figure S3), suggesting that there are no major changes in the secondary structure of MBD AtMBD6 upon methyl-CpG-containing DNA binding. Next, we calculated the normalized chemical shift difference (CSD) values of all backbone amide resonances to understand which residues contribute most significantly to methyl-CpG-containing DNA binding. The binding surfaces were similar to those observed in the case of human MBD proteins ( Figure 3B); 22,23 the residues that displayed large CSD values are mainly located in the two RFs, the L1 loop, and the α-helix of which the corresponding regions in mammalian MBD proteins are in contact with methyl-CpG-containing DNA ( Figure 3C). 8,19 Notably, two residues in the L1 loop, G95 and A98, known as unique reporters for the methyl-CpG-specific binding mode of previously studied MBD domains, 25−27 showed chemical shift changes very similar to the corresponding residues in these MBD domains (in terms of both directions and magnitude) upon interactions with methyl-CpG-containing DNA. All in all, our results indicate that MBD AtMBD6 uses a similar interface as conserved mammalian MBD domains to bind methyl-CpGcontaining DNA.
Interestingly, several peaks exhibited significant line broadening when the molar ratio of DNA to MBD was in the range of 0.1−0.5 ( Figure 3A, enlarged view). In addition, the magnitude of line broadening appeared to be roughly proportional to the degree of change in chemical shift upon binding. This observation indicated that MBD AtMBD6 binds to methyl-CpG-containing DNA in the fast to intermediate exchange regime on the NMR timescale; in general, this regime is associated with dissociation constants K d ranging from several micromolar to several tens of micromolar. 28,29 In order to determine the K d value for this interaction, we fitted the chemical shift changes during the course of the titration to a two-state binding model ( Figure 3D). The K d value was obtained as 40.2 ± 0.5 μM, which is reasonable for the observed fast to intermediate exchange regime. Previous studies have revealed that several mammalian MBD domains show high affinity toward methyl-CpG-containing DNA with K d values from several tens of nanomolar to a few micromolar. 24,30 Therefore, our results clearly show that the binding affinity of MBD AtMBD6 toward methyl-CpG-containing Four conserved residues that have been reported to be essential for specific recognition of the methyl-CpG sites in mammalian MBD domains are displayed as sticks. In the reported crystal structure of the HsMeCP2/methyl-CpG-containing DNA complex, these four residues play essential roles in methyl-CpG recognition as follows: 19 R111 (RF1) and R133 (RF2) form hydrogen bonds with guanine bases of the methyl-CpG site. R133 also engages in hydrophobic interactions with the methyl group of one of the two 5methylcytosines. The carbonyl group of D121 forms a weak hydrogen bond with the methyl group of the second 5-methylcytosine and is a part of a water-mediated hydrogen bond network, in which the methyl groups of both 5-methylcytosines are involved. D121 also stabilizes the RF1 by formation of a salt bridge. Y123 contributes to recognition of the two 5-methylcytosines by participating in the hydrogen bond network. In HsMBD4, which preferentially recognizes the TpG/ methyl-CpG mismatch sites, the side chain of this tyrosine (Y109) is flipped out. 1.1 ± 0.1 Å Ramachandran plot statistics (residues 78−127) residues in the most favored regions 89.2% residues in additionally allowed regions 9.0% residues in generously allowed regions 1.8% residues in disallowed regions 0.0% DNA is much lower in comparison to that of the canonical MBD domains, even though MBD AtMBD6 shares the DNA binding interface with its mammalian counterparts.
Consequence of Variation of Important Amino Acids in the AtMBD6 MBD Domain. To understand the comparably weak binding of MBD AtMBD6 to methyl-CpGcontaining DNA, we performed sequence and structural alignment of MBD AtMBD6 with human and other A. thaliana MBD domains ( Figures 4A, S4, and S5). MBD AtMBD6 possesses all the key residues that have been shown to be responsible for specific recognition of the methyl-CpG sites in canonical MBD domains, 8,19 in line with the aforementioned results. However, two critical differences were found in the amino acid sequences. First, the C-terminal region of MBD AtMBD6 (residues 127−149) shows poor homology with its human counterparts. Previous studies on human MBD domains demonstrated that this region forms a "hairpin loop" structure that is also associated with DNA binding. 23 Therefore, we suppose that the affinity of MBD AtMBD6 toward DNA is reduced as a consequence of the absence of the hairpin loop. Second, three positively charged residues conserved among human MBD proteins (R17, K23, and R30 in HsMBD1) correspond to uncharged residues (V87, T93, and S100, respectively) in AtMBD6. In canonical human MBD proteins, the first of these three residues stabilizes the hairpin loop by electrostatic interactions with backbone carbonyl groups in the hairpin loop ( Figure 4B). 19 The absence of this positively charged residue therefore supports our idea that the hairpin loop is missing in MBD AtMBD6 , that is, not only in the free but also even in the DNA-bound state. The remaining two residues are located near the two ends of the L1 loop and contact the phosphate backbone of DNA in the reported structures of canonical MBD-DNA complexes ( Figure 4C). 18,19 Previous studies demonstrated that the positive charge of these residues plays a crucial role in the DNA binding by human and chicken MBD domains. 18,23,31 Therefore, we hypothesized that the absence of these important electrostatic contributions to the binding energy significantly reduces the binding affinity of MBD AtMBD6 toward methyl-CpG-containing DNA.
To explore this hypothesis, we constructed a single-point mutant S100R MBD AtMBD6 and conducted NMR titration experiments under the same conditions as performed with wildtype (WT) MBD AtMBD6 . As the molar ratio of DNA to MBD was raised, cross-peaks of S100R MBD AtMBD6 corresponding to the free state gradually disappeared with only very slight changes in the chemical shift, while the peak intensity of the bound state increased (Figures 4D and S6A). Several crosspeaks of the free state did not move straight toward the bound state, suggesting that an intermediate state detectable on the chemical shift timescale might be involved in the DNA binding of S100R MBD AtMBD6 . These observations indicated that the off-rate of the binding was lower than that of WT MBD AtMBD6 and the exchange between the intermediate and bound states was in the slow exchange regime on the chemical shift timescale. Taken together, our results suggest that the S100R mutant has an increased affinity toward methyl-CpGcontaining DNA as compared with WT MBD AtMBD6 , in line with the hypothesis based on the sequence alignment. To obtain the K d value of this interaction, we tried several methods including this NMR titration, isothermal titration calorimetry, and fluorescence polarization. Unfortunately, however, we could not obtain a reliable K d value, likely because this interaction is not a simple two-state exchange and several properties of this interaction and MBD AtMBD6 itself such as binding enthalpy and molecular weight are not suitable for quantification of the binding affinity by these methods. Therefore, hereafter, we focus on the differences in the chemical shift change upon DNA binding between WT and S100R MBD AtMBD6 . For most of the residues, the CSD values between the free and bound states were similar to the corresponding CSD values of WT MBD AtMBD6 ; however, the four residues (V101, D102, R117, and E119) showed large differences in the chemical shift change upon DNA binding between WT and S100R MBD AtMBD6 ( Figure 4E, left panel; see also Figures 3B and S6B). Among these four residues, only R117 experienced a "downward" change, suggesting that this residue is less involved in the DNA binding in the S100R mutant compared to WT MBD AtMBD6 . By contrast, V101, D102, and E119 showed "upward" changes, suggesting that these residues became more involved in DNA binding due to the introduction of the S100R mutation. It should also be noted that R117 and E119 are not close to the mutation site (more than 10 Å apart from S/R100). Intriguingly, we found that these four residues are located near either of the RFs ( Figure 4E, right). Overall, the main distinction between the CSD values of WT and S100R MBD AtMBD6 results from the difference in the chemical shifts of the bound state, not the free state ( Figures S6C,D). Furthermore, as the molar ratio of DNA over protein increased, an aliased side-chain signal emerged in the 1 H− 15 N HSQC spectra at a markedly downfield-shifted 1 H chemical shift ( Figure S6A). Based on a 15 N-edited NOESY-HSQC spectrum, this resonance could be identified as the ε-NH resonance of R92 of the structural element RF1. By contrast, this signal was not observed in the case of WT MBD AtMBD6 , further underlining the differential involvement of RF1 in WT and S100R MBD AtMBD6 . Thus, taken together, these results suggest that "exchanging" positively charged residues responsible for electrostatic interactions with the DNA backbone (in human MBD domains) to uncharged residues (in MBD AtMBD6 ) specifically affects local conformational states around the two distinct RFs in the bound state. In other words, the reduced binding affinity of MBD AtMBD6 toward methyl-CpG-containing DNA can be considered a result of the lack of important nonspecific interactions and the consequent alterations in the DNA  Figure 3A. (E) Correlation plot of the normalized CSD values upon binding between WT MBD AtMBD6 and the S100R mutant (left). Mapping of the residues displaying significant differences in the correlation on the structure of WT MBD AtMBD6 (right).
binding mode of the RFs that are responsible for the binding specificity.
■ DISCUSSION Structural Properties of the AtMBD6 MBD Domain. Among the 13 MBD proteins from A. thaliana, AtMBD6, AtMBD5, and AtMBD7 show the highest sequence homology in the MBD domain with mammalian MBD domains that specifically recognize the methyl-CpG sites. 32 Correspondingly, the overall structure of MBD AtMBD6 in the free form is almost identical to the canonical structures of known mammalian MBD domains, except for the C-terminal region. The insertion of a single residue, N109, in the short loop between β2 and β3 of MBD AtMBD6 ( Figure 4A) has almost no effect on the overall structure of the MBD core ( Figure 2). On the basis of sequence homology, other MBD proteins from A. thaliana also seem to adopt the canonical MBD fold, although one or more residues in the L1 loop are missing except for AtMBD5 and AtMBD8 that do possess these residues (i.e., they have a "full" L1 loop). 33,34 While we found that the L1 loop of MBD AtMBD6 shows high structural flexibility prior to DNA binding, interestingly, previous structural and chemical shift analyses indicated that the specific binding of human MBD domains to the methyl-CpG site stabilizes the dynamic L1 loop, thereby reducing its conformational flexibility. 18,25 Since the amino acid composition in this region of AtMBD6 is similar to that of human MBD proteins, the L1 loop of MBD AtMBD6 would also recognize the major groove of DNA by hydrophobic interactions and hydrogen bonds to become rigid upon binding to methyl-CpG-containing DNA. 18 The C-terminal region of MBD AtMBD6 shows no noticeable sequence homology with human MBD domains and was found to be highly disordered in the solution structure, in fine agreement with the negative backbone heteronuclear NOE values reported in our initial study on MBD AtMBD6 . 21 This implies that the requirements for formation of a stabilized hairpin loop in plant MBD domains are the same as in human MBD domains: both a tripeptide motif FBF (B represents D or N) at the respective position in the C-terminal region (see Figure 4A) and a positively charged residue that forms polar contacts with the hairpin loop at the beginning of β1. According to amino acid sequence similarity ( Figure S4), MBD proteins from A. thaliana can be classified into two groups by the presence and absence of the hairpin loop. However, our results indicate that the presence or absence of the hairpin loop alone does not correlate with the binding ability of the MBD domains to methyl-CpG-containing DNA in A. thaliana.
DNA Binding Specificity of the AtMBD6 MBD Domain. The hitherto published reports of DNA binding specificity of AtMBD6, which were determined on the basis of gel shift assay experiments, are not in mutual agreement, for example, one group claims that AtMBD6 binds all types of methylated DNA in plants, 35 whereas other groups claim that AtMBD6 specifically recognizes the methyl-CpG sites. 32,34 Possible reasons for this discrepancy include artifacts derived from the unstructured regions of AtMBD6, the presence of an N-terminal affinity tag, and differences in the experimental buffer composition. To resolve this contradiction, we had prepared high-purity tag-free MBD AtMBD6 samples and qualitatively evaluated its DNA binding specificity using all types of methylated DNA at the same time, that is, on the same gel. Under the conditions employed here, MBD AtMBD6 specifically recognized methyl-CpG-containing DNA, suggesting that AtMBD6 is not directly involved in epigenetic regulation via methylation in CpHpG and CpHpH contexts. With respect to methyl-CpHpG and methyl-CpHpH contexts, previous reports had also revealed that these sequences are recognized by plant-specific histone methyltransferase SUVH family proteins that contain the SET-and RING-associated domains. 36−38 Thus, we speculate that the MBD and SUVH family proteins might function in different signaling pathways distinguished by DNA methylation patterns in plant cells.
Mechanism of Weak Binding of AtMBD6 to Methyl-CpG-Containing DNA. We revealed by solution NMR spectroscopy that MBD AtMBD6 possesses a similar binding interface to methyl-CpG-containing DNA as canonical mammalian MBD domains, albeit with one to three orders of magnitude lower binding affinity. One of the reasons for the comparably low affinity of MBD AtMBD6 is the absence of two positively charged residues that have been shown to electrostatically interact with the DNA backbone in other MBD domains (T93 and S100 in AtMBD6). 18,19 NMR titration experiments using a single-point mutant to "correct" one of these atypical residues, S100R, indicated that the difference between MBD AtMBD6 and canonical MBD domains lies in the chemical environment in the vicinity of the two RFs (V101, D102, R117, and E119 in AtMBD6) in the DNA-bound state. However, the backbone amide CSD values of the RFs themselves were not significantly changed by the S100R mutation (see Figure S6D), indicating that only their side- In the description of the structure of HsMeCP2, the DNA moiety is omitted for clarity. Note that the orientations of the side chains of the two RFs are not completely defined in WT MBD AtMBD6 in the free state; however, our results indicate that the orientations of the RFs are still not entirely fixed in the DNA-bound state. (B) Model of the conformational changes upon DNA binding in S100R MBD AtMBD6 . Repulsion of positive charges between R92 (RF1) and R100 retains the conformation of R92 and directs R100 toward the DNA backbone (shown as "1"). This coordinated effect enables formation of a stable salt bridge between R92 and D102, which cooperatively induces maturation of the RF2 motif by the electrostatic stabilization between R115 and E119 (shown as "2"). chain guanidino groups might be involved in the mechanistic difference in DNA binding affinity between AtMBD6 and canonical MBD proteins. A simple effect of the S100R mutation is an attractive electrostatic interaction between the side chain of the introduced arginine R100 and the DNA backbone, as observed in the reported structures of canonical MBD domains in complex with methyl-CpG-containing DNA. Another possible aspect of this mutation may be an electrostatic repulsion between the positively charged guanidino groups of R100 and the RF1, that is, R92. These effects seem to explain the observed changes in the vicinity of RF1 but not in RF2. Importantly, D121 and E137 in HsMeCP2 corresponding to D102 and E119 in AtMBD6 form salt bridges with the guanidino groups of the two RFs to fix them in appropriate positions for specific recognition of the methyl-CpG site in the complex ( Figure 5A, left panel). 19 Therefore, the observed differences between WT MBD AtMBD6 and the S100R mutant indicate that the absence of important local interactions with the DNA backbone in MBD AtMBD6 makes the retention of RF1 and RF2 (cooperative binding) in a proper conformation insufficient ( Figure 5A, right panel, Figure 5B). Indeed, the downfield-shifted RF1 side-chain resonance is reminiscent of the methyl-CpG-specific binding mode exhibited by the RF1 motif of several MBD domains from human and even the invertebrate Ephydatia muelleri. 25,26 The facts that in the complex, the side chain of D121 in HsMeCP2 also forms a polar contact with its backbone amide group (3.2 Å distance between one of the two carboxy oxygen atoms and the amide nitrogen atom) and that D102 in WT MBD AtMBD6 showed only a small CSD value upon DNA binding support our conclusion that the RF1 motif is not properly formed in WT MBD AtMBD6 . Collectively, the conformational incompleteness of the RF motifs in AtMBD6 in the complex appears to increase the off-rate, resulting in a significant reduction in the binding affinity toward methyl-CpG-containing DNA, although we stress again that MBD AtMBD6 does still preserve the binding specif icity.
Our analysis further indicates that the DNA binding affinity of AtMBD5 is likely to be even lower than that of AtMBD6, as the important tyrosine (Y104 in AtMBD6) is substituted to phenylalanine ( Figures 2C and S4) and, thus, AtMBD5 could not form the water-mediated hydrogen bond network that has been implicated in DNA base recognition by HsMeCP2. 19 The sequence homology of the MBD domains of AtMBD7 with mammalian MBD domains is slightly lower than that of AtMBD6. Therefore, the DNA binding affinity of each individual MBD unit is likely to be even lower than that of AtMBD6. However, AtMBD7 may still retain a moderate binding affinity toward methyl-CpG-containing DNA by cooperative binding of the three tandem MBD domains to multiple methyl-CpG sites.
Possible Mechanism of Subnuclear Localization of AtMBD6 Harboring an MBD Domain with Low DNA Binding Affinity. Although the binding affinity of the MBD domain toward methyl-CpG-containing DNA is significantly reduced, AtMBD6 localizes to highly methylated heterochromatin. 15,34 Given that 24% of the CpG sites in genomic DNA are methylated in A. thaliana, 10 the density of the methyl-CpG sites may aid proper subnuclear localization of AtMBD6 even with a comparably low DNA binding affinity. In addition, regions other than the MBD domain itself might also contribute to binding of AtMBD6 toward methyl-CpGcontaining DNA, although at present, no significant sequence homology with any known DNA-binding domain is detected for the N-or C-terminal regions of AtMBD6.
Another possible explanation for the localization of AtMBD6 to heterochromatin would be that binding partners compensate for the low DNA binding affinity of AtMBD6. One of the putative AtMBD6-binding proteins, AtDDM1, possesses a DNA helicase domain and was shown to bind to both free DNA and nucleosomal DNA. 39 Another potential binding partner of AtMBD6, AtAGO4, plays a crucial role in the RNAdirected DNA methylation 9 and thus might also bind to DNA with sequence preference. Indeed, a previous study showed by fluorescence recovery analysis that AtMBD6 is highly mobile at perinucleolar chromocenters, while a small fraction of the protein is relatively immobile. 17 We assume that the immobile fraction that tightly binds to chromatin is involved in formation of protein−protein complexes and the mobile fraction corresponds to the free molecules.

■ CONCLUSIONS
In this study, we revealed that MBD AtMBD6 specifically recognizes methyl-CpG sites and characterized its structural properties at the atomic level. MBD AtMBD6 binds methyl-CpGcontaining DNA with a significantly reduced affinity compared to mammalian MBD domains, while its DNA binding interface is conserved. Future structural studies of AtMBD6 in complex with methyl-CpG-containing DNA and analysis of its interactions with previously identified binding partners ought to pave the way of understanding the mechanism by which AtMBD6 serves as a repressor of gene expression in plant cells.

■ MATERIALS AND METHODS
Sample Preparation. The expression vector pGEX-6P-1 encoding the MBD domain of AtMBD6 (UniProtKB entry Q9LTJ1, residues 78−140) was used for protein overexpression in bacterial cells. Site-directed mutagenesis to generate S100R MBD AtMBD6 was achieved by inverse PCR. Proteins were prepared as previously described 21 with a few modifications. Briefly, Escherichia coli BL21(DE3) cells carrying the expression plasmids were grown in Lennox's LB media or M9 minimal media containing 2 g/L uniformly 13 Clabeled D-glucose (Cambridge Isotope Laboratories) or 4 g/L unlabeled D-glucose (Nacalai Tesque), 1 g/L 15 N-labeled ammonium chloride (Cambridge Isotope Laboratories), and as antibiotic selection pressure 100 mg/L ampicillin. Proteins were expressed for 20 h at 16°C and purified using glutathione affinity chromatography. After on-column digestion of the Nterminal glutathione S-transferase-tag by HRV3C protease overnight at 4°C, proteins were eluted from the column and further purified by size-exclusion chromatography. Purity and integrity of the samples were verified by SDS-PAGE and MALDI-TOF mass spectrometry. The yields of proteins per 1 L culture were 4−5 mg for LB and 2−4 mg for M9.
All DNA samples used in this study were purchased from Hokkaido System Science. These DNA sequences are summarized in Table 1. Single-stranded DNA samples were dissolved in 10 mM Tris buffer (pH 8.0 at 4°C) containing 50 mM sodium chloride for the gel shift assay or in the titration buffer containing 20 mM sodium phosphate (pH 6.5 at 25°C) and 150 mM sodium chloride for NMR titration experiments. Annealing was performed by incubating the solutions for 3 min at 98°C and then gradually lowering the temperature to 4°C over a total duration of 140 min. DNA Binding Assay. The mixture of 80 pmol annealed DNA (final 4 μM) and 400 pmol protein (final 20 μM) was incubated in 20 μL of the binding buffer containing 25 mM Tris, 25 mM boric acid, 150 mM sodium chloride, 5% glycerol, and 1 mM dithiothreitol for 30 min at 4°C. Then, 5 μL of the reaction sample was loaded onto an 8% polyacrylamide gel containing 25 mM Tris and 25 mM boric acid and subjected to electrophoresis at 150 V for 60 min at 4°C (i.e., in the cold room). Double-stranded DNA was stained with ethidium bromide, and the corresponding fluorescent bands were detected by UV irradiation using a ChemiDoc XRS Plus system (Bio-Rad).
General NMR Spectroscopy. All NMR experiments were performed at 20°C using an AVANCE II 700 MHz NMR spectrometer (Bruker) equipped with a TCI cryogenic probe with the exception of in-phase/anti-phase (IPAP) HSQC experiments, which were performed on an AVANCE 600 MHz NMR spectrometer (Bruker) equipped with a TXI cryogenic probe. NMR samples contained 5% deuterium oxide (Cambridge Isotope Laboratories) and were measured in 5 mmdiameter Shigemi tubes (Shigemi). Sodium 3-(trimethylsilyl)-1-propanesulfonate (Tokyo Chemical Industry) was used as an external standard of the 1 H chemical shift; 13 C and 15 N chemical shifts were calibrated indirectly. 40 All acquired NMR data (i.e., free induction decays) were processed using NMRPipe, 41 and the resulting spectra were further analyzed using various software packages (see below).
NMR Analysis of the Homo-oligomerization Status of the AtMBD6 MBD Domain. Two-dimensional 1 H− 15 N HSQC spectra of 13 C, 15 N-labeled MBD AtMBD6 in the titration buffer were acquired with sequentially decreasing the concentration of the protein by twofold dilution. The concentration of MBD AtMBD6 in these experiments ranged from 200 to 3.1 μM. Analysis of the obtained spectra was performed using CcpNmr Analysis. 42 NOESY Experiments. Three-dimensional 15 15 N-labeled MBD AtMBD6 dissolved in the NMR buffer containing 20 mM sodium phosphate (pH 6.5 at 25°C) and 50 mM sodium chloride; the numbers in the parentheses indicate, respectively, the spectral widths and the number of complex points in the F 3 , F 2 , and F 1 dimensions. Each experiment was performed with a mixing time of 150 ms. NOE peaks were picked automatically, and the peak lists were refined manually using MagRO 43 on NMRView. 44 IPAP HSQC Experiments. IPAP 1 H− 15 N HSQC experiments were performed using 0.5 mM 13 C, 15 N-labeled MBD AtMBD6 samples in the presence and absence of pentaethylene glycol monododecyl ether (C 12 E 5 ) bicelles as alignment medium. 45 The bicelle solution was prepared by gradually adding a total of 15 μL 1-hexanol (Tokyo Chemical Industry) to a solution composed of 50 μL of C 12 E 5 (Sigma-Aldrich), 200 μL of the NMR buffer, and 50 μL of deuterium oxide. The final concentration of C 12 E 5 in the NMR sample was 4%. IPAP HSQC (F 2 : 7246.377 Hz/512 complex points, F 1 : 1520.447 Hz/350 complex points) experiments were conducted consecutively twice for each sample. Each acquired IPAP spectrum was split into two distinct spectra (IP + AP and IP − AP) in TopSpin (Bruker) and analyzed with CcpNmr Analysis. The averaged backbone amide RDC constants were used to generate RDC restraints in the structure refinement.
Structure Determination of the AtMBD6 MBD Domain in the Free Form. The initial structure calculation of MBD AtMBD6 in the free form was performed using CYANA 46 with interproton distance restraints derived from NOESY spectra (NOE peaks were automatically assigned based on the chemical shifts of backbone and side-chain resonances reported previously 21 ) and backbone dihedral angle restraints estimated by TALOS+. 47 The lowest-energy structure of the 20 structures calculated using CYANA was used as an initial structure in the subsequent refinement procedure. Structure refinement was performed using 500 cycles of a combination of simulated annealing and energy minimization using XPLOR-NIH 48,49 while additionally applying RDC-based restraints. The top 20 structures of the 41 structures that meet the acceptance criteria are reported here with the statistics shown in Table 2. The coordinates of the solution structure of MBD AtMBD6 in the free form were deposited in the Protein Data Bank with accession ID 7D8K.
NMR Titration and Resonance Assignment of the AtMBD6 MBD Domain in the Bound State. A solution of 4.5 mM double-stranded methyl-CpG-containing DNA termed as MG (see Table 1) was added in a stepwise manner to 0.6 mM 13 C, 15 N-labeled WT MBD AtMBD6 or 15 N-labeled S100R MBD AtMBD6 in the titration buffer. After each addition of the ligand solution, a 1 H− 15 N HSQC spectrum was acquired. To obtain backbone resonance assignments of WT MBD AtMBD6 in the bound state, HNCO, HN(CA)CO, CBCA(CO)NH, and HNCACB experiments 50 were carried out after the final titration experiment. The triple resonance spectra were analyzed using MagRO-NMRView, and initial assignments were performed using FLYA, 51 followed by manual verification and correction of the automated assignments. For each residue, the SSP score was evaluated from its chemical shifts of amide proton, amide nitrogen, α-carbon, β-carbon, and carbonyl carbon with the SSP program. 52 In the case of the S100R mutant, resonances of backbone amide groups were assigned using three-dimensional 15