Chemical Biology of Protein Arginine Modifications in Epigenetic Regulation

Post-translational modifications (PTMs) of histone proteins are a hallmark of epigenetic regulation. They provide a mechanism to modulate chromatin structure and constitute the main features of the so-called “histone code”.1 The proposed function of this code is to integrate exogenous and endogenous signals into a diverse set of histone PTM patterns to enable the epigenetic control of gene expression. The key regulators of this process are the so-called “writers” and “erasers”, which act by dynamically modifying histones, and other chromatin-associated proteins, as well as the “readers”, which interpret these PTMs, thereby facilitating the downstream activation or repression of gene expression.2 
 
The writers are histone-modifying enzymes that can be grouped according to their amino acid substrate preference, affecting mainly lysine, arginine, and serine residues.3 These enzymes can be further classified according to the type of covalent modification that they catalyze. Histone modifications include acetylation, methylation, phosphorylation, and the more recently described modifications of citrullination, ubiquitination, SUMOylation, proline isomerization, O-GlcNAcylation, and ADP-ribosylation.1b,3 On the basis of detailed mass spectrometric analyses, there are at least 15 different types of covalent histone modifications,4 and since histone proteins are modified at multiple sites, and different stoichiometries, the total number of histone marks is >160.5 Although our understanding of how histone modifications contribute to the epigenetic control of gene transcription has grown immensely over the past ∼15 years, the precise impact of this vast number of modifications, not to mention the crosstalk between them, has yet to be fully realized. 
 
Histone proteins are small, highly basic proteins consisting of a globular domain and flexible N-terminal and C-terminal tails that protrude from the nucleosome. The core histone proteins (histones H2A, H2B, H3, and H4) form an octameric particle consisting of two H2A–H2B dimers and an H3–H4 tetramer, around which wrap two helical turns of DNA (∼150 bp).6 This structure, which is generally termed a nucleosome, comprises the basic building block of higher order chromatin structures that are further organized through the function of linker histones such as histone H1. On the basis of nucleosome positioning studies, around 80% of the yeast genome and even 99% of the mappable genome of human granulocytes is occupied by nucleosomes, thereby highlighting the importance of nucleosome-packaged DNA for eukaryotic cells.7 Importantly, while histone PTMs are found throughout the entire protein, they are most often clustered within the N-terminal tail. Although research on histone lysine modifications has drawn considerable attention and even resulted in the approval of novel anticancer drugs,8 the modification of histone arginine residues is a recently emerging nucleosomal mark of similar importance (Figure ​(Figure11). 
 
 
 
Figure 1 
 
N-terminal tails of histone proteins are the preferred targets of histone-modifying enzymes. The major modifications of histone arginine residues are citrullination and methylation. Abbreviations: Cit, citrulline; MMA, monomethylarginine; ADMA, asymmetric ...


INTRODUCTION
Post-translational modifications (PTMs) of histone proteins are a hallmark of epigenetic regulation. They provide a mechanism to modulate chromatin structure and constitute the main features of the so-called "histone code". 1 The proposed function of this code is to integrate exogenous and endogenous signals into a diverse set of histone PTM patterns to enable the epigenetic control of gene expression. The key regulators of this process are the so-called "writers" and "erasers", which act by dynamically modifying histones, and other chromatin-associated proteins, as well as the "readers", which interpret these PTMs, thereby facilitating the downstream activation or repression of gene expression. 2 The writers are histone-modifying enzymes that can be grouped according to their amino acid substrate preference, affecting mainly lysine, arginine, and serine residues. 3 These enzymes can be further classified according to the type of covalent modification that they catalyze. Histone modifications include acetylation, methylation, phosphorylation, and the more recently described modifications of citrullination, ubiquitination, SUMOylation, proline isomerization, O-GlcNAcylation, and ADP-ribosylation. 1b, 3 On the basis of detailed mass spectrometric analyses, there are at least 15 different types of covalent histone modifications, 4 and since histone proteins are modified at multiple sites, and different stoichiometries, the total number of histone marks is >160. 5 Although our understanding of how histone modifications contribute to the epigenetic control of gene transcription has grown immensely over the past ∼15 years, the precise impact of this vast number of modifications, not to mention the crosstalk between them, has yet to be fully realized.
Histone proteins are small, highly basic proteins consisting of a globular domain and flexible N-terminal and C-terminal tails that protrude from the nucleosome. The core histone proteins (histones H2A, H2B, H3, and H4) form an octameric particle consisting of two H2A−H2B dimers and an H3−H4 tetramer, around which wrap two helical turns of DNA (∼150 bp). 6 This structure, which is generally termed a nucleosome, comprises the basic building block of higher order chromatin structures that are further organized through the function of linker histones such as histone H1. On the basis of nucleosome positioning studies, around 80% of the yeast genome and even 99% of the mappable genome of human granulocytes is occupied by nucleosomes, thereby highlighting the importance of nucleosome-packaged DNA for eukaryotic cells. 7 Importantly, while histone PTMs are found throughout the entire protein, they are most often clustered within the N-terminal tail. Although research on histone lysine modifications has drawn considerable attention and even resulted in the approval of novel anticancer drugs, 8 the modification of histone arginine residues is a recently emerging nucleosomal mark of similar importance ( Figure 1).
Arginine residues possess a characteristic guanidiniumcontaining side chain that has one of the highest pK a values (pK a = 12.5; this value refers to the side chain of the free amino acid in aqueous solution at 25°C) of all amino acids; 9 thus, these residues are protonated and positively charged at physiological pH. The high pK a value also renders the guanidinium group a poor nucleophile and as such presents a considerable challenge for modifying this residue. In addition to its positive charge, the arginine guanidinium contains five potential hydrogen bond donors that can be used to interact with other polar groups ( Figure 2). Notably, the strong charge favors the location of arginine residues to the outer hydrophilic surfaces of proteins. Consequently, they are readily accessible for binding to molecules with negative charges, e.g., nucleic acids. In this respect, arginines are one of the most common residues involved in the formation of protein/DNA and protein/RNA complexes. 10 The frequent use of arginine for this type of interaction can be explained by opposite charge attraction, the length and flexibility of the side chain, and the ability to produce excellent hydrogen-bonding geometries with nucleobases or phosphate groups ( Figure 2). For example, the orientation of the planar guanidinium nitrogen atoms perfectly matches with the oxygen atoms of phosphate (present in nucleic acids and phosphoproteins) to form a stable bidentate salt bridge that is at least 2-fold stronger than the interaction between a lysine ammonium group and a phosphoryl group. 11 In addition, the possible bidentate hydrogen-bonding ability of arginine not only confers a more economical way to optimize bond energies but also increases specificity in the recognition of specific DNA sequences, as exemplified by the recognition of guanine ( Figure 2). 10a Given the ability to form these types of interactions, it is clear that the post-translational modification of an arginine residue could have dramatic effects on cell signaling and, like other histone modifications, contribute to disease pathogenesis. In fact, there is increasing experimental evidence to suggest that the dysregulation of arginine-modifying enzymes plays pivotal roles in cancer, inflammatory diseases, neurodegenerative diseases, and other conditions. 8b, 12 In this review, we aim to summarize the current knowledge surrounding the posttranslational modification of histone arginine residues, focusing on enzyme classes that catalyze the citrullination and methylation of arginine residues as well as noncanonical arginine modifications such as phosphorylation, ADP-ribosylation, and arginylation. The major focus will be given to the PADs (protein arginine deiminases) and PRMTs (protein arginine methyltransferases).

Overview of Protein Citrullination
Protein citrullination is an emerging PTM that results from the conversion of peptidyl arginine to peptidyl citrulline. This PTM is catalyzed by the calcium-regulated PAD family of enzymes ( Figure 3). 13 Due to the exchange of an imine for a carbonyl group, this reaction is referred to as deimination. The PADcatalyzed hydrolysis of a guanidinium group has a profound effect on the electrostatics and the hydrogen-bonding potential of the original side chain as citrulline is neutral and contains two hydrogen bond acceptor sites and only three potential hydrogen bond donors compared to the five present in arginine ( Figure 4). Despite the large electronic effects, the overall change in mass is marginal: −0.02 Da, accounting for the extra proton in the charged guanidinium group, or +0.98 Da, as encountered during mass spectrometric analyses of the neutral guanidine form.
There are five human PAD isozymes, including PAD1, PAD2, PAD3, and PAD4, which are catalytically active, and PAD6, for which no activity has been detected. 13,14 On the basis of the historic nomenclature, human PAD5 was thought to represent a novel PAD family member that differed from mouse PAD4. 15 However, detailed sequence and expression analysis revealed that human PAD5 was the mouse PAD4 orthologue. As such, it was renamed PAD4, leaving PAD5 unused. 13 The PADs share a high degree of sequence conservation (70−95% identity among each isozyme in different mammals and 50−55% identity between individual isozymes within one species) ( Figure 5) and possess low pI values, typically around 5.8. 13 The net negative charge is thought to be instrumental for recognizing the positive charge of a substrate arginine residue as well as for the binding of essential calcium ions (see below).
The PADs are widely distributed in higher eukaryotes, and their expression as well as activity is associated with the regulation of gene expression and various developmental stages (see below). 13 PAD1 is highly expressed in the epidermis and uterus, PAD2 is expressed in numerous tissues, PAD3 is mainly found in the skin and hair follicles, PAD4 is primarily expressed in granulocytes, macrophages, and neutrophils, and PAD6 is expressed in oocytes and embryos. 15,16 Within the cell, PAD protein and/or activity are detected in various cellular compartments, including the cytoplasm, mitochondria, and nucleus. 17 Although PAD4, which contains a canonical nuclear localization signal (NLS; P 56 PAKKKST 63 ) was long thought to be the only nuclear PAD enzyme, 17a emerging evidence indicates that other PAD isozymes can localize to the nucleus as well. 17b For example, PAD2 was recently found in the nucleus of murine mammary epithelial cells, and in nuclear fractions of astrocytes as well as hippocampal neurons of scrapie-infected mice. 17b,c guanidinoacetate, the arginine deiminases (ADIs), which act on nonpeptidyl arginine and the dimethylarginine dimethylaminohydrolase (DDAH) enzymes, which are highly selective for methylated nonpeptidyl arginine. 18a In contrast to these enzyme groups, the PADs only act on peptidyl arginine residues and require at least an N-terminal amide bond and a C-terminal carbonyl for efficient substrate recognition. 19 The PADs are highly efficient enzymes as exemplified by PAD1, which exhibits a rate enhancement over the noncatalyzed reaction (k cat /k non ) under near-neutral pH conditions of ∼8.5 × 10 11 (k cat = 0.45 s −1 versus k non = 5.3 × 10 −13 s −1 ). In terms of catalytic proficiency (i.e., (k cat /K M )/k non ), evaluated by comparing the second-order rate constant (k cat /K M = 4.1 × 10 3 M −1 s −1 ) with the rate of the spontaneous reaction in neutral solution in the absence of a catalyst (k non = 5.3 × 10 −13 s −1 ), the rate enhancement is 8 × 10 15 M −1 (Table 1). 20 2.2.1. PAD Structure. As with other members of the guanidino-group-modifying pentein family, PADs contain a catalytic α/β propeller domain that is located on its C-terminal half. 18b, 21 PADs typically exist as homodimeric proteins in solution and, according to crystallographic studies, bind in a head-to-tail fashion ( Figure 6A). 21 The dissociation constant (K d ) for PAD4 dimer formation was estimated to be ∼450 nM. 22 Moreover, dimerization was shown to be important for activity and cooperativity, since disruption of the dimer interface reduces the activity by 50−75%. 22 The dimerization interface covers a large surface area of ∼2000 Å 2 comprising multiple contacts between the N-terminal domain of one protomer and the catalytic domain of the other protomer. Each subunit comprises two domains, including the N-terminal domain, which can be further subdivided into two immunoglobulin-like (Ig) subdomains that are proposed to be important for protein−protein interactions and to facilitate substrate selection. 21 The C-terminal catalytic domain harbors all the active site residues, and the dimeric PAD4 structure revealed that both active site cavities are located on the same dimer face and are separated by ∼65 Å ( Figure 6B).
2.2.2. Calcium Dependency of PAD Enzymes. The crystal structure of PAD4 further showed that each protomer contains two bound calcium ions (Ca1, Ca2) in the catalytic domain and three additional calcium ions (Ca3, Ca4, and Ca5) located in the Ig II subdomain (Figure 7). Although calcium binding is crucial for catalytic activity, it has no influence on dimer formation. 21,22 Detailed comparisons of calcium-free and calcium-bound PAD4 and PAD2 indicate that calcium binding has a profound effect on the PAD structure, and is critical for forming a catalytically competent active site ( Figure 8). 21,23 Notably, loop regions comprising the active site cavity are largely disordered in the absence of calcium and only become visible (ordered) in the presence of calcium. Calcium ions Ca3, Ca4, and Ca5 bind to a conserved negatively charged region and are thought to stabilize the structure, whereas Ca2 and in particular Ca1 are positioned close to the bottom of the active site cleft and are thus essential for maintaining the architecture of the active site ( Figure 8). 21 This calcium-induced conformational change is also highlighted by the fact that C645 in PAD4 and C647 in PAD2, the nucleophilic cysteine residue, moves ≥5 Å into the active site when calcium binds to the enzyme. The organization of the active site cavity of calcium-bound, substrate-free PAD4 is almost identical to that of the substratebound complex, indicating that substrate binding has little effect on the formation of the active site. 21 On the basis of calcium titration experiments with PAD2, which were monitored by X-ray crystallography, Ca1 and Ca6 bind first, and Ca3, Ca4, and Ca5 bind next and cause a conformational change that generates the Ca2 site. Calcium binding at Ca2 is critical for triggering the movement of the active site cysteine into a catalytically competent position. Despite Ca2 binding last, these titration experiments, coupled with mutagenesis and hydrogen/deuterium exchange experiments, indicate that Ca3, Ca4, and Ca5 act as a calcium switch to control the overall calcium dependence of the enzyme.
Notably, the metal dependency of PAD4 was tested using various cations, including barium, calcium, magnesium, manganese, samarium, strontium, and zinc. 19 None of the examined metals except calcium could efficiently activate the catalytic activity of PAD4. Conversely, some of these metals were even decent PAD inhibitors. The most potent inhibition was accomplished using samarium (present as Sm 3+ ion in samarium sulfate, Sm 2 (SO 4 ) 3 ) and zinc (present as Zn 2+ ion in zinc chloride, ZnCl 2 ), which possessed IC 50 values of 40 and 750 μM, respectively. Thus, calcium binding is a critical prerequisite for proper substrate binding.
The strict calcium dependency of PAD activity is further reflected by the fact that calcium activates the enzyme by more than 10000-fold. Also notable is the fact that full PAD4 activity requires a high concentration of calcium and the half-saturation constant for calcium binding (K 0.5,Ca ) ranges from 130 to 710 μM depending on the substrate. 19,22 How the PADs are activated in cells is, however, still unknown. For instance, the concentration of calcium required for maximal PAD4 activity is 100−1000-fold higher than that observed in activated cells. Therefore, it is currently unclear how PADs are efficiently activated in vivo and/or how the calcium dependency is lowered inside cells. It has been proposed that PADs may temporarily locate to intracellular calcium channels that can provide local calcium concentrations in the millimolar range upon channel opening, 24 sufficient to activate PAD activity. 23 In addition, one can speculate that the PADs' calcium dependency may be altered by specific PTMs or via the interaction of binding proteins. In this regard, it was recently shown that antibodies, isolated from patients with rheumatoid arthritis, can bind and activate PAD4 by lowering the concentration of calcium required for activity. 25 Figure 5. Sequence alignment of human PAD family members. Catalytic residues are highlighted with red asterisks below the alignment. The sequence alignment was generated using Clustal Omega and visualized using Espript 3.0. 229 The consensus sequence is abbreviated as follows: uppercase letters indicate identical residues, lowercase letters indicate consensus level >0.5, "!" represents any conserved residue of isoleucine (I) or valine (V), "$" represents any conserved residue of leucine (L) or methionine (M), "%" represents any conserved residue of phenylalanine (F) or tyrosine (Y), and "#" represents any conserved residue of asparagine (N), aspartate (D), glutamine (Q), or glutamate (E). The relative accessibility of each residue is depicted below the consensus motif: blue indicates accessible residues, cyan marks intermediately accessible residues, white stands for buried residues, and red indicates that the accessibility is not predicted.

PAD Substrate Recognition.
In contrast to other members of the guanidino-group-modifying pentein family such as ADI and DDAH, where peptidyl arginine is occluded from the solvent by a loop closed over the active site, PAD4 substrate binding occurs at the molecular surface of the enzyme, which is accessible for peptide/protein interactions. 26 Thus, steric effects provide at least a partial explanation for why peptidyl arginine residues are recognized by PADs but not by ADI or DDAH. 21 The active site of PAD4 has a characteristic U-shaped tunnel containing two entrance doors ( Figure 9). This tunnel is also found in other hydrolases of the pentein superfamily such as DDAH. 27 The "front door" is the actual substrate-binding site accommodating the arginine side chain residue and derived inhibitors, while the "back door" provides solvent access through a highly polar tunnel connected to the base of the active site. This narrow solvent channel presumably allows the ammonia generated by the hydrolysis of arginine to diffuse away and provide access for a water molecule for the subsequent hydrolytic phase of the reaction to occur. 28 As a result, the access of other small molecules is restricted and thereby prevents their reaction with the S-alkylthiouronium intermediate during enzyme catalysis (see below), only allowing water to enter the active site. Interestingly, in related amidinotransferases that do not utilize a hydrolytic mechanism but transfer the activated amidine via a reactive cysteine thiouronium intermediate onto their respective substrate molecule such as glycine, the amidine-donor arginine substrate occupies a position similar to that of the back door solvent channel in PAD4. 18c This observation raises the possibility that the solvent channel in PAD4 and related hydrolases is a vestigial substrate-binding site that is retained in amidinotransferases from a common promiscuous ancestor. 18c Although the front door in PAD4 represents the major site of inhibitor binding, it has been proposed that the back door solvent channel might represent an alternative target for inhibitor development. 29 The structure of PAD4 bound to benzoyl-L-arginine amide (BAA), a small-molecule mimic of peptidyl arginine, illustrates key residues that are important for substrate recognition (Figure 7). For example, aspartates D473 and D350 form two bidentate salt bridges with the substrate guanidinium group, positioning it for nucleophilic attack by C645. Notably, C645 and H471, which are involved in general acid/base catalysis, are located on opposite sides of the guanidinium carbon center. The aliphatic portion of the substrate arginine side chain is clamped between W347 and V469 via hydrophobic interactions. W347, along with R372, also appears to be a key player in generating a catalytically competent active site since mutation of either residue decreases activity to near background levels. 30 In the case of W347, this effect is best interpreted by the importance of this large hydrophobic side chain forming the wall of the substrate-binding pocket. Although R372 does not directly hydrogen bond to the substrate, except for a watermediated interaction, it directly interacts with D345 and E351, which binds Ca2 and is close to the active site residue D350. Therefore, R372 is critical for maintaining the structural organization of the active site, but plays only a minor role in substrate binding.
As mentioned above, PAD4 is highly selective for peptidyl arginine substrates over nonpeptidyl arginine. This observation can be rationalized by the engagement of the peptide backbone. Specifically, the main chain carbonyl oxygen of the preceding N-terminal residue and the arginine backbone carbonyl oxygen atom form hydrogen bond interactions with the side chain of R374 (see Figure 7), thereby conferring specificity toward peptidyl arginine as opposed to free arginine, which lacks these amide bonds. 21 Consistent with this residue being important for substrate recognition, mutation of R374 to alanine results in a ∼20−50-fold reduction in enzymatic activity. 31 The crystal structures of PAD4 bound to several histone substrate peptides further confirmed this observation and revealed that the great majority of contacts occur between the backbone carbonyl groups of the peptide and the side chains of residues surrounding the active site, i.e., Q346, W347, R372, and R374 ( Figure 10). 31 The lack of significant contacts between the enzyme and side chains of the substrate may explain why sequence-specific substrate recognition elements have been difficult to identify. While it was suggested that PAD4 recognizes five successive residues with the consensus sequence of ΦXRXX, where Φ Figure 9. Top view of the PAD4 C645A mutant bound to BAA colored according to its electrostatic surface potential, highlighting two connected cavities that form a continuous tunnel (orange rod) of ∼21 Å (PDB code 1WDA). The lower image illustrates the side view of the active site cavity (front door), occupied by BAA, and the back door tunnel, presumably involved in incoming water channeling and ammonia (product) extrusion. denotes amino acids with small side chain moieties and X denotes any amino acid, it is evident from both the lack of strong sequence specificity and the available PAD4 crystal structures that there are no obvious substrate-specificity determinants apart from the arginine-binding site. 31 These structures do, however, indicate that PAD4 peptide substrates adopt a β-turn-like conformation in which the peptide backbone is kinked to allow proper penetration of the arginine side chain into the deep active site cavity. Therefore, in contrast to most histone-modifying enzymes that interact extensively with their peptide ligands and recognize their substrates in a sequence-specific manner, the PADs likely modify exposed arginine residues that can adopt the type of β-turn-like structure described above. 32 Thus, the enzyme has a fairly broad sequence specificity and could target multiple arginine sites in histones. However, specificity may be provided by controlling access of the enzyme to a limited subset of arginine residues in higher order chromatin structures, by crosstalk with other PTMs, or through cooperation of PAD4 with additional factors. In this respect, the Ig-like domains might also contribute to substrate selection. For example, the binding affinity between PAD4 and HDAC1 was decreased by 3.3-fold upon introduction of an autocitrullination-mimicking glutamine residue at position R123, while there was no effect on the PAD4−H3 interaction. 30b 2.2.4. Proposed Catalytic Mechanism of PADs. On the basis of the available crystal structures and biochemical studies, the following catalytic mechanism has been proposed ( Figure  11). 21,33 Briefly, PAD4 catalyzes arginine citrullination by using a nucleophilic cysteine residue that forms a covalent reaction intermediate, which is hydrolyzed by an incoming water molecule. Substrate binding is initiated by strong electrostatic interactions between D350 and D473, the two active site aspartate residues that coordinate the guanidinium group. The carboxylate of D473 binds to both terminal ω-nitrogens of the guanidinium group, whereas D350 coordinates to one ωnitrogen and the δ-nitrogen. Consequently, the thiolate of the active site cysteine, C645, is appropriately positioned to promote nucleophilic attack on the guanidinium carbon, which results in the formation of a covalent tetrahedral intermediate. H471, which is located on the opposite side of the guanidinium group, is thought to promote catalysis by protonating the tetrahedral intermediate, either concomitantly with nucleophilic attack by the active site cysteine or in a stepwise fashion, thereby generating a better leaving group and promoting cleavage of the scissile C−N bond. After the collapse of this reaction intermediate, ammonia is released. Thereafter, the covalent S-alkylthiouronium intermediate is subsequently hydrolyzed by the attack of an H471-activated water molecule, forming a second tetrahedral intermediate that collapses to eliminate the C645 thiolate, ultimately generating citrulline. Support for this mechanism comes from initial studies confirming that PADs are cysteine-dependent enzymes as evidenced by the strong inhibition afforded by iodoacetate or PCMB and the requirement of a reducing agent such as DTT. 34 Detailed mutagenesis, pH rate profile, solvent isotope effect, and solvent viscosity effect studies also support this mechanism. 33 In addition, several inhibitor-bound structures, i.e., F-amidine (N-alpha-Benzoyl-N 5 -(2-fluoro-1-Iminoethyl)-L-Ornithine Amide), Cl-amidine (N-alpha-Benzoyl-N 5 -(2-Chloro-1-Iminoethyl)-LOrnithine Amide), and TDFA (threonine-aspartate-F-amidine) (see below), 30a,35 are consistent with the proposed mechanism.
Notably, PAD4 has a pH optimum of ∼7.6 and uses a reverse protonation mechanism that is manifested by the high pK a of the active site cysteine C645 (pK a ≈ 8.3); this pK a value was measured by pH-dependent kinetic inactivation studies using the cysteine reactive compound iodoacetamide. 33a In addition, Figure 11. Proposed catalytic mechanism for PAD enzymes. H471 has to be protonated for efficient catalysis to take place. However, the pK a for H471 is ∼7.3 and is therefore below the pH optimum of the enzyme. Consequently, there is just a small pH window for optimal PAD4 activity, indicating that only a fraction of PAD4 (∼15%) exists in the proper deprotonated C645-thiolate and protonated H471-imidazolium forms at the pH optimum. Mechanistic studies on PAD1 and PAD3 have confirmed that they also proceed through a similar reverse protonation mechanism. 33b Interestingly, however, PAD2 appears to use a substrate-assisted mechanism, in which the positively charged substrate guanidinium promotes catalysis by depressing the pK a of the nucleophilic cysteine. 36 This conclusion was mainly based on pH-dependent inactivation studies by comparing the guanidinium group mimicking compound 2-chloroacetamidine with the neutral iodoacetamide. Consistent with PAD4, iodoacetamide inactivation revealed a pK a of 8.2 for the active site cysteine, whereas the use of 2-chloroacetamidine yielded a pK a value of 7.2 for the same cysteine. Therefore, it was proposed that PAD2, like DDAH, 37 uses a substrate-assisted mechanism rather than a reverse-protonation mechanism. 36

Do PADs Function as "Demethyliminases"?
There has been some controversy regarding whether the PADs act on methylated arginine residues. One study claimed that PAD4 catalyzes the conversion of monomethylarginine into citrulline. 38 However, the "demethylimination" reaction occurs at rates that are several orders of magnitude (100−10000-fold) slower for peptide substrates containing either a single monomethyl or a single asymmetrically dimethylated arginine residue than the actual deimination reaction of unmodified arginine residues. 19 Two further studies report that methylation of the guanidinium group even prevents and inhibits citrullination. 39 Using synthetic peptides that contain methylated arginine residues, neither human PAD2, PAD3, PAD4, and PAD6 enzymes nor PADs present in mouse tissue extracts are capable of generating peptidyl citrulline or peptidyl methylcitrulline from either mono-or dimethylated peptidyl arginine. 39b Notably, structural comparison of the active site cavity of PAD4 with DDAH further revealed striking differences that might explain the individual substrate preferences ( Figure 12A). DDAH is very selective for asymmetric dimethylated nonpeptidyl arginines to ensure that DDAH only hydrolyzes methylated arginine, which acts as a physiological inhibitor of nitric oxide synthase, while sparing unmodified arginine, which is the substrate for nitric oxide synthase. 27b The active site residues in PAD4 and DDAH are highly conserved; however, PAD4 has a much smaller active site pocket, containing an aspartate at the bottom. This aspartate directly forms a bidentate hydrogen bond to the substrate guanidinium group, whereas DDAH possesses a lysine at the corresponding position that bends away from the substrate-binding site, thereby forming a larger active site pocket that can accommodate methylated arginine residues or even an iminopentyl group as illustrated in Figure 12A. 27b, 40 In the case of PAD4, a methylated guanidinium group would clash with the active site aspartate residues, thereby preventing proper substrate alignment ( Figure 12B). Taken together, methylarginines are unlikely to represent physiologically relevant substrates, and it is more likely that arginine methylation antagonizes citrullination as proposed by Cuthbert et al. and Kearney et al. 19,28,41 Notably, since citrulline residues are not methylated by the PRMTs, these two modifications are mutually exclusive.

Is Protein Citrullination a Reversible Modification?
Although many histone PTMs are reversibly regulated by the action of writers and erasers, there is currently no known citrulline eraser. However, the level of H3 citrullination is dynamically controlled, indicating that citrulline deposition is transient during gene expression. 41 For example, PAD4 was shown to be recruited to the pS2 promoter and to citrullinate histone H3 when MCF-7 cells were stimulated with estrogen. Notably, this promoter region elicits a strong signal for H3 citrullination 40 min poststimulation; however, after an additional 10 min, the amount of H3 citrullination drops rapidly to its original levels, observed before estrogen stimulation, suggesting that a "decitrullinase" may exist. 41 While the dynamic nature of citrulline marks could be caused by histone tail clipping, epitope occlusion, or nucleosome displacement, 42 the existence of a decitrullinase remains a formal possibility.
Precedence for such a reaction comes from the urea cycle where nonpeptidyl citrulline is converted to free arginine by the combined actions of argininosuccinate synthetase and argininosuccinate lyase ( Figure 13). 43 Argininosuccinate synthetase catalyzes the conversion of L-citrulline into argininosuccinate ( Figure 13A). The first step involves the activation of the urea oxygen of citrulline via covalent modification by an adenosine monophosphate (AMP) moiety. 43b Subsequently, the α-amine of aspartate acts as the nucleophile and attacks the carbon center of the urea, thereby displacing the AMP moiety and generating argininosuccinate. Thus, argininosuccinate synthetase converts the neutral urea of citrulline into a guanidinium connected to succinate. To remove the succinate moiety, the second enzyme, argininosuccinate lyase, catalyzes C−N bond cleavage with the subsequent release of fumarate and arginine ( Figure 13B). The catalytic mechanism proceeds through the deprotonation of the β-carbon of succinate, thereby forming a highly reactive carbanion intermediate. 43a Redistribution of the negative charge into the carboxylate group generates the acicarboxylate intermediate, which provides the driving force for the cleavage of the fumarate group. Protonation of the guanidinium by a general-acid catalyst further facilitates the reaction.
Whether a similar set of enzymes might act on peptidyl citrulline is unclear, but this mechanism provides the principle  requirements for promoting a decitrullination reaction, i.e., covalent modification of the urea oxygen atom coupled to incorporation of ammonia, donated by an aspartate-, glutamate-, or glutamine-dependent enzyme. Thus, one could imagine a variety of alternative strategies to achieve the same outcome, including activation by phosphorylation to generate a better leaving group and cleavage of the phospho intermediate by ammonia.  Figure  14). For example, Taxol (the generic name is paclitaxel) inhibited PAD4 in the low millimolar range. 44 The authors suggested that paclitaxel acts as a noncompetitive inhibitor of PAD4, since the K M was not affected while the V max was reduced. Given the absence of citrullination when testing methylated arginine residues as PAD4 substrates (see above), Hidaka and colleagues also determined whether these arginine derivatives can inhibit PAD4 activity. 39a They observed that both benzoyl-N ω -monomethylarginine (Bz-MMA) and benzoyl-N ω ,N ω -dimethylarginine (Bz-ADMA) inhibit the enzymatic activity of PAD4. However, these compounds are relatively modest inhibitors. The IC 50 value of Bz-ADMA was estimated to be ∼400 μM, whereas the IC 50 value for the weaker inhibitor Bz-MMA was not determined.
Employing a PAD4-targeted activity-based protein profiling (ABPP) inhibitory screen (described below), Thompson and colleagues screened a small library of therapeutics used for the treatment of rheumatoid arthritis (RA). 45 Given that streptomycin contains two guanidinium groups, the authors also considered the possibility that streptomycin might act as an alternative substrate; however, streptomycin was not deiminated by PAD4. In general, the potency of the tested compounds is relatively weak, ranging from low millimolar to mid-micromolar. Notably, streptomycin is a competitive inhibitor of PAD4, with a K i value of ∼0.56 mM, and the tetracycline derivative minocycline was shown to be a mixedtype inhibitor with a K i value of ∼0.63 mM. A similar compound, chlortetracycline, which only differs from minocycline by the addition of a hydroxyl and methyl group at position 6 and a chloro group replacing the dimethylamine moiety at position 7, is a significantly more potent inhibitor (K i value of ∼0.11 mM). Kinetic studies further revealed that chlortetracycline is also a mixed inhibitor, similar to minocycline. Although the detailed mechanism of inhibition and the specific binding site of the tetracycline derivatives are currently unknown, the tetracycline scaffold might be exploited in the future design of reversible PAD4 inhibitors. Although it is tempting to speculate that the efficacy of the most effective PAD4 inhibitors, identified in this screen, minocycline and other tetracycline derivatives, is due in part to their ability to inhibit cellular PAD4, we note that it is not clear whether the high concentrations of compound needed to inhibit PAD4 in vitro could be achieved systemically. Additionally, these compounds are known to inhibit a wide range of other enzymes, including collagenase, poly-ADP-ribose polymerase-1 (PARP-1), Arachidonate 5-lipoxygenase, and several cysteine proteinases, that may also contribute to their efficacy as RA therapeutics and their ability to impair neutrophil chemotaxis and act as antiinflammatory agents.
In addition to these compounds, the guanidine derivative 6 ( Figure 14) was recently shown to inhibit PAD4 activity (8% inhibition at 1 μM and 36% inhibition at 10 μM). 46 The authors claim that 6 most likely acts through a noncovalent  mechanism to block the activity of PAD4; however, the detailed mode of action and inhibition studies were not performed by the authors. Ferretti and colleagues also recently described a novel PAD inhibitor (7) comprising a 3,5-dihydroimidazol-4one ring that replaces the acyclic guanidine moiety present in arginine residues. 47 This new small-molecule PAD3 inhibitor was reported to show inhibition at 100 nM using cell extracts containing recombinant PAD3. The high potency of this inhibitor is rather surprising since the pK a of acylguanidines is typically 4−5 orders of magnitude lower than that of the corresponding guanidines. 48 Additionally, it is noteworthy that this strong level of inhibition has not been replicated in our hands.
More recently, an ABPP-based inhibitor screening strategy to identify inhibitors that target the calcium-free form of PAD2 identified ruthenium red (8) as a PAD2 inhibitor. 49 This compound preferentially binds the apoenzyme with a K i of 17 μM for PAD2 and was shown to be competitive with calcium, presumably binding at the calcium 3, 4, and 5 sites. Ruthenium red is also a potent inhibitor for the other PAD isozymes with apparent K i values of 30 μM for PAD1, 25 μM for PAD3, and 10 μM for PAD4. The ability to identify inhibitors targeting the apo form of the PADs holds great promise for developing highly potent non-active-site-directed reversible inhibitors targeting the PADs. Lewis and colleagues recently described a highly potent and reversible inhibitor that shows remarkable selectivity for PAD4. 50 In this study, the authors screened a DNA-encoded small-molecule library for PAD4 inhibitors in the absence and presence of calcium. Optimization of the primary hits yielded GSK199 (9) and GSK484 (10) ( Figure 15A). Notably, inhibition is calcium dependent for both compounds. In the absence of calcium, GSK199 and GSK484 inhibit PAD4 with IC 50 values of 200 and 50 nM, respectively, while in the Figure 15. (A) Reversible, mixed-type PAD4 inhibitors. K is is the dissociation constant for the enzyme−inhibitor complex. (B) Crystal structure of PAD4 bound to inhibitor 9, GSK199 (PDB code 4X8G). GSK199 (gray) directly interacts with active site residues H471 and D473, and is further stabilized by binding to F634 and N588. Hydrogen bonds of <3.5 Å are represented as dashed black lines. (C) The image on the left side depicts the structure of PAD4 (orange) bound to inhibitor GSK199 (gray, PDB code 4X8G) superimposed onto the structure of PAD4 (green) bound to BAA substrate (green stick model, PDB code 1WDA). Residues 633−640 (red, denoted by α) of PAD4 bound to BAA adopt an α-helical conformation, while residues 633−645 (yellow, denoted by β) of PAD4 bound to GSK199 form an antiparallel β-sheet. The image on the right side compares the binding sites of BAA (green) and GSK199 (gray) mapped onto the structure of PAD4 (PDB code 1WDA), colored according to its electrostatic surface potential. Interestingly, detailed kinetic analysis demonstrated a mixed mode of inhibition for these compounds and showed that they possess more than 35-fold selectivity for PAD4 compared to the other PADs. The crystal structure of PAD4 bound to GSK199 revealed that the inhibitor directly interacts with the active site residues D473 and H471 ( Figure 15B). Comparison of inhibitor-and substrate-bound PAD4 shows that an active site α-helical region, present in the substrate-bound form, adopts a new conformation and is reordered to form a β-hairpin in the inhibitor-bound structure ( Figure 15C). In addition, detailed inspection of the orientation of GSK199 highlights a partial overlap between the aminopiperidine group of 9 with the substrate guanidinium group. However, in contrast to the substrate side chain that occupies the front door channel, the benzimidazole and pyrrolopyridine moieties of 9 protrude out into the back door solvent exchange channel ( Figure 15C). Notably, these compounds bind a form of PAD4 that lacks calcium at the Ca2 site, mimicking our calcium titration data with PAD2, and again highlighting the importance of Ca2 for generating a catalytically competent conformation. Overall, these inhibitors represent a great example of a successful combination of high-throughput screening efforts with detailed biochemical and structural characterizations to yield novel compounds with potential therapeutic applications.

Chemical Reviews
2.5.2. Irreversible, Covalent Inhibitors of PADs. Over the past several years, major progress has been made in generating irreversible inhibitors targeting the PADs ( Figure  16A). Initial studies suggested that 2-chloroacetamidine, having a guanidinium-like amidinium group, represented a suitable candidate for PAD inhibition. 51 In fact, 2-chloroacetamidine is a modest PAD4 inactivator, which blocks enzyme activity in a time-dependent manner, characteristic of a covalent inhibitor. Inspired by BAA, one of the best small-molecule PAD4 substrates, the Thompson group installed reactive electrophilic fluoroacetamidine or chloroacetamidine warheads onto the BAA scaffold. 35a, 52 The generated compounds, denoted as Famidine or Cl-amidine, respectively, were the first highly potent PAD4 inhibitors. In the context of BAA, this haloacetamidinebased warhead is targeted to the active site of PAD4, where it reacts with C645 to form a stable thioether adduct. Indeed, in vitro studies revealed that both Cl-amidine and F-amidine act as mechanism-based inhibitors that irreversibly inactivate PAD4 and other PAD isozymes in a calcium-dependent manner via the specific modification of C645, the active site cysteine. 35a, 52 The alkylation of C645 proceeds through one of two potential mechanisms. 53 In the first mechanism, C645 directly displaces the halide through an S N 2 mechanism. Alternatively, inactivation could proceed via a multistep mechanism that involves nucleophilic attack of the cysteine thiolate on the amidinium carbon, forming a tetrahedral intermediate that mimics the initial tetrahedral intermediate formed during substrate hydrolysis. The protonation of the tetrahedral intermediate by H471, acting as a general acid, is thought to stabilize the rather unstable hemi-iminal tetrahedral intermediate such that it is long enough lived to undergo an intramolecular halide displacement reaction, which generates a three-membered sulfonium ring. Although the proposed dicationic intermediate depicted in Figure 16B is unprecedented in the literature, dianionic intermediates have been described. 43a Regardless of the specific mechanism, formation of the three-membered sulfonium ring ultimately induces the collapse of the tetrahedral intermediate, leading to a 1,2-shift that generates a thioether linkage, whose existence has been verified crystallographically ( Figure 17). 35a Although, in principle, both mechanisms are plausible, the bell-shaped pH inactivation rate profiles observed for both F-amidine and Clamidine strongly support the second inactivation mechanism, especially the importance of H471 as a general-acid catalyst.
Notably, the pH-dependent rate of inactivation correlates with the pK a values obtained for H471 and C645, indicating that these two residues likely possess a critical role not only for substrate turnover but also for enzyme inactivation by haloacetamidine-containing compounds. 53 The second mechanism also accounts for the otherwise poor leaving group potential of the fluoride. Furthermore, crystal structures of PAD4 bound to several inhibitors (PDB codes 2DW5, 3B1T, 3B1U, and 4DKT) show that the histidine (H471) nitrogen atom N δ1 directly points toward the N ω atom of the amidine inhibitor, as exemplified by the PAD4·F-amidine complex, where one can observe a 2.9 Å distance between N δ1 from H471 and N ω from the amidine group of the inhibitor. More recently, however, a computational study of the PAD4 inactivation mechanism suggested that proton donation to the departing halide may, alternatively, account for the loss of reactivity at higher pH values. 54 Discriminating between these two potential mechanisms will undoubtedly be the subject of future research.
Cl-amidine and 2-chloroacetamidine possess similar maximal rates of inactivation (i.e., k inact ). Cl-amidine, however, is a far more potent inhibitor due to increased binding energy. Thus, selective enzyme inactivation is driven in part by the affinity of the enzyme for the inhibitor. Further exploration of the Clamidine scaffold resulted in the identification of more potent PAD inactivators, such as o-carboxyl-Cl-amidine (14). 30a Structural analysis of this compound bound to PAD4 revealed that the o-carboxylate forms a direct hydrogen bond with the indole NH of W347 and a water-mediated hydrogen bond with the side chain of Q346, which might explain the enhanced potency of 14 compared to Cl-amidine ( Figure 17B). 30a Additional selectivity studies revealed that 14 preferentially inactivates PAD1. The selectivity for PAD1 inhibition is 8-, 10-, and 3-fold higher than that obtained for PAD2, PAD3, and PAD4, respectively. 30a Using a solid-phase peptide library approach, the Thompson group also identified a novel PAD4-selective inhibitor that consists of a tripeptide comprising threonine, aspartate, and the warhead-containing F-amidine residue (TDFA). 35b TDFA is highly selective for PAD4 (up to 65-fold) with excellent in vivo potency. 35b The crystal structure of TDFA bound to PAD4 further revealed that the carboxylate group from the TDFA aspartate residue directly interacts with the amide nitrogen of the glutamine residue Q346 of PAD4 (Figure 17 B). Interestingly, the TDFA carboxylate adopts a position similar to that of the o-carboxylate group in o-Cl-amidine as well as the carbonyl oxygen of T7 in the H3 substrate ( Figures 10 and 17), but does not directly hydrogen bond with the side chain of W347. 35b The negative charge of the carboxylate might further enhance inhibitor binding through long-range electrostatic interactions with residues R374 and R639. 35b Despite the availability of several PAD inhibitors, Cl-amidine is still the most widely used compound and serves as a benchmark to estimate the potency of novel inhibitors. Although Cl-amidine was shown to reduce protein citrullination in cell and animal studies, and ameliorate disease severity in several animal models (see below), there are still several obstacles remaining to be solved before its potential clinical use, including a short in vivo half-life, poor bioavailability, and, because Cl-amidine is an irreversible inhibitor, the potential for off-target effects. 55 Therefore, efforts have been undertaken to generate more stable Cl-amidine derivatives that resist proteolysis in vivo. To this end, the D-amino acid derivative

Chemical Reviews
Review of Cl-amidine (D-Cl-amidine) has been synthesized. 55 D-Clamidine is slightly less potent in vitro and preferentially inactivates PAD1. The inhibition of PAD4 by D-Cl-amidine is consistent with the observation that PADs are also active on Darginine derivatives at a rate that is ∼5-fold weaker than that for the L-isomer. 56 However, D-Cl-amidine is equally potent in cells as compared to L-Cl-amidine and exhibits better pharmacokinetics, presumably due to decreased proteolysis. 55 Since Cl-amidine is a highly hydrophilic compound that readily dissolves in water, Wang and colleagues tried to increase the hydrophobicity and thereby bioavailability of Cl-amidine. 57 Specifically, they synthesized a diverse panel of molecules, all comprising the Cl-amidine scaffold, flanked by alternative hydrophobic groups. The most potent inhibitor (16) contained a C α -amide-methylbenzene as well as an N α -amide-dimethylnaphthylamine moiety attached to the regular Cl-amidine scaffold. Compound 16 showed similar in vitro inhibition values (IC 50 = 1−5 μM) compared to Cl-amidine (IC 50 = 5 μM); however, inhibition of cellular proliferation was increased by ∼50-fold, most likely due to an improvement in cell permeability. 57 Another interesting Cl-amidine derivative is BB-Cl-amidine (17), which contains a C-terminal benzimidazole and an N-terminal biphenyl moiety. The increased hydrophobicity of this compound also improves its cellular potency, bioavailability, and in vivo half-life. 58 BB-Cl-amidine exhibits similar in vitro potencies and selectivities compared to Cl-amidine. However, the cellular potency of BB-Cl-amidine is increased by more than 20-fold; the EC 50 value is 8.8 μM versus >200 μM for Clamidine when tested against U2OS osteosarcoma cells, a PAD4-expressing cell line.
Projecting forward, with the exception of TDFA and GSK199, most of the currently available compounds are pan PAD inhibitors that block all of the active PAD isozymes with similar potencies. Thus, the identification of isozyme-selective PAD inhibitors remains of crucial importance, and will facilitate the discovery of the individual contributions of PAD isozymes to both cellular physiology and disease.
2.5.3. Chemical Probes for the PADs. Given the high potency and ability to irreversibly modify PAD4, as well as the fact that F-amidine and Cl-amidine selectively modify the active, calcium-bound, form of PAD4, these compounds were adapted for use as ABPP reagents. The first synthesized PADselective ABPP was rhodamine-conjugated F-amidine (RFA, 19; Figure 18A). 59 This compound retains the F-amidine scaffold but is linked to a fluorescent reporter tag (rhodamine) via a p-benzylic triazole group. Preliminary studies confirmed that this fluorescently tagged PAD4-targeted ABPP preferentially labels the calcium-bound, active, form of PAD4 and does not modify a C645S mutant. 59 The probe shows potency equal to that of the non-reporter-tagged F-amidine, indicating that the reporter group does not interfere with enzyme binding. In addition, a biotinylated version of F-amidine (BFA, 20) was synthesized to isolate endogenous PADs. 60 The BFA probe further contains a TEV (tobacco etch virus) cleavage site to release bound PAD4 under gentle conditions. This probe was used to coisolate PAD4-interacting proteins from MCF7 cells. While the probe selectively targets endogenous PADs, it was also able to copurify several known PAD4-associated proteins, including p53, HDAC1, and histone H3, indicating that this approach might be used to identify novel PAD4-binding proteins. 60 Therefore, these PAD-specific ABPPs represent valuable tools that can be used to label PAD4 in cells, as well as to enrich active PAD4-and PAD4-interacting proteins.
In addition to their potential use in targeted proteomic studies and activity-based protein profiling applications, these probes have been used as the basis for developing a number of inhibitor screening platforms. To this end, the Thompson group developed a screening assay that relies on RFA to identify novel PAD inhibitors from diverse chemical libraries in a gel-based format. 45 RFA can also be used to measure changes in PAD activity as a function of added inhibitor in a plate-based assay that is compatible with large compound libraries used in high-throughput screening (HTS) approaches by monitoring the changes in fluorescence polarization evoked by probe labeling of the enzyme ( Figure 18B). The basis of this assay comes from the fact that when the fluorescent group is excited with polarized light, the RFA−PAD4 complex will rotate slowly and therefore emit highly polarized light. Conversely, free RFA rotates faster and emits nonpolarized light. If an inhibitor is bound to PAD4, it will compete with RFA for enzyme interaction, thereby yielding a low fluorescence polarization signal. Using this fluorescence polarization activity-based protein profiling (fluopol-ABPP)-based HTS assay, the Thompson group screened the NIH validation set, comprising 2000 compounds, and identified streptonigrin (18) as a potent and selective (>35-fold selective for PAD4) PAD4 inactivator. 61 This compound shows time-dependent enzyme inactivation and acts as an irreversible PAD4 inhibitor (k inact /K I = 4.4 × 10 5 min −1 M −1 ). The detailed mode of inactivation is still, however, unknown. 62 In addition to its in vitro activity, streptonigrin also inhibits the histone citrullination activity of PAD4 in HL-60 granulocytes and MCF7 cells. 61 Unfortunately, however, streptonigrin has a number of off targets leading to pleiotropic effects on cell viability and signaling, thereby limiting its utility as a probe of PAD4 activity.
More recently, the Thompson group adapted this fluopol-ABPP-based HTS approach to identify PAD2 inhibitors. Here, the authors hypothesized that by lowering the concentration of calcium in the reaction mixture they might identify compounds that specifically bind to the apo, calcium-free form of the enzyme. Using this biased assay format, Lewallen et al. screened the LOPAC collection of pharmacologically active compounds and identified ruthenium red as the first calcium competitive inhibitor for the PADs. 49 Although this compound shows limited utility as a cellular probe of PAD activity, its discovery does demonstrate that it is possible to identify potent nonactive-site-targeted reversible PAD inhibitors.  (Figure 19). In addition, H4R17, H4R19, and H4R23 can be citrullinated by PAD4 in vitro but have not been found to be a target of citrullination in vivo. 31 Histone citrullination is associated with both transcriptional repression and activation. 38,41,64 For example, the citrullination of H3R17 by PAD4 at the estrogen receptor α (ERα)-regulated pS2 promoter was shown to correlate with transcriptional repression by interfering with activating, PRMT4-mediated, arginine methylation events. 38,41 Moreover, p53-dependent recruitment of PAD4 to the p21 promoter resulted in citrullination of histone H3 and inhibition of gene transcription. 65 The function of PAD4 as a p53 corepressor is further enhanced by direct interaction between the histone deacetylase HDAC2, which also represses p53 target genes, and PAD4. 66 Exposure to genotoxic stress induced the release of PAD4 from the p21 promoter, which is subsequently derepressed by activating methylation of histone H3R17 by PRMT4. 65 PAD4 also mediates histone H3 citrullination on the promoter of the OKL38 gene, thereby repressing the expression of this pro-apoptotic tumor suppressor gene. 67 However, following DNA damage, increased p53 binding and histone arginine methylation, as well as a decrease in histone citrullination on the OKL38 promoter, accompany the activation of OKL38, suggesting a direct role of PAD4 and p53 in the expression of OKL38. 67 Citrullination of H3R8 following estrogen-induced activation of PAD4 also correlates with target gene activation by abolishing the H3K9me3-directed recruitment of HP1α to ERα-dependent promoters. 68 HP1α is a chromatin-binding protein that specifically interacts with the H3K9me3 mark to repress gene expression by inducing a heterochromatin-like state that is refractory to high-level transcription. It was further shown that methylation of H3R8 slightly reduces the binding of the transcriptional repressor HP1α to H3K9me3. 68 This example of histone PTM crosstalk raises the interesting possibility that differences in HP1α-binding affinity to H3K9me3 are caused by H3R8 methylation or citrullination, thereby regulating the gradual activation of HP1α target genes upon estrogen stimulation.  Increased citrullination at H3R8 in peripheral blood mononuclear cells also results in the activation of downstream genes such as the cytokines TNFα and IL8, which ultimately leads to inappropriate T-lymphocyte activation and uncontrolled immune response in multiple sclerosis. 68 More recently, PAD4 was shown to interact with TAL1, a transcription factor that is essential for the generation of embryonic hematopoietic stem cells. 69 There, it was demonstrated that TAL1-bound PAD4 acts as an epigenetic coactivator by competing with PRMT6-mediated methylation of the repressive H3R2me2a mark and thus increases IL6ST expression. Alternatively, TAL1recruited PAD4 can function as a corepressor by counteracting the activating H3R17me2a mark by PRMT4, thereby inhibiting the expression of the CTCF-encoding gene.
More recently, it was shown that PAD4 citrullinates a single arginine residue, H1R54, within the DNA-binding site of histone H1. Modification of this residue results in H1 displacement from chromatin, thereby inducing global chromatin decondensation in pluripotent cells. 70 It was also shown that PAD4 is expressed and active in murine embryonic stem (ES) cells as well as reprogrammed induced pluripotent stem (iPS) cells. Interestingly, the expression of PAD1, PAD2, and PAD3, but not that of PAD6, is also observed in pluripotent cells, indicating a potential function for these PADs in pluripotency or cell differentiation as well. 70 In these cells, PAD4 plays a critical role in the pluripotency transcriptional network by enhancing the expression of genes involved in stem cell development and maintenance that can be inhibited by addition of Cl-amidine. For example, PAD4 binds to the promoter region of key stem cell genes such as Klf 2, Tcl1, Tcfap2c, Kit, and Nanog, thereby activating their expression. 70 Chromatin immunoprecipitation−quantitative polymerase chain reaction (ChIP−qPCR) analyses revealed that the association of H1 with chromatin at the regulatory regions of Tcl1 and Nanog is low in pluripotent cells; however, upon Pad4 knockdown, it is significantly enhanced. Moreover, mutation of H1R54 to alanine impairs its interaction with nucleosomes, supporting the critical role of this residue in nucleosome binding. The citrullination of H1 induces chromatin decompaction and may enhance the accessibility of RNA polymerase, transcription factors, and further histone-modifying enzymes. The PAD4-induced open chromatin architecture is also important for stem cell pluripotency during early mouse embryogenesis and can be impaired by Cl-amidine and TDFA treatment. 70 The overexpression of PADs in multiple cancers (see below) might induce a similar stem-cell-like state, containing decondensed chromatin, and thereby promote uncontrolled cell growth. 71 Interestingly, Dwivedi and colleagues recently observed that the citrullination of histone H1 at arginine 54 is also critical for neutrophil extracellular trap (NET) formation (see below) and represents an autoantibody epitope in sera from patients with systemic lupus erythematosus and Sjogren's syndrome. 72 PAD4 was long thought to be the only nuclear PAD and as such was assumed to be responsible for all nuclear histone citrullination events. Recent studies, however, indicate that this is not the case and that PAD2 is also capable of citrullinating histones. 17b,64 For example, PAD2 expression was shown to be upregulated by epidermal growth factor (EGF) stimulation in mammary epithelial cells. 17b There, nuclear PAD2 citrullinates histone H3 at R2, R8, and R17. It was proposed that PAD2induced histone citrullination may play a regulatory role in the expression of lactation-related genes during the diestrus phase of the estrous cycle. 17b Moreover, citrullination of H3R26 by PAD2 at ERα target genes has been linked to transcriptional activation of more than 200 genes. 64 The presence of H3R26 Cit destabilizes the nucleosome structure to allow for efficient ER binding to nucleosomal DNA. 73 The altered nucleosome structure directly correlates with estradiol administration. Hence, it was proposed that, following estradiol exposure, ER directly or indirectly recruits PAD2 to ER target genes where PAD2 then citrullinates H3R26. The citrullinated H3 was postulated to induce an altered conformation of the nucleosome, manifested by core nucleosome particle protection size shifts from 149 to 125 bp upon H3R26 deimination, which allows for a more stable interaction between ER and its nucleosomal ER-binding sites. 73 Interestingly, this arginine residue can also be methylated, and these two modifications are inversely correlated. 41 It is also interesting to note that citrullination at H3R26 strongly colocalizes with H3K27 acetylation in MCF-7 cells, thereby raising the possibility for crosstalk between these two modifications. 64 2.6.2. Epigenetic Effects of Nonhistone Citrullination. Apart from direct histone citrullination, PADs can also act as direct coactivators of specific transcription factors to induce other histone modifications that affect gene expression. As such, it was shown that PAD4 associates with several transcriptionally active promoters and functions as an activator of c-Fos via a mechanism that involves facilitated phosphorylation of the ETS-domain protein Elk-1. 74 EGF-induced activation of PAD4 results in the direct targeting of Elk-1 for citrullination, thereby increasing ERK-mediated-phosphorylation-induced activation of Elk-1. Activated Elk-1 exhibits enhanced association with the histone acetyltransferase p300, which ultimately induces histone H4K5 acetylation and concomitant increased gene transcription.
In addition to the citrullination of histones and transcription factors, the PADs citrullinate a number of other proteins, including themselves. For example, PAD4 is known to autocitrullinate at numerous sites in vitro and in vivo. 30b, 75 There have been some conflicting observations regarding the functional impact of PAD4 autocitrullination. One report claimed that autocitrullination reduces PAD4 activity. 75 By contrast, another study did not detect any significant influence on catalytic activity. 30b In addition, it was shown that PAD4 autocitrullination alters protein−protein interactions and is thought to weaken the interaction between PAD4 and citrullinated H3 as well as PRMT1 and the histone deacetylase HDAC1. 30b Autocitrullination of PAD4 was also proposed to destabilize a corepressor complex consisting of PAD4 and HDAC1, thereby providing a potential mechanism for decreasing the corepressor activity of this complex. 30b 2.6.3. Nonepigenetic Effects of Histone Citrullination. It was postulated that PAD4-mediated citrullination at H4R3 may represent an "apoptotic histone code" to detect damaged cells and induce nuclear fragmentation. 76 DNA damage induces PAD4 expression and the concomitant citrullination of various proteins. 77 In this respect, PAD4 was shown to citrullinate H4 at arginine 3 and that this activity was blocked by small interfering RNAs (siRNAs) against p53 or PAD4. 76 In addition, the presence of H4R3 Cit correlates with the level of apoptosis induction in DNA-damage-exposed U2OS cells. Citrullination of H4R3 was proposed to enhance accessibility of nucleosomal DNA, thereby promoting its apoptotic fragmentation. 76 Chemical Reviews Moreover, PAD4 was shown to citrullinate the nuclear lamina protein lamin C, suggesting the involvement of lamin C citrullination in nuclear fragmentation during apoptosis. 76 Histone citrullination is also implicated in innate immunity, where PAD4-mediated histone citrullination is involved in the formation of NETs ( Figure 20). 78 NETs, which were first identified in 2004 by the Zychlinsky group, are composed of nuclear DNA and associated proteins that are ejected by neutrophils in response to an infection. 79 Although the physiological function of NET formation is incompletely understood, NETs are thought to function as a proinflammatory form of cell death, known as NETosis, that links the innate and adaptive immune responses. Specifically, NETs are formed in response to a number of stimuli of both bacterial and human host origin that trigger the release of chromatin to form a weblike structure that can trap pathogens and prevent them from spreading throughout the body.
NETs also increase the local concentration of coextruded antimicrobial agents, including histones, myeloperoxidase (MPO), and proteases. Notably, several decades earlier, it had already been shown that histones, in particular H3 and H4, possess bactericidal activity, 80 thereby providing at least a partial explanation for the release of histones during NETosis. NET-forming stimuli include lipopolysaccharide (LPS), Nformyl-methionine-leucine-phenylalanine (f-MLP), lipoteichoic acid (LTA), tumor necrosis factor (TNF), interleukin-8 (IL-8), and hydrogen peroxide. 78b,d Although the specific cellular pathways that trigger NET formation are an area of intense investigation, they are currently incompletely understood.
Nonetheless, there appears to be a requirement for the generation of reactive oxygen species (ROS), since neutrophils from patients with chronic granulomatous disease, which is due to mutations in the ROS-generating enzyme nicotinamide adenine dinucleotide phosphate (NADPH) oxidase, do not form NETs. In addition, PAD4 activity is a prerequisite and likely the terminal point in this signal transduction cascade because genetic deletion or chemical inhibition of PAD4 results in mouse neutrophils that are unable to citrullinate histones and do not form NETs. 78a,c,81 Thus, PAD4 is a crucial component of the innate immune system in mammals.

Histone Citrullination in Disease
Dysregulated PAD expression and aberrant protein citrullination have been implicated in numerous human diseases as summarized in several excellent reviews. 13,82 Here, we focus on diseases where a direct link between histone citrullination and disease has been established by discussing the involvement of histone citrullination in cancer and inflammatory diseases.
2.7.1. Histone Citrullination in Cancer. Histone modifications play an important role in tumor development and cancer. 83 In this respect, PAD4 is overexpressed in various malignant tumor tissues, including osteosarcoma, several adenocarcinomas affecting the colon, esophagus, ovaries, pancreas, and stomach, and carcinomas of the breast, bladder, endometrium, and liver, suggesting that it might be involved in tumors derived from multiple tissue origins. 84 It was proposed that aberrant histone modifications can induce tumor suppressor gene silencing, thereby promoting tumorigenesis. 83b A key regulator of cell cycle arrest, and programmed cell death, is the tumor suppressor p53, which is mutated in about half of cancers and thereby represents the most frequently altered gene in human cancers. 85 p53 is a DNA-binding protein that is responsible for the integration of diverse signals, such as starvation, DNA damage, and various stress signals, by regulating numerous downstream genes that help to cope with stress and control cell fate. As described above, PAD4 functions as a corepressor of p53 to repress its downstream tumor suppressor genes such as p21, GADD45, and PUMA. 65,66 Notably, both PAD4 knockdown and inhibition with Clamidine result in increased expression of these p53 target genes and increased cell death. 65,66 In addition, in osteosarcoma U2OS cancer cells, PAD4 represses the expression of the p53 target gene SESN2, which encodes an upstream inhibitor of the mammalian target of the rapamycin complex 1 (mTORC1) signaling pathway to regulate cellular autophagy. 57 Consistently, the PAD inhibitor YW3-56 (14) induces autophagy in U2OS cells. 57 In summary, PAD4 might induce tumorigenesis through multiple mechanisms, including chromatin decondensation resulting in a stem-cell-like state, inhibition of tumor suppressor genes by corepressing p53 target genes, and inhibition of autophagy and the concomitant increase in protein synthesis and cell growth.  Interestingly, Tanikawa and colleagues observed that the expression of PAD4 is activated by p53, which binds to the p53response element p53BS-A in intron 1 of the PAD4 gene. These data indicate that PAD4 may form part of a negative feedback loop to regulate p53 activity. 76,77 These data further imply that aberrant histone citrullination caused by dysregulated PAD enzyme activity is actively involved in cancer progression. Given the close link between PADs and cancer, PAD4 represents a suitable target for cancer drug development. In this respect, the PAD inhibitors Cl-and F-amidine both exhibit cytotoxic effects toward several cancerous cell lines such as HL-60, MCF7, and HT-2, whereas no effect was observed in noncancerous lines such as NIH 3T3 and HL-60 granulocytes. 86 In addition, these PAD inactivators also potentiate the cytotoxicity of the commonly used anticancer drug doxorubicin. 86 Moreover, the Coonrod group showed that the level and activity of PAD2 increase during the transition of normal mammary epithelium to fully malignant breast carcinomas and coincides with HER2/ERBB2 upregulation. 87 In addition, when treating MCF10DCIS monolayer carcinoma cells with Clamidine, they observed a strong suppression of cell growth in culture, which is induced by cell cycle arrest in the S-phase, followed by apoptosis. Moreover, administration of Cl-amidine to mice containing MCF10DCIS-injected tumor xenografts, a preclinical model of breast cancer, suppresses tumor growth. 87 Although high-level PAD expression and histone citrullination are generally considered to be characteristic features of cancer cell proliferation, recently, however, the Coonrod group identified a strong correlation between high PAD2 expression and H3R26 Cit with increased survival in estrogen receptor positive (ER+) tumor patients. 73 The authors further suggest that histone citrullination might be a critical prognostic for ER+ tumor development and is thus suited to stratify ER+ tumors into clinically relevant subsets. This observation also raises the intriguing question of whether histone citrullination may act as either a tumor promoter or a suppressor mark in a context and or isozyme-dependent manner. Therefore, further studies are necessary to clarify the detailed roles of individual PADs during tumor development and to understand whether they function as tumor suppressors or oncogenes.
2.7.2. Histone Citrullination in Inflammatory Diseases. Altered histone citrullination is also observed in a range of inflammatory diseases such as RA, lupus, ulcerative colitis, Alzheimer's disease, and multiple sclerosis. 82d Despite their great diversity, a common molecular feature of at least a subset of these diseases is aberrantly upregulated NET formation due to the presence of activated immune cells. As such, progression of these inflammatory diseases may result from the inappropriate and exaggerated induction of NET formation. For example, neutrophils obtained from RA patients are more likely than control neutrophils to spontaneously release NETs. 78e Lupus neutrophils possess a similar phenotype, and it is notable that hallmarks of RA and lupus include autoantibodies that bind specifically to citrullinated proteins (RA) and double-stranded DNA (lupus), which are both released from neutrophils during NETosis. Consistent with a role for PAD4 in this process is the fact that both Cl-amidine and GSK199 inhibit NET formation. 50,58 Additionally, neutrophils isolated from PAD4(−/−) mice cannot form NETs after stimulation with chemokines or incubation with bacteria, highlighting that the PAD4 function is critical in diseases associated with aberrant NET formation. 78c With respect to RA, the links between dysregulated PAD4 activity and disease onset are extremely strong because, in addition to its important role in NET formation, a genome-wide haplotype study identified four single-nucleotide polymorphisms (SNPs) in PAD4 that are associated with an increased risk of developing RA. 88 Additionally, the most specific diagnostic for RA is the presence of antibodies to citrullinated proteins, the product of the PAD reaction, and these antibodies now form part of the clinical diagnostic criteria. 89 Furthermore, these autoantibodies are present before clinical disease and are predictive of a more severe and erosive form of the disease. 90 Notably, it was demonstrated that treatment of mice suffering from collagen-induced arthritis (CIA) with Cl-amidine reduces disease severity, joint inflammation, and joint damage in a dose-dependent manner without apparent signs of cytotoxic effects. 91 Moreover, Cl-amidine was also effective in a mouse model of ulcerative colitis where its oral or intraperitoneal administration increased the colon length, as well as mouse mobility and activity, and reduced the disease severity. 92 Recently it was shown that PAD inhibition using BB-Cl-amidine and Cl-amidine mitigates vascular, kidney, and skin disease in an MRL/lpr mouse model of lupus. 58 Specifically, these PAD inhibitors not only reduce NET formation and interferon (IFN) production, which has been associated with the development of endothelial dysfunction in lupus, 93 but also decrease immune complex deposition in kidneys and reduce proteinuria, which constitute major characteristics of this disease. Taken together, inhibition of NET formation by PAD-specific inhibitors represents a promising therapeutic strategy to combat different inflammatory diseases.
Since aberrant NET formation including concomitant increased levels of citrullinated histones is also present in deep vein thrombosis and myocardial infarct formation, inhibition of PAD4 activity also represents a suitable target to interfere with these serious cardiovascular diseases. 94 Indeed, recent data with both Cl-amidine and BB-Cl-amidine support this hypothesis. 58,81 Apart from PAD-mediated citrullination of histone proteins and NET formation, PAD4 and PAD2 were also shown to hypercitrullinate myelin basic protein (MBP), resulting in the demyelination of the myelin sheath and affecting nerve cell signal transduction, thereby promoting the development of multiple sclerosis (MS). 95 In this respect, the PAD inhibitor 2-chloroacetamidine has shown efficacy in multiple preclinical models of MS, indicating that PAD enzymes may also represent a therapeutic target for MS. 96

Future Areas of Protein Citrullination Research
Although most studies dealing with histone citrullination focus on its nuclear effects regarding the modulation of gene expression, it is evident that protein citrullination, even on histone proteins, also exists in the extracellular environment of human sera. The presence of hypercitrullinated proteins, including histones, is a well-documented highly specific marker for rheumatoid arthritis that has diagnostic and prognostic value as well as potential therapeutic implications. 97 Since PAD inhibitors can block the accumulation of these hypercitrullinated (histone) proteins and the incidence of anticitrulline antibodies as well as aberrant NET formation, inhibition of PAD function represents a promising target to treat a number of different inflammatory diseases which are linked to hypercitrullination and abnormal NET formation.
Regarding the function of protein citrullination in epigenetic regulation, it will be of great interest to screen for enzymes that can reverse protein citrullination as well as to identify potential

Chemical Reviews
Review citrullination reader proteins that may recognize and integrate this emerging epigenetic mark into the cellular interpretation of the histone code. Since PADs are calcium-dependent enzymes that require near millimolar concentrations of calcium to efficiently citrullinate its protein substrates in vitro, it also remains a topic of future research to elucidate how the PADs get activated in vivo, as cellular calcium concentrations typically remain at low micromolar levels. Another critical aspect to further advance the current knowledge of PAD biology is the generation of isozyme-selective inhibitors to ascertain the physiological contribution of the individual enzymes in health and diseases. In addition, the development of next-generation PAD inhibitors with enhanced selectivity, bioavailability, and pharmacokinetic stability as well as preferentially reversible inhibitors to minimize potential off-target effects encountered by irreversible inhibitors is highly desirable.

Overview of Protein Arginine Methylation
Protein arginine methylation is a common post-translational modification that regulates numerous cellular processes, including gene transcription, mRNA splicing, DNA repair, protein cellular localization, cell fate determination, and signaling. 98 Notably, it was shown that about 2% of the total arginine residues isolated from rat liver nuclei are dimethylated. 99 The formation of this PTM is catalyzed by the PRMT family of methyltransferases. Currently, there are nine PRMTs annotated in the human genome ( Figure 21). 100 In addition, two distantly related proteins, FBXO10 and FBXO11, show a low degree of sequence homology to some PRMT motifs, but lack the important substrate-binding double E-loop and THW loop. 101 Although the human flag-tagged FBXO11 protein was proposed to harbor arginine methyltransferase activity, the human HA-tagged version of the protein and the Caeno-

Chemical Reviews
Review rhabditis elegans FBXO11 orthologue DRE-1 did not show any methyltransferase activity in a subsequent study. 101a,102 Therefore, FBXO10 and FBXO11 are not considered as true PRMTs. 103 Notably, chemogenetic analyses suggest that there may be as many as 44 PRMTs in the human proteome. 101c Whether all of these enzymes represent bona fide PRMTs has yet to be proven definitively.
The nine S-adenosyl-L-methionine (SAM or AdoMet)dependent enzymes can be further classified into three types according to their preferred methylation products as the terminal amine(s) of the arginine guanidinium group may be monomethylated or symmetrically or asymmetrically dimethylated to form monomethylarginine (MMA), symmetric dimethylarginine (SDMA), and asymmetric dimethylarginine (ADMA), respectively ( Figure 22). Therefore, PRMT-mediated arginine modifications add either ∼14 Da (MMA) or ∼28 Da (SDMA or ADMA) to the overall mass of the histone protein. However, as can be seen in Figure 23A, methylation does not perturb the overall positive charge of the arginine guanidinium group, but changes potential hydrogen bond interactions, since the number of added methyl groups reduces the hydrogen bond donor sites accordingly. On the basis of experimental studies performed with guanidine and a diverse series of methylated guanidine derivatives, the pK a of guanidine is 13.6, whereas the pK a values for N-methylguanidine and N,Ndimethylguanidine are 13.4. 104 Interestingly, the pK a value for the N,N′-dimethylguanidine is 13.6, indicating that monomethylated and asymmetrically dimethylated guanidines are only slightly weaker bases than symmetrically dimethylated guanidines. Thus, the major effects of this modification are steric effects as well as changes in hydrogen bond interactions as opposed to the electronic effects observed with citrullination.
Moreover, the addition of methyl groups alters the shape of the arginine side chain. Due to the electron delocalization in the arginine guanidinium group and possible rotations around the central carbon−nitrogen bonds, distinct stereoisomers can occur in MMA and SDMA ( Figure 23B). On the basis of density functional theory (DFT) calculations, the anti−syn conformation was shown to be the most favorable SDMA form, highlighted by a difference in the ground-state energies for the anti−anti and anti−syn SDMA conformations of 2.8 kcal mol −1 . 105 A plausible reason for this difference may be steric effects that would disfavor the close proximity between both methyl groups in the anti−anti conformation. However, the activation energy for rotation around the bond between the central guanidinium carbon, C ζ , and one of the terminal nitrogens, N ω , is 14 kcal mol −1 , indicating that conversion of the different conformations can occur at room temperature. 105,106 In addition, the different stereoisomers represent different hydrogen-bonding patterns that might be harnessed for specific protein interactions. For instance, the anti−anti SDMA conformation was shown to be preferentially bound to the Tudor domains of the human SMN and SPF30 proteins, whereas the extended Tudor domains of the Drosophila TUDOR protein and the human SND1 were shown to bind SDMA in the anti−syn conformation. 105,107 Most PRMTs generate ADMA and are classified as type I enzymes (PRMT1−4, PRMT6, PRMT8), with PRMT1 accounting for >50% of the normal steady-state levels of ADMA. 108 Of the remaining enzymes, PRMT5 and PRMT9 are the only known type II enzymes that catalyze the formation of SDMA, whereas PRMT7 is a type III enzyme that can only generate MMA on its substrates (Figure 22). 109 In addition to these human PRMT orthologues, yeast encodes a PRMT that catalyzes the monomethylation of the internal (N ε ) guanidinium nitrogen atom. 110 Given the unique specificity of this enzyme, it has been classified as a type IV enzyme. Notably, however, sequence analyses indicate that there is no homologue of this type IV enzyme in higher eukaryotic organisms.

Structure−Mechanism of PRMTs
All PRMTs contain a conserved catalytic core region of approximately 310 amino acids, and several PRMTs possess additional domains (e.g., SH3, Zn finger, TIM barrel, and TPR) that have been suggested to diversify the substrate specificity of the enzymes and to regulate their activity ( Figure 21). 98d Typically, PRMTs possess a single catalytic core region. PRMT7 and PRMT9, however, harbor two consecutive methyltransferase domains that may have arisen by gene duplication. 111 The catalytic core region comprises five highly characteristic signature motifs (Figures 19 and 22), including

Chemical Reviews
Review (i) motif I (VLD/EVGXGXG), which forms the base of the SAM-binding site and is structurally homologous to sequences found in other nucleotide-binding proteins, (ii) post I (L/V/ IXG/AXD/E), which is important for hydrogen bond formation to each hydroxyl of the ribose part of SAM via the carboxylate of the acidic residue, (iii) motif II (F/I/VDI/L/K), which stabilizes motif I by the formation of a parallel β-sheet, (iv) motif III (LR/KXXG), which forms a parallel β-sheet with motif II, and (v) the THW loop, which is close to the active site cavity and helps stabilize the N-terminal helix, which is important for substrate recognition. 100 3.2.1. Structure of PRMTs. The crystal structures of several PRMTs revealed that these proteins mainly exist as homodimeric head-to-tail protein complexes ( Figure 25A). 112 It was proposed that the dimer is critical for proper substrate binding and therefore is required for activity. 112c By contrast, PRMT7, the only known type III methyltransferase, is unusual in that it contains two PRMT core units arranged in tandem ( Figure 21). Gel filtration profiles and small-angle X-ray scattering (SAXS) experiments indicated that the PRMT7 orthologue from C. elegans exists as a monomer in solution, even in the presence of SAM. 113 Notably, the recent crystal structures of mouse and C. elegans PRMT7 revealed that the second methyltransferase domain folds back onto the first catalytically active domain, and thereby forms a pseudodimeric form of the enzyme (Figure 26). 113,114 In mouse PRMT7, the two PRMT modules are bridged by a 19-residue linker. Overall, the general architecture is very similar to that of other known PRMT dimer structures. However, only the first (N-terminal) module is active, since the second PRMT module contains several mutations that impair SAM cofactor binding and does not contain a proper double Eloop, thereby rendering this module catalytically nonproductive. 114 Moreover, the crystal structures revealed a tight interaction between both modules, which are further stabilized by a zinc finger. 114 Interestingly, the interface between both PRMT modules does not contain the typical hole observed in the other PRMT dimer structures and might therefore restrict the flexibility and orientation of peptidyl substrates ( Figure  26). 114 Notably, PRMT7 derived from plants and Trypanosama are composed of a single PRMT module. Wang et al., however, observed that PRMT7 derived from Trypanosoma brucei exists Figure 24. Sequence alignment of the SAM-binding methyltransferase region of human PRMT family members. The product specificity-determining residue is highlighted with a blue asterisk below the alignment. Catalytic residues located on the double E-loop are highlighted with red asterisks below the alignment. The sequence alignment was generated using Clustal Omega and visualized using Espript 3.0. 229 The relative accessibility of each residue is depicted below the consensus motif: blue indicates accessible residues, cyan marks intermediately accessible residues, white stands for buried residues, and red indicates that the accessibility is not predicted. as a dimer on the basis of SAXS analysis, and also the available crystal structure of T. brucei PRMT7 (TbPRMT7) confirms its dimeric organization (Figure 26). 115 Although TbPRMT7 is active as a homodimer, it strictly generates monomethylated arginine residues, indicating that the presence of tandemly arranged PRMT modules versus two active PRMT modules in trans orientation are not critical for determining product specificity. 114,115 On the basis of the available crystal structures, the catalytic core region consists of three structurally and functionally distinguishable regions as exemplified by the structure of PRMT1 ( Figure 25A). The most critical is the SAM-binding domain, which is highly conserved in other SAM-dependent methyltransferases. 116 This SAM-binding domain adopts a typical Rossmann fold and is followed by the β-barrel domain, which is quite unique to the PRMT family and is thought to be important for substrate binding. 117 Moreover, the β-barrel domain contains an α-helical insertion that acts as a dimerization arm. Despite the variation in amino acid sequences (Figure 24), the crystal structures of several PRMTs reveal highly similar general folds. In addition, key structural features such as the active site double E-loop, which is critical for guanidinium binding, as well as the SAM-binding residues and several β-strand-forming signature motifs, are conserved among all PRMTs.

Chemical Reviews
Within the active site are a number of conserved residues that are important for SAM binding, catalysis, and maintaining the overall architecture of the PRMT1 active site ( Figure  27). 112c These residues include E129 and V128, which are both located on a loop preceding motif II and directly interact with the SAM adenine ring and E100, which forms a bidentate hydrogen bond to the ribose moiety. The side chain of H45 projecting from an N-terminal helix, denoted αY, also hydrogen bonds to a ribose hydroxyl group. Moreover, proper positioning of the SAM cofactor is further mediated by the side chain of M155. The methionine portion of SAM is bound to the side chains of D76, recognizing the free α-amine, whereas R54 forms a bidentate interaction with the carboxylate group. R54 also hydrogen bonds with the side chain of E144 to orient the γ-carboxylate of this residue for optimal electrostatic and hydrogen bond interactions with the ω-nitrogen atom of a substrate arginine. E144 is part of the substrate-binding double E-loop, which also harbors E153. The side chain carboxylates of both of these residues are thought to recognize and align the arginine guanidinium substrate for proper catalysis to occur. Notably, in the structure of PRMT1, the position of this residue does not appear to be catalytically competent since the orientation of the E153 side chain is out of the active site. 112c 3.2.2. PRMT Substrate Recognition. On the basis of sequence motif analysis, there are no conserved residues surrounding the sites of arginine methylation. 118 The only exception is glycine, which is slightly enriched, especially at positions +1 and +2 following the arginine substrate. However, arginine methylation is often found in unstructured protein regions, including loops, as well as N-or C-terminal regions. 119 The lack of structured segments can be rationalized by the deep active site cavity of the PRMTs, which only allows arginine residues present on kinked loops to enter. This structural restraint was highlighted by the peptide-bound PRMT5·MEP50 complex, which revealed the presence of a characteristic β-turn in the peptide substrate ( Figure 28). 120 Specifically, the substrate arginine is located at the tip of the β-turn, which is stabilized by a hydrogen bond between the main chain carbonyl oxygen of S1 and the backbone amide nitrogen of G4. In addition, the carbonyl oxygen and the amide nitrogen of S1 form a bidentate hydrogen bond to the Q309 amide side chain group of PRMT5. Similar to the PAD4− substrate peptide interaction, the majority of the interactions between the substrate peptide and PRMT5 are mediated by peptide backbone interactions, rationalizing the lack of strict sequence specificity. The only exception represents a direct hydrogen bond between the side chain of K5 and the carbonyl oxygen of PRMT5 P311. Interestingly, PRMT5 also utilizes several backbone hydrogen bond interactions originating from the main chain of S310, P311, L312, and F580. This observation may explain the lack of sequence conservation of these residues among different PRMT members. Moreover, the hydroxyl group of Y307 hydrogen bonds to the backbone carbonyl oxygen and amide nitrogen of substrate residues G6

Chemical Reviews
Review and K8, respectively. Thus, the fact that glycine-rich sequences are preferentially targeted is consistent with their ability to endow the polypeptide chain with enhanced conformational freedom, which would facilitate the formation of such β-turn structures. Thus, the context in which an arginine is placed is important for substrate recognition. Moreover, the preference for peptide stretches that can adopt β-turn-like structures coincides with the substrate specificity of PADs, implying that PRMTs and PADs might compete for similar substrates. This appears to be the case because, as noted above, in several instances the methylation and citrullination status of specific arginines in histones are inversely correlated and possess distinct and opposing effects on transcription (see section 2.6.1).
In addition to these structural constraints, local sequences next to the modified arginine are also important for substrate recognition by some PRMTs. As mentioned above, PRMTs, such as PRMT1, PRMT3, PRMT5, PRMT6, and PRMT8, typically target glycine-rich sequences. 98d,121 Contrarily, PRMT4 prefers to methylate arginines embedded within proline-, glycine-, and methionine-rich motifs. 122 Moreover, crosstalk with lysine acetylation was shown to be critical to enhance H3 peptide substrate methylation by PRMT4 at H3R17, as kinetic studies revealed that PRMT4 has a 5-fold higher activity toward an H3K18-acetylated peptide than the unmodified peptide. 112d Interestingly, the only type III enzyme, PRMT7, specifically recognizes arginines within an RXR motif present in H2B and H4 (see section 3.6). 123 There, it was shown that PRMT7 preferentially modifies the N-terminal arginine of the RXR motif (i.e., H2BR29, H2BR31, and H4R17). Notably, efficient catalysis depends on the presence of the second arginine since the replacement of the second arginine by lysine leads to a significant reduction in the methylation signal. 123 Although the cocrystal structure of TbPRMT7 bound to a 21-residue histone H4 peptide was recently solved, only the first four residues (SGRG) of this peptide could be observed. 115 This structure revealed that the peptide substrate forms a wide turn on the surface of the active site, but not a characteristic β-turn ( Figure  29). The side chain guanidinium of the arginine substrate forms five hydrogen bonds to both double E-loop glutamate residues (E172 and E181) and glutamine Q329, which occupies the same position as the central histidine in the THW loop. Mutagenesis studies further revealed that the glutamine residue Q329 can be substituted by a histidine without loss of activity. 115 Similar to that in PRMT5, the peptide substrate in PRMT7 is also recognized by several hydrogen bond interactions with the substrate amide backbone. Specifically, the carboxylate of D70, originating from helix αY, makes hydrogen bonds with the main chain amides of residues S1 and G2, and T176, situated on the double E-loop, hydrogen bonds to the backbone carbonyl and the amide of the substrate residue R3. These data indicate that main chain interactions are critical for substrate binding, and the lack of a β-turn in the peptide substrate indicates that PRMT7 may accommodate a wider range of peptide sequences apart from glycine-rich elements.
For most PRMTs, substrate recognition relies to a great extent on remote sequences (>14 residues distant from the arginine). 124 This stands in contrast to the PADs, where longrange interactions are unimportant. 19 These distal elements are typically positively charged and are thought to interact via ionic interactions with several negatively charged patches found on PRMT1 ( Figure 25B). 112c In this respect, mutation of acidic residues in PRMT1 leads to compromised enzyme activity or altered methyltransferase substrate specificity. 125 Notably, efficient PRMT-mediated methylation reactions require long peptide substrates, i.e., 21 residues of the N-terminal tail of histone H4 (acH4-21), to achieve H4R3 methylation kinetics comparable to that of full-length H4. 124b N-terminal truncation of the two residues preceding the methylated arginine residue (H4−21Δ(1−2)) decreased the methylation efficiency by ∼10 4 -fold, thereby indicating that interactions between the enzyme and the backbone amide N-terminal to the site of methylation are critical for substrate recognition as confirmed by structural analysis. A similar approach employing C-terminal truncation peptides revealed that removal of three residues (AcH4-18) decreased arginine methylation by 150-fold in human PRMT1, while further truncation by three additional residues (AcH4-15) further diminished activity by 860-fold. 124b These data clearly indicate that remote sequences are critical for efficient substrate capture, by mediating strong charge−charge interactions between the acidic patches in the SAM-binding domain as well as the β-barrel domain of PRMT1 (as described above) and positively charged residues such as K16, R17, R19, and K20 in the distal portion of the H4 tails. 112c,124b,125 Since   the great majority of PRMTs form dimers, it is conceivable that this increased amount of distal substrate-binding sites, represented by acidic patches in PRMT1, facilitates the processive dimethylation of Arg residues by allowing the product of the first methylation reaction, monomethylarginine, to enter the active site of the second molecule of the dimer without releasing the substrate from the homodimer. This model is further supported by the high affinity, also reflected by a low K M value, between substrates and the PRMT dimer and the loss of activity of engineered PRMT monomers. 112b, c Osborne and colleagues further revealed that PRMT1 employs a partially or semiprocessive mechanism, indicating that the substrate stays bound to the enzyme for two consecutive methylation reactions, and the major substrate released is ADMA. 124b Similar results have been obtained for PRMT6 as this enzyme has also been suggested to show some processivity. 126 By performing double-turnover reaction experi-   ments, the Hevel group observed both MMA and ADMA formation, confirming a general semiprocessive mechanism of PRMT1 catalysis. 127 Interestingly, the proportions of MMA and ADMA generated are different among distinct peptide substrates. For example, the N-terminal H4 peptide SGRGK-GGKGLGKGGAKR is preferentially processed through a processive mechanism, leading to a dimethylated product, while other sequences such as the fibrillarin-based RKK peptide GGRGGFGGKGGFGGKW partition more frequently through a distributive mechanism where both monomethylated and dimethylated products are ultimately generated. These observations clearly indicate that the degree of processivity is controlled in a substrate-dependent manner and thus distinct patterns of methylation can be deposited by the same PRMT enzyme. 127 Thus, conflicting observations regarding the processive or distributive nature of PRMT1 activity may be partially due to the different substrates used and hence changes in the substrate-induced processivity rate. By contrast, recent data indicate that PRMT5 uses a distributive mechanism to catalyze the symmetric dimethylation of histone H4 and its Nterminal peptide fragment. 120,124c, 128 3.2.3. PRMT Product Selectivity. Structural comparison of several PRMTs reveals similar active site architectures, thereby suggesting that these enzymes employ a similar catalytic mechanism to effect substrate methylation ( Figure  30). 112a,c,e,114 Interestingly, the catalytic domain of PRMT5, which catalyzes SDMA formation, is also highly similar to that of the ADMA-generating type I enzymes as judged by the very high degree of consersation between the available crystal structures ( Figure 31). 120,129 However, the molecular basis for their distinct product formation paths is largely unknown. Recent reports proposed that a conserved phenylalanine (F327) in the active site of PRMT5 is important for directing symmetric dimethylation, while a methionine (M48) residue in PRMT1 confers specificity toward asymmetric dimethylation. 129,130 These observations imply that the generation of symmetrically and asymmetrically dimethylated arginine residues share a common catalytic mechanism, as type I and type II mutant enzymes are capable of performing both of these reactions. Notably, on the basis of the PRMT5 structure, F327, which has been shown to be important for specifying the symmetric dimethylating activity of PRMT5, interacts with the substrate guanidinium group, thereby orienting it for methyl transfer ( Figure 31). 120 Quantum mechanical models further revealed that the free energy of activation, ΔG ⧧ , for SDMA formation is 13.4 kcal/mol, and therefore, SDMA formation is more energetically costly than ADMA formation, for which a ΔG ⧧ of 10.2 kcal/mol was calculated. 130 The higher energy barrier for forming SDMA over ADMA presumably explains the low amount of SDMA formed by a PRMT1M48F mutant and is also reflected by the 160-fold slower rate for symmetric dimethylation, using monomethylated arginine as the substrate, compared to the monomethylation rate observed with PRMT5. 131 By contrast, the PRMT1-catalyzed rate of dimethylation is only 2−4-fold slower than the rate of the monomethylation reaction. 124b Moreover, introduction of an F379M substitution into PRMT5 partially shifts the product formation specificity such that this enzyme can now generate ADMA, albeit with reduced efficiency compared to that of PRMT1. 129 However, it still remains to be determined what other structural features, apart from the mentioned phenylalanine to methionine switch, contribute to product specificity, especially since PRMT9, the other type II enzyme, does not contain a phenylalanine but a methionine at this position. As such, it was recently proposed that subtle differences in the size of the arginine-binding pocket, mainly due to alterations of the THW loop, may be important for controlling the product selectivity of the PRMTs. 113,114 In this respect, it is interesting to note that all PRMTs, except the type II enzymes PRMT5 and PRMT9, contain a histidine in the THW loop. PRMT5 and PRMT9 possess serine and cysteine residues at the corresponding site, respectively. It is tempting to speculate that the bulky side chain of histidine impairs binding of MMA, whereas the much smaller side chains of serine and cysteine allow for proper accommodation of a symmetrically dimethylated guanidinium group in the active site pocket (Figure 32).

Proposed Catalytic Mechanism of PRMTs.
PRMTs employ a bisubstrate mechanism, transferring the methyl group of SAM to specific arginine residues in histone and nonhistone protein substrates, resulting in mono-and dimethylated arginine residues and the byproduct S-adenosyl-Lhomocysteine (SAH or AdoHcy). According to the enzyme classification nomenclature, PRMTs belong to the class of transferases that transfer one-carbon groups (EC 2.1.1.125).   Notably, PRMT1, PRMT5, and PRMT6 employ rapid equilibrium random kinetic mechanisms wherein substrate binding and product release occur in a random fashion, 131,132 although it should be noted that prior work with PRMT6 suggested that this enzyme utilized an ordered kinetic mechanism in which SAM binds first and SAH dissociates last from the enzyme. 133 However, this study only used product inhibition patterns to assign the order of substrate binding and product release, and the overall quality of the observed inhibition patterns is quite poor. Thus, PRMT6, like PRMT1 and PRMT5, most likely binds its substrates in a random fashion. Notably, though, the PRMT4-catalyzed reaction has also been suggested to proceed via an ordered sequential mechanism where SAM binding is the first step and SAH is the last product to leave the enzyme. 112d The PRMT1 catalytic mechanism proceeds via a bimolecular nucleophilic substitution (S N 2) methyl transfer reaction ( Figure   33). The two invariant glutamate residues E144 and E153 are hypothesized to localize the positive charge of the guanidinium group to one ω-nitrogen atom, thereby leaving a lone pair of electrons on the other terminal nitrogen to attack the methylsulfonium group of SAM. Originally, it was proposed that the substrate guanidinium is deprotonated and thereby activated by the carboxylate of E144. However, using solvent isotope effect experiments, Rust et al. suggested that generalacid/base catalysis is not important for promoting methyl transfer in PRMT1. 134 Instead they proposed that the PRMT1catalyzed reaction is primarily driven by proper substrate guanidinium alignment by E144 and E153 with respect to the S-methyl group of SAM and that the prior deprotonation of the substrate guanidinium group is not required for methyl transfer. Subsequent quantum mechanical (QM) calculations indicated that E144 abstracts a proton from the reacting arginine immediately after methyl transfer, consistent with the mechanism proposed by Rust et al. 135 These QM studies also suggested that the guanidinium loses planarity in the transfer state, as predicted by the observed inverse solvent isotope effect. 134,135 The positioning of the ω-nitrogen of the guanidinium group to attack the SAM methyl group ultimately results in the arginine N-methylation and generation of the byproduct SAH.

Is Protein Arginine Methylation a Reversible Modification?
Several PTM regulatory systems such as phosphorylation, ubiquitination, or lysine acetylation are reversibly regulated; however, for arginine methylation, PRMTs act as writers, but the occurrence of a corresponding eraser that clips off the methyl group is still controversial. There was a report claiming that the iron-and α-ketoglutarate-dependent dioxygenase Jumonji domain 6 (Jmjd6) protein acts as an arginine demethylase (eraser). 136 However, subsequent detailed analyses revealed that this enzyme is in fact a lysine hydroxylase and does not erase the methyl mark from methylated arginine residues. 137 As described above, PAD4 was also suggested to convert methylated arginines to citrulline, 38 but this activity is unlikely physiologically relevant due to the extraordinarily low activity. 19,39b Therefore, PAD activity antagonizes arginine methylation by competition with PRMTs but does not directly convert methylarginines into citrulline. 19,28,41 Nonetheless, due to the dynamic appearance and disappearance of methylarginine marks, 41,138 the existence of an arginine demethylase is    100 Examples for aminomethyl demethylases are found in nature, and two separate classes of histone lysine demethylases are known to exist. 139 The first class was discovered in 2004 and consists of the lysine-specific demethylases (LSDs), which are flavin adenine dinucleotide (FAD)-dependent amine oxidases that remove mono-and dimethylation marks ( Figure 34A). 140 The FAD cofactor oxidizes the methyllysine to form an imine intermediate, which is then hydrolyzed to yield unmodified lysine and formaldehyde. The resulting reduced FADH 2 is reoxidized by molecular oxygen, thereby forming hydrogen peroxide as a byproduct. The second group comprises the Jumonji Cterminal domain (JmjC) family of histone demethylases, which use iron and α-ketoglutarate as cofactors, thereby acting as oxygenase enzymes to remove mono-, di-, and trimethyl groups from methylated lysine residues ( Figure 34B). 141 The JmjCcatalyzed demethylation reaction involves oxidative decarboxylation of α-ketoglutarate, coupled to hydroxylation of the methyl group, generating an unstable hydroxymethylammonium intermediate, which is released as formaldehyde.

Methylarginine-Binding Proteins
In contrast to the identification of an arginine demethylase, there is strong structural and biochemical evidence to support the existence of methylarginine readers ( Figure 35). 107,142 For example, several members of the Tudor protein family specifically recognize methylated arginine residues. These proteins contain a conserved Tudor domain, which is responsible for either methylarginine or methyllysine binding. 142c On the basis of sequence analysis, however, it is not possible to unequivocally predict the binding specificity of individual Tudor domains. Structural studies of Tudor domains revealed that an aromatic cage surrounds the methylarginine, thereby forming extensive cation−π and hydrophobic inter-   Figure 36). 105 In addition, it was shown that the hydrophobic cage for methylarginine recognition is much narrower (diameter ∼7 Å) than the cage for methyllysine (diameter >8 Å), thus favoring binding of the planar guanidinium group. 105 The structure of ADMA-bound Tudor domains is a good example of how nature utilizes noncovalent cation−π interactions to recognize the arginine guanidinium group. 143 In this respect, it was shown that methylation of arginine residues increases the strength of cation−π interactions compared to that for unmodified arginine. 144 Notably, the high affinity and specificity in the binding of trimethylated lysine to chromodomains that harbor similar aromatic cage structure also depend on favorable cation−π interactions. 145 The strong contribution of multiple cation−π interactions might also provide a plausible mechanism to exclude an uncharged peptidyl citrulline from binding to Tudor domains as it cannot form this type of interaction, although it should be noted that this has not been systematically tested.
Recently, it was shown that the TDRD3 protein recognizes ADMA-methylated histone H4 tails via its Tudor domain, thereby activating transcription. 146 Although most of the Tudor domains prefer SDMA, it was recently shown that TDRD3 also recognizes ADMA in H3 (H3R17me2a) and H4 (H4R3me2a). 146,147 Moreover, it was shown that the PHD domain of the DNA methyltransferase DNMT3A can bind to the H4R3me2s suppressive gene expression mark. 148 There, it was proposed that PRMT5 generates H4R3me2s on targeted promoters, which are recognized by DNMT3A. The recruited DNMT3A promotes DNA methylation, thereby inducing gene silencing. However, a subsequent study by Otani et al. could not confirm any interaction between the DNMT3A PHD domain and an H4R3me2s peptide. 149 3.5. Chemical Probes−Inhibitors for PRMTs 3.5.1. Inhibitors of PRMTs. Since PRMTs affect a plethora of different target genes, it is unsurprising that, when dysregulated, they play a role in human disease. In fact, dysregulated PRMT activity has been causally linked to the development and progression of numerous cancers, as well as to viral replication and cardiovascular disease. Therefore, the PRMTs constitute promising targets for drug discovery, and inhibitor development is at the frontline of current PRMT research. 8b One of the earliest described PRMT inhibitors to be discovered is autogenerated by the enzyme during catalysis. As described above, the methyl donor substrate SAM is converted to SAH (21) (Figure 37), which represents a potent feedback inhibitor of PRMT activity, and accumulates as an inevitable byproduct of protein methylation. In cells, SAH levels can be raised by blocking its degradation via SAH Figure 36. SMN Tudor domain bound to the asymmetric dimethylated arginine residue (PDB code 4A4G). The left panel illustrates the Tudor domain colored according to its electrostatic surface potential. The image on the right highlights the ADMAinteracting residues that form a hydrophobic cage around the methylated guanidinium group. Figure 37. Nonselective PRMT inhibitors. Note that adenosine dialdehyde (22) and AMI-1 (25) are both not direct PRMT inhibitors. Adenosine dialdehyde blocks the activity of SAH hydrolase, which induces an increase in SAH levels, thereby inhibiting PRMT activity. AMI-1 binds to the histone substrates and prevents recognition by the PRMT enzyme.

Chemical Reviews
Review hydrolase. In fact, adenosine dialdehyde (22) is a potent SAH hydrolase inhibitor that is often used in cell studies to increase the amount of intracellular SAH. Higher SAH levels in turn result in feedback inhibition of most SAM-dependent methylation reactions including PRMT activity. SAM analogues, such as methylthioadenosine (MTA, 23) and sinefungin (24), also function as general PRMT inhibitors ( Figure 37); however, they exhibit limited specificity, due to their structural homology to SAM. Thus, they inhibit numerous other SAMdependent methyltransferases, thereby affecting the cellular methylation of phospholipids, proteins, DNA, and RNA as well. As such, SAM-mimicking methyltransferase inhibitors display limited specificity by indiscriminately inhibiting all SAMutilizing enzymes.
To obtain PRMT-selective inhibitors, several approaches, including both virtual and high-throughput screens, as well as substrate analogue inhibitor design were conducted. One of the first high-throughput screens directed against the yeast type I PRMT, Hmt1p, resulted in the identification of several arginine methyltransferase inhibitors, denoted as AMI. 150 Most of these inhibitors were highly nonspecific and also blocked the activity of the protein lysine methyltransferases. AMI-1 (25), however, a symmetric naphthalenesulfonate molecule, inhibited PRMT1 with an IC 50 of 8.8 μM but did not diminish the activity of distinct lysine methyltransferases. In addition, AMI-1 blocked the activity of PRMT3, PRMT4, and PRMT6. It was also shown that AMI-1 is cell-permeable and that it inhibits endogenous PRMT1 activity in a concentration-dependent manner. 150 However, on the basis of circular dichroism, fluorescence, and absorption spectral analysis, Feng et al. unequivocally determined that AMI-1, and related naphthalenesulfonate derivatives, target the substrate instead of the PRMT enzyme and are therefore not direct PRMT inhibitors. 151 Notably, AMI-1 forms a complex with an H4 peptide substrate, most likely via bidentate electrostatic interactions between its sulfonate groups and the arginine guanidinium groups present in the peptide substrate, thereby blocking substrate access to the PRMTs. 151 Fragment-based virtual screening identified RM65 (26) as a cell-permeable PRMT1 inhibitor. 152 Docking studies suggest that 26 is a competitive inhibitor, occupying both the SAMbinding and the substrate arginine-binding sites. Compound 26 was also shown to reduce histone H4R3 methylation in HepG2 cancer cells at concentrations above 100 μM. Using a similar virtual docking and pharmacophore-based filtering approach, the Jung group identified the diamidine stilbamidine (27) and allantodapsone (28) as PRMT1 inhibitors. 153 Both compounds are competitive for the protein substrate, but not for the cofactor SAM. Moreover, these inhibitors are active in a functional assay of estrogen receptor activation and also reduced the cellular methylation of R3 in histone H4 at concentrations below 50 μM while having minor effects on the lysine methylation at H3K4. 153 Further optimization resulted in the generation of even more potent inhibitors, such as the dapsone derivative 29, which inhibits PRMT1 with an IC 50 of 1.5 μM. 154 The Thompson group observed that the SAM congener 5′-(diaminobutyric acid)-N-(iodoethyl)-5′-deoxyadenosine ammonium hydrochloride (AAI, 30) blocks PRMT1 activity with an IC 50 of 18.5 μM and a 4.4-fold preference for PRMT1 over CARM1 (Figure 38) 155 This compound is thought to form a reactive aziridinium moiety that is susceptible to nucleophilic attack. 156 Interestingly, upon incubation with PRMT1 and H4 peptide substrate, this compound reacts with the incoming arginine substrate (H4R3) in situ in an enzymedependent manner, thereby autogenerating an effective bisubstrate inhibitor within the enzyme active site ( Figure  38). The ability of PRMT1 to chemoenzymatically generate an effective bisubstrate analogue inspired the development of defined bisubstrate derivatives. In this respect, the partial-  bisubstrate analogue 31, which comprises an argininecontaining peptide fragment that was linked to the amino acid moiety of SAM, was shown to block PRMT1 activity with an IC 50 of 14 μM (Figure 39). 157 This compound shows limited PRMT selectivity and also blocks PRMT4 and PRMT6 with similar efficiency. In addition, Dowden and colleagues reported the development of SAM derivatives conjugated to a guanidinium group via varying carbon linkers (32−34). 158 Comparison of the generated derivatives revealed that a fourcarbon spacer, present in 33 between the guanidinium group and the SAM analogue, is most suited for efficient PRMT inhibition (IC 50 = 2.9 μM). Though the selectivity of this inhibitor for other PRMTs was not evaluated, it did not show substantial inhibitory activity against the lysine methyltransferase SET7. 158 The Martin group recently developed a partial bisubstrate inhibitor where the SAM adenosine moiety is connected to the guanidinium group (35). 159 Interestingly, 35 was more potent than the previously described (partial) bisubstrate inhibitors with IC 50 values of 1.3 μM, 560 nM, and 720 nM for PRMT1, PRMT4, and PRMT6, respectively. Moreover, this compound did not display any measurable inhibitory effect on the lysine methyltransferase G9a; however, it also did not show any inhibitory effect on cell proliferation using MCF7 and Caco2 cells. 159 Although bisubstrate analogues represent interesting tools to analyze PRMT activity, they comprise several limitations, including lack of selectivity, within the PRMT family, and, with respect to peptide-based bisubstrate inhibitors, undesirable pharmacological properties. Therefore, recently, great efforts have been taken to yield potent and isozyme-selective PRMT inhibitors.
The stilbamidine derivative furamidine 36 contains two amidine moieties and is specific for PRMT1 with an IC 50 of 9.4 μM ( Figure 40). 160 Another potent PRMT1 inhibitor is a peptide-based haloacetamidine-containing compound dubbed C21 (37). 161 This compound is derived from the N-terminal sequence of histone H4 with a chloroacetamidine-modified residue in place of H4R3 to serve as a reactive warhead. In contrast to most other PRMT inhibitors, C21 acts as an irreversible inhibitor that forms a covalent bond with a hyperreactive cysteine residue, C101, 162 present in the active site of the enzyme. 163 In addition, despite the presence of a chloroacetamidine warhead, which is also present in the most potent PAD inhibitors (see above), C21 is selective for PRMT1 (IC 50 = 1.8 μM) and displays poor inhibitory activity toward the PAD enzymes (IC 50 for PAD4 = 145 μM). 161 Notably, C21 selectively inhibits cellular PRMT1 activity over PRMT4 when delivered into cells with peptide transfection reagents. 161 However, C21 is only ∼5-fold more selective for PRMT1 over PRMT6. To screen for more selective PRMT1 inhibitors, Bicker and colleagues employed a combinatorial peptide library approach. 164 The identified hit, denoted C21-1F (38), contains a phenylalanine instead of a glycine at the R-1 position ( Figure  40). Although C21-1F is slightly less potent than C21, it is 3 times more selective for PRMT1 over PRMT6, indicating that residues around the substrate arginine might be exchanged or modified to develop isozyme-selective PRMT inhibitors.
With respect to PRMT3, Siarheyeva et al. employed a library screening approach to identify selective inhibitors targeting this enzyme. 165 Optimization of the initial hit compounds revealed inhibitor 39. 166 Notably, this inhibitor is noncompetitive with respect to both SAM and the peptide substrate and binds to an allosteric site present in PRMT3 (Figure 41). This compound is highly selective for PRMT3, and detailed structural analyses showed that 39 binds to an allosteric pocket at the base of the dimerization arm between two PRMT3 subunits. There, 39 interacts with and distorts the activation helix, which is critical for proper SAM binding. It is thought that 39 induces conformational constraints on this α-helix that prevent formation of a catalytically competent state. 166 The isoquinoline moiety of 39 forms a buried hydrogen bond with T466, the  urea group forms hydrogen bonds with the side chains of E422 and R39, and the pyrrolidine amide pushes against the α-helix (Figure 41). To test the in vivo efficacy, 39 was evaluated for target engagement using an InCELL Hunter assay. There, it was shown that 39 stabilized PRMT3 in HEK293 cells with an EC 50 value of 1.3 μM. Moreover, 39 can block the PRMT3dependent dimethylation of endogenous histone 4 at H4R3 with an IC 50 of 225 nM. These data demonstrate that 39 is a potent, selective, and cell-active allosteric inhibitor of PRMT3 that is suitable for further cell or animal studies.
High-throughput screening efforts also led to the identification of pyrazole and benzimidazole derivatives as potent PRMT4 inhibitors. 167 Subsequent optimization of the initial hit compounds resulted in nanomolar inhibitors such as the indole derivative 40 and the pyrazole derivative 41, which possess IC 50 values of 30 and 27 nM, respectively (Figure 42). 168 Interestingly, both of these inhibitors bind to the substrate arginine-binding cavity of PRMT4 and require the presence of bound cofactor SAH (Figure 42). 168 Structural studies of PRMT4 in complex with sinefungin and 40 revealed that the Nmethylethanamine moiety of the inhibitor is directed toward the bottom of the arginine-binding cavity and directly interacts with the active site residue E258. The piperidine group is positioned at the entrance of the active site cavity and hydrogen bonds to H415 of the THW loop, whereas the indole moiety makes hydrophobic interactions with several aromatic side chains of PRMT4 and forms a water-mediated hydrogen bond to the main chain carbonyl of K471 and the side chain of N266. The crystal structure of PRMT4 bound to 41 revealed that the terminal L-alaninamide moiety mimics the arginine guanidinium group and makes several polar interactions with PRMT4, including the carboxyl groups of the double E-loop residues E258 and E267. In addition, the alanylmethyl group of 41 forms CH···O hydrogen bonds to the hydroxyl oxygen of Y154 and the backbone carbonyl oxygen of M260, while the carbonyl oxygen of the L-alaninamide hydrogen bonds with one of the imidazole nitrogens of H415. Notably, the (trifluoromethyl)pyrazole, 1,3,4-oxadiazole, and indole scaffolds are thought to mainly interact with PRMT4 via shape complementarity rather than by polar interactions, except for two hydrogen bonds formed between the side chain of Y262 and the oxadiazole oxygen atom and between the hydroxyl og Y477 and a fluorine of the trifluoromethyl group.
Very recently, researchers from Epizyme developed selective PRMT5 inhibitors containing a di-or tetrahydroisoquinoline− hydroxypropyl−arylcarboxamide scaffold. 169 One of the most effective compounds, EPZ015666 (Figure 43), was shown to act as a very potent (K i = 5 nM), selective (>20000-fold compared to other protein methyltransferases), and orally bioavailable inhibitor of PRMT5. 170 Interestingly, the tetrahydroisoquinoline moiety of 42 is thought to directly bind to the characteristic F327 residue that is present in PRMT5 but absent in other PRMTs via π−π stacking interactions. 170 In addition, it was demonstrated that 42 reduces cellular SDMA levels in Z-138 cell lines with an IC 50 of 44 nM and displays robust antitumor activity in mantle cell lymphoma (MCL) xenograft mouse models. Thus, EPZ015666 has cellular activity and in vivo efficacy and represents a promising lead compound for the development of PRMT5 inhibitors as potential cancer therapeutics.
In 2015, Alinari and colleagues also reported the discovery of a selective PRMT5 inhibitor employing a structure-based virtual screening approach. 171 After initial cell testing assays, CMP5 (43) was identified as the top hit and used for further characterizations. It was shown that 43 is selective for PRMT5 and does not block PRMT1, PRMT4, or PRMT7 at concentrations below 100 μM. Moreover, on the basis of modeling studies, this compound is predicted to occupy the SAM-binding pocket via its carbazole ring, while the pyridine ring is thought to form π-stacking interactions with the characteristic F327 of PRMT5. Inhibitor 43 was selectively toxic to lymphoma cells and killed Pfeiffer cells and SUDHL2 cells with IC 50 values of 30 and 35 μM, respectively. 171 In addition, CMP5 treatment reduces the cellular level of H4R2me2s as well as H3R8me2s marks, thus highlighting its cellular efficacy.
3.5.2. Chemical Probes for the PRMTs. Since C21 is an irreversible, covalent, and highly selective PRMT1 inhibitor, it was adapted for use as a chemical probe of PRMT1 activity by attaching fluorescein (F-C21, 44) and biotin (B-C21, 45) reporter tags (Figure 44). 172 Although F-C21 labels recombinant PRMT1, the fluorescent probe cannot detect cellular PRMT1, likely due to low levels of the active protein. However, B-C21 can efficiently be used to label and isolate endogenous PRMT1 from MCF-7 whole cell extracts. Notably, using B-C21 as an ABPP, it was shown that cellular PRMT1 activity is regulated in response to estrogen. Specifically, the amount of active PRMT1 isolated from nuclear extracts is reduced after estrogen treatment, suggesting that PRMT1 activity is negatively regulated in a manner that ultimately precludes the enzyme from interacting with its substrates (mimicked by B-C21). 172 Interestingly, the subcellular localization of the PRMTs also appears to be altered in cancer cells (see below), suggesting that the non-nuclear effects of the PRMTs may be more important than previously thought.
Since the fluorescent conjugate F-C21 acts as a potent PRMT1 inactivator, it could also be used in Fluopol-ABPPbased HTS approaches to identify specific PRMT1 inhibitors in a manner similar to that described for the PAD enzymes (see above). In this respect, Dillon and colleagues adapted a cysteine-reactive maleimide conjugated to AlexaFluor488 as an ABPP to screen for PRMT1 inhibitors. 173 This probe, although much less specific than F-C21, can label PRMT1 by forming a covalent bond with a hyper-reactive cysteine residue, C101, 162 that is located in the SAM-binding pocket close to the purine ring. Using this Fluopol-ABPP assay, they identified two potent PRMT1 inhibitors, 46 and 47 (Figure 45), that also block PRMT8 activity but not PRMT4 activity. 173 Both of these compounds are nitroalkenes and are expected to react with the cysteine residue in PRMT1. Notably, PRMT8 also possesses a cysteine residue in its SAM-binding pocket, whereas PRMT4 does not.

Physiological Role of Histone Arginine Methylation
PRMTs methylate numerous cellular protein substrates, including nuclear proteins such as transcription factors, other coregulators, and histones. The importance of this PTM to cellular growth is probably best exemplified by the fact that both PRMT1 and CARM1 mouse knockouts are embryonically lethal. The currently identified sites of histone methylation are H2AR3 and R11, H2BR29, R31, and R33, H3R2, R8, R17, and R26, and H4R3, R17, R19, and R23 ( Figure 46). The diversity of arginine methylation sites on histone proteins provides

Chemical Reviews
Review multiple routes to directly link arginine methylation to the epigenetic regulation of gene expression ( Figure 46). In addition, the methylation of arginine residues in several transcriptional coactivators (e.g., the histone acetyltransferases p300 and CBP) provides an indirect route to influence the epigenetic state of affected genes. Originally, it was assumed that the methylation of histone arginine residues was associated with gene activation, as the dimethylation of histone H4R3 by PRMT1 facilitates transcriptional activation by a variety of nuclear hormone receptors. 174 More recently, however, it became apparent that arginine methylation can be either an activating or a repressing mark that regulates the expression of multiple genes. Below we highlight several examples of PRMTmediated histone methylation events that are correlated with the activation and repression of gene transcription.
3.6.1. Methylation of H2AR3. H2AR3 methylation can be catalyzed by PRMT1, PRMT5, PRMT6, or PRMT7, generating ADMA (PRMT1 and PRMT6), SDMA (PRMT5), or MMA (PRMT7). The effects of these methylation events are varied. For example, in mouse ES cells, PRMT5 expression and activity are upregulated and the enzyme was shown to symmetrically dimethylate H2AR3. This event helps maintain pluripotency by repressing differentiation genes, including Fgf5, Gata4, Gata6, and HoxD9. 175 This report also indicated that PRMT5 is critical for embryonic stem cell (ESC) generation, as no ESCs were generated in Prmt5 −/− mouse embryos. 175 However, in a separate study investigating human ESCs, PRMT5 knockdown had no effect on pluripotency, as evidenced by the fact that the PRMT5 knockdown cells and wild-type cells had similar RNA levels for genes associated with multiple tissue types. 176 PRMT5 knockdown did, however, correlate with the repression of 78 genes. Notably, only two of these genes are known developmental genes, whereas the rest are associated with basic cellular processes. 176 These results illustrate the differences between the human and mouse epigenetic landscape.
PRMT7 also methylates H2AR3 in response to DNA damage to repress the transcription of the DNA polymerases POLD1 and POLD2. Importantly, when PRMT7 was knocked down in a cellular model, cells were more resistant to DNA damage due to the derepression of DNA damage response genes such as POLD1 and POLD2. 177 Collectively, these results illustrate the importance of isozyme-specific PRMT inhibitors. In this case, if a PRMT inhibitor were used in combination with a DNA-damaging agent, inhibition of PRMT7 could promote tumor resistance to the chemotherapeutic drug.
3.6.2. Methylation of H2AR11 and R29. H2AR3, R11, and R29 were all shown to be methylated in a proteomic study of histones extracted from HeLa cells. Notably, H2AR29 was found to be methylated at sites known to be associated with PRMT6-mediated gene repression, suggesting that PRMT6 is able to asymmetrically dimethylate H2AR29 in vivo. 178 These genes include EIF1b, MMP9, THBS1, and TNFRSF11B. THBS1 is of particular interest as it is dysregulated in a variety of cancers, specifically playing a role in angiogenesis. 178 3.6.3. Methylation of H3R2. On the basis of current data, histone H3 is the most heavily modified histone, and the methylation of arginine residues in this protein follows this trend. PRMT6 dimethylates H3R2, and this modification alters the binding of several effector proteins that typically bind H3K4Me. These effectors, including JMJD2, and several Tudor-and PHD-domain-containing proteins, play a variety of roles in gene activation. It follows then that H3R2 methylation causes a change in the gene activation profile of the cell. These changes were characterized prominently by investigating the downstream effects of one particular effector protein, WDR5, which is an integral component of the MLL complex that catalyzes the methylation of H3K4. Interestingly, this WD40-repeat protein was shown to have decreased binding upon H3R2 dimethylation, resulting in a reduction in the activation of a number of target genes, including HoxA5 and cyclin D1. 179 Interestingly, PRMT6 is able to methylate H3R2 regardless of the methylation state of H3K4, suggesting that the methylation of H3R2 has a dominant effect on H3K4 methylation.
PRMT5 also symmetrically dimethylates H3R2, and this modification promotes chromatin remodeling, exposing a binding site for a transcription factor called CREB (cAMPresponse-element-binding protein). CREB is then phosphorylated by PKA, allowing for the activation of target genes involved in glucose metabolism, including G6pc, Pck1, and Ppargc1a. 180 Expression of CARM1 is similarly necessary for expression of glucose homeostasis factors, including Gys1, Pgam2, and Pgym. This role, however, was not linked to a specific histone methylation event. 181 3.6.4. Methylation of H3R8. PRMT5 has also been implicated in metabolic regulation via the dimethylation of H3R8. In both cell culture and primary adipocytes, ChIP results show that PRMT5 associates with PPARγ2 and PPARγ2responsive promoter sequences, activating adipogenesis. In type 2 diabetes, PRMT5 regulates metabolic signals during a fasting state by associating with CRTC2, which directs it to target genes. 182 3.6.5. Methylation of H3R17 and R26. CARM1 activity on H3R17 also regulates effector binding. In a recent study, asymmetric dimethylation of this site not only blocked association of the TIF family of corepressors, but prevented deacetylation by abrogating the interaction of H3 with the NuRD complex. 183 Like PRMT5, CARM1 was also found to maintain pluripotency in mouse ESCs. Notably in this study, ChIP experiments using antibodies for CARM1 as well as for two different methylated H3 substrates, H3R26me2a and H3R17me2a, showed that CARM1 is recruited to the promoters of a variety of genes involved in differentiation. This more precisely implicates not only CARM1 but histone modification in gene activation. 184 In another study, CARM1 was shown to be recruited to the creatine kinase promoter during skeletal myogenesis. 185 Further studies indicate that CARM1 is specifically expressed during differentiation and recruited to the nucleus and subsequently to chromatin. 185 Knockdown studies show that, in the absence of CARM1, the levels of other members of the transcription factor complex associated with creatine kinase activation are not expressed, giving another facet of the role of CARM1 in this process. Together, this study shows an enhancing role for CARM1 in muscle differentiation. 185 3.6.6. Methylation of H4R3. H4R3 is methylated by a number of PRMTs, including PRMT1, PRMT5, and PRMT6. The first indication that the methylation of this residue could alter gene transcription began with reports investigating PRMT1 activity on purified H4 with varied levels of lysine acetylation. 174 These studies showed that PRMT1 methylates unacetylated histones more efficiently than acetylated histones. This allowed Wang et al. to conclude that asymmetric dimethylation of H4R3 was likely a transcriptional activating event that promoted the acetylation of histone H4 by the histone acetyltransferase p300 to activate gene expression. 174

Review
The first report to clearly identify a role for a PRMT in tumorgenesis came from a study showing that PRMT1 methylates H4R3 as a part of the MLL complex in hematopoietic cells. 186 With the introduction of the full MLL complex, these cells showed enhanced self-renewal when compared to those with an MLL complex with a catalytically dead PRMT1. This implicates PRMT1, specifically its role in dimethylating H4R3, in cell survival. 186 By contrast with the results obtained for PRMT1, the symmetric dimethylation of H4R3 by PRMT5 is generally a repressive mark. For example, in an early study of arginine methylation, PRMT5 was shown to associate with the SWI/ SNF (switch/sucrose nonfermentable) chromatin remodeling complex as well as with the Brg1 complex. As a part of this complex, PRMT5-mediated methylation of H4R3 displayed a repressive functionality specifically on the c-Myc target gene Cad. 187 This report further implicated PRMT5 as playing a role in oncogenesis as well as being a part of a larger chromatinmodifying complex and thereby working in collaboration with other epigenetic transformations. A later bioinformatics study using ChIP-seq data analysis of histone methylation found PRMT5-mediated H4R3 dimethylation to be the second most repressive mark of the 20 lysine and arginine methylation events tested. Also shown in this study was the dependence of DNMT3A-mediated DNA methylation on the symmetric dimethylation of H4R3. 188 DNA methylation by DNMT3A is also known to be a transcriptionally repressive modification. 148 H4R3 methylation also affects the binding of the effector proteins SRP68/72 to H4. Interestingly, both asymmetric dimethylation and symmetric dimethylation of H4R3 inhibit the binding of the SRP68/72 heterodimer to chromatin both in vitro and in cells. 189

Nonhistone Methylation in Epigenetic Regulation
PRMT1 also methylates ERα at R260, within its DNA-binding domain. This methylation event occurs in response to estradiol and allows for the association of ERα with PI3K and Src, leading to Akt1 activation. Moreover, in highly malignant ER+ breast cancer samples, ERα was found to be hypermethylated, suggesting a role for PRMT1 in breast cancer. 138b BRCA1 is also methylated by PRMT1, which alters its binding to a variety of promoters, leading to increased binding to the APEX, ARHG, and GADD45G promoters, and a decrease in binding to ESR2, SREB, and FGF9 promoters. 190 PRMT1 also affects mRNA processing by methylating FUS (fused in sarcoma). 191 FUS is an mRNA-binding protein that is important for mRNA processing and shuttling RNAs from the nucleus to the cytoplasm. However, nuclear import of FUS relies upon its binding to transportin 1. The FUS/transportin 1 interaction is abrogated by a PRMT1-mediated methylation event, causing FUS to be trafficked to inclusion bodies that are found in amyotrophic lateral sclerosis (ALS) patients. PRMT1 knockdown increases FUS/transportin 1 binding as well as the nuclear localization of FUS. 191 PRMT5-mediated methylation of the tumor suppressor p53 also helps to regulate the expression of p53 target genes. The sites of modification were mapped to R333, R335, and R337, and methylation at these sites induced cell cycle arrest, whereas the deletion of PRMT5 induced apoptosis. 192 These data indicate that PRMT5-mediated methylation of p53 causes a change in the response to DNA damage, inducing cell proliferation. 192 In a later study, the role of PRMT5 in p53 signaling was further investigated. Here, PRMT5 was found to be essential for p53 stability as well as for the expression of two p53 target genes, MDM2 and p21. 193

Arginine Methylation in Cancer
Dysregulated PRMT expression and activity have been observed in a variety of cancers. Specifically, PRMT1, PRMT2, PRMT3, PRMT4, PRMT5, PRMT6, and PRMT7 have been shown to be overexpressed or otherwise contribute to tumorigenesis, while PRMT8 and PRMT9 have not yet been implicated in oncogenesis.
3.8.1. PRMT1 in Cancer. Recent studies have linked the increased expression of different PRMT1 splice variants (i.e., PRMT1v1 and PRMT1v2) to enhanced malignancy and poor prognosis. 194 In one study, based on immunohistochemistry of primary tumor samples, overall PRMT1 expression was linked to the patient's age, menopausal status, and progesterone receptor status. 194a In the same study, low expression levels of PRMT1v1 were associated with increased survival. This research found no link between PRMT1v2 levels and survival rate. 194a However, in a cell-based study using MCF7 cells, specific knockdown of PRMT1v2 increased apoptosis. Similarly, induced expression of PRMT1v2 promoted cell invasion in nonaggressive cell lines; this effect was not achieved by overexpressing other splice variants of PRMT1. 194b Interestingly, both of these papers stress the importance of PRMT1 as a cytosolic methyltransferase and both report that PRMT1 was shown to be more malignant when expressed in the extranuclear environment. While the role of PRMT1 as a histone arginine methyltransferase has been clearly described in the literature, the effect of arginine methylation on general cell signaling is only beginning to be explored.
Dysregulation of PRMT1 has also been linked to breast cancer and leukemia, again suggesting that it is a potential therapeutic target. 186 As mentioned previously, asymmetric dimethylation of H4R3 upregulates the expression of estrogen receptor target genes. While this response was first attributed to PRMT1, the increased expression of estrogen receptor target genes was later found to be promoted by either PRMT1 or CARM1. 138a In this report, the protein complexes that associate with the pS2 promoter were identified by ChIP and reChIP analysis. In two of the six complexes, a type 1 PRMT was found, either PRMT1 or CARM1, but never both. In one of these complexes, PRMT1 and CARM1 were interchangeable. This complex, notably, also contains histone acetyltransferases. The other found complex only contained PRMT1 but also SWI/SNF chromatin remodeler proteins Brg1 and Ini1. 138a 3.8.2. PRMT2 in Cancer. PRMT2 is also implicated in breast cancer relating to its ability to act as a transcriptional coactivator for ERα. 195 In breast cancer cell lines, the levels of PRMT2 and a splice variant, PRMT2L2, were shown to be increased in ER+ lines. Upon overexpression of PRMT2L2, increased expression of ERα target genes was observed. 195 It was later observed that different splice variants of PRMT2 have a distinct subcellular localization.
3.8.3. PRMT3 in Cancer. PRMT3 has been shown to be regulated post-transcriptionally by a tumor suppressor protein, DAL-1/4.1B, which inhibits its methyltransferase activity both in vitro and in cell culture. 196 Overexpression of DAL-1/4.1B in MCF7 cells reduces PRMT3-catalyzed methylation of a variety of unidentified cellular proteins. As DAL-1/4.1B has been shown to have an antiproliferative role, this activity indirectly implicates PRMT3 in oncogenesis. 196 Chemical Reviews Review 3.8.4. PRMT4/CARM1 in Cancer. CARM1 levels have been shown to be increased in colon, prostate, and breast cancer. 197 Like PRMT1, CARM1 plays a role in ER+ breast cancer by activating ERα target genes. Furthermore, CARM1 is necessary for this activation event as knockdown of CARM1 abrogates the estrogen response in mouse embryos. 198 CARM1 has also been shown to be a marker for well-differentiated breast cancer, suggesting that CARM1 plays a role in reprogramming the epigenome of breast cancer cells. Overexpression of CARM1 in MCF7 cells causes a change in morphology as well as a change in the estradiol-induced gene signature. 197c In another breast cancer study, CARM1 was shown to be necessary for E2F1 expression. E2F1, in turn, is required for cyclin E1 expression. The cyclin E1 promoter has high levels of both H3R17me2a and H3R26me2a, both of which are reduced upon CARM1 knockdown. 199 CARM1 is also necessary for NFκB target gene expression as shown in MEF cells by CARM1 knockdown. 200 NFκB is a transcription factor responsible for the regulation of genes involved in inflammation and cell survival. The effect of CARM1 was linked to H3R17me2a, though investigation into the potential role of H3R26me2a was not performed. These studies also showed a link between CARM1 and p300 acetyltransferase activity in NFκB recruitment and gene activation. Both enzymes are critical for this event. 200 3.8.5. PRMT5 in Cancer. PRMT5 methylates both H4R3 and H3R8 in chronic lymphocytic lymphoma, causing transcriptional silencing of known target genes Rb1, Rbl1, and Rbl2. 201 Knockdown of PRMT5 in a B-CLL cell line model decreases H4R3 and H3R8 dimethylation, increases protein expression of RBL2, and inhibits cell proliferation. 201 Furthermore, PRMT5 overexpression has been linked to a number of cancers, including colon, lung, astrocytoma, and fibrosarcoma. 202 Although it is unclear how PRMT5 promotes tumorigenesis, PRMT5 does regulate eIF4E expression and p53 function as a prosurvival factor. 193 PRMT5 also associates with the SWI/SNF chromatin remodeling proteins, and this interaction has been suggested to be causal in dysregulating the expression of tumor suppressor genes, including ST7 and NM23. 203 PRMT5 has also been shown to methylate NF-κB, promoting its gene regulation functionality in a colon cancer cell model. By mutating the target arginine on NF-κB to a lysine residue and therein blocking the ability of PRMT5 to methylate NF-κB, a more dramatic dysregulation of gene regulation was observed than by knocking down PRMT5. This indicates the necessity of arginine methylation for NF-κB activity. 202a Many of the functions of PRMT5 have been attributed to cytosolic and other nonhistone targets of arginine methylation. While PRMT5, as with the other isozymes, was initially characterized as a histone-modifying enzyme, the role of PRMT5 in the cellular environment is constantly evolving and appears to reside more frequently in the cytoplasm, not the nucleus. For example, Shilo et al. showed that PRMT5 mRNA and protein levels are dramatically increased in lung carcinoma tissues over normal lung tissues. 202b Along with this trend, dimethylation of H4R3 also increases, illustrating the role of PRMT5 in tumor suppression. However, the more striking trend from this report indicates that there is more cytoplasmic PRMT5 present not only when comparing cancerous to normal lung tissues, but also in higher grade tumors with poorer prognosis. 202b 3.8.6. PRMT7 in Cancer. As described previously, PRMT7mediated methylation of H2AR3 affects DNA damage repair. The effects of inhibiting PRMT7 are complex and could result in resistance to certain chemotherapeutics. 177 For these reasons, isozyme-specific PRMT inhibitors are essential.

Arginine Methylation in Atherosclerosis
Nitric oxide (NO) is a regulator of vasodilation known to be integral to cardiac health. The free amino acid L-arginine is the precursor for the generation of NO by the nitric oxide synthase (NOS) family of enzymes. However, when excess ADMA is present in the system due to the breakdown of proteins containing ADMA, this serves as an inhibitor of the NOS enzymes, lowering the production of NO. 204 In the absence of L-arginine or presence of ADMA, NOS can also produce superoxide, which can have the opposite physiological effect compared to NO. 205 In patients with heart disease, PRMT1 is observed at elevated levels, as well as a lowered expression of DDAH, which metabolizes ADMA. Both of these factors cause increases in ADMA levels and decreased vasodilation. 206

Protein Arginine Phosphorylation
The phosphorylation of arginine residues in histone H3 was first reported in 1994. 207 Although the underlying kinase could not be identified, using a partially purified kinase fraction derived from nuclear cell extracts of mouse leukemia cells, it was shown that histone H3 can be arginine phosphorylated at R2 as well as R128, R129, and R131 in the C-terminus of the protein. 207 This modification introduces significant negative charge into the histone that undoubtedly would influence its DNA-binding ability and thereby chromatin structure. In contrast to other phosphorylated residues that are stable under acidic conditions (e.g., phosphoserine, phosphothreonine, and phosphotyrosine; O-phosphorylations), phosphoarginine, like other N-linked phosphorylation events, is an acidlabile modification. Therefore, current methods to directly detect protein arginine phosphorylation are sparse, and care has to be taken to preserve the acid-labile phospho mark during sample preparation and analysis. 208 Although the molecular identity of the responsible eukaryotic protein arginine kinase is still unclear, it was recently demonstrated that protein arginine phosphorylation also occurs in bacteria. 208a The underlying protein arginine kinase (PAK) was identified as McsB (EC 2.7.14.1) (Figure 47). 208a This enzyme shows close homology to guanidine phosphotransfer-  ase enzymes such as creatine kinase and L-arginine kinase. Guanidine phosphotransferases are typically involved in energy homeostasis and shuttle the γ-phosphate from ATP onto smallmolecule guanidine-containing compounds that serve as chemical energy storage devices in cells relying on high rates of energy turnover such as muscles and neurons. 209 McsB shares several highly conserved residues which are important for nucleotide and substrate guanidinium binding with other guanidine phosphotransferases. However, it lacks the entire N-terminal domain that is critical for trapping smallmolecule substrates, such as the amino acid L-arginine, thereby allowing larger peptidyl substrates to enter the active site. 208a Besides McsB's preference for peptidyl arginine residues, the catalytic mechanism is thought to be similar to that of Larginine kinases that employ a direct, in-line phosphoryl transfer between ATP and the guanidinium group ( Figure  48A). Notably, it was shown that L-arginine kinase binds arginine and ATP randomly, i.e., without a specific order, to the active site and that the resulting products phosphoarginine and ADP are individually released. 210 Moreover, structural analysis of arginine kinase bound with a transition-state analogue, composed of nitrate, ADP, and L-arginine, revealed that five arginine residues form a dense network of electrostatic interactions with the phosphate groups and the γ-phosphate mimicking nitrate ( Figure 48A). 211 In addition, a tightly bound magnesium ion, which is essential for catalysis, is thought to coordinate all three ATP phosphate groups to properly orient the γ-phosphate for the transfer reaction to occur.
The substrate arginine is aligned between two carboxylate groups originating from E225 and E314 that are both thought to act as general bases. However, mutagenesis studies suggest that base catalysis by these residues may enhance the catalytic rate but is not absolutely essential. 212 It was suggested that proper substrate prealignment might be more important than acid/base chemistry, electrostatics, or other potential effects. 212 Due to the polarization of the guanidinium group by the two glutamate residues, the substrate arginine N ω atom is highly nucleophilic and predisposed to attack the electrophilic phosphorus of the ATP γ-phosphate in a classical S N 2 reaction. In addition, the positive charges of several arginine residues and Figure 48. (A) Active site of L-arginine kinase with bound ADP, nitrate, and L-arginine (PDB code 1BG0) and proposed reaction mechanism. Note that several arginine residues were omitted in the proposed reaction scheme for clarity. (B) Active site of YwlE C7S with bound peptidyl arginine and containing a phosphorylated S7 residue, which mimics the thiophosphate reaction intermediate generated after the first S N 2 reaction (PDB code 4KK4). Residue R149* originates from a symmetry-equivalent molecule. Polar contacts of <3.5 Å are represented as dashed lines. The proposed catalytic mechanism for the PAP enzyme YwlE is shown on the right side. The guanidinium group of the incoming phosphoarginine substrate is colored in blue, whereas the phosphoryl group is shown in red.

Chemical Reviews
Review the magnesium ion pull electrons toward the phosphate oxygens, away from the phosphorus, thereby increasing its electrophilicity.
Recently, a corresponding protein arginine phosphatase (PAP), YwlE (EC 3.9.1.2), was identified that can hydrolyze the phosphoramidate (P−N) bond of phosphoarginine, thereby releasing unmodified arginine ( Figure 47). 213 Notably, YwlE shares homology with the protein tyrosine phosphatase (PTP) family and utilizes a similar catalytic mechanism. In contrast to other PTP members, YwlE employs a dual size and polarity filter to select phosphoarginine residues. 213a The highly polar substrate guanidinium group is sandwiched between the carboxylate of D118 and the hydroxyl of T11 and further stabilized by cation−π interactions with the phenyl ring of F120 ( Figure 48B). The molecular mechanism of phosphoarginine hydrolysis consists of a two-step process comprising two consecutive S N 2 reactions ( Figure 48B). 213a The incoming phosphoarginine phosphoryl group forms a bidentate bond with the guanidinium group of R13, whereas the phosphoarginine guanidinium group is clamped between the hydroxyl group of T11 and the carboxyl group of D118. The phosphorus atom of the substrate molecule is attacked by the highly nucleophilic active site cysteine residue C7, forming a thiophosphate reaction intermediate. This reaction is accompanied by the protonation of the arginine leaving group by D118. Subsequently, an incoming water molecule is deprotonated by D118, and the water-derived nucleophilic hydroxyl anion attacks the phosphocysteine residue, performing the second S N 2 reaction, thereby releasing the phosphate ion and regenerating the active C7 thiolate anion. The active site cysteine was shown to be essential for catalysis and subject to oxidative regulation by the formation of a disulfide bridge with an adjacent backdoor cysteine. 214 4.1.1. Physiological Role of Histone Arginine Phosphorylation. As shown by Wakim et al., histone H3 is subject to arginine phosphorylation by a Ca 2+ -calmodulin-dependent kinase derived from mouse leukemia cells. 207 It was also demonstrated that in vivo 32 P incorporation into H3 in rat heart endothelial cells results in phosphorylation of a basic amino acid in quiescent but not in dividing cells. 215 This Ca 2+calmodulin-dependent kinase is present in nearly equal amounts in both quiescent and dividing cells; however, the histone H3 phosphorylating activity was 20−100-fold higher in quiescent cells. 215 In addition, it was suggested that phosphorylation of histone H3 was involved in cell cycle exit.
The studies by Wakim et al. have, however, not been followed up, and there is still some uncertainty about the existence of a eukaryotic protein arginine kinase. Notably, the phosphorylation of arginine residues was solely derived from indirect methods, including the acid lability of 32 P radiolabeling and a missing signal for arginine using Edman sequencing. However, to unequivocally prove the presence of phosphoarginine in histone proteins, direct phosphoarginine detection methods such as recently optimized mass spectrometry techniques or 31 P NMR analysis should be employed. 208a,216

Histone Arginine ADP-Ribosylation
ADP-ribosylation is a covalent PTM which is catalyzed by ADP-ribosyltransferases (ARTs) and is involved in several cellular processes such as cell cycle regulation and DNA damage response, replication, or transcription. The generation of ADP-ribosylated proteins requires nicotinamide adenine dinucleotide (NAD + ) as a cofactor and leads to the formation of nicotinamide (Figure 49). ADP-ribosylation was shown to be reversibly regulated by the hydrolysis of the ADP-ribose group catalyzed by ADP-ribosyl hydrolase (ARH) enzymes. ARTs can either attach mono-ADP-ribosyl groups (catalyzed by mARTs) or poly-ADP-ribosyl groups (catalyzed by pARTs). In contrast to mARTs that only transfer a single ADP-ribose moiety onto a specific amino acid side chain, pARTs (also known as poly-ADP-ribose polymerases or PARPs), additionally catalyze the elongation and branching of ADP-ribose units on ADPribosylated targets. 217 ADP-ribose can be linked to either negatively charged glutamates and aspartates via ester bonds that are highly sensitive to hydroxylamine or to positively charged arginine or lysine residues via N-glycosidic bonds that are resistant to hydroxylamine treatment as well as to cysteine and asparagine residues. 218 Interestingly, most known mARTs transfer ADP-ribose onto arginine or lysine residues, whereas pARTs mainly target glutamate residues. 219 Notably, ARTs are mainly extracellular enzymes that modify integrins and growth factor receptors. The only known intracellular enzymes possessing mono-ADP-ribosyltransferase activity are members of the sirtuin (SIRT) family of NAD + -dependent deacetylases. Sirtuins SIRT1, SIRT2, SIRT4, and SIRT6 harbor weak intrinsic mono-ADP-ribosylation activity, transferring a single ADP-ribose to an arginine residue of specific target proteins. 220,221 There are three ARH and one poly-ADP-ribose glycohydrolase (PARG) known in humans. The poly-ADPribose polymer can be degraded by PARG and ARH3, which hydrolyze glycosidic bonds between two ADP-ribose units, thus removing ADP-ribose moieties from the polymers. The only enzyme able to release a mono-ADP-ribose moiety from ADPribosylated proteins is ARH1, which cleaves off a mono-ADPribose from arginine residues. 222 Figure 49. Generation of ADP-ribosylated arginines is catalyzed by ART enzymes, while the hydrolysis of peptidyl ADP-ribosylated arginine residues is mediated by ARH enzymes. Several decades ago, it was shown that histone proteins can be ADP-ribosylated. 223 There, it was reported that ADPribosylated histone proteins contain primarily ADP-ribose monomers or short oligomers rather than long polymers. 223b Despite some evidence that histone proteins can be ADPribosylated on arginine residues, further studies are necessary to unequivocally prove the presence of ADP-ribosylated histones and to identify the underlying transferase enzyme(s).

Histone Arginylation
Histone proteins are also subject to protein arginylation, which represents the post-translational addition of an arginine residue. This type of modification affects the proteolytically processed and thereby exposed α-amine of histone H2B type 2-B at Q48 and A59, histone H4 at T136 and V116, and histone H2A.1 at S41, as well as histone H1 at L61. 224 Interestingly, arginylation sites can be further modified by arginine methylation. On the basis of structural modeling, arginylation of histone proteins potentially facilitates the interaction of the histones with DNA due to the introduced positive charge of the additional arginine residue. 224b Protein arginylation is mediated by the arginyl-tRNA-protein transferase 1 (ATE1), which transfers a single tRNA-bound arginine onto proteins ( Figure 50). 225 Typically, arginylation occurs at the unprotected N-terminal α-amino groups. However, following proteolytic processing, internal αamino groups can be arginylated in vivo. 224a It was proposed that the main function of protein arginylation is to mark target protein substrates for degradation by the ubiquitin-dependent N-end rule pathway. 226 However, the detailed functions and the extent of histone arginylation are currently not known.

CONCLUDING REMARKS
Epigenetic regulation governed by arginine modification is an emerging hallmark of eukaryotic organisms, and interference with the underlying enzymes holds great promise to intervene in various diseases ranging from cancer to rheumatoid arthritis. The abundance of various histone modifications on nucleosomes implies that crosstalk between these modifications is very likely. Different types of modifications occur on arginine residues, resulting in some form of antagonism since distinct types of modifications on arginines are mutually exclusive. In addition, the introduction of arginine modifications can either create or compromise a substrate recognition site for other histone-modifying enzymes even spanning different histone tails. The molecular details of such communication between modifications are a topic of intense research. In this respect, it is interesting to note that PAD-mediated histone citrullination functions as a general opponent of methylation by PRMTs. Depending on the methylation signal, i.e., activating or inhibiting, PADs might act as a repressor if activating arginine methylation is inhibited or an activator if the balance of PAD activity is shifted toward the suppression of a repressive arginine methylation mark.

ASSOCIATED CONTENT Special Issue Paper
This paper is an additional review for Chem. Rev. 2015, 115, issue 6, "Epigenetics".

Notes
The authors declare the following competing financial interest(s): P.R.T. is a cofounder and consultant to Padlock Therapeutics.