HIV-1 Nucleocapsid Protein Unfolds Stable RNA G-Quadruplexes in the Viral Genome and Is Inhibited by G-Quadruplex Ligands

The G-quadruplexes that form in the HIV-1 RNA genome hinder progression of reverse transcriptase in vitro, but not in infected cells. We investigated the possibility that the HIV-1 nucleocapsid protein NCp7, which remains associated with the viral RNA during reverse transcription, modulated HIV-1 RNA G-quadruplex stability. By electrophoresis, circular dichroism, mass spectrometry, and reverse transcriptase stop assays, we demonstrated that NCp7 binds and unfolds the HIV-1 RNA G-quadruplexes and promotes DNA/RNA duplex formation, allowing reverse transcription to proceed. The G-quadruplex ligand BRACO-19 was able to partially counteract this effect. These results indicate NCp7 as the first known viral protein able to unfold RNA G-quadruplexes, and they explain how the extra-stable HIV-1 RNA G-quadruplexes are processed; they also point out that the reverse transcription process is hindered by G-quadruplex ligands at both reverse transcriptase and NCp7 level. This information can lead to the development of more effective anti-HIV-1 drugs with a new mechanism of action.

T he G-quadruplexes (G4s) are noncanonical secondary structures formed by single-stranded guanine (G)-rich DNA or RNA. 1,2 Four Gs bind through Hoogsteen hydrogen bonds to form G-quartets, planar square structures that stack upon each other to form the G4, which in turn is significantly stabilized by monovalent cations. 3 In the human genome, many G4s have been characterized as regulatory elements of critical cellular processes, including inhibition of telomerase activity, modulation of oncogene expression via transcriptional suppression, polymerase stalling during replication, and translational regulation in the untranslated regions (UTRs), where RNA G4s support both positive and negative control of mRNA translation. 4−7 G4 folding has been shown to be triggered or hindered by cellular proteins. 8,9 Nucleolin is an example of G4-binding protein, first identified as the binding and stabilizing partner of the G4-structured transcriptionally inactive form of the c-myc promoter 10 and later demonstrated to bind G4s in viruses, 11−14 with a preference for long-looped G4s. 15 Various families of helicases have been shown to unfold G4 structures, 16,17 thus allowing polymerase progression that otherwise would be impaired.
The human immunodeficiency virus type 1 (HIV-1) is a retrovirus with two identical copies of RNA genome encapsidated in the viral particle together with the enzyme machinery necessary for viral replication. Right after reverse transcription, which takes place in the cytoplasm of the host cell, the newly synthesized dsDNA genome is transported in the nucleus and integrated into the host genome thanks to the two long terminal repeat (LTR) flanking regions. 18,19 In the proviral DNA, the 5′-LTR serves as promoter of all HIV-1 genes and is segmented into U3, R, and U5 regions. 20,21 The U3 region is also present at the 3′-end of the viral RNA genome. We have previously demonstrated formation of multiple G4 structures in the U3 region of both the proviral DNA and RNA genome. 22,23 G4-ligands with both low and high selectivity for viral G4s displayed antiviral activity ascribable to the interaction with the viral G4s. 23,24 The antiviral activity of the G4 ligand BRACO-19 (B19) was linked to inhibition of both post-and preintegration steps of the viral cycle. 23 At the preintegration steps, B19 hindered progression of reverse transcriptase (RT) on the RNA template, therefore inhibiting HIV-1 DNA synthesis and thus viral production. In infected cells, HIV-1 RT is assisted by the viral nucleocapsid protein (NCp7). 25 HIV-1 NCp7 is a small 55-amino acid basic protein with two CCHC zinc-finger domains, produced as part of the Gag polyprotein. 26 During reverse transcription, NCp7 assists RT by solving HIV-1 RNA genome secondary structures to ensure enzyme progression and complete synthesis of the viral DNA. 27−29 NCp7 was shown to be able to recognize DNA G4s, in particular the intermolecular G4 that forms in the HIV-1 DNA flap, an intermediate and ultimately removed sequence of the reverse transcription process, 30 and to unfold an intramolecular synthetic DNA G4. 31 In addition, NCp7 was able to induce surface-attached oligonucleotides to fold into tetramolecular G4. 32 Because the G4 structures that form in the U3 region of the HIV-1 RNA genome have been reported to be extremely stable, 23 we here investigated the possibility that NCp7 was also involved in the unfolding of these structures and tested if G4 ligands could inhibit this process. We demonstrated that NCp7 is indeed able to bind and unfold the HIV-1 RNA G4s and stimulate RNA/DNA hybrid formation, thus greatly enhancing the processivity of RT and allowing synthesis of the proviral DNA genome. The presence of the G4-ligand B19 was able to partially counteract these effects.

■ RESULTS
NCp7 Binds and Unfolds the RNA G4s in the U3 Region of the HIV-1 RNA Genome. We have previously shown that two RNA G-quadruplexes (G4s), i.e., U3-III and U3-IV, form in the G-rich U3 region of the HIV-1 RNA genome ( Figure 1). 22 Because these are extremely stable in physiological conditions (i.e., T m = 82.1 and 71.2°C for U3-III and U3-IV, respectively, in 100 mM K + ), we envisaged the presence of a protein with G4-unfolding activity that would allow progression of the reverse transcriptase (RT) in infected cells. Since the viral nucleocapsid protein (NCp7) associates with RT during reverse transcription, 25,29 we explored the possibility that NCp7 itself was able to unfold the U3 G4s.
We first assessed the ability of NCp7 to bind the G4-folded U3 sequences. To this purpose, the RNA U3-III+IV oligonucleotide, folded into G4 in the presence of increasing amounts of NCp7, was analyzed by electrophoretic mobility shift assay (EMSA) (Figure 2A).
Initial increase of the free G4 band ( Figure 2B) was ascribed to the aggregation properties of NCp7, mediated by unspecific electrostatic binding, too weak to make the protein/RNA complex visible in these conditions. Alternatively, different RNA conformations unable to enter the gel and/or species below detection in the gel may be unfolded by NCp7 to yield the most stable G4 structure. At the highest protein concentration, we observed a decrease in the free G4 band. Complexes with NCp7 were reported to aggregate/precip-

ACS Infectious Diseases
Article itate, 30 and thus, the disappearance of the free G4 starting from NCp7 75 nM was deemed indicative of the binding. At NCp7 300 nM (Figure 2A, lane 8), a faint slower migrating band became visible, alongside the almost complete disappearance of the free G4, which indicated binding saturation ( Figure 2A). We deemed the slower migrating band attributable to the G4-NCp7 complex, which was visible because, besides electrostatic interactions, it involves also higher energy bonds, such as hydrogen, hydrophobic, and stacking interactions. Since during reverse transcription a DNA/RNA intermediate forms, we next evaluated the binding properties of NCp7 to the G4-forming U3-III+IV sequence in the presence of the complementary DNA oligonucleotide ( Figure 3A). Addition of the cold complementary DNA strand (i.e., the unlabeled DNA strand complementary to the U3-III+IV RNA sequence) to the labeled and G4-folded RNA sequence induced formation of a hybrid RNA-DNA duplex, which had a slower migration rate and thus could be separated from the single-stranded G4folded species on a native polyacrylamide gel ( Figure 3A, lanes 1−3). In the presence of NCp7, the RNA/DNA duplex competed for the formation of the RNA G4−NCp7 complex ( Figure 3A, lanes 4−6). The free G4 species completely disappeared upon G4−NCp7 complex formation, while the amount of the free duplex was not perturbed. These data indicate the binding of NCp7 to the G4-folded RNA vs the DNA/RNA duplex. To form the duplex, the thermodynamically stable RNA G4 has to be unfolded prior to base pairing with its complementary strand. We thus analyzed duplex formation kinetics: the amount of duplex increased over time and reached 60% at 24 h ( Figure 3C). Addition of NCp7 highly increased the amount of the duplex species, especially at 24 h when the G4 folded oligonucleotide completely converted to the duplex form ( Figure 3C). These results indicate that when NCp7 binds to the HIV-1 RNA G4 structures, it stimulates their unfolding in the presence of the complementary strand.
To confirm this observation, NCp7 activity was investigated by circular dichroism (CD). The G4-folded U3-III+IV RNA was incubated in the presence/absence of NCp7 and analyzed by CD (Figures 4, S1, and S2). Upon addition of NCp7, the molar ellipticity of U3-III+IV G4 drastically decreased, while the CD spectrum maintained the G4 signature, with a maximum at 265 nm and a minimum at 238 nm. Usually, low molar ellipticity indicates low stability of the tested G4s.
To check the actual stability of U3-III+IV G4 in the presence/ absence of NCp7, CD spectra were recorded at increasing temperature ( Figure 4). The melting temperature (T m ) calculated according to the van 't Hoff equation applied to the molar ellipticity signal at 265 nm vs the temperature was 68.3 and 44.8°C in the absence and presence of NCp7, respectively, indicating the effective unfolding of the RNA G4 mediated by NCp7. When the unfolded U3-III+IV G4 was reannealed by steadily decreasing the temperature from 90 to 20°C, in the absence of NCp7 the T m maintained its value of 68.3°C, whereas in the presence of the protein the oligonucleotide was unable to regain the folded G4 structure

ACS Infectious Diseases
Article (Figure 4), indicating that NCp7 was able to maintain its unfolding properties in this condition. We next investigated if the G4-ligand B19, which has been shown to stabilize the HIV-1 U3 G4s, 23 could inhibit the G4-unfolding activity of NCp7. We incubated U3-III+IV G4 with B19 before addition and further incubation with NCp7 ( Figure 4). B19 increased the G4 T m up to >90°C; in the presence of NCp7, this was reduced to 67.8°C (Figure 4), indicating that B19 was able to in part suppress NCp7 unfolding activity.
NCp7 unfolding properties were further assessed by electrospray ionization (ESI) mass spectrometry (MS) ( Figure  5), a powerful technique to investigate both G4 structures and G4/small molecules binding. 33 The number of the coordinated K + ions is diagnostic of the number of G-quartets involved in the G4 structure, and therefore of the G4 folded conformation.
The MS spectrum of the RNA U3-III+IV G4 presented a peak corresponding to the oligonucleotide coordinated to two K + ions, which indicated the expected three-layered U3-III+IV G4 in this condition ( Figure 5A, B). In the presence of NCp7, two additional peaks corresponding to the oligo-protein complex appeared ( Figure 5C). In the complex, the species with 0 and 1 K + ions were prevalent with respect to the 2 K + ion species (Figures 5D, E), indicating the unfolded state of the oligonucleotide in the presence and in complex with NCp7, and thus confirming the unfolding activity of NCp7 toward the HIV-1 RNA G4s.
NCp7-Mediated Unfolding of the HIV-1 RNA G4s Promotes Reverse Transcriptase Processivity. To investigate whether NCp7 unfolding properties could abolish the previously observed HIV-1 RNA G4-mediated RT stalling, 22 we performed the RT stop assay in the presence of increasing concentrations of NCp7. The U3-III+IV sequence in the presence of K + induced RT pausing at all G-tracts involved in formation of the overlapping G4s (i.e., U3-III and U3-IV, Figure 6A, lane 1). Upon addition of NCp7, the stop sites decreased and the full-length RT product increased ( Figure  6A, lanes 2−4, and Figure 6B). When the RNA U3-III+IV G4 template was treated with increasing concentrations of B19, the stop sites significantly increased at the expense of the fulllength product ( Figure 6A, lanes 5−8, and Figure 6B). When NCp7 was added to the B19-treated samples, the full-length product was restored, while the stop sites were mainly maintained ( Figure 6A, lanes 9−12, and Figure 6B), with a visible B19 concentration-dependent effect on both the fulllength product and stop sites ( Figure 6A, lanes 9−12, and Figure 6B). We next tested a control G-rich sequence unable to fold into G4 ( Figure 6C). Minor pausing sites were observed (lane 1) likely due to transient base pairing within short RNA tracts. Addition of NCp7 also in this case was able to solve most of these pausing sites (lanes 2−4). In the presence of B19, one stop site paralleled by reduction of the full-length products was visible (lanes 5−8): this behavior is compatible with the reported not absolute G4 specificity of B19, especially at high concentrations. 34 Incubation with NCp7 fully released the pausing sites (lanes 9−12), indicating that, in the absence of a G4 structure, unspecific binding of B19 to the RNA is not able to inhibit NCp7 and that B19 does not inhibit the protein activity per se. The full-length product in the presence of NCp7 was more abundant than that in the absence of the protein (lanes 9−12 vs lanes 2−4), confirming

ACS Infectious Diseases
Article the previously reported NCp7 higher polymerization rates in crowding conditions. 42 Altogether these data indicate that NCp7 unfolds G4 structures that form in the HIV-1 RNA genome, favoring the proceeding of reverse transcription. In the absence of G4 stabilizing compounds, NCp7 is able to destabilize the structures to allow complete synthesis of the DNA, while the G4-ligand B19 can in part counteract this effect.

■ DISCUSSION
We have previously observed that the U3 region of the HIV-1 genome can fold into extremely stable G4 structures that inhibit RT progression in vitro. 22 In contrast to DNA G-rich regions that may form G4s only when dissociated from their complementary sequence, for example, temporarily by DNAand RNA-polymerase processing during replication and transcription, 35 the HIV-1 RNA genome is single-stranded and therefore folded into secondary structures most of the time. We reasoned that the very stable RNA G4 structures would be deleterious for viral survival and therefore viral/ cellular proteins need to be present to solve them. While a cellular protein, i.e., hnRNP A2/B1, has been reported to unfold the HIV-1 DNA G4s, 36 we hypothesized that, in this case, a viral protein could solve RNA G4s to allow the correct

ACS Infectious Diseases
Article completion of the RT process: in fact, reverse transcription has recently being reported to initiate within the capsid of the mature virus, where cellular proteins do not have access. 37 We thus focused on NCp7, a small retroviral protein generated by proteolytic cleavage of the Gag precursor: several hundred molecules of NCp7 coat and protect the HIV-1 dimeric RNA genome in the virion and later assist various steps of the HIV-1 replication cycle, including reverse transcription, genome dimerization, and selective genome packaging. 26,28,38−42 NCp7 displays nucleic acid chaperone activity that, during reverse transcription, facilitates the rearrangement of nucleic acids into their most thermodynamically stable structures. 27 NCp7 has been reported to bind DNA G4s: it was able to unfold a short and synthetic monomeric DNA G4 31 and to assemble tetramolecular G4 structures. 32 This latter activity probably resulted from the nucleic acids aggregation properties of NCp7. However, NCp7 ability to bind and process RNA G4s has never been presented so far.
Using different and complementary techniques, we proved here that NCp7 was able to bind and unfold the U3 G4s in vitro. The presence of NCp7 stimulated production of fulllength amplification products by RT, as assessed in the RT stop assay. In addition, we proved that NCp7 preferentially binds the G4 sequence vs its duplex counterpart. NCp7 has been reported to preferentially bind single-stranded nucleic acid regions. 38 Here we take this concept further and demonstrate that NCp7 binds to conformationally structured single-stranded regions, such as G4s, and unfolds them. In our case, this activity resulted in the increased formation of duplex RNA/DNA hybrid, a structure that is thermodynamically more stable than the G4, as demonstrated by competition experiments in EMSA. A similar activity has been reported for another single-stranded structured RNA in HIV-1, the TAR hairpin, which gets unfolded by NCp7 to favor annealing to the complementary sequence and thus formation of the double-stranded molecule. 43−45 In the case of TAR RNA, exposed G bases are the sites preferentially bound by NCp7. 45 This evidence supports the unfolding activity observed on the HIV-1 RNA U3 G4s, where exposed G bases are present in the G4 loops. In addition, in the G4 conformation, Gs base-pair through the Hoogsteen-type hydrogen bonds that are less thermodynamically stable than the Watson and Crick ones and may thus be recognition sites for NCp7 as well. 46 These results indicate that NCp7 is indeed the protein able to solve the stable RNA G4s, thus allowing viral reverse transcription to occur in vivo. This is the first time a viral protein is reported to unfold RNA G4s. So far, only one other viral protein, i.e., EBNA 1 of the Epstein−Barr virus, has been shown to bind to folded RNA G4s to promote viral DNA replication. 47 The initial stability of the secondary structure processed by NCp7 dictates the efficiency of the unfolding activity. In fact, as previously reported for the annealing of TAR to its complementary strand, NCp7 destabilizes the less stable complementary TAR hairpin at a higher rate. 44 Therefore, from a therapeutic point of view, increasing the stability of the HIV-1 RNA structures could be a valuable strategy to decrease the unfolding capacity of NCp7 and thus further inhibit RT progression. 48 We have previously shown that the U3 G4s could be stabilized by the G4-ligand B19, which was able to inhibit RT progression at the sites of G formation. This activity resulted in inhibition of the viral life cycle at the preintegration step, 23 which we proposed due exclusively to inhibition of RT progression at the G4 site. We proved here by CD and RT stop assay, that B19 is also able to counteract the unfolding activity of NCp7.
Therefore, our data indicate that G4 ligands possess a dual activity at the U3 RNA level (Figure 7): on one hand they sterically hinder RT processing of the structured RNA template; on the other, they inhibit the chaperone activity of NCp7, which in turn assists RT activity. Thus, both activities contribute to the final effect of inhibition of the reverse transcription process and thus of the viral life cycle. Our data also point out the strength of NCp7 as a chaperone, as this protein is able to process extremely stable structures, both naturally occurring and further stabilized by small molecules. Therefore, to take advantage of the dual inhibition at the HIV-1 RNA level, we envisage the need of G4 ligands able to potently stabilize the U3 region. In addition, since G4s are also largely present in the cell genome and their stabilization may

ACS Infectious Diseases
Article not be beneficial to noninfected cells, G4 ligands that are also selective for the HIV-1 G4s are highly wished for.

■ CONCLUSIONS
Inhibition of NCp7 is an additional and previously unknown activity of G4 ligands. G4 ligands with improved U3 G4 stabilizing activity will likely allow researchers to exploit inhibition of both NCp7 and RT to the fullest extent, and they may lead to the development of anti-HIV-1 drugs with new targets and mechanism of action.

■ METHODS
HIV-1 Recombinant Nucleocapsid Protein. The fulllength recombinant nucleocapsid protein (NCp7) was prepared as previously reported. 49 The stock solution was stored in aliquots at −80°C until use. For each analysis, the lowest possible amount of protein was used.
Electrophoretic Mobility Shift Assay (EMSA). RNA oligonucleotides, labeled with [γ-32 P-ATP] using T4 polynucleotide kinase at 37°C for 30 min, were annealed by heating at 95°C for 5 min in lithium cacodylate (10 mM, pH 7.4) and KCl (50 mM) buffer and gradually cooled to room temperature. The annealed oligonucleotides at 15 nM final concentration were added to 20 μL of binding reaction (8% glycerol, 30 mM Tris-HCl, 15 mM MgCl 2 , 50 μM ZnCl 2 ) containing appropriate concentrations of NCp7. For EMSA unfolding assays, labeled RNA oligonucleotides were annealed to form G4s, and cold DNA complementary oligonucleotides were added to the binding reactions at equimolar or 2-fold excess strand ratio. Binding reactions were incubated for the indicated time at 37°C in the presence of appropriate protein concentrations. Mixtures (80% of the sample) were then loaded on a 12% polyacrylamide native gel and run at 4°C for 90 min at 90 V. Gels were dried, exposed overnight, and visualized by phosphorimaging (Typhoon FLA 9000, GE Healthcare).
Circular Dichroism (CD) Analysis. For CD analysis, the RNA oligonucleotides were diluted to 2 μM concentration in 10 mM lithium cacodylate buffer (pH 7.4) supplemented with 50 mM KCl. Samples were annealed by heating at 95°C for 5 min and gradually cooled to room temperature to allow G4 formation. When the unfolding properties of NCp7 were analyzed, it was added to the samples at 10-fold NCp7/ oligonucleotide ratio and incubated for 3 h before CD analysis. Where specified, B19 was added at 8 μM concentration 4 h after the annealing step, and the samples were placed at 4°C for 24 h to permit G4 stabilization. CD spectra were recorded on a Chirascan-Plus (Applied Photophysics, Leatherhead, UK) instrument equipped with a Peltier temperature controller using a quartz cell of 5 mm optical path length and an instrument scanning speed of 50 nm/min over a wavelength range of 230−320 nm. The reported spectrum of each sample represents the average of 2 scans, and it is baseline corrected for signal contributions due to the buffer. Observed ellipticities were converted to mean residue ellipticity (θ) = deg × cm 2 × dmol −1 (mol ellip). Unfolding spectra were recorded over a temperature range of 20−90°C, while 90−20°C was used for annealing experiments, with a temperature increase/decrease rate of 1°C/min. T m values were calculated according to the van 't Hoff equation, applied for a two-state transition from a folded to unfolded state, assuming that the heat capacity of the folded and unfolded states are equal.
Mass Spectrometry (MS) Analysis. The RNA oligonucleotides were diluted to 5 μM concentration in a final buffer composition consisting of 0.8 mM KCl, 120 mM trimethylammonium acetate (TMAA) adjusted from pH ∼ 7 to 7.4 with triethylamine (TEA). Samples were annealed by heating at 95°C for 5 min, gradually cooled to room temperature and incubated overnight at 4°C. Where appropriate, NCp7 was added to the sample at a 1:1 protein/oligonucleotide ratio: the high sensitivity of MS allowed the use of a lower amount of protein compared to the CD analysis. At the time of analysis, a volume of 5 μL of each sample was typically scanned by direct infusion using electrospray ionization (ESI) on a Xevo G2-XS QTOF mass spectrometer (Waters, Manchester, UK). The ESI source settings were as follows: electrospray capillary voltage 1.8 kV; source and desolvation temperatures 45 and 65°C, respectively; sampling cone voltage 65 V. All these parameters ensured minimal fragmentation of the DNA complexes. The instrument was calibrated using a 2 mg/mL solution of sodium iodide in 50% of isopropanol (IPA). Additionally, the use of the internal standard LockSpray (a solution of leu-enkephalin 1 μg/mL in acetonitrile/water (50:50, v/v) containing 0.1% of formic acid) provided a typical <5 ppm mass accuracy. This high-resolution system allowed us to visualize the isotopic pattern, identify the charge state, and therefore unambiguously calculate the neutral mass of the detected species.
Reverse Transcriptase (RT) Stop Assay. DNA primer was 5′-labeled with [γ-32 P-ATP] using T4 polynucleotide kinase at 37°C for 30 min. The labeled primer (70 nM) was annealed to the RNA U3-III+IV in the presence of 50 mM KCl. The primer extension reaction was performed by recombinant HIV-1 Reverse Transcriptase (1 U/reaction; Calbiochem) in the provided buffer (50 mM Tris-HCl (pH 8.3), 75 mM KCl, 3 mM MgCl 2 ) at 44°C for 1 h. Where specified, samples were incubated overnight with increasing concentrations of B19 (500 nM to 4 μM) at room temperature before primer extension. When NCp7 was used, appropriate concentrations (350 nM to 1.5 uM) of it were added immediately before the elongation reaction. Reaction products were treated with NaOH (2 N) at 95°C for 3 min to permit the alkaline hydrolysis of RNA, and the pH was adjusted with HCl (2 N) to neutrality. Samples were ethanol precipitated, and extension products were separated on 16% denaturating

Author Contributions
The manuscript was written through contributions of all authors. All authors have given approval to the final version of the manuscript.

Notes
The authors declare no competing financial interest.

■ ACKNOWLEDGMENTS
We are thankful to D. Fabris (University of Albany, USA) for helpful discussions. This work was supported by grants to S.N.R. from the European Research Council (ERC Consolidator grant number 615879) and the Bill and Melinda Gates Foundation (grant numbers OPP1035881, OPP1097238).