A Mild, DNA-Compatible Nitro Reduction Using B2(OH)4

A hypodiboric acid system for the reduction of nitro groups on DNA–chemical conjugates has been developed. This transformation provided good to excellent yields of the reduced amine product for a variety of functionalized aromatic, heterocyclic, and aliphatic nitro compounds. DNA tolerance to reaction conditions, extension to decigram scale reductions, successful use in a DNA-encoded chemical library synthesis, and subsequent target selection are also described.


General Information
Some of the general materials, equipment and procedures used in this study are adapted from those our group has reported previously 1 or other DNA-encoded library publications. 2 1a. Materials and equipment used for the synthesis and analysis of oligonucleotides and DNA-encoded chemical libraries. The chemically-modified DNA oligonucleotide DTSU ("DEC-Tec Starting Unit", S1, Figure S1) and 5'phosphorylated oligonucleotides were obtained from LGC Biosearch Technologies. All DNA oligonucleotides were assessed for purity through the general analytical procedure and sequences were designed on principles designed to maximize sequence-reads. High-concentration T4 DNA ligase was obtained from Enzymatics (Qiagen) and its activity was determined through test DNA oligomer ligations on DTSU. Chemical building blocks and reagents were sourced from a variety of vendors, and were generally used from aliquots dissolved in acetonitrile or mixed aqueous acetonitrile solutions. Building block aliquots for DECL synthesis were stored in Tracetraq barcoded tubes (Biosero) with either screw-or septa-caps. Barcoded tubes were read using a SampleScan 96 scanner (BiomicroLab) and decoded using Vortex software (Dotmatics). All buffers and ionic solutions, including HEPES 10X ligation buffer (300 mM 2- [4-(2-hydroxyethyl)piperazin-1-yl]ethanesulfonic acid, 100 mM MgCl 2 , 100 mM dithiothreitol, 10 mM adenosine triphosphate, pH 7.8), aq. NaOH (1.5 M), aq. NaCl (5M) and basic borate buffer (250 mM sodium borate/boric acid, pH 9.5), were prepared in-house. DNA working solutions were prepared using DNAse free ultrapure water (Invitrogen), HPLC-grade acetonitrile (Fisher) or high-purity absolute ethanol (Koptec). LC/MS running solvents were made from Optima LC/MS grade water (Fisher), Optima LC/MS grade methanol (Fisher), 99+% purity hexafluoroisopropanol (Sigma) and HPLC-grade triethylamine (Fisher). Solutions were generally transferred or pooled utilizing Biotix brand pipette tips and reservoirs (various sizes), reactions were generally performed in polypropylene, 96-well, deep-well plates (USA Scientific, various sizes) or polypropylene, 96-well PCR plates (Phenix) or polypropylene Eppendorf tubes (various brands), plates were sealed for incubation with AlumaSeal II foil seals (Excel Scientific) and large volume DNA precipitations were performed in polypropylene 250 mL screwcap bottles, Eppendorf tubes or Falcon tubes (from various vendors). Heated reactions were either performed in ep384 Mastercyclers (Eppendorf), benchtop heating blocks or in laboratory ovens (Fisher). Solutions were centrifuged in either Avanti J-30I or Allegra X-15R centrifuges (Beckman-Coulter). Optical density measurements were made using a Biophotometer (Eppendorf). A Vanquish UHPLC system was integrated with LTQ XL ion trap mass spectrometer (ThermoFisher Scientific) for LC/MS analysis of oligonucleotides.  Figure S1. Structure of "DTSU" S1 (5'-Phos-CTGCAT-Spacer 9-Amino C7-Spacer 9-ATGCAGGT 3').
Samples were analyzed on a Thermo Vanquish UHPLC system coupled to an electrospray LTQ ion trap mass spectrometer. An ion-pairing mobile phase comprising of 15mM TEA/100mM HFIP in a water/methanol solvent system was used in conjunction with an oligonucleotide column Thermo DNAPac RP (2.1 x 50 mm, 4µm) for all the separations. All mass spectra were acquired in the full scan negative-ion mode over the mass range of 500-2000 m/z. The data analysis was performed by exporting the raw instrument data (.RAW) to an automated biomolecule deconvolution and reporting software (ProMass) which uses a novel algorithm known as ZNova to produce artifact-free mass spectra. Deconvoluted mass spectra were standardized/compared against co-currently run samples of DTSU S1 and HP S2 to account for any drift from theoretical mass during deconvolution.
1c. General procedure for ethanol precipitation and DNA reconstitution. To a DNA reaction mixture was added 5% (v/v) 5 M NaCl solution and 2.5−3 times the volume of absolute ethanol. The colloidal solution was then incubated at −20 °C overnight. After centrifugation, the supernatant was decanted, the pellet was centrifuged with 70% aq. ethanol, the supernatant again was decanted and the DNA pellet was dried in air or under gentle vacuum. Water was added to reconstitute the DNA to 0.5-1 mM. Ethanol precipitation was generally performed after each chemical reaction.
1d. Representative general procedure for DNA ligation.
To DNA conjugate 7a (1.5 nmol, 5 μL, 1.0 equiv) was added DNA_1 (5'-CGGCTACAGTGT-3', 1.95 nmol, 1.95 μL, 1.3 equiv), DNA_2 (5'-ACTGTAGCCGGA-3', 1.95 nmol, 1.95 μL, 1.3 equiv), and nuclease-free water (3.6 μL), followed by the addition of 10× HEPES buffer (1.5 μL) and T4 DNA ligase (1.0 μL). The reaction mixture was incubated at room temperature overnight before performing gel electrophoresis. The crude material was purified by EtOH precipitation and then analyzed by gel electrophoresis. Gel electrophoresis was executed using precast 10% TBE acrylamide gel from Invitrogen (12 wells). The gel box was filled with 1× TBE buffer until the gel was covered. The purified DNA (by EtOH precipitation) was diluted to the concentration of 12 ng/μL. To a tube was added 10 μL of one DNA sample and 2 μL of 6× DNA loading dye to make a DNA dye loading sample. The first lane of the gel was loaded with a DNA molecular weight ladder, and 5 μL of DNA-dye mixed samples was loaded into each lane. Gels were ran at 160 V for 35 min and then stained in a container with 0.5 ng/mL ethidium bromide in 1× TBE buffer for 50 min. DNA fragments were visualized under a UV light device, and assessed for completed ligation.
1e. Elaboration of "DTSU" S1 to "HP" S2 or S3 for substrate preparation. All substrates were prepared on DTSU S1 that had been further elaborated by the ligation of additional oligonucleotides and a longer amino-terminating linker. This elaborated DNA, "HP" S2, was prepared through ligation of two duplexed 11-mer oligonucleotides through the general DNA ligation procedure (final sequence: 5' d TGA GTG AAT ACC TGC AT -Spacer 9-Amino C7-Spacer 9-ATG CAG GTA TTC ACT GAG G 3') followed by amidation of Fmoc-15-amino-4,7,10,13-tetraoxapentadecanoic acid through the general acylation procedure and Fmoc deprotection. A special, longer DNA oligonucleotide S3 for ligation tests was prepared through ligation of two 39mer duplexed oligonucleotides by the general procedure (final sequence 5' d TAT GAT ACT AAA GTA AGT CAC ACA CAA TTG GAG CAG TCC TGA GTG AAT ACC TGC AT -Spacer 9-Amino C7-Spacer 9-ATG CAG GTA TTC ACT GAG GAC TGC TCC AAT TGT GTG TGA CTT ACT TTA GTA TCA TAT C 3' ).

Nitro reduction 2a) Preparation of nitro substrates.
From HP 2, compounds 1a─1g, 1j─1o, 1x, 1z─1ee, and 1gg─1ll were prepared with the appropriate acid building block through the general acylation procedure, compounds 1h, 1i, 1v, 1w, and 1y were prepared with the appropriate sulfonyl chloride building block through the general sulfonylation procedure; compounds 1p, 1q, and 1u were prepared with the appropriate aldehyde through the general reductive alkylation procedure; compounds 1r─1t were prepared from the general nucleophilic aromatic substitution procedure; compound 1k was prepared using urea formation method a, and 1l was prepared using urea formation method b; compound 1ff was prepared using the carbamate formation procedure.
2b) General procedure for nitro reduction. To a solution of DNA-conjugated nitro substrate (3 nmol, 3 μL, 1.0 mM, 1 equiv) in H 2 O was added NaOH (1500 nmol, 1 μL, 1500 mM, 500 equiv), H 2 O (2 μL) and EtOH (4.5 μL), followed by the addition of B 2 (OH) 4 (450 nmol, 4.5 μL, 100 mM in H 2 O, 150 equiv). The reaction mixture was incubated at room temperature for 2 h prior to EtOH precipitation. The solution of B 2 (OH) 4 in neutral water was prepared freshly from vortexing or brief sonication before use. B 2 (OH) 4 was purchased from Ark Pharm, Inc. and used without further purification. When conducting a subsequent ligation, steps should be taken to ensure maximum removal of small molecules/salts before ligation, as large residual amounts of these may inhibit the function of DNA ligase. In DECL production, a double ethanol precipitation is recommended.

Synthesis of a DNA-Encoded Chemical Library using Nitro reduction
7a. Architecture of the Main Library build. The DECL build described is a three-cycle library, produced through three iterative cycles each containing a chemical reaction phase, an encoding DNA oligonucleotide (codon) ligation phase with a final pooling phase. The library is constructed on "HP" S2 (shown here as combination of DTSU, first overhang, forward primer unit, and second overhang) which had been further diversified on the "small molecule end" with a series of different amino-or carboxy-terminating linkers. Overhangs between codons are two base pairs and encoding regions within codons are eleven base pairs. Specific details and principles related to the overall oligonucleotide sequence design utilized in our DECL production pipeline have been discussed previously. 1c Figure S98. The structure of a completed Main Library Build. Separately assembled/ligated oligonucleotides (codons) are shown in alternate colors.
7b. General procedures utilized in the DECL build. The previously listed general information and procedures for materials, oligonucleotide analysis, ethanol precipitation, and ligation were utilized in the build of this library (see SI, sections 1a-1d). In addition, cycle 2 and cycle 3 reactions were monitored by use of a cholesterol-tagged DNA oligomer ("spike in") with very different LC-MS retention to assess post-pool reaction completion. Other general procedures related to chemical transformations were performed: "Reverse" Acylation of amines with on-DNA carboxylic acids: To carboxylic acid-terminated DNA (35 nmol, 32 μL, 1 equiv, soln in H 2 O) in pH 5.8 MES buffer (17500 nmol, 35 μL, 500 equiv), additional water (29.25 μL) and acetonitrile (43.75 μL) were added, followed by an amino-building block (3500 nmol, 17.5 μL, 100 equiv, 200 mM in CH 3 CN) and DMTMM (3500 nmol, 17.5 μL, 100 equiv, 200 mM in H 2 O) and the solution was allowed to incubate overnight at room temperature. The reactions were then assessed for completion and/or precipitated by the general procedures.
Reductive amination of aldehydes with on-DNA amines with NaBH 4 : To amino-terminated DNA (47.6 nmol, 50 μL, 1 equiv, soln in H 2 O) in pH 9.5 borate buffer (23810 nmol, 95 μL, 500 equiv), additional acetonitrile (20 μL) and an aldehyde building block (4762 nmol, 23.8 μL, 100 equiv, 200 mM in CH 3 CN) were added and the solution was allowed to sit at room temperature for 1 h. NaBH 4 (4762 nmol, 47.6 μL, 100 equiv, 100 mM in CH 3 CN) was added and the solution was incubated at room temperature overnight. The reactions were then assessed for completion and/or precipitated by the general procedures.

Nucleophilic aromatic substitution with DABCO:
To amino-terminated DNA (47.72 nmol, 50 μL, 1 equiv, soln in H 2 O), NaOH (47620 nmol, 9.52 μL, 1000 equiv, 5M in H 2 O), and additional water (100 μL) were added. Then an electrophilic dihaloarene (14286 nmol, 71 μL, 300 equiv, 200 mM in CH 3 CN) and DABCO (476 nmol, 4.76 μL, 10 equiv, 100 mM in CH 3 CN) were added and the solution was incubated at room temperature overnight. The reactions were then assessed for completion and/or precipitated by the general procedures.  Figure S99. Synthetic sequence of the Main Build. Cycle 1 is shown in blue, cycle 2 is shown in red and cycle 3 is shown in green.
Procedure for Cycle 1. Several linker-functionalized S2 (35 nmol/well) were plated individually into wells in 96-well plates. N-Boc diamines (183) and nitro anilines (23) were attached by "reverse" acylation onto two different carboxylic acid terminated DNA substrates in separate wells. N-Boc amino acids (57) and nitro benzoic acids (23) were acylated onto four different amine terminated DNA substrates in separate wells. Nitro benzaldehydes (14) were attached by reductive amination to four different amino terminated DNA substrates in separate wells. In addition, blank wells to encode the addition of no-building block or reagents were included. After precipitation, each chemical transformation was encoded by the ligation of a pair of 13-mer duplexed DNA oligonucleotides 88831) were added to the incubation to instantaneously pull down the protein. Beads were washed 1 time with the aforementioned selection buffer without salmon sperm DNA by brief and vigorous vortex. The binding molecules were then eluted by heating the beads at 80 °C for 10 min in washing buffer. The resulting eluent was quantitated using qPCR and determined whether another round selection was needed. Totally 3 rounds of selection were performed for PLK1, and the oligonucleotides in the final eluent were PCR amplified for 18 cycles (NTC) and 17 cycles (PLK1) using Platinum Taq DNA Polymerase High Fidelity (Thermo Fisher Scientific, 10966026) with denaturation at 95 °C, annealing at 58 °C, and extension at 72 °C using primers that incorporate complementary sequences to the library headpiece or tailpiece along with the Illumina READ 1 or READ 2 sequences required for Illumina clustering and sequencing. The amplified DNA was cleaned by Agencourt AMPure XP beads and quantitated with Agilent high sensitivity DNA kit using a Bioanalyzer before the sequencing.

Sequencing analysis:
Raw DNA sequence reads (in the form of FASTQ files), quality metrics, and sequencing index-to-sample attribute value pairs were obtained from Illumina BaseSpace at the conclusion of sequencing. Samples were linked to their respective FASTQ files based on their sequencing index (DTSU) and were expanded into individual experiments if they were part of a larger pool. Individual samples were then decoded by perfectly matching individual oligonucleotide sub-structures without gaps and in the order defined by the known DNA encoding structure (Main Library Build). Valid DNA barcodes were annotated with the corresponding oligonucleotide sequence-to-building block lookup for each of the three codon cycles, which collectively represent a distinct small molecule within a specific DECL. The degenerate UMI (unique molecular identifier) portions of the DNA barcodes were accumulated into a list of UMIs for each unique codon tuple as a method to distinguish experimental vs. amplification events. Unique molecule counts were then evaluated using a directed-graph counting model as described previously. 1c The set of unique codon tuples with unique molecule counts was then aggregated across all possible combinations of codons (all n-synthons), and enrichment for each n-synthon was evaluated independently. Enrichment was evaluated with a normalized z-score metric which normalizes for sampling and library diversity. 1c The sequencing of the naïve DECL resulted in 18,568,815 complete barcodes read and 17,166,316 unique molecules sampled. This corresponded to a sampling factor of 21674.6 for the cycle 1 codons of the DECL, enabling accurate statistics for the cycle 1 codon distribution of the library. The codon distribution was evaluated by normalizing observed c1 codon counts by the average cycle 1 codon count, and we report codon populations as percentage of the mean count.
The PLK1-selected library as well as a Non-Target Control (NTC) library sample were amplified, sequenced, and analyzed for enrichment of each n-synthon in the DECL. The enrichment of each n-synthon was measured using a normalized z-score metric 1c and the Agresti-Coull estimation interval for proportions. 5 The resulting enrichment values (labeled as "AC_zscore_n") were then compared by plotting the enrichment in the target sample against enrichment in the NTC sample ( Figure S100). DECL members with significant binding affinity for the target are detected by observing their component n-synthons as enriched on the Target axis but not the NTC axis. Figure  S100 highlights n-synthon components of the PLK1 hit series described in the text. Two di-synthons related to the hit series are shown in blue, which correspond to two different syntheses of the cycle 1 component: N-Boc deprotection and Nitro reduction. Similarly, two tri-synthons in yellow and two mono-synthons in green correspond to components of the hit series arising from the two synthetic routes. Importantly, the measured enrichment of each pair of displayed n-synthons was comparable for the two synthetic routes in cycle 1.