The Evolution of DNA-Templated Synthesis as a Tool for Materials Discovery

Conspectus Precise control over reactivity and molecular structure is a fundamental goal of the chemical sciences. Billions of years of evolution by natural selection have resulted in chemical systems capable of information storage, self-replication, catalysis, capture and production of light, and even cognition. In all these cases, control over molecular structure is required to achieve a particular function: without structural control, function may be impaired, unpredictable, or impossible. The search for molecules with a desired function is often achieved by synthesizing a combinatorial library, which contains many or all possible combinations of a set of chemical building blocks (BBs), and then screening this library to identify “successful” structures. The largest libraries made by conventional synthesis are currently of the order of 108 distinct molecules. To put this in context, there are 1013 ways of arranging the 21 proteinogenic amino acids in chains up to 10 units long. Given that we know that a number of these compounds have potent biological activity, it would be highly desirable to be able to search them all to identify leads for new drug molecules. Large libraries of oligonucleotides can be synthesized combinatorially and translated into peptides using systems based on biological replication such as mRNA display, with selected molecules identified by DNA sequencing; but these methods are limited to BBs that are compatible with cellular machinery. In order to search the vast tracts of chemical space beyond nucleic acids and natural peptides, an alternative approach is required. DNA-templated synthesis (DTS) could enable us to meet this challenge. DTS controls chemical product formation by using the specificity of DNA hybridization to bring selected reactants into close proximity, and is capable of the programmed synthesis of many distinct products in the same reaction vessel. By making use of dynamic, programmable DNA processes, it is possible to engineer a system that can translate instructions coded as a sequence of DNA bases into a chemical structure—a process analogous to the action of the ribosome in living organisms but with the potential to create a much more chemically diverse set of products. It is also possible to ensure that each product molecule is tagged with its identifying DNA sequence. Compound libraries synthesized in this way can be exposed to selection against suitable targets, enriching successful molecules. The encoding DNA can then be amplified using the polymerase chain reaction and decoded by DNA sequencing. More importantly, the DNA instruction sequences can be mutated and reused during multiple rounds of amplification, translation, and selection. In other words, DTS could be used as the foundation for a system of synthetic molecular evolution, which could allow us to efficiently search a vast chemical space. This has huge potential to revolutionize materials discovery—imagine being able to evolve molecules for light harvesting, or catalysts for CO2 fixation. The field of DTS has developed to the point where a wide variety of reactions can be performed on a DNA template. Complex architectures and autonomous “DNA robots” have been implemented for the controlled assembly of BBs, and these mechanisms have in turn enabled the one-pot synthesis of large combinatorial libraries. Indeed, DTS libraries are being exploited by pharmaceutical companies and have already found their way into drug lead discovery programs. This Account explores the processes involved in DTS and highlights the challenges that remain in creating a general system for molecular discovery by evolution.


■ INTRODUCTION
Two centuries of research has furnished chemists with the ability to synthesize a huge variety of molecular architectures based on organic and inorganic components and to create materials with new functions ranging from therapeutics to solar cells. While the majority of new molecules with precisely defined structures are "small" (i.e., <1000 Da), solid-phase synthesis techniques have made it possible to produce monodisperse macromolecules such as DNA, peptides and their analogues, 1,2 and advances in sequence-controlled polymerization continue. 3 While much work remains to be done, we now have access to a very large chemical space. Searching this space for new molecules capable of meeting challenges in human health, energy, and security is of vital importance. However, even the largest combinatorial libraries are many orders of magnitude too small to search even the most synthetically accessible regions of chemical space effectively. 4 A system capable of tackling the above challenge would need to (1) operate in parallel rather than in series, drastically reducing synthesis time; (2) use extremely small amounts of material, in order to bring costs down and render synthesis of very large libraries of compounds practical, while still allowing product selection and identification (typically below the detection limit of common analytical techniques such as mass spectrometry); (3) enable molecular evolution. Evolution is perhaps the most important innovation as it allows a very large chemical space to be sampled without the requirement to synthesize all possible molecules within that space. Sequential rounds of selection, mutation and resynthesis can allow for the identification of functional molecules that were not present in the initial compound library (Figure 1). While criterion (1) may be addressed by improvements in synthetic methods/ technology, it is extremely difficult, if not impossible, to envisage how conventional combinatorial synthesis could address points (2) and (3).
One method that has been developed to allow functional evolution is messenger RNA (mRNA) display 5,6 ( Figure 2). Here, a combinatorial library of DNA sequences is converted by reverse transcription into the corresponding mRNA library. Each mRNA strand is then modified by ligation of a puromycin-modified DNA strand to its 3′ end and translated into the corresponding peptide by in vitro ribosomal peptide synthesis (RPS). When a ribosome reaches the RNA/DNA junction at the end of a mRNA template it stalls: at this point the terminal puromycin, a peptidyl acceptor antibiotic, can enter the active site causing the peptide product to be transferred to it. The resulting library of peptide products can Figure 1. Molecular evolution allows large chemical spaces to be searched efficiently. A library of instructions is translated into the corresponding library of molecular products, which are then selected for target properties (Round 1). The instructions for the enriched products are then amplified, mutated and translated again to generate a library of new products (some or all of which may not have been present in the original product library) which can be selected against to identify products with improved properties (Round 2). Repeated cycles of translation, selection, amplification, and mutation can enable the system to identify on an optimized product (Round N) without the need for every possible library member to be synthesized. comprise as many as 10 13 unique members, each of which is attached to its encoding mRNA sequence. After subjecting the library to selection, the mRNA attached to successful products can be reverse transcribed to the corresponding DNA sequences and amplified using the polymerase chain reaction (PCR). Mutation can be achieved by cutting members of the DNA library using restriction enzymes and then randomly recombining the fragments, or by error-prone PCR. Multiple rounds of selection, mutation, and amplification allow many more peptide sequences, not present in the original library, to be explored. Eventually, a peptide that is highly optimized for a particular function can be identified. 7 mRNA display, the related techniques of ribosome 8 and phage 9 display, and DNA aptamer libraries 10 provide a solution to the selection and evolution problems identified above. However, techniques involving RPS are limited to peptides incorporating proteinogenic amino acids. Expanding the library of BBs to include nonnatural amino acids is possible but difficult as it involves engineering the translation machinery of cellsa nontrivial undertaking. 11 In order to truly revolutionize the way that we search chemical space, we need a system with the capacity of mRNA display for directed evolution but with fewer constraints on the chemical structures of the products. The purpose of this Account is to chart the development of just such a technology: DNA-Templated Synthesis (DTS). 12 The basic principle of DTS is illustrated in Figure 3. Reactive BBs are conjugated to short adapter strands of DNA. At suitably low concentrations (nM), reaction rates between BBs are negligible in the absence of DNA−DNA interactions. 13 However, if two of the DNA adapters hybridize to form a duplex their attached BBs are brought into close proximity, greatly increasing their effective local concentration and hence the rate of reaction. This mechanism allows for the selective activation of reactions in the presence of many reactive species in the same mixture−a feat not ordinarily possible in conventional synthetic chemistry. The use of a nucleic acid

Accounts of Chemical Research
Article template to control synthesis has a precedent in RPS ( Figure  3c) 14 in which peptide bond formation is directed by basepairing between aminoacyl transfer RNAs (aatRNAs) and an mRNA template. Its ability to direct multiple reactions in parallel means that DTS is capable of addressing criterion (1), a key challenge in combinatorial synthesis.
As Gartner and Liu realized nearly 20 years ago, DTS also has the potential to address the more difficult questions of product identification and molecular evolution. 13 The products of DTS are tagged with DNA: it is possible to design ribosomeinspired DTS systems that encode information about the order of addition of BBs in the base sequence of this DNA tag ( Figure 4). 15 Following selection against a target, DNA amplification and sequencing methods can be applied to "read off" the reaction sequence, from which the chemical structure of the successful product can be inferred. It is important to note that due to the expense associated with synthesizing BB-DNA adapters, it is usually practical to make only very small amounts of product by DTS (usually on the order of picomoles). However, since amplification by PCR requires, in principle, only a single DNA molecule, product detection is still possible even at such small reaction scales (criterion (2)). Finally, molecular evolution could be achieved by iterated cycles of DTS, selection, amplification and mutation 13 (for example, by cutting or "restricting" the DNA into fragments then randomly recombining them). Translation is key to this process: the DNA tags attached to selected products must be capable of directing subsequent rounds of product synthesis. DTS thus has the potential for development into a tool to search efficiently and quickly a vast chemical space. In this Account, we outline the evolution of DTS toward this goal and the challenges associated with its development.

■ DNA-TEMPLATED CHEMISTRY
A simple example of DTS is the use of a DNA template to facilitate ligation of two DNA strands through a native phosphodiester bond 16 or a non-natural linkage, 12 as pioneered by the groups of Orgel, Liu, and many others. Numerous examples of bond-forming and bond-breaking reactions directed by DNA templates have been reported ( Figure 5), 17 including Heck coupling, the copper-catalyzed azide−alkyne cycloaddition "click" reaction, transition metal-mediated catalysis, and synthesis of conductive polymers and macrocyclic drug-like molecules. Thanks to the work of Kool, Seitz, and others, there is a well-developed field of research into DNA/ RNA probes based on fluorogenic reactions templated by a target nucleic acid. 18 Three different architectures are commonly used to bring BBs into close proximity ( Figure 6). 19 In an "end of helix" design, the reactants coupled to each strand are brought together at the end of a double helix. In "cross nick", reactions take place across a gap between DNA adapters held on a template strand. "Junction"-based designs template reactions in small volumes where multiple DNA strands intersect; an example is the YoctoReactor reported by Hansen and coworkers (see below). 20 For programmed, multistep synthesis, perhaps the most useful DTS reactions are transfer reactions in which bond formation is coordinated with cleavage from one of the DNA adapters ( Figure 5, blue box). Transfer reactions can facilitate autonomous, multistep DTS as they avoid steric problems caused by the accumulation of DNA adapters. An exemplar

Accounts of Chemical Research
Article from nature is RPS (Figure 3c). Here, amino acid BBs are linked to transfer RNAs (tRNAs) by activated ester bonds. As the ribosome scans from codon to codon along an mRNA, the growing peptide chain is continually passed to the incoming tRNA (selected by the next codon in the mRNA program) by means of an acyl transfer reaction that coordinates peptide bond formation with cleavage from the penultimate tRNA.
Relatively few DTS transfer reactions have been reported to date. The two predominant examples in the literature are acyl transfer and Wittig olefination. Acyl transfer is useful as it enables the creation of peptidomimetic molecules, 21 and several research groups have used this reaction to create oligopeptides and for various other applications. 17 The limited stability of the activated ester BBs in solution can cause problems, however. 22,23 Wittig olefination results in the formation of a carbon−carbon double bond, so allows the exploration of a different region of chemical space. It has been used for DTS of macrocycles 24−27 and linear oligomers (see ref 22 for a recent example). However, its broader application is limited by the stability of the phosphoryl BBs, which can be oxidized in water. 22 Less commonly used transfer reactions include a modified Staudinger ligation, 28 native chemical ligation, 29 nucleophilic aromatic substitution 30 and a tetrazine-transfer reaction. 31 In combination, these reactions could be very useful for the introduction of specific functional groups during DTS. In our opinion, this avenue remains underexplored. However, with the current state of the art, multistep syntheses take around a day to complete, and the best yields per step are around 80%, resulting in rather low overall yields. Investigation of alternative transfer chemistries compatible with DTS conditions should be given high priority as the discovery of a highly efficient and versatile method for DNA-templated oligomer synthesis could make the development of autono- Figure 4. Principle of product encoding and molecular evolution enabled by DTS. The base sequence of a DNA tag directs the synthesis of a product and defines its chemical structure. Selection against a target followed by amplification, shuffling of the instructions encoded in the DNA tag (restriction and recombination), and then resynthesis by another round of DTS allows the production of new products with improved properties. Molecular evolution is therefore possible.

Accounts of Chemical Research
Article Figure 5. Examples of reactions that have been performed on a DNA template. For acyl transfer X = S or N-hydroxysuccinimide. 17

Accounts of Chemical Research
Article mous systems analogous to the ribosome much more straightforward.

■ PRODUCT ENCODING
The idea of encoding the identity of a small-molecule product using an attached DNA sequence was first proposed by Brenner and Lerner 25 years ago. 15 DNA is an ideal identifying tag because it is straightforward to synthesize large libraries of unique oligonucleotides which can be sequenced to identify products. Its most useful feature, however, is its ability to be amplified by PCR, which has a limit of detection far below conventional analysis methods.
The original proposal was that solid-phase synthesis of a combinatorial library of target molecules (by repeatedly pooling then splitting support beads between different reactions) would proceed in parallel with the construction of DNA tags on each bead to encode the sequence of addition of BBs (Figure 7a). However, DNA serves only as a post hoc record of the reaction steps: it does not program synthesis, and cannot be used to direct the resynthesis of enriched products. As a result, this system is not suitable for the implementation of molecular evolution. An elegant alternative, termed "DNA routing", was devised by Halpin and Harbury (Figure 7b): 32 successive codons in a DNA "gene" are used to route a growing oligomer between reaction vessels, determining the sequence of BB coupling reactions and, therefore, the structure of the final product.
Using DNA-encoded chemical libraries (DECLs) for molecular discovery is advantageous because compounds can be selected from a pooled library as opposed to serial screening, enabling a 10 6 -fold increase in library size. 32 Selection from DECLs has become a well-established method and has been used by pharmaceutical companies in drug discovery programs. 4 In both split-and-pool and DNA-routed syntheses, each reaction occurs in a different reaction vessel without the direct involvement of the DNA tag. These methods are thus distinct from DTS, in which reactions occur in the same pot and are programed by DNA interactions. For this reason, we will not include them in our discussion below, but readers are directed to a recent paper illustrating the potential of DNA routing for molecular evolution. 33 DTS has been employed in a number of ways to create products tagged with a unique identifying DNA sequence. These approaches fall into three categories, which we have termed "templated parallel", "templated sequential" and "autonomous" (Figure 8). In each case, the use of DNA amplification and sequencing to confirm the identity of the DNA-tagged product oligomers has been demonstrated. 20,22,34 In the templated parallel approach, BB-DNA adapters are arrayed in sequence by hybridization to a DNA template (Figure 8a). Template domains act as codons, each of which uniquely specifies a single BB. The BBs are then chemically linked to each other and released from the now-redundant adapters. Kleiner and co-workers elegantly demonstrated this idea by connecting BBs to peptide nucleic acid (PNA) adapters 36 via cleavable linkers. Upon completion of the synthesis, the product was liberated while remaining tagged at one end with the templating DNA sequence. 34 Zhu and coworkers have also applied the templated parallel approach to produce "nylon DNA" using amide condensation reactions. 37 The templated sequential approach provides a more flexible but laborious route to oligomer synthesis. As in the templated parallel approach, the DNA template provides an ordered array of binding sites for BB-DNA adapters. However, the assembly of the BBs on the template, and hence the BB transfer reactions, occurs sequentially in this case−generally at the terminus of the template−and is controlled externally by strand-displacement reactions that bring successive reactants into close proximity with the growing oligomer ( Figure  8b). 38−41 Again, the product remains covalently attached to the DNA template, which can encode its chemical structure.
Finally, autonomous systems use a DNA "program" to control the sequential addition of BBs without the need for external intervention. One example of this approach couples motion of a DNA "walker" with chemical reactions between BBs (Figure 8c, upper scheme). 35 The walker, a single strand of DNA, moves along a track consisting of an array of singlestranded anchorages. At each step the walker catalyzes cleavage of the anchorage to which it is bound, thereby initiating a strand-displacement reaction that transfers it to the next anchorage. In a sequence programmed by the track, BBs attached to the anchorages are transferred to the growing oligomer attached to the walker. In principle, the final product could be ligated to a template strand on which the track is built to enable the sequence of BBs to be read off.
We recently reported a second example of autonomous DTS, using a hybridization chain reaction (HCR) to bring BBs into proximity with the growing oligomer in sequence 22 (Figure 8c, lower scheme). The DNA components are hairpins formed by partially self-complementary strands. A staggered duplex is formed by HCR between the hairpins, in which the sequence of hairpin addition is controlled by hybridization between complementary "toehold" domains. A set of "instruction" hairpins programs the sequence in which "chemistry" hairpins are incorporated and thus the sequence in which BBs coupled to these hairpins are added. The growing oligomer is carried forward on a strand of DNA that remains at the reactive end of the duplex. Ligation of the "instruction" hairpins creates a DNA-encoded record of the reaction sequence.

Accounts of Chemical Research
Article ■ EXTENDING THE LENGTH OF

SEQUENCE-CONTROLLED OLIGOMERS
Extending the length of sequence-controlled oligomers that can be synthesized is important for two reasons. First, product diversity increases rapidly with oligomer length. For example, a library of 1 billion trimers requires 1000 distinct BBs, while a comparable library of decamers needs only eight. Second, sequence-controlled macromolecules are a "holy grail" of polymer science as perfect control may allow the discovery of synthetic polymers with well-defined folded conformations, analogous to native proteins, with greatly enhanced properties.
The mechanism and architecture used in DTS determines the maximum number of BBs that can be concatenated. For example, the YoctoReactor restricts the number of reactants that can be colocalized and thus cannot produce products longer than tetramers. 20 By using the templated parallel approach, Niu and coworkers were able to synthesize long, precisely defined polymers by templating the concatenation of up to ten BBs that were themselves oligomers (of amino acids or ethylene glycol subunits). 34 This method therefore makes it possible to explore the structure−function relationships of artificial polymers similar in length to proteinsthis is extremely important for the development of artificial enzymes, for example. However, the system cannot encode variability within the oligomeric BBs, so only a fraction of possible sequences of the subunits could be explored.
Strategies for oligomer synthesis using the templated sequential method can be limited by the lengths of adapter  35 which steps down a track (driven by ribozyme-catalyzed cleavage of the track anchorages) picking up BBs in a programmed order. Lower: the HCR system, developed in our laboratories, 22 which coordinates programmed DNA polymerization with oligomer assembly. Complementary "toehold" domains, whose hybridization controls the reaction sequence, are identified by color.

Accounts of Chemical Research
Article and template that it is practical to synthesize, usually around 200 nucleotides. Alternative approaches in which the adapter length is kept constant usually result in the reactive site at the distal end of the oligomer moving further and further from the new BB as the synthesis proceeds, potentially limiting yield. As a result, hexamers are currently the longest oligomers that have been produced using these methods. 40 To prevent the DNA mechanism from imposing limits on oligomer length we have developed a simpler strand-displacement mechanism (Figure 9a). 42 Here, the growing oligomer is transferred to the incoming BB which remains attached to its adapter, as in the ribosome. The spent adapter is then removed by strand displacement, making way for hybridization of the next DNA-BB adapter. Adapters are distinguished by the unhybridized toehold domain used to drive their eventual displacement so they can all be designed to be the same length. Using this method we have so far been able to demonstrate the construction of decamers which, with those synthesized using the templated parallel method described above, are currently the longest oligomers constructed by DTS. 43 However, control over reaction sequence requires the sequential addition of BBs. A more sophisticated system (Figure 9b) uses the serial addition of instruction strands to control reactions within a vessel containing a mixture of all BBs. 44 Neither method lends itself to encoding the identity of the product in the final DNA tag, however.
The autonomous systems described above have perhaps the greatest potential for the synthesis of long oligomers by DTS. The DNA walker 35 and HCR 22 systems have significant potential for optimization and extension, but they have yet to realize sequence-controlled synthesis of products longer than tetramers.

■ COMBINATORIAL SYNTHESIS BY DTS
Gartner and Liu first demonstrated the potential of DTS for combinatorial synthesis by templating 1025 distinct thiol− maleimide additions in a single pot. 13 The authors expanded this approach to produce a DNA-templated library of 65 macrocycles, 24 and later larger libraries of 13 000 26 and 160 000 similar molecules (Figure 10a). 45 The library size in these and similar systems is ultimately limited by the number of orthogonal adapter sequences required, as a unique adapter is needed to encode not only the identity of each BB but also each possible position of that BB within the product (as with DNA routing 33 ).
Li and co-workers exploited the non-natural base deoxyinosine, which forms base pairs almost indiscriminately with all four natural bases, in order to simplify the design of adapters. 46 This enabled the use of a single "universal" template for the combinatorial synthesis of a model library of 114 688 distinct products, with the identity of the products encoded in DNA regions opposite deoxyinosine tracts (Figure 10b). However, the use of a universal template means that mutation is not possible.
The YoctoReactor has been used to generate libraries comprising more than 10 7 unique members. 47 This method, as well as a related approach described by Cao and co-

Accounts of Chemical Research
Article workers, 41 simplifies adapter design by decoupling the DNA domain encoding BB identity from that directly involved in DNA templating (Figure 10c). However, in both cases a BB requires a different adapter for each position in the product. The preparation of all oligomers of length n therefore requires the synthesis of n different DNA-linked versions of the same BB, making this methodology time-consuming and limited in flexibility.
Our work on HCR goes some way toward solving issues related to adapter sequence design and BB interchangeability, since the identity of a BB is encoded in the base sequence of the chemistry hairpin loop and any chemistry hairpin can be added at any point in the sequence (Figure 10d). Using the HCR approach with a branching (nondeterministic) synthesis program we were able to demonstrate the combinatorial synthesis of a library of 12 different products 22 and are working toward the synthesis of larger libraries.
Given that the largest combinatorial libraries synthesized by DTS are still at least an order of magnitude smaller than libraries generated by mRNA display, 9 reaching libraries of this size remains a key target.

■ SELECTION OF FUNCTIONAL PRODUCTS
The principle of functional product selection from DNA-tagged libraries is illustrated in Figure 11. The most widely used method involves the incubation of a library with an immobilized target followed by stringent washing to remove any products that do not bind. The DNA tags of selected products are then amplified by PCR and sequenced: the chemical structures of the successful binders can be inferred from the DNA sequence. This approach has the advantage that it may not be necessary to remove unreacted DNA adapters/ templates from the reaction mixture since these will be removed during the washing step, simplifying library synthesis.
Many of the compound libraries produced by DTS have been exposed to selection experiments, resulting in the identification of inhibitors 25,27,47 and antagonists 45 of several important biological targets including kinases and apoptosis inhibitors. These examples demonstrate the great potential of DTS in drug discovery, but the libraries involved remain limited by their relatively small sizes. To our knowledge there are as yet no therapeutics discovered using DTS libraries that have made it to market, although both Vipergen and Ensemble Therapeutics are working toward this end.
Libraries produced by DTS are constrained by the requirement that product synthesis be water-compatible. In contrast, DECLs produced by solid phase methods encounter no such limitation since BB conjugation, DNA tag extension and hybridization reactions can be carried out in different solvents (Figure 7). Since comparable library sizes are achievable with both, it is perhaps not surprising that the adoption of DTS by the pharmaceutical industry has been slower, in spite of the promise shown by functional selection experiments. 4

■ CONCLUSIONS AND FUTURE CHALLENGES
Over the past 20 years, DTS has developed enormously. A diverse range of chemical reactions can now be directed by DNA templates. Different template architectures allow the synthesis of oligomeric and macrocyclic products. Mechanisms for autonomous DTS have been developed, and synthesis of large combinatorial libraries containing DNA-tagged molecules for selection against various biological targets is now possible. However, there remains much work to be done to identify water-stable yet reactive BBs, to develop autonomous DTS systems to the point where they can produce large

Accounts of Chemical Research
Article combinatorial product libraries, and to diversify the range of targets against which selection experiments are performed. DTS has developed to meet most of the requirements of a system for molecular evolutionbut not all. The capacity for mutation and resynthesis is still missing from all published DTS systems: we believe that this is a priority for those working in the field.
Some of the most exciting possibilities for DTS lie in nonnatural materials discovery. The current approach to materials chemistry can largely be characterized by "make one, test one"a material is made with a particular function in mind, it is tested, and then improvements are proposed based on the outcome. A DTS system capable of evolving molecules to meet challenges such as light harvesting or carbon sequestration would be truly revolutionary: this technology has the potential to usher in a new and exciting era of materials discovery.

■ ACKNOWLEDGMENTS
We thank Annie Morton and Dr. Charlotte Zammit for providing feedback on the manuscript.