uPIC–M: Efficient and Scalable Preparation of Clonal Single Mutant Libraries for High-Throughput Protein BiochemistryClick to copy article linkArticle link copied!
- Mason J. AppelMason J. AppelDepartment of Biochemistry, Stanford University, Stanford, California 94305, United StatesMore by Mason J. Appel
- Scott A. LongwellScott A. LongwellDepartment of Bioengineering, Stanford University, Stanford, California 94305, United StatesMore by Scott A. Longwell
- Maurizio MorriMaurizio MorriChan Zuckerberg Biohub, San Francisco, California 94110, United StatesMore by Maurizio Morri
- Norma Neff
- Daniel Herschlag*Daniel Herschlag*Email: [email protected]Department of Biochemistry, Stanford University, Stanford, California 94305, United StatesMore by Daniel Herschlag
- Polly M. Fordyce*Polly M. Fordyce*Email: [email protected]Department of Bioengineering, Stanford University, Stanford, California 94305, United StatesChan Zuckerberg Biohub, San Francisco, California 94110, United StatesDepartment of Genetics, Stanford University, Stanford, California 94305, United StatesChEM-H Institute, Stanford University, Stanford, California 94305, United StatesMore by Polly M. Fordyce
Abstract
New high-throughput biochemistry techniques complement selection-based approaches and provide quantitative kinetic and thermodynamic data for thousands of protein variants in parallel. With these advances, library generation rather than data collection has become rate-limiting. Unlike pooled selection approaches, high-throughput biochemistry requires mutant libraries in which individual sequences are rationally designed, efficiently recovered, sequence-validated, and separated from one another, but current strategies are unable to produce these libraries at the needed scale and specificity at reasonable cost. Here, we present a scalable, rapid, and inexpensive approach for creating User-designed Physically Isolated Clonal–Mutant (uPIC–M) libraries that utilizes recent advances in oligo synthesis, high-throughput sample preparation, and next-generation sequencing. To demonstrate uPIC–M, we created a scanning mutant library of SpAP, a 541 amino acid alkaline phosphatase, and recovered 94% of desired mutants in a single iteration. uPIC–M uses commonly available equipment and freely downloadable custom software and can produce a 5000 mutant library at 1/3 the cost and 1/5 the time of traditional techniques.
This publication is licensed under
License Summary*
You are free to share(copy and redistribute) this article in any medium or format within the parameters below:
Creative Commons (CC): This is a Creative Commons license.
Attribution (BY): Credit must be given to the creator.
Non-Commercial (NC): Only non-commercial uses of the work are permitted.
No Derivatives (ND): Derivative works may be created for non-commercial purposes, but sharing is prohibited.
*Disclaimer
This summary highlights only some of the key features and terms of the actual license. It is not a license and has no legal value. Carefully review the actual license before using these materials.
License Summary*
You are free to share(copy and redistribute) this article in any medium or format within the parameters below:
Creative Commons (CC): This is a Creative Commons license.
Attribution (BY): Credit must be given to the creator.
Non-Commercial (NC): Only non-commercial uses of the work are permitted.
No Derivatives (ND): Derivative works may be created for non-commercial purposes, but sharing is prohibited.
*Disclaimer
This summary highlights only some of the key features and terms of the actual license. It is not a license and has no legal value. Carefully review the actual license before using these materials.
License Summary*
You are free to share(copy and redistribute) this article in any medium or format within the parameters below:
Creative Commons (CC): This is a Creative Commons license.
Attribution (BY): Credit must be given to the creator.
Non-Commercial (NC): Only non-commercial uses of the work are permitted.
No Derivatives (ND): Derivative works may be created for non-commercial purposes, but sharing is prohibited.
*Disclaimer
This summary highlights only some of the key features and terms of the actual license. It is not a license and has no legal value. Carefully review the actual license before using these materials.
License Summary*
You are free to share(copy and redistribute) this article in any medium or format within the parameters below:
Creative Commons (CC): This is a Creative Commons license.
Attribution (BY): Credit must be given to the creator.
Non-Commercial (NC): Only non-commercial uses of the work are permitted.
No Derivatives (ND): Derivative works may be created for non-commercial purposes, but sharing is prohibited.
*Disclaimer
This summary highlights only some of the key features and terms of the actual license. It is not a license and has no legal value. Carefully review the actual license before using these materials.
License Summary*
You are free to share(copy and redistribute) this article in any medium or format within the parameters below:
Creative Commons (CC): This is a Creative Commons license.
Attribution (BY): Credit must be given to the creator.
Non-Commercial (NC): Only non-commercial uses of the work are permitted.
No Derivatives (ND): Derivative works may be created for non-commercial purposes, but sharing is prohibited.
*Disclaimer
This summary highlights only some of the key features and terms of the actual license. It is not a license and has no legal value. Carefully review the actual license before using these materials.
Introduction
Figure 1
Figure 1. Overview of the uPIC–M pipeline to generate user-defined clonal mutant libraries. (A) Examples of clonal libraries from uPIC–M and potential high-throughput biochemistry applications. Applications are listed along with examples of the types of variants involved. (B) Comparison of cost (including materials and labor) of conventional mutagenesis vs uPIC–M for libraries of 50–20,000 mutants. A uPIC–M clone sampling rate of 384 per 50 desired mutants (7.68-fold excess) was used for these calculations. uPIC–M (modified) represents a lower cost version of uPIC–M with the addition of pipet tip washing for plate liquid transfer steps. See Table S1 for full time and cost calculations. (C) Workflow for generating uPIC–M libraries in three phases: (1) Mutagenic oligos are synthesized for ∼50 residue windows on a pooled array and selective PCR amplification of each window generates a primer pool used for QuikChange; (2) pooled QuikChange reactions are transformed and plated, with each plate containing a mixture of ∼50 possible single mutants, facilitating colony picking into multiwell plates to isolate clonal libraries of unidentified variants; (3) clonal libraries are prepared and sequenced by NGS to reveal the genotype and location of each variant.
Results and Discussion
Overview of uPIC–M
Design of Tiling ORF Windows Allows Selective Mutagenesis from Oligo Arrays
Figure 2
Figure 2. Tiling window strategy for uPIC–M mutagenic oligo array design. (A) Tiling window strategy (see Figure 1C) divides the ORF from the protein of interest into mutagenic sublibrary regions, with sublibrary oligo length constrained by DNA synthesis limits. Each window contains unique forward and reverse priming sites (dark shading, here ∼25 nt each) at the 5′- and 3′-termini surrounding a mutational region (light shading, here ∼150 nt). For a scanning library, each codon along the length of a sublibrary mutational region is substituted via an individual mutagenic oligo. (B) Selective amplification of oligos from a single window (sublibrary 11). Forward and reverse primers specific to a single sublibrary are used to amplify oligos from the resuspended array material, yielding an oligo pool containing ∼50 codon substitutions from the same mutagenic window.
QuikChange-HT Mutagenesis
Simulated Mutant Sampling to Predict Screening Requirements
Figure 3
Figure 3. Simulated sampling of pooled single mutant libraries. (A, B) Simulation of the number of unique mutants obtained as a function of the number of clones sampled for pooled libraries containing 50 (A) or 541 (B) unique single mutants with single mutant frequencies from 0.1 to 1.0. The remaining fraction of each pool represents all other variants (e.g., WT, indels, double, and higher-order mutants). Each curve represents the average of 103 simulations; shaded bands represent the 95% confidence interval; horizontal dashed lines (A, B) indicate the total possible number of unique mutants; vertical line (B) indicates the number of colonies picked for the SpAP library constructed herein (for legend, see A). (C–E) Simulated picking results for a sublibrary containing 50 single mutants at equal relative abundances sampled 384 times with a single mutant frequency of 0.5. (C) Simulated positional frequencies of single mutants; the results of five sampling simulations were chosen at random. (D) Histogram of expected mutant yields and (E) histogram of expected yields per sublibrary position (from 103 sampling events).
Clonal Mutant Isolation from Plasmid Libraries by a Pick-and-Grow Step
Preparation of Mutant DNA Amplicons
Figure 4
Figure 4. Schematic of uPIC–M sequencing library preparation. Preparation of sequencing libraries takes place in multiwell plate format (96 or 384) via the following steps: (i) ORF regions of target plasmids are amplified from each clone using universal primers to obtain enriched amplicon DNA (A); (iia) For amplicons ≤600 bp, universal Illumina adapters may be ligated directed to amplicons or added by amplification in a second PCR step; (iib) for amplicons >600 bp, DNA is fragmented and tagged using adapter-loaded Tn5 transposase, i.e., tagmented; (iii) amplicons or fragments are further amplified with Nextera primers that incorporate dual-unique i7 and i5 index barcodes; (iv–vii) amplified and barcoded clonal libraries are pooled for NGS, purified, sequenced, and barcodes are used to report the plate-well location and genotype of each variant (B). Mutant amplicons generated at (i) can be used directly for high-throughput biochemistry applications (shown here: cell-free expression and fluorogenic assay of an enzyme library using a microfluidic platform to obtain kinetic parameters) (C).
Tagmentation and Barcoding of Mutant Amplicons
Automated Plate Processing
Sequencing Library QC
Figure 5
Figure 5. Sequencing library quality control results. (A) Plot of fluorescence (arbitrary units) vs fragment length for sublibary 1 following tagmentation and barcoding amplification. See Figure S7 for analogous data for the other sublibraries. (B) Electropherograms of sublibraries 1–13 (see Table S6 for integrated peak concentrations).
Sequencing Library Analysis
Figure 6
Figure 6. NGS data processing and read mapping pipeline and results for the SpAP scanning library. (A) Data processing steps and observed statistics. Raw FASTQ files (demultiplexed and unpaired) are filtered for barcodes containing 1 or more reads followed by adapter sequence trimming and pairing with read mates (if both reads are present and meet length/quality thresholds). Sequence-redundant readthrough read pairs are flagged at this stage and redundant read mates are discarded. (B) Trimmed and paired reads are mapped to the SpAP-eGFP amplicon, E. coli, and full plasmid genomes. (C) Histogram of total reads per barcode across all sublibraries following read trimming and pairing (n = 4645). (D) Barcode counts for each sublibrary plate at several read depth thresholds for the SpAP-eGFP ORF (>0 represents barcodes containing any mapped reads and remaining thresholds represent the minimum number of mapped reads at all positions; only barcodes containing at least one mapped read are included). The horizontal dashed line at 384 barcodes represents the maximum possible number of barcodes.
sublibrary | total reads (×106) | barcodesa | median reads | SpAP readsb | E. coli readsb | plasmid readsb | unmapped readsb |
---|---|---|---|---|---|---|---|
all | 28.6 | 4645 | 5974 | 0.96 | 0 | 0.02 | 0.02 |
1 | 2.7 | 369 | 7794 | 0.97 | 0 | 0.02 | 0.02 |
2 | 2.5 | 372 | 7561 | 0.96 | 0 | 0.02 | 0.02 |
3 | 2.7 | 344 | 7771 | 0.94 | 0 | 0.03 | 0.02 |
4 | 1.4 | 383 | 3421 | 0.94 | 0 | 0.02 | 0.03 |
5 | 2.3 | 356 | 6339 | 0.96 | 0 | 0.02 | 0.02 |
6 | 1.5 | 382 | 3560 | 0.96 | 0 | 0.02 | 0.02 |
7 | 2.3 | 377 | 6782 | 0.97 | 0 | 0.01 | 0.02 |
8 | 1.7 | 380 | 469 | 0.08 | 0 | 0.02 | 0.89 |
8c | 1.6 | 184 | 9418 | 0.95 | 0 | 0.02 | 0.03 |
9 | 1.7 | 359 | 4471 | 0.93 | 0 | 0.03 | 0.04 |
10 | 3.8 | 384 | 9787 | 0.96 | 0 | 0.02 | 0.02 |
11 | 3.4 | 377 | 9011 | 0.97 | 0 | 0.02 | 0.01 |
12 | 1.4 | 253 | 6171 | 0.96 | 0 | 0.01 | 0.02 |
13 | 1.2 | 309 | 3755 | 0.96 | 0 | 0.02 | 0.02 |
Number of barcodes (out of a possible 4992 total or 384 per sublibrary) with >0 reads (mapped or unmapped).
Reported as the median value across barcodes with >0 reads.
This entry contains only sublibrary 8 barcodes with ≥500 reads.
sublibrary | barcodesa | depth ≥1b | depth ≥10b | depth ≥100b | depth ≥1000b | keptc |
---|---|---|---|---|---|---|
all | 4571 | 3603 | 3386 | 2926 | 0 | 3530 |
1 | 369 | 325 | 301 | 262 | 0 | 318 |
2 | 367 | 277 | 260 | 246 | 0 | 274 |
3 | 342 | 300 | 289 | 244 | 0 | 298 |
4 | 367 | 251 | 234 | 194 | 0 | 247 |
5 | 355 | 327 | 299 | 234 | 0 | 315 |
6 | 378 | 276 | 250 | 192 | 0 | 264 |
7 | 371 | 252 | 230 | 209 | 0 | 241 |
8 | 376 | 151 | 139 | 131 | 0 | 146 |
9 | 345 | 263 | 247 | 202 | 0 | 260 |
10 | 384 | 352 | 342 | 331 | 0 | 352 |
11 | 356 | 329 | 320 | 302 | 0 | 329 |
12 | 252 | 218 | 209 | 180 | 0 | 216 |
13 | 309 | 282 | 266 | 199 | 0 | 270 |
Number of barcodes (out of a possible 4992 total or 384 per sublibrary) with ≥1 SpAP reads.
Number of barcodes with at least this read depth at all positions in the SpAP-eGFP genome.
Number of barcodes meeting the depth threshold of at least 1 read at all positions, and ≥10 reads at ≥95% of positions. Only barcodes meeting this depth threshold were carried forward for subsequent variant analyses.
Quantifying the Yield of Single Mutants
Figure 7
Figure 7. Characterization of the SpAP alkaline phosphatase scanning mutant library created with uPIC–M. (A) Overview of variant detection analyses and calculated yields (red) for the SpAP mutant library. (B) Overall distribution of single mutants, WT, double mutants, triple and greater mutants, and indels across all mutational sublibraries (indel count reflects variants containing one or more indels). (C) Location and frequency of intended single mutants across the entire SpAP-eGFP ORF. (D) Scatter plot and histograms of variant reads vs WT reads for all intended single mutants. (E) Comparison of simulated and observed single mutant frequency distributions for three sublibraries. The legend specifies the observed yield of unique single mutants and simulated 95% confidence interval from 1000 events; “n” indicates the total number of observed intended single mutants. Results for all sublibraries are shown in Figure S9.
sublibrary | residues | total positions | barcodesa | single total | single intended | single unintended | double | triple+ | indels | WT |
---|---|---|---|---|---|---|---|---|---|---|
all | 2–542 | 541 | 3530 | 2056 | 1996 | 60 | 761 | 212 | 174 | 327 |
1 | 2–41 | 40 | 318 | 178 | 175 | 3 | 61 | 18 | 20 | 41 |
2 | 42–89 | 48 | 274 | 154 | 148 | 6 | 66 | 17 | 8 | 29 |
3 | 90–137 | 48 | 298 | 168 | 161 | 7 | 65 | 20 | 14 | 31 |
4 | 138–185 | 48 | 247 | 135 | 131 | 4 | 43 | 27 | 24 | 18 |
5 | 186–232 | 47 | 315 | 175 | 169 | 6 | 85 | 11 | 21 | 23 |
6 | 233–279 | 47 | 264 | 141 | 138 | 3 | 60 | 23 | 23 | 17 |
7 | 280–326 | 47 | 241 | 141 | 134 | 7 | 59 | 12 | 5 | 24 |
8 | 327–356 | 30 | 146 | 94 | 91 | 3 | 21 | 5 | 2 | 24 |
9 | 357–402 | 46 | 260 | 148 | 142 | 6 | 50 | 11 | 16 | 35 |
10 | 403–448 | 46 | 352 | 210 | 204 | 6 | 81 | 24 | 7 | 30 |
11 | 449–491 | 43 | 329 | 230 | 224 | 6 | 60 | 15 | 5 | 19 |
12 | 492–517 | 26 | 216 | 120 | 120 | 0 | 52 | 20 | 12 | 12 |
13 | 518–542 | 25 | 270 | 162 | 159 | 3 | 58 | 9 | 17 | 24 |
Number of barcodes meeting the depth threshold of at least 1 read at all positions, and ≥10 reads at ≥95% of positions.
Assessment of Single Mutant Purity
Evaluation of Method Performance
Conclusions
Materials and Methods
Description of Plasmid
Design of Mutagenic Oligo Arrays
Preparation of Sublibrary Mutagenic Primer Pools and PCR Mutagenesis
Transformation, Plating, Colony Picking & Growth
qPCR Detection of E. Coli Genomic DNA
Preparation of Enzyme ORF Amplicons
Fluorescence Quantification of Amplicon DNA
Tn5 Tagmentation
i7/i5 Barcoding PCR and Library Cleanup
Next-Generation Sequencing
Picking Simulations
NGS Data Processing, and Analysis
Supporting Information
The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/acsomega.1c04180.
Timeline of uPIC–M library generation, amplification of window-specific sublibrary pools from an oligo array, quantification of E. coli genomic DNA in diluted mutant culture templates, selection of PCR conditions for SpAP mutant amplicons, quantification of amplicon DNA concentrations per sublibrary plate, Tn5 tagmentation reaction conditions, electropherograms of tagmented and amplified mutant sublibraries, histogram of variant:WT read ratios among single, double, and triple and greater mutants, comparison of observed and simulated single mutant frequency distributions, time and cost calculations for uPIC–M and conventional mutagenesis, Plasmid map of PURExpress-SpAP-eGFP, complete DNA sequence of PURExpress-SpAP-eGFP plasmid, protein sequence of SpAP-(10mer linker)-eGFP, oligo array and window design details for SpAP scanning mutant library, concentration of purified sublibrary mutagenic primer pools, expected mutant yields from simulations of mutant sampling, variant composition of small-scale QuikChange-HT reactions, sublibrary transformation and colony picking results, amplicon DNA and library concentration statistics, unique single mutant yields for the SpAP scanning library, comparison of uPIC–M performance with simulated picking experiments, per sublibrary, and oligo array price summary (PDF)
Terms & Conditions
Most electronic Supporting Information files are available without a subscription to ACS Web Editions. Such files may be downloaded by article for research use (if there is a public use license linked to the relevant article, that license may permit other uses). Permission may be obtained from ACS for other uses through requests via the RightsLink permission system: http://pubs.acs.org/page/copyright/permissions.html.
Acknowledgments
We thank Daniel Mokhtari for assistance with data analysis, Agilent Technologies for generously providing oligo arrays, Dr. Robert St. Onge and the Stanford Genome Technology Center for assistance with robotic colony picking, and members of the Herschlag and Fordyce research groups for critical discussions and reviewing the manuscript. This work was supported by the NIH grant R01 (GM064798) awarded to D.H. and P.M.F., an Ono Pharma Foundation Breakthrough Innovation Prize, and the Gordon and Betty Moore Foundation (grant number 8415). P.M.F. is a Chan Zuckerberg Biohub Investigator.
References
This article references 44 other publications.
- 1Tang, Q.; Fenton, A. W. Whole-Protein Alanine-Scanning Mutagenesis of Allostery: A Large Percentage of a Protein Can Contribute to Mechanism. Hum. Mutat. 2017, 38, 1132– 1143, DOI: 10.1002/humu.23231Google Scholar1https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2sXhtlGit7rP&md5=e18150de48be6b6421b98621237c0dc4Whole-protein alanine-scanning mutagenesis of allostery: A large percentage of a protein can contribute to mechanismTang, Qingling; Fenton, Aron W.Human Mutation (2017), 38 (9), 1132-1143CODEN: HUMUE3; ISSN:1059-7794. (Wiley-Liss, Inc.)Many studies of allosteric mechanisms use limited nos. of mutations to test whether residues play "key" roles. However, if a large percentage of the protein contributes to allosteric function, mutating any residue would have a high probability of modifying allostery. Thus, a predicted mechanism that is dependent on only a few residues could erroneously appear to be supported. We used whole-protein alanine-scanning mutagenesis to det. which amino acid sidechains of human liver pyruvate kinase (hL-PYK; approved symbol PKLR) contribute to regulation by fructose-1,6-bisphosphate (Fru-1,6-BP; activator) and alanine (inhibitor). Each nonalanine/nonglycine residue of hL-PYK was mutated to alanine to generate 431 mutant proteins. Allosteric functions in active proteins were quantified by following substrate affinity over a concn. range of effectors. Results show that different residues contribute to the two allosteric functions. Only a small fraction of mutated residues perturbed inhibition by alanine. In contrast, a large percentage of mutated residues influenced activation by Fru-1,6-BP; inhibition by alanine is not simply the reverse of activation by Fru-1,6-BP. Moreover, the results show that Fru-1,6-BP activation would be extremely difficult to elucidate using a limited no. of mutations. Addnl., this large mutational data set will be useful to train and test computational algorithms aiming to predict allosteric mechanisms.
- 2Nisthal, A.; Wang, C. Y.; Ary, M. L.; Mayo, S. L. Protein Stability Engineering Insights Revealed by Domain-Wide Comprehensive Mutagenesis. Proc. Natl. Acad. Sci. U. S. A. 2019, 116, 16367– 16377, DOI: 10.1073/pnas.1903888116Google Scholar2https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1MXhsFylsLnO&md5=fd2600724b4aaa6c8684ca183baad9bdProtein stability engineering insights revealed by domain-wide comprehensive mutagenesisNisthal, Alex; Wang, Connie Y.; Ary, Marie L.; Mayo, Stephen L.Proceedings of the National Academy of Sciences of the United States of America (2019), 116 (33), 16367-16377CODEN: PNASA6; ISSN:0027-8424. (National Academy of Sciences)The accurate prediction of protein stability upon sequence mutation is an important but unsolved challenge in protein engineering. Large mutational datasets are required to train computational predictors, but traditional methods for collecting stability data are either low-throughput or measure protein stability indirectly. Here, the authors develop an automated method to generate thermodn. stability data for nearly every single mutant in a small 56-residue protein. Anal. reveals that most single mutants have a neutral effect on stability, mutational sensitivity is largely governed by residue burial, and unexpectedly, hydrophobics are the best tolerated amino acid type. Correlating the output of various stability-prediction algorithms against the authors' data shows that nearly all perform better on boundary and surface positions than for those in the core and are better at predicting large-to-small mutations than small-to-large ones. The most stable variants in the single-mutant landscape are better identified using combinations of 2 prediction algorithms and including more algorithms can provide diminishing returns. In most cases, poor in silico predictions were tied to compositional differences between the data being analyzed and the datasets used to train the algorithm. Finally, strategies to ext. stabilities from high-throughput fitness data such as deep mutational scanning are promising and data produced by these methods may be applicable toward training future stability-prediction tools.
- 3Fordyce, P. M.; Gerber, D.; Tran, D.; Zheng, J.; Li, H.; DeRisi, J. L.; Quake, S. R. De Novo Identification and Biophysical Characterization of Transcription-Factor Binding Sites with Microfluidic Affinity Analysis. Nat. Biotechnol. 2010, 28, 970– 975, DOI: 10.1038/nbt.1675Google Scholar3https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3cXhtVyiu7zI&md5=4b7720856ab810d08d11ddad872b967eDe novo identification and biophysical characterization of transcription-factor binding sites with microfluidic affinity analysisFordyce, Polly M.; Gerber, Doron; Tran, Danh; Zheng, Jiashun; Li, Hao; DeRisi, Joseph L.; Quake, Stephen R.Nature Biotechnology (2010), 28 (9), 970-975CODEN: NABIF9; ISSN:1087-0156. (Nature Publishing Group)Gene expression is regulated in part by protein transcription factors that bind target regulatory DNA sequences. Predicting DNA binding sites and affinities from transcription factor sequence or structure is difficult; therefore, exptl. data are required to link transcription factors to target sequences. We present a microfluidics-based approach for de novo discovery and quant. biophys. characterization of DNA target sequences. We validated our technique by measuring sequence preferences for 28 Saccharomyces cerevisiae transcription factors with a variety of DNA-binding domains, including several that have proven difficult to study by other techniques. For each transcription factor, we measured relative binding affinities to oligonucleotides covering all possible 8-bp DNA sequences to create a comprehensive map of sequence preferences; for four transcription factors, we also detd. abs. affinities. We expect that these data and future use of this technique will provide information essential for understanding transcription factor specificity, improving identification of regulatory sites and reconstructing regulatory interactions.
- 4Aditham, A. K.; Markin, C. J.; Mokhtari, D. A.; DelRosso, N.; Fordyce, P. M. High-Throughput Affinity Measurements of Transcription Factor and DNA Mutations Reveal Affinity and Specificity Determinants. Cell Syst. 2021, 112, DOI: 10.1016/j.cels.2020.11.012Google Scholar4https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB3MXks1Grs7o%253D&md5=6fb89fb454dd97ce76062724965d3a73High-Throughput Affinity Measurements of Transcription Factor and DNA Mutations Reveal Affinity and Specificity DeterminantsAditham, Arjun K.; Markin, Craig J.; Mokhtari, Daniel A.; DelRosso, Nicole; Fordyce, Polly M.Cell Systems (2021), 12 (2), 112-127.e11CODEN: CSEYA4; ISSN:2405-4712. (Cell Press)Transcription factors (TFs) bind regulatory DNA to control gene expression, and mutations to either TFs or DNA can alter binding affinities to rewire regulatory networks and drive phenotypic variation. While studies have profiled energetic effects of DNA mutations extensively, we lack similar information for TF variants. Here, we present STAMMP (simultaneous transcription factor affinity measurements via microfluidic protein arrays), a high-throughput microfluidic platform enabling quant. characterization of hundreds of TF variants simultaneously. Measured affinities for ∼210 mutants of a model yeast TF (Pho4) interacting with 9 oligonucleotides (>1,800 Kds) reveal that many combinations of mutations to poorly conserved TF residues and nucleotides flanking the core binding site alter but preserve physiol. binding, providing a mechanism by which combinations of mutations in cis and trans could modulate TF binding to tune occupancies during evolution. Moreover, biochem. double-mutant cycles across the TF-DNA interface reveal mol. mechanisms driving recognition, linking sequence to function. A record of this paper's Transparent Peer Review process is included in the Supplemental Information.
- 5Markin, C. J.; Mokhtari, D. A.; Sunden, F.; Appel, M. J.; Akiva, E.; Longwell, S. A.; Sabatti, C.; Herschlag, D.; Fordyce, P. M. Revealing Enzyme Functional Architecture via High-Throughput Microfluidic Enzyme Kinetics. Science 2021, 373, eabf8761, DOI: 10.1126/science.abf8761Google ScholarThere is no corresponding record for this reference.
- 6Liang, S.; Mort, M.; Stenson, P. D.; Cooper, D. N.; Yu, H. PIVOTAL: Prioritizing Variants of Uncertain Significance with Spatial Genomic Patterns in the 3D Proteome. Genomics 2020. DOI: 10.1101/2020.06.04.135103 .Google ScholarThere is no corresponding record for this reference.
- 7Starita, L. M.; Ahituv, N.; Dunham, M. J.; Kitzman, J. O.; Roth, F. P.; Seelig, G.; Shendure, J.; Fowler, D. M. Variant Interpretation: Functional Assays to the Rescue. Am. J. Hum. Genet. 2017, 101, 315– 325, DOI: 10.1016/j.ajhg.2017.07.014Google Scholar7https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2sXhsVOhu73M&md5=a6037f71813bdb50453c42d011c2eee0Variant Interpretation: Functional Assays to the RescueStarita, Lea M.; Ahituv, Nadav; Dunham, Maitreya J.; Kitzman, Jacob O.; Roth, Frederick P.; Seelig, Georg; Shendure, Jay; Fowler, Douglas M.American Journal of Human Genetics (2017), 101 (3), 315-325CODEN: AJHGAG; ISSN:0002-9297. (Cell Press)Classical genetic approaches for interpreting variants, such as case-control or co-segregation studies, require finding many individuals with each variant. Because the overwhelming majority of variants are present in only a few living humans, this strategy has clear limits. Fully realizing the clin. potential of genetics requires that we accurately infer pathogenicity even for rare or private variation. Many computational approaches to predicting variant effects have been developed, but they can identify only a small fraction of pathogenic variants with the high confidence that is required in the clinic. Exptl. measuring a variant's functional consequences can provide clearer guidance, but individual assays performed only after the discovery of the variant are both time and resource intensive. Here, we discuss how multiplex assays of variant effect (MAVEs) can be used to measure the functional consequences of all possible variants in disease-relevant loci for a variety of mol. and cellular phenotypes. The resulting large-scale functional data can be combined with machine learning and clin. knowledge for the development of "lookup tables" of accurate pathogenicity predictions. A coordinated effort to produce, analyze, and disseminate large-scale functional data generated by multiplex assays could be essential to addressing the variant-interpretation crisis.
- 8Hochberg, G. K. A.; Thornton, J. W. Reconstructing Ancient Proteins to Understand the Causes of Structure and Function. Annu. Rev. Biophys. 2017, 46, 247– 269, DOI: 10.1146/annurev-biophys-070816-033631Google Scholar8https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2sXksVCqs70%253D&md5=19552f9d9e82ad02000e1650203db066Reconstructing Ancient Proteins to Understand the Causes of Structure and FunctionHochberg, Georg K. A.; Thornton, Joseph W.Annual Review of Biophysics (2017), 46 (), 247-269CODEN: ARBNCV; ISSN:1936-122X. (Annual Reviews)A review. A central goal in biochem. is to explain the causes of protein sequence, structure, and function. Mainstream approaches seek to rationalize sequence and structure in terms of their effects on function and to identify function's underlying determinants by comparing related proteins to each other. Although productive, both strategies suffer from intrinsic limitations that have left important aspects of many proteins unexplained. These limits can be overcome by reconstructing ancient proteins, exptl. characterizing their properties, and retracing their evolution through time. This approach has proven to be a powerful means for discovering how historical changes in sequence produced the functions, structures, and other phys./chem. characteristics of modern proteins. It has also illuminated whether protein features evolved because of functional optimization, historical constraint, or blind chance. Here this review recent studies employing ancestral protein reconstruction and show how they have produced new knowledge not only of mol. evolutionary processes but also of the underlying determinants of modern proteins' phys., chem., and biol. properties.
- 9Lim, S. A.; Bolin, E. R.; Marqusee, S. Tracing a Protein’s Folding Pathway over Evolutionary Time Using Ancestral Sequence Reconstruction and Hydrogen Exchange. eLife 2018, 7, e38369 DOI: 10.7554/eLife.38369Google Scholar9https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1MXitlyrurzJ&md5=9a82831cc2e19b9fb662ce0842abf3a1Tracing a protein's folding pathway over evolutionary time using ancestral sequence reconstruction and hydrogen exchangeLim, Shion An; Bolin, Eric Richard; Marqusee, SusaneLife (2018), 7 (), e38369/1-e38369/19CODEN: ELIFA8; ISSN:2050-084X. (eLife Sciences Publications Ltd.)The conformations populated during protein folding have been studied for decades; yet, their evolutionary importance remains largely unexplored. Ancestral sequence reconstruction allows access to proteins across evolutionary time, and new methods such as pulsed-labeling hydrogen exchange coupled with mass spectrometry allow detn. of folding intermediate structures at near amino-acid resoln. Here, we combine these techniques to monitor the folding of the RNase H family along the evolutionary lineages of T. thermophilus and E. coli RNase H. All homologs and ancestral proteins studied populate a similar folding intermediate despite being sepd. by billions of years of evolution. Even though this conformation is conserved, the pathway leading to it has diverged over evolutionary time, and rational mutations can alter this trajectory. Our results demonstrate that evolutionary processes can affect the energy landscape to preserve or alter specific features of a protein's folding pathway.
- 10Flamholz, A. I.; Prywes, N.; Moran, U.; Davidi, D.; Bar-On, Y. M.; Oltrogge, L. M.; Alves, R.; Savage, D.; Milo, R. Revisiting Trade-Offs between Rubisco Kinetic Parameters. Biochemistry 2019, 58, 3365– 3376, DOI: 10.1021/acs.biochem.9b00237Google Scholar10https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1MXht1OqsbjJ&md5=a3957a9709e3f5bb86cfdee474c407b6Revisiting Trade-offs between Rubisco Kinetic ParametersFlamholz, Avi I.; Prywes, Noam; Moran, Uri; Davidi, Dan; Bar-On, Yinon M.; Oltrogge, Luke M.; Alves, Rui; Savage, David; Milo, RonBiochemistry (2019), 58 (31), 3365-3376CODEN: BICHAW; ISSN:0006-2960. (American Chemical Society)Rubisco is the primary carboxylase of the Calvin cycle, the most abundant enzyme in the biosphere, and one of the best-characterized enzymes. On the basis of correlations between Rubisco kinetic parameters, it is widely posited that constraints embedded in the catalytic mechanism enforce trade-offs between CO2 specificity, SC/O, and max. carboxylation rate, kcat,C. However, the reasoning that established this view was based on data from ≈20 organisms. Here, we re-examine models of trade-offs in Rubisco catalysis using a data set from ≈300 organisms. Correlations between kinetic parameters are substantially attenuated in this larger data set, with the inverse relationship between kcat,C and SC/O being a key example. Nonetheless, measured kinetic parameters display extremely limited variation, consistent with a view of Rubisco as a highly constrained enzyme. More than 95% of kcat,C values are between 1 and 10 s-1, and no measured kcat,C exceeds 15 s-1. Similarly, SC/O varies by only 30% among Form I Rubiscos and <10% among C3 plant enzymes. Limited variation in SC/O forces a strong pos. correlation between the catalytic efficiencies (kcat/KM) for carboxylation and oxygenation, consistent with a model of Rubisco catalysis in which increasing the rate of addn. of CO2 to the enzyme-substrate complex requires an equal increase in the O2 addn. rate. Altogether, these data suggest that Rubisco evolution is tightly constrained by the physicochem. limits of CO2/O2 discrimination.
- 11Furukawa, R.; Toma, W.; Yamazaki, K.; Akanuma, S. Ancestral Sequence Reconstruction Produces Thermally Stable Enzymes with Mesophilic Enzyme-like Catalytic Properties. Sci. Rep. 2020, 10, 15493, DOI: 10.1038/s41598-020-72418-4Google Scholar11https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB3cXhvFOhtr%252FK&md5=ddceb2c9975d9d1a58534d08fe38ea24Ancestral sequence reconstruction produces thermally stable enzymes with mesophilic enzyme-like catalytic propertiesFurukawa, Ryutaro; Toma, Wakako; Yamazaki, Koji; Akanuma, SatoshiScientific Reports (2020), 10 (1), 15493CODEN: SRCEC3; ISSN:2045-2322. (Nature Research)Abstr.: Enzymes have high catalytic efficiency and low environmental impact, and are therefore potentially useful tools for various industrial processes. Crucially, however, natural enzymes do not always have the properties required for specific processes. It may be necessary, therefore, to design, engineer, and evolve enzymes with properties that are not found in natural enzymes. In particular, the creation of enzymes that are thermally stable and catalytically active at low temp. is desirable for processes involving both high and low temps. In the current study, we designed two ancestral sequences of 3-isopropylmalate dehydrogenase by an ancestral sequence reconstruction technique based on a phylogenetic anal. of extant homologous amino acid sequences. Genes encoding the designed sequences were artificially synthesized and expressed in Escherichia coli. The reconstructed enzymes were found to be slightly more thermally stable than the extant thermophilic homolog from Thermus thermophilus. Moreover, they had considerably higher low-temp. catalytic activity as compared with the T. thermophilus enzyme. Detailed analyses of their temp.-dependent specific activities and kinetic properties showed that the reconstructed enzymes have catalytic properties similar to those of mesophilic homologues. Collectively, our study demonstrates that ancestral sequence reconstruction can produce a thermally stable enzyme with catalytic properties adapted to low-temp. reactions.
- 12Alejaldre, L.; Pelletier, J. N.; Quaglia, D. Methods for Enzyme Library Creation: Which One Will You Choose?: A Guide for Novices and Experts to Introduce Genetic Diversity. BioEssays 2021, 43, 2100052, DOI: 10.1002/bies.202100052Google ScholarThere is no corresponding record for this reference.
- 13Cirino, P., Mayer, K. M., Umeno, D. C.; Mayer, K. M.; Umeno, D. Generating Mutant Libraries Using Error-Prone PCR. In Directed Evolution Library Creation; Humana Press: New Jersey, 2003; Vol. 231, pp. 3– 10. DOI: 10.1385/1-59259-395-X:3 .Google ScholarThere is no corresponding record for this reference.
- 14Hanson-Manful, P.; Patrick, W. M. Construction and Analysis of Randomized Protein-Encoding Libraries Using Error-Prone PCR. In Protein Nanotechnology; Gerrard, J. A., Ed.; Methods in Molecular Biology; Humana Press: Totowa, NJ, 2013; Vol. 996, pp 251–267. DOI: 10.1007/978-1-62703-354-1_15 .Google ScholarThere is no corresponding record for this reference.
- 15Firth, A. E.; Patrick, W. M. Statistics of Protein Library Construction. Bioinformatics 2005, 21, 3314– 3315, DOI: 10.1093/bioinformatics/bti516Google Scholar15https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD2MXmt1eqsL4%253D&md5=641b96657060a12dce35e4ee4f4effa0Statistics of protein library constructionFirth, Andrew E.; Patrick, Wayne M.Bioinformatics (2005), 21 (15), 3314-3315CODEN: BOINFP; ISSN:1367-4803. (Oxford University Press)Review. We have investigated the statistics assocd. with constructing and sampling large protein-encoding libraries. Using fairly simple statistics we have written algorithms for estg. the diversity in libraries generated by the most commonly used protocols, including error-prone PCR, DNA shuffling, StEP PCR, oligonucleotide-directed randomization, MAX randomization, synthetic shuffling, DHR, ADO and SISDC.
- 16Steffens, D. L.; Williams, J. G. Efficient Site-Directed Saturation Mutagenesis Using Degenerate Oligonucleotides. J. Biomol. Tech. 2007, 18, 147– 149Google Scholar16https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BD2sznvVCksQ%253D%253D&md5=57d22fbf25cc787f6f1ab2c5b66931a3Efficient site-directed saturation mutagenesis using degenerate oligonucleotidesSteffens David L; Williams John G KJournal of biomolecular techniques : JBT (2007), 18 (3), 147-9 ISSN:1524-0215.We describe a reliable protocol for constructing single-site saturation mutagenesis libraries consisting of all 20 naturally occurring amino acids at a specific site within a protein. Such libraries are useful for structure-function studies and directed evolution. This protocol extends the utility of Stratagene's QuikChange Site-Directed Mutagenesis Kit, which is primarily recommended for single amino acid substitutions. Two complementary primers are synthesized, containing a degenerate mixture of the four bases at the three positions of the selected codon. These primers are added to starting plasmid template and thermal cycled to produce mutant DNA molecules, which are subsequently transformed into competent bacteria. The protocol does not require purification of mutagenic oligonucleotides or PCR products. This reduces both the cost and turnaround time in high-throughput directed evolution applications. We have utilized this protocol to generate over 200 site-saturation libraries in a DNA polymerase, with a success rate of greater than 95%.
- 17Shimko, T. C.; Fordyce, P. M.; Orenstein, Y. DeCoDe: Degenerate Codon Design for Complete Protein-Coding DNA Libraries. Bioinformatics 2020, 36, 3357– 3364, DOI: 10.1093/bioinformatics/btaa162Google Scholar17https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB3cXhvF2htbbE&md5=28244a325db665e60a24e45499eb4e2fDeCoDe: degenerate codon design for complete protein-coding DNA librariesShimko, Tyler C.; Fordyce, Polly M.; Orenstein, YaronBioinformatics (2020), 36 (11), 3357-3364CODEN: BOINFP; ISSN:1367-4811. (Oxford University Press)High-throughput protein screening is a crit. technique for dissecting and designing protein function. Libraries for these assays can be created through a no. of means, including targeted or random mutagenesis of a template protein sequence or direct DNA synthesis. However, mutagenic library construction methods often yield vastly more nonfunctional than functional variants and, despite advances in large-scale DNA synthesis, individual synthesis of each desired DNA template is often prohibitively expensive. Consequently, many protein-screening libraries rely on the use of degenerate codons (DCs), mixts. of DNA bases incorporated at specific positions during DNA synthesis, to generate highly diverse protein-variant pools from only a few low-cost synthesis reactions. However, selecting DCs for sets of sequences that covary at multiple positions dramatically increases the difficulty of designing a DC library and leads to the creation of many undesired variants that can quickly outstrip screening capacity. We introduce a novel algorithm for total DC library optimization, degenerate codon design (DeCoDe), based on integer linear programming. DeCoDe significantly outperforms state-of-the-art DC optimization algorithms and scales well to more than a hundred proteins sharing complex patterns of covariation (e.g. the lab-derived avGFP lineage).
- 18Picelli, S.; Faridani, O. R.; Björklund, Å. K.; Winberg, G.; Sagasser, S.; Sandberg, R. Full-Length RNA-Seq from Single Cells Using Smart-Seq2. Nat. Protoc. 2014, 9, 171– 181, DOI: 10.1038/nprot.2014.006Google Scholar18https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2cXls1ejtrc%253D&md5=152c5b7d140d866952bd1e211fefd280Full-length RNA-seq from single cells using Smart-seq2Picelli, Simone; Faridani, Omid R.; Bjorklund, Aasa K.; Winberg, Goesta; Sagasser, Sven; Sandberg, RickardNature Protocols (2014), 9 (1), 171-181CODEN: NPARDW; ISSN:1750-2799. (Nature Publishing Group)Emerging methods for the accurate quantification of gene expression in individual cells hold promise for revealing the extent, function and origins of cell-to-cell variability. Different high-throughput methods for single-cell RNA-seq have been introduced that vary in coverage, sensitivity and multiplexing ability. We recently introduced Smart-seq for transcriptome anal. from single cells, and we subsequently optimized the method for improved sensitivity, accuracy and full-length coverage across transcripts. Here we present a detailed protocol for Smart-seq2 that allows the generation of full-length cDNA and sequencing libraries by using std. reagents. The entire protocol takes ∼2 d from cell picking to having a final library ready for sequencing; sequencing will require an addnl. 1-3 d depending on the strategy and sequencer. The current limitations are the lack of strand specificity and the inability to detect nonpolyadenylated (polyA-) RNA.
- 19Tee, K. L.; Wong, T. S. Back to Basics: Creating Genetic Diversity. In Directed Enzyme Evolution: Advances and Applications; Alcalde, M., Ed.; Springer International Publishing: Cham, CH, 2017; pp. 201– 227. DOI: 10.1007/978-3-319-50413-1_8 .Google ScholarThere is no corresponding record for this reference.
- 20Firnberg, E.; Ostermeier, M. PFunkel: Efficient, Expansive, User-Defined Mutagenesis. PLoS One 2012, 7, e52031, DOI: 10.1371/journal.pone.0052031Google Scholar20https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3sXhvF2lsg%253D%253D&md5=5953a993f289801f7b1f006b44565799PFunkel: efficient, expansive, user-defined mutagenesisFirnberg, Elad; Ostermeier, MarcPLoS One (2012), 7 (12), e52031CODEN: POLNCL; ISSN:1932-6203. (Public Library of Science)We introduce PFunkel, a versatile method for extensive, researcher-defined DNA mutagenesis using a ssDNA or dsDNA template. Once the template DNA is prepd., the method can be completed in a single day in a single tube, and requires no intermediate DNA purifn. or sub-cloning. PFunkel can be used for site-directed mutagenesis at an efficiency approaching 100%. More importantly, PFunkel allows researchers the unparalleled ability to efficiently construct user-defined libraries. We demonstrate the creation of a library with site-satn. at four distal sites simultaneously at 70% efficiency. We also employ PFunkel to create a comprehensive codon mutagenesis library of the TEM-1 β-lactamase gene. We designed this library to contain 18,081 members, one for each possible codon substitution in the gene (287 positions in TEM-1 × 63 possible codon substitutions). Deep sequencing revealed that ∼97% of the designed single codon substitutions are present in the library. From such a library we identified 18 previously unreported adaptive mutations that each confer resistance to the β-lactamase inhibitor tazobactam. Three of these mutations confer resistance equal to or higher than that of the most resistant reported TEM-1 allele and have the potential to emerge clin.
- 21Wrenbeck, E. E.; Klesmith, J. R.; Stapleton, J. A.; Adeniran, A.; Tyo, K. E. J.; Whitehead, T. A. Plasmid-Based One-Pot Saturation Mutagenesis. Nat. Methods 2016, 13, 928– 930, DOI: 10.1038/nmeth.4029Google Scholar21https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC28Xhs1CnsrjF&md5=b571c246fb8f2c6bd97016e12a5d2302Plasmid-based one-pot saturation mutagenesisWrenbeck, Emily E.; Klesmith, Justin R.; Stapleton, James A.; Adeniran, Adebola; Tyo, Keith E. J.; Whitehead, Timothy A.Nature Methods (2016), 13 (11), 928-930CODEN: NMAEA3; ISSN:1548-7091. (Nature Publishing Group)Deep mutational scanning is a foundational tool for addressing the functional consequences of large nos. of mutants, but a more efficient and accessible method for construction of user-defined mutagenesis libraries is needed. Here we present nicking mutagenesis, a robust, single-day, one-pot satn. mutagenesis method performed on routinely prepped plasmid dsDNA. The method can be used to produce comprehensive or single- or multi-site satn. mutagenesis libraries.
- 22Bihani, S. C.; Das, A.; Nilgiriwala, K. S.; Prashar, V.; Pirocchi, M.; Apte, S. K.; Ferrer, J.-L.; Hosur, M. V. X-Ray Structure Reveals a New Class and Provides Insight into Evolution of Alkaline Phosphatases. PLoS One 2011, 6, e22767 DOI: 10.1371/journal.pone.0022767Google Scholar22https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3MXhtVyks7nL&md5=a0279bff46021d8255f1d9bc49dad68fX-ray structure reveals a new class and provides insight into evolution of alkaline phosphatasesBihani, Subhash C.; Das, Amit; Nilgiriwala, Kayzad S.; Prashar, Vishal; Pirocchi, Michel; Apte, Shree Kumar; Ferrer, Jean-Luc; Hosur, Madhusoodan V.PLoS One (2011), 6 (7), e22767CODEN: POLNCL; ISSN:1932-6203. (Public Library of Science)The alk. phosphatase (AP) is a bi-metalloenzyme of potential applications in biotechnol. and bioremediation, in which phosphate monoesters are nonspecifically hydrolyzed under alk. conditions to yield inorg. phosphate. The hydrolysis occurs through an enzyme intermediate in which the catalytic residue is phosphorylated. The reaction, which also requires a third metal ion, is proposed to proceed through a mechanism of in-line displacement involving a trigonal bipyramidal transition state. Stabilizing the transition state by bidentate hydrogen bonding has been suggested to be the reason for conservation of an arginine residue in the active site. We report here the first crystal structure of alk. phosphatase purified from the bacterium Sphingomonas sp. Strain BSAR-1 (SPAP). The crystal structure reveals many differences from other APs: (1) the catalytic residue is a threonine instead of serine, (2) there is no third metal ion binding pocket, and (3) the arginine residue forming bidentate hydrogen bonding is deleted in SPAP. A lysine and an aspargine residue, recruited together for the first time into the active site, bind the substrate phosphoryl group in a manner not obsd. before in any other AP. These and other structural features suggest that SPAP represents a new class of APs. Because of its direct contact with the substrate phosphoryl group, the lysine residue is proposed to play a significant role in catalysis. The structure is consistent with a mechanism of in-line displacement via a trigonal bipyramidal transition state. The structure provides important insights into evolutionary relationships between members of AP superfamily.
- 23Rinke, C.; Lee, J.; Nath, N.; Goudeau, D.; Thompson, B.; Poulton, N.; Dmitrieff, E.; Malmstrom, R.; Stepanauskas, R.; Woyke, T. Obtaining Genomes from Uncultivated Environmental Microorganisms Using FACS–Based Single-Cell Genomics. Nat. Protoc. 2014, 9, 1038– 1048, DOI: 10.1038/nprot.2014.067Google Scholar23https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2cXmtlSgsLc%253D&md5=11899aeadd659be9dc95bfdab2f253daObtaining genomes from uncultivated environmental microorganisms using FACS-based single-cell genomicsRinke, Christian; Lee, Janey; Nath, Nandita; Goudeau, Danielle; Thompson, Brian; Poulton, Nicole; Dmitrieff, Elizabeth; Malmstrom, Rex; Stepanauskas, Ramunas; Woyke, TanjaNature Protocols (2014), 9 (5), 1038-1048CODEN: NPARDW; ISSN:1750-2799. (Nature Publishing Group)A review. Single-cell genomics is a powerful tool for exploring the genetic makeup of environmental microorganisms, the vast majority of which are difficult, if not impossible, to cultivate with current approaches. Here we present a comprehensive protocol for obtaining genomes from uncultivated environmental microbes via high-throughput single-cell isolation by FACS. The protocol encompasses the preservation and pretreatment of differing environmental samples, followed by the phys. sepn., lysis, whole-genome amplification and 16S rRNA-based identification of individual bacterial and archaeal cells. The described procedure can be performed with std. mol. biol. equipment and a FACS machine. It takes <12 h of bench time over a 4-d time period, and it generates up to 1 mg of genomic DNA from an individual microbial cell, which is suitable for downstream applications such as PCR amplification and shotgun sequencing. The completeness of the recovered genomes varies, with an av. of ∼50%.
- 24Brower, K. K.; Carswell-Crumpton, C.; Klemm, S.; Cruz, B.; Kim, G.; Calhoun, S. G. K.; Nichols, L.; Fordyce, P. M. Double Emulsion Flow Cytometry with High-Throughput Single Droplet Isolation and Nucleic Acid Recovery. Lab Chip 2020, 20, 2062– 2074, DOI: 10.1039/D0LC00261EGoogle Scholar24https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB3cXpsV2htrc%253D&md5=2b2ad041b6e06468c59a6214d058dbd5Double emulsion flow cytometry with high-throughput single droplet isolation and nucleic acid recoveryBrower, Kara K.; Carswell-Crumpton, Catherine; Klemm, Sandy; Cruz, Bianca; Kim, Gaeun; Calhoun, Suzanne G. K.; Nichols, Lisa; Fordyce, Polly M.Lab on a Chip (2020), 20 (12), 2062-2074CODEN: LCAHAM; ISSN:1473-0189. (Royal Society of Chemistry)Droplet microfluidics has made large impacts in diverse areas such as enzyme evolution, chem. product screening, polymer engineering, and single-cell anal. However, while droplet reactions have become increasingly sophisticated, phenotyping droplets by a fluorescent signal and sorting them to isolate individual variants-of-interest at high-throughput remains challenging. Here, we present sdDE-FACS (single droplet Double Emulsion-FACS), a new method that uses a std. flow cytometer to phenotype, select, and isolate individual double emulsion droplets of interest. Using a 130μm nozzle at high sort frequency (12-14 kHz), we demonstrate detection of droplet fluorescence signals with a dynamic range spanning 5 orders of magnitude and robust post-sort recovery of intact double emulsion (DE) droplets using 2 com.-available FACS instruments. We report the first demonstration of single double emulsion droplet isolation with post-sort recovery efficiencies >70%, equiv. to the capabilities of single-cell FACS. Finally, we establish complete downstream recovery of nucleic acids from single, sorted double emulsion droplets via qPCR with little to no cross-contamination. sdDE-FACS marries the full power of droplet microfluidics with flow cytometry to enable a variety of new droplet assays, including rare variant isolation and multiparameter single-cell anal.
- 25Bronner, I. F.; Quail, M. A. Best Practices for Illumina Library Preparation. Curr. Protoc. Hum. Genet. 2019, 102, e86, DOI: 10.1002/cphg.86Google ScholarThere is no corresponding record for this reference.
- 26Picelli, S.; Björklund, Å. K.; Reinius, B.; Sagasser, S.; Winberg, G.; Sandberg, R. Tn5 Transposase and Tagmentation Procedures for Massively Scaled Sequencing Projects. Genome Res. 2014, 24, 2033– 2040, DOI: 10.1101/gr.177881.114Google Scholar26https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2cXitVOls7%252FF&md5=a79bbd4c329ebfb8690eb89ea04f2380Tn5 transposase and tagmentation procedures for massively scaled sequencing projectsPicelli, Simone; Bjoerklund, Aasa K.; Reinius, Bjoern; Sagasser, Sven; Winberg, Goesta; Sandberg, RickardGenome Research (2014), 24 (12), 2033-2040CODEN: GEREFS; ISSN:1088-9051. (Cold Spring Harbor Laboratory Press)Massively parallel DNA sequencing of thousands of samples in a single machine-run is now possible, but the prepn. of the individual sequencing libraries is expensive and time-consuming. Tagmentation-based library construction, using the Tn5 transposase, is efficient for generating sequencing libraries but currently relies on undisclosed reagents, which severely limits development of novel applications and the execution of large-scale projects. Here, we present simple and robust procedures for Tn5 transposase prodn. and optimized reaction conditions for tagmentation-based sequencing library construction. We further show how mol. crowding agents both modulate library lengths and enable efficient tagmentation from subpicogram amts. of cDNA. The comparison of single-cell RNA-sequencing libraries generated using produced and com. Tn5 demonstrated equal performances in terms of gene detection and library characteristics. Finally, because naked Tn5 can be annealed to any oligonucleotide of choice, for example, mol. barcodes in single-cell assays or methylated oligonucleotides for bisulfite sequencing, custom Tn5 prodn. and tagmentation enable innovation in sequencing-based applications.
- 27Adey, A.; Morrison, H. G.; Xun, X.; Kitzman, J. O.; Turner, E. H.; Stackhouse, B.; MacKenzie, A. P.; Caruccio, N. C.; Zhang, X.; Shendure, J. Rapid, Low-Input, Low-Bias Construction of Shotgun Fragment Libraries by High-Density in Vitro Transposition. Genome Biol. 2010, 11, R119, DOI: 10.1186/gb-2010-11-12-r119Google Scholar27https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3MXltVegsA%253D%253D&md5=99c96db11994c37386dfdfd9947e80b5Rapid, low-input, low-bias construction of shotgun fragment libraries by high-density in vitro transpositionAdey, Andrew; Morrison, Hilary G.; Asan; Xun, Xu; Kitzman, Jacob O.; Turner, Emily H.; Stackhouse, Bethany; MacKenzie, Alexandra P.; Caruccio, Nicholas C.; Zhang, Xiuqing; Shendure, JayGenome Biology (2010), 11 (12), R119CODEN: GNBLFW; ISSN:1474-760X. (BioMed Central Ltd.)We characterize and extend a highly efficient method for constructing shotgun fragment libraries in which transposase catalyzes in vitro DNA fragmentation and adaptor incorporation simultaneously. We apply this method towards sequencing a human genome, and find that coverage biases are comparable with conventional protocols. We also extend its capabilities by developing protocols for sub-nanogram library construction, exome capture from 50 ng of input DNA, PCR-free and colony PCR library construction, and 96-plex sample indexing.
- 28Glenn, T. C.; Nilsen, R. A.; Kieran, T. J.; Sanders, J. G.; Bayona-Vásquez, N. J.; Finger, J. W.; Pierson, T. W.; Bentley, K. E.; Hoffberg, S. L.; Louha, S.; Garcia-De Leon, F. J.; Del Rio Portilla, M. A.; Reed, K. D.; Anderson, J. L.; Meece, J. K.; Aggrey, S. E.; Rekaya, R.; Alabady, M.; Belanger, M.; Winker, K.; Faircloth, B. C. Adapterama I: Universal Stubs and Primers for 384 Unique Dual-Indexed or 147,456 Combinatorially-Indexed Illumina Libraries (ITru & INext). PeerJ 2019, 7, e7755, DOI: 10.7717/peerj.7755Google ScholarThere is no corresponding record for this reference.
- 29Tegally, H.; San, J. E.; Giandhari, J.; de Oliveira, T. Unlocking the Efficiency of Genomics Laboratories with Robotic Liquid-Handling. BMC Genomics 2020, 21, 729, DOI: 10.1186/s12864-020-07137-1Google Scholar29https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BB3s7jvFyitg%253D%253D&md5=4d99e01dfd76d80ead2ec8ad63d97087Unlocking the efficiency of genomics laboratories with robotic liquid-handlingTegally Houriiyah; San James Emmanuel; Giandhari Jennifer; de Oliveira Tulio; de Oliveira TulioBMC genomics (2020), 21 (1), 729 ISSN:.In research and clinical genomics laboratories today, sample preparation is the bottleneck of experiments, particularly when it comes to high-throughput next generation sequencing (NGS). More genomics laboratories are now considering liquid-handling automation to make the sequencing workflow more efficient and cost effective. The question remains as to its suitability and return on investment. A number of points need to be carefully considered before introducing robots into biological laboratories. Here, we describe the state-of-the-art technology of both sophisticated and do-it-yourself (DIY) robotic liquid-handlers and provide a practical review of the motivation, implications and requirements of laboratory automation for genome sequencing experiments.
- 30Bolger, A. M.; Lohse, M.; Usadel, B. Trimmomatic: A Flexible Trimmer for Illumina Sequence Data. Bioinformatics 2014, 30, 2114– 2120, DOI: 10.1093/bioinformatics/btu170Google Scholar30https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2cXht1Sqt7nP&md5=0833bee198353e90a4d7363f99f02c8eTrimmomatic: a flexible trimmer for Illumina sequence dataBolger, Anthony M.; Lohse, Marc; Usadel, BjoernBioinformatics (2014), 30 (15), 2114-2120CODEN: BOINFP; ISSN:1367-4803. (Oxford University Press)Motivation: Although many next-generation sequencing (NGS) read preprocessing tools already existed, we could not find any tool or combination of tools that met our requirements in terms of flexibility, correct handling of paired-end data and high performance. We have developed Trimmomatic as a more flexible and efficient preprocessing tool, which could correctly handle paired-end data. Results: The value of NGS read preprocessing is demonstrated for both ref.-based and ref.-free tasks. Trimmomatic is shown to produce output that is at least competitive with, and in many cases superior to, that produced by other tools, in all scenarios tested. Availability and implementation: Trimmomatic is licensed under GPL V3. It is cross-platform (Java 1.5+ required) and available at http://www.usadellab.org/cms/index.phppage=trimmomatic Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics online.
- 31Li, H. Aligning Sequence Reads, Clone Sequences and Assembly Contigs with BWA-MEM. arXiv:1303.3997 [q-bio] 2013.Google ScholarThere is no corresponding record for this reference.
- 32Li, H.; Handsaker, B.; Wysoker, A.; Fennell, T.; Ruan, J.; Homer, N.; Marth, G.; Abecasis, G.; Durbin, R.; 1000 Genome Project Data Processing Subgroup The Sequence Alignment/Map Format and SAMtools. Bioinformatics 2009, 25, 2078– 2079, DOI: 10.1093/bioinformatics/btp352Google Scholar32https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD1MXpslertr8%253D&md5=1ab7714968487a35cce7f81b751a0b1aThe Sequence Alignment/Map format and SAMtoolsLi, Heng; Handsaker, Bob; Wysoker, Alec; Fennell, Tim; Ruan, Jue; Homer, Nils; Marth, Gabor; Abecasis, Goncalo; Durbin, RichardBioinformatics (2009), 25 (16), 2078-2079CODEN: BOINFP; ISSN:1367-4803. (Oxford University Press)Summary: The Sequence Alignment/Map (SAM) format is a generic alignment format for storing read alignments against ref. sequences, supporting short and long reads (up to 128 Mbp) produced by different sequencing platforms. It is flexible in style, compact in size, efficient in random access and is the format in which alignments from the 1000 Genomes Project are released. SAMtools implements various utilities for post-processing alignments in the SAM format, such as indexing, variant caller and alignment viewer, and thus provides universal tools for processing read alignments. Availability: http://samtools.sourceforge.net Contact: [email protected].
- 33Li, H. A Statistical Framework for SNP Calling, Mutation Discovery, Association Mapping and Population Genetical Parameter Estimation from Sequencing Data. Bioinformatics 2011, 27, 2987– 2993, DOI: 10.1093/bioinformatics/btr509Google Scholar33https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3MXhtlGkur7L&md5=778fc839fbbce2b0fc82aa2d9295652bA statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing dataLi, HengBioinformatics (2011), 27 (21), 2987-2993CODEN: BOINFP; ISSN:1367-4803. (Oxford University Press)Motivation: Most existing methods for DNA sequence anal. rely on accurate sequences or genotypes. However, in applications of the next-generation sequencing (NGS), accurate genotypes may not be easily obtained (e.g. multi-sample low-coverage sequencing or somatic mutation discovery). These applications press for the development of new methods for analyzing sequence data with uncertainty. Results: We present a statistical framework for calling SNPs, discovering somatic mutations, inferring population genetical parameters and performing assocn. tests directly based on sequencing data without explicit genotyping or linkage-based imputation. On real data, we demonstrate that our method achieves comparable accuracy to alternative methods for estg. site allele count, for inferring allele frequency spectrum and for assocn. mapping. We also highlight the necessity of using sym. datasets for finding somatic mutations and confirm that for discovering rare events, mismapping is frequently the leading source of errors. Availability: http://samtools.sourceforge.net Contact: [email protected].
- 34Kirsch, R. D.; Joly, E. An Improved PCR-Mutagenesis Strategy for Two-Site Mutagenesis or Sequence Swapping between Related Genes. Nucleic Acids Res. 1998, 26, 1848– 1850, DOI: 10.1093/nar/26.7.1848Google Scholar34https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADyaK1cXis1Oju7Y%253D&md5=31423b2c27f6d183e7dc1b49d4d982c5An improved PCR-mutagenesis strategy for two-site mutagenesis or sequence swapping between related genesKirsch, Ralf D.; Joly, EtienneNucleic Acids Research (1998), 26 (7), 1848-1850CODEN: NARHAD; ISSN:0305-1048. (Oxford University Press)The QuikChange protocol is one of the simplest and fastest methods for site-directed mutagenesis, but introduces mutations at only one site at a time, and requires two HPLC-purified complementary oligonucleotides. Here, we describe that this method can be used with non-overlapping oligonucleotides. By doing this, two sep. sites can be mutagenized simultaneously, or money can be saved by using a second "std." oligonucleotide. By a further modification, we have also used the QuikChange approach to exchange DNA sequences between closely related genes.
- 35Wang, W.; Malcolm, B. A. Two-Stage PCR Protocol Allowing Introduction of Multiple Mutations, Deletions and Insertions Using QuikChange TM Site-Directed Mutagenesis. BioTechniques 1999, 26, 680– 682, DOI: 10.2144/99264st03Google Scholar35https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADyaK1MXislGmtrk%253D&md5=96add1e80ac3ed042f88de046997a848Two-stage PCR protocol allowing introduction of multiple mutations, deletions and insertions using QuikChange Site-Directed MutagenesisWang, Wenyan; Malcolm, Bruce A.BioTechniques (1999), 26 (4), 680-682CODEN: BTNQDO; ISSN:0736-6205. (Eaton Publishing Co.)We developed a two-stage procedure, based on the QuikChange Site-Directed Mutagenesis Protocol, that significantly expands its application to a variety of gene modification expts. A pre-PCR, single-primer extension stage before the std. protocol allows the efficient introduction of not only point mutation but also multiple mutations and deletions and insertions to a sequence of interest.
- 36Li, M. Z.; Elledge, S. J. Harnessing Homologous Recombination in Vitro to Generate Recombinant DNA via SLIC. Nat. Methods 2007, 4, 251– 256, DOI: 10.1038/nmeth1010Google Scholar36https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD2sXitFChsbY%253D&md5=daf627b0eb2b744aa33a7e109f6df6b6Harnessing homologous recombination in vitro to generate recombinant DNA via SLICLi, Mamie Z.; Elledge, Stephen J.Nature Methods (2007), 4 (3), 251-256CODEN: NMAEA3; ISSN:1548-7091. (Nature Publishing Group)We describe a new cloning method, sequence and ligation-independent cloning (SLIC), which allows the assembly of multiple DNA fragments in a single reaction using in vitro homologous recombination and single-strand annealing. SLIC mimics in vivo homologous recombination by relying on exonuclease-generated ssDNA overhangs in insert and vector fragments, and the assembly of these fragments by recombination in vitro. SLIC inserts can also be prepd. by incomplete PCR (iPCR) or mixed PCR. SLIC allows efficient and reproducible assembly of recombinant DNA with as many as 5 and 10 fragments simultaneously. SLIC circumvents the sequence requirements of traditional methods and functions much more efficiently at very low DNA concns. when combined with RecA to catalyze homologous recombination. This flexibility allows much greater versatility in the generation of recombinant DNA for the purposes of synthetic biol.
- 37Gibson, D. G.; Young, L.; Chuang, R.-Y.; Venter, J. C.; Hutchison, C. A., III; Smith, H. O. Enzymatic Assembly of DNA Molecules up to Several Hundred Kilobases. Nat. Methods 2009, 6, 343– 345, DOI: 10.1038/nmeth.1318Google Scholar37https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD1MXksVemsbw%253D&md5=46284924c7d73c47cfb490983338e480Enzymatic assembly of DNA molecules up to several hundred kilobasesGibson, Daniel G.; Young, Lei; Chuang, Ray-Yuan; Venter, J. Craig; Hutchison, Clyde A.; Smith, Hamilton O.Nature Methods (2009), 6 (5), 343-345CODEN: NMAEA3; ISSN:1548-7091. (Nature Publishing Group)The authors describe an isothermal, single-reaction method for assembling multiple overlapping DNA mols. by the concerted action of a 5' exonuclease, a DNA polymerase and a DNA ligase. First they recessed DNA fragments, yielding single-stranded DNA overhangs that specifically annealed, and then covalently joined them. This assembly method can be used to seamlessly construct synthetic and natural genes, genetic pathways and entire genomes, and could be a useful mol. engineering tool.
- 38Longwell, S. A.; Appel, M. J.; Orenstein, Y.; Fordyce, P. M. OpTile: An Optimized Method for Creating Overlapping Tiled Oligonucleotide Libraries. In preparation .Google ScholarThere is no corresponding record for this reference.
- 39Logsdon, G. A.; Vollger, M. R.; Eichler, E. E. Long-Read Human Genome Sequencing and Its Applications. Nat. Rev. Genet. 2020, 21, 597– 614, DOI: 10.1038/s41576-020-0236-xGoogle Scholar39https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB3cXhtFSgs77F&md5=d42d9831df87803e6a64387d9d05b95cLong-read human genome sequencing and its applicationsLogsdon, Glennis A.; Vollger, Mitchell R.; Eichler, Evan E.Nature Reviews Genetics (2020), 21 (10), 597-614CODEN: NRGAAM; ISSN:1471-0056. (Nature Research)A review. Over the past decade, long-read, single-mol. DNA sequencing technologies have emerged as powerful players in genomics. With the ability to generate reads tens to thousands of kilobases in length with an accuracy approaching that of short-read sequencing technologies, these platforms have proven their ability to resolve some of the most challenging regions of the human genome, detect previously inaccessible structural variants and generate some of the first telomere-to-telomere assemblies of whole chromosomes. Long-read sequencing technologies will soon permit the routine assembly of diploid genomes, which will revolutionize genomics by revealing the full spectrum of human genetic variation, resolving some of the missing heritability and leading to the discovery of novel mechanisms of disease.
- 40Nilgiriwala, K. S.; Alahari, A.; Rao, A. S.; Apte, S. K. Cloning and Overexpression of Alkaline Phosphatase PhoK from Sphingomonas Sp. Strain BSAR-1 for Bioprecipitation of Uranium from Alkaline Solutions. Appl. Environ. Microbiol. 2008, 74, 5516– 5523, DOI: 10.1128/aem.00107-08Google Scholar40https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD1cXhtV2ru7jL&md5=5c37868a32eafcda29070f38c50c870cCloning and overexpression of alkaline phosphatase phoK from Sphingomonas sp. strain BSAR-1 for bioprecipitation of uranium from alkaline solutionsNilgiriwala, Kayzad S.; Alahari, Anuradha; Rao, Amara Sambasiva; Apte, Shree KumarApplied and Environmental Microbiology (2008), 74 (17), 5516-5523CODEN: AEMIDF; ISSN:0099-2240. (American Society for Microbiology)Cells of Sphingomonas sp. strain BSAR-1 constitutively expressed an alk. phosphatase, which was also secreted in the extracellular medium. A null mutant lacking this alk. phosphatase activity was isolated by Tn5 random mutagenesis. The corresponding gene, designated phoK, was cloned and overexpressed in Escherichia coli strain BL21(DE3). The resultant E. coli strain EK4 overexpressed cellular activity 55 times higher and secreted extracellular PhoK activity 13 times higher than did BSAR-1. The recombinant strain very rapidly pptd. >90% of input uranium in less than 2 h from alk. solns. (pH, 9) contg. 0.5 to 5 mM of uranyl carbonate, compared to BSAR-1, which pptd. uranium in >7 h. In both strains BSAR-1 and EK4, pptd. uranium remained cell bound. The EK4 cells exhibited a much higher loading capacity of 3.8 g U/g dry wt. in <2 h compared to only 1.5 g U/g dry wt. in >7 h in BSAR-1. The data demonstrate the potential utility of genetically engineering PhoK for the biopptn. of uranium from alk. solns.
- 41Chern, E. C.; Siefring, S.; Paar, J.; Doolittle, M.; Haugland, R. A. Comparison of Quantitative PCR Assays for Escherichia Coli Targeting Ribosomal RNA and Single Copy Genes. Lett. Appl. Microbiol. 2011, 52, 298– 306, DOI: 10.1111/j.1472-765X.2010.03001.xGoogle Scholar41https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3MXktVGnsLo%253D&md5=f397ad35fbed0bbea6bae713119c1003Comparison of quantitative PCR assays for Escherichia coli targeting ribosomal RNA and single copy genesChern, E. C.; Siefring, S.; Paar, J.; Doolittle, M.; Haugland, R. A.Letters in Applied Microbiology (2011), 52 (3), 298-306CODEN: LAMIE7; ISSN:0266-8254. (Wiley-Blackwell)Aims: Compare specificity and sensitivity of quant. PCR (qPCR) assays targeting single and multi-copy gene regions of Escherichia coli. Methods and Results: A previously reported assay targeting the uidA gene (uidA405) was used as the basis for comparing the taxonomic specificity and sensitivity of qPCR assays targeting the rodA gene (rodA984) and two regions of the multi-copy 23S rRNA gene (EC23S and EC23S857). Exptl. analyses of 28 culture collection strains representing E. coli and 21 related non-target species indicated that the uidA405 and rodA984 assays were both 100% specific for E. coli while the EC23S assay was only 29% specific. The EC23S857 assay was only 95% specific due to detection of E. fergusonii. The uidA405, rodA984, EC23S and EC23S857 assays were 85%, 85%, 100% and 86% sensitive, resp., in detecting 175 presumptive E. coli culture isolates from fresh, marine and waste water samples. In analyses of DNA exts. from 32 fresh, marine and waste water samples, the rodA984, EC23S and EC23S857 assays detected mean densities of target sequences at ratios of approx. 1:1, 243:1 and 6:1 compared with the mean densities detected by the uidA405 assay. Conclusions: The EC23S assay was less specific for E. coli, whereas the rodA984 and EC23S857 assay taxonomic specificities and sensitivities were similar to those of the uidA405 gene assay. Significance and Impact: The EC23S857 assay has a lower limit of detection for E. coli cells than the uidA405 and rodA984 assays due to its multi-copy gene target and therefore provides greater anal. sensitivity in monitoring for these fecal pollution indicators in environmental waters by qPCR methods.
- 42van Rossum, G.; Drake, F. L.; Van Rossum, G. The Python Language Reference, Release 3.0.1 [Repr.].; Python documentation manual; Python Software Foundation: Hampton, NH, 2010.Google ScholarThere is no corresponding record for this reference.
- 43Mölder, F.; Jablonski, K. P.; Letcher, B.; Hall, M. B.; Tomkins-Tinch, C. H.; Sochat, V.; Forster, J.; Lee, S.; Twardziok, S. O.; Kanitz, A.; Wilm, A.; Holtgrewe, M.; Rahmann, S.; Nahnsen, S.; Köster, J. Sustainable Data Analysis with Snakemake. F1000Res 2021, 10, 33, DOI: 10.12688/f1000research.29032.1Google Scholar43https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BB2c%252FptVKjug%253D%253D&md5=57e1bba76c4646bce4fead738211a8e8Sustainable data analysis with SnakemakeMolder Felix; Forster Jan; Koster Johannes; Molder Felix; Jablonski Kim Philipp; Jablonski Kim Philipp; Letcher Brice; Hall Michael B; Tomkins-Tinch Christopher H; Tomkins-Tinch Christopher H; Sochat Vanessa; Forster Jan; Lee Soohyun; Twardziok Sven O; Holtgrewe Manuel; Kanitz Alexander; Kanitz Alexander; Wilm Andreas; Holtgrewe Manuel; Rahmann Sven; Nahnsen Sven; Koster JohannesF1000Research (2021), 10 (), 33 ISSN:.Data analysis often entails a multitude of heterogeneous steps, from the application of various command line tools to the usage of scripting languages like R or Python for the generation of plots and tables. It is widely recognized that data analyses should ideally be conducted in a reproducible way. Reproducibility enables technical validation and regeneration of results on the original or even new data. However, reproducibility alone is by no means sufficient to deliver an analysis that is of lasting impact (i.e., sustainable) for the field, or even just one research group. We postulate that it is equally important to ensure adaptability and transparency. The former describes the ability to modify the analysis to answer extended or slightly different research questions. The latter describes the ability to understand the analysis in order to judge whether it is not only technically, but methodologically valid. Here, we analyze the properties needed for a data analysis to become reproducible, adaptable, and transparent. We show how the popular workflow management system Snakemake can be used to guarantee this, and how it enables an ergonomic, combined, unified representation of all steps involved in data analysis, ranging from raw data processing, to quality control and fine-grained, interactive exploration and plotting of final results.
- 44Robinson, J. T.; Thorvaldsdóttir, H.; Winckler, W.; Guttman, M.; Lander, E. S.; Getz, G.; Mesirov, J. P. Integrative Genomics Viewer. Nat. Biotechnol. 2011, 29, 24– 26, DOI: 10.1038/nbt.1754Google Scholar44https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3MXjsFWrtg%253D%253D&md5=312a2139d048ade04dedb2f6f13eae63Integrative genomics viewerRobinson, James T.; Thorvaldsdottir, Helga; Winckler, Wendy; Guttman, Mitchell; Lander, Eric S.; Getz, Gad; Mesirov, Jill P.Nature Biotechnology (2011), 29 (1), 24-26CODEN: NABIF9; ISSN:1087-0156. (Nature Publishing Group)There is no expanded citation for this reference.
Cited By
Smart citations by scite.ai include citation statements extracted from the full text of the citing article. The number of the statements may be higher than the number of citations provided by ACS Publications if one paper cites another multiple times or lower if scite has not yet processed some of the citing articles.
This article is cited by 5 publications.
- Yueming Long, Ariane Mora, Francesca-Zhoufan Li, Emre Gürsoy, Kadina E. Johnston, Frances H. Arnold. LevSeq: Rapid Generation of Sequence-Function Data for Directed Evolution and Machine Learning. ACS Synthetic Biology 2025, 14
(1)
, 230-238. https://doi.org/10.1021/acssynbio.4c00625
- Bruce J. Wittmann, Kadina E. Johnston, Patrick J. Almhjell, Frances H. Arnold. evSeq: Cost-Effective Amplicon Sequencing of Every Variant in a Protein Library. ACS Synthetic Biology 2022, 11
(3)
, 1313-1324. https://doi.org/10.1021/acssynbio.1c00592
- Alicia Maciá Valero, Rianne C. Prins, Thijs de Vroet, Sonja Billerbeck. Combining Oligo Pools and Golden Gate Cloning to Create Protein Variant Libraries or Guide RNA Libraries for CRISPR Applications. 2025, 265-295. https://doi.org/10.1007/978-1-0716-4220-7_15
- Yueming Long, Ariane Mora, Emre Gürsoy, Kadina E. Johnston, Francesca Zhoufan-Li, Frances H. Arnold. LevSeq: Rapid Generation of Sequence-Function Data for Directed Evolution and Machine Learning. 2024https://doi.org/10.1101/2024.09.04.611255
- Nicole DelRosso, Peter H. Suzuki, Daniel Griffith, Jeffrey M. Lotthammer, Borna Novak, Selin Kocalar, Maya U. Sheth, Alex S. Holehouse, Lacramioara Bintu, Polly Fordyce. High-throughput affinity measurements of direct interactions between activation domains and co-activators. 2024https://doi.org/10.1101/2024.08.19.608698
- Weiyi Li, Darach Miller, Xianan Liu, Lorenzo Tosi, Lamia Chkaiban, Han Mei, Po-Hsiang Hung, Biju Parekkadan, Gavin Sherlock, Sasha F Levy. Arrayed
in vivo
barcoding for multiplexed sequence verification of plasmid DNA and demultiplexing of pooled libraries. Nucleic Acids Research 2024, 52
(10)
, e47-e47. https://doi.org/10.1093/nar/gkae332
- Kadina E. Johnston, Clara Fannjiang, Bruce J. Wittmann, Brian L. Hie, Kevin K. Yang, Zachary Wu. Machine Learning for Protein Engineering. 2023, 277-311. https://doi.org/10.1007/978-3-031-37196-7_9
Article Views are the COUNTER-compliant sum of full text article downloads since November 2008 (both PDF and HTML) across all institutions and individuals. These metrics are regularly updated to reflect usage leading up to the last few days.
Citations are the number of other articles citing this article, calculated by Crossref and updated daily. Find more information about Crossref citation counts.
The Altmetric Attention Score is a quantitative measure of the attention that a research article has received online. Clicking on the donut icon will load a page at altmetric.com with additional details about the score and the social media presence for the given article. Find more information on the Altmetric Attention Score and how the score is calculated.
Recommended Articles
Abstract
Figure 1
Figure 1. Overview of the uPIC–M pipeline to generate user-defined clonal mutant libraries. (A) Examples of clonal libraries from uPIC–M and potential high-throughput biochemistry applications. Applications are listed along with examples of the types of variants involved. (B) Comparison of cost (including materials and labor) of conventional mutagenesis vs uPIC–M for libraries of 50–20,000 mutants. A uPIC–M clone sampling rate of 384 per 50 desired mutants (7.68-fold excess) was used for these calculations. uPIC–M (modified) represents a lower cost version of uPIC–M with the addition of pipet tip washing for plate liquid transfer steps. See Table S1 for full time and cost calculations. (C) Workflow for generating uPIC–M libraries in three phases: (1) Mutagenic oligos are synthesized for ∼50 residue windows on a pooled array and selective PCR amplification of each window generates a primer pool used for QuikChange; (2) pooled QuikChange reactions are transformed and plated, with each plate containing a mixture of ∼50 possible single mutants, facilitating colony picking into multiwell plates to isolate clonal libraries of unidentified variants; (3) clonal libraries are prepared and sequenced by NGS to reveal the genotype and location of each variant.
Figure 2
Figure 2. Tiling window strategy for uPIC–M mutagenic oligo array design. (A) Tiling window strategy (see Figure 1C) divides the ORF from the protein of interest into mutagenic sublibrary regions, with sublibrary oligo length constrained by DNA synthesis limits. Each window contains unique forward and reverse priming sites (dark shading, here ∼25 nt each) at the 5′- and 3′-termini surrounding a mutational region (light shading, here ∼150 nt). For a scanning library, each codon along the length of a sublibrary mutational region is substituted via an individual mutagenic oligo. (B) Selective amplification of oligos from a single window (sublibrary 11). Forward and reverse primers specific to a single sublibrary are used to amplify oligos from the resuspended array material, yielding an oligo pool containing ∼50 codon substitutions from the same mutagenic window.
Figure 3
Figure 3. Simulated sampling of pooled single mutant libraries. (A, B) Simulation of the number of unique mutants obtained as a function of the number of clones sampled for pooled libraries containing 50 (A) or 541 (B) unique single mutants with single mutant frequencies from 0.1 to 1.0. The remaining fraction of each pool represents all other variants (e.g., WT, indels, double, and higher-order mutants). Each curve represents the average of 103 simulations; shaded bands represent the 95% confidence interval; horizontal dashed lines (A, B) indicate the total possible number of unique mutants; vertical line (B) indicates the number of colonies picked for the SpAP library constructed herein (for legend, see A). (C–E) Simulated picking results for a sublibrary containing 50 single mutants at equal relative abundances sampled 384 times with a single mutant frequency of 0.5. (C) Simulated positional frequencies of single mutants; the results of five sampling simulations were chosen at random. (D) Histogram of expected mutant yields and (E) histogram of expected yields per sublibrary position (from 103 sampling events).
Figure 4
Figure 4. Schematic of uPIC–M sequencing library preparation. Preparation of sequencing libraries takes place in multiwell plate format (96 or 384) via the following steps: (i) ORF regions of target plasmids are amplified from each clone using universal primers to obtain enriched amplicon DNA (A); (iia) For amplicons ≤600 bp, universal Illumina adapters may be ligated directed to amplicons or added by amplification in a second PCR step; (iib) for amplicons >600 bp, DNA is fragmented and tagged using adapter-loaded Tn5 transposase, i.e., tagmented; (iii) amplicons or fragments are further amplified with Nextera primers that incorporate dual-unique i7 and i5 index barcodes; (iv–vii) amplified and barcoded clonal libraries are pooled for NGS, purified, sequenced, and barcodes are used to report the plate-well location and genotype of each variant (B). Mutant amplicons generated at (i) can be used directly for high-throughput biochemistry applications (shown here: cell-free expression and fluorogenic assay of an enzyme library using a microfluidic platform to obtain kinetic parameters) (C).
Figure 5
Figure 5. Sequencing library quality control results. (A) Plot of fluorescence (arbitrary units) vs fragment length for sublibary 1 following tagmentation and barcoding amplification. See Figure S7 for analogous data for the other sublibraries. (B) Electropherograms of sublibraries 1–13 (see Table S6 for integrated peak concentrations).
Figure 6
Figure 6. NGS data processing and read mapping pipeline and results for the SpAP scanning library. (A) Data processing steps and observed statistics. Raw FASTQ files (demultiplexed and unpaired) are filtered for barcodes containing 1 or more reads followed by adapter sequence trimming and pairing with read mates (if both reads are present and meet length/quality thresholds). Sequence-redundant readthrough read pairs are flagged at this stage and redundant read mates are discarded. (B) Trimmed and paired reads are mapped to the SpAP-eGFP amplicon, E. coli, and full plasmid genomes. (C) Histogram of total reads per barcode across all sublibraries following read trimming and pairing (n = 4645). (D) Barcode counts for each sublibrary plate at several read depth thresholds for the SpAP-eGFP ORF (>0 represents barcodes containing any mapped reads and remaining thresholds represent the minimum number of mapped reads at all positions; only barcodes containing at least one mapped read are included). The horizontal dashed line at 384 barcodes represents the maximum possible number of barcodes.
Figure 7
Figure 7. Characterization of the SpAP alkaline phosphatase scanning mutant library created with uPIC–M. (A) Overview of variant detection analyses and calculated yields (red) for the SpAP mutant library. (B) Overall distribution of single mutants, WT, double mutants, triple and greater mutants, and indels across all mutational sublibraries (indel count reflects variants containing one or more indels). (C) Location and frequency of intended single mutants across the entire SpAP-eGFP ORF. (D) Scatter plot and histograms of variant reads vs WT reads for all intended single mutants. (E) Comparison of simulated and observed single mutant frequency distributions for three sublibraries. The legend specifies the observed yield of unique single mutants and simulated 95% confidence interval from 1000 events; “n” indicates the total number of observed intended single mutants. Results for all sublibraries are shown in Figure S9.
References
This article references 44 other publications.
- 1Tang, Q.; Fenton, A. W. Whole-Protein Alanine-Scanning Mutagenesis of Allostery: A Large Percentage of a Protein Can Contribute to Mechanism. Hum. Mutat. 2017, 38, 1132– 1143, DOI: 10.1002/humu.232311https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2sXhtlGit7rP&md5=e18150de48be6b6421b98621237c0dc4Whole-protein alanine-scanning mutagenesis of allostery: A large percentage of a protein can contribute to mechanismTang, Qingling; Fenton, Aron W.Human Mutation (2017), 38 (9), 1132-1143CODEN: HUMUE3; ISSN:1059-7794. (Wiley-Liss, Inc.)Many studies of allosteric mechanisms use limited nos. of mutations to test whether residues play "key" roles. However, if a large percentage of the protein contributes to allosteric function, mutating any residue would have a high probability of modifying allostery. Thus, a predicted mechanism that is dependent on only a few residues could erroneously appear to be supported. We used whole-protein alanine-scanning mutagenesis to det. which amino acid sidechains of human liver pyruvate kinase (hL-PYK; approved symbol PKLR) contribute to regulation by fructose-1,6-bisphosphate (Fru-1,6-BP; activator) and alanine (inhibitor). Each nonalanine/nonglycine residue of hL-PYK was mutated to alanine to generate 431 mutant proteins. Allosteric functions in active proteins were quantified by following substrate affinity over a concn. range of effectors. Results show that different residues contribute to the two allosteric functions. Only a small fraction of mutated residues perturbed inhibition by alanine. In contrast, a large percentage of mutated residues influenced activation by Fru-1,6-BP; inhibition by alanine is not simply the reverse of activation by Fru-1,6-BP. Moreover, the results show that Fru-1,6-BP activation would be extremely difficult to elucidate using a limited no. of mutations. Addnl., this large mutational data set will be useful to train and test computational algorithms aiming to predict allosteric mechanisms.
- 2Nisthal, A.; Wang, C. Y.; Ary, M. L.; Mayo, S. L. Protein Stability Engineering Insights Revealed by Domain-Wide Comprehensive Mutagenesis. Proc. Natl. Acad. Sci. U. S. A. 2019, 116, 16367– 16377, DOI: 10.1073/pnas.19038881162https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1MXhsFylsLnO&md5=fd2600724b4aaa6c8684ca183baad9bdProtein stability engineering insights revealed by domain-wide comprehensive mutagenesisNisthal, Alex; Wang, Connie Y.; Ary, Marie L.; Mayo, Stephen L.Proceedings of the National Academy of Sciences of the United States of America (2019), 116 (33), 16367-16377CODEN: PNASA6; ISSN:0027-8424. (National Academy of Sciences)The accurate prediction of protein stability upon sequence mutation is an important but unsolved challenge in protein engineering. Large mutational datasets are required to train computational predictors, but traditional methods for collecting stability data are either low-throughput or measure protein stability indirectly. Here, the authors develop an automated method to generate thermodn. stability data for nearly every single mutant in a small 56-residue protein. Anal. reveals that most single mutants have a neutral effect on stability, mutational sensitivity is largely governed by residue burial, and unexpectedly, hydrophobics are the best tolerated amino acid type. Correlating the output of various stability-prediction algorithms against the authors' data shows that nearly all perform better on boundary and surface positions than for those in the core and are better at predicting large-to-small mutations than small-to-large ones. The most stable variants in the single-mutant landscape are better identified using combinations of 2 prediction algorithms and including more algorithms can provide diminishing returns. In most cases, poor in silico predictions were tied to compositional differences between the data being analyzed and the datasets used to train the algorithm. Finally, strategies to ext. stabilities from high-throughput fitness data such as deep mutational scanning are promising and data produced by these methods may be applicable toward training future stability-prediction tools.
- 3Fordyce, P. M.; Gerber, D.; Tran, D.; Zheng, J.; Li, H.; DeRisi, J. L.; Quake, S. R. De Novo Identification and Biophysical Characterization of Transcription-Factor Binding Sites with Microfluidic Affinity Analysis. Nat. Biotechnol. 2010, 28, 970– 975, DOI: 10.1038/nbt.16753https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3cXhtVyiu7zI&md5=4b7720856ab810d08d11ddad872b967eDe novo identification and biophysical characterization of transcription-factor binding sites with microfluidic affinity analysisFordyce, Polly M.; Gerber, Doron; Tran, Danh; Zheng, Jiashun; Li, Hao; DeRisi, Joseph L.; Quake, Stephen R.Nature Biotechnology (2010), 28 (9), 970-975CODEN: NABIF9; ISSN:1087-0156. (Nature Publishing Group)Gene expression is regulated in part by protein transcription factors that bind target regulatory DNA sequences. Predicting DNA binding sites and affinities from transcription factor sequence or structure is difficult; therefore, exptl. data are required to link transcription factors to target sequences. We present a microfluidics-based approach for de novo discovery and quant. biophys. characterization of DNA target sequences. We validated our technique by measuring sequence preferences for 28 Saccharomyces cerevisiae transcription factors with a variety of DNA-binding domains, including several that have proven difficult to study by other techniques. For each transcription factor, we measured relative binding affinities to oligonucleotides covering all possible 8-bp DNA sequences to create a comprehensive map of sequence preferences; for four transcription factors, we also detd. abs. affinities. We expect that these data and future use of this technique will provide information essential for understanding transcription factor specificity, improving identification of regulatory sites and reconstructing regulatory interactions.
- 4Aditham, A. K.; Markin, C. J.; Mokhtari, D. A.; DelRosso, N.; Fordyce, P. M. High-Throughput Affinity Measurements of Transcription Factor and DNA Mutations Reveal Affinity and Specificity Determinants. Cell Syst. 2021, 112, DOI: 10.1016/j.cels.2020.11.0124https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB3MXks1Grs7o%253D&md5=6fb89fb454dd97ce76062724965d3a73High-Throughput Affinity Measurements of Transcription Factor and DNA Mutations Reveal Affinity and Specificity DeterminantsAditham, Arjun K.; Markin, Craig J.; Mokhtari, Daniel A.; DelRosso, Nicole; Fordyce, Polly M.Cell Systems (2021), 12 (2), 112-127.e11CODEN: CSEYA4; ISSN:2405-4712. (Cell Press)Transcription factors (TFs) bind regulatory DNA to control gene expression, and mutations to either TFs or DNA can alter binding affinities to rewire regulatory networks and drive phenotypic variation. While studies have profiled energetic effects of DNA mutations extensively, we lack similar information for TF variants. Here, we present STAMMP (simultaneous transcription factor affinity measurements via microfluidic protein arrays), a high-throughput microfluidic platform enabling quant. characterization of hundreds of TF variants simultaneously. Measured affinities for ∼210 mutants of a model yeast TF (Pho4) interacting with 9 oligonucleotides (>1,800 Kds) reveal that many combinations of mutations to poorly conserved TF residues and nucleotides flanking the core binding site alter but preserve physiol. binding, providing a mechanism by which combinations of mutations in cis and trans could modulate TF binding to tune occupancies during evolution. Moreover, biochem. double-mutant cycles across the TF-DNA interface reveal mol. mechanisms driving recognition, linking sequence to function. A record of this paper's Transparent Peer Review process is included in the Supplemental Information.
- 5Markin, C. J.; Mokhtari, D. A.; Sunden, F.; Appel, M. J.; Akiva, E.; Longwell, S. A.; Sabatti, C.; Herschlag, D.; Fordyce, P. M. Revealing Enzyme Functional Architecture via High-Throughput Microfluidic Enzyme Kinetics. Science 2021, 373, eabf8761, DOI: 10.1126/science.abf8761There is no corresponding record for this reference.
- 6Liang, S.; Mort, M.; Stenson, P. D.; Cooper, D. N.; Yu, H. PIVOTAL: Prioritizing Variants of Uncertain Significance with Spatial Genomic Patterns in the 3D Proteome. Genomics 2020. DOI: 10.1101/2020.06.04.135103 .There is no corresponding record for this reference.
- 7Starita, L. M.; Ahituv, N.; Dunham, M. J.; Kitzman, J. O.; Roth, F. P.; Seelig, G.; Shendure, J.; Fowler, D. M. Variant Interpretation: Functional Assays to the Rescue. Am. J. Hum. Genet. 2017, 101, 315– 325, DOI: 10.1016/j.ajhg.2017.07.0147https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2sXhsVOhu73M&md5=a6037f71813bdb50453c42d011c2eee0Variant Interpretation: Functional Assays to the RescueStarita, Lea M.; Ahituv, Nadav; Dunham, Maitreya J.; Kitzman, Jacob O.; Roth, Frederick P.; Seelig, Georg; Shendure, Jay; Fowler, Douglas M.American Journal of Human Genetics (2017), 101 (3), 315-325CODEN: AJHGAG; ISSN:0002-9297. (Cell Press)Classical genetic approaches for interpreting variants, such as case-control or co-segregation studies, require finding many individuals with each variant. Because the overwhelming majority of variants are present in only a few living humans, this strategy has clear limits. Fully realizing the clin. potential of genetics requires that we accurately infer pathogenicity even for rare or private variation. Many computational approaches to predicting variant effects have been developed, but they can identify only a small fraction of pathogenic variants with the high confidence that is required in the clinic. Exptl. measuring a variant's functional consequences can provide clearer guidance, but individual assays performed only after the discovery of the variant are both time and resource intensive. Here, we discuss how multiplex assays of variant effect (MAVEs) can be used to measure the functional consequences of all possible variants in disease-relevant loci for a variety of mol. and cellular phenotypes. The resulting large-scale functional data can be combined with machine learning and clin. knowledge for the development of "lookup tables" of accurate pathogenicity predictions. A coordinated effort to produce, analyze, and disseminate large-scale functional data generated by multiplex assays could be essential to addressing the variant-interpretation crisis.
- 8Hochberg, G. K. A.; Thornton, J. W. Reconstructing Ancient Proteins to Understand the Causes of Structure and Function. Annu. Rev. Biophys. 2017, 46, 247– 269, DOI: 10.1146/annurev-biophys-070816-0336318https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2sXksVCqs70%253D&md5=19552f9d9e82ad02000e1650203db066Reconstructing Ancient Proteins to Understand the Causes of Structure and FunctionHochberg, Georg K. A.; Thornton, Joseph W.Annual Review of Biophysics (2017), 46 (), 247-269CODEN: ARBNCV; ISSN:1936-122X. (Annual Reviews)A review. A central goal in biochem. is to explain the causes of protein sequence, structure, and function. Mainstream approaches seek to rationalize sequence and structure in terms of their effects on function and to identify function's underlying determinants by comparing related proteins to each other. Although productive, both strategies suffer from intrinsic limitations that have left important aspects of many proteins unexplained. These limits can be overcome by reconstructing ancient proteins, exptl. characterizing their properties, and retracing their evolution through time. This approach has proven to be a powerful means for discovering how historical changes in sequence produced the functions, structures, and other phys./chem. characteristics of modern proteins. It has also illuminated whether protein features evolved because of functional optimization, historical constraint, or blind chance. Here this review recent studies employing ancestral protein reconstruction and show how they have produced new knowledge not only of mol. evolutionary processes but also of the underlying determinants of modern proteins' phys., chem., and biol. properties.
- 9Lim, S. A.; Bolin, E. R.; Marqusee, S. Tracing a Protein’s Folding Pathway over Evolutionary Time Using Ancestral Sequence Reconstruction and Hydrogen Exchange. eLife 2018, 7, e38369 DOI: 10.7554/eLife.383699https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1MXitlyrurzJ&md5=9a82831cc2e19b9fb662ce0842abf3a1Tracing a protein's folding pathway over evolutionary time using ancestral sequence reconstruction and hydrogen exchangeLim, Shion An; Bolin, Eric Richard; Marqusee, SusaneLife (2018), 7 (), e38369/1-e38369/19CODEN: ELIFA8; ISSN:2050-084X. (eLife Sciences Publications Ltd.)The conformations populated during protein folding have been studied for decades; yet, their evolutionary importance remains largely unexplored. Ancestral sequence reconstruction allows access to proteins across evolutionary time, and new methods such as pulsed-labeling hydrogen exchange coupled with mass spectrometry allow detn. of folding intermediate structures at near amino-acid resoln. Here, we combine these techniques to monitor the folding of the RNase H family along the evolutionary lineages of T. thermophilus and E. coli RNase H. All homologs and ancestral proteins studied populate a similar folding intermediate despite being sepd. by billions of years of evolution. Even though this conformation is conserved, the pathway leading to it has diverged over evolutionary time, and rational mutations can alter this trajectory. Our results demonstrate that evolutionary processes can affect the energy landscape to preserve or alter specific features of a protein's folding pathway.
- 10Flamholz, A. I.; Prywes, N.; Moran, U.; Davidi, D.; Bar-On, Y. M.; Oltrogge, L. M.; Alves, R.; Savage, D.; Milo, R. Revisiting Trade-Offs between Rubisco Kinetic Parameters. Biochemistry 2019, 58, 3365– 3376, DOI: 10.1021/acs.biochem.9b0023710https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1MXht1OqsbjJ&md5=a3957a9709e3f5bb86cfdee474c407b6Revisiting Trade-offs between Rubisco Kinetic ParametersFlamholz, Avi I.; Prywes, Noam; Moran, Uri; Davidi, Dan; Bar-On, Yinon M.; Oltrogge, Luke M.; Alves, Rui; Savage, David; Milo, RonBiochemistry (2019), 58 (31), 3365-3376CODEN: BICHAW; ISSN:0006-2960. (American Chemical Society)Rubisco is the primary carboxylase of the Calvin cycle, the most abundant enzyme in the biosphere, and one of the best-characterized enzymes. On the basis of correlations between Rubisco kinetic parameters, it is widely posited that constraints embedded in the catalytic mechanism enforce trade-offs between CO2 specificity, SC/O, and max. carboxylation rate, kcat,C. However, the reasoning that established this view was based on data from ≈20 organisms. Here, we re-examine models of trade-offs in Rubisco catalysis using a data set from ≈300 organisms. Correlations between kinetic parameters are substantially attenuated in this larger data set, with the inverse relationship between kcat,C and SC/O being a key example. Nonetheless, measured kinetic parameters display extremely limited variation, consistent with a view of Rubisco as a highly constrained enzyme. More than 95% of kcat,C values are between 1 and 10 s-1, and no measured kcat,C exceeds 15 s-1. Similarly, SC/O varies by only 30% among Form I Rubiscos and <10% among C3 plant enzymes. Limited variation in SC/O forces a strong pos. correlation between the catalytic efficiencies (kcat/KM) for carboxylation and oxygenation, consistent with a model of Rubisco catalysis in which increasing the rate of addn. of CO2 to the enzyme-substrate complex requires an equal increase in the O2 addn. rate. Altogether, these data suggest that Rubisco evolution is tightly constrained by the physicochem. limits of CO2/O2 discrimination.
- 11Furukawa, R.; Toma, W.; Yamazaki, K.; Akanuma, S. Ancestral Sequence Reconstruction Produces Thermally Stable Enzymes with Mesophilic Enzyme-like Catalytic Properties. Sci. Rep. 2020, 10, 15493, DOI: 10.1038/s41598-020-72418-411https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB3cXhvFOhtr%252FK&md5=ddceb2c9975d9d1a58534d08fe38ea24Ancestral sequence reconstruction produces thermally stable enzymes with mesophilic enzyme-like catalytic propertiesFurukawa, Ryutaro; Toma, Wakako; Yamazaki, Koji; Akanuma, SatoshiScientific Reports (2020), 10 (1), 15493CODEN: SRCEC3; ISSN:2045-2322. (Nature Research)Abstr.: Enzymes have high catalytic efficiency and low environmental impact, and are therefore potentially useful tools for various industrial processes. Crucially, however, natural enzymes do not always have the properties required for specific processes. It may be necessary, therefore, to design, engineer, and evolve enzymes with properties that are not found in natural enzymes. In particular, the creation of enzymes that are thermally stable and catalytically active at low temp. is desirable for processes involving both high and low temps. In the current study, we designed two ancestral sequences of 3-isopropylmalate dehydrogenase by an ancestral sequence reconstruction technique based on a phylogenetic anal. of extant homologous amino acid sequences. Genes encoding the designed sequences were artificially synthesized and expressed in Escherichia coli. The reconstructed enzymes were found to be slightly more thermally stable than the extant thermophilic homolog from Thermus thermophilus. Moreover, they had considerably higher low-temp. catalytic activity as compared with the T. thermophilus enzyme. Detailed analyses of their temp.-dependent specific activities and kinetic properties showed that the reconstructed enzymes have catalytic properties similar to those of mesophilic homologues. Collectively, our study demonstrates that ancestral sequence reconstruction can produce a thermally stable enzyme with catalytic properties adapted to low-temp. reactions.
- 12Alejaldre, L.; Pelletier, J. N.; Quaglia, D. Methods for Enzyme Library Creation: Which One Will You Choose?: A Guide for Novices and Experts to Introduce Genetic Diversity. BioEssays 2021, 43, 2100052, DOI: 10.1002/bies.202100052There is no corresponding record for this reference.
- 13Cirino, P., Mayer, K. M., Umeno, D. C.; Mayer, K. M.; Umeno, D. Generating Mutant Libraries Using Error-Prone PCR. In Directed Evolution Library Creation; Humana Press: New Jersey, 2003; Vol. 231, pp. 3– 10. DOI: 10.1385/1-59259-395-X:3 .There is no corresponding record for this reference.
- 14Hanson-Manful, P.; Patrick, W. M. Construction and Analysis of Randomized Protein-Encoding Libraries Using Error-Prone PCR. In Protein Nanotechnology; Gerrard, J. A., Ed.; Methods in Molecular Biology; Humana Press: Totowa, NJ, 2013; Vol. 996, pp 251–267. DOI: 10.1007/978-1-62703-354-1_15 .There is no corresponding record for this reference.
- 15Firth, A. E.; Patrick, W. M. Statistics of Protein Library Construction. Bioinformatics 2005, 21, 3314– 3315, DOI: 10.1093/bioinformatics/bti51615https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD2MXmt1eqsL4%253D&md5=641b96657060a12dce35e4ee4f4effa0Statistics of protein library constructionFirth, Andrew E.; Patrick, Wayne M.Bioinformatics (2005), 21 (15), 3314-3315CODEN: BOINFP; ISSN:1367-4803. (Oxford University Press)Review. We have investigated the statistics assocd. with constructing and sampling large protein-encoding libraries. Using fairly simple statistics we have written algorithms for estg. the diversity in libraries generated by the most commonly used protocols, including error-prone PCR, DNA shuffling, StEP PCR, oligonucleotide-directed randomization, MAX randomization, synthetic shuffling, DHR, ADO and SISDC.
- 16Steffens, D. L.; Williams, J. G. Efficient Site-Directed Saturation Mutagenesis Using Degenerate Oligonucleotides. J. Biomol. Tech. 2007, 18, 147– 14916https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BD2sznvVCksQ%253D%253D&md5=57d22fbf25cc787f6f1ab2c5b66931a3Efficient site-directed saturation mutagenesis using degenerate oligonucleotidesSteffens David L; Williams John G KJournal of biomolecular techniques : JBT (2007), 18 (3), 147-9 ISSN:1524-0215.We describe a reliable protocol for constructing single-site saturation mutagenesis libraries consisting of all 20 naturally occurring amino acids at a specific site within a protein. Such libraries are useful for structure-function studies and directed evolution. This protocol extends the utility of Stratagene's QuikChange Site-Directed Mutagenesis Kit, which is primarily recommended for single amino acid substitutions. Two complementary primers are synthesized, containing a degenerate mixture of the four bases at the three positions of the selected codon. These primers are added to starting plasmid template and thermal cycled to produce mutant DNA molecules, which are subsequently transformed into competent bacteria. The protocol does not require purification of mutagenic oligonucleotides or PCR products. This reduces both the cost and turnaround time in high-throughput directed evolution applications. We have utilized this protocol to generate over 200 site-saturation libraries in a DNA polymerase, with a success rate of greater than 95%.
- 17Shimko, T. C.; Fordyce, P. M.; Orenstein, Y. DeCoDe: Degenerate Codon Design for Complete Protein-Coding DNA Libraries. Bioinformatics 2020, 36, 3357– 3364, DOI: 10.1093/bioinformatics/btaa16217https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB3cXhvF2htbbE&md5=28244a325db665e60a24e45499eb4e2fDeCoDe: degenerate codon design for complete protein-coding DNA librariesShimko, Tyler C.; Fordyce, Polly M.; Orenstein, YaronBioinformatics (2020), 36 (11), 3357-3364CODEN: BOINFP; ISSN:1367-4811. (Oxford University Press)High-throughput protein screening is a crit. technique for dissecting and designing protein function. Libraries for these assays can be created through a no. of means, including targeted or random mutagenesis of a template protein sequence or direct DNA synthesis. However, mutagenic library construction methods often yield vastly more nonfunctional than functional variants and, despite advances in large-scale DNA synthesis, individual synthesis of each desired DNA template is often prohibitively expensive. Consequently, many protein-screening libraries rely on the use of degenerate codons (DCs), mixts. of DNA bases incorporated at specific positions during DNA synthesis, to generate highly diverse protein-variant pools from only a few low-cost synthesis reactions. However, selecting DCs for sets of sequences that covary at multiple positions dramatically increases the difficulty of designing a DC library and leads to the creation of many undesired variants that can quickly outstrip screening capacity. We introduce a novel algorithm for total DC library optimization, degenerate codon design (DeCoDe), based on integer linear programming. DeCoDe significantly outperforms state-of-the-art DC optimization algorithms and scales well to more than a hundred proteins sharing complex patterns of covariation (e.g. the lab-derived avGFP lineage).
- 18Picelli, S.; Faridani, O. R.; Björklund, Å. K.; Winberg, G.; Sagasser, S.; Sandberg, R. Full-Length RNA-Seq from Single Cells Using Smart-Seq2. Nat. Protoc. 2014, 9, 171– 181, DOI: 10.1038/nprot.2014.00618https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2cXls1ejtrc%253D&md5=152c5b7d140d866952bd1e211fefd280Full-length RNA-seq from single cells using Smart-seq2Picelli, Simone; Faridani, Omid R.; Bjorklund, Aasa K.; Winberg, Goesta; Sagasser, Sven; Sandberg, RickardNature Protocols (2014), 9 (1), 171-181CODEN: NPARDW; ISSN:1750-2799. (Nature Publishing Group)Emerging methods for the accurate quantification of gene expression in individual cells hold promise for revealing the extent, function and origins of cell-to-cell variability. Different high-throughput methods for single-cell RNA-seq have been introduced that vary in coverage, sensitivity and multiplexing ability. We recently introduced Smart-seq for transcriptome anal. from single cells, and we subsequently optimized the method for improved sensitivity, accuracy and full-length coverage across transcripts. Here we present a detailed protocol for Smart-seq2 that allows the generation of full-length cDNA and sequencing libraries by using std. reagents. The entire protocol takes ∼2 d from cell picking to having a final library ready for sequencing; sequencing will require an addnl. 1-3 d depending on the strategy and sequencer. The current limitations are the lack of strand specificity and the inability to detect nonpolyadenylated (polyA-) RNA.
- 19Tee, K. L.; Wong, T. S. Back to Basics: Creating Genetic Diversity. In Directed Enzyme Evolution: Advances and Applications; Alcalde, M., Ed.; Springer International Publishing: Cham, CH, 2017; pp. 201– 227. DOI: 10.1007/978-3-319-50413-1_8 .There is no corresponding record for this reference.
- 20Firnberg, E.; Ostermeier, M. PFunkel: Efficient, Expansive, User-Defined Mutagenesis. PLoS One 2012, 7, e52031, DOI: 10.1371/journal.pone.005203120https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3sXhvF2lsg%253D%253D&md5=5953a993f289801f7b1f006b44565799PFunkel: efficient, expansive, user-defined mutagenesisFirnberg, Elad; Ostermeier, MarcPLoS One (2012), 7 (12), e52031CODEN: POLNCL; ISSN:1932-6203. (Public Library of Science)We introduce PFunkel, a versatile method for extensive, researcher-defined DNA mutagenesis using a ssDNA or dsDNA template. Once the template DNA is prepd., the method can be completed in a single day in a single tube, and requires no intermediate DNA purifn. or sub-cloning. PFunkel can be used for site-directed mutagenesis at an efficiency approaching 100%. More importantly, PFunkel allows researchers the unparalleled ability to efficiently construct user-defined libraries. We demonstrate the creation of a library with site-satn. at four distal sites simultaneously at 70% efficiency. We also employ PFunkel to create a comprehensive codon mutagenesis library of the TEM-1 β-lactamase gene. We designed this library to contain 18,081 members, one for each possible codon substitution in the gene (287 positions in TEM-1 × 63 possible codon substitutions). Deep sequencing revealed that ∼97% of the designed single codon substitutions are present in the library. From such a library we identified 18 previously unreported adaptive mutations that each confer resistance to the β-lactamase inhibitor tazobactam. Three of these mutations confer resistance equal to or higher than that of the most resistant reported TEM-1 allele and have the potential to emerge clin.
- 21Wrenbeck, E. E.; Klesmith, J. R.; Stapleton, J. A.; Adeniran, A.; Tyo, K. E. J.; Whitehead, T. A. Plasmid-Based One-Pot Saturation Mutagenesis. Nat. Methods 2016, 13, 928– 930, DOI: 10.1038/nmeth.402921https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC28Xhs1CnsrjF&md5=b571c246fb8f2c6bd97016e12a5d2302Plasmid-based one-pot saturation mutagenesisWrenbeck, Emily E.; Klesmith, Justin R.; Stapleton, James A.; Adeniran, Adebola; Tyo, Keith E. J.; Whitehead, Timothy A.Nature Methods (2016), 13 (11), 928-930CODEN: NMAEA3; ISSN:1548-7091. (Nature Publishing Group)Deep mutational scanning is a foundational tool for addressing the functional consequences of large nos. of mutants, but a more efficient and accessible method for construction of user-defined mutagenesis libraries is needed. Here we present nicking mutagenesis, a robust, single-day, one-pot satn. mutagenesis method performed on routinely prepped plasmid dsDNA. The method can be used to produce comprehensive or single- or multi-site satn. mutagenesis libraries.
- 22Bihani, S. C.; Das, A.; Nilgiriwala, K. S.; Prashar, V.; Pirocchi, M.; Apte, S. K.; Ferrer, J.-L.; Hosur, M. V. X-Ray Structure Reveals a New Class and Provides Insight into Evolution of Alkaline Phosphatases. PLoS One 2011, 6, e22767 DOI: 10.1371/journal.pone.002276722https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3MXhtVyks7nL&md5=a0279bff46021d8255f1d9bc49dad68fX-ray structure reveals a new class and provides insight into evolution of alkaline phosphatasesBihani, Subhash C.; Das, Amit; Nilgiriwala, Kayzad S.; Prashar, Vishal; Pirocchi, Michel; Apte, Shree Kumar; Ferrer, Jean-Luc; Hosur, Madhusoodan V.PLoS One (2011), 6 (7), e22767CODEN: POLNCL; ISSN:1932-6203. (Public Library of Science)The alk. phosphatase (AP) is a bi-metalloenzyme of potential applications in biotechnol. and bioremediation, in which phosphate monoesters are nonspecifically hydrolyzed under alk. conditions to yield inorg. phosphate. The hydrolysis occurs through an enzyme intermediate in which the catalytic residue is phosphorylated. The reaction, which also requires a third metal ion, is proposed to proceed through a mechanism of in-line displacement involving a trigonal bipyramidal transition state. Stabilizing the transition state by bidentate hydrogen bonding has been suggested to be the reason for conservation of an arginine residue in the active site. We report here the first crystal structure of alk. phosphatase purified from the bacterium Sphingomonas sp. Strain BSAR-1 (SPAP). The crystal structure reveals many differences from other APs: (1) the catalytic residue is a threonine instead of serine, (2) there is no third metal ion binding pocket, and (3) the arginine residue forming bidentate hydrogen bonding is deleted in SPAP. A lysine and an aspargine residue, recruited together for the first time into the active site, bind the substrate phosphoryl group in a manner not obsd. before in any other AP. These and other structural features suggest that SPAP represents a new class of APs. Because of its direct contact with the substrate phosphoryl group, the lysine residue is proposed to play a significant role in catalysis. The structure is consistent with a mechanism of in-line displacement via a trigonal bipyramidal transition state. The structure provides important insights into evolutionary relationships between members of AP superfamily.
- 23Rinke, C.; Lee, J.; Nath, N.; Goudeau, D.; Thompson, B.; Poulton, N.; Dmitrieff, E.; Malmstrom, R.; Stepanauskas, R.; Woyke, T. Obtaining Genomes from Uncultivated Environmental Microorganisms Using FACS–Based Single-Cell Genomics. Nat. Protoc. 2014, 9, 1038– 1048, DOI: 10.1038/nprot.2014.06723https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2cXmtlSgsLc%253D&md5=11899aeadd659be9dc95bfdab2f253daObtaining genomes from uncultivated environmental microorganisms using FACS-based single-cell genomicsRinke, Christian; Lee, Janey; Nath, Nandita; Goudeau, Danielle; Thompson, Brian; Poulton, Nicole; Dmitrieff, Elizabeth; Malmstrom, Rex; Stepanauskas, Ramunas; Woyke, TanjaNature Protocols (2014), 9 (5), 1038-1048CODEN: NPARDW; ISSN:1750-2799. (Nature Publishing Group)A review. Single-cell genomics is a powerful tool for exploring the genetic makeup of environmental microorganisms, the vast majority of which are difficult, if not impossible, to cultivate with current approaches. Here we present a comprehensive protocol for obtaining genomes from uncultivated environmental microbes via high-throughput single-cell isolation by FACS. The protocol encompasses the preservation and pretreatment of differing environmental samples, followed by the phys. sepn., lysis, whole-genome amplification and 16S rRNA-based identification of individual bacterial and archaeal cells. The described procedure can be performed with std. mol. biol. equipment and a FACS machine. It takes <12 h of bench time over a 4-d time period, and it generates up to 1 mg of genomic DNA from an individual microbial cell, which is suitable for downstream applications such as PCR amplification and shotgun sequencing. The completeness of the recovered genomes varies, with an av. of ∼50%.
- 24Brower, K. K.; Carswell-Crumpton, C.; Klemm, S.; Cruz, B.; Kim, G.; Calhoun, S. G. K.; Nichols, L.; Fordyce, P. M. Double Emulsion Flow Cytometry with High-Throughput Single Droplet Isolation and Nucleic Acid Recovery. Lab Chip 2020, 20, 2062– 2074, DOI: 10.1039/D0LC00261E24https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB3cXpsV2htrc%253D&md5=2b2ad041b6e06468c59a6214d058dbd5Double emulsion flow cytometry with high-throughput single droplet isolation and nucleic acid recoveryBrower, Kara K.; Carswell-Crumpton, Catherine; Klemm, Sandy; Cruz, Bianca; Kim, Gaeun; Calhoun, Suzanne G. K.; Nichols, Lisa; Fordyce, Polly M.Lab on a Chip (2020), 20 (12), 2062-2074CODEN: LCAHAM; ISSN:1473-0189. (Royal Society of Chemistry)Droplet microfluidics has made large impacts in diverse areas such as enzyme evolution, chem. product screening, polymer engineering, and single-cell anal. However, while droplet reactions have become increasingly sophisticated, phenotyping droplets by a fluorescent signal and sorting them to isolate individual variants-of-interest at high-throughput remains challenging. Here, we present sdDE-FACS (single droplet Double Emulsion-FACS), a new method that uses a std. flow cytometer to phenotype, select, and isolate individual double emulsion droplets of interest. Using a 130μm nozzle at high sort frequency (12-14 kHz), we demonstrate detection of droplet fluorescence signals with a dynamic range spanning 5 orders of magnitude and robust post-sort recovery of intact double emulsion (DE) droplets using 2 com.-available FACS instruments. We report the first demonstration of single double emulsion droplet isolation with post-sort recovery efficiencies >70%, equiv. to the capabilities of single-cell FACS. Finally, we establish complete downstream recovery of nucleic acids from single, sorted double emulsion droplets via qPCR with little to no cross-contamination. sdDE-FACS marries the full power of droplet microfluidics with flow cytometry to enable a variety of new droplet assays, including rare variant isolation and multiparameter single-cell anal.
- 25Bronner, I. F.; Quail, M. A. Best Practices for Illumina Library Preparation. Curr. Protoc. Hum. Genet. 2019, 102, e86, DOI: 10.1002/cphg.86There is no corresponding record for this reference.
- 26Picelli, S.; Björklund, Å. K.; Reinius, B.; Sagasser, S.; Winberg, G.; Sandberg, R. Tn5 Transposase and Tagmentation Procedures for Massively Scaled Sequencing Projects. Genome Res. 2014, 24, 2033– 2040, DOI: 10.1101/gr.177881.11426https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2cXitVOls7%252FF&md5=a79bbd4c329ebfb8690eb89ea04f2380Tn5 transposase and tagmentation procedures for massively scaled sequencing projectsPicelli, Simone; Bjoerklund, Aasa K.; Reinius, Bjoern; Sagasser, Sven; Winberg, Goesta; Sandberg, RickardGenome Research (2014), 24 (12), 2033-2040CODEN: GEREFS; ISSN:1088-9051. (Cold Spring Harbor Laboratory Press)Massively parallel DNA sequencing of thousands of samples in a single machine-run is now possible, but the prepn. of the individual sequencing libraries is expensive and time-consuming. Tagmentation-based library construction, using the Tn5 transposase, is efficient for generating sequencing libraries but currently relies on undisclosed reagents, which severely limits development of novel applications and the execution of large-scale projects. Here, we present simple and robust procedures for Tn5 transposase prodn. and optimized reaction conditions for tagmentation-based sequencing library construction. We further show how mol. crowding agents both modulate library lengths and enable efficient tagmentation from subpicogram amts. of cDNA. The comparison of single-cell RNA-sequencing libraries generated using produced and com. Tn5 demonstrated equal performances in terms of gene detection and library characteristics. Finally, because naked Tn5 can be annealed to any oligonucleotide of choice, for example, mol. barcodes in single-cell assays or methylated oligonucleotides for bisulfite sequencing, custom Tn5 prodn. and tagmentation enable innovation in sequencing-based applications.
- 27Adey, A.; Morrison, H. G.; Xun, X.; Kitzman, J. O.; Turner, E. H.; Stackhouse, B.; MacKenzie, A. P.; Caruccio, N. C.; Zhang, X.; Shendure, J. Rapid, Low-Input, Low-Bias Construction of Shotgun Fragment Libraries by High-Density in Vitro Transposition. Genome Biol. 2010, 11, R119, DOI: 10.1186/gb-2010-11-12-r11927https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3MXltVegsA%253D%253D&md5=99c96db11994c37386dfdfd9947e80b5Rapid, low-input, low-bias construction of shotgun fragment libraries by high-density in vitro transpositionAdey, Andrew; Morrison, Hilary G.; Asan; Xun, Xu; Kitzman, Jacob O.; Turner, Emily H.; Stackhouse, Bethany; MacKenzie, Alexandra P.; Caruccio, Nicholas C.; Zhang, Xiuqing; Shendure, JayGenome Biology (2010), 11 (12), R119CODEN: GNBLFW; ISSN:1474-760X. (BioMed Central Ltd.)We characterize and extend a highly efficient method for constructing shotgun fragment libraries in which transposase catalyzes in vitro DNA fragmentation and adaptor incorporation simultaneously. We apply this method towards sequencing a human genome, and find that coverage biases are comparable with conventional protocols. We also extend its capabilities by developing protocols for sub-nanogram library construction, exome capture from 50 ng of input DNA, PCR-free and colony PCR library construction, and 96-plex sample indexing.
- 28Glenn, T. C.; Nilsen, R. A.; Kieran, T. J.; Sanders, J. G.; Bayona-Vásquez, N. J.; Finger, J. W.; Pierson, T. W.; Bentley, K. E.; Hoffberg, S. L.; Louha, S.; Garcia-De Leon, F. J.; Del Rio Portilla, M. A.; Reed, K. D.; Anderson, J. L.; Meece, J. K.; Aggrey, S. E.; Rekaya, R.; Alabady, M.; Belanger, M.; Winker, K.; Faircloth, B. C. Adapterama I: Universal Stubs and Primers for 384 Unique Dual-Indexed or 147,456 Combinatorially-Indexed Illumina Libraries (ITru & INext). PeerJ 2019, 7, e7755, DOI: 10.7717/peerj.7755There is no corresponding record for this reference.
- 29Tegally, H.; San, J. E.; Giandhari, J.; de Oliveira, T. Unlocking the Efficiency of Genomics Laboratories with Robotic Liquid-Handling. BMC Genomics 2020, 21, 729, DOI: 10.1186/s12864-020-07137-129https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BB3s7jvFyitg%253D%253D&md5=4d99e01dfd76d80ead2ec8ad63d97087Unlocking the efficiency of genomics laboratories with robotic liquid-handlingTegally Houriiyah; San James Emmanuel; Giandhari Jennifer; de Oliveira Tulio; de Oliveira TulioBMC genomics (2020), 21 (1), 729 ISSN:.In research and clinical genomics laboratories today, sample preparation is the bottleneck of experiments, particularly when it comes to high-throughput next generation sequencing (NGS). More genomics laboratories are now considering liquid-handling automation to make the sequencing workflow more efficient and cost effective. The question remains as to its suitability and return on investment. A number of points need to be carefully considered before introducing robots into biological laboratories. Here, we describe the state-of-the-art technology of both sophisticated and do-it-yourself (DIY) robotic liquid-handlers and provide a practical review of the motivation, implications and requirements of laboratory automation for genome sequencing experiments.
- 30Bolger, A. M.; Lohse, M.; Usadel, B. Trimmomatic: A Flexible Trimmer for Illumina Sequence Data. Bioinformatics 2014, 30, 2114– 2120, DOI: 10.1093/bioinformatics/btu17030https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2cXht1Sqt7nP&md5=0833bee198353e90a4d7363f99f02c8eTrimmomatic: a flexible trimmer for Illumina sequence dataBolger, Anthony M.; Lohse, Marc; Usadel, BjoernBioinformatics (2014), 30 (15), 2114-2120CODEN: BOINFP; ISSN:1367-4803. (Oxford University Press)Motivation: Although many next-generation sequencing (NGS) read preprocessing tools already existed, we could not find any tool or combination of tools that met our requirements in terms of flexibility, correct handling of paired-end data and high performance. We have developed Trimmomatic as a more flexible and efficient preprocessing tool, which could correctly handle paired-end data. Results: The value of NGS read preprocessing is demonstrated for both ref.-based and ref.-free tasks. Trimmomatic is shown to produce output that is at least competitive with, and in many cases superior to, that produced by other tools, in all scenarios tested. Availability and implementation: Trimmomatic is licensed under GPL V3. It is cross-platform (Java 1.5+ required) and available at http://www.usadellab.org/cms/index.phppage=trimmomatic Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics online.
- 31Li, H. Aligning Sequence Reads, Clone Sequences and Assembly Contigs with BWA-MEM. arXiv:1303.3997 [q-bio] 2013.There is no corresponding record for this reference.
- 32Li, H.; Handsaker, B.; Wysoker, A.; Fennell, T.; Ruan, J.; Homer, N.; Marth, G.; Abecasis, G.; Durbin, R.; 1000 Genome Project Data Processing Subgroup The Sequence Alignment/Map Format and SAMtools. Bioinformatics 2009, 25, 2078– 2079, DOI: 10.1093/bioinformatics/btp35232https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD1MXpslertr8%253D&md5=1ab7714968487a35cce7f81b751a0b1aThe Sequence Alignment/Map format and SAMtoolsLi, Heng; Handsaker, Bob; Wysoker, Alec; Fennell, Tim; Ruan, Jue; Homer, Nils; Marth, Gabor; Abecasis, Goncalo; Durbin, RichardBioinformatics (2009), 25 (16), 2078-2079CODEN: BOINFP; ISSN:1367-4803. (Oxford University Press)Summary: The Sequence Alignment/Map (SAM) format is a generic alignment format for storing read alignments against ref. sequences, supporting short and long reads (up to 128 Mbp) produced by different sequencing platforms. It is flexible in style, compact in size, efficient in random access and is the format in which alignments from the 1000 Genomes Project are released. SAMtools implements various utilities for post-processing alignments in the SAM format, such as indexing, variant caller and alignment viewer, and thus provides universal tools for processing read alignments. Availability: http://samtools.sourceforge.net Contact: [email protected].
- 33Li, H. A Statistical Framework for SNP Calling, Mutation Discovery, Association Mapping and Population Genetical Parameter Estimation from Sequencing Data. Bioinformatics 2011, 27, 2987– 2993, DOI: 10.1093/bioinformatics/btr50933https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3MXhtlGkur7L&md5=778fc839fbbce2b0fc82aa2d9295652bA statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing dataLi, HengBioinformatics (2011), 27 (21), 2987-2993CODEN: BOINFP; ISSN:1367-4803. (Oxford University Press)Motivation: Most existing methods for DNA sequence anal. rely on accurate sequences or genotypes. However, in applications of the next-generation sequencing (NGS), accurate genotypes may not be easily obtained (e.g. multi-sample low-coverage sequencing or somatic mutation discovery). These applications press for the development of new methods for analyzing sequence data with uncertainty. Results: We present a statistical framework for calling SNPs, discovering somatic mutations, inferring population genetical parameters and performing assocn. tests directly based on sequencing data without explicit genotyping or linkage-based imputation. On real data, we demonstrate that our method achieves comparable accuracy to alternative methods for estg. site allele count, for inferring allele frequency spectrum and for assocn. mapping. We also highlight the necessity of using sym. datasets for finding somatic mutations and confirm that for discovering rare events, mismapping is frequently the leading source of errors. Availability: http://samtools.sourceforge.net Contact: [email protected].
- 34Kirsch, R. D.; Joly, E. An Improved PCR-Mutagenesis Strategy for Two-Site Mutagenesis or Sequence Swapping between Related Genes. Nucleic Acids Res. 1998, 26, 1848– 1850, DOI: 10.1093/nar/26.7.184834https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADyaK1cXis1Oju7Y%253D&md5=31423b2c27f6d183e7dc1b49d4d982c5An improved PCR-mutagenesis strategy for two-site mutagenesis or sequence swapping between related genesKirsch, Ralf D.; Joly, EtienneNucleic Acids Research (1998), 26 (7), 1848-1850CODEN: NARHAD; ISSN:0305-1048. (Oxford University Press)The QuikChange protocol is one of the simplest and fastest methods for site-directed mutagenesis, but introduces mutations at only one site at a time, and requires two HPLC-purified complementary oligonucleotides. Here, we describe that this method can be used with non-overlapping oligonucleotides. By doing this, two sep. sites can be mutagenized simultaneously, or money can be saved by using a second "std." oligonucleotide. By a further modification, we have also used the QuikChange approach to exchange DNA sequences between closely related genes.
- 35Wang, W.; Malcolm, B. A. Two-Stage PCR Protocol Allowing Introduction of Multiple Mutations, Deletions and Insertions Using QuikChange TM Site-Directed Mutagenesis. BioTechniques 1999, 26, 680– 682, DOI: 10.2144/99264st0335https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADyaK1MXislGmtrk%253D&md5=96add1e80ac3ed042f88de046997a848Two-stage PCR protocol allowing introduction of multiple mutations, deletions and insertions using QuikChange Site-Directed MutagenesisWang, Wenyan; Malcolm, Bruce A.BioTechniques (1999), 26 (4), 680-682CODEN: BTNQDO; ISSN:0736-6205. (Eaton Publishing Co.)We developed a two-stage procedure, based on the QuikChange Site-Directed Mutagenesis Protocol, that significantly expands its application to a variety of gene modification expts. A pre-PCR, single-primer extension stage before the std. protocol allows the efficient introduction of not only point mutation but also multiple mutations and deletions and insertions to a sequence of interest.
- 36Li, M. Z.; Elledge, S. J. Harnessing Homologous Recombination in Vitro to Generate Recombinant DNA via SLIC. Nat. Methods 2007, 4, 251– 256, DOI: 10.1038/nmeth101036https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD2sXitFChsbY%253D&md5=daf627b0eb2b744aa33a7e109f6df6b6Harnessing homologous recombination in vitro to generate recombinant DNA via SLICLi, Mamie Z.; Elledge, Stephen J.Nature Methods (2007), 4 (3), 251-256CODEN: NMAEA3; ISSN:1548-7091. (Nature Publishing Group)We describe a new cloning method, sequence and ligation-independent cloning (SLIC), which allows the assembly of multiple DNA fragments in a single reaction using in vitro homologous recombination and single-strand annealing. SLIC mimics in vivo homologous recombination by relying on exonuclease-generated ssDNA overhangs in insert and vector fragments, and the assembly of these fragments by recombination in vitro. SLIC inserts can also be prepd. by incomplete PCR (iPCR) or mixed PCR. SLIC allows efficient and reproducible assembly of recombinant DNA with as many as 5 and 10 fragments simultaneously. SLIC circumvents the sequence requirements of traditional methods and functions much more efficiently at very low DNA concns. when combined with RecA to catalyze homologous recombination. This flexibility allows much greater versatility in the generation of recombinant DNA for the purposes of synthetic biol.
- 37Gibson, D. G.; Young, L.; Chuang, R.-Y.; Venter, J. C.; Hutchison, C. A., III; Smith, H. O. Enzymatic Assembly of DNA Molecules up to Several Hundred Kilobases. Nat. Methods 2009, 6, 343– 345, DOI: 10.1038/nmeth.131837https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD1MXksVemsbw%253D&md5=46284924c7d73c47cfb490983338e480Enzymatic assembly of DNA molecules up to several hundred kilobasesGibson, Daniel G.; Young, Lei; Chuang, Ray-Yuan; Venter, J. Craig; Hutchison, Clyde A.; Smith, Hamilton O.Nature Methods (2009), 6 (5), 343-345CODEN: NMAEA3; ISSN:1548-7091. (Nature Publishing Group)The authors describe an isothermal, single-reaction method for assembling multiple overlapping DNA mols. by the concerted action of a 5' exonuclease, a DNA polymerase and a DNA ligase. First they recessed DNA fragments, yielding single-stranded DNA overhangs that specifically annealed, and then covalently joined them. This assembly method can be used to seamlessly construct synthetic and natural genes, genetic pathways and entire genomes, and could be a useful mol. engineering tool.
- 38Longwell, S. A.; Appel, M. J.; Orenstein, Y.; Fordyce, P. M. OpTile: An Optimized Method for Creating Overlapping Tiled Oligonucleotide Libraries. In preparation .There is no corresponding record for this reference.
- 39Logsdon, G. A.; Vollger, M. R.; Eichler, E. E. Long-Read Human Genome Sequencing and Its Applications. Nat. Rev. Genet. 2020, 21, 597– 614, DOI: 10.1038/s41576-020-0236-x39https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB3cXhtFSgs77F&md5=d42d9831df87803e6a64387d9d05b95cLong-read human genome sequencing and its applicationsLogsdon, Glennis A.; Vollger, Mitchell R.; Eichler, Evan E.Nature Reviews Genetics (2020), 21 (10), 597-614CODEN: NRGAAM; ISSN:1471-0056. (Nature Research)A review. Over the past decade, long-read, single-mol. DNA sequencing technologies have emerged as powerful players in genomics. With the ability to generate reads tens to thousands of kilobases in length with an accuracy approaching that of short-read sequencing technologies, these platforms have proven their ability to resolve some of the most challenging regions of the human genome, detect previously inaccessible structural variants and generate some of the first telomere-to-telomere assemblies of whole chromosomes. Long-read sequencing technologies will soon permit the routine assembly of diploid genomes, which will revolutionize genomics by revealing the full spectrum of human genetic variation, resolving some of the missing heritability and leading to the discovery of novel mechanisms of disease.
- 40Nilgiriwala, K. S.; Alahari, A.; Rao, A. S.; Apte, S. K. Cloning and Overexpression of Alkaline Phosphatase PhoK from Sphingomonas Sp. Strain BSAR-1 for Bioprecipitation of Uranium from Alkaline Solutions. Appl. Environ. Microbiol. 2008, 74, 5516– 5523, DOI: 10.1128/aem.00107-0840https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD1cXhtV2ru7jL&md5=5c37868a32eafcda29070f38c50c870cCloning and overexpression of alkaline phosphatase phoK from Sphingomonas sp. strain BSAR-1 for bioprecipitation of uranium from alkaline solutionsNilgiriwala, Kayzad S.; Alahari, Anuradha; Rao, Amara Sambasiva; Apte, Shree KumarApplied and Environmental Microbiology (2008), 74 (17), 5516-5523CODEN: AEMIDF; ISSN:0099-2240. (American Society for Microbiology)Cells of Sphingomonas sp. strain BSAR-1 constitutively expressed an alk. phosphatase, which was also secreted in the extracellular medium. A null mutant lacking this alk. phosphatase activity was isolated by Tn5 random mutagenesis. The corresponding gene, designated phoK, was cloned and overexpressed in Escherichia coli strain BL21(DE3). The resultant E. coli strain EK4 overexpressed cellular activity 55 times higher and secreted extracellular PhoK activity 13 times higher than did BSAR-1. The recombinant strain very rapidly pptd. >90% of input uranium in less than 2 h from alk. solns. (pH, 9) contg. 0.5 to 5 mM of uranyl carbonate, compared to BSAR-1, which pptd. uranium in >7 h. In both strains BSAR-1 and EK4, pptd. uranium remained cell bound. The EK4 cells exhibited a much higher loading capacity of 3.8 g U/g dry wt. in <2 h compared to only 1.5 g U/g dry wt. in >7 h in BSAR-1. The data demonstrate the potential utility of genetically engineering PhoK for the biopptn. of uranium from alk. solns.
- 41Chern, E. C.; Siefring, S.; Paar, J.; Doolittle, M.; Haugland, R. A. Comparison of Quantitative PCR Assays for Escherichia Coli Targeting Ribosomal RNA and Single Copy Genes. Lett. Appl. Microbiol. 2011, 52, 298– 306, DOI: 10.1111/j.1472-765X.2010.03001.x41https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3MXktVGnsLo%253D&md5=f397ad35fbed0bbea6bae713119c1003Comparison of quantitative PCR assays for Escherichia coli targeting ribosomal RNA and single copy genesChern, E. C.; Siefring, S.; Paar, J.; Doolittle, M.; Haugland, R. A.Letters in Applied Microbiology (2011), 52 (3), 298-306CODEN: LAMIE7; ISSN:0266-8254. (Wiley-Blackwell)Aims: Compare specificity and sensitivity of quant. PCR (qPCR) assays targeting single and multi-copy gene regions of Escherichia coli. Methods and Results: A previously reported assay targeting the uidA gene (uidA405) was used as the basis for comparing the taxonomic specificity and sensitivity of qPCR assays targeting the rodA gene (rodA984) and two regions of the multi-copy 23S rRNA gene (EC23S and EC23S857). Exptl. analyses of 28 culture collection strains representing E. coli and 21 related non-target species indicated that the uidA405 and rodA984 assays were both 100% specific for E. coli while the EC23S assay was only 29% specific. The EC23S857 assay was only 95% specific due to detection of E. fergusonii. The uidA405, rodA984, EC23S and EC23S857 assays were 85%, 85%, 100% and 86% sensitive, resp., in detecting 175 presumptive E. coli culture isolates from fresh, marine and waste water samples. In analyses of DNA exts. from 32 fresh, marine and waste water samples, the rodA984, EC23S and EC23S857 assays detected mean densities of target sequences at ratios of approx. 1:1, 243:1 and 6:1 compared with the mean densities detected by the uidA405 assay. Conclusions: The EC23S assay was less specific for E. coli, whereas the rodA984 and EC23S857 assay taxonomic specificities and sensitivities were similar to those of the uidA405 gene assay. Significance and Impact: The EC23S857 assay has a lower limit of detection for E. coli cells than the uidA405 and rodA984 assays due to its multi-copy gene target and therefore provides greater anal. sensitivity in monitoring for these fecal pollution indicators in environmental waters by qPCR methods.
- 42van Rossum, G.; Drake, F. L.; Van Rossum, G. The Python Language Reference, Release 3.0.1 [Repr.].; Python documentation manual; Python Software Foundation: Hampton, NH, 2010.There is no corresponding record for this reference.
- 43Mölder, F.; Jablonski, K. P.; Letcher, B.; Hall, M. B.; Tomkins-Tinch, C. H.; Sochat, V.; Forster, J.; Lee, S.; Twardziok, S. O.; Kanitz, A.; Wilm, A.; Holtgrewe, M.; Rahmann, S.; Nahnsen, S.; Köster, J. Sustainable Data Analysis with Snakemake. F1000Res 2021, 10, 33, DOI: 10.12688/f1000research.29032.143https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BB2c%252FptVKjug%253D%253D&md5=57e1bba76c4646bce4fead738211a8e8Sustainable data analysis with SnakemakeMolder Felix; Forster Jan; Koster Johannes; Molder Felix; Jablonski Kim Philipp; Jablonski Kim Philipp; Letcher Brice; Hall Michael B; Tomkins-Tinch Christopher H; Tomkins-Tinch Christopher H; Sochat Vanessa; Forster Jan; Lee Soohyun; Twardziok Sven O; Holtgrewe Manuel; Kanitz Alexander; Kanitz Alexander; Wilm Andreas; Holtgrewe Manuel; Rahmann Sven; Nahnsen Sven; Koster JohannesF1000Research (2021), 10 (), 33 ISSN:.Data analysis often entails a multitude of heterogeneous steps, from the application of various command line tools to the usage of scripting languages like R or Python for the generation of plots and tables. It is widely recognized that data analyses should ideally be conducted in a reproducible way. Reproducibility enables technical validation and regeneration of results on the original or even new data. However, reproducibility alone is by no means sufficient to deliver an analysis that is of lasting impact (i.e., sustainable) for the field, or even just one research group. We postulate that it is equally important to ensure adaptability and transparency. The former describes the ability to modify the analysis to answer extended or slightly different research questions. The latter describes the ability to understand the analysis in order to judge whether it is not only technically, but methodologically valid. Here, we analyze the properties needed for a data analysis to become reproducible, adaptable, and transparent. We show how the popular workflow management system Snakemake can be used to guarantee this, and how it enables an ergonomic, combined, unified representation of all steps involved in data analysis, ranging from raw data processing, to quality control and fine-grained, interactive exploration and plotting of final results.
- 44Robinson, J. T.; Thorvaldsdóttir, H.; Winckler, W.; Guttman, M.; Lander, E. S.; Getz, G.; Mesirov, J. P. Integrative Genomics Viewer. Nat. Biotechnol. 2011, 29, 24– 26, DOI: 10.1038/nbt.175444https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3MXjsFWrtg%253D%253D&md5=312a2139d048ade04dedb2f6f13eae63Integrative genomics viewerRobinson, James T.; Thorvaldsdottir, Helga; Winckler, Wendy; Guttman, Mitchell; Lander, Eric S.; Getz, Gad; Mesirov, Jill P.Nature Biotechnology (2011), 29 (1), 24-26CODEN: NABIF9; ISSN:1087-0156. (Nature Publishing Group)There is no expanded citation for this reference.
Supporting Information
Supporting Information
The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/acsomega.1c04180.
Timeline of uPIC–M library generation, amplification of window-specific sublibrary pools from an oligo array, quantification of E. coli genomic DNA in diluted mutant culture templates, selection of PCR conditions for SpAP mutant amplicons, quantification of amplicon DNA concentrations per sublibrary plate, Tn5 tagmentation reaction conditions, electropherograms of tagmented and amplified mutant sublibraries, histogram of variant:WT read ratios among single, double, and triple and greater mutants, comparison of observed and simulated single mutant frequency distributions, time and cost calculations for uPIC–M and conventional mutagenesis, Plasmid map of PURExpress-SpAP-eGFP, complete DNA sequence of PURExpress-SpAP-eGFP plasmid, protein sequence of SpAP-(10mer linker)-eGFP, oligo array and window design details for SpAP scanning mutant library, concentration of purified sublibrary mutagenic primer pools, expected mutant yields from simulations of mutant sampling, variant composition of small-scale QuikChange-HT reactions, sublibrary transformation and colony picking results, amplicon DNA and library concentration statistics, unique single mutant yields for the SpAP scanning library, comparison of uPIC–M performance with simulated picking experiments, per sublibrary, and oligo array price summary (PDF)
Terms & Conditions
Most electronic Supporting Information files are available without a subscription to ACS Web Editions. Such files may be downloaded by article for research use (if there is a public use license linked to the relevant article, that license may permit other uses). Permission may be obtained from ACS for other uses through requests via the RightsLink permission system: http://pubs.acs.org/page/copyright/permissions.html.