Delving into Eukaryotic Origins of Replication Using DNA Structural FeaturesClick to copy article linkArticle link copied!
- Venkata Rajesh Yella*Venkata Rajesh Yella*Email: [email protected]Department of Biotechnology, Koneru Lakshmaiah Education Foundation, Guntur 522502, Andhra Pradesh, IndiaMore by Venkata Rajesh Yella
- Akkinepally VanajaAkkinepally VanajaDepartment of Biotechnology, Koneru Lakshmaiah Education Foundation, Guntur 522502, Andhra Pradesh, IndiaKL College of Pharmacy, Koneru Lakshmaiah Education Foundation, Vaddeswaram, Guntur 522502, Andhra Pradesh, IndiaMore by Akkinepally Vanaja
- Umasankar KulandaiveluUmasankar KulandaiveluKL College of Pharmacy, Koneru Lakshmaiah Education Foundation, Vaddeswaram, Guntur 522502, Andhra Pradesh, IndiaMore by Umasankar Kulandaivelu
- Aditya Kumar*Aditya Kumar*Email: [email protected]Department of Molecular Biology and Biotechnology, Tezpur University, Tezpur 784028, Assam, IndiaMore by Aditya Kumar
Abstract
DNA replication in eukaryotes is an intricate process, which is precisely synchronized by a set of regulatory proteins, and the replication fork emanates from discrete sites on chromatin called origins of replication (Oris). These spots are considered as the gateway to chromosomal replication and are stereotyped by sequence motifs. The cognate sequences are noticeable in a small group of entire origin regions or totally absent across different metazoans. Alternatively, the use of DNA secondary structural features can provide additional information compared to the primary sequence. In this article, we report the trends in DNA sequence-based structural properties of origin sequences in nine eukaryotic systems representing different families of life. Biologically relevant DNA secondary structural properties, namely, stability, propeller twist, flexibility, and minor groove shape were studied in the sequences flanking replication start sites. Results indicate that Oris in yeasts show lower stability, more rigidity, and narrow minor groove preferences compared to genomic sequences surrounding them. Yeast Oris also show preference for A-tracts and the promoter element TATA box in the vicinity of replication start sites. On the contrary, Drosophila melanogaster, humans, and Arabidopsis thaliana do not have such features in their Oris, and instead, they show high preponderance of G-rich sequence motifs such as putative G-quadruplexes or i-motifs and CpG islands. Our extensive study applies the DNA structural feature computation to delve into origins of replication across organisms ranging from yeasts to mammals and including a plant. Insights from this study would be significant in understanding origin architecture and help in designing new algorithms for predicting DNA trans-acting factor recognition events.
Note
This paper was originally published ASAP on June 1, 2020. Due to a production error, there were mistakes in Table 1. The corrected version was reposted on June 2, 2020.
Introduction
Results and Discussion
most represented oligonucleotides | |||||||||
---|---|---|---|---|---|---|---|---|---|
name of organism | genome size (in mb) | no. of Chr | no. of Ori sites | genome GC % | GC % of Ori region | di | tri | tetra | hepta |
Saccharomyces cerevisiae | 12.07 | 16 | 357 | 38.15 | 36.05 | AA (11.59) | AAA (4.49) | AAAA (1.85) | AAAAAAA (0.217) |
TT (11.53) | TTT (4.43) | TTTT (1.79) | TTTTTTT (0.198) | ||||||
AT (9.77) | AAT (3.23) | AAAT (1.22) | ATATATA (0.097) | ||||||
TA (8.56) | ATT (3.21) | ATTT (1.21) | TATATAT (0.091) | ||||||
TG (6.20) | ATA (3.05) | ATAT (1.14) | AAAAAAT (0.077) | ||||||
Kluyveromyces lactis | 10.68 | 6 | 144 | 38.76 | 35.02 | TT (11.57) | AAA (4.16) | AAAA (1.49) | AAAAAAA (0.133) |
AA (11.34) | TTT (4.16) | TTTT (1.44) | TTTTTTT (0.116) | ||||||
AT (10.12) | AAT (3.32) | ATAT (1.24) | ATATATA (0.105) | ||||||
TA (8.95) | ATT (3.29) | ATTT (1.21) | TATATAT (0.101) | ||||||
TG (6.42) | TAT (3.21) | AAAT (1.17) | AAATAAA (0.063) | ||||||
Candida glabrata strain CBS138 | 4.81 | 13 | 256 | 39.03 | 33.85 | AA (11.61) | AAA (4.29) | AAAA (1.64) | AAAAAAA (0.128) |
TT (11.41) | TTT (4.20) | TTTT (1.61) | TTTTTTT (0.116) | ||||||
AT (10.73) | TAT (3.61) | ATAT (1.41) | ATGTTTT (0.102) | ||||||
TA (9.82) | ATA (3.58) | AAAT (1.31) | ACCAAAA (0.087) | ||||||
TG (6.57) | AAT (3.54) | TATT (1.25) | TTTTTAT (0.084) | ||||||
Pichia pastoris | 9.35 | 4 | 294 | 41.13 | 39.51 | AA (10.16) | AAA (3.42) | AAAA (1.21) | AAAAAAA (0.091) |
TT (10.08) | TTT (3.40) | TTTT (1.19) | TTTTTTT (0.088) | ||||||
AT (8.39) | AAT (2.75) | AAAT (0.89) | AAAAAAT (0.046) | ||||||
TA (6.92) | ATT (2.68) | AATT (0.86) | ATTTTTT (0.045) | ||||||
GA (6.66) | TTG (2.40) | ATTT (0.85) | TCTTTTT (0.043) | ||||||
Schizosaccharomyces pombe | 12.59 | 3 | 345 | 36.06 | 30.79 | AA (14.27) | TTT (6.26) | TTTT (2.81) | TTTTTTT (0.459) |
TT (14.25) | AAA (6.24) | AAAA (2.78) | AAAAAAA (0.428) | ||||||
AT (10.55) | AAT (4.12) | AAAT (1.75) | TTTATTT (0.162) | ||||||
TA (9.80) | ATT (4.09) | ATTT (1.74) | ATTTTTT (0.160) | ||||||
TG (5.53) | TAA (3.59) | TAAA (1.59) | AAATAAA (0.160) | ||||||
Drosophila melanogaster (S2) | 137.55 | 4 | 7156 | 42.29 | 43.8 | TT (9.40) | TTT (3.38) | TTTT (1.23) | AAAAAAA (0.140) |
AA (9.37) | AAA (3.37) | AAAA (1.23) | TTTTTTT (0.131) | ||||||
AT (7.59) | ATT (2.53) | ATTT (0.97) | TTTATTT (0.071) | ||||||
CA (6.84) | AAT (2.51) | AAAT (0.96) | AAAAATA (0.062) | ||||||
TG (6.84) | TTG (2.16) | AATT (0.83) | TTTTATT (0.061) | ||||||
Arabidopsis thaliana | 119.16 | 5 | 1533 | 36.05 | 41.53 | AA (9.75) | AAA (3.33) | AAAA (1.15) | AAAAAAA (0.099) |
TT (9.64) | TTT (3.28) | TTTT (1.14) | TTTTTTT (0.098) | ||||||
AT (7.70) | AGA (2.41) | AAGA (0.87) | AAGAAGA (0.059) | ||||||
GA (7.13) | TCT (2.39) | TCTT (0.87) | TCTTCTT (0.059) | ||||||
TC (7.13) | GAA (2.32) | AGAA (0.85) | AGAAGAA (0.057) | ||||||
mouse (P19) | 2716.96 | 20 | 2412 | 42 | 50.38 | CT (7.97) | CTG (2.65) | TTTT (0.87) | TTTTTTT (0.167) |
AG (7.87) | CAG (2.61) | AAAA (0.82) | AAAAAAA (0.153) | ||||||
TG (7.65) | TTT (2.45) | CTGG (0.78) | TGTGTGT (0.100) | ||||||
CA (7.58) | AAA (2.37) | CCAG (0.78) | ACACACA (0.099) | ||||||
CC (7.21) | CCT (2.33) | CCTG (0.78) | GTGTGTG (0.096) | ||||||
human (MCF7) | 3259.56 | 23 | 94,195 | 41 | 57.76 | GG (10.33) | GGG (3.61) | CAGG (1.20) | CCCTCCC (0.064) |
CC (10.32) | CCC (3.61) | CCTG (1.20) | GGGAGGG (0.063) | ||||||
CT (8.17) | CAG (3.31) | CTGG (1.15) | GGCTGGG (0.062) | ||||||
AG (8.17) | CTG (3.30) | CCCC (1.14) | GGGCAGG (0.062) | ||||||
TG (8.05) | CCT (3.00) | GGGG (1.14) | CCCAGCC (0.062) |
Origins sequences were downloaded from the DeOri database for computing the GC percent and k-mer calculations (k = 2, 3, 4 and 7). The numbers in the parenthesis indicate the absolute percentage frequency of oligonucleotides observed in the data sets. Five most occurring words are displayed in the table. The frequency of k-mer depends on GC percentage and also the arrangement of nucleotide steps which is characteristic of Ori regions. Different cell- types dataset word composition for D. melanogaster, mouse and human is shown in Supplementary Table 2.
Ori Regions Display Signature Structural Profiles
Ori Sequences Are Enriched with Characteristic Sequence and Structural Motifs
organism | i-motif density | G-quad density | A-tracts | G-tracts | ARS | TATA box |
---|---|---|---|---|---|---|
S. cerevisiae | 0.01 | 0.02 | 0.99 | 0.11 | 0.26 | 0.95 |
K. lactis | 0.00 | 0.01 | 0.94 | 0.08 | 0.13 | 0.89 |
P. pastoris | 0.01 | 0.02 | 0.94 | 0.04 | 0.07 | 0.75 |
C. glabrata | 0.09 | 0.05 | 0.96 | 0.35 | 0.21 | 0.93 |
S. pombe | 0.01 | 0.01 | 1.00 | 0.08 | 0.33 | 0.97 |
D. melanogaster | 0.19 | 0.20 | 0.96 | 0.33 | 0.28 | 0.92 |
A. thaliana | 0.02 | 0.02 | 0.98 | 0.04 | 0.21 | 0.88 |
mouse | 0.64 | 0.66 | 0.92 | 0.70 | 0.09 | 0.61 |
human | 0.57 | 0.57 | 0.86 | 0.34 | 0.10 | 0.49 |
Densities of i-motifs, G-quadruplexes, A-tracts, G-tracts, autonomously replicating sequences (ARS), and TATA boxes were shown in the table. One thousand mer sequences downstream to the Ori start sites were considered in this table.
Eukaryotic Origins of Replication May Be Linked to Promoter Regions
Conclusions
Materials and Methods
Origins of Replication Data Sets
DNA Structural Profile Enumeration
DNA Stability and Melting Temperature Models
DNA Bendability Models
Propeller Twist and Minor Groove Width
Computation of Structural Motifs
CpG Island Calculations and Promoter Motif Element Search
Supporting Information
The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/acsomega.0c00441.
(Supplementary figure 1) Structural profiles of DNA helicoidal parameters for Ori sequences, (Supplementary figure 2) a profile of DNA structural properties of tissue-specific Ori sequences, and (Supplementary figure 3) positional distribution of promoter elements in Ori sequences (PDF)
(Supplementary table 1) Characteristic structural feature values observed in Ori sequences and (Supplementary table 2) oligonucleotide compositional analysis of Ori sequences (XLS)
Terms & Conditions
Most electronic Supporting Information files are available without a subscription to ACS Web Editions. Such files may be downloaded by article for research use (if there is a public use license linked to the relevant article, that license may permit other uses). Permission may be obtained from ACS for other uses through requests via the RightsLink permission system: http://pubs.acs.org/page/copyright/permissions.html.
Acknowledgments
The authors are grateful to the Science and Engineering Research Board (SERB), Department of Science and Technology (DST), Government of India for the grant (ECR/2017/001100/LS) and for supporting A.V. with her JRF. We would like to thank the management of Koneru Lakshmaiah Education Foundation for helping us with the necessary resources.
Oris | origins of replication |
References
This article references 66 other publications.
- 1Masai, H.; Matsumoto, S.; You, Z.; Yoshizawa-Sugata, N.; Oda, M. Eukaryotic chromosome DNA replication: where, when, and how?. Annu. Rev. Biochem. 2010, 79, 89– 130, DOI: 10.1146/annurev.biochem.052308.103205Google Scholar1https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3cXpslShsbs%253D&md5=c8a0dfe38359a7eb2fe724d0693116b2Eukaryotic chromosome DNA replication: where, when, and how?Masai, Hisao; Matsumoto, Seiji; You, Zhiying; Yoshizawa-Sugata, Naoko; Oda, MasakoAnnual Review of Biochemistry (2010), 79 (), 89-130CODEN: ARBOAW; ISSN:0066-4154. (Annual Reviews Inc.)A review. DNA replication is central to cell proliferation. Studies in the past six decades since the proposal of a semiconservative mode of DNA replication have confirmed the high degree of conservation of the basic machinery of DNA replication from prokaryotes to eukaryotes. However, the need for replication of a substantially longer segment of DNA in coordination with various internal and external signals in eukaryotic cells has led to more complex and versatile regulatory strategies. The replication program in higher eukaryotes is under a dynamic and plastic regulation within a single cell, or within the cell population, or during development. We review here various regulatory mechanisms that control the replication program in eukaryotes and discuss future directions in this dynamic field.
- 2Aladjem, M. I.; Redon, C. E. Order from clutter: selective interactions at mammalian replication origins. Nat. Rev. Genet. 2017, 18, 101– 116, DOI: 10.1038/nrg.2016.141Google Scholar2https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC28XhvV2iur7E&md5=297fb77793aee0fe319fe3af7dcb2a84Order from clutter: selective interactions at mammalian replication originsAladjem, Mirit I.; Redon, Christophe E.Nature Reviews Genetics (2017), 18 (2), 101-116CODEN: NRGAAM; ISSN:1471-0056. (Nature Publishing Group)Mammalian chromosome duplication progresses in a precise order and is subject to constraints that are often relaxed in developmental disorders and malignancies. Mol. information about the regulation of DNA replication at the chromatin level is lacking because protein complexes that initiate replication seem to bind chromatin indiscriminately. High-throughput sequencing and math. modeling have yielded detailed genome-wide replication initiation maps. Combining these maps and models with functional genetic analyses suggests that distinct DNA-protein interactions at subgroups of replication initiation sites (replication origins) modulate the ubiquitous replication machinery and supports an emerging model that delineates how indiscriminate DNA-binding patterns translate into a consistent, organized replication program.
- 3Fragkos, M.; Ganier, O.; Coulombe, P.; Méchali, M. DNA replication origin activation in space and time. Nat. Rev. Mol. Cell Biol. 2015, 16, 360– 374, DOI: 10.1038/nrm4002Google Scholar3https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2MXht1WltbrP&md5=dc4e7fed0cfedf7e71d31a8fa3434751DNA replication origin activation in space and timeFragkos, Michalis; Ganier, Olivier; Coulombe, Philippe; Mechali, MarcelNature Reviews Molecular Cell Biology (2015), 16 (6), 360-374CODEN: NRMCBP; ISSN:1471-0072. (Nature Publishing Group)A review. DNA replication begins with the assembly of pre-replication complexes (pre-RCs) at thousands of DNA replication origins during the G1 phase of the cell cycle. At the G1-S-phase transition, pre-RCs are converted into pre-initiation complexes, in which the replicative helicase is activated, leading to DNA unwinding and initiation of DNA synthesis. However, only a subset of origins are activated during any S phase. Recent insights into the mechanisms underlying this choice reveal how flexibility in origin usage and temporal activation are linked to chromosome structure and organization, cell growth and differentiation, and replication stress.
- 4Jacob, F.; Brenner, S. On the regulation of DNA synthesis in bacteria: the hypothesis of the replicon. C R Hebd. Seances Acad. Sci. 1963, 256, 298– 300Google Scholar4https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADyaF387ls1Snsw%253D%253D&md5=f717738803200309a7b33310aec1b128On the regulation of DNA synthesis in bacteria: the hypothesis of the repliconJACOB F; BRENNER SComptes rendus hebdomadaires des seances de l'Academie des sciences (1963), 256 (), 298-300 ISSN:0001-4036.There is no expanded citation for this reference.
- 5Marahrens, Y.; Stillman, B. A yeast chromosomal origin of DNA replication defined by multiple functional elements. Science 1992, 255, 817– 823, DOI: 10.1126/science.1536007Google Scholar5https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADyaK38Xht1amu70%253D&md5=bcabc93006be8d984b0314d9cd4d8215A yeast chromosomal origin of DNA replication defined by multiple functional elementsMarahrens, York; Stillman, BruceScience (Washington, DC, United States) (1992), 255 (5046), 817-23CODEN: SCIEAS; ISSN:0036-8075.Although it has been demonstrated that discrete origins of DNA replication exist in eukaryotic cellular chromosomes, the detailed organization of a eukaryotic cellular origin remains to be detd. Linker substitution mutations were constructed across the entire Saccharomyces cerevisiae chromosomal origin, ARS1. Functional studies of these mutants revealed 1 essential element (A), which includes a match to the ARS consensus sequence, and 3 addnl. elements (B1, B2, and B3), which collectively are also essential for origin function. These 4 elements arranged exactly as in ARS1, but surrounded by completely unrelated sequence, functioned as an efficient origin. Element B3 is the binding site for the transcription factor-origin binding protein ABF1. Other transcription factor binding sites substitute for B3 element and a trans-acting transcriptional activation domain is required. The multipartite nature of a chromosomal replication origin and the role of transcriptional activators in its function present a striking similarity to the organization of eukaryotic promoters.
- 6Dai, J.; Chuang, R. Y.; Kelly, T. J. DNA replication origins in the Schizosaccharomyces pombe genome. Proc. Natl. Acad. Sci. U. S. A. 2005, 102, 337– 342, DOI: 10.1073/pnas.0408811102Google Scholar6https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD2MXptFSltA%253D%253D&md5=01796a598a126899d5ee00fe7d4e40f4DNA replication origins in the Schizosaccharomyces pombe genomeDai, Jianli; Chuang, Ray-Yuan; Kelly, Thomas J.Proceedings of the National Academy of Sciences of the United States of America (2005), 102 (2), 337-342CODEN: PNASA6; ISSN:0027-8424. (National Academy of Sciences)Origins of DNA replication in Schizosaccharomyces pombe lack a specific consensus sequence analogous to the Saccharomyces cerevisiae autonomously replicating sequence (ARS) consensus, raising the question of how they are recognized by the replication machinery. Because all well characterized S. pombe origins are located in intergenic regions, we analyzed the sequence properties and biol. activity of such regions. The AT content of intergenes is very high (≈70%), and runs of A's or T's occur with a significantly greater frequency than expected. Addnl., the two DNA strands in intergenes display compositional asymmetry that strongly correlates with the direction of transcription of flanking genes. Importantly, the sequence properties of known S. pombe origins of DNA replication are similar to those of intergenes in general. In functional studies, we assayed the in vivo origin activity of 26 intergenes in a 68-kb region of S. pombe chromosome 2. We also assayed the origin activity of sets of randomly chosen intergenes with the same length or AT content. Our data demonstrate that at least half of intergenes have potential origin activity and that the relative ability of an intergene to function as an origin is governed primarily by AT content and length. We propose a stochastic model for initiation of DNA replication in the fission yeast. In this model, the no. of AT tracts in a given sequence is the major determinant of its probability of binding SpORC and serving as a replication origin. A similar model may explain some features of origins of DNA replication in metazoans.
- 7Xu, J.; Yanagisawa, Y.; Tsankov, A. M.; Hart, C.; Aoki, K.; Kommajosyula, N.; Steinmann, K. E.; Bochicchio, J.; Russ, C.; Regev, A.; Rando, O. J.; Nusbaum, C.; Niki, H.; Milos, P.; Weng, Z.; Rhind, N. Genome-wide identification and characterization of replication origins by deep sequencing. Genome Biol. 2012, 13, R27, DOI: 10.1186/gb-2012-13-4-r27Google Scholar7https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC38XptV2jur0%253D&md5=27aea1830370a1204e20ede571e7da09Genome-wide identification and characterization of replication origins by deep sequencingXu, Jia; Yanagisawa, Yoshimi; Tsankov, Alexander M.; Hart, Christopher; Aoki, Keita; Kommajosyula, Naveen; Steinmann, Kathleen E.; Bochicchio, James; Russ, Carsten; Regev, Aviv; Rando, Oliver J.; Nusbaum, Chad; Niki, Hironori; Milos, Patrice; Weng, Zhiping; Rhind, NicholasGenome Biology (2012), 13 (), R27CODEN: GNBLFW; ISSN:1474-760X. (BioMed Central Ltd.)Background: DNA replication initiates at distinct origins in eukaryotic genomes, but the genomic features that define these sites are not well understood. Results: We have taken a combined exptl. and bioinformatic approach to identify and characterize origins of replication in three distantly related fission yeasts: Schizosaccharomyces pombe, Schizosaccharomyces octosporus and Schizosaccharomyces japonicus. Using single-mol. deep sequencing to construct amplification-free high-resoln. replication profiles, we located origins and identified sequence motifs that predict origin function. We then mapped nucleosome occupancy by deep sequencing of mononucleosomal DNA from the corresponding species, finding that origins tend to occupy nucleosome-depleted regions. Conclusions: The sequences that specify origins are evolutionarily plastic, with low complexity nucleosome-excluding sequences functioning in S. pombe and S. octosporus and binding sites for trans-acting nucleosome-excluding proteins functioning in S. japonicus. Furthermore, chromosome-scale variation in replication timing is conserved independently of origin location and via a mechanism distinct from known heterochromatic effects on origin function. These results are consistent with a model in which origins are simply the nucleosome-depleted regions of the genome with the highest affinity for the origin recognition complex. This approach provides a general strategy for understanding the mechanisms that define DNA replication origins in eukaryotes.
- 8Cayrou, C.; Coulombe, P.; Puy, A.; Rialle, S.; Kaplan, N.; Segal, E.; Méchali, M. New insights into replication origin characteristics in metazoans. Cell Cycle 2012, 11, 658– 667, DOI: 10.4161/cc.11.4.19097Google Scholar8https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC38Xks1enurw%253D&md5=e8b8e8081dbfcc3f4b7152f609efab8bNew insights into replication origin characteristics in metazoansCayrou, Christelle; Coulombe, Philippe; Puy, Aurore; Rialle, Stephanie; Kaplan, Noam; Segal, Eran; Mechali, MarcelCell Cycle (2012), 11 (4), 658-667CODEN: CCEYAS; ISSN:1538-4101. (Landes Bioscience)We recently reported the identification and characterization of DNA replication origins (Oris) in metazoan cell lines. Here, we describe addnl. bioinformatic analyses showing that the previously identified GC-rich sequence elements form origin G-rich repeated elements (OGREs) that are present in 67% to 90% of the DNA replication origins from Drosophila to human cells, resp. Our analyses also show that initiation of DNA synthesis takes place precisely at 160 bp (Drosophila) and 280 bp (mouse) from the OGRE. We also found that in most CpG islands, an OGRE is positioned in opposite orientation on each of the two DNA strands and detected two sites of initiation of DNA synthesis upstream or downstream of each OGRE. Conversely, Oris not assocd. with CpG islands have a single initiation site. OGRE d. along chromosomes correlated with previously published replication timing data. Ori sequences centered on the OGRE are also predicted to have high intrinsic nucleosome occupancy. Finally, OGREs predict G-quadruplex structures at Oris that might be structural elements controlling the choice or activation of replication origins.
- 9Ghosh, A.; Bansal, M. A glossary of DNA structures from A to Z. Acta Crystallogr. D. Biol. Crystallogr. 2003, 59, 620– 626, DOI: 10.1107/S0907444903003251Google Scholar9https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD3sXit12kt7Y%253D&md5=9d1306e34c1930e36d870b596f28ee55A glossary of DNA structures from A to ZGhosh, Anirban; Bansal, ManjuActa Crystallographica, Section D: Biological Crystallography (2003), D59 (4), 620-626CODEN: ABCRE6; ISSN:0907-4449. (Blackwell Munksgaard)A review. The right-handed double-helical Watson-Crick model for B-form DNA is the most commonly known DNA structure. In addn. to this classic structure, several other forms of DNA have been obsd., and it is clear that the DNA mol. can assume different structures depending on the base sequence and environment. The various forms of DNA have been identified as A, B, C etc. In fact, a detailed inspection of the literature reveals that only the letters F, Q, U, V and Y are now available to describe any new DNA structure that may appear in the future. It is also apparent that it may be more relevant to talk about the A, B or C type dinucleotide steps, since several recent structures show mixts. of various different geometries and a careful anal. is essential before identifying it as a 'new structure'. This review provides a glossary of currently identified DNA structures and is quite timely as it outlines the present understanding of DNA structure exactly 50 yr after the original discovery of DNA structure by Watson and Crick.
- 10Guiblet, W. M.; Cremona, M. A.; Cechova, M.; Harris, R. S.; Kejnovská, I.; Kejnovsky, E.; Eckert, K.; Chiaromonte, F.; Makova, K. D. Long-read sequencing technology indicates genome-wide effects of non-B DNA on polymerization speed and error rate. Genome Res. 2018, 28, 1767– 1778, DOI: 10.1101/gr.241257.118Google Scholar10https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1MXjsVKjtQ%253D%253D&md5=86c11cc6d9a78dcef3bac531e7e99f36Long-read sequencing technology indicates genome-wide effects of non-B DNA on polymerization speed and error rateGuiblet, Wilfried M.; Cremona, Marzia A.; Cechova, Monika; Harris, Robert S.; Kejnovska, Iva; Kejnovsky, Eduard; Eckert, Kristin; Chiaromonte, Francesca; Makova, Kateryna D.Genome Research (2018), 28 (12), 1767-1778CODEN: GEREFS; ISSN:1088-9051. (Cold Spring Harbor Laboratory Press)DNA conformation may deviate from the classical B-form in ∼13% of the human genome. Non-B DNA regulates many cellular processes; however, its effects on DNA polymn. speed and accuracy have not been investigated genome-wide. Such an inquiry is crit. for understanding neurol. diseases and cancer genome instability. Here, we present the first simultaneous examn. of DNA polymn. kinetics and errors in the human genome sequenced with Single-Mol. Real-Time (SMRT) technol. We show that polymn. speed differs between non-B and B-DNA: It decelerates at G-quadruplexes and fluctuates periodically at disease-causing tandem repeats. Analyzing polymn. kinetics profiles, we predict and validate exptl. non-B DNA formation for a novel motif. We demonstrate that several non-B motifs affect sequencing errors (e.g., G-quadruplexes increase error rates), and that sequencing errors are pos. assocd. with polymerase slowdown. Finally, we show that highly divergent G4 motifs have pronounced polymn. slowdown and high sequencing error rates, suggesting similar mechanisms for sequencing errors and germline mutations.
- 11Marathe, A.; Karandur, D.; Bansal, M. Small local variations in B-form DNA lead to a large variety of global geometries which can accommodate most DNA-binding protein motifs. BMC Struct. Biol. 2009, 9, 24, DOI: 10.1186/1472-6807-9-24Google Scholar11https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BD1MzntlCitg%253D%253D&md5=85d1302530d372af53200164a807f2bbSmall local variations in B-form DNA lead to a large variety of global geometries which can accommodate most DNA-binding protein motifsMarathe Arvind; Karandur Deepti; Bansal ManjuBMC structural biology (2009), 9 (), 24 ISSN:.BACKGROUND: An important question of biological relevance is the polymorphism of the double-helical DNA structure in its free form, and the changes that it undergoes upon protein-binding. We have analysed a database of free DNA crystal structures to assess the inherent variability of the free DNA structure and have compared it with a database of protein-bound DNA crystal structures to ascertain the protein-induced variations. RESULTS: Most of the dinucleotide steps in free DNA display high flexibility, assuming different conformations in a sequence-dependent fashion. With the exception of the AA/TT and GA/TC steps, which are 'A-phobic', and the GG/CC step, which is 'A-philic', the dinucleotide steps show no preference for A or B forms of DNA. Protein-bound DNA adopts the B-conformation most often. However, in certain cases, protein-binding causes the DNA backbone to take up energetically unfavourable conformations. At the gross structural level, several protein-bound DNA duplexes are observed to assume a curved conformation in the absence of any large distortions, indicating that a series of normal structural parameters at the dinucleotide and trinucleotide level, similar to the ones in free B-DNA, can give rise to curvature at the overall level. CONCLUSION: The results illustrate that the free DNA molecule, even in the crystalline state, samples a large amount of conformational space, encompassing both the A and the B-forms, in the absence of any large ligands. A-form as well as some non-A, non-B, distorted geometries are observed for a small number of dinucleotide steps in DNA structures bound to the proteins belonging to a few specific families. However, for most of the bound DNA structures, across a wide variety of protein families, the average step parameters for various dinucleotide sequences as well as backbone torsion angles are observed to be quite close to the free 'B-like' DNA oligomer values, highlighting the flexibility and biological significance of this structural form.
- 12Gorin, A. A.; Zhurkin, V. B.; Wima, K. B-DNA twisting correlates with base-pair morphology. J. Mol. Biol. 1995, 247, 34– 48, DOI: 10.1006/jmbi.1994.0120Google Scholar12https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADyaK2MXkslSmsr8%253D&md5=201abbbf842e6fafdc9ac9ca86a125e7B-DNA twisting correlates with base-pair morphologyGorin, Andrey A.; Zhurkin, Victor B.; Olson, Wilma K.Journal of Molecular Biology (1995), 247 (1), 34-48CODEN: JMOBAK; ISSN:0022-2836. (Academic)The obsd. sequence dependence of the mean twist angles in 38 B-DNA crystal structures can be understood in terms of simple geometrical features of the constituent base-pairs. Structures with low twist appear to unwind in response to severe steric clashes of large exocyclic groups (such as NH2-NH2) in the major and minor grooves, while those with high twist are subjected to lesser contacts (H-O and H-H). The authors offer a simple clash function that depends on base-pair morphol. (i.e. the chem. constitution of base-pairs) and satisfactorily accounts for the twist angles of the ten common Watson-Crick dimer steps both in the solid state and in soln. The twist-clash correlation that the authors find here still holds when extended to modified bases. In addn. to Calladine's purine-purine clashes, the authors add other close contacts between bases in the grooves, and consider the conformational restrictions on the geometry of the sugar-phosphate backbone (namely, the authors emphasize the tendency of DNA to conserve virtual backbone length). The significance of this finding is threefold: (1) sequence-dependent DNA twisting is directly involved with protein-DNA interactions; (2) strong correlation between Twist and Roll helps to elucidate the bending of the double helix as a function of base sequence; (3) it is possible to anticipate the effects of chem. modifications on twisting and bending. The mutual correlations of other structural parameters with the twist make this angle a primary determinant of DNA conformational heterogeneity.
- 13Drew, H. R.; Dickerson, R. E. Structure of a B-DNA dodecamer. III. Geometry of hydration. J. Mol. Biol. 1981, 151, 535– 556, DOI: 10.1016/0022-2836(81)90009-7Google Scholar13https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADyaL38XmsVentg%253D%253D&md5=4378664a4da9ec83155a3f7e046027d9Structure of a B-DNA dodecamer. III. Geometry of hydrationDrew, Horace R.; Dickerson, Richard E.Journal of Molecular Biology (1981), 151 (3), 535-56CODEN: JMOBAK; ISSN:0022-2836.The B-DNA dodecamer, C-G-C-G-A-A-T-T-C-G-C-G, crystd. as slightly more than 1 full turn of right-handed B-DNA in space group P212121 with cell dimensions a = 24.87 Å, b = 40.39 Å, and c = 66.20 Å. X-ray anal. showed that it was surrounded by 72 ordered water mols., mostly assocd. with polar N and O atoms at exposed edges of base-pairs. Hydration in the major groove was mainly in the form of a monodentate monolayer. Hydration of backbone phosphate O atoms were not ordered, except when immobilized by the 5-Me groups of adjacent thymines. The minor groove was extensively and regularly hydrated, with a zigzag spine of 1st- and 2nd-shell hydration along the floor of the groove serving as a foundation for less-regular outer shells extending beyond the radius of the phosphate backbone. The spine network bridged purine N-3 and pyrimidine O-2 atoms in adjacent base pairs; it was regular in the A-A-T-T center, but was disrupted at the C-G-C-G ends, partly by the guanine N-2 amino groups. The minor groove hydration spine may be responsible for the stability of the B form of polymers contg. only A-T and I-C base pairs, and its disruption may explain the ease of transition to the A form of polymers with G-C pairs.
- 14Rohs, R.; West, S. M.; Sosinsky, A.; Liu, P.; Mann, R. S.; Honig, B. The role of DNA shape in protein-DNA recognition. Nature 2009, 461, 1248– 1253, DOI: 10.1038/nature08473Google Scholar14https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD1MXhtlOjtrvL&md5=2c314c36cc47209d7869e3fbda04f5e5The role of DNA shape in protein-DNA recognitionRohs, Remo; West, Sean M.; Sosinsky, Alona; Liu, Peng; Mann, Richard S.; Honig, BarryNature (London, United Kingdom) (2009), 461 (7268), 1248-1253CODEN: NATUAS; ISSN:0028-0836. (Nature Publishing Group)The recognition of specific DNA sequences by proteins is thought to depend on 2 types of mechanism: one that involves the formation of H-bonds with specific bases, primarily in the major groove, and one involving sequence-dependent deformations of the DNA helix. Here, by comprehensively analyzing the 3-dimensional structures of protein-DNA complexes, the authors show that the binding of Arg residues to narrow minor grooves is a widely used mode for protein-DNA recognition. This readout mechanism exploits the phenomenon that narrow minor grooves strongly enhance the neg. electrostatic potential of the DNA. The nucleosome core particle offers a prominent example of this effect. Minor-groove narrowing is often assocd. with the presence of A-tracts, AT-rich sequences that exclude the flexible TpA step. These findings indicate that the ability to detect local variations in DNA shape and electrostatic potential is a general mechanism that enables proteins to use information in the minor groove, which otherwise offers few opportunities for the formation of base-specific H-bonds, to achieve DNA-binding specificity.
- 15Morey, C.; Mookherjee, S.; Rajasekaran, G.; Bansal, M. DNA free energy-based promoter prediction and comparative analysis of Arabidopsis and rice genomes. Plant Physiol. 2011, 156, 1300– 1315, DOI: 10.1104/pp.110.167809Google Scholar15https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3MXptFWks74%253D&md5=4912d0437d17f5ac1c9892fbef2d6c9aDNA free energy-based promoter prediction and comparative analysis of Arabidopsis and rice genomesMorey, Czuee; Mookherjee, Sushmita; Rajasekaran, Ganesan; Bansal, ManjuPlant Physiology (2011), 156 (3), 1300-1315CODEN: PLPHAY; ISSN:0032-0889. (American Society of Plant Biologists)The cis-regulatory regions on DNA serve as binding sites for proteins such as transcription factors and RNA polymerase. The combinatorial interaction of these proteins plays a crucial role in transcription initiation, which is an important point of control in the regulation of gene expression. We present here an anal. of the performance of an in silico method for predicting cis-regulatory regions in the plant genomes of Arabidopsis (Arabidopsis thaliana) and rice (Oryza sativa) on the basis of free energy of DNA melting. For protein-coding genes, we achieve recall and precision of 96% and 42% for Arabidopsis and 97% and 31% for rice, resp. For noncoding RNA genes, the program gives recall and precision of 94% and 75% for Arabidopsis and 95% and 90% for rice, resp. Moreover, 96% of the false-pos. predictions were located in noncoding regions of primary transcripts, out of which 20% were found in the first intron alone, indicating possible regulatory roles. The predictions for orthologous genes from the two genomes showed a good correlation with respect to prediction scores and promoter organization. Comparison of our results with an existing program for promoter prediction in plant genomes indicates that our method shows improved prediction capability.
- 16Yella, V. R.; Bansal, M. DNA structural features and architecture of promoter regions play a role in gene responsiveness of S. cerevisiae. J. Bioinform. Comput. Biol. 2013, 11, 1343001, DOI: 10.1142/S0219720013430014Google Scholar16https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2cXisF2ntg%253D%253D&md5=146b54ad9ffa7e56d17270cfc43406ceDNA STRUCTURAL FEATURES AND ARCHITECTURE OF PROMOTER REGIONS PLAY A ROLE IN GENE RESPONSIVENESS OF S. cerevisiaeYella, Venkata Rajesh; Bansal, ManjuJournal of Bioinformatics and Computational Biology (2013), 11 (6), 1343001/1-1343001/13CODEN: JBCBBK; ISSN:0219-7200. (Imperial College Press)Gene expression is the most fundamental biol. process, which is essential for phenotypic variation. It is regulated by various external (environment and evolution) and internal (genetic) factors. The level of gene expression depends on promoter architecture, along with other external factors. Presence of sequence motifs, such as transcription factor binding sites (TFBSs) and TATA-box, or DNA methylation in vertebrates has been implicated in the regulation of expression of some genes in eukaryotes, but a large no. of genes lack these sequences. On the other hand, several exptl. and computational studies have shown that promoter sequences possess some special structural properties, such as low stability, less bendability, low nucleosome occupancy, and more curvature, which are prevalent across all organisms. These structural features may play role in transcription initiation and regulation of gene expression. We have studied the relationship between the structural features of promoter DNA, promoter directionality and gene expression variability in S. cerevisiae. This relationship has been analyzed for seven different measures of gene expression variability, along with two different regulatory effect measures. We find that a few of the variability measures of gene expression are linked to DNA structural properties, nucleosome occupancy, TATA-box presence, and bidirectionality of promoter regions. Interestingly, gene responsiveness is most intimately correlated with DNA structural features and promoter architecture.
- 17Yella, V. R.; Bansal, M. DNA structural features of eukaryotic TATA-containing and TATA-less promoters. FEBS Open Bio 2017, 7, 324– 334, DOI: 10.1002/2211-5463.12166Google Scholar17https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2sXislWgtbc%253D&md5=e74f088ea534e4270a7bd44568d592a8DNA structural features of eukaryotic TATA-containing and TATA-less promotersYella, Venkata Rajesh; Bansal, ManjuFEBS Open Bio (2017), 7 (3), 324-334CODEN: FOBEB3; ISSN:2211-5463. (Wiley-Blackwell)Eukaryotic genes can be broadly classified as TATA-contg. and TATA-less based on the presence of TATA box in their promoters. Expts. on both classes of genes have revealed a disparity in the regulation of gene expression and cellular functions between the two classes. In this study, we report characteristic differences in promoter sequences and assocd. structural properties of the two categories of genes in six different eukaryotes. We have analyzed three structural features, DNA duplex stability, bendability, and curvature along with the distribution of A-tracts, G-quadruplex motifs, and CpG islands. The structural feature analyses reveal that while the two classes of gene promoters are distinctly different from each other, the properties are also distinguishable across the six organisms.
- 18Yella, V. R.; Kumar, A.; Bansal, M. DNA Structure and Promoter Engineering. In Systems and Synthetic Biology; Singh, V.; Dhar, P. K., Eds. Springer Netherlands: Dordrecht %@ 978–94–017-9514-2, 2015; pp 241– 254.Google ScholarThere is no corresponding record for this reference.
- 19Yella, V. R.; Kumar, A.; Bansal, M. Identification of putative promoters in 48 eukaryotic genomes on the basis of DNA free energy. Sci. Rep. 2018, 8, 4520, DOI: 10.1038/s41598-018-22129-8Google Scholar19https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BC1Mnht1Oqtw%253D%253D&md5=176e595f1c33f5ba86c78951ea669de3Identification of putative promoters in 48 eukaryotic genomes on the basis of DNA free energyYella Venkata Rajesh; Kumar Aditya; Bansal Manju; Yella Venkata Rajesh; Kumar AdityaScientific reports (2018), 8 (1), 4520 ISSN:.Transcription is an intricate mechanism and is orchestrated at the promoter region. The cognate motifs in the promoters are observed in only a subset of total genes across different domains of life. Hence, sequence-motif based promoter prediction may not be a holistic approach for whole genomes. Conversely, the DNA structural property, duplex stability is a characteristic of promoters and can be used to delineate them from other genomic sequences. In this study, we have used a DNA duplex stability based algorithm 'PromPredict' for promoter prediction in a broad range of eukaryotes, representing various species of yeast, worm, fly, fish, and mammal. Efficiency of the software has been tested in promoter regions of 48 eukaryotic systems. PromPredict achieves recall values, which range from 68 to 92% in various eukaryotes. PromPredict performs well in mammals, although their core promoter regions are GC rich. 'PromPredict' has also been tested for its ability to predict promoter regions for various transcript classes (coding and non-coding), TATA-containing and TATA-less promoters as well as on promoter sequences belonging to different gene expression variability categories. The results support the idea that differential DNA duplex stability is a potential predictor of promoter regions in various genomes.
- 20Kumar, A.; Bansal, M. Modulation of Gene Expression by Gene Architecture and Promoter Structure. In Bioinformatics in the Era of Post Genomics and Big Data Abdurakhmonov, I. Y., Ed. IntechOpen: 2018; pp 37– 53.Google ScholarThere is no corresponding record for this reference.
- 21Bansal, M.; Kumar, A.; Yella, V. R. Role of DNA sequence based structural features of promoters in transcription initiation and gene expression. Curr. Opin. Struct. Biol. 2014, 25, 77– 85, DOI: 10.1016/j.sbi.2014.01.007Google Scholar21https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2cXovFCju7Y%253D&md5=22278aea55146b8d71b02c9795ea412aRole of DNA sequence based structural features of promoters in transcription initiation and gene expressionBansal, Manju; Kumar, Aditya; Yella, Venkata RajeshCurrent Opinion in Structural Biology (2014), 25 (), 77-85CODEN: COSBEF; ISSN:0959-440X. (Elsevier Ltd.)A review. Regulatory information for transcription initiation is present in a stretch of genomic DNA, called the promoter region that is located upstream of the transcription start site (TSS) of the gene. The promoter region interacts with different transcription factors and RNA polymerase to initiate transcription and contains short stretches of transcription factor binding sites (TFBSs), as well as structurally unique elements. Recent exptl. and computational analyses of promoter sequences show that they often have non-B-DNA structural motifs, as well as some conserved structural properties, such as stability, bendability, nucleosome positioning preference and curvature, across a class of organisms. Here, we briefly describe these structural features, the differences obsd. in various organisms and their possible role in regulation of gene expression.
- 22Kanhere, A.; Bansal, M. Structural properties of promoters: similarities and differences between prokaryotes and eukaryotes. Nucleic Acids Res. 2005, 33, 3165– 3175, DOI: 10.1093/nar/gki627Google Scholar22https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD2MXmsVKmsL0%253D&md5=257de0ba124688f10a87d796b5ef2192Structural properties of promoters: Similarities and differences between prokaryotes and eukaryotesKanhere, Aditi; Bansal, ManjuNucleic Acids Research (2005), 33 (10), 3165-3175CODEN: NARHAD; ISSN:0305-1048. (Oxford University Press)During the process of transcription, RNA polymerase can exactly locate a promoter sequence in the complex maze of a genome. Several exptl. studies and computational analyses have shown that the promoter sequences apparently possess some special properties, such as unusual DNA structures and low stability, which make them distinct from the rest of the genome. But most of these studies have been carried out on a particular set of promoter sequences or on promoter sequences from similar organisms. To examine whether the promoters from a wide variety of organisms share these special properties, the authors have carried out an anal. of sets of promoters from bacteria, vertebrates and plants. These promoters were analyzed with respect to the prediction of three different properties, such as DNA curvature, bendability and stability, which are relevant to transcription. All the promoter sequences are predicted to share certain features, such as stability and bendability profiles, but there are significant differences in DNA curvature profiles and nucleotide compn. between the different organisms. These similarities and differences are correlated with some of the known facts about transcription process in the promoters from the three groups of organisms.
- 23Kumar, A.; Bansal, M. Unveiling DNA structural features of promoters associated with various types of TSSs in prokaryotic transcriptomes and their role in gene expression. DNA Res. 2017, 24, 25– 35, DOI: 10.1093/dnares/dsw045Google Scholar23https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1cXlvVek&md5=fa8d6224c7bb0de2db6534914c24ea20Unveiling DNA structural features of promoters associated with various types of TSSs in prokaryotic transcriptomes and their role in gene expressionKumar, Aditya; Bansal, ManjuDNA Research (2017), 24 (1), 25-35CODEN: DARSE8; ISSN:1756-1663. (Oxford University Press)Next-generation sequencing studies have revealed that a variety of transcripts are present in the prokaryotic transcriptome and a significant fraction of them are functional, being involved in various regulatory activities apart from coding for proteins. Identification of promoters assocd. with different transcripts is necessary for characterization of the transcriptome. Promoter regions have been shown to have unique structural features as compared with their flanking region, in organisms covering all domains of life. Here we report an in silico anal. of DNA sequence dependent structural properties like stability, bendability and curvature in the promoter region of six different prokaryotic transcriptomes. Using these structural features, we predicted promoters assocd. with different categories of transcripts (mRNA, internal, antisense and non-coding), which constitute the transcriptome. Promoter annotation using structural features is fairly accurate and reliable with about 50% of the primary promoters being characterized by all three structural properties while at least one property identifies 95%. We also studied the relative differences of these structural features in terms of gene expression and found that the features, viz. lower stability, lesser bendability and higher curvature are more prominent in the promoter regions which are assocd. with high gene expression as compared with low expression genes. Hence, promoters, which are assocd. with higher gene expression, get annotated well using DNA structural features as compared with those, which are linked to lower gene expression.
- 24Marin-Gonzalez, A.; Vilhena, J. G.; Moreno-Herrero, F.; Perez, R. DNA Crookedness Regulates DNA Mechanical Properties at Short Length Scales. Phys. Rev. Lett. 2019, 122, 048102 DOI: 10.1103/PhysRevLett.122.048102Google Scholar24https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1MXpt1KlsL8%253D&md5=7904774f4c572a4ce99947a3f934338fDNA crookedness regulates DNA mechanical properties at short length scalesMarin-Gonzalez, Alberto; Vilhena, J. G.; Moreno-Herrero, Fernando; Perez, RubenPhysical Review Letters (2019), 122 (4), 048102/1-048102/6CODEN: PRLTAO; ISSN:1079-7114. (American Physical Society)Sequence-dependent DNA conformation and flexibility play a fundamental role in the specificity of DNA-protein interactions. Here we quantify the DNA crookedness: a sequence-dependent deformation of DNA that consists of periodic bends of the base pair centers chain. Using extensive 100 μs-long, all-atom mol. dynamics simulations, we found that DNA crookedness and its assocd. flexibility are bijective, which unveils a one-to-one relation between DNA structure and dynamics. This allowed us to build a predictive model to compute the stretch moduli of different DNA sequences from solely their structure. Sequences with very little crookedness show extremely high stretching stiffness and have been previously shown to form unstable nucleosomes and promote gene expression. Interestingly, the crookedness can be tailored by epigenetic modifications, known to affect gene expression. Our results rationalize the idea that the DNA sequence is not only a chem. code, but also a phys. one that allows finely regulating its mech. properties and, possibly, its 3D arrangement inside the cell.
- 25Parker, S. C. J.; Hansen, L.; Abaan, H. O.; Tullius, T. D.; Margulies, E. H. Local DNA topography correlates with functional noncoding regions of the human genome. Science 2009, 324, 389– 392, DOI: 10.1126/science.1169050Google Scholar25https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD1MXksVOltr0%253D&md5=f6d1ebe92eb784e7672b5e9a0f8c5b65Local DNA topography correlates with functional noncoding regions of the human genomeParker, Stephen C. J.; Hansen, Loren; Abaan, Hatice Ozel; Tullius, Thomas D.; Margulies, Elliott H.Science (Washington, DC, United States) (2009), 324 (5925), 389-392CODEN: SCIEAS; ISSN:0036-8075. (American Association for the Advancement of Science)The three-dimensional mol. structure of DNA, specifically the shape of the backbone and grooves of genomic DNA, can be dramatically affected by nucleotide changes, which can cause differences in protein-binding affinity and phenotype. The authors developed an algorithm to measure constraint on the basis of similarity of DNA topog. among multiple species, using hydroxyl radical cleavage patterns to interrogate the solvent-accessible surface area of DNA. This algorithm found that 12% of bases in the human genome are evolutionarily constrained-double the no. detected by nucleotide sequence-based algorithms. Topog.-informed constrained regions correlated with functional noncoding elements, including enhancers, better than did regions identified solely on the basis of nucleotide sequence. These results support the idea that the mol. shape of DNA is under selection and can identify evolutionary history.
- 26Meysman, P.; Marchal, K.; Engelen, K. DNA structural properties in the classification of genomic transcription regulation elements. Bioinform. Biol. Insights 2012, 6, 155– 168, DOI: 10.4137/BBI.S9426Google Scholar26https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC38XhtVars77J&md5=9b66d13eba1a584da9a8398d8c1026d5DNA structural properties in the classification of genomic transcription regulation elementsMeysman, Pieter; Marchal, Kathleen; Engelen, KristofBioinformatics and Biology Insights (2012), 6 (), 155-168CODEN: BBIIGM; ISSN:1177-9322. (Libertas Academica)A review. It has been long known that DNA mols. encode information at various levels. The most basic level comprises the base sequence itself and is primarily important for the encoding of proteins and direct base recognition by DNA-binding proteins. A more elusive level consists of the local structural properties of the DNA mol. wherein the DNA sequence only plays an indirect supportive role. These properties are nevertheless an important factor in a large no. of biomol. processes and can be considered as informative signals for the presence of a variety of genomic features. Several recent studies have unequivocally shown the benefit of relying on such DNA properties for modeling and predicting genomic features as diverse as transcription start sites, transcription factor binding sites, or nucleosome occupancy. This review is meant to provide an overview of the key aspects of these DNA conformational and physicochem. properties. To illustrate their potential added value compared to relying solely on the nucleotide sequence in genomics studies, we discuss their application in research on transcription regulation mechanisms as representative cases.
- 27Yella, V. R.; Bhimsaria, D.; Ghoshdastidar, D.; Rodríguez-Martínez, J. A.; Ansari, A. Z.; Bansal, M. Flexibility and structure of flanking DNA impact transcription factor affinity for its core motif. Nucleic Acids Res. 2018, 46, 11883– 11897, DOI: 10.1093/nar/gky1057Google Scholar27https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1MXovFWgurg%253D&md5=abd9ce678c06f8b6e2681471ba0f1ba8Flexibility and structure of flanking DNA impact transcription factor affinity for its core motifYella, Venkata Rajesh; Bhimsaria, Devesh; Ghoshdastidar, Debostuti; Rodriguez-Martinez, Jose A.; Ansari, Aseem Z.; Bansal, ManjuNucleic Acids Research (2018), 46 (22), 11883-11897CODEN: NARHAD; ISSN:1362-4962. (Oxford University Press)Spatial and temporal expression of genes is essential for maintaining phenotype integrity. Transcription factors (TFs) modulate expression patterns by binding to specific DNA sequences in the genome. Along with the core binding motif, the flanking sequence context can play a role in DNA-TF recognition. Here, we employ high-throughput in vitro and in silico analyses to understand the influence of sequences flanking the cognate sites in binding of three most prevalent eukaryotic TF families (zinc finger, homeodomain and bZIP). In vitro binding preferences of each TF toward the entire DNA sequence space were correlated with a wide range of DNA structural parameters, including DNA flexibility. Results demonstrate that conformational plasticity of flanking regions modulates binding affinity of certain TF families. DNA duplex stability and minor groove width also play an important role in DNA-TF recognition but differ in how exactly they influence the binding in each specific case. Our analyses further reveal that the structural features of preferred flanking sequences are not universal, as similar DNA-binding folds can employ distinct DNA recognition modes.
- 28Kumar, A.; Manivelan, V.; Bansal, M. Structural features of DNA are conserved in the promoter region of orthologous genes across different strains of Helicobacter pylori. FEMS Microbiol. Lett. 2016, 363, fnv207, DOI: 10.1093/femsle/fnw207Google ScholarThere is no corresponding record for this reference.
- 29Cao, X. Q.; Zeng, J.; Yan, H. Structural properties of replication origins in yeast DNA sequences. Phys. Biol. 2008, 5, 036012 DOI: 10.1088/1478-3975/5/3/036012Google Scholar29https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BD1cnjtlKntA%253D%253D&md5=c65d783931fb9bc46f6979a5ed53e2cfStructural properties of replication origins in yeast DNA sequencesCao Xiao-Qin; Zeng Jia; Yan HongPhysical biology (2008), 5 (3), 036012 ISSN:.Sequence-dependent DNA flexibility is an important structural property originating from the DNA 3D structure. In this paper, we investigate the DNA flexibility of the budding yeast (S. Cerevisiae) replication origins on a genome-wide scale using flexibility parameters from two different models, the trinucleotide and the tetranucleotide models. Based on analyzing average flexibility profiles of 270 replication origins, we find that yeast replication origins are significantly rigid compared with their surrounding genomic regions. To further understand the highly distinctive property of replication origins, we compare the flexibility patterns between yeast replication origins and promoters, and find that they both contain significantly rigid DNAs. Our results suggest that DNA flexibility is an important factor that helps proteins recognize and bind the target sites in order to initiate DNA replication. Inspired by the role of the rigid region in promoters, we speculate that the rigid replication origins may facilitate binding of proteins, including the origin recognition complex (ORC), Cdc6, Cdt1 and the MCM2-7 complex.
- 30Comoglio, F.; Schlumpf, T.; Schmid, V.; Rohs, R.; Beisel, C.; Paro, R. High-resolution profiling of Drosophila replication start sites reveals a DNA shape and chromatin signature of metazoan origins. Cell Rep. 2015, 11, 821– 834, DOI: 10.1016/j.celrep.2015.03.070Google Scholar30https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2MXntFKjurw%253D&md5=870f20a0e1fbb9fb3aebec7d857b6657High-Resolution Profiling of Drosophila Replication Start Sites Reveals a DNA Shape and Chromatin Signature of Metazoan OriginsComoglio, Federico; Schlumpf, Tommy; Schmid, Virginia; Rohs, Remo; Beisel, Christian; Paro, RenatoCell Reports (2015), 11 (5), 821-834CODEN: CREED8; ISSN:2211-1247. (Cell Press)At every cell cycle, faithful inheritance of metazoan genomes requires the concerted activation of thousands of DNA replication origins. However, the genetic and chromatin features defining metazoan replication start sites remain largely unknown. Here, we delineate the origin repertoire of the Drosophila genome at high resoln. We address the role of origin-proximal G-quadruplexes and suggest that they transiently stall replication forks in vivo. We dissect the chromatin configuration of replication origins and identify a rich spatial organization of chromatin features at initiation sites. DNA shape and chromatin configurations, not strict sequence motifs, mark and predict origins in higher eukaryotes. We further examine the link between transcription and origin firing and reveal that modulation of origin activity across cell types is intimately linked to cell-type-specific transcriptional programs. Our study unravels conserved origin features and provides unique insights into the relationship among DNA topol., chromatin, transcription, and replication initiation across metazoa.
- 31Gao, F.; Luo, H.; Zhang, C. T. DeOri: a database of eukaryotic DNA replication origins. Bioinformatics 2012, 28, 1551– 1552, DOI: 10.1093/bioinformatics/bts151Google Scholar31https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC38Xns1Slsbk%253D&md5=cbc68d259df8cd5e296a4b6b4b45a26bDeOri: a database of eukaryotic DNA replication originsGao, Feng; Luo, Hao; Zhang, Chun-TingBioinformatics (2012), 28 (11), 1551-1552CODEN: BOINFP; ISSN:1367-4803. (Oxford University Press)Summary: DNA replication, a central event for cell proliferation, is the basis of biol. inheritance. The identification of replication origins helps to reveal the mechanism of the regulation of DNA replication. However, only few eukaryotic replication origins were characterized not long ago; nevertheless, recent genome-wide approaches have boosted the no. of mapped replication origins. To gain a comprehensive understanding of the nature of eukaryotic replication origins, we have constructed a Database of Eukaryotic ORIs (DeOri), which contains all the eukaryotic ones identified by genome-wide analyses currently available. A total of 16 145 eukaryotic replication origins have been collected from 6 eukaryotic organisms in which genome-wide studies have been performed, the replication-origin nos. being 433, 7489, 1543, 148, 348 and 6184 for humans, mice, Arabidopsis thaliana, Kluyveromyces lactis, Schizosaccharomyces pombe and Drosophila melanogaster, resp. Availability: Database of Eukaryotic ORIs (DeOri) can be accessed from http://tubic.tju.edu.cn/deori/ Contact: [email protected].
- 32Chen, W.; Feng, P.; Lin, H. Prediction of replication origins by calculating DNA structural properties. FEBS Lett. 2012, 586, 934– 938, DOI: 10.1016/j.febslet.2012.02.034Google Scholar32https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC38Xjs1Ghsbk%253D&md5=00cf6edcb1c9f388ddb05f95454ab3aePrediction of replication origins by calculating DNA structural propertiesChen, Wei; Feng, Pengmian; Lin, HaoFEBS Letters (2012), 586 (6), 934-938CODEN: FEBLAL; ISSN:0014-5793. (Elsevier B.V.)In this study, we introduced two DNA structural characteristics, namely, bendability and hydroxyl radical cleavage intensity to analyze origin of replication (ORI) in the Saccharomyces cerevisiae genome. We found that both DNA bendability and cleavage intensity in core replication regions were significantly lower than in the linker regions. By using these two DNA structural characteristics, we developed a computational model for ORI prediction and evaluated the model in a benchmark dataset. The predictive performance of the jackknife cross-validation indicates that DNA bendability and cleavage intensity have the ability to describe core replication regions and our model is effective in ORI prediction.
- 33Kumar, A.; Bansal, M. Characterization of structural and free energy properties of promoters associated with Primary and Operon TSS in Helicobacter pylori genome and their orthologs. J. Biosci. 2012, 37, 423– 431, DOI: 10.1007/s12038-012-9214-6Google Scholar33https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC38XhtVOgsb7P&md5=a784b131e71d8a937bf37a57eb07928fCharacterization of structural and free energy properties of promoters associated with Primary and Operon TSS in Helicobacter pylori genome and their orthologsKumar, Aditya; Bansal, ManjuJournal of Biosciences (New Delhi, India) (2012), 37 (3), 423-431CODEN: JOBSDN; ISSN:0250-5991. (Springer (India) Private Ltd.)Promoter regions in the genomes of all domains of life show similar trends in several structural properties such as stability, bendability, curvature, etc. In current study we analyzed the stability and bendability of various classes of promoter regions (based on the recent identification of different classes of transcription start sites) of Helicobacter pylori 26695 strain. It is found that primary TSS and operon-assocd. TSS promoters show significantly strong features in their promoter regions. DNA free-energy-based promoter prediction tool PromPredict was used to annotate promoters of different classes, and very high recall values (∼80%) are obtained for primary TSS. Orthologous genes from other strains of H. pylori show conservation of structural properties in promoter regions as well as coding regions. PromPredict annotates promoters of orthologous genes with very high recall and precision.
- 34Cao, X. Q.; Zeng, J.; Yan, H. Physical signals for protein-DNA recognition. Phys. Biol. 2009, 6, 036012 DOI: 10.1088/1478-3975/6/3/036012Google Scholar34https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3cXntFOqu70%253D&md5=588fd51aa107baf5171771d1c02d0978Physical signals for protein-DNA recognitionCao, Xiao-Qin; Zeng, Jia; Yan, HongPhysical Biology (2009), 6 (3), 036012/1-036012/10CODEN: PBHIAT; ISSN:1478-3975. (Institute of Physics Publishing)This paper discovers consensus phys. signals around eukaryotic splice sites, transcription start sites, and replication origin start and end sites on a genome-wide scale based on their DNA flexibility profiles calcd. by three different flexibility models. These salient phys. signals are localized highly rigid and flexible DNAs, which may play important roles in protein-DNA recognition by the sliding search mechanism. The found phys. signals lead us to a detailed hypothetical view of the search process in which a DNA-binding protein first finds a genomic region close to the target site from an arbitrary starting location by three-dimensional (3D) hopping and intersegment transfer mechanisms for long distances, and subsequently uses the one-dimensional (1D) sliding mechanism facilitated by the localized highly rigid DNAs to accurately locate the target flexible binding site within 30 bp (base pair) short distances. Guided by these phys. signals, DNA-binding proteins rapidly search the entire genome to recognize a specific target site from the 3D to 1D pathway. Our findings also show that current promoter prediction programs (PPPs) based on DNA phys. properties may suffer from lots of false positives because other functional sites such as splice sites and replication origins have similar phys. signals as promoters do.
- 35Bleichert, F.; Botchan, M. R.; Berger, J. M. Mechanisms for initiating cellular DNA replication. Science 2017, 355, eaah6317, DOI: 10.1126/science.aah6317Google ScholarThere is no corresponding record for this reference.
- 36Gai, D.; Chang, Y. P.; Chen, X. S. Origin DNA melting and unwinding in DNA replication. Curr. Opin. Struct. Biol. 2010, 20, 756– 762, DOI: 10.1016/j.sbi.2010.08.009Google Scholar36https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3cXhsV2js77J&md5=95a2965414712fd572204137c0f8e9bbOrigin DNA melting and unwinding in DNA replicationGai, Dahai; Chang, Y. Paul; Chen, Xiaojiang S.Current Opinion in Structural Biology (2010), 20 (6), 756-762CODEN: COSBEF; ISSN:0959-440X. (Elsevier Ltd.)A review. Genomic DNA replication is a necessary step in the life cycles of all organisms. To initiate DNA replication, the double-stranded DNA (dsDNA) at the origin of replication must be sepd. or melted; this melted region is propagated and a mature replication fork is formed. To accomplish origin recognition, initial DNA melting, and the eventual formation of a replication fork, coordinated activity of initiators, helicases, and other cellular factors are required. Here, the authors focus on recent advances in the structural and biochem. studies of the initiators and the replicative helicases in multiple replication systems, with emphasis on the systems in archaeal and eukaryotic cells. These studies have yielded insights into the plausible mechanisms of the early stages of DNA replication.
- 37Rajewska, M.; Wegrzyn, K.; Konieczny, I. AT-rich region and repeated sequences - the essential elements of replication origins of bacterial replicons. FEMS Microbiol. Rev. 2012, 36, 408– 434, DOI: 10.1111/j.1574-6976.2011.00300.xGoogle Scholar37https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC38Xjt1ClsLs%253D&md5=d8eeeedc2aa9f709bd66dab82d28c7e3AT-rich region and repeated sequences - the essential elements of replication origins of bacterial repliconsRajewska, Magdalena; Wegrzyn, Katarzyna; Konieczny, IgorFEMS Microbiology Reviews (2012), 36 (2), 408-434CODEN: FMREE4; ISSN:0168-6445. (Wiley-Blackwell)A review. Repeated sequences are commonly present in the sites for DNA replication initiation in bacterial, archaeal, and eukaryotic replicons. Those motifs are usually the binding places for replication initiation proteins or replication regulatory factors. In prokaryotic replication origins, the most abundant repeated sequences are DnaA boxes which are the binding sites for chromosomal replication initiation protein DnaA, iterons which bind plasmid or phage DNA replication initiators, defined motifs for site-specific DNA methylation, and 13-nucleotide-long motifs of a not too well-characterized function, which are present within a specific region of replication origin contg. higher than av. content of adenine and thymine residues. In this review, we specify methods allowing identification of a replication origin, basing on the localization of an AT-rich region and the arrangement of the origin's structural elements. We describe the regularity of the position and structure of the AT-rich regions in bacterial chromosomes and plasmids. The importance of 13-nucleotide-long repeats present at the AT-rich region, as well as other motifs overlapping them, was pointed out to be essential for DNA replication initiation including origin opening, helicase loading and replication complex assembly. We also summarize the role of AT-rich region repeated sequences for DNA replication regulation.
- 38Dao, F. Y.; Lv, H.; Wang, F.; Feng, C. Q.; Ding, H.; Chen, W.; Lin, H. Identify origin of replication in Saccharomyces cerevisiae using two-step feature selection technique. Bioinformatics 2019, 35, 2075– 2083, DOI: 10.1093/bioinformatics/bty943Google Scholar38https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB3cXhs1WktL4%253D&md5=0f7b1cbe27a6b90f91dca0bc04248835Identify origin of replication in Saccharomyces cerevisiae using two-step feature selection techniqueDao, Fu-Ying; Lv, Hao; Wang, Fang; Feng, Chao-Qin; Ding, Hui; Chen, Wei; Lin, HaoBioinformatics (2019), 35 (12), 2075-2083CODEN: BOINFP; ISSN:1367-4811. (Oxford University Press)DNA replication is a key step to maintain the continuity of genetic information between parental generation and offspring. The initiation site of DNA replication, also called origin of replication (ORI), plays an extremely important role in the basic biochem. process. Thus, rapidly and effectively identifying the location of ORI in genome will provide key clues for genome anal. Although biochem. expts. could provide detailed information for ORI, it requires high exptl. cost and long exptl. period. As good complements to exptl. techniques, computational methods could overcome these disadvantages. Thus, in this study, we developed a predictor called iORI-PseKNC2.0 to identify ORIs in the Saccharomyces cerevisiae genome based on sequence information. The PseKNC including 90 physicochem. properties was proposed to formulate ORI and non-ORI samples. In order to improve the accuracy, a two-step feature selection was proposed to exclude redundant and noise information. As a result, the overall success rate of 88.53% was achieved in the 5-fold cross-validation test by using support vector machine.
- 39Li, W.-C.; Deng, E.-Z.; Ding, H.; Chen, W.; Lin, H. iORI-PseKNC: A predictor for identifying origin of replication with pseudo k-tuple nucleotide composition. Chemom. Intell. Lab. Syst. 2015, 141, 100– 106, DOI: 10.1016/j.chemolab.2014.12.011Google Scholar39https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2MXls1Kmug%253D%253D&md5=feb3c6a88cf48926717db64ee78422d2iORI-PseKNC: A predictor for identifying origin of replication with pseudo k-tuple nucleotide compositionLi, Wen-Chao; Deng, En-Ze; Ding, Hui; Chen, Wei; Lin, HaoChemometrics and Intelligent Laboratory Systems (2015), 141 (), 100-106CODEN: CILSEN; ISSN:0169-7439. (Elsevier B.V.)The initiation of replication origin is an extremely important process of DNA replication. The distribution of replication origin regions (ORIs) is the major determinant of the timing of genome replication. Thus, correctly identifying ORIs is crucial to understand DNA replication mechanism. With the avalanche of genome sequences generated in the post-genomic age, it is highly desired to develop computational methods for rapidly, effectively and automatically identifying the ORIs in genome. In this paper, we developed a predictor called iORI-PseKNC for identifying ORIs in Saccharomyces cerevisiae genome. In the predictor, based on the concept of the global and long-range sequence-order effects of DNA sequence, the feature called "pseudo k-tuple nucleotide compn." (PseKNC) was used to encode the DNA sequences by incorporating six local structural properties of 16 dinucleotides. The overall success rate of 83.72% was achieved from the jackknife cross-validation test on an objective benchmark dataset. Comparisons demonstrate that the new predictor is superior to other methods. As a user-friendly web-server, iORI-PseKNC is freely accessible at http://lin.uestc.edu.cn/server/iORI-PseKNC. We hope that iORI-PseKNC will become a useful tool or at least as a complement to existing methods for identifying ORIs.
- 40Zhang, C. J.; Tang, H.; Li, W. C.; Lin, H.; Chen, W.; Chou, K. C. iOri-Human: identify human origin of replication by incorporating dinucleotide physicochemical properties into pseudo nucleotide composition. Oncotarget 2016, 7, 69783– 69793, DOI: 10.18632/oncotarget.11975Google Scholar40https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BC2svgt1eksQ%253D%253D&md5=585648a09f582ab81fe8eb6870d7f6dfiOri-Human: identify human origin of replication by incorporating dinucleotide physicochemical properties into pseudo nucleotide compositionZhang Chang-Jian; Li Wen-Chao; Lin Hao; Chen Wei; Chou Kuo-Chen; Tang Hua; Lin Hao; Chen Wei; Chou Kuo-Chen; Chen Wei; Chou Kuo-ChenOncotarget (2016), 7 (43), 69783-69793 ISSN:.The initiation of replication is an extremely important process in DNA life cycle. Given an uncharacterized DNA sequence, can we identify where its origin of replication (ORI) is located? It is no doubt a fundamental problem in genome analysis. Particularly, with the rapid development of genome sequencing technology that results in a huge amount of sequence data, it is highly desired to develop computational methods for rapidly and effectively identifying the ORIs in these genomes. Unfortunately, by means of the existing computational methods, such as sequence alignment or kmer strategies, it could hardly achieve decent success rates. To address this problem, we developed a predictor called "iOri-Human". Rigorous jackknife tests have shown that its overall accuracy and stability in identifying human ORIs are over 75% and 50%, respectively. In the predictor, it is through the pseudo nucleotide composition (an extension of pseudo amino acid composition) that 96 physicochemical properties for the 16 possible constituent dinucleotides have been incorporated to reflect the global sequence patterns in DNA as well as its local sequence patterns. Moreover, a user-friendly web-server for iOri-Human has been established at http://lin.uestc.edu.cn/server/iOri-Human.html, by which users can easily get their desired results without the need to through the complicated mathematics involved.
- 41Gowers, D. M.; Wilson, G. G.; Halford, S. E. Measurement of the contributions of 1D and 3D pathways to the translocation of a protein along DNA. Proc. Natl. Acad. Sci. U. S. A. 2005, 102, 15883– 15888, DOI: 10.1073/pnas.0505378102Google Scholar41https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD2MXht1Wru7bK&md5=349563bc658cb21f548f0efc222170f6Measurement of the contributions of 1D and 3D pathways to the translocation of a protein along DNAGowers, Darren M.; Wilson, Geoffrey G.; Halford, Stephen E.Proceedings of the National Academy of Sciences of the United States of America (2005), 102 (44), 15883-15888CODEN: PNASA6; ISSN:0027-8424. (National Academy of Sciences)Proteins that act at specific DNA sequences bind DNA randomly and then translocate to the target site. The translocation is often ascribed to the protein sliding along the DNA while maintaining continuous contact with it. Proteins also can move on DNA by multiple cycles of dissocn./reassocn. within the same chain. To distinguish these pathways, a strategy was developed to analyze protein motion between DNA sites. The strategy reveals whether the protein maintains contact with the DNA as it transfers from one site to another by sliding or whether it loses contact by a dissocn./reassocn. step. In reactions at low salt, the test protein stayed on the DNA as it traveled between sites, but only when the sites were <50 bp apart. Transfers of >30 bp at in vivo salt, and over distances of >50 bp at any salt, always included at least one dissocn. step. Hence, for this enzyme, 1D sliding operates only over short distances at low salt, and 3D dissocn./reassocn. is its main mode of translocation.
- 42Halford, S. E.; Marko, J. F. How do site-specific DNA-binding proteins find their targets?. Nucleic Acids Res. 2004, 32, 3040– 3052, DOI: 10.1093/nar/gkh624Google Scholar42https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD2cXltVWlsro%253D&md5=b6e3b1d67b846ed95ce2d17d2a4754c0How do site-specific DNA-binding proteins find their targets?Halford, Stephen E.; Marko, John F.Nucleic Acids Research (2004), 32 (10), 3040-3052CODEN: NARHAD; ISSN:0305-1048. (Oxford University Press)A review. Essentially all the biol. functions of DNA depend on site-specific DNA-binding proteins finding their targets, and therefore searching through megabases of non-target DNA. In this article, we review current understanding of how this sequence searching is done. We review how simple diffusion through soln. may be unable to account for the rapid rates of assocn. obsd. in expts. on some model systems, primarily the Lac repressor. We then present a simplified version of the facilitated diffusion model of Berg, Winter and von Hippel, showing how non-specific DNA-protein interactions may account for accelerated targeting, by permitting the protein to sample many binding sites per DNA encounter. We discuss the 1-dimensional sliding motion of protein along non-specific DNA, often proposed to be the mechanism of this multiple site sampling, and we discuss the role of short-range diffusive-hopping motions. We then derive the optimal range of sliding for a few phys. situations, including simple models of chromosomes in vivo, showing that a sliding range of ∼100 bp before dissocn. optimizes targeting in vivo. Going beyond first-order binding kinetics, we discuss how processivity, the interaction of a protein with two or more targets on the same DNA, can reveal the extent of sliding and we review recent expts. studying processivity using the restriction enzyme EcoRV. Finally, we discuss how single mol. techniques might be used to study the dynamics of DNA site-specific targeting of proteins.
- 43Jiang, C.; Pugh, B. F. Nucleosome positioning and gene regulation: advances through genomics. Nat. Rev. Genet. 2009, 10, 161– 172, DOI: 10.1038/nrg2522Google Scholar43https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD1MXhvFGlu7Y%253D&md5=5a40109670e7b2f3dd754f8bc7a2d3dcNucleosome positioning and gene regulation: advances through genomicsJiang, Cizhong; Pugh, B. FranklinNature Reviews Genetics (2009), 10 (3), 161-172CODEN: NRGAAM; ISSN:1471-0056. (Nature Publishing Group)A review. Principles and patterns of nucleosome positioning have emerged through recent advances in genome-wide mapping technologies. These patterns have improved understanding of how DNA sequence and protein complexes control nucleosome location and the influence of nucleosome positioning on transcriptional control. Knowing the precise locations of nucleosomes in a genome is key to understanding how genes are regulated. Recent 'next generation' ChIP-chip and ChIP-Seq technologies have accelerated our understanding of the basic principles of chromatin organization. Here we discuss what high-resoln. genome-wide maps of nucleosome positions have taught us about how nucleosome positioning demarcates promoter regions and transcriptional start sites, and how the compn. and structure of promoter nucleosomes facilitate or inhibit transcription. A detailed picture is starting to emerge of how diverse factors, including underlying DNA sequences and chromatin remodelling complexes, influence nucleosome positioning.
- 44Hoskins, R. A.; Landolin, J. M.; Brown, J. B.; Sandler, J. E.; Takahashi, H.; Lassmann, T.; Yu, C.; Booth, B. W.; Zhang, D.; Wan, K. H.; Yang, L.; Boley, N.; Andrews, J.; Kaufman, T. C.; Graveley, B. R.; Bickel, P. J.; Carninci, P.; Carlson, J. W.; Celniker, S. E. Genome-wide analysis of promoter architecture in Drosophila melanogaster. Genome Res. 2011, 21, 182– 192, DOI: 10.1101/gr.112466.110Google Scholar44https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3MXitFSrt74%253D&md5=2ae3c6dd82b1612800f99a301f88c8e7Genome-wide analysis of promoter architecture in Drosophila melanogasterHoskins, Roger A.; Landolin, Jane M.; Brown, James B.; Sandler, Jeremy E.; Takahashi, Hazuki; Lassmann, Timo; Yu, Charles; Booth, Benjamin W.; Zhang, Dayu; Wan, Kenneth H.; Yang, Li; Boley, Nathan; Andrews, Justen; Kaufman, Thomas C.; Graveley, Brenton R.; Bickel, Peter J.; Carninci, Piero; Carlson, Joseph W.; Celniker, Susan E.Genome Research (2011), 21 (2), 182-192CODEN: GEREFS; ISSN:1088-9051. (Cold Spring Harbor Laboratory Press)Core promoters are crit. regions for gene regulation in higher eukaryotes. However, the boundaries of promoter regions, the relative rates of initiation at the transcription start sites (TSSs) distributed within them, and the functional significance of promoter architecture remain poorly understood. We produced a high-resoln. map of promoters active in the Drosophila melanogaster embryo by integrating data from three independent and complementary methods: 21 million cap anal. of gene expression (CAGE) tags, 1.2 million RNA ligase mediated rapid amplification of cDNA ends (RLM-RACE) reads, and 50,000 cap-trapped expressed sequence tags (ESTs). We defined 12,454 promoters of 8037 genes. Our anal. indicates that, due to non-promoter-assocd. RNA background signal, previous studies have likely over-estd. the no. of promoter-assocd. CAGE clusters by fivefold. We show that TSS distributions form a complex continuum of shapes, and that promoters active in the embryo and adult have highly similar shapes in 95% of cases. This suggests that these distributions are generally detd. by static elements such as local DNA sequence and are not modulated by dynamic signals such as histone modifications. Transcription factor binding motifs are differentially enriched as a function of promoter shape, and peaked promoter shape is correlated with both temporal and spatial regulation of gene expression. Our results contribute to the emerging view that core promoters are functionally diverse and control patterning of gene expression in Drosophila and mammals.
- 45Dao, F. Y.; Lv, H.; Zulfiqar, H.; Yang, H.; Su, W.; Gao, H.; Ding, H.; Lin, H. A computational platform to identify origins of replication sites in eukaryotes. Brief Bioinform 2020, DOI: 10.1093/bib/bbaa017Google ScholarThere is no corresponding record for this reference.
- 46Takai, D.; Jones, P. A. Comprehensive analysis of CpG islands in human chromosomes 21 and 22. Proc. Natl. Acad. Sci. U. S. A. 2002, 99, 3740– 3745, DOI: 10.1073/pnas.052410099Google Scholar46https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD38Xis1KltrY%253D&md5=2433872cfe96da0def9e694833e50f02Comprehensive analysis of CpG islands in human chromosomes 21 and 22Takai, Daiya; Jones, Peter A.Proceedings of the National Academy of Sciences of the United States of America (2002), 99 (6), 3740-3745CODEN: PNASA6; ISSN:0027-8424. (National Academy of Sciences)CpG islands are useful markers for genes in organisms contg. 5-methylcytosine in their genomes. In addn., CpG islands located in the promoter regions of genes can play important roles in gene silencing during processes such as X-chromosome inactivation, imprinting, and silencing of intragenomic parasites. The generally accepted definition of what constitutes a CpG island was proposed in 1987 by Gardiner-Garden and Frommer [Gardiner-Garden, M. & Frommer, M. (1987) J. Mol. Biol. 196, 261-282] as being a 200-bp stretch of DNA with a C+G content of 50% and an obsd. CpG/expected CpG in excess of 0.6. Any definition of a CpG island is somewhat arbitrary, and this one, which was derived before the sequencing of mammalian genomes, will include many sequences that are not necessarily assocd. with controlling regions of genes but rather are assocd. with intragenomic parasites. The authors have therefore used the complete genomic sequences of human chromosomes 21 and 22 to examine the properties of CpG islands in different sequence classes by using a search algorithm that the authors have developed. Regions of DNA of greater than 500 bp with a G+C equal to or greater than 55% and obsd. CpG/expected CpG of 0.65 were more likely to be assocd. with the 5' regions of genes and this definition excluded most Alu-repetitive elements. The authors also used genome sequences to show strong CpG suppression in the human genome and slight suppression in Drosophila melanogaster and Saccharomyces cerevisiae. This finding is compatible with the recent detection of 5-methylcytosine in Drosophila, and might suggest that S. cerevisiae has, or once had, CpG methylation.
- 47Mirkin, E. V.; Mirkin, S. M. Replication fork stalling at natural impediments. Microbiol. Mol. Biol. Rev. 2007, 71, 13– 35, DOI: 10.1128/MMBR.00030-06Google Scholar47https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD2sXksleqsLg%253D&md5=f3e547ba124f2a93c5fb6cd3d8e64222Replication fork stalling at natural impedimentsMirkin, Ekaterina V.; Mirkin, Sergei M.Microbiology and Molecular Biology Reviews (2007), 71 (1), 13-35CODEN: MMBRF7; ISSN:1092-2172. (American Society for Microbiology)A review. Accurate and complete replication of the genome in every cell division is a prerequisite of genomic stability. Thus, both prokaryotic and eukaryotic replication forks are extremely precise and robust mol. machines that have evolved to be up to the task. However, it has recently become clear that the replication fork is more of a hurdler than a runner: it must overcome various obstacles present on its way. Such obstacles can be called natural impediments to DNA replication, as opposed to external and genetic factors. Natural impediments to DNA replication are particular DNA binding proteins, unusual secondary structures in DNA, and transcription complexes that occasionally (in eukaryotes) or constantly (in prokaryotes) operate on replicating templates. This review describes the mechanisms and consequences of replication stalling at various natural impediments, with an emphasis on the role of replication stalling in genomic instability.
- 48Kaushik Tiwari, M.; Adaku, N.; Peart, N.; Rogers, F. A. Triplex structures induce DNA double strand breaks via replication fork collapse in NER deficient cells. Nucleic Acids Res. 2016, 44, 7742– 7754, DOI: 10.1093/nar/gkw515Google Scholar48https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BC2s%252FpslGgtg%253D%253D&md5=c4560b8f264a98f9631e9642af0d5c09Triplex structures induce DNA double strand breaks via replication fork collapse in NER deficient cellsKaushik Tiwari Meetu; Adaku Nneoma; Peart Natoya; Rogers Faye ANucleic acids research (2016), 44 (16), 7742-54 ISSN:.Structural alterations in DNA can serve as natural impediments to replication fork stability and progression, resulting in DNA damage and genomic instability. Naturally occurring polypurine mirror repeat sequences in the human genome can create endogenous triplex structures evoking a robust DNA damage response. Failures to recognize or adequately process these genomic lesions can result in loss of genomic integrity. Nucleotide excision repair (NER) proteins have been found to play a prominent role in the recognition and repair of triplex structures. We demonstrate using triplex-forming oligonucleotides that chromosomal triplexes perturb DNA replication fork progression, eventually resulting in fork collapse and the induction of double strand breaks (DSBs). We find that cells deficient in the NER damage recognition proteins, XPA and XPC, accumulate more DSBs in response to chromosomal triplex formation than NER-proficient cells. Furthermore, we demonstrate that XPC-deficient cells are particularly prone to replication-associated DSBs in the presence of triplexes. In the absence of XPA or XPC, deleterious consequences of triplex-induced genomic instability may be averted by activating apoptosis via dual phosphorylation of the H2AX protein. Our results reveal that damage recognition by XPC and XPA is critical to maintaining replication fork integrity and preventing replication fork collapse in the presence of triplex structures.
- 49Prioleau, M. N.; MacAlpine, D. M. DNA replication origins-where do we begin?. Genes Dev. 2016, 30, 1683– 1697, DOI: 10.1101/gad.285114.116Google Scholar49https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC28Xhsl2js7rJ&md5=04ed09d23c47556b1bcd73f41ee1b16fDNA replication origins- where do we begin?Prioleau, Marie-Noelle; MacAlpine, David M.Genes & Development (2016), 30 (15), 1683-1697CODEN: GEDEEP; ISSN:0890-9369. (Cold Spring Harbor Laboratory Press)For more than three decades, investigators have sought to identify the precise locations where DNA replication initiates in mammalian genomes. The development of mol. and biochem. approaches to identify start sites of DNA replication (origins) based on the presence of defining and characteristic replication intermediates at specific loci led to the identification of only a handful of mammalian replication origins. The limited no. of identified origins prevented a comprehensive and exhaustive search for conserved genomic features that were capable of specifying origins of DNA replication. More recently, the adaptation of origin-mapping assays to genome-wide approaches has led to the identification of tens of thousands of replication origins throughout mammalian genomes, providing an unprecedented opportunity to identify both genetic and epigenetic features that define and regulate their distribution and utilization. Here we summarize recent advances in our understanding of how primary sequence, chromatin environment, and nuclear architecture contribute to the dynamic selection and activation of replication origins across diverse cell types and developmental stages.
- 50Cayrou, C.; Ballester, B.; Peiffer, I.; Fenouil, R.; Coulombe, P.; Andrau, J.-C.; van Helden, J.; Méchali, M. The chromatin environment shapes DNA replication origin organization and defines origin classes. Genome Res. 2015, 25, 1873– 1885, DOI: 10.1101/gr.192799.115Google Scholar50https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC28Xkt1Kmurw%253D&md5=7974dcf7c51bdfab271658023c7e2125The chromatin environment shapes DNA replication origin organization and defines origin classesCayrou, Christelle; Ballester, Benoit; Peiffer, Isabelle; Fenouil, Romain; Coulombe, Philippe; Andrau, Jean-Christophe; van Helden, Jacques; Mechali, MarcelGenome Research (2015), 25 (12), 1873-1885CODEN: GEREFS; ISSN:1088-9051. (Cold Spring Harbor Laboratory Press)To unveil the still-elusive nature of metazoan replication origins, we identified them genome-wide and at unprecedented high-resoln. in mouse ES cells. This allowed initiation sites (IS) and initiation zones (IZ) to be differentiated. We then characterized their genetic signatures and organization and integrated these data with 43 chromatin marks and factors. Our results reveal that replication origins can be grouped into three main classes with distinct organization, chromatin environment, and sequence motifs. Class 1 contains relatively isolated, low-efficiency origins that are poor in epigenetic marks and are enriched in an asym. AC repeat at the initiation site. Late origins are mainly found in this class. Class 2 origins are particularly rich in enhancer elements. Class 3 origins are the most efficient and are assocd. with open chromatin and polycomb protein-enriched regions. The presence of Origin G-rich Repeated elements (OGRE) potentially forming G-quadruplexes (G4) was confirmed at most origins. These coincide with nucleosome-depleted regions located upstream of the initiation sites, which are assocd. with a labile nucleosome contg. H3K64ac. These data demonstrate that specific chromatin landscapes and combinations of specific signatures regulate origin localization. They explain the frequently obsd. links between DNA replication and transcription. They also emphasize the plasticity of metazoan replication origins and suggest that in multicellular eukaryotes, the combination of distinct genetic features and chromatin configurations act in synergy to define and adapt the origin profile.
- 51Antequera, F. Structure, function and evolution of CpG island promoters. Cell. Mol. Life Sci. 2003, 60, 1647– 1658, DOI: 10.1007/s00018-003-3088-6Google Scholar51https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD3sXnsFynurY%253D&md5=fd3582748858c93762980109952b4a5aStructure, function and evolution of CpG island promotersAntequera, F.Cellular and Molecular Life Sciences (2003), 60 (8), 1647-1658CODEN: CMLSFI; ISSN:1420-682X. (Birkhaeuser Verlag)A review, with refs. Mammalian promoters belong to two different categories in terms of base compn. and DNA methylation. In humans and mice, approx. 60% of all promoters colocalize with CpG islands, which are regions devoid of methylation that have a higher G+C content than the genome av., while the rest have a methylation pattern and base compn. indistinguishable from bulk DNA. Recent comparative studies between both organisms have refined our understanding of how CpG island promoters are organized in terms of protein-DNA interactions and patterns of expression. In addn., the finding that DNA replication initiates at CpG islands in vivo suggests that their distinctive properties could be a consequence of such activity and opens the possibility of a coordinated regulation of transcription and replication. These new data shed light on the origin and evolution of the CpG islands and should contribute to improving methods for promoter prediction in the human and mouse genomes.
- 52Delgado, S.; Gómez, M.; Bird, A.; Antequera, F. Initiation of DNA replication at CpG islands in mammalian chromosomes. EMBO J. 1998, 17, 2426– 2435, DOI: 10.1093/emboj/17.8.2426Google Scholar52https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADyaK1cXivFOmt74%253D&md5=6236f653fb36fc441e75ffda63bd7c9cInitiation of DNA replication at CpG islands in mammalian chromosomesDelgado, Sonia; Gomez, Maria; Bird, Adrian; Antequera, FranciscoEMBO Journal (1998), 17 (8), 2426-2435CODEN: EMJODG; ISSN:0261-4189. (Oxford University Press)CpG islands are G+C-rich regions ∼1 kb long that are free of methylation and contain the promoters of many mammalian genes. Anal. of in vivo replication intermediates at three hamster genes and one human gene showed that the CpG island regions, but not their flanks, were present in very short nascent strands, suggesting that they are replication origins (ORIs). CpG island-like fragments were enriched in a population of short nascent strands from human erythroleukemic cells, suggesting that islands constitute a significant fraction of endogenous ORIs. Correspondingly, bulk CpG islands were found to replicate coordinately early in S phase. Our results imply that CpG islands are initiation sites for both transcription and DNA replication, and may represent genomic footprints of replication initiation.
- 53Eaton, M. L.; Galani, K.; Kang, S.; Bell, S. P.; MacAlpine, D. M. Conserved nucleosome positioning defines replication origins. Genes Dev. 2010, 24, 748– 753, DOI: 10.1101/gad.1913210Google Scholar53https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3cXlsVaiu7k%253D&md5=34030ed2b7bcdd21fc6a2e1db83a4c84Conserved nucleosome positioning defines replication originsEaton, Matthew L.; Galani, Kyriaki; Kang, Sukhyun; Bell, Stephen P.; MacAlpine, David M.Genes & Development (2010), 24 (8), 748-753CODEN: GEDEEP; ISSN:0890-9369. (Cold Spring Harbor Laboratory Press)The origin recognition complex (ORC) specifies replication origin location. The Saccharomyces cerevisiae ORC recognizes the ARS (autonomously replicating sequence) consensus sequence (ACS), but only a subset of potential genomic sites are bound, suggesting other chromosomal features influence ORC binding. Using high-throughput sequencing to map ORC binding and nucleosome positioning, we show that yeast origins are characterized by an asym. pattern of positioned nucleosomes flanking the ACS. The origin sequences are sufficient to maintain a nucleosome-free origin; however, ORC is required for the precise positioning of nucleosomes flanking the origin. These findings identify local nucleosomes as an important determinant for origin selection and function.
- 54Li, W.-C.; Zhong, Z.-J.; Zhu, P.-P.; Deng, E.-Z.; Ding, H.; Chen, W.; Lin, H. Sequence analysis of origins of replication in the Saccharomyces cerevisiae genomes. Front. Microbiol. 2014, 5, 574, DOI: 10.3389/fmicb.2014.00574Google Scholar54https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BC2Mzpt1yrsg%253D%253D&md5=b190bd4b7f174824fb844d3f3882bd0bSequence analysis of origins of replication in the Saccharomyces cerevisiae genomesLi Wen-Chao; Zhong Zhe-Jin; Zhu Pan-Pan; Deng En-Ze; Ding Hui; Lin Hao; Chen WeiFrontiers in microbiology (2014), 5 (), 574 ISSN:1664-302X.DNA replication is a highly precise process that is initiated from origins of replication (ORIs) and is regulated by a set of regulatory proteins. The mining of DNA sequence information will be not only beneficial for understanding the regulatory mechanism of replication initiation but also for accurately identifying ORIs. In this study, the GC profile and GC skew were calculated to analyze the compositional bias in the Saccharomyces cerevisiae genome. We found that the GC profile in the region of ORIs is significantly lower than that in the flanking regions. By calculating the information redundancy, an estimation of the correlation of nucleotides, we found that the intensity of adjoining correlation in ORIs is dramatically higher than that in flanking regions. Furthermore, the relationships between ORIs and nucleosomes as well as transcription start sites were investigated. Results showed that ORIs are usually not occupied by nucleosomes. Finally, we calculated the distribution of ORIs in yeast chromosomes and found that most ORIs are in transcription terminal regions. We hope that these results will contribute to the identification of ORIs and the study of DNA replication mechanisms.
- 55Gilbert, D. M. Evaluating genome-scale approaches to eukaryotic DNA replication. Nat Rev Genet 2010, 11, 673– 684, DOI: 10.1038/nrg2830Google Scholar55https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3cXhtFyju7%252FL&md5=c14e9d32f4c1ca18a38c375a6b630f4eEvaluating genome-scale approaches to eukaryotic DNA replicationGilbert, David M.Nature Reviews Genetics (2010), 11 (10), 673-684CODEN: NRGAAM; ISSN:1471-0056. (Nature Publishing Group)A review of the increasing range of genome-scale methods that are being used to analyze eukaryotic DNA replication. Studies in different species and of replication timing or origin location have yielded varying degrees of success; tech. hurdles remain, but important biol. insights have been gained. Mechanisms regulating where and when eukaryotic DNA replication initiates remain a mystery. Recently, genome-scale methods have been brought to bear on this problem. The identification of replication origins and their assocd. proteins in yeasts is a well-integrated investigative tool, but corresponding data sets from multicellular organisms are scarce. By contrast, standardized protocols for evaluating replication timing have generated informative data sets for most eukaryotic systems. Here, I summarize the genome-scale methods that are most frequently used to analyze replication in eukaryotes, the kinds of questions each method can address and the tech. hurdles that must be overcome to gain a complete understanding of the nature of eukaryotic replication origins.
- 56Tyner, C.; Barber, G. P.; Casper, J.; Clawson, H.; Diekhans, M.; Eisenhart, C.; Fischer, C. M.; Gibson, D.; Gonzalez, J. N.; Guruvadoo, L.; Haeussler, M.; Heitner, S.; Hinrichs, A. S.; Karolchik, D.; Lee, B. T.; Lee, C. M.; Nejad, P.; Raney, B. J.; Rosenbloom, K. R.; Speir, M. L.; Villarreal, C.; Vivian, J.; Zweig, A. S.; Haussler, D.; Kuhn, R. M.; Kent, W. J. The UCSC Genome Browser database: 2017 update. Nucleic Acids Res. 2017, 45, D626– D634, DOI: 10.1093/nar/gkw1134Google Scholar56https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1cXhslWhsL8%253D&md5=011386d9d12edff5882c43cc21eaf8ebThe UCSC Genome Browser database: 2017 updateTyner, Cath; Barber, Galt P.; Casper, Jonathan; Clawson, Hiram; Diekhans, Mark; Eisenhart, Christopher; Fischer, Clayton M.; Gibson, David; Gonzalez, Jairo Navarro; Guruvadoo, Luvina; Haeussler, Maximilian; Heitner, Steve; Hinrichs, Angie S.; Karolchik, Donna; Lee, Brian T.; Lee, Christopher M.; Nejad, Parisa; Raney, Brian J.; Rosenbloom, Kate R.; Speir, Matthew L.; Villarreal, Chris; Vivian, John; Zweig, Ann S.; Haussler, David; Kuhn, Robert M.; Kent, W. JamesNucleic Acids Research (2017), 45 (D1), D626-D634CODEN: NARHAD; ISSN:1362-4962. (Oxford University Press)A review. Since its 2001 debut, the University of California, Santa Cruz (UCSC) Genome Browser (http://genome. ucsc.edu/) team has provided continuous support to the international genomics and biomedical communities through a web-based, open source platform designed for the fast, scalable display of sequence alignments and annotations landscaped against a vast collection of quality ref. genome assemblies. The browser's publicly accessible databases are the backbone of a rich, integrated bioinformatics tool suite that includes a graphical interface for data queries and downloads, alignment programs, command-line utilities and more. This year's highlights include newly designed home and gateway pages; a new 'multi-region' track display configuration for exon-only, gene-only and custom regions visualization; new genome browsers for three species (brown kiwi, crab-eating macaque and Malayan flying lemur); eight updated genome assemblies; extended support for new data types such as CRAM, RNA-seq expression data and long-range chromatin interaction pairs; and the unveiling of a new supported mirror site in Japan.
- 57SantaLucia, J., Jr. A unified view of polymer, dumbbell, and oligonucleotide DNA nearest-neighbor thermodynamics. Proc. Natl. Acad. Sci. U. S. A. 1998, 95, 1460– 1465, DOI: 10.1073/pnas.95.4.1460Google Scholar57https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADyaK1cXht1Wqsbc%253D&md5=1a4e89f9f0caa91aecd5944add0aaf83A unified view of polymer, dumbbell, and oligonucleotide DNA nearest-neighbor thermodynamicsSantalucia, John, Jr.Proceedings of the National Academy of Sciences of the United States of America (1998), 95 (4), 1460-1465CODEN: PNASA6; ISSN:0027-8424. (National Academy of Sciences)A unified view of polymer, dumbbell, and oligonucleotide nearest-neighbor (NN) thermodn. is presented. DNA NN ΔG37° parameters from seven labs. are presented in the same format so that careful comparisons can be made. The seven studies used data from natural polymers, synthetic polymers, oligonucleotide dumbbells, and oligonucleotide duplexes to derive NN parameters; used different methods of data anal.; used different salt concns.; and presented the NN thermodn. in different formats. As a result of these differences, there has been much confusion regarding the NN thermodn. of DNA polymers and oligomers. Herein I show that six of the studies are actually in remarkable agreement with one another and explanations are provided in cases where discrepancies remain. Further, a single set of parameters, derived from 108 oligonucleotide duplexes, adequately describes polymer and oligomer thermodn. Empirical salt dependencies are also derived for oligonucleotides and polymers.
- 58Anselmi, C.; De Santis, P.; Paparcone, R.; Savino, M.; Scipioni, A. From the sequence to the superstructural properties of DNAs. Biophys. Chem. 2002, 95, 23– 47, DOI: 10.1016/S0301-4622(01)00246-0Google Scholar58https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD38XhslWhsbs%253D&md5=3e0ddf1b164a5b4f24ceedfc4649861fFrom the sequence to the superstructural properties of DNAsAnselmi, C.; De Santis, P.; Paparcone, R.; Savino, M.; Scipioni, A.Biophysical Chemistry (2002), 95 (1), 23-47CODEN: BICIAZ; ISSN:0301-4622. (Elsevier Science B.V.)A theor. model for predicting intrinsic and induced DNA superstructures as well as their thermodn. properties is presented. Intrinsic sequence-dependent superstructures are evaluated by integrating local deviations from the canonical B-DNA of the different dinucleotide steps. Induced superstructures are obtained by adopting the principle of min. deformation free energy, evaluated in the Fourier space, in the framework of first-order elasticity. Finally dinucleotide stacking energies and melting temps. are considered to account for local flexibility. In fact the two scales are strongly correlated. The model works very satisfactorily in predicting the sequence-dependent effects on the DNA exptl. behavior, such as the gel electrophoresis retardation, the writhe transitions in topol. constrained domains, the thermodn. consts. of circularization reactions as well as the nucleosome thermodn. stability consts.
- 59Brukner, I.; Sánchez, R.; Suck, D.; Pongor, S. Trinucleotide models for DNA bending propensity: comparison of models based on DNaseI digestion and nucleosome packaging data. J. Biomol. Struct. Dyn. 1995, 13, 309– 317, DOI: 10.1080/07391102.1995.10508842Google Scholar59https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADyaK2MXovFantrc%253D&md5=8e13a8220a944ab6135d187a4d453605Trinucleotide models of DNA bending propensity: comparison of models based on DNaseI digestion and nucleosome packaging dataBrukner, Ivan; Sanchez, Roberto; Suck, Dietrich; Pongor, SandorJournal of Biomolecular Structure & Dynamics (1995), 13 (2), 309-17CODEN: JBSDD6; ISSN:0739-1102. (Adenine Press)DNaseI digestion studies (Brukner et al, EMBO J 14, 1812-1818 1995) and nucleosome-binding data (Satchwell et al., J. Mol. Biol. 191, 639-659 1986, Goodsell and Dickerson, Nucleic Acids Res. 1, 22, 5497-5503 1994)provide a possibility to derive bending parameters for trinucleotides. A detailed comparison of the two models suggests that while both of them represent improvements with respect to dinucleotide based descriptions, the individual trinucleotide parameters are not highly correlated (linear correlation coeff. is 0.53), and a no. of motifs such as TA-elements and CCA/TGG motifs are more realistically described in the DNaseI-based model. This may be due to the fact that the DNaseI-based model does not rely on a static geometry but rather captures a dynamic ability of ds DNA to bend towards the major grove. Future refinement of both models on larger exptl. data sets is expected to further improve the prediction of macroscopic DNA-curvature.
- 60Satchwell, S. C.; Drew, H. R.; Travers, A. A. Sequence periodicities in chicken nucleosome core DNA. J. Mol. Biol. 1986, 191, 659– 675, DOI: 10.1016/0022-2836(86)90452-3Google Scholar60https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADyaL2sXivVertA%253D%253D&md5=e4bbd4eb060dd2e3dcca36dd06bebf73Sequence periodicities in chicken nucleosome core DNASatchwell, Sandra C.; Drew, Horace R.; Travers, Andrew A.Journal of Molecular Biology (1986), 191 (4), 659-75CODEN: JMOBAK; ISSN:0022-2836.The rotational positioning of DNA about the histone octamer appears to be detd. by certain sequence-dependent modulations of DNA structure. To establish the detailed nature of these interactions, the sequences of 177 different DNA mols. from chicken erythrocyte core particles were analyzed. All variations in the sequence content of these mols., which may be attributed to sequence-dependent preferences for DNA bending, correlate well with the detailed path of the DNA as it wraps around the histone octamer in the crystal structure of the nucleosome core. The sequence-dependent preferences that correlate most closely with the rotational orientation of the DNA, relative to the surface of the protein, are of two kinds: ApApA/TpTpT and ApApT/ApTpT, the minor grooves of which face predominantly in towards the protein; and also GpGpC/GpCpC and ApGpC/GpCpT, whose minor grooves face outward. Fourier anal. has been used to obtain fractional variations in occurrence for all ten dinucleotide and all 32 trinucleotide arrangements. These sequence preferences should apply generally to many other cases of protein-DNA recognition, where the DNA wraps around a protein. In addn., it is obsd. that long runs of homopolymer (dA)·(dT) prefer to occupy the ends of core DNA, five to six turns away from the dyad. These same sequences are apparently excluded from the near-center of core DNA, two to three turns from the dyad. Hence, the translational positioning of any single histone octamer along a DNA mol. of defined sequence may be strongly influenced by the placement of (dA)·(dT) sequences. It may also be influenced by any aversion of the protein for sequences in the linker region, the sequence content of which remains to be detd.
- 61Friedel, M.; Nikolajewa, S.; Sühnel, J.; Wilhelm, T. DiProDB: a database for dinucleotide properties. Nucleic Acids Res. 2009, 37, D37– D40, DOI: 10.1093/nar/gkn597Google Scholar61https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD1cXhsFejt7jN&md5=c55ec2c4eb87fb55d3ba7958b5c8a75aDiProDB: a database for dinucleotide propertiesFriedel, Maik; Nikolajewa, Swetlana; Suehnel, Juergen; Wilhelm, ThomasNucleic Acids Research (2009), 37 (Database Iss), D37-D40CODEN: NARHAD; ISSN:0305-1048. (Oxford University Press)DiProDB (http://diprodb.fli-leibniz.de) is a database of conformational and thermodn. dinucleotide properties. It includes datasets both for DNA and RNA, as well as for single and double strands. The data have been shown to be important for understanding different aspects of nucleic acid structure and function, and they can also be used for encoding nucleic acid sequences. The database is intended to facilitate further applications of dinucleotide properties. A no. of property datasets is highly correlated. Therefore, the database comes with a correlation anal. facility. Authors having detd. new sets of dinucleotide property values are invited to submit these data to DiProDB.
- 62Qin, Y.; Hurley, L. H. Structures, folding patterns, and functions of intramolecular DNA G-quadruplexes found in eukaryotic promoter regions. Biochimie 2008, 90, 1149– 1171, DOI: 10.1016/j.biochi.2008.02.020Google Scholar62https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD1cXptVWlt7s%253D&md5=40a9a2ddd196217df942f0824c6292a5Structures, folding patterns, and functions of intramolecular DNA G-quadruplexes found in eukaryotic promoter regionsQin, Yong; Hurley, Laurence H.Biochimie (2008), 90 (8), 1149-1171CODEN: BICMBE; ISSN:0300-9084. (Elsevier B.V.)A review. In its simplest form, a DNA G-quadruplex is a four-stranded DNA structure that is composed of stacked guanine tetrads. G-quadruplex-forming sequences have been identified in eukaryotic telomeres, as well as in non-telomeric genomic regions, such as gene promoters, recombination sites, and DNA tandem repeats. Of particular interest are the G-quadruplex structures that form in gene promoter regions, which have emerged as potential targets for anticancer drug development. Evidence for the formation of G-quadruplex structures in living cells continues to grow. In this review, we examine recent studies on intramol. G-quadruplex structures that form in the promoter regions of some human genes in living cells and discuss the biol. implications of these structures. The identification of G-quadruplex structures in promoter regions provides us with new insights into the fundamental aspects of G-quadruplex topol. and DNA sequence-structure relationships. Progress in G-quadruplex structural studies and the validation of the biol. role of these structures in cells will further encourage the development of small mols. that target these structures to specifically modulate gene transcription.
- 63Todd, A. K.; Johnston, M.; Neidle, S. Highly prevalent putative quadruplex sequence motifs in human DNA. Nucleic Acids Res. 2005, 33, 2901– 2907, DOI: 10.1093/nar/gki553Google Scholar63https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD2MXkvVSgsrw%253D&md5=53d7a1863e6a3c2f264aa291eabe893dHighly prevalent putative quadruplex sequence motifs in human DNATodd, Alan K.; Johnston, Matthew; Neidle, StephenNucleic Acids Research (2005), 33 (9), 2901-2907CODEN: NARHAD; ISSN:0305-1048. (Oxford University Press)We report here the results of a systematic search for the existence and prevalence of potential intramol. G-quadruplex forming sequences in the human genome. We have also examd. the tendency for particular sequences of 'loop' regions to occur in particular positions with respect to the G-tracts in a quadruplex. Using arithmetic ratio and probability techniques we have discovered frequent and systematic occurrence of certain sequence types, the most prominent being a potential quadruplex contg. CCTGT in the first 'loop' position. Being able to highlight types of potential quadruplex sequences in G-rich regions is an important step in searching for biol. relevant sequences and finding their function.
- 64Zeraati, M.; Langley, D. B.; Schofield, P.; Moye, A. L.; Rouet, R.; Hughes, W. E.; Bryan, T. M.; Dinger, M. E.; Christ, D. I-motif DNA structures are formed in the nuclei of human cells. Nat. Chem. 2018, 10, 631– 637, DOI: 10.1038/s41557-018-0046-3Google Scholar65https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1cXos1Smt7c%253D&md5=d9984eb53b7672d60823566ad7f54c38i-motif DNA structures are formed in the nuclei of human cellsZeraati, Mahdi; Langley, David B.; Schofield, Peter; Moye, Aaron L.; Rouet, Romain; Hughes, William E.; Bryan, Tracy M.; Dinger, Marcel E.; Christ, DanielNature Chemistry (2018), 10 (6), 631-637CODEN: NCAHBB; ISSN:1755-4330. (Nature Research)Human genome function is underpinned by the primary storage of genetic information in canonical B-form DNA, with a second layer of DNA structure providing regulatory control. I-motif structures are thought to form in cytosine-rich regions of the genome and to have regulatory functions; however, in vivo evidence for the existence of such structures has so far remained elusive. Here we report the generation and characterization of an antibody fragment (iMab) that recognizes i-motif structures with high selectivity and affinity, enabling the detection of i-motifs in the nuclei of human cells. We demonstrate that the in vivo formation of such structures is cell-cycle and pH dependent. Furthermore, we provide evidence that i-motif structures are formed in regulatory regions of the human genome, including promoters and telomeric regions. Our results support the notion that i-motif structures provide key regulatory roles in the genome.
- 65Drew, H. R.; Travers, A. A. DNA bending and its relation to nucleosome positioning. J. Mol. Biol. 1985, 186, 773– 790, DOI: 10.1016/0022-2836(85)90396-1Google Scholar66https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADyaL28XotlSltg%253D%253D&md5=2c86fb083f6f0ce0f78030dba4db93d7DNA bending and its relation to nucleosome positioningDrew, Horace R.; Travers, Andrew A.Journal of Molecular Biology (1985), 186 (4), 773-90CODEN: JMOBAK; ISSN:0022-2836.X-ray and soln. studies have shown that the conformation of a DNA double helix depends strongly on its base sequence. Certain sequence-dependent modulations in structure appear to det. the rotational positioning of DNA about the nucleosome. Three different expts. are described. First, a piece of DNA of defined sequence (169 base pairs long) is closed into a circle and its structure examd. by digestion with DNAse I. The helix adopts a highly preferred configuration, with short runs of (A, T) facing in and runs of (G, C) facing out. Secondly, the same sequence is reconstituted with a histone octamer: the angular orientation around the histone core remains conserved, apart from a small uniform increase in helix twist. Finally, the av. sequence content of DNA mols. isolated from chicken nucleosome cores is nonrandom, as in a reconstituted nucleosome. Short runs of (A, T) are preferentially positioned with minor grooves facing in, while runs of (G, C) tend to have their minor grooves facing out. The periodicity of this modulation in sequence content (10·17 base pairs) corresponds to the helix twist in a local frame of ref. (a result that bears on the change in linking no. upon nucleosome formation). The determinants of translational positioning were not identified, but 1 possibility is that long runs of homopolymer (dA)·(dT) or (dG)·(dC) will be excluded from the central region of the supercoil on account of their resistance to curvature.
- 66Tsankov, A.; Yanagisawa, Y.; Rhind, N.; Regev, A.; Rando, O. J. Evolutionary divergence of intrinsic and trans-regulated nucleosome positioning sequences reveals plastic rules for chromatin organization. Genome Res. 2011, 21, 1851– 1862, DOI: 10.1101/gr.122267.111Google Scholar67https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3MXhsVKjsL%252FI&md5=33193e72242ed7a677d7995b98838e66Evolutionary divergence of intrinsic and trans-regulated nucleosome positioning sequences reveals plastic rules for chromatin organizationTsankov, Alex; Yanagisawa, Yoshimi; Rhind, Nicholas; Regev, Aviv; Rando, Oliver J.Genome Research (2011), 21 (11), 1851-1862CODEN: GEREFS; ISSN:1088-9051. (Cold Spring Harbor Laboratory Press)The packaging of eukaryotic genomes into nucleosomes plays crit. roles in chromatin organization and gene regulation. Studies in Saccharomyces cerevisiae indicate that nucleosome occupancy is partially encoded by intrinsic antinucleosomal DNA sequences, such as poly(A) sequences, as well as by binding sites for trans-acting factors that can evict nucleosomes, such as Reb1 and the Rsc3/30 complex. Here, we use genome-wide nucleosome occupancy maps in 13 Ascomycota fungi to discover large-scale evolutionary reprogramming of both intrinsic and trans determinants of chromatin structure. We find that poly(G)s act as intrinsic antinucleosomal sequences, comparable to the known function of poly(A)s, but that the abundance of poly(G)s has diverged greatly between species, obscuring their antinucleosomal effect in low-poly(G) species such as S. cerevisiae. We also develop a computational method that uses nucleosome occupancy maps for discovering trans-acting general regulatory factor (GRF) binding sites. Our approach reveals that the specific sequences bound by GRFs have diverged substantially across evolution, corresponding to a no. of major evolutionary transitions in the repertoire of GRFs. We exptl. validate a proposed evolutionary transition from Cbf1 as a major GRF in pre-whole-genome duplication (WGD) yeasts to Reb1 in post-WGD yeasts. We further show that the mating type switch-activating protein Sap1 is a GRF in S. pombe, demonstrating the general applicability of our approach. Our results reveal that the underlying mechanisms that det. in vivo chromatin organization have diverged and that comparative genomics can help discover new determinants of chromatin organization.
Cited By
This article is cited by 12 publications.
- Subhojit Paul, Kaushika Olymon, Gustavo Sganzerla Martinez, Sharmilee Sarkar, Venkata Rajesh Yella, Aditya Kumar. MLDSPP: Bacterial Promoter Prediction Tool Using DNA Structural Properties with Machine Learning and Explainable AI. Journal of Chemical Information and Modeling 2024, 64
(7)
, 2705-2719. https://doi.org/10.1021/acs.jcim.3c02017
- Akkinepally Vanaja, Venkata Rajesh Yella. Delineation of the DNA Structural Features of Eukaryotic Core Promoter Classes. ACS Omega 2022, 7
(7)
, 5657-5669. https://doi.org/10.1021/acsomega.1c04603
- Liujiang Song, Tomoko Hasegawa, Nolan J Brown, Jacquelyn J Bower, Richard J Samulski, Matthew L Hirsch. . Nucleic Acids Research 2025, 53
(3)
https://doi.org/10.1093/nar/gkaf013
- Patrycja Obara, Paweł Wolski, Tomasz Pańczyk. Insights into the Molecular Structure, Stability, and Biological Significance of Non-Canonical DNA Forms, with a Focus on G-Quadruplexes and i-Motifs. Molecules 2024, 29
(19)
, 4683. https://doi.org/10.3390/molecules29194683
- James G Davies, Georgina E Menzies, . Utilising biological experimental data and molecular dynamics for the classification of mutational hotspots through machine learning. Bioinformatics Advances 2024, https://doi.org/10.1093/bioadv/vbae125
- Mireille Bétermier, Lawrence A. Klobutcher, Eduardo Orias, . Programmed chromosome fragmentation in ciliated protozoa: multiple means to chromosome ends. Microbiology and Molecular Biology Reviews 2023, 87
(4)
https://doi.org/10.1128/mmbr.00184-22
- Fumiaki Uchiumi. Biological roles of loop structures. 2023, 171-181. https://doi.org/10.1016/B978-0-12-818787-6.00001-1
- Hemanth Kari, Surya Manikhanta Sowri Bandi, Aditya Kumar, Venkata Rajesh Yella. DeePromClass: Delineator for Eukaryotic Core Promoters employing Deep Neural Networks. IEEE/ACM Transactions on Computational Biology and Bioinformatics 2022, 14 , 1-1. https://doi.org/10.1109/TCBB.2022.3163418
- Feng Wu, Runtao Yang, Chengjin Zhang, Lina Zhang. A deep learning framework combined with word embedding to identify DNA replication origins. Scientific Reports 2021, 11
(1)
https://doi.org/10.1038/s41598-020-80670-x
- Akkinepally Vanaja, Sarada Prasanna Mallick, Umasankar Kulandaivelu, Aditya Kumar, Venkata Rajesh Yella. Symphony of the DNA flexibility and sequence environment orchestrates p53 binding to its responsive elements. Gene 2021, 803 , 145892. https://doi.org/10.1016/j.gene.2021.145892
- Sharmilee Sarkar, Upalabdha Dey, Trust Boitumelo Khohliwe, Venkata Rajesh Yella, Aditya Kumar. Analysis of nucleoid‐associated protein‐binding regions reveals DNA structural features influencing genome organization in
Mycobacterium tuberculosis. FEBS Letters 2021, 595
(19)
, 2504-2521. https://doi.org/10.1002/1873-3468.14178
- Upalabdha Dey, Sharmilee Sarkar, Valentina Teronpi, Venkata Rajesh Yella, Aditya Kumar. G-quadruplex motifs are functionally conserved in cis-regulatory regions of pathogenic bacteria: An in-silico evaluation. Biochimie 2021, 184 , 40-51. https://doi.org/10.1016/j.biochi.2021.01.017
Article Views are the COUNTER-compliant sum of full text article downloads since November 2008 (both PDF and HTML) across all institutions and individuals. These metrics are regularly updated to reflect usage leading up to the last few days.
Citations are the number of other articles citing this article, calculated by Crossref and updated daily. Find more information about Crossref citation counts.
The Altmetric Attention Score is a quantitative measure of the attention that a research article has received online. Clicking on the donut icon will load a page at altmetric.com with additional details about the score and the social media presence for the given article. Find more information on the Altmetric Attention Score and how the score is calculated.
Recommended Articles
References
This article references 66 other publications.
- 1Masai, H.; Matsumoto, S.; You, Z.; Yoshizawa-Sugata, N.; Oda, M. Eukaryotic chromosome DNA replication: where, when, and how?. Annu. Rev. Biochem. 2010, 79, 89– 130, DOI: 10.1146/annurev.biochem.052308.1032051https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3cXpslShsbs%253D&md5=c8a0dfe38359a7eb2fe724d0693116b2Eukaryotic chromosome DNA replication: where, when, and how?Masai, Hisao; Matsumoto, Seiji; You, Zhiying; Yoshizawa-Sugata, Naoko; Oda, MasakoAnnual Review of Biochemistry (2010), 79 (), 89-130CODEN: ARBOAW; ISSN:0066-4154. (Annual Reviews Inc.)A review. DNA replication is central to cell proliferation. Studies in the past six decades since the proposal of a semiconservative mode of DNA replication have confirmed the high degree of conservation of the basic machinery of DNA replication from prokaryotes to eukaryotes. However, the need for replication of a substantially longer segment of DNA in coordination with various internal and external signals in eukaryotic cells has led to more complex and versatile regulatory strategies. The replication program in higher eukaryotes is under a dynamic and plastic regulation within a single cell, or within the cell population, or during development. We review here various regulatory mechanisms that control the replication program in eukaryotes and discuss future directions in this dynamic field.
- 2Aladjem, M. I.; Redon, C. E. Order from clutter: selective interactions at mammalian replication origins. Nat. Rev. Genet. 2017, 18, 101– 116, DOI: 10.1038/nrg.2016.1412https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC28XhvV2iur7E&md5=297fb77793aee0fe319fe3af7dcb2a84Order from clutter: selective interactions at mammalian replication originsAladjem, Mirit I.; Redon, Christophe E.Nature Reviews Genetics (2017), 18 (2), 101-116CODEN: NRGAAM; ISSN:1471-0056. (Nature Publishing Group)Mammalian chromosome duplication progresses in a precise order and is subject to constraints that are often relaxed in developmental disorders and malignancies. Mol. information about the regulation of DNA replication at the chromatin level is lacking because protein complexes that initiate replication seem to bind chromatin indiscriminately. High-throughput sequencing and math. modeling have yielded detailed genome-wide replication initiation maps. Combining these maps and models with functional genetic analyses suggests that distinct DNA-protein interactions at subgroups of replication initiation sites (replication origins) modulate the ubiquitous replication machinery and supports an emerging model that delineates how indiscriminate DNA-binding patterns translate into a consistent, organized replication program.
- 3Fragkos, M.; Ganier, O.; Coulombe, P.; Méchali, M. DNA replication origin activation in space and time. Nat. Rev. Mol. Cell Biol. 2015, 16, 360– 374, DOI: 10.1038/nrm40023https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2MXht1WltbrP&md5=dc4e7fed0cfedf7e71d31a8fa3434751DNA replication origin activation in space and timeFragkos, Michalis; Ganier, Olivier; Coulombe, Philippe; Mechali, MarcelNature Reviews Molecular Cell Biology (2015), 16 (6), 360-374CODEN: NRMCBP; ISSN:1471-0072. (Nature Publishing Group)A review. DNA replication begins with the assembly of pre-replication complexes (pre-RCs) at thousands of DNA replication origins during the G1 phase of the cell cycle. At the G1-S-phase transition, pre-RCs are converted into pre-initiation complexes, in which the replicative helicase is activated, leading to DNA unwinding and initiation of DNA synthesis. However, only a subset of origins are activated during any S phase. Recent insights into the mechanisms underlying this choice reveal how flexibility in origin usage and temporal activation are linked to chromosome structure and organization, cell growth and differentiation, and replication stress.
- 4Jacob, F.; Brenner, S. On the regulation of DNA synthesis in bacteria: the hypothesis of the replicon. C R Hebd. Seances Acad. Sci. 1963, 256, 298– 3004https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADyaF387ls1Snsw%253D%253D&md5=f717738803200309a7b33310aec1b128On the regulation of DNA synthesis in bacteria: the hypothesis of the repliconJACOB F; BRENNER SComptes rendus hebdomadaires des seances de l'Academie des sciences (1963), 256 (), 298-300 ISSN:0001-4036.There is no expanded citation for this reference.
- 5Marahrens, Y.; Stillman, B. A yeast chromosomal origin of DNA replication defined by multiple functional elements. Science 1992, 255, 817– 823, DOI: 10.1126/science.15360075https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADyaK38Xht1amu70%253D&md5=bcabc93006be8d984b0314d9cd4d8215A yeast chromosomal origin of DNA replication defined by multiple functional elementsMarahrens, York; Stillman, BruceScience (Washington, DC, United States) (1992), 255 (5046), 817-23CODEN: SCIEAS; ISSN:0036-8075.Although it has been demonstrated that discrete origins of DNA replication exist in eukaryotic cellular chromosomes, the detailed organization of a eukaryotic cellular origin remains to be detd. Linker substitution mutations were constructed across the entire Saccharomyces cerevisiae chromosomal origin, ARS1. Functional studies of these mutants revealed 1 essential element (A), which includes a match to the ARS consensus sequence, and 3 addnl. elements (B1, B2, and B3), which collectively are also essential for origin function. These 4 elements arranged exactly as in ARS1, but surrounded by completely unrelated sequence, functioned as an efficient origin. Element B3 is the binding site for the transcription factor-origin binding protein ABF1. Other transcription factor binding sites substitute for B3 element and a trans-acting transcriptional activation domain is required. The multipartite nature of a chromosomal replication origin and the role of transcriptional activators in its function present a striking similarity to the organization of eukaryotic promoters.
- 6Dai, J.; Chuang, R. Y.; Kelly, T. J. DNA replication origins in the Schizosaccharomyces pombe genome. Proc. Natl. Acad. Sci. U. S. A. 2005, 102, 337– 342, DOI: 10.1073/pnas.04088111026https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD2MXptFSltA%253D%253D&md5=01796a598a126899d5ee00fe7d4e40f4DNA replication origins in the Schizosaccharomyces pombe genomeDai, Jianli; Chuang, Ray-Yuan; Kelly, Thomas J.Proceedings of the National Academy of Sciences of the United States of America (2005), 102 (2), 337-342CODEN: PNASA6; ISSN:0027-8424. (National Academy of Sciences)Origins of DNA replication in Schizosaccharomyces pombe lack a specific consensus sequence analogous to the Saccharomyces cerevisiae autonomously replicating sequence (ARS) consensus, raising the question of how they are recognized by the replication machinery. Because all well characterized S. pombe origins are located in intergenic regions, we analyzed the sequence properties and biol. activity of such regions. The AT content of intergenes is very high (≈70%), and runs of A's or T's occur with a significantly greater frequency than expected. Addnl., the two DNA strands in intergenes display compositional asymmetry that strongly correlates with the direction of transcription of flanking genes. Importantly, the sequence properties of known S. pombe origins of DNA replication are similar to those of intergenes in general. In functional studies, we assayed the in vivo origin activity of 26 intergenes in a 68-kb region of S. pombe chromosome 2. We also assayed the origin activity of sets of randomly chosen intergenes with the same length or AT content. Our data demonstrate that at least half of intergenes have potential origin activity and that the relative ability of an intergene to function as an origin is governed primarily by AT content and length. We propose a stochastic model for initiation of DNA replication in the fission yeast. In this model, the no. of AT tracts in a given sequence is the major determinant of its probability of binding SpORC and serving as a replication origin. A similar model may explain some features of origins of DNA replication in metazoans.
- 7Xu, J.; Yanagisawa, Y.; Tsankov, A. M.; Hart, C.; Aoki, K.; Kommajosyula, N.; Steinmann, K. E.; Bochicchio, J.; Russ, C.; Regev, A.; Rando, O. J.; Nusbaum, C.; Niki, H.; Milos, P.; Weng, Z.; Rhind, N. Genome-wide identification and characterization of replication origins by deep sequencing. Genome Biol. 2012, 13, R27, DOI: 10.1186/gb-2012-13-4-r277https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC38XptV2jur0%253D&md5=27aea1830370a1204e20ede571e7da09Genome-wide identification and characterization of replication origins by deep sequencingXu, Jia; Yanagisawa, Yoshimi; Tsankov, Alexander M.; Hart, Christopher; Aoki, Keita; Kommajosyula, Naveen; Steinmann, Kathleen E.; Bochicchio, James; Russ, Carsten; Regev, Aviv; Rando, Oliver J.; Nusbaum, Chad; Niki, Hironori; Milos, Patrice; Weng, Zhiping; Rhind, NicholasGenome Biology (2012), 13 (), R27CODEN: GNBLFW; ISSN:1474-760X. (BioMed Central Ltd.)Background: DNA replication initiates at distinct origins in eukaryotic genomes, but the genomic features that define these sites are not well understood. Results: We have taken a combined exptl. and bioinformatic approach to identify and characterize origins of replication in three distantly related fission yeasts: Schizosaccharomyces pombe, Schizosaccharomyces octosporus and Schizosaccharomyces japonicus. Using single-mol. deep sequencing to construct amplification-free high-resoln. replication profiles, we located origins and identified sequence motifs that predict origin function. We then mapped nucleosome occupancy by deep sequencing of mononucleosomal DNA from the corresponding species, finding that origins tend to occupy nucleosome-depleted regions. Conclusions: The sequences that specify origins are evolutionarily plastic, with low complexity nucleosome-excluding sequences functioning in S. pombe and S. octosporus and binding sites for trans-acting nucleosome-excluding proteins functioning in S. japonicus. Furthermore, chromosome-scale variation in replication timing is conserved independently of origin location and via a mechanism distinct from known heterochromatic effects on origin function. These results are consistent with a model in which origins are simply the nucleosome-depleted regions of the genome with the highest affinity for the origin recognition complex. This approach provides a general strategy for understanding the mechanisms that define DNA replication origins in eukaryotes.
- 8Cayrou, C.; Coulombe, P.; Puy, A.; Rialle, S.; Kaplan, N.; Segal, E.; Méchali, M. New insights into replication origin characteristics in metazoans. Cell Cycle 2012, 11, 658– 667, DOI: 10.4161/cc.11.4.190978https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC38Xks1enurw%253D&md5=e8b8e8081dbfcc3f4b7152f609efab8bNew insights into replication origin characteristics in metazoansCayrou, Christelle; Coulombe, Philippe; Puy, Aurore; Rialle, Stephanie; Kaplan, Noam; Segal, Eran; Mechali, MarcelCell Cycle (2012), 11 (4), 658-667CODEN: CCEYAS; ISSN:1538-4101. (Landes Bioscience)We recently reported the identification and characterization of DNA replication origins (Oris) in metazoan cell lines. Here, we describe addnl. bioinformatic analyses showing that the previously identified GC-rich sequence elements form origin G-rich repeated elements (OGREs) that are present in 67% to 90% of the DNA replication origins from Drosophila to human cells, resp. Our analyses also show that initiation of DNA synthesis takes place precisely at 160 bp (Drosophila) and 280 bp (mouse) from the OGRE. We also found that in most CpG islands, an OGRE is positioned in opposite orientation on each of the two DNA strands and detected two sites of initiation of DNA synthesis upstream or downstream of each OGRE. Conversely, Oris not assocd. with CpG islands have a single initiation site. OGRE d. along chromosomes correlated with previously published replication timing data. Ori sequences centered on the OGRE are also predicted to have high intrinsic nucleosome occupancy. Finally, OGREs predict G-quadruplex structures at Oris that might be structural elements controlling the choice or activation of replication origins.
- 9Ghosh, A.; Bansal, M. A glossary of DNA structures from A to Z. Acta Crystallogr. D. Biol. Crystallogr. 2003, 59, 620– 626, DOI: 10.1107/S09074449030032519https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD3sXit12kt7Y%253D&md5=9d1306e34c1930e36d870b596f28ee55A glossary of DNA structures from A to ZGhosh, Anirban; Bansal, ManjuActa Crystallographica, Section D: Biological Crystallography (2003), D59 (4), 620-626CODEN: ABCRE6; ISSN:0907-4449. (Blackwell Munksgaard)A review. The right-handed double-helical Watson-Crick model for B-form DNA is the most commonly known DNA structure. In addn. to this classic structure, several other forms of DNA have been obsd., and it is clear that the DNA mol. can assume different structures depending on the base sequence and environment. The various forms of DNA have been identified as A, B, C etc. In fact, a detailed inspection of the literature reveals that only the letters F, Q, U, V and Y are now available to describe any new DNA structure that may appear in the future. It is also apparent that it may be more relevant to talk about the A, B or C type dinucleotide steps, since several recent structures show mixts. of various different geometries and a careful anal. is essential before identifying it as a 'new structure'. This review provides a glossary of currently identified DNA structures and is quite timely as it outlines the present understanding of DNA structure exactly 50 yr after the original discovery of DNA structure by Watson and Crick.
- 10Guiblet, W. M.; Cremona, M. A.; Cechova, M.; Harris, R. S.; Kejnovská, I.; Kejnovsky, E.; Eckert, K.; Chiaromonte, F.; Makova, K. D. Long-read sequencing technology indicates genome-wide effects of non-B DNA on polymerization speed and error rate. Genome Res. 2018, 28, 1767– 1778, DOI: 10.1101/gr.241257.11810https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1MXjsVKjtQ%253D%253D&md5=86c11cc6d9a78dcef3bac531e7e99f36Long-read sequencing technology indicates genome-wide effects of non-B DNA on polymerization speed and error rateGuiblet, Wilfried M.; Cremona, Marzia A.; Cechova, Monika; Harris, Robert S.; Kejnovska, Iva; Kejnovsky, Eduard; Eckert, Kristin; Chiaromonte, Francesca; Makova, Kateryna D.Genome Research (2018), 28 (12), 1767-1778CODEN: GEREFS; ISSN:1088-9051. (Cold Spring Harbor Laboratory Press)DNA conformation may deviate from the classical B-form in ∼13% of the human genome. Non-B DNA regulates many cellular processes; however, its effects on DNA polymn. speed and accuracy have not been investigated genome-wide. Such an inquiry is crit. for understanding neurol. diseases and cancer genome instability. Here, we present the first simultaneous examn. of DNA polymn. kinetics and errors in the human genome sequenced with Single-Mol. Real-Time (SMRT) technol. We show that polymn. speed differs between non-B and B-DNA: It decelerates at G-quadruplexes and fluctuates periodically at disease-causing tandem repeats. Analyzing polymn. kinetics profiles, we predict and validate exptl. non-B DNA formation for a novel motif. We demonstrate that several non-B motifs affect sequencing errors (e.g., G-quadruplexes increase error rates), and that sequencing errors are pos. assocd. with polymerase slowdown. Finally, we show that highly divergent G4 motifs have pronounced polymn. slowdown and high sequencing error rates, suggesting similar mechanisms for sequencing errors and germline mutations.
- 11Marathe, A.; Karandur, D.; Bansal, M. Small local variations in B-form DNA lead to a large variety of global geometries which can accommodate most DNA-binding protein motifs. BMC Struct. Biol. 2009, 9, 24, DOI: 10.1186/1472-6807-9-2411https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BD1MzntlCitg%253D%253D&md5=85d1302530d372af53200164a807f2bbSmall local variations in B-form DNA lead to a large variety of global geometries which can accommodate most DNA-binding protein motifsMarathe Arvind; Karandur Deepti; Bansal ManjuBMC structural biology (2009), 9 (), 24 ISSN:.BACKGROUND: An important question of biological relevance is the polymorphism of the double-helical DNA structure in its free form, and the changes that it undergoes upon protein-binding. We have analysed a database of free DNA crystal structures to assess the inherent variability of the free DNA structure and have compared it with a database of protein-bound DNA crystal structures to ascertain the protein-induced variations. RESULTS: Most of the dinucleotide steps in free DNA display high flexibility, assuming different conformations in a sequence-dependent fashion. With the exception of the AA/TT and GA/TC steps, which are 'A-phobic', and the GG/CC step, which is 'A-philic', the dinucleotide steps show no preference for A or B forms of DNA. Protein-bound DNA adopts the B-conformation most often. However, in certain cases, protein-binding causes the DNA backbone to take up energetically unfavourable conformations. At the gross structural level, several protein-bound DNA duplexes are observed to assume a curved conformation in the absence of any large distortions, indicating that a series of normal structural parameters at the dinucleotide and trinucleotide level, similar to the ones in free B-DNA, can give rise to curvature at the overall level. CONCLUSION: The results illustrate that the free DNA molecule, even in the crystalline state, samples a large amount of conformational space, encompassing both the A and the B-forms, in the absence of any large ligands. A-form as well as some non-A, non-B, distorted geometries are observed for a small number of dinucleotide steps in DNA structures bound to the proteins belonging to a few specific families. However, for most of the bound DNA structures, across a wide variety of protein families, the average step parameters for various dinucleotide sequences as well as backbone torsion angles are observed to be quite close to the free 'B-like' DNA oligomer values, highlighting the flexibility and biological significance of this structural form.
- 12Gorin, A. A.; Zhurkin, V. B.; Wima, K. B-DNA twisting correlates with base-pair morphology. J. Mol. Biol. 1995, 247, 34– 48, DOI: 10.1006/jmbi.1994.012012https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADyaK2MXkslSmsr8%253D&md5=201abbbf842e6fafdc9ac9ca86a125e7B-DNA twisting correlates with base-pair morphologyGorin, Andrey A.; Zhurkin, Victor B.; Olson, Wilma K.Journal of Molecular Biology (1995), 247 (1), 34-48CODEN: JMOBAK; ISSN:0022-2836. (Academic)The obsd. sequence dependence of the mean twist angles in 38 B-DNA crystal structures can be understood in terms of simple geometrical features of the constituent base-pairs. Structures with low twist appear to unwind in response to severe steric clashes of large exocyclic groups (such as NH2-NH2) in the major and minor grooves, while those with high twist are subjected to lesser contacts (H-O and H-H). The authors offer a simple clash function that depends on base-pair morphol. (i.e. the chem. constitution of base-pairs) and satisfactorily accounts for the twist angles of the ten common Watson-Crick dimer steps both in the solid state and in soln. The twist-clash correlation that the authors find here still holds when extended to modified bases. In addn. to Calladine's purine-purine clashes, the authors add other close contacts between bases in the grooves, and consider the conformational restrictions on the geometry of the sugar-phosphate backbone (namely, the authors emphasize the tendency of DNA to conserve virtual backbone length). The significance of this finding is threefold: (1) sequence-dependent DNA twisting is directly involved with protein-DNA interactions; (2) strong correlation between Twist and Roll helps to elucidate the bending of the double helix as a function of base sequence; (3) it is possible to anticipate the effects of chem. modifications on twisting and bending. The mutual correlations of other structural parameters with the twist make this angle a primary determinant of DNA conformational heterogeneity.
- 13Drew, H. R.; Dickerson, R. E. Structure of a B-DNA dodecamer. III. Geometry of hydration. J. Mol. Biol. 1981, 151, 535– 556, DOI: 10.1016/0022-2836(81)90009-713https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADyaL38XmsVentg%253D%253D&md5=4378664a4da9ec83155a3f7e046027d9Structure of a B-DNA dodecamer. III. Geometry of hydrationDrew, Horace R.; Dickerson, Richard E.Journal of Molecular Biology (1981), 151 (3), 535-56CODEN: JMOBAK; ISSN:0022-2836.The B-DNA dodecamer, C-G-C-G-A-A-T-T-C-G-C-G, crystd. as slightly more than 1 full turn of right-handed B-DNA in space group P212121 with cell dimensions a = 24.87 Å, b = 40.39 Å, and c = 66.20 Å. X-ray anal. showed that it was surrounded by 72 ordered water mols., mostly assocd. with polar N and O atoms at exposed edges of base-pairs. Hydration in the major groove was mainly in the form of a monodentate monolayer. Hydration of backbone phosphate O atoms were not ordered, except when immobilized by the 5-Me groups of adjacent thymines. The minor groove was extensively and regularly hydrated, with a zigzag spine of 1st- and 2nd-shell hydration along the floor of the groove serving as a foundation for less-regular outer shells extending beyond the radius of the phosphate backbone. The spine network bridged purine N-3 and pyrimidine O-2 atoms in adjacent base pairs; it was regular in the A-A-T-T center, but was disrupted at the C-G-C-G ends, partly by the guanine N-2 amino groups. The minor groove hydration spine may be responsible for the stability of the B form of polymers contg. only A-T and I-C base pairs, and its disruption may explain the ease of transition to the A form of polymers with G-C pairs.
- 14Rohs, R.; West, S. M.; Sosinsky, A.; Liu, P.; Mann, R. S.; Honig, B. The role of DNA shape in protein-DNA recognition. Nature 2009, 461, 1248– 1253, DOI: 10.1038/nature0847314https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD1MXhtlOjtrvL&md5=2c314c36cc47209d7869e3fbda04f5e5The role of DNA shape in protein-DNA recognitionRohs, Remo; West, Sean M.; Sosinsky, Alona; Liu, Peng; Mann, Richard S.; Honig, BarryNature (London, United Kingdom) (2009), 461 (7268), 1248-1253CODEN: NATUAS; ISSN:0028-0836. (Nature Publishing Group)The recognition of specific DNA sequences by proteins is thought to depend on 2 types of mechanism: one that involves the formation of H-bonds with specific bases, primarily in the major groove, and one involving sequence-dependent deformations of the DNA helix. Here, by comprehensively analyzing the 3-dimensional structures of protein-DNA complexes, the authors show that the binding of Arg residues to narrow minor grooves is a widely used mode for protein-DNA recognition. This readout mechanism exploits the phenomenon that narrow minor grooves strongly enhance the neg. electrostatic potential of the DNA. The nucleosome core particle offers a prominent example of this effect. Minor-groove narrowing is often assocd. with the presence of A-tracts, AT-rich sequences that exclude the flexible TpA step. These findings indicate that the ability to detect local variations in DNA shape and electrostatic potential is a general mechanism that enables proteins to use information in the minor groove, which otherwise offers few opportunities for the formation of base-specific H-bonds, to achieve DNA-binding specificity.
- 15Morey, C.; Mookherjee, S.; Rajasekaran, G.; Bansal, M. DNA free energy-based promoter prediction and comparative analysis of Arabidopsis and rice genomes. Plant Physiol. 2011, 156, 1300– 1315, DOI: 10.1104/pp.110.16780915https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3MXptFWks74%253D&md5=4912d0437d17f5ac1c9892fbef2d6c9aDNA free energy-based promoter prediction and comparative analysis of Arabidopsis and rice genomesMorey, Czuee; Mookherjee, Sushmita; Rajasekaran, Ganesan; Bansal, ManjuPlant Physiology (2011), 156 (3), 1300-1315CODEN: PLPHAY; ISSN:0032-0889. (American Society of Plant Biologists)The cis-regulatory regions on DNA serve as binding sites for proteins such as transcription factors and RNA polymerase. The combinatorial interaction of these proteins plays a crucial role in transcription initiation, which is an important point of control in the regulation of gene expression. We present here an anal. of the performance of an in silico method for predicting cis-regulatory regions in the plant genomes of Arabidopsis (Arabidopsis thaliana) and rice (Oryza sativa) on the basis of free energy of DNA melting. For protein-coding genes, we achieve recall and precision of 96% and 42% for Arabidopsis and 97% and 31% for rice, resp. For noncoding RNA genes, the program gives recall and precision of 94% and 75% for Arabidopsis and 95% and 90% for rice, resp. Moreover, 96% of the false-pos. predictions were located in noncoding regions of primary transcripts, out of which 20% were found in the first intron alone, indicating possible regulatory roles. The predictions for orthologous genes from the two genomes showed a good correlation with respect to prediction scores and promoter organization. Comparison of our results with an existing program for promoter prediction in plant genomes indicates that our method shows improved prediction capability.
- 16Yella, V. R.; Bansal, M. DNA structural features and architecture of promoter regions play a role in gene responsiveness of S. cerevisiae. J. Bioinform. Comput. Biol. 2013, 11, 1343001, DOI: 10.1142/S021972001343001416https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2cXisF2ntg%253D%253D&md5=146b54ad9ffa7e56d17270cfc43406ceDNA STRUCTURAL FEATURES AND ARCHITECTURE OF PROMOTER REGIONS PLAY A ROLE IN GENE RESPONSIVENESS OF S. cerevisiaeYella, Venkata Rajesh; Bansal, ManjuJournal of Bioinformatics and Computational Biology (2013), 11 (6), 1343001/1-1343001/13CODEN: JBCBBK; ISSN:0219-7200. (Imperial College Press)Gene expression is the most fundamental biol. process, which is essential for phenotypic variation. It is regulated by various external (environment and evolution) and internal (genetic) factors. The level of gene expression depends on promoter architecture, along with other external factors. Presence of sequence motifs, such as transcription factor binding sites (TFBSs) and TATA-box, or DNA methylation in vertebrates has been implicated in the regulation of expression of some genes in eukaryotes, but a large no. of genes lack these sequences. On the other hand, several exptl. and computational studies have shown that promoter sequences possess some special structural properties, such as low stability, less bendability, low nucleosome occupancy, and more curvature, which are prevalent across all organisms. These structural features may play role in transcription initiation and regulation of gene expression. We have studied the relationship between the structural features of promoter DNA, promoter directionality and gene expression variability in S. cerevisiae. This relationship has been analyzed for seven different measures of gene expression variability, along with two different regulatory effect measures. We find that a few of the variability measures of gene expression are linked to DNA structural properties, nucleosome occupancy, TATA-box presence, and bidirectionality of promoter regions. Interestingly, gene responsiveness is most intimately correlated with DNA structural features and promoter architecture.
- 17Yella, V. R.; Bansal, M. DNA structural features of eukaryotic TATA-containing and TATA-less promoters. FEBS Open Bio 2017, 7, 324– 334, DOI: 10.1002/2211-5463.1216617https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2sXislWgtbc%253D&md5=e74f088ea534e4270a7bd44568d592a8DNA structural features of eukaryotic TATA-containing and TATA-less promotersYella, Venkata Rajesh; Bansal, ManjuFEBS Open Bio (2017), 7 (3), 324-334CODEN: FOBEB3; ISSN:2211-5463. (Wiley-Blackwell)Eukaryotic genes can be broadly classified as TATA-contg. and TATA-less based on the presence of TATA box in their promoters. Expts. on both classes of genes have revealed a disparity in the regulation of gene expression and cellular functions between the two classes. In this study, we report characteristic differences in promoter sequences and assocd. structural properties of the two categories of genes in six different eukaryotes. We have analyzed three structural features, DNA duplex stability, bendability, and curvature along with the distribution of A-tracts, G-quadruplex motifs, and CpG islands. The structural feature analyses reveal that while the two classes of gene promoters are distinctly different from each other, the properties are also distinguishable across the six organisms.
- 18Yella, V. R.; Kumar, A.; Bansal, M. DNA Structure and Promoter Engineering. In Systems and Synthetic Biology; Singh, V.; Dhar, P. K., Eds. Springer Netherlands: Dordrecht %@ 978–94–017-9514-2, 2015; pp 241– 254.There is no corresponding record for this reference.
- 19Yella, V. R.; Kumar, A.; Bansal, M. Identification of putative promoters in 48 eukaryotic genomes on the basis of DNA free energy. Sci. Rep. 2018, 8, 4520, DOI: 10.1038/s41598-018-22129-819https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BC1Mnht1Oqtw%253D%253D&md5=176e595f1c33f5ba86c78951ea669de3Identification of putative promoters in 48 eukaryotic genomes on the basis of DNA free energyYella Venkata Rajesh; Kumar Aditya; Bansal Manju; Yella Venkata Rajesh; Kumar AdityaScientific reports (2018), 8 (1), 4520 ISSN:.Transcription is an intricate mechanism and is orchestrated at the promoter region. The cognate motifs in the promoters are observed in only a subset of total genes across different domains of life. Hence, sequence-motif based promoter prediction may not be a holistic approach for whole genomes. Conversely, the DNA structural property, duplex stability is a characteristic of promoters and can be used to delineate them from other genomic sequences. In this study, we have used a DNA duplex stability based algorithm 'PromPredict' for promoter prediction in a broad range of eukaryotes, representing various species of yeast, worm, fly, fish, and mammal. Efficiency of the software has been tested in promoter regions of 48 eukaryotic systems. PromPredict achieves recall values, which range from 68 to 92% in various eukaryotes. PromPredict performs well in mammals, although their core promoter regions are GC rich. 'PromPredict' has also been tested for its ability to predict promoter regions for various transcript classes (coding and non-coding), TATA-containing and TATA-less promoters as well as on promoter sequences belonging to different gene expression variability categories. The results support the idea that differential DNA duplex stability is a potential predictor of promoter regions in various genomes.
- 20Kumar, A.; Bansal, M. Modulation of Gene Expression by Gene Architecture and Promoter Structure. In Bioinformatics in the Era of Post Genomics and Big Data Abdurakhmonov, I. Y., Ed. IntechOpen: 2018; pp 37– 53.There is no corresponding record for this reference.
- 21Bansal, M.; Kumar, A.; Yella, V. R. Role of DNA sequence based structural features of promoters in transcription initiation and gene expression. Curr. Opin. Struct. Biol. 2014, 25, 77– 85, DOI: 10.1016/j.sbi.2014.01.00721https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2cXovFCju7Y%253D&md5=22278aea55146b8d71b02c9795ea412aRole of DNA sequence based structural features of promoters in transcription initiation and gene expressionBansal, Manju; Kumar, Aditya; Yella, Venkata RajeshCurrent Opinion in Structural Biology (2014), 25 (), 77-85CODEN: COSBEF; ISSN:0959-440X. (Elsevier Ltd.)A review. Regulatory information for transcription initiation is present in a stretch of genomic DNA, called the promoter region that is located upstream of the transcription start site (TSS) of the gene. The promoter region interacts with different transcription factors and RNA polymerase to initiate transcription and contains short stretches of transcription factor binding sites (TFBSs), as well as structurally unique elements. Recent exptl. and computational analyses of promoter sequences show that they often have non-B-DNA structural motifs, as well as some conserved structural properties, such as stability, bendability, nucleosome positioning preference and curvature, across a class of organisms. Here, we briefly describe these structural features, the differences obsd. in various organisms and their possible role in regulation of gene expression.
- 22Kanhere, A.; Bansal, M. Structural properties of promoters: similarities and differences between prokaryotes and eukaryotes. Nucleic Acids Res. 2005, 33, 3165– 3175, DOI: 10.1093/nar/gki62722https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD2MXmsVKmsL0%253D&md5=257de0ba124688f10a87d796b5ef2192Structural properties of promoters: Similarities and differences between prokaryotes and eukaryotesKanhere, Aditi; Bansal, ManjuNucleic Acids Research (2005), 33 (10), 3165-3175CODEN: NARHAD; ISSN:0305-1048. (Oxford University Press)During the process of transcription, RNA polymerase can exactly locate a promoter sequence in the complex maze of a genome. Several exptl. studies and computational analyses have shown that the promoter sequences apparently possess some special properties, such as unusual DNA structures and low stability, which make them distinct from the rest of the genome. But most of these studies have been carried out on a particular set of promoter sequences or on promoter sequences from similar organisms. To examine whether the promoters from a wide variety of organisms share these special properties, the authors have carried out an anal. of sets of promoters from bacteria, vertebrates and plants. These promoters were analyzed with respect to the prediction of three different properties, such as DNA curvature, bendability and stability, which are relevant to transcription. All the promoter sequences are predicted to share certain features, such as stability and bendability profiles, but there are significant differences in DNA curvature profiles and nucleotide compn. between the different organisms. These similarities and differences are correlated with some of the known facts about transcription process in the promoters from the three groups of organisms.
- 23Kumar, A.; Bansal, M. Unveiling DNA structural features of promoters associated with various types of TSSs in prokaryotic transcriptomes and their role in gene expression. DNA Res. 2017, 24, 25– 35, DOI: 10.1093/dnares/dsw04523https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1cXlvVek&md5=fa8d6224c7bb0de2db6534914c24ea20Unveiling DNA structural features of promoters associated with various types of TSSs in prokaryotic transcriptomes and their role in gene expressionKumar, Aditya; Bansal, ManjuDNA Research (2017), 24 (1), 25-35CODEN: DARSE8; ISSN:1756-1663. (Oxford University Press)Next-generation sequencing studies have revealed that a variety of transcripts are present in the prokaryotic transcriptome and a significant fraction of them are functional, being involved in various regulatory activities apart from coding for proteins. Identification of promoters assocd. with different transcripts is necessary for characterization of the transcriptome. Promoter regions have been shown to have unique structural features as compared with their flanking region, in organisms covering all domains of life. Here we report an in silico anal. of DNA sequence dependent structural properties like stability, bendability and curvature in the promoter region of six different prokaryotic transcriptomes. Using these structural features, we predicted promoters assocd. with different categories of transcripts (mRNA, internal, antisense and non-coding), which constitute the transcriptome. Promoter annotation using structural features is fairly accurate and reliable with about 50% of the primary promoters being characterized by all three structural properties while at least one property identifies 95%. We also studied the relative differences of these structural features in terms of gene expression and found that the features, viz. lower stability, lesser bendability and higher curvature are more prominent in the promoter regions which are assocd. with high gene expression as compared with low expression genes. Hence, promoters, which are assocd. with higher gene expression, get annotated well using DNA structural features as compared with those, which are linked to lower gene expression.
- 24Marin-Gonzalez, A.; Vilhena, J. G.; Moreno-Herrero, F.; Perez, R. DNA Crookedness Regulates DNA Mechanical Properties at Short Length Scales. Phys. Rev. Lett. 2019, 122, 048102 DOI: 10.1103/PhysRevLett.122.04810224https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1MXpt1KlsL8%253D&md5=7904774f4c572a4ce99947a3f934338fDNA crookedness regulates DNA mechanical properties at short length scalesMarin-Gonzalez, Alberto; Vilhena, J. G.; Moreno-Herrero, Fernando; Perez, RubenPhysical Review Letters (2019), 122 (4), 048102/1-048102/6CODEN: PRLTAO; ISSN:1079-7114. (American Physical Society)Sequence-dependent DNA conformation and flexibility play a fundamental role in the specificity of DNA-protein interactions. Here we quantify the DNA crookedness: a sequence-dependent deformation of DNA that consists of periodic bends of the base pair centers chain. Using extensive 100 μs-long, all-atom mol. dynamics simulations, we found that DNA crookedness and its assocd. flexibility are bijective, which unveils a one-to-one relation between DNA structure and dynamics. This allowed us to build a predictive model to compute the stretch moduli of different DNA sequences from solely their structure. Sequences with very little crookedness show extremely high stretching stiffness and have been previously shown to form unstable nucleosomes and promote gene expression. Interestingly, the crookedness can be tailored by epigenetic modifications, known to affect gene expression. Our results rationalize the idea that the DNA sequence is not only a chem. code, but also a phys. one that allows finely regulating its mech. properties and, possibly, its 3D arrangement inside the cell.
- 25Parker, S. C. J.; Hansen, L.; Abaan, H. O.; Tullius, T. D.; Margulies, E. H. Local DNA topography correlates with functional noncoding regions of the human genome. Science 2009, 324, 389– 392, DOI: 10.1126/science.116905025https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD1MXksVOltr0%253D&md5=f6d1ebe92eb784e7672b5e9a0f8c5b65Local DNA topography correlates with functional noncoding regions of the human genomeParker, Stephen C. J.; Hansen, Loren; Abaan, Hatice Ozel; Tullius, Thomas D.; Margulies, Elliott H.Science (Washington, DC, United States) (2009), 324 (5925), 389-392CODEN: SCIEAS; ISSN:0036-8075. (American Association for the Advancement of Science)The three-dimensional mol. structure of DNA, specifically the shape of the backbone and grooves of genomic DNA, can be dramatically affected by nucleotide changes, which can cause differences in protein-binding affinity and phenotype. The authors developed an algorithm to measure constraint on the basis of similarity of DNA topog. among multiple species, using hydroxyl radical cleavage patterns to interrogate the solvent-accessible surface area of DNA. This algorithm found that 12% of bases in the human genome are evolutionarily constrained-double the no. detected by nucleotide sequence-based algorithms. Topog.-informed constrained regions correlated with functional noncoding elements, including enhancers, better than did regions identified solely on the basis of nucleotide sequence. These results support the idea that the mol. shape of DNA is under selection and can identify evolutionary history.
- 26Meysman, P.; Marchal, K.; Engelen, K. DNA structural properties in the classification of genomic transcription regulation elements. Bioinform. Biol. Insights 2012, 6, 155– 168, DOI: 10.4137/BBI.S942626https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC38XhtVars77J&md5=9b66d13eba1a584da9a8398d8c1026d5DNA structural properties in the classification of genomic transcription regulation elementsMeysman, Pieter; Marchal, Kathleen; Engelen, KristofBioinformatics and Biology Insights (2012), 6 (), 155-168CODEN: BBIIGM; ISSN:1177-9322. (Libertas Academica)A review. It has been long known that DNA mols. encode information at various levels. The most basic level comprises the base sequence itself and is primarily important for the encoding of proteins and direct base recognition by DNA-binding proteins. A more elusive level consists of the local structural properties of the DNA mol. wherein the DNA sequence only plays an indirect supportive role. These properties are nevertheless an important factor in a large no. of biomol. processes and can be considered as informative signals for the presence of a variety of genomic features. Several recent studies have unequivocally shown the benefit of relying on such DNA properties for modeling and predicting genomic features as diverse as transcription start sites, transcription factor binding sites, or nucleosome occupancy. This review is meant to provide an overview of the key aspects of these DNA conformational and physicochem. properties. To illustrate their potential added value compared to relying solely on the nucleotide sequence in genomics studies, we discuss their application in research on transcription regulation mechanisms as representative cases.
- 27Yella, V. R.; Bhimsaria, D.; Ghoshdastidar, D.; Rodríguez-Martínez, J. A.; Ansari, A. Z.; Bansal, M. Flexibility and structure of flanking DNA impact transcription factor affinity for its core motif. Nucleic Acids Res. 2018, 46, 11883– 11897, DOI: 10.1093/nar/gky105727https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1MXovFWgurg%253D&md5=abd9ce678c06f8b6e2681471ba0f1ba8Flexibility and structure of flanking DNA impact transcription factor affinity for its core motifYella, Venkata Rajesh; Bhimsaria, Devesh; Ghoshdastidar, Debostuti; Rodriguez-Martinez, Jose A.; Ansari, Aseem Z.; Bansal, ManjuNucleic Acids Research (2018), 46 (22), 11883-11897CODEN: NARHAD; ISSN:1362-4962. (Oxford University Press)Spatial and temporal expression of genes is essential for maintaining phenotype integrity. Transcription factors (TFs) modulate expression patterns by binding to specific DNA sequences in the genome. Along with the core binding motif, the flanking sequence context can play a role in DNA-TF recognition. Here, we employ high-throughput in vitro and in silico analyses to understand the influence of sequences flanking the cognate sites in binding of three most prevalent eukaryotic TF families (zinc finger, homeodomain and bZIP). In vitro binding preferences of each TF toward the entire DNA sequence space were correlated with a wide range of DNA structural parameters, including DNA flexibility. Results demonstrate that conformational plasticity of flanking regions modulates binding affinity of certain TF families. DNA duplex stability and minor groove width also play an important role in DNA-TF recognition but differ in how exactly they influence the binding in each specific case. Our analyses further reveal that the structural features of preferred flanking sequences are not universal, as similar DNA-binding folds can employ distinct DNA recognition modes.
- 28Kumar, A.; Manivelan, V.; Bansal, M. Structural features of DNA are conserved in the promoter region of orthologous genes across different strains of Helicobacter pylori. FEMS Microbiol. Lett. 2016, 363, fnv207, DOI: 10.1093/femsle/fnw207There is no corresponding record for this reference.
- 29Cao, X. Q.; Zeng, J.; Yan, H. Structural properties of replication origins in yeast DNA sequences. Phys. Biol. 2008, 5, 036012 DOI: 10.1088/1478-3975/5/3/03601229https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BD1cnjtlKntA%253D%253D&md5=c65d783931fb9bc46f6979a5ed53e2cfStructural properties of replication origins in yeast DNA sequencesCao Xiao-Qin; Zeng Jia; Yan HongPhysical biology (2008), 5 (3), 036012 ISSN:.Sequence-dependent DNA flexibility is an important structural property originating from the DNA 3D structure. In this paper, we investigate the DNA flexibility of the budding yeast (S. Cerevisiae) replication origins on a genome-wide scale using flexibility parameters from two different models, the trinucleotide and the tetranucleotide models. Based on analyzing average flexibility profiles of 270 replication origins, we find that yeast replication origins are significantly rigid compared with their surrounding genomic regions. To further understand the highly distinctive property of replication origins, we compare the flexibility patterns between yeast replication origins and promoters, and find that they both contain significantly rigid DNAs. Our results suggest that DNA flexibility is an important factor that helps proteins recognize and bind the target sites in order to initiate DNA replication. Inspired by the role of the rigid region in promoters, we speculate that the rigid replication origins may facilitate binding of proteins, including the origin recognition complex (ORC), Cdc6, Cdt1 and the MCM2-7 complex.
- 30Comoglio, F.; Schlumpf, T.; Schmid, V.; Rohs, R.; Beisel, C.; Paro, R. High-resolution profiling of Drosophila replication start sites reveals a DNA shape and chromatin signature of metazoan origins. Cell Rep. 2015, 11, 821– 834, DOI: 10.1016/j.celrep.2015.03.07030https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2MXntFKjurw%253D&md5=870f20a0e1fbb9fb3aebec7d857b6657High-Resolution Profiling of Drosophila Replication Start Sites Reveals a DNA Shape and Chromatin Signature of Metazoan OriginsComoglio, Federico; Schlumpf, Tommy; Schmid, Virginia; Rohs, Remo; Beisel, Christian; Paro, RenatoCell Reports (2015), 11 (5), 821-834CODEN: CREED8; ISSN:2211-1247. (Cell Press)At every cell cycle, faithful inheritance of metazoan genomes requires the concerted activation of thousands of DNA replication origins. However, the genetic and chromatin features defining metazoan replication start sites remain largely unknown. Here, we delineate the origin repertoire of the Drosophila genome at high resoln. We address the role of origin-proximal G-quadruplexes and suggest that they transiently stall replication forks in vivo. We dissect the chromatin configuration of replication origins and identify a rich spatial organization of chromatin features at initiation sites. DNA shape and chromatin configurations, not strict sequence motifs, mark and predict origins in higher eukaryotes. We further examine the link between transcription and origin firing and reveal that modulation of origin activity across cell types is intimately linked to cell-type-specific transcriptional programs. Our study unravels conserved origin features and provides unique insights into the relationship among DNA topol., chromatin, transcription, and replication initiation across metazoa.
- 31Gao, F.; Luo, H.; Zhang, C. T. DeOri: a database of eukaryotic DNA replication origins. Bioinformatics 2012, 28, 1551– 1552, DOI: 10.1093/bioinformatics/bts15131https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC38Xns1Slsbk%253D&md5=cbc68d259df8cd5e296a4b6b4b45a26bDeOri: a database of eukaryotic DNA replication originsGao, Feng; Luo, Hao; Zhang, Chun-TingBioinformatics (2012), 28 (11), 1551-1552CODEN: BOINFP; ISSN:1367-4803. (Oxford University Press)Summary: DNA replication, a central event for cell proliferation, is the basis of biol. inheritance. The identification of replication origins helps to reveal the mechanism of the regulation of DNA replication. However, only few eukaryotic replication origins were characterized not long ago; nevertheless, recent genome-wide approaches have boosted the no. of mapped replication origins. To gain a comprehensive understanding of the nature of eukaryotic replication origins, we have constructed a Database of Eukaryotic ORIs (DeOri), which contains all the eukaryotic ones identified by genome-wide analyses currently available. A total of 16 145 eukaryotic replication origins have been collected from 6 eukaryotic organisms in which genome-wide studies have been performed, the replication-origin nos. being 433, 7489, 1543, 148, 348 and 6184 for humans, mice, Arabidopsis thaliana, Kluyveromyces lactis, Schizosaccharomyces pombe and Drosophila melanogaster, resp. Availability: Database of Eukaryotic ORIs (DeOri) can be accessed from http://tubic.tju.edu.cn/deori/ Contact: [email protected].
- 32Chen, W.; Feng, P.; Lin, H. Prediction of replication origins by calculating DNA structural properties. FEBS Lett. 2012, 586, 934– 938, DOI: 10.1016/j.febslet.2012.02.03432https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC38Xjs1Ghsbk%253D&md5=00cf6edcb1c9f388ddb05f95454ab3aePrediction of replication origins by calculating DNA structural propertiesChen, Wei; Feng, Pengmian; Lin, HaoFEBS Letters (2012), 586 (6), 934-938CODEN: FEBLAL; ISSN:0014-5793. (Elsevier B.V.)In this study, we introduced two DNA structural characteristics, namely, bendability and hydroxyl radical cleavage intensity to analyze origin of replication (ORI) in the Saccharomyces cerevisiae genome. We found that both DNA bendability and cleavage intensity in core replication regions were significantly lower than in the linker regions. By using these two DNA structural characteristics, we developed a computational model for ORI prediction and evaluated the model in a benchmark dataset. The predictive performance of the jackknife cross-validation indicates that DNA bendability and cleavage intensity have the ability to describe core replication regions and our model is effective in ORI prediction.
- 33Kumar, A.; Bansal, M. Characterization of structural and free energy properties of promoters associated with Primary and Operon TSS in Helicobacter pylori genome and their orthologs. J. Biosci. 2012, 37, 423– 431, DOI: 10.1007/s12038-012-9214-633https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC38XhtVOgsb7P&md5=a784b131e71d8a937bf37a57eb07928fCharacterization of structural and free energy properties of promoters associated with Primary and Operon TSS in Helicobacter pylori genome and their orthologsKumar, Aditya; Bansal, ManjuJournal of Biosciences (New Delhi, India) (2012), 37 (3), 423-431CODEN: JOBSDN; ISSN:0250-5991. (Springer (India) Private Ltd.)Promoter regions in the genomes of all domains of life show similar trends in several structural properties such as stability, bendability, curvature, etc. In current study we analyzed the stability and bendability of various classes of promoter regions (based on the recent identification of different classes of transcription start sites) of Helicobacter pylori 26695 strain. It is found that primary TSS and operon-assocd. TSS promoters show significantly strong features in their promoter regions. DNA free-energy-based promoter prediction tool PromPredict was used to annotate promoters of different classes, and very high recall values (∼80%) are obtained for primary TSS. Orthologous genes from other strains of H. pylori show conservation of structural properties in promoter regions as well as coding regions. PromPredict annotates promoters of orthologous genes with very high recall and precision.
- 34Cao, X. Q.; Zeng, J.; Yan, H. Physical signals for protein-DNA recognition. Phys. Biol. 2009, 6, 036012 DOI: 10.1088/1478-3975/6/3/03601234https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3cXntFOqu70%253D&md5=588fd51aa107baf5171771d1c02d0978Physical signals for protein-DNA recognitionCao, Xiao-Qin; Zeng, Jia; Yan, HongPhysical Biology (2009), 6 (3), 036012/1-036012/10CODEN: PBHIAT; ISSN:1478-3975. (Institute of Physics Publishing)This paper discovers consensus phys. signals around eukaryotic splice sites, transcription start sites, and replication origin start and end sites on a genome-wide scale based on their DNA flexibility profiles calcd. by three different flexibility models. These salient phys. signals are localized highly rigid and flexible DNAs, which may play important roles in protein-DNA recognition by the sliding search mechanism. The found phys. signals lead us to a detailed hypothetical view of the search process in which a DNA-binding protein first finds a genomic region close to the target site from an arbitrary starting location by three-dimensional (3D) hopping and intersegment transfer mechanisms for long distances, and subsequently uses the one-dimensional (1D) sliding mechanism facilitated by the localized highly rigid DNAs to accurately locate the target flexible binding site within 30 bp (base pair) short distances. Guided by these phys. signals, DNA-binding proteins rapidly search the entire genome to recognize a specific target site from the 3D to 1D pathway. Our findings also show that current promoter prediction programs (PPPs) based on DNA phys. properties may suffer from lots of false positives because other functional sites such as splice sites and replication origins have similar phys. signals as promoters do.
- 35Bleichert, F.; Botchan, M. R.; Berger, J. M. Mechanisms for initiating cellular DNA replication. Science 2017, 355, eaah6317, DOI: 10.1126/science.aah6317There is no corresponding record for this reference.
- 36Gai, D.; Chang, Y. P.; Chen, X. S. Origin DNA melting and unwinding in DNA replication. Curr. Opin. Struct. Biol. 2010, 20, 756– 762, DOI: 10.1016/j.sbi.2010.08.00936https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3cXhsV2js77J&md5=95a2965414712fd572204137c0f8e9bbOrigin DNA melting and unwinding in DNA replicationGai, Dahai; Chang, Y. Paul; Chen, Xiaojiang S.Current Opinion in Structural Biology (2010), 20 (6), 756-762CODEN: COSBEF; ISSN:0959-440X. (Elsevier Ltd.)A review. Genomic DNA replication is a necessary step in the life cycles of all organisms. To initiate DNA replication, the double-stranded DNA (dsDNA) at the origin of replication must be sepd. or melted; this melted region is propagated and a mature replication fork is formed. To accomplish origin recognition, initial DNA melting, and the eventual formation of a replication fork, coordinated activity of initiators, helicases, and other cellular factors are required. Here, the authors focus on recent advances in the structural and biochem. studies of the initiators and the replicative helicases in multiple replication systems, with emphasis on the systems in archaeal and eukaryotic cells. These studies have yielded insights into the plausible mechanisms of the early stages of DNA replication.
- 37Rajewska, M.; Wegrzyn, K.; Konieczny, I. AT-rich region and repeated sequences - the essential elements of replication origins of bacterial replicons. FEMS Microbiol. Rev. 2012, 36, 408– 434, DOI: 10.1111/j.1574-6976.2011.00300.x37https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC38Xjt1ClsLs%253D&md5=d8eeeedc2aa9f709bd66dab82d28c7e3AT-rich region and repeated sequences - the essential elements of replication origins of bacterial repliconsRajewska, Magdalena; Wegrzyn, Katarzyna; Konieczny, IgorFEMS Microbiology Reviews (2012), 36 (2), 408-434CODEN: FMREE4; ISSN:0168-6445. (Wiley-Blackwell)A review. Repeated sequences are commonly present in the sites for DNA replication initiation in bacterial, archaeal, and eukaryotic replicons. Those motifs are usually the binding places for replication initiation proteins or replication regulatory factors. In prokaryotic replication origins, the most abundant repeated sequences are DnaA boxes which are the binding sites for chromosomal replication initiation protein DnaA, iterons which bind plasmid or phage DNA replication initiators, defined motifs for site-specific DNA methylation, and 13-nucleotide-long motifs of a not too well-characterized function, which are present within a specific region of replication origin contg. higher than av. content of adenine and thymine residues. In this review, we specify methods allowing identification of a replication origin, basing on the localization of an AT-rich region and the arrangement of the origin's structural elements. We describe the regularity of the position and structure of the AT-rich regions in bacterial chromosomes and plasmids. The importance of 13-nucleotide-long repeats present at the AT-rich region, as well as other motifs overlapping them, was pointed out to be essential for DNA replication initiation including origin opening, helicase loading and replication complex assembly. We also summarize the role of AT-rich region repeated sequences for DNA replication regulation.
- 38Dao, F. Y.; Lv, H.; Wang, F.; Feng, C. Q.; Ding, H.; Chen, W.; Lin, H. Identify origin of replication in Saccharomyces cerevisiae using two-step feature selection technique. Bioinformatics 2019, 35, 2075– 2083, DOI: 10.1093/bioinformatics/bty94338https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB3cXhs1WktL4%253D&md5=0f7b1cbe27a6b90f91dca0bc04248835Identify origin of replication in Saccharomyces cerevisiae using two-step feature selection techniqueDao, Fu-Ying; Lv, Hao; Wang, Fang; Feng, Chao-Qin; Ding, Hui; Chen, Wei; Lin, HaoBioinformatics (2019), 35 (12), 2075-2083CODEN: BOINFP; ISSN:1367-4811. (Oxford University Press)DNA replication is a key step to maintain the continuity of genetic information between parental generation and offspring. The initiation site of DNA replication, also called origin of replication (ORI), plays an extremely important role in the basic biochem. process. Thus, rapidly and effectively identifying the location of ORI in genome will provide key clues for genome anal. Although biochem. expts. could provide detailed information for ORI, it requires high exptl. cost and long exptl. period. As good complements to exptl. techniques, computational methods could overcome these disadvantages. Thus, in this study, we developed a predictor called iORI-PseKNC2.0 to identify ORIs in the Saccharomyces cerevisiae genome based on sequence information. The PseKNC including 90 physicochem. properties was proposed to formulate ORI and non-ORI samples. In order to improve the accuracy, a two-step feature selection was proposed to exclude redundant and noise information. As a result, the overall success rate of 88.53% was achieved in the 5-fold cross-validation test by using support vector machine.
- 39Li, W.-C.; Deng, E.-Z.; Ding, H.; Chen, W.; Lin, H. iORI-PseKNC: A predictor for identifying origin of replication with pseudo k-tuple nucleotide composition. Chemom. Intell. Lab. Syst. 2015, 141, 100– 106, DOI: 10.1016/j.chemolab.2014.12.01139https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2MXls1Kmug%253D%253D&md5=feb3c6a88cf48926717db64ee78422d2iORI-PseKNC: A predictor for identifying origin of replication with pseudo k-tuple nucleotide compositionLi, Wen-Chao; Deng, En-Ze; Ding, Hui; Chen, Wei; Lin, HaoChemometrics and Intelligent Laboratory Systems (2015), 141 (), 100-106CODEN: CILSEN; ISSN:0169-7439. (Elsevier B.V.)The initiation of replication origin is an extremely important process of DNA replication. The distribution of replication origin regions (ORIs) is the major determinant of the timing of genome replication. Thus, correctly identifying ORIs is crucial to understand DNA replication mechanism. With the avalanche of genome sequences generated in the post-genomic age, it is highly desired to develop computational methods for rapidly, effectively and automatically identifying the ORIs in genome. In this paper, we developed a predictor called iORI-PseKNC for identifying ORIs in Saccharomyces cerevisiae genome. In the predictor, based on the concept of the global and long-range sequence-order effects of DNA sequence, the feature called "pseudo k-tuple nucleotide compn." (PseKNC) was used to encode the DNA sequences by incorporating six local structural properties of 16 dinucleotides. The overall success rate of 83.72% was achieved from the jackknife cross-validation test on an objective benchmark dataset. Comparisons demonstrate that the new predictor is superior to other methods. As a user-friendly web-server, iORI-PseKNC is freely accessible at http://lin.uestc.edu.cn/server/iORI-PseKNC. We hope that iORI-PseKNC will become a useful tool or at least as a complement to existing methods for identifying ORIs.
- 40Zhang, C. J.; Tang, H.; Li, W. C.; Lin, H.; Chen, W.; Chou, K. C. iOri-Human: identify human origin of replication by incorporating dinucleotide physicochemical properties into pseudo nucleotide composition. Oncotarget 2016, 7, 69783– 69793, DOI: 10.18632/oncotarget.1197540https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BC2svgt1eksQ%253D%253D&md5=585648a09f582ab81fe8eb6870d7f6dfiOri-Human: identify human origin of replication by incorporating dinucleotide physicochemical properties into pseudo nucleotide compositionZhang Chang-Jian; Li Wen-Chao; Lin Hao; Chen Wei; Chou Kuo-Chen; Tang Hua; Lin Hao; Chen Wei; Chou Kuo-Chen; Chen Wei; Chou Kuo-ChenOncotarget (2016), 7 (43), 69783-69793 ISSN:.The initiation of replication is an extremely important process in DNA life cycle. Given an uncharacterized DNA sequence, can we identify where its origin of replication (ORI) is located? It is no doubt a fundamental problem in genome analysis. Particularly, with the rapid development of genome sequencing technology that results in a huge amount of sequence data, it is highly desired to develop computational methods for rapidly and effectively identifying the ORIs in these genomes. Unfortunately, by means of the existing computational methods, such as sequence alignment or kmer strategies, it could hardly achieve decent success rates. To address this problem, we developed a predictor called "iOri-Human". Rigorous jackknife tests have shown that its overall accuracy and stability in identifying human ORIs are over 75% and 50%, respectively. In the predictor, it is through the pseudo nucleotide composition (an extension of pseudo amino acid composition) that 96 physicochemical properties for the 16 possible constituent dinucleotides have been incorporated to reflect the global sequence patterns in DNA as well as its local sequence patterns. Moreover, a user-friendly web-server for iOri-Human has been established at http://lin.uestc.edu.cn/server/iOri-Human.html, by which users can easily get their desired results without the need to through the complicated mathematics involved.
- 41Gowers, D. M.; Wilson, G. G.; Halford, S. E. Measurement of the contributions of 1D and 3D pathways to the translocation of a protein along DNA. Proc. Natl. Acad. Sci. U. S. A. 2005, 102, 15883– 15888, DOI: 10.1073/pnas.050537810241https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD2MXht1Wru7bK&md5=349563bc658cb21f548f0efc222170f6Measurement of the contributions of 1D and 3D pathways to the translocation of a protein along DNAGowers, Darren M.; Wilson, Geoffrey G.; Halford, Stephen E.Proceedings of the National Academy of Sciences of the United States of America (2005), 102 (44), 15883-15888CODEN: PNASA6; ISSN:0027-8424. (National Academy of Sciences)Proteins that act at specific DNA sequences bind DNA randomly and then translocate to the target site. The translocation is often ascribed to the protein sliding along the DNA while maintaining continuous contact with it. Proteins also can move on DNA by multiple cycles of dissocn./reassocn. within the same chain. To distinguish these pathways, a strategy was developed to analyze protein motion between DNA sites. The strategy reveals whether the protein maintains contact with the DNA as it transfers from one site to another by sliding or whether it loses contact by a dissocn./reassocn. step. In reactions at low salt, the test protein stayed on the DNA as it traveled between sites, but only when the sites were <50 bp apart. Transfers of >30 bp at in vivo salt, and over distances of >50 bp at any salt, always included at least one dissocn. step. Hence, for this enzyme, 1D sliding operates only over short distances at low salt, and 3D dissocn./reassocn. is its main mode of translocation.
- 42Halford, S. E.; Marko, J. F. How do site-specific DNA-binding proteins find their targets?. Nucleic Acids Res. 2004, 32, 3040– 3052, DOI: 10.1093/nar/gkh62442https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD2cXltVWlsro%253D&md5=b6e3b1d67b846ed95ce2d17d2a4754c0How do site-specific DNA-binding proteins find their targets?Halford, Stephen E.; Marko, John F.Nucleic Acids Research (2004), 32 (10), 3040-3052CODEN: NARHAD; ISSN:0305-1048. (Oxford University Press)A review. Essentially all the biol. functions of DNA depend on site-specific DNA-binding proteins finding their targets, and therefore searching through megabases of non-target DNA. In this article, we review current understanding of how this sequence searching is done. We review how simple diffusion through soln. may be unable to account for the rapid rates of assocn. obsd. in expts. on some model systems, primarily the Lac repressor. We then present a simplified version of the facilitated diffusion model of Berg, Winter and von Hippel, showing how non-specific DNA-protein interactions may account for accelerated targeting, by permitting the protein to sample many binding sites per DNA encounter. We discuss the 1-dimensional sliding motion of protein along non-specific DNA, often proposed to be the mechanism of this multiple site sampling, and we discuss the role of short-range diffusive-hopping motions. We then derive the optimal range of sliding for a few phys. situations, including simple models of chromosomes in vivo, showing that a sliding range of ∼100 bp before dissocn. optimizes targeting in vivo. Going beyond first-order binding kinetics, we discuss how processivity, the interaction of a protein with two or more targets on the same DNA, can reveal the extent of sliding and we review recent expts. studying processivity using the restriction enzyme EcoRV. Finally, we discuss how single mol. techniques might be used to study the dynamics of DNA site-specific targeting of proteins.
- 43Jiang, C.; Pugh, B. F. Nucleosome positioning and gene regulation: advances through genomics. Nat. Rev. Genet. 2009, 10, 161– 172, DOI: 10.1038/nrg252243https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD1MXhvFGlu7Y%253D&md5=5a40109670e7b2f3dd754f8bc7a2d3dcNucleosome positioning and gene regulation: advances through genomicsJiang, Cizhong; Pugh, B. FranklinNature Reviews Genetics (2009), 10 (3), 161-172CODEN: NRGAAM; ISSN:1471-0056. (Nature Publishing Group)A review. Principles and patterns of nucleosome positioning have emerged through recent advances in genome-wide mapping technologies. These patterns have improved understanding of how DNA sequence and protein complexes control nucleosome location and the influence of nucleosome positioning on transcriptional control. Knowing the precise locations of nucleosomes in a genome is key to understanding how genes are regulated. Recent 'next generation' ChIP-chip and ChIP-Seq technologies have accelerated our understanding of the basic principles of chromatin organization. Here we discuss what high-resoln. genome-wide maps of nucleosome positions have taught us about how nucleosome positioning demarcates promoter regions and transcriptional start sites, and how the compn. and structure of promoter nucleosomes facilitate or inhibit transcription. A detailed picture is starting to emerge of how diverse factors, including underlying DNA sequences and chromatin remodelling complexes, influence nucleosome positioning.
- 44Hoskins, R. A.; Landolin, J. M.; Brown, J. B.; Sandler, J. E.; Takahashi, H.; Lassmann, T.; Yu, C.; Booth, B. W.; Zhang, D.; Wan, K. H.; Yang, L.; Boley, N.; Andrews, J.; Kaufman, T. C.; Graveley, B. R.; Bickel, P. J.; Carninci, P.; Carlson, J. W.; Celniker, S. E. Genome-wide analysis of promoter architecture in Drosophila melanogaster. Genome Res. 2011, 21, 182– 192, DOI: 10.1101/gr.112466.11044https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3MXitFSrt74%253D&md5=2ae3c6dd82b1612800f99a301f88c8e7Genome-wide analysis of promoter architecture in Drosophila melanogasterHoskins, Roger A.; Landolin, Jane M.; Brown, James B.; Sandler, Jeremy E.; Takahashi, Hazuki; Lassmann, Timo; Yu, Charles; Booth, Benjamin W.; Zhang, Dayu; Wan, Kenneth H.; Yang, Li; Boley, Nathan; Andrews, Justen; Kaufman, Thomas C.; Graveley, Brenton R.; Bickel, Peter J.; Carninci, Piero; Carlson, Joseph W.; Celniker, Susan E.Genome Research (2011), 21 (2), 182-192CODEN: GEREFS; ISSN:1088-9051. (Cold Spring Harbor Laboratory Press)Core promoters are crit. regions for gene regulation in higher eukaryotes. However, the boundaries of promoter regions, the relative rates of initiation at the transcription start sites (TSSs) distributed within them, and the functional significance of promoter architecture remain poorly understood. We produced a high-resoln. map of promoters active in the Drosophila melanogaster embryo by integrating data from three independent and complementary methods: 21 million cap anal. of gene expression (CAGE) tags, 1.2 million RNA ligase mediated rapid amplification of cDNA ends (RLM-RACE) reads, and 50,000 cap-trapped expressed sequence tags (ESTs). We defined 12,454 promoters of 8037 genes. Our anal. indicates that, due to non-promoter-assocd. RNA background signal, previous studies have likely over-estd. the no. of promoter-assocd. CAGE clusters by fivefold. We show that TSS distributions form a complex continuum of shapes, and that promoters active in the embryo and adult have highly similar shapes in 95% of cases. This suggests that these distributions are generally detd. by static elements such as local DNA sequence and are not modulated by dynamic signals such as histone modifications. Transcription factor binding motifs are differentially enriched as a function of promoter shape, and peaked promoter shape is correlated with both temporal and spatial regulation of gene expression. Our results contribute to the emerging view that core promoters are functionally diverse and control patterning of gene expression in Drosophila and mammals.
- 45Dao, F. Y.; Lv, H.; Zulfiqar, H.; Yang, H.; Su, W.; Gao, H.; Ding, H.; Lin, H. A computational platform to identify origins of replication sites in eukaryotes. Brief Bioinform 2020, DOI: 10.1093/bib/bbaa017There is no corresponding record for this reference.
- 46Takai, D.; Jones, P. A. Comprehensive analysis of CpG islands in human chromosomes 21 and 22. Proc. Natl. Acad. Sci. U. S. A. 2002, 99, 3740– 3745, DOI: 10.1073/pnas.05241009946https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD38Xis1KltrY%253D&md5=2433872cfe96da0def9e694833e50f02Comprehensive analysis of CpG islands in human chromosomes 21 and 22Takai, Daiya; Jones, Peter A.Proceedings of the National Academy of Sciences of the United States of America (2002), 99 (6), 3740-3745CODEN: PNASA6; ISSN:0027-8424. (National Academy of Sciences)CpG islands are useful markers for genes in organisms contg. 5-methylcytosine in their genomes. In addn., CpG islands located in the promoter regions of genes can play important roles in gene silencing during processes such as X-chromosome inactivation, imprinting, and silencing of intragenomic parasites. The generally accepted definition of what constitutes a CpG island was proposed in 1987 by Gardiner-Garden and Frommer [Gardiner-Garden, M. & Frommer, M. (1987) J. Mol. Biol. 196, 261-282] as being a 200-bp stretch of DNA with a C+G content of 50% and an obsd. CpG/expected CpG in excess of 0.6. Any definition of a CpG island is somewhat arbitrary, and this one, which was derived before the sequencing of mammalian genomes, will include many sequences that are not necessarily assocd. with controlling regions of genes but rather are assocd. with intragenomic parasites. The authors have therefore used the complete genomic sequences of human chromosomes 21 and 22 to examine the properties of CpG islands in different sequence classes by using a search algorithm that the authors have developed. Regions of DNA of greater than 500 bp with a G+C equal to or greater than 55% and obsd. CpG/expected CpG of 0.65 were more likely to be assocd. with the 5' regions of genes and this definition excluded most Alu-repetitive elements. The authors also used genome sequences to show strong CpG suppression in the human genome and slight suppression in Drosophila melanogaster and Saccharomyces cerevisiae. This finding is compatible with the recent detection of 5-methylcytosine in Drosophila, and might suggest that S. cerevisiae has, or once had, CpG methylation.
- 47Mirkin, E. V.; Mirkin, S. M. Replication fork stalling at natural impediments. Microbiol. Mol. Biol. Rev. 2007, 71, 13– 35, DOI: 10.1128/MMBR.00030-0647https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD2sXksleqsLg%253D&md5=f3e547ba124f2a93c5fb6cd3d8e64222Replication fork stalling at natural impedimentsMirkin, Ekaterina V.; Mirkin, Sergei M.Microbiology and Molecular Biology Reviews (2007), 71 (1), 13-35CODEN: MMBRF7; ISSN:1092-2172. (American Society for Microbiology)A review. Accurate and complete replication of the genome in every cell division is a prerequisite of genomic stability. Thus, both prokaryotic and eukaryotic replication forks are extremely precise and robust mol. machines that have evolved to be up to the task. However, it has recently become clear that the replication fork is more of a hurdler than a runner: it must overcome various obstacles present on its way. Such obstacles can be called natural impediments to DNA replication, as opposed to external and genetic factors. Natural impediments to DNA replication are particular DNA binding proteins, unusual secondary structures in DNA, and transcription complexes that occasionally (in eukaryotes) or constantly (in prokaryotes) operate on replicating templates. This review describes the mechanisms and consequences of replication stalling at various natural impediments, with an emphasis on the role of replication stalling in genomic instability.
- 48Kaushik Tiwari, M.; Adaku, N.; Peart, N.; Rogers, F. A. Triplex structures induce DNA double strand breaks via replication fork collapse in NER deficient cells. Nucleic Acids Res. 2016, 44, 7742– 7754, DOI: 10.1093/nar/gkw51548https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BC2s%252FpslGgtg%253D%253D&md5=c4560b8f264a98f9631e9642af0d5c09Triplex structures induce DNA double strand breaks via replication fork collapse in NER deficient cellsKaushik Tiwari Meetu; Adaku Nneoma; Peart Natoya; Rogers Faye ANucleic acids research (2016), 44 (16), 7742-54 ISSN:.Structural alterations in DNA can serve as natural impediments to replication fork stability and progression, resulting in DNA damage and genomic instability. Naturally occurring polypurine mirror repeat sequences in the human genome can create endogenous triplex structures evoking a robust DNA damage response. Failures to recognize or adequately process these genomic lesions can result in loss of genomic integrity. Nucleotide excision repair (NER) proteins have been found to play a prominent role in the recognition and repair of triplex structures. We demonstrate using triplex-forming oligonucleotides that chromosomal triplexes perturb DNA replication fork progression, eventually resulting in fork collapse and the induction of double strand breaks (DSBs). We find that cells deficient in the NER damage recognition proteins, XPA and XPC, accumulate more DSBs in response to chromosomal triplex formation than NER-proficient cells. Furthermore, we demonstrate that XPC-deficient cells are particularly prone to replication-associated DSBs in the presence of triplexes. In the absence of XPA or XPC, deleterious consequences of triplex-induced genomic instability may be averted by activating apoptosis via dual phosphorylation of the H2AX protein. Our results reveal that damage recognition by XPC and XPA is critical to maintaining replication fork integrity and preventing replication fork collapse in the presence of triplex structures.
- 49Prioleau, M. N.; MacAlpine, D. M. DNA replication origins-where do we begin?. Genes Dev. 2016, 30, 1683– 1697, DOI: 10.1101/gad.285114.11649https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC28Xhsl2js7rJ&md5=04ed09d23c47556b1bcd73f41ee1b16fDNA replication origins- where do we begin?Prioleau, Marie-Noelle; MacAlpine, David M.Genes & Development (2016), 30 (15), 1683-1697CODEN: GEDEEP; ISSN:0890-9369. (Cold Spring Harbor Laboratory Press)For more than three decades, investigators have sought to identify the precise locations where DNA replication initiates in mammalian genomes. The development of mol. and biochem. approaches to identify start sites of DNA replication (origins) based on the presence of defining and characteristic replication intermediates at specific loci led to the identification of only a handful of mammalian replication origins. The limited no. of identified origins prevented a comprehensive and exhaustive search for conserved genomic features that were capable of specifying origins of DNA replication. More recently, the adaptation of origin-mapping assays to genome-wide approaches has led to the identification of tens of thousands of replication origins throughout mammalian genomes, providing an unprecedented opportunity to identify both genetic and epigenetic features that define and regulate their distribution and utilization. Here we summarize recent advances in our understanding of how primary sequence, chromatin environment, and nuclear architecture contribute to the dynamic selection and activation of replication origins across diverse cell types and developmental stages.
- 50Cayrou, C.; Ballester, B.; Peiffer, I.; Fenouil, R.; Coulombe, P.; Andrau, J.-C.; van Helden, J.; Méchali, M. The chromatin environment shapes DNA replication origin organization and defines origin classes. Genome Res. 2015, 25, 1873– 1885, DOI: 10.1101/gr.192799.11550https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC28Xkt1Kmurw%253D&md5=7974dcf7c51bdfab271658023c7e2125The chromatin environment shapes DNA replication origin organization and defines origin classesCayrou, Christelle; Ballester, Benoit; Peiffer, Isabelle; Fenouil, Romain; Coulombe, Philippe; Andrau, Jean-Christophe; van Helden, Jacques; Mechali, MarcelGenome Research (2015), 25 (12), 1873-1885CODEN: GEREFS; ISSN:1088-9051. (Cold Spring Harbor Laboratory Press)To unveil the still-elusive nature of metazoan replication origins, we identified them genome-wide and at unprecedented high-resoln. in mouse ES cells. This allowed initiation sites (IS) and initiation zones (IZ) to be differentiated. We then characterized their genetic signatures and organization and integrated these data with 43 chromatin marks and factors. Our results reveal that replication origins can be grouped into three main classes with distinct organization, chromatin environment, and sequence motifs. Class 1 contains relatively isolated, low-efficiency origins that are poor in epigenetic marks and are enriched in an asym. AC repeat at the initiation site. Late origins are mainly found in this class. Class 2 origins are particularly rich in enhancer elements. Class 3 origins are the most efficient and are assocd. with open chromatin and polycomb protein-enriched regions. The presence of Origin G-rich Repeated elements (OGRE) potentially forming G-quadruplexes (G4) was confirmed at most origins. These coincide with nucleosome-depleted regions located upstream of the initiation sites, which are assocd. with a labile nucleosome contg. H3K64ac. These data demonstrate that specific chromatin landscapes and combinations of specific signatures regulate origin localization. They explain the frequently obsd. links between DNA replication and transcription. They also emphasize the plasticity of metazoan replication origins and suggest that in multicellular eukaryotes, the combination of distinct genetic features and chromatin configurations act in synergy to define and adapt the origin profile.
- 51Antequera, F. Structure, function and evolution of CpG island promoters. Cell. Mol. Life Sci. 2003, 60, 1647– 1658, DOI: 10.1007/s00018-003-3088-651https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD3sXnsFynurY%253D&md5=fd3582748858c93762980109952b4a5aStructure, function and evolution of CpG island promotersAntequera, F.Cellular and Molecular Life Sciences (2003), 60 (8), 1647-1658CODEN: CMLSFI; ISSN:1420-682X. (Birkhaeuser Verlag)A review, with refs. Mammalian promoters belong to two different categories in terms of base compn. and DNA methylation. In humans and mice, approx. 60% of all promoters colocalize with CpG islands, which are regions devoid of methylation that have a higher G+C content than the genome av., while the rest have a methylation pattern and base compn. indistinguishable from bulk DNA. Recent comparative studies between both organisms have refined our understanding of how CpG island promoters are organized in terms of protein-DNA interactions and patterns of expression. In addn., the finding that DNA replication initiates at CpG islands in vivo suggests that their distinctive properties could be a consequence of such activity and opens the possibility of a coordinated regulation of transcription and replication. These new data shed light on the origin and evolution of the CpG islands and should contribute to improving methods for promoter prediction in the human and mouse genomes.
- 52Delgado, S.; Gómez, M.; Bird, A.; Antequera, F. Initiation of DNA replication at CpG islands in mammalian chromosomes. EMBO J. 1998, 17, 2426– 2435, DOI: 10.1093/emboj/17.8.242652https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADyaK1cXivFOmt74%253D&md5=6236f653fb36fc441e75ffda63bd7c9cInitiation of DNA replication at CpG islands in mammalian chromosomesDelgado, Sonia; Gomez, Maria; Bird, Adrian; Antequera, FranciscoEMBO Journal (1998), 17 (8), 2426-2435CODEN: EMJODG; ISSN:0261-4189. (Oxford University Press)CpG islands are G+C-rich regions ∼1 kb long that are free of methylation and contain the promoters of many mammalian genes. Anal. of in vivo replication intermediates at three hamster genes and one human gene showed that the CpG island regions, but not their flanks, were present in very short nascent strands, suggesting that they are replication origins (ORIs). CpG island-like fragments were enriched in a population of short nascent strands from human erythroleukemic cells, suggesting that islands constitute a significant fraction of endogenous ORIs. Correspondingly, bulk CpG islands were found to replicate coordinately early in S phase. Our results imply that CpG islands are initiation sites for both transcription and DNA replication, and may represent genomic footprints of replication initiation.
- 53Eaton, M. L.; Galani, K.; Kang, S.; Bell, S. P.; MacAlpine, D. M. Conserved nucleosome positioning defines replication origins. Genes Dev. 2010, 24, 748– 753, DOI: 10.1101/gad.191321053https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3cXlsVaiu7k%253D&md5=34030ed2b7bcdd21fc6a2e1db83a4c84Conserved nucleosome positioning defines replication originsEaton, Matthew L.; Galani, Kyriaki; Kang, Sukhyun; Bell, Stephen P.; MacAlpine, David M.Genes & Development (2010), 24 (8), 748-753CODEN: GEDEEP; ISSN:0890-9369. (Cold Spring Harbor Laboratory Press)The origin recognition complex (ORC) specifies replication origin location. The Saccharomyces cerevisiae ORC recognizes the ARS (autonomously replicating sequence) consensus sequence (ACS), but only a subset of potential genomic sites are bound, suggesting other chromosomal features influence ORC binding. Using high-throughput sequencing to map ORC binding and nucleosome positioning, we show that yeast origins are characterized by an asym. pattern of positioned nucleosomes flanking the ACS. The origin sequences are sufficient to maintain a nucleosome-free origin; however, ORC is required for the precise positioning of nucleosomes flanking the origin. These findings identify local nucleosomes as an important determinant for origin selection and function.
- 54Li, W.-C.; Zhong, Z.-J.; Zhu, P.-P.; Deng, E.-Z.; Ding, H.; Chen, W.; Lin, H. Sequence analysis of origins of replication in the Saccharomyces cerevisiae genomes. Front. Microbiol. 2014, 5, 574, DOI: 10.3389/fmicb.2014.0057454https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BC2Mzpt1yrsg%253D%253D&md5=b190bd4b7f174824fb844d3f3882bd0bSequence analysis of origins of replication in the Saccharomyces cerevisiae genomesLi Wen-Chao; Zhong Zhe-Jin; Zhu Pan-Pan; Deng En-Ze; Ding Hui; Lin Hao; Chen WeiFrontiers in microbiology (2014), 5 (), 574 ISSN:1664-302X.DNA replication is a highly precise process that is initiated from origins of replication (ORIs) and is regulated by a set of regulatory proteins. The mining of DNA sequence information will be not only beneficial for understanding the regulatory mechanism of replication initiation but also for accurately identifying ORIs. In this study, the GC profile and GC skew were calculated to analyze the compositional bias in the Saccharomyces cerevisiae genome. We found that the GC profile in the region of ORIs is significantly lower than that in the flanking regions. By calculating the information redundancy, an estimation of the correlation of nucleotides, we found that the intensity of adjoining correlation in ORIs is dramatically higher than that in flanking regions. Furthermore, the relationships between ORIs and nucleosomes as well as transcription start sites were investigated. Results showed that ORIs are usually not occupied by nucleosomes. Finally, we calculated the distribution of ORIs in yeast chromosomes and found that most ORIs are in transcription terminal regions. We hope that these results will contribute to the identification of ORIs and the study of DNA replication mechanisms.
- 55Gilbert, D. M. Evaluating genome-scale approaches to eukaryotic DNA replication. Nat Rev Genet 2010, 11, 673– 684, DOI: 10.1038/nrg283055https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3cXhtFyju7%252FL&md5=c14e9d32f4c1ca18a38c375a6b630f4eEvaluating genome-scale approaches to eukaryotic DNA replicationGilbert, David M.Nature Reviews Genetics (2010), 11 (10), 673-684CODEN: NRGAAM; ISSN:1471-0056. (Nature Publishing Group)A review of the increasing range of genome-scale methods that are being used to analyze eukaryotic DNA replication. Studies in different species and of replication timing or origin location have yielded varying degrees of success; tech. hurdles remain, but important biol. insights have been gained. Mechanisms regulating where and when eukaryotic DNA replication initiates remain a mystery. Recently, genome-scale methods have been brought to bear on this problem. The identification of replication origins and their assocd. proteins in yeasts is a well-integrated investigative tool, but corresponding data sets from multicellular organisms are scarce. By contrast, standardized protocols for evaluating replication timing have generated informative data sets for most eukaryotic systems. Here, I summarize the genome-scale methods that are most frequently used to analyze replication in eukaryotes, the kinds of questions each method can address and the tech. hurdles that must be overcome to gain a complete understanding of the nature of eukaryotic replication origins.
- 56Tyner, C.; Barber, G. P.; Casper, J.; Clawson, H.; Diekhans, M.; Eisenhart, C.; Fischer, C. M.; Gibson, D.; Gonzalez, J. N.; Guruvadoo, L.; Haeussler, M.; Heitner, S.; Hinrichs, A. S.; Karolchik, D.; Lee, B. T.; Lee, C. M.; Nejad, P.; Raney, B. J.; Rosenbloom, K. R.; Speir, M. L.; Villarreal, C.; Vivian, J.; Zweig, A. S.; Haussler, D.; Kuhn, R. M.; Kent, W. J. The UCSC Genome Browser database: 2017 update. Nucleic Acids Res. 2017, 45, D626– D634, DOI: 10.1093/nar/gkw113456https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1cXhslWhsL8%253D&md5=011386d9d12edff5882c43cc21eaf8ebThe UCSC Genome Browser database: 2017 updateTyner, Cath; Barber, Galt P.; Casper, Jonathan; Clawson, Hiram; Diekhans, Mark; Eisenhart, Christopher; Fischer, Clayton M.; Gibson, David; Gonzalez, Jairo Navarro; Guruvadoo, Luvina; Haeussler, Maximilian; Heitner, Steve; Hinrichs, Angie S.; Karolchik, Donna; Lee, Brian T.; Lee, Christopher M.; Nejad, Parisa; Raney, Brian J.; Rosenbloom, Kate R.; Speir, Matthew L.; Villarreal, Chris; Vivian, John; Zweig, Ann S.; Haussler, David; Kuhn, Robert M.; Kent, W. JamesNucleic Acids Research (2017), 45 (D1), D626-D634CODEN: NARHAD; ISSN:1362-4962. (Oxford University Press)A review. Since its 2001 debut, the University of California, Santa Cruz (UCSC) Genome Browser (http://genome. ucsc.edu/) team has provided continuous support to the international genomics and biomedical communities through a web-based, open source platform designed for the fast, scalable display of sequence alignments and annotations landscaped against a vast collection of quality ref. genome assemblies. The browser's publicly accessible databases are the backbone of a rich, integrated bioinformatics tool suite that includes a graphical interface for data queries and downloads, alignment programs, command-line utilities and more. This year's highlights include newly designed home and gateway pages; a new 'multi-region' track display configuration for exon-only, gene-only and custom regions visualization; new genome browsers for three species (brown kiwi, crab-eating macaque and Malayan flying lemur); eight updated genome assemblies; extended support for new data types such as CRAM, RNA-seq expression data and long-range chromatin interaction pairs; and the unveiling of a new supported mirror site in Japan.
- 57SantaLucia, J., Jr. A unified view of polymer, dumbbell, and oligonucleotide DNA nearest-neighbor thermodynamics. Proc. Natl. Acad. Sci. U. S. A. 1998, 95, 1460– 1465, DOI: 10.1073/pnas.95.4.146057https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADyaK1cXht1Wqsbc%253D&md5=1a4e89f9f0caa91aecd5944add0aaf83A unified view of polymer, dumbbell, and oligonucleotide DNA nearest-neighbor thermodynamicsSantalucia, John, Jr.Proceedings of the National Academy of Sciences of the United States of America (1998), 95 (4), 1460-1465CODEN: PNASA6; ISSN:0027-8424. (National Academy of Sciences)A unified view of polymer, dumbbell, and oligonucleotide nearest-neighbor (NN) thermodn. is presented. DNA NN ΔG37° parameters from seven labs. are presented in the same format so that careful comparisons can be made. The seven studies used data from natural polymers, synthetic polymers, oligonucleotide dumbbells, and oligonucleotide duplexes to derive NN parameters; used different methods of data anal.; used different salt concns.; and presented the NN thermodn. in different formats. As a result of these differences, there has been much confusion regarding the NN thermodn. of DNA polymers and oligomers. Herein I show that six of the studies are actually in remarkable agreement with one another and explanations are provided in cases where discrepancies remain. Further, a single set of parameters, derived from 108 oligonucleotide duplexes, adequately describes polymer and oligomer thermodn. Empirical salt dependencies are also derived for oligonucleotides and polymers.
- 58Anselmi, C.; De Santis, P.; Paparcone, R.; Savino, M.; Scipioni, A. From the sequence to the superstructural properties of DNAs. Biophys. Chem. 2002, 95, 23– 47, DOI: 10.1016/S0301-4622(01)00246-058https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD38XhslWhsbs%253D&md5=3e0ddf1b164a5b4f24ceedfc4649861fFrom the sequence to the superstructural properties of DNAsAnselmi, C.; De Santis, P.; Paparcone, R.; Savino, M.; Scipioni, A.Biophysical Chemistry (2002), 95 (1), 23-47CODEN: BICIAZ; ISSN:0301-4622. (Elsevier Science B.V.)A theor. model for predicting intrinsic and induced DNA superstructures as well as their thermodn. properties is presented. Intrinsic sequence-dependent superstructures are evaluated by integrating local deviations from the canonical B-DNA of the different dinucleotide steps. Induced superstructures are obtained by adopting the principle of min. deformation free energy, evaluated in the Fourier space, in the framework of first-order elasticity. Finally dinucleotide stacking energies and melting temps. are considered to account for local flexibility. In fact the two scales are strongly correlated. The model works very satisfactorily in predicting the sequence-dependent effects on the DNA exptl. behavior, such as the gel electrophoresis retardation, the writhe transitions in topol. constrained domains, the thermodn. consts. of circularization reactions as well as the nucleosome thermodn. stability consts.
- 59Brukner, I.; Sánchez, R.; Suck, D.; Pongor, S. Trinucleotide models for DNA bending propensity: comparison of models based on DNaseI digestion and nucleosome packaging data. J. Biomol. Struct. Dyn. 1995, 13, 309– 317, DOI: 10.1080/07391102.1995.1050884259https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADyaK2MXovFantrc%253D&md5=8e13a8220a944ab6135d187a4d453605Trinucleotide models of DNA bending propensity: comparison of models based on DNaseI digestion and nucleosome packaging dataBrukner, Ivan; Sanchez, Roberto; Suck, Dietrich; Pongor, SandorJournal of Biomolecular Structure & Dynamics (1995), 13 (2), 309-17CODEN: JBSDD6; ISSN:0739-1102. (Adenine Press)DNaseI digestion studies (Brukner et al, EMBO J 14, 1812-1818 1995) and nucleosome-binding data (Satchwell et al., J. Mol. Biol. 191, 639-659 1986, Goodsell and Dickerson, Nucleic Acids Res. 1, 22, 5497-5503 1994)provide a possibility to derive bending parameters for trinucleotides. A detailed comparison of the two models suggests that while both of them represent improvements with respect to dinucleotide based descriptions, the individual trinucleotide parameters are not highly correlated (linear correlation coeff. is 0.53), and a no. of motifs such as TA-elements and CCA/TGG motifs are more realistically described in the DNaseI-based model. This may be due to the fact that the DNaseI-based model does not rely on a static geometry but rather captures a dynamic ability of ds DNA to bend towards the major grove. Future refinement of both models on larger exptl. data sets is expected to further improve the prediction of macroscopic DNA-curvature.
- 60Satchwell, S. C.; Drew, H. R.; Travers, A. A. Sequence periodicities in chicken nucleosome core DNA. J. Mol. Biol. 1986, 191, 659– 675, DOI: 10.1016/0022-2836(86)90452-360https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADyaL2sXivVertA%253D%253D&md5=e4bbd4eb060dd2e3dcca36dd06bebf73Sequence periodicities in chicken nucleosome core DNASatchwell, Sandra C.; Drew, Horace R.; Travers, Andrew A.Journal of Molecular Biology (1986), 191 (4), 659-75CODEN: JMOBAK; ISSN:0022-2836.The rotational positioning of DNA about the histone octamer appears to be detd. by certain sequence-dependent modulations of DNA structure. To establish the detailed nature of these interactions, the sequences of 177 different DNA mols. from chicken erythrocyte core particles were analyzed. All variations in the sequence content of these mols., which may be attributed to sequence-dependent preferences for DNA bending, correlate well with the detailed path of the DNA as it wraps around the histone octamer in the crystal structure of the nucleosome core. The sequence-dependent preferences that correlate most closely with the rotational orientation of the DNA, relative to the surface of the protein, are of two kinds: ApApA/TpTpT and ApApT/ApTpT, the minor grooves of which face predominantly in towards the protein; and also GpGpC/GpCpC and ApGpC/GpCpT, whose minor grooves face outward. Fourier anal. has been used to obtain fractional variations in occurrence for all ten dinucleotide and all 32 trinucleotide arrangements. These sequence preferences should apply generally to many other cases of protein-DNA recognition, where the DNA wraps around a protein. In addn., it is obsd. that long runs of homopolymer (dA)·(dT) prefer to occupy the ends of core DNA, five to six turns away from the dyad. These same sequences are apparently excluded from the near-center of core DNA, two to three turns from the dyad. Hence, the translational positioning of any single histone octamer along a DNA mol. of defined sequence may be strongly influenced by the placement of (dA)·(dT) sequences. It may also be influenced by any aversion of the protein for sequences in the linker region, the sequence content of which remains to be detd.
- 61Friedel, M.; Nikolajewa, S.; Sühnel, J.; Wilhelm, T. DiProDB: a database for dinucleotide properties. Nucleic Acids Res. 2009, 37, D37– D40, DOI: 10.1093/nar/gkn59761https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD1cXhsFejt7jN&md5=c55ec2c4eb87fb55d3ba7958b5c8a75aDiProDB: a database for dinucleotide propertiesFriedel, Maik; Nikolajewa, Swetlana; Suehnel, Juergen; Wilhelm, ThomasNucleic Acids Research (2009), 37 (Database Iss), D37-D40CODEN: NARHAD; ISSN:0305-1048. (Oxford University Press)DiProDB (http://diprodb.fli-leibniz.de) is a database of conformational and thermodn. dinucleotide properties. It includes datasets both for DNA and RNA, as well as for single and double strands. The data have been shown to be important for understanding different aspects of nucleic acid structure and function, and they can also be used for encoding nucleic acid sequences. The database is intended to facilitate further applications of dinucleotide properties. A no. of property datasets is highly correlated. Therefore, the database comes with a correlation anal. facility. Authors having detd. new sets of dinucleotide property values are invited to submit these data to DiProDB.
- 62Qin, Y.; Hurley, L. H. Structures, folding patterns, and functions of intramolecular DNA G-quadruplexes found in eukaryotic promoter regions. Biochimie 2008, 90, 1149– 1171, DOI: 10.1016/j.biochi.2008.02.02062https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD1cXptVWlt7s%253D&md5=40a9a2ddd196217df942f0824c6292a5Structures, folding patterns, and functions of intramolecular DNA G-quadruplexes found in eukaryotic promoter regionsQin, Yong; Hurley, Laurence H.Biochimie (2008), 90 (8), 1149-1171CODEN: BICMBE; ISSN:0300-9084. (Elsevier B.V.)A review. In its simplest form, a DNA G-quadruplex is a four-stranded DNA structure that is composed of stacked guanine tetrads. G-quadruplex-forming sequences have been identified in eukaryotic telomeres, as well as in non-telomeric genomic regions, such as gene promoters, recombination sites, and DNA tandem repeats. Of particular interest are the G-quadruplex structures that form in gene promoter regions, which have emerged as potential targets for anticancer drug development. Evidence for the formation of G-quadruplex structures in living cells continues to grow. In this review, we examine recent studies on intramol. G-quadruplex structures that form in the promoter regions of some human genes in living cells and discuss the biol. implications of these structures. The identification of G-quadruplex structures in promoter regions provides us with new insights into the fundamental aspects of G-quadruplex topol. and DNA sequence-structure relationships. Progress in G-quadruplex structural studies and the validation of the biol. role of these structures in cells will further encourage the development of small mols. that target these structures to specifically modulate gene transcription.
- 63Todd, A. K.; Johnston, M.; Neidle, S. Highly prevalent putative quadruplex sequence motifs in human DNA. Nucleic Acids Res. 2005, 33, 2901– 2907, DOI: 10.1093/nar/gki55363https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD2MXkvVSgsrw%253D&md5=53d7a1863e6a3c2f264aa291eabe893dHighly prevalent putative quadruplex sequence motifs in human DNATodd, Alan K.; Johnston, Matthew; Neidle, StephenNucleic Acids Research (2005), 33 (9), 2901-2907CODEN: NARHAD; ISSN:0305-1048. (Oxford University Press)We report here the results of a systematic search for the existence and prevalence of potential intramol. G-quadruplex forming sequences in the human genome. We have also examd. the tendency for particular sequences of 'loop' regions to occur in particular positions with respect to the G-tracts in a quadruplex. Using arithmetic ratio and probability techniques we have discovered frequent and systematic occurrence of certain sequence types, the most prominent being a potential quadruplex contg. CCTGT in the first 'loop' position. Being able to highlight types of potential quadruplex sequences in G-rich regions is an important step in searching for biol. relevant sequences and finding their function.
- 64Zeraati, M.; Langley, D. B.; Schofield, P.; Moye, A. L.; Rouet, R.; Hughes, W. E.; Bryan, T. M.; Dinger, M. E.; Christ, D. I-motif DNA structures are formed in the nuclei of human cells. Nat. Chem. 2018, 10, 631– 637, DOI: 10.1038/s41557-018-0046-365https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1cXos1Smt7c%253D&md5=d9984eb53b7672d60823566ad7f54c38i-motif DNA structures are formed in the nuclei of human cellsZeraati, Mahdi; Langley, David B.; Schofield, Peter; Moye, Aaron L.; Rouet, Romain; Hughes, William E.; Bryan, Tracy M.; Dinger, Marcel E.; Christ, DanielNature Chemistry (2018), 10 (6), 631-637CODEN: NCAHBB; ISSN:1755-4330. (Nature Research)Human genome function is underpinned by the primary storage of genetic information in canonical B-form DNA, with a second layer of DNA structure providing regulatory control. I-motif structures are thought to form in cytosine-rich regions of the genome and to have regulatory functions; however, in vivo evidence for the existence of such structures has so far remained elusive. Here we report the generation and characterization of an antibody fragment (iMab) that recognizes i-motif structures with high selectivity and affinity, enabling the detection of i-motifs in the nuclei of human cells. We demonstrate that the in vivo formation of such structures is cell-cycle and pH dependent. Furthermore, we provide evidence that i-motif structures are formed in regulatory regions of the human genome, including promoters and telomeric regions. Our results support the notion that i-motif structures provide key regulatory roles in the genome.
- 65Drew, H. R.; Travers, A. A. DNA bending and its relation to nucleosome positioning. J. Mol. Biol. 1985, 186, 773– 790, DOI: 10.1016/0022-2836(85)90396-166https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADyaL28XotlSltg%253D%253D&md5=2c86fb083f6f0ce0f78030dba4db93d7DNA bending and its relation to nucleosome positioningDrew, Horace R.; Travers, Andrew A.Journal of Molecular Biology (1985), 186 (4), 773-90CODEN: JMOBAK; ISSN:0022-2836.X-ray and soln. studies have shown that the conformation of a DNA double helix depends strongly on its base sequence. Certain sequence-dependent modulations in structure appear to det. the rotational positioning of DNA about the nucleosome. Three different expts. are described. First, a piece of DNA of defined sequence (169 base pairs long) is closed into a circle and its structure examd. by digestion with DNAse I. The helix adopts a highly preferred configuration, with short runs of (A, T) facing in and runs of (G, C) facing out. Secondly, the same sequence is reconstituted with a histone octamer: the angular orientation around the histone core remains conserved, apart from a small uniform increase in helix twist. Finally, the av. sequence content of DNA mols. isolated from chicken nucleosome cores is nonrandom, as in a reconstituted nucleosome. Short runs of (A, T) are preferentially positioned with minor grooves facing in, while runs of (G, C) tend to have their minor grooves facing out. The periodicity of this modulation in sequence content (10·17 base pairs) corresponds to the helix twist in a local frame of ref. (a result that bears on the change in linking no. upon nucleosome formation). The determinants of translational positioning were not identified, but 1 possibility is that long runs of homopolymer (dA)·(dT) or (dG)·(dC) will be excluded from the central region of the supercoil on account of their resistance to curvature.
- 66Tsankov, A.; Yanagisawa, Y.; Rhind, N.; Regev, A.; Rando, O. J. Evolutionary divergence of intrinsic and trans-regulated nucleosome positioning sequences reveals plastic rules for chromatin organization. Genome Res. 2011, 21, 1851– 1862, DOI: 10.1101/gr.122267.11167https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3MXhsVKjsL%252FI&md5=33193e72242ed7a677d7995b98838e66Evolutionary divergence of intrinsic and trans-regulated nucleosome positioning sequences reveals plastic rules for chromatin organizationTsankov, Alex; Yanagisawa, Yoshimi; Rhind, Nicholas; Regev, Aviv; Rando, Oliver J.Genome Research (2011), 21 (11), 1851-1862CODEN: GEREFS; ISSN:1088-9051. (Cold Spring Harbor Laboratory Press)The packaging of eukaryotic genomes into nucleosomes plays crit. roles in chromatin organization and gene regulation. Studies in Saccharomyces cerevisiae indicate that nucleosome occupancy is partially encoded by intrinsic antinucleosomal DNA sequences, such as poly(A) sequences, as well as by binding sites for trans-acting factors that can evict nucleosomes, such as Reb1 and the Rsc3/30 complex. Here, we use genome-wide nucleosome occupancy maps in 13 Ascomycota fungi to discover large-scale evolutionary reprogramming of both intrinsic and trans determinants of chromatin structure. We find that poly(G)s act as intrinsic antinucleosomal sequences, comparable to the known function of poly(A)s, but that the abundance of poly(G)s has diverged greatly between species, obscuring their antinucleosomal effect in low-poly(G) species such as S. cerevisiae. We also develop a computational method that uses nucleosome occupancy maps for discovering trans-acting general regulatory factor (GRF) binding sites. Our approach reveals that the specific sequences bound by GRFs have diverged substantially across evolution, corresponding to a no. of major evolutionary transitions in the repertoire of GRFs. We exptl. validate a proposed evolutionary transition from Cbf1 as a major GRF in pre-whole-genome duplication (WGD) yeasts to Reb1 in post-WGD yeasts. We further show that the mating type switch-activating protein Sap1 is a GRF in S. pombe, demonstrating the general applicability of our approach. Our results reveal that the underlying mechanisms that det. in vivo chromatin organization have diverged and that comparative genomics can help discover new determinants of chromatin organization.
Supporting Information
Supporting Information
The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/acsomega.0c00441.
(Supplementary figure 1) Structural profiles of DNA helicoidal parameters for Ori sequences, (Supplementary figure 2) a profile of DNA structural properties of tissue-specific Ori sequences, and (Supplementary figure 3) positional distribution of promoter elements in Ori sequences (PDF)
(Supplementary table 1) Characteristic structural feature values observed in Ori sequences and (Supplementary table 2) oligonucleotide compositional analysis of Ori sequences (XLS)
Terms & Conditions
Most electronic Supporting Information files are available without a subscription to ACS Web Editions. Such files may be downloaded by article for research use (if there is a public use license linked to the relevant article, that license may permit other uses). Permission may be obtained from ACS for other uses through requests via the RightsLink permission system: http://pubs.acs.org/page/copyright/permissions.html.