SPECTRa: The Deposition and Validation of Primary Chemistry Research Data in Digital Repositories
- Jim Downing ,
- Peter Murray-Rust ,
- Alan P. Tonge ,
- Peter Morgan ,
- Henry S. Rzepa ,
- Fiona Cotterill ,
- Nick Day , and
- Matt J. Harvey
Abstract

The SPECTRa (Submission, Preservation and Exposure of Chemistry Teaching and Research Data) project has investigated the practices of chemists in archiving and disseminating primary chemical data from academic research laboratories. To redress the loss of the large amount of data never archived or disseminated, we have developed software for data publication into departmental and institutional Open Access digital repositories (DSpace). Data adhering to standard formats in selected disciplines (crystallography, NMR, computational chemistry) is transformed to XML (CML, Chemical Markup Language) which provides added validation. Context-specific chemical metadata and persistent Handle identifiers are added to enable long-term data reuse. It was found essential to provide an embargo mechanism, and policies for operating this and other processes are presented.
Cited By
This article is cited by 40 publications.
- King Kuok (Mimi) Hii, Henry S. Rzepa, and Edward H. Smith . Asymmetric Epoxidation: A Twinned Laboratory and Molecular Modeling Experiment for Upper-Level Organic Chemistry Students. Journal of Chemical Education 2015, 92 (8) , 1385-1389. https://doi.org/10.1021/ed500398e
- Aileen E. Day, Simon J. Coles, Colin L. Bird, Jeremy G. Frey, Richard J. Whitby, Valery E. Tkachenko, and Antony J. Williams . ChemTrove: Enabling a Generic ELN To Support Chemistry through the Use of Transferable Plug-ins and Online Data Sources. Journal of Chemical Information and Modeling 2015, 55 (3) , 501-509. https://doi.org/10.1021/ci5005948
- Cerys Willoughby, Colin L. Bird, Simon J. Coles, and Jeremy G. Frey . Creating Context for the Experiment Record. User-Defined Metadata: Investigations into Metadata Usage in the LabTrove ELN. Journal of Chemical Information and Modeling 2014, 54 (12) , 3268-3283. https://doi.org/10.1021/ci500469f
- Matthew J. Harvey, Nicholas J. Mason, and Henry S. Rzepa . Digital Data Repositories in Chemistry and Their Integration with Journals and Electronic Notebooks. Journal of Chemical Information and Modeling 2014, 54 (10) , 2627-2635. https://doi.org/10.1021/ci500302p
- Steven Lal, Henry S. Rzepa, and Silvia Díez-González . Catalytic and Computational Studies of N-Heterocyclic Carbene or Phosphine-Containing Copper(I) Complexes for the Synthesis of 5-Iodo-1,2,3-Triazoles. ACS Catalysis 2014, 4 (7) , 2274-2287. https://doi.org/10.1021/cs500326e
- Henry S. Rzepa and Curt Wentrup . Mechanistic Diversity in Thermal Fragmentation Reactions: A Computational Exploration of CO and CO2 Extrusions from Five-Membered Rings. The Journal of Organic Chemistry 2013, 78 (15) , 7565-7574. https://doi.org/10.1021/jo401146k
- Kai Abersfelder, Adam Russell, Henry S. Rzepa, Andrew J. P. White, Peter R. Haycock, and David Scheschkewitz . Contraction and Expansion of the Silicon Scaffold of Stable Si6R6 Isomers. Journal of the American Chemical Society 2012, 134 (38) , 16008-16016. https://doi.org/10.1021/ja307344f
- Antoine Buchard, Fabian Jutz, Michael R. Kember, Andrew J. P. White, Henry S. Rzepa, and Charlotte K. Williams . Experimental and Computational Investigation of the Mechanism of Carbon Dioxide/Cyclohexene Oxide Copolymerization Using a Dizinc Catalyst. Macromolecules 2012, 45 (17) , 6781-6795. https://doi.org/10.1021/ma300803b
- Ye Li Lori Tschirhart . Preparing To Support Research Data Sharing. 2012,,, 145-162. https://doi.org/10.1021/bk-2012-1110.ch009
- Chaitanya S. Wannere, Henry S. Rzepa, B. Christopher Rinderspacher, Ankan Paul, Charlotte S. M. Allan and Henry F. Schaefer, III, Paul v. R. Schleyer. The Geometry and Electronic Topology of Higher-Order Charged Möbius Annulenes. The Journal of Physical Chemistry A 2009, 113 (43) , 11619-11629. https://doi.org/10.1021/jp902176a
- . Synthesis and Applications. 2019,,, 31-77. https://doi.org/10.1002/9781119010753.ch2
- Geven Piir, Iiris Kahn, Alfonso T. García-Sosa, Sulev Sild, Priit Ahte, Uko Maran. Best Practices for QSAR Model Reporting: Physical and Chemical Properties, Ecotoxicity, Environmental Fate, Human Health, and Toxicokinetics Endpoints. Environmental Health Perspectives 2018, 126 (12) , 126001. https://doi.org/10.1289/EHP3264
- Matthew J. Harvey, Andrew McLean, Henry S. Rzepa. A metadata-driven approach to data repository design. Journal of Cheminformatics 2017, 9 (1) https://doi.org/10.1186/s13321-017-0190-6
- Ahmed Mohamed, Canh Hao Nguyen, Hiroshi Mamitsuka. Current status and prospects of computational resources for natural product dereplication: a review. Briefings in Bioinformatics 2016, 17 (2) , 309-321. https://doi.org/10.1093/bib/bbv042
- Matthew J Harvey, Nicholas J Mason, Andrew McLean, Henry S Rzepa. Standards-based metadata procedures for retrieving data for display or mining utilizing persistent (data-DOI) identifiers. Journal of Cheminformatics 2015, 7 (1) https://doi.org/10.1186/s13321-015-0081-7
- Matthew J Harvey, Nicholas J Mason, Andrew McLean, Peter Murray-Rust, Henry S Rzepa, James J P Stewart. Standards-based curation of a decade-old digital repository dataset of molecular information. Journal of Cheminformatics 2015, 7 (1) https://doi.org/10.1186/s13321-015-0093-3
- Philip Calvert. Should All Lab Books Be Treated as Vital Records? An Investigation into the Use of Lab Books by Research Scientists. Australian Academic & Research Libraries 2015, 46 (4) , 291-304. https://doi.org/10.1080/00048623.2015.1108897
- Wendy A. Warr. Many InChIs and quite some feat. Journal of Computer-Aided Molecular Design 2015, 29 (8) , 681-694. https://doi.org/10.1007/s10822-015-9854-3
- Dena Tahvildari. Semantic Support for Recording Laboratory Experimental Metadata: A Study in Food Chemistry. 2015,,, 783-794. https://doi.org/10.1007/978-3-319-18818-8_51
- Jeremy G. Frey, Colin L. Bird. Scientific and technical data sharing: a trading perspective. Journal of Computer-Aided Molecular Design 2014, 28 (10) , 989-996. https://doi.org/10.1007/s10822-014-9785-4
- Wibe A de Jong, Andrew M Walker, Marcus D Hanwell. From data to analysis: linking NWChem and Avogadro with the syntax and semantics of Chemical Markup Language. Journal of Cheminformatics 2013, 5 (1) https://doi.org/10.1186/1758-2946-5-25
- Henry S Rzepa. Chemical datuments as scientific enablers. Journal of Cheminformatics 2013, 5 (1) https://doi.org/10.1186/1758-2946-5-6
- Jeremy G. Frey, Colin L. Bird. Cheminformatics and the Semantic Web: adding value with linked data and enhanced provenance. Wiley Interdisciplinary Reviews: Computational Molecular Science 2013, 3 (5) , 465-481. https://doi.org/10.1002/wcms.1127
- Sason Shaik, Henry S. Rzepa, Roald Hoffmann. Ein Molekül, zwei Atome, drei Ansichten, vier Bindungen?. Angewandte Chemie 2013, 125 (10) , 3094-3109. https://doi.org/10.1002/ange.201208206
- Sason Shaik, Henry S. Rzepa, Roald Hoffmann. One Molecule, Two Atoms, Three Views, Four Bonds?. Angewandte Chemie International Edition 2013, 52 (10) , 3020-3033. https://doi.org/10.1002/anie.201208206
- Tobias Kind, Oliver Fiehn. Advances in structure elucidation of small molecules using mass spectrometry. 2013,,, 129-166. https://doi.org/10.1007/978-3-642-36303-0_7
- D. Christopher Braddock, James Clarke, Henry S. Rzepa. Epoxidation of bromoallenes connects red algae metabolites by an intersecting bromoallene oxide – Favorskii manifold. Chemical Communications 2013, 49 (95) , 11176. https://doi.org/10.1039/c3cc46720a
- Colin L. Bird, Jeremy G. Frey. Chemical information matters: an e-Research perspective on information and data sharing in the chemical sciences. Chemical Society Reviews 2013, 42 (16) , 6754. https://doi.org/10.1039/c3cs60050e
- Colin L. Bird, Cerys Willoughby, Jeremy G. Frey. Laboratory notebooks in the digital era: the role of ELNs in record keeping for chemistry and other sciences. Chemical Society Reviews 2013, 42 (20) , 8157. https://doi.org/10.1039/c3cs60122f
- Colin L. Bird, Cerys Willoughby, Simon J. Coles, Jeremy G. Frey. Data Curation Issues in the Chemical Sciences. Information Standards Quarterly 2013, 25 (3) , 4. https://doi.org/10.3789/isqv25no3.2013.02
- Kinga Leszczyńska, Kai Abersfelder, Moumita Majumdar, Beate Neumann, Hans-Georg Stammler, Henry S. Rzepa, Peter Jutzi, David Scheschkewitz. The Cp*Si+ cation as a stoichiometric source of silicon. Chemical Communications 2012, 48 (63) , 7820. https://doi.org/10.1039/c2cc33911k
- Henry S Rzepa. The past, present and future of Scientific discourse. Journal of Cheminformatics 2011, 3 (1) https://doi.org/10.1186/1758-2946-3-46
- Nick E. Day, Peter Murray-Rust, Simon M. Tyrrell. CIFXML: a schema and toolkit for managing CIFs in XML. Journal of Applied Crystallography 2011, 44 (3) , 628-634. https://doi.org/10.1107/S0021889811011058
- Tobias Kind, Oliver Fiehn. Advances in structure elucidation of small molecules using mass spectrometry. Bioanalytical Reviews 2010, 2 (1-4) , 23-60. https://doi.org/10.1007/s12566-010-0015-9
- Yannick Millot, Redouane Hajjar, Pascal P. Man. NMR cogwheel phase cycling determination with web tools: Amplitude-modulated z-filter MQMAS sequence. Solid State Nuclear Magnetic Resonance 2010, 38 (1) , 19-26. https://doi.org/10.1016/j.ssnmr.2010.06.001
- K. Abersfelder, A. J. P. White, H. S. Rzepa, D. Scheschkewitz. A Tricyclic Aromatic Isomer of Hexasilabenzene. Science 2010, 327 (5965) , 564-566. https://doi.org/10.1126/science.1181771
- Henry S. Rzepa. The importance of being bonded. Nature Chemistry 2009, 1 (7) , 510-512. https://doi.org/10.1038/nchem.373
- Tobias Kind, Martin Scholz, Oliver Fiehn, . How Large Is the Metabolome? A Critical Analysis of Data Exchange Practices in Chemistry. PLoS ONE 2009, 4 (5) , e5440. https://doi.org/10.1371/journal.pone.0005440
- Henry S. Rzepa. Wormholes in chemical space connecting torus knot and torus link π-electron density topologies. Phys. Chem. Chem. Phys. 2009, 11 (9) , 1340-1345. https://doi.org/10.1039/B810301A
- Charlotte S. M. Allan, Henry S. Rzepa. A computational investigation of the structure of polythiocyanogen. Dalton Trans. 2008, 14 (48) , 6925-6932. https://doi.org/10.1039/B810147G



