PubChemLite Plus Collision Cross Section (CCS) Values for Enhanced Interpretation of Nontarget Environmental DataClick to copy article linkArticle link copied!
- Anjana ElapavaloreAnjana ElapavaloreLuxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, 6 Avenue du Swing, 4367 Belvaux, LuxembourgMore by Anjana Elapavalore
- Dylan H. RossDylan H. RossDepartment of Medicinal Chemistry, University of Washington, Seattle, Washington 98195, United StatesCurrent Address: Biological Sciences Division, Pacific Northwest National Laboratory, Richland, Washington 99352, United StatesMore by Dylan H. Ross
- Valentin GrouèsValentin GrouèsLuxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, 6 Avenue du Swing, 4367 Belvaux, LuxembourgMore by Valentin Grouès
- Dagny AurichDagny AurichLuxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, 6 Avenue du Swing, 4367 Belvaux, LuxembourgMore by Dagny Aurich
- Allison M. KrinskyAllison M. KrinskyDepartment of Medicinal Chemistry, University of Washington, Seattle, Washington 98195, United StatesMore by Allison M. Krinsky
- Sunghwan KimSunghwan KimNational Center for Biotechnology Information (NCBI), National Library of Medicine (NLM), National Institutes of Health (NIH), Bethesda, Maryland 20894, United StatesMore by Sunghwan Kim
- Paul A. ThiessenPaul A. ThiessenNational Center for Biotechnology Information (NCBI), National Library of Medicine (NLM), National Institutes of Health (NIH), Bethesda, Maryland 20894, United StatesMore by Paul A. Thiessen
- Jian ZhangJian ZhangNational Center for Biotechnology Information (NCBI), National Library of Medicine (NLM), National Institutes of Health (NIH), Bethesda, Maryland 20894, United StatesMore by Jian Zhang
- James N. DoddsJames N. DoddsDepartment of Chemistry, University of North Carolina, Chapel Hill, North Carolina 27599, United StatesMore by James N. Dodds
- Erin S. BakerErin S. BakerDepartment of Chemistry, University of North Carolina, Chapel Hill, North Carolina 27599, United StatesMore by Erin S. Baker
- Evan E. Bolton*Evan E. Bolton*Phone: +1 301 451 1811. Fax: +1 301 480 9241. Email: [email protected]National Center for Biotechnology Information (NCBI), National Library of Medicine (NLM), National Institutes of Health (NIH), Bethesda, Maryland 20894, United StatesMore by Evan E. Bolton
- Libin Xu*Libin Xu*Phone: +1 206 543-1080. Fax: +1 206 685 3252. Email: [email protected]Department of Medicinal Chemistry, University of Washington, Seattle, Washington 98195, United StatesMore by Libin Xu
- Emma L. Schymanski*Emma L. Schymanski*Phone: +352 46 66 44 5616. Email: [email protected]Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, 6 Avenue du Swing, 4367 Belvaux, LuxembourgMore by Emma L. Schymanski
Abstract
Finding relevant chemicals in the vast (known) chemical space is a major challenge for environmental and exposomics studies leveraging nontarget high resolution mass spectrometry (NT-HRMS) methods. Chemical databases now contain hundreds of millions of chemicals, yet many are not relevant. This article details an extensive collaborative, open science effort to provide a dynamic collection of chemicals for environmental, metabolomics, and exposomics research, along with supporting information about their relevance to assist researchers in the interpretation of candidate hits. The PubChemLite for Exposomics collection is compiled from ten annotation categories within PubChem, enhanced with patent, literature and annotation counts, predicted partition coefficient (logP) values, as well as predicted collision cross section (CCS) values using CCSbase. Monthly versions are archived on Zenodo under a CC-BY license, supporting reproducible research, and a new interface has been developed, including historical trends of patent and literature data, for researchers to browse the collection. This article details how PubChemLite can support researchers in environmental and exposomics studies, describes efforts to increase the availability of experimental CCS values, and explores known limitations and potential for future developments. The data and code behind these efforts are openly available. PubChemLite can be browsed at https://pubchemlite.lcsb.uni.lu.
This publication is licensed under
License Summary*
You are free to share(copy and redistribute) this article in any medium or format and to adapt(remix, transform, and build upon) the material for any purpose, even commercially within the parameters below:
Creative Commons (CC): This is a Creative Commons license.
Attribution (BY): Credit must be given to the creator.
*Disclaimer
This summary highlights only some of the key features and terms of the actual license. It is not a license and has no legal value. Carefully review the actual license before using these materials.
License Summary*
You are free to share(copy and redistribute) this article in any medium or format and to adapt(remix, transform, and build upon) the material for any purpose, even commercially within the parameters below:
Creative Commons (CC): This is a Creative Commons license.
Attribution (BY): Credit must be given to the creator.
*Disclaimer
This summary highlights only some of the key features and terms of the actual license. It is not a license and has no legal value. Carefully review the actual license before using these materials.
License Summary*
You are free to share(copy and redistribute) this article in any medium or format and to adapt(remix, transform, and build upon) the material for any purpose, even commercially within the parameters below:
Creative Commons (CC): This is a Creative Commons license.
Attribution (BY): Credit must be given to the creator.
*Disclaimer
This summary highlights only some of the key features and terms of the actual license. It is not a license and has no legal value. Carefully review the actual license before using these materials.
Special Issue
Published as part of Environmental Science & Technology Letters special issue “Non-Targeted Analysis of the Environment”.
Introduction
Methods and Materials
Building PubChemLite
Figure 1
Figure 1. PubChemLite categories in the PubChem Table of Contents (TOC) (https://pubchem.ncbi.nlm.nih.gov/classification/#hid=72), selected subcategories, and associated annotation examples. Yellow shading denotes “environmental” categories (example CID 47759) (https://pubchem.ncbi.nlm.nih.gov/compound/47759#section=EU-Pesticides-Data), red the “exposomics” (example CID 114481) (https://pubchem.ncbi.nlm.nih.gov/compound/114481#section=Associated-Disorders-and-Diseases) and purple the “metabolomics” sections (example CID 1) (https://pubchem.ncbi.nlm.nih.gov/compound/1#section=Pathways). For high resolution live images, please click the embedded hyperlinks. Logo image from GitLab. (28)
Adding Predicted CCS Values to PubChemLite
Adding Experimental CCS Values to PubChem
Figure 2
Figure 2. Aggregated Collision Cross Section (CCS) Classification Tree (https://pubchem.ncbi.nlm.nih.gov/classification/#hid=106) in PubChem. Inset: Experimental CCS values in individual PubChem compound records for Cl-PFOPA (CID 138395139) (https://pubchem.ncbi.nlm.nih.gov/compound/138395139#section=Collision-Cross-Section) and the transformation product 2-hydroxyatrazine (CID 135398733) (https://pubchem.ncbi.nlm.nih.gov/compound/135398733#section=Collision-Cross-Section). For high resolution live images, please click the embedded hyperlinks. Logo image from GitLab. (28)
PubChemLite Web Interface
Figure 3
Figure 3. PubChemLite web interface (composite image), compound view of Atrazine (https://pubchemlite.lcsb.uni.lu/e/compound/2256). For high resolution live images, please click the embedded hyperlink. Logo image from GitLab. (28)
Figure 4
Figure 4. PubChemLite web interface (composite image), view of additional data including annotations, CCS values, and patent and literature stripes for Streptomycin (https://pubchemlite.lcsb.uni.lu/e/compound/19649). For high resolution live images, please click the embedded hyperlink. Logo image from GitLab. (28)
Results and Discussion
PubChemLite Over Time
Figure 5
Figure 5. PubChemLite annotation content (total and by category) between 4 Feb. 2022, and 3 Nov. 2024.
Incorporating CCS Values into Candidate Selection with PubChemLite
Future Perspectives
Data Availability
The PubChemLite web interface (https://pubchemlite.lcsb.uni.lu) is openly available. PubChemLite is compiled weekly from openly available files downloaded from PubChem (50) and is archived monthly on Zenodo (DOI: https://doi.org/10.5281/zenodo.5995885). CCS values are added using open cs3db (https://github.com/dylanhross/c3sdb/) code (29) and the PubChemLite-CCS files are archived on Zenodo at DOI: https://doi.org/10.5281/zenodo.4081056. The Zenodo links redirect to the latest version. The code for the PubChemLite build system (https://gitlab.com/uniluxembourg/lcsb/eci/pclbuild), (25) inputs (https://gitlab.com/uniluxembourg/lcsb/eci/pubchemlite-input), (24) chemical stripes (https://gitlab.com/uniluxembourg/lcsb/eci/chemicalstripes) (53) and interface (https://gitlab.com/uniluxembourg/lcsb/eci/pubchemlite-web) (49) are openly available on the Environmental Cheminformatics (ECI) GitLab (https://gitlab.com/uniluxembourg/lcsb/eci/). (26) All resources are available under open licenses, see individual pages for details. This article was submitted as a preprint: Anjana Elapavalore, Dylan Ross, Valentin Groues, Dagny Aurich, Allison Krinsky, Sunghwan Kim, Paul Thiessen, Jian Zhang, James Dodds, Erin Baker, Evan Bolton, Libin Xu, Emma Schymanski. 2024. ChemRxiv. DOI: https://doi.org/10.26434/chemrxiv-2024-2xcsq.
Supporting Information
The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/acs.estlett.4c01003.
A document including additional details about the CCSbase training data sets (S1), using PubChemLite in MetFrag (S2), and additional rank and CCS results (S3) (PDF)
Terms & Conditions
Most electronic Supporting Information files are available without a subscription to ACS Web Editions. Such files may be downloaded by article for research use (if there is a public use license linked to the relevant article, that license may permit other uses). Permission may be obtained from ACS for other uses through requests via the RightsLink permission system: http://pubs.acs.org/page/copyright/permissions.html.
Acknowledgments
The authors acknowledge the earlier efforts of Todor Kondic (now at LDNS) to parts of this work, Steffen Neumann (IPB Halle) for his merging of countless monthly pull requests into MetFrag, Rick Helmus (University of Amsterdam) for the continuing patRoon integration, and Christine Gallampois (Umea University) for her insights on the sediment NTS, as well as the Environmental Cheminformatics, Bioinformatics Core, Xu lab, BakerLab and PubChem team members and other colleagues and collaborators who contributed to this work indirectly via other collaborative and scientific activities and discussions, and finally the reviewers and editor for their comments and suggestions.
References
This article references 58 other publications.
- 1Hollender, J.; Schymanski, E. L.; Ahrens, L.; Alygizakis, N.; Béen, F.; Bijlsma, L.; Brunner, A. M.; Celma, A.; Fildier, A.; Fu, Q.; Gago-Ferrero, P.; Gil-Solsona, R.; Haglund, P.; Hansen, M.; Kaserzon, S.; Kruve, A.; Lamoree, M.; Margoum, C.; Meijer, J.; Merel, S. NORMAN Guidance on Suspect and Non-Target Screening in Environmental Monitoring. Environmental Sciences Europe 2023, 35 (1), 75, DOI: 10.1186/s12302-023-00779-4Google ScholarThere is no corresponding record for this reference.
- 2Lai, Y.; Koelmel, J. P.; Walker, D. I.; Price, E. J.; Papazian, S.; Manz, K. E.; Castilla-Fernández, D.; Bowden, J. A.; Nikiforov, V.; David, A.; Bessonneau, V.; Amer, B.; Seethapathy, S.; Hu, X.; Lin, E. Z.; Jbebli, A.; McNeil, B. R.; Barupal, D.; Cerasa, M.; Xie, H. High-Resolution Mass Spectrometry for Human Exposomics: Expanding Chemical Space Coverage. Environ. Sci. Technol. 2024, 58 (29), 12784– 12822, DOI: 10.1021/acs.est.4c01156Google ScholarThere is no corresponding record for this reference.
- 3Belova, L.; Caballero-Casero, N.; van Nuijs, A. L. N.; Covaci, A. Ion Mobility-High-Resolution Mass Spectrometry (IM-HRMS) for the Analysis of Contaminants of Emerging Concern (CECs): Database Compilation and Application to Urine Samples. Anal. Chem. 2021, 93 (16), 6428– 6436, DOI: 10.1021/acs.analchem.1c00142Google Scholar3https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB3MXosFSqu7c%253D&md5=0fc01fc7b8c7360ee4f7015b0c8b98dcIon Mobility-High-Resolution Mass Spectrometry (IM-HRMS) for the Analysis of Contaminants of Emerging Concern (CECs): Database Compilation and Application to Urine SamplesBelova, Lidia; Caballero-Casero, Noelia; van Nuijs, Alexander L. N.; Covaci, AdrianAnalytical Chemistry (Washington, DC, United States) (2021), 93 (16), 6428-6436CODEN: ANCHAM; ISSN:0003-2700. (American Chemical Society)Ion mobility mass spectrometry (IM-MS)-derived collision cross section (CCS) values can serve as a valuable addnl. identification parameter within the anal. of compds. of emerging concern (CEC) in human matrixes. This study introduces the first comprehensive database of DTCCSN2 values of 148 CECs and their metabolites including bisphenols, alternative plasticizers (AP), organophosphate flame retardants (OP), perfluoroalkyl chems. (PFAS), and others. A total of 311 ions were included in the database, whereby the DTCCSN2 values for 113 compds. are reported for the first time. For 105 compds., more than one ion is reported. Moreover, the DTCCSN2 values of several isomeric CECs and their metabolites are reported to allow a distinction between isomers. Comprehensive quality assurance guidelines were implemented in the workflow of acquiring DTCCSN2 values to ensure reproducible exptl. conditions. The reliability and reproducibility of the complied database were investigated by analyzing pooled human urine spiked with 30 AP and OP metabolites at two concn. levels. For all investigated metabolites, the DTCCSN2 values measured in urine showed a percent error of <1% in comparison to database values. DTCCSN2 values of OP metabolites showed an av. percent error of 0.12% (50 ng/mL in urine) and 0.15% (20 ng/mL in urine). For AP metabolites, these values were 0.10 and 0.09%, resp. These results show that the provided database can be of great value for enhanced identification of CECs in environmental and human matrixes, which can advance future suspect screening studies on CECs.
- 4Celma, A.; Bade, R.; Sancho, J. V.; Hernandez, F.; Humphries, M.; Bijlsma, L. Prediction of Retention Time and Collision Cross Section (CCSH+, CCSH-, and CCSNa+) of Emerging Contaminants Using Multiple Adaptive Regression Splines. J. Chem. Inf. Model. 2022, 62 (22), 5425– 5434, DOI: 10.1021/acs.jcim.2c00847Google Scholar4https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB38Xis12gu7zN&md5=604ce54bdd69828329d9d996601f7d39Prediction of Retention Time and Collision Cross Section (CCSH+, CCSH-, and CCSNa+) of Emerging Contaminants Using Multiple Adaptive Regression SplinesCelma, Alberto; Bade, Richard; Sancho, Juan Vicente; Hernandez, Felix; Humphries, Melissa; Bijlsma, LubertusJournal of Chemical Information and Modeling (2022), 62 (22), 5425-5434CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)Ultra-high performance liq. chromatog. coupled to ion mobility sepn. and high-resoln. mass spectrometry instruments have proven very valuable for screening of emerging contaminants in the aquatic environment. However, when applying suspect or nontarget approaches (i.e., when no ref. stds. are available), there is no information on retention time (RT) and collision cross-section (CCS) values to facilitate identification. In silico prediction tools of RT and CCS can therefore be of great utility to decrease the no. of candidates to investigate. In this work, Multiple Adaptive Regression Splines (MARS) were evaluated for the prediction of both RT and CCS. MARS prediction models were developed and validated using a database of 477 protonated mols., 169 deprotonated mols., and 249 sodium adducts. Multivariate and univariate models were evaluated showing a better fit for univariate models to the exptl. data. The RT model (R2 = 0.855) showed a deviation between predicted and exptl. data of ±2.32 min (95% confidence intervals). The deviation obsd. for CCS data of protonated mols. using the CCSH model (R2 = 0.966) was ±4.05% with 95% confidence intervals. The CCSH model was also tested for the prediction of deprotonated mols., resulting in deviations below ±5.86% for the 95% of the cases. Finally, a third model was developed for sodium adducts (CCSNa, R2 = 0.954) with deviation below ±5.25% for 95% of the cases. The developed models have been incorporated in an open-access and user-friendly online platform which represents a great advantage for third-party research labs. for predicting both RT and CCS data.
- 5Song, X.-C.; Dreolin, N.; Canellas, E.; Goshawk, J.; Nerin, C. Prediction of Collision Cross-Section Values for Extractables and Leachables from Plastic Products. Environ. Sci. Technol. 2022, 56 (13), 9463– 9473, DOI: 10.1021/acs.est.2c02853Google Scholar5https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB38XhsFKiur3I&md5=154cdd78efbaf8ca197951b7a690b691Prediction of collision cross-section values for extractables and leachables from plastic productsSong, Xue-Chao; Dreolin, Nicola; Canellas, Elena; Goshawk, Jeff; Nerin, CristinaEnvironmental Science & Technology (2022), 56 (13), 9463-9473CODEN: ESTHAG; ISSN:1520-5851. (American Chemical Society)The use of ion mobility sepn. (IMS) in conjunction with high-resoln. mass spectrometry has proved to be a reliable and useful technique for the characterization of small mols. from plastic products. Collision cross-section (CCS) values derived from IMS can be used as a structural descriptor to aid compd. identification. One limitation of the application of IMS to the identification of chems. from plastics is the lack of published empirical CCS values. As such, machine learning techniques can provide an alternative approach by generating predicted CCS values. Herein, exptl. CCS values for over a thousand chems. assocd. with plastics were collected from the literature and used to develop an accurate CCS prediction model for extractables and leachables from plastic products. The effect of different mol. descriptors and machine learning algorithms on the model performance were assessed. A support vector machine (SVM) model, based on Chem. Development Kit (CDK) descriptors, provided the most accurate prediction with 93.3% of CCS values for [M + H]+ adducts and 95.0% of CCS values for [M + Na]+ adducts in testing sets predicted with <5% error. Median relative errors for the CCS values of the [M + H]+ and [M + Na]+ adducts were 1.42 and 1.76%, resp. Subsequently, CCS values for the compds. in the Chems. assocd. with Plastic Packaging Database and the Food Contact Chems. Database were predicted using the SVM model developed herein. These values were integrated in our structural elucidation workflow and applied to the identification of plastic-related chems. in river water. False positives were reduced, and the identification confidence level was improved by the incorporation of predicted CCS values in the suspect screening workflow.
- 6Ieritano, C.; Hopkins, W. S. Assessing Collision Cross Section Calculations Using MobCal-MPI with a Variety of Commonly Used Computational Methods. Mater. Today Commun. 2021, 27, 102226 DOI: 10.1016/j.mtcomm.2021.102226Google Scholar6https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB3MXmtFWltro%253D&md5=2189429366cd072f2ac8ff8ca7f1816bAssessing collision cross section calculations using MobCal-MPI with a variety of commonly used computational methodsIeritano, Christian; Hopkins, W. scottMaterials Today Communications (2021), 27 (), 102226CODEN: MTCAC7; ISSN:2352-4928. (Elsevier Ltd.)Structural studies with ion mobility require an accurate methodol. to bridge theor. modeling of chem. structure with exptl. detn. of an ion's collision cross section (CCS). The parallelized MobCal-MPI package enables rapid and accurate evaluation of CCSs that are applicable to several chem. classes, but was only assessed for accuracy using a single model chem.: B3LYP-D3/6-31++G(d,p). In this work, the performance of MobCal-MPI was validated across 25 different model chemistries, which encompassed PM7, Hartree-Fock, and three common DFT functionals (B3LYP-D3, ωB97X-D, and M06-2X-D3) using six different basis sets (6-31 G, 6-31 G(d,p), 6-31++G(d,p), def2-SVP, def2-TZVP, and def2-TZVPP). Performance assessment was accomplished using geometries generated from a set of 50 structurally diverse mols. at each level of theory. MobCal-MPI calcs. CCSs that correlate well with exptl. values for all model chemistries explored (< 2.5% RMSD) with the exception of PM7 (3.0% RMSD) and methods that employ basis sets lacking polarization functions (e.g., 6-31G; < 4% RMSD). While any of the 25 model chemistries can be used with MobCal-MPI with reasonable accuracy, caution should be exercised when coupling CCS calcns. with PM7 or basis sets that lack polarization functions. Following benchmarking, MobCal-MPI was used to calc. the CCS of a macromol. construct consisting of atropine and β-cyclodextrin. The CCSs calcd. for the β-cyclodextrin complex using either the PM7 or B3LYP-D3 model chemistries agree with exptl. values within the expected error of the method (< 2.5%).
- 7Colby, S. M.; Thomas, D. G.; Nuñez, J. R.; Baxter, D. J.; Glaesemann, K. R.; Brown, J. M.; Pirrung, M. A.; Govind, N.; Teeguarden, J. G.; Metz, T. O.; Renslow, R. S. ISiCLE: A Quantum Chemistry Pipeline for Establishing in Silico Collision Cross Section Libraries. Anal. Chem. 2019, 91 (7), 4346– 4356, DOI: 10.1021/acs.analchem.8b04567Google Scholar7https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1MXivVOgsb0%253D&md5=786b8a6e8d764d48a9a4debfc31e4d59ISiCLE: A Quantum Chemistry Pipeline for Establishing in Silico Collision Cross Section LibrariesColby, Sean M.; Thomas, Dennis G.; Nunez, Jamie R.; Baxter, Douglas J.; Glaesemann, Kurt R.; Brown, Joseph M.; Pirrung, Meg A.; Govind, Niranjan; Teeguarden, Justin G.; Metz, Thomas O.; Renslow, Ryan S.Analytical Chemistry (Washington, DC, United States) (2019), 91 (7), 4346-4356CODEN: ANCHAM; ISSN:0003-2700. (American Chemical Society)High-throughput, comprehensive, and confident identifications of metabolites and other chems. in biol. and environmental samples will revolutionize the understanding of the role these chem. diverse mols. play in biol. systems. Despite recent technol. advances, metabolomics studies still result in the detection of a disproportionate no. of features that cannot be confidently assigned to a chem. structure. This inadequacy is driven by the single most significant limitation in metabolomics, the reliance on ref. libraries constructed by anal. of authentic ref. materials with limited com. availability. To this end, the authors have developed the in silico chem. library engine (ISiCLE), a high-performance computing-friendly cheminformatics workflow for generating libraries of chem. properties. In the instantiation described here, the authors predict probable three-dimensional mol. conformers (i.e., conformational isomers) using chem. identifiers as input, from which collision cross sections (CCS) are derived. The approach employs first-principles simulation, distinguished by the use of mol. dynamics, quantum chem., and ion mobility calcns., to generate structures and chem. property libraries, all without training data. Importantly, optimization of ISiCLE included a refactoring of the popular MOBCAL code for trajectory-based mobility calcns., improving its computational efficiency by over 2 orders of magnitude. Calcd. CCS values were validated against 1983 exptl. measured CCS values and compared to previously reported CCS calcn. approaches. Av. calcd. CCS error for the validation set is 3.2% using std. parameters, outperforming other d. functional theory (DFT)-based methods and machine learning methods (e.g., MetCCS). An online database is introduced for sharing both calcd. and exptl. CCS values (metabolomics.pnnl.gov), initially including a CCS library with over 1 million entries. Finally, three successful applications of mol. characterization using calcd. CCS are described, including providing evidence for the presence of an environmental degrdn. product, the sepn. of mol. isomers, and an initial characterization of complex blinded mixts. of exposure chems. This work represents a method to address the limitations of small mol. identification and offers an alternative to generating chem. identification libraries exptl. by analyzing authentic ref. materials. All code is available at github.com/pnnl.
- 8Zhou, Z.; Luo, M.; Chen, X.; Yin, Y.; Xiong, X.; Wang, R.; Zhu, Z.-J. Ion Mobility Collision Cross-Section Atlas for Known and Unknown Metabolite Annotation in Untargeted Metabolomics. Nat. Commun. 2020, 11 (1), 4334, DOI: 10.1038/s41467-020-18171-8Google Scholar8https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB38XhtVyjtr7J&md5=e6d7e433aa292b43a5f71cabe7338670Ion mobility collision cross-section atlas for known and unknown metabolite annotation in untargeted metabolomicsZhou, Zhiwei; Luo, Mingdu; Chen, Xi; Yin, Yandong; Xiong, Xin; Wang, Ruohong; Zhu, Zheng-JiangNature Communications (2020), 11 (1), 4334CODEN: NCAOBW; ISSN:2041-1723. (Nature Research)The metabolome includes not just known but also unknown metabolites; however, metabolite annotation remains the bottleneck in untargeted metabolomics. Ion mobility - mass spectrometry (IM-MS) has emerged as a promising technol. by providing multi-dimensional characterizations of metabolites. Here, we curate an ion mobility CCS atlas, namely AllCCS, and develop an integrated strategy for metabolite annotation using known or unknown chem. structures. The AllCCS atlas covers vast chem. structures with >5000 exptl. CCS records and ∼12 million calcd. CCS values for >1.6 million small mols. We demonstrate the high accuracy and wide applicability of AllCCS with medium relative errors of 0.5-2% for a broad spectrum of small mols. AllCCS combined with in silico MS/MS spectra facilitates multi-dimensional match and substantially improves the accuracy and coverage of both known and unknown metabolite annotation from biol. samples. Together, AllCCS is a versatile resource that enables confident metabolite annotation, revealing comprehensive chem. and metabolic insights towards biol. processes.
- 9Ross, D. H.; Cho, J. H.; Xu, L. Breaking Down Structural Diversity for Comprehensive Prediction of Ion-Neutral Collision Cross Sections. Anal. Chem. 2020, 92 (6), 4548– 4557, DOI: 10.1021/acs.analchem.9b05772Google Scholar9https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB3cXjs1Klsb4%253D&md5=c5c69468f76ce320da7c03357fd97176Breaking Down Structural Diversity for Comprehensive Prediction of Ion-Neutral Collision Cross SectionsRoss, Dylan H.; Cho, Jang Ho; Xu, LibinAnalytical Chemistry (Washington, DC, United States) (2020), 92 (6), 4548-4557CODEN: ANCHAM; ISSN:0003-2700. (American Chemical Society)Identification of unknowns is a bottleneck for large-scale untargeted analyses like metabolomics or drug metabolite identification. Ion mobility-mass spectrometry (IM-MS) provides rapid two-dimensional sepn. of ions based on their mobility through a neutral buffer gas. The mobility of an ion is related to its collision cross section (CCS) with the buffer gas, a phys. property that is detd. by the size and shape of the ion. This structural dependency makes CCS a promising characteristic for compd. identification, but this utility is limited by the availability of high-quality ref. CCS values. CCS prediction using machine learning (ML) has recently shown promise in the field, but accurate and broadly applicable models are still lacking. Here we present a novel ML approach that employs a comprehensive collection of CCS values covering a wide range of chem. space. Using this diverse database, we identified the structural characteristics, represented by mol. quantum nos. (MQNs), that contribute to variance in CCS and assessed the performance of a variety of ML algorithms in predicting CCS. We found that by breaking down the chem. structural diversity using unsupervised clustering based on the MQNs, specific and accurate prediction models for each cluster can be trained, which showed superior performance than a single model trained with all data. Using this approach, we have robustly trained and characterized a CCS prediction model with high accuracy on diverse chem. structures. An all-in-one web interface (https://CCSbase.net) was built for querying the CCS database and accessing the predictive model to support unknown compd. identifications.
- 10Guo, R.; Zhang, Y.; Liao, Y.; Yang, Q.; Xie, T.; Fan, X.; Lin, Z.; Chen, Y.; Lu, H.; Zhang, Z. Highly Accurate and Large-Scale Collision Cross Sections Prediction with Graph Neural Networks. Communications Chemistry 2023, 6 (1), 1– 10, DOI: 10.1038/s42004-023-00939-wGoogle ScholarThere is no corresponding record for this reference.
- 11Plante, P.-L.; Francovic-Fontaine, É.; May, J. C.; McLean, J. A.; Baker, E. S.; Laviolette, F.; Marchand, M.; Corbeil, J. Predicting Ion Mobility Collision Cross-Sections Using a Deep Neural Network: DeepCCS. Anal. Chem. 2019, 91 (8), 5191– 5199, DOI: 10.1021/acs.analchem.8b05821Google Scholar11https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1MXmtFOisLY%253D&md5=d735e5f80e99301113280f7c878314a5Predicting Ion Mobility Collision Cross-Sections Using a Deep Neural Network: DeepCCSPlante, Pier-Luc; Francovic-Fontaine, Elina; May, Jody C.; McLean, John A.; Baker, Erin S.; Laviolette, Francois; Marchand, Mario; Corbeil, JacquesAnalytical Chemistry (Washington, DC, United States) (2019), 91 (8), 5191-5199CODEN: ANCHAM; ISSN:0003-2700. (American Chemical Society)Untargeted metabolomic measurements using mass spectrometry are a powerful tool for uncovering new small mols. with environmental and biol. importance. The small mol. identification step, however, still remains an enormous challenge due to fragmentation difficulties or unspecific fragment ion information. Current methods to address this challenge are often dependent on databases or require the use of NMR, which have their own difficulties. The use of the gas-phase collision cross section (CCS) values obtained from ion mobility spectrometry (IMS) measurements were recently demonstrated to reduce the no. of false pos. metabolite identifications. While promising, the amt. of empirical CCS information currently available is limited, thus predictive CCS methods need to be developed. In this article, the authors expand upon current exptl. IMS capabilities by predicting the CCS values using a deep learning algorithm. The authors successfully developed and trained a prediction model for CCS values requiring only information about a compd.'s SMILES notation and ion type. The use of data from five different labs. using different instruments allowed the algorithm to be trained and tested on more than 2400 mols. The resulting CCS predictions were found to achieve a coeff. of detn. of 0.97 and median relative error of 2.7% for a wide range of mols. Furthermore, the method requires only a small amt. of processing power to predict CCS values. Considering the performance, time, and resources necessary, as well as its applicability to a variety of mols., this model was able to outperform all currently available CCS prediction algorithms.
- 12Rainey, M. A.; Watson, C. A.; Asef, C. K.; Foster, M. R.; Baker, E. S.; Fernández, F. M. CCS Predictor 2.0: An Open-Source Jupyter Notebook Tool for Filtering Out False Positives in Metabolomics. Anal. Chem. 2022, 94 (50), 17456– 17466, DOI: 10.1021/acs.analchem.2c03491Google Scholar12https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB38XjtVensrbP&md5=9fdde50110cae7ec21403a8c2c242c86CCS Predictor 2.0: An Open-Source Jupyter Notebook Tool for Filtering Out False Positives in MetabolomicsRainey, Markace A.; Watson, Chandler A.; Asef, Carter K.; Foster, Makayla R.; Baker, Erin S.; Fernandez, Facundo M.Analytical Chemistry (Washington, DC, United States) (2022), 94 (50), 17456-17466CODEN: ANCHAM; ISSN:0003-2700. (American Chemical Society)Metabolite annotation continues to be the widely accepted bottleneck in nontargeted metabolomics workflows. Annotation of metabolites typically relies on a combination of high-resoln. mass spectrometry (MS) with parent and tandem measurements, isotope cluster evaluations, and Kendrick mass defect (KMD) anal. Chromatog. retention time matching with stds. is often used at the later stages of the process, which can also be followed by metabolite isolation and structure confirmation utilizing NMR (NMR) spectroscopy. The measurement of gas-phase collision cross-section (CCS) values by ion mobility (IM) spectrometry also adds an important dimension to this workflow by generating an addnl. mol. parameter that can be used for filtering unlikely structures. The millisecond timescale of IM spectrometry allows the rapid measurement of CCS values and allows easy pairing with existing MS workflows. Here, we report on a highly accurate machine learning algorithm (CCSP 2.0) in an open-source Jupyter Notebook format to predict CCS values based on linear support vector regression models. This tool allows customization of the training set to the needs of the user, enabling the prodn. of models for new adducts or previously unexplored mol. classes. CCSP produces predictions with accuracy equal to or greater than existing machine learning approaches such as CCSbase, DeepCCS, and AllCCS, while being better aligned with FAIR (Findable, Accessible, Interoperable, and Reusable) data principles. Another unique aspect of CCSP 2.0 is its inclusion of a large library of 1613 mol. descriptors via the Mordred Python package, further encoding the fine aspects of isomeric mol. structures. CCS prediction accuracy was tested using CCS values in the McLean CCS Compendium with median relative errors of 1.25, 1.73, and 1.87% for the 170 [M - H]-, 155 [M + H]+, and 138 [M + Na]+ adducts tested. For superclass-matched data sets, CCS predictions via CCSP allowed filtering of 36.1% of incorrect structures while retaining a total of 100% of the correct annotations using a ΔCCS threshold of 2.8% and a mass error of 10 ppm.
- 13Wishart, D. S.; Guo, A.; Oler, E.; Wang, F.; Anjum, A.; Peters, H.; Dizon, R.; Sayeeda, Z.; Tian, S.; Lee, B. L.; Berjanskii, M.; Mah, R.; Yamamoto, M.; Jovel, J.; Torres-Calzada, C.; Hiebert-Giesbrecht, M.; Lui, V. W.; Varshavi, D.; Varshavi, D.; Allen, D. HMDB 5.0: The Human Metabolome Database for 2022. Nucleic Acids Res. 2022, 50 (D1), D622– D631, DOI: 10.1093/nar/gkab1062Google Scholar13https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB38Xis1Chtbk%253D&md5=ee423e78dc044e44d3bceed843af47f4HMDB 5.0: the human metabolome database for 2022Wishart, David S.; Guo, AnChi; Oler, Eponine; Wang, Fei; Anjum, Afia; Peters, Harrison; Dizon, Raynard; Sayeeda, Zinat; Tian, Siyang; Lee, Brian L.; Berjanskii, Mark; Mah, Robert; Yamamoto, Mai; Jovel, Juan; Torres-Calzada, Claudia; Hiebert-Giesbrecht, Mickel; Lui, Vicki W.; Varshavi, Dorna; Varshavi, Dorsa; Allen, Dana; Arndt, David; Khetarpal, Nitya; Sivakumaran, Aadhavya; Harford, Karxena; Sanford, Selena; Yee, Kristen; Cao, Xuan; Budinski, Zachary; Liigand, Jaanus; Zhang, Lun; Zheng, Jiamin; Mandal, Rupasri; Karu, Naama; Dambrova, Maija; Schioth, Helgi B.; Greiner, Russell; Gautam, VasukNucleic Acids Research (2022), 50 (D1), D622-D631CODEN: NARHAD; ISSN:1362-4962. (Oxford University Press)A review. The Human Metabolome Database or HMDB has been providing comprehensive ref. information about human metabolites and their assocd. biol., physiol. and chem. properties since 2007. Over the past 15 years, the HMDB has grown and evolved significantly to meet the needs of the metabolomics community and respond to continuing changes in internet and computing technol. This year's update, HMDB 5.0, brings a no. of important improvements and upgrades to the database. These should make the HMDB more useful and more appealing to a larger cross-section of users. In particular, these improvements include: (i) a significant increase in the no. of metabolite entries (from 114 100 to 217 920 compds.); (ii) enhancements to the quality and depth of metabolite descriptions; (iii) the addn. of new structure, spectral and pathway visualization tools; (iv) the inclusion of many new and much more accurately predicted spectral data sets, including predicted NMR spectra, more accurately predicted MS spectra, predicted retention indexes and predicted collision cross section data and (v) enhancements to the HMDB's search functions to facilitate better compd. identification. Many other minor improvements and updates to the content, the interface, and general performance of the HMDB website have also been made. Overall, we believe these upgrades and updates should greatly enhance the HMDB's ease of use and its potential applications not only in human metabolomics but also in exposomics, lipidomics, nutritional science, biochem. and clin. chem.
- 14Williams, A. J.; Grulke, C. M.; Edwards, J.; McEachran, A. D.; Mansouri, K.; Baker, N. C.; Patlewicz, G.; Shah, I.; Wambaugh, J. F.; Judson, R. S.; Richard, A. M. The CompTox Chemistry Dashboard: A Community Data Resource for Environmental Chemistry. Journal of Cheminformatics 2017, 9 (1), 61, DOI: 10.1186/s13321-017-0247-6Google Scholar14https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1MXjtlejtrw%253D&md5=34b709fb91fd550b84398cf62434cd01The CompTox chemistry dashboard: a community data resource for environmental chemistryWilliams, Antony J.; Grulke, Christopher M.; Edwards, Jeff; Mceachran, Andrew D.; Mansouri, Kamel; Baker, Nancy C.; Patlewicz, Grace; Shah, Imran; Wambaugh, John F.; Judson, Richard S.; Richard, Ann M.Journal of Cheminformatics (2017), 9 (), 61/1-61/27CODEN: JCOHB3; ISSN:1758-2946. (Chemistry Central Ltd.)Despite an abundance of online databases providing access to chem. data, there is increasing demand for highquality, structure-curated, open data to meet the various needs of the environmental sciences and computational toxicol. communities. The U.S.These data include physicochem., environmental fate and transport, exposure, usage, in vivo toxicity, and in vitro bioassay data, surfaced through an integration hub with link-outs to addnl. EPA data and public domain online resources. Batch searching allows for direct chem. identifier (ID) mapping and downloading of multiple data streams in several different formats. This facilitates fast access to available structure, property, toxicity, and bioassay data for collections of chems. (hundreds to thousands at a time). Advanced search capabilities are available to support, for example, non-targeted anal. and identification of chems. using mass spectrometry. The contents of the chem. database, presently contg. ∼ 760,000 substances, are available as public domain data for download. The chem. content underpinning the Dashboard has been aggregated over the past 15 years by both manual and auto-curation techniques within EPA's DSSTox project. DSSTox chem. content is subject to strict quality controls to enforce consistency among chem. substance-structure identifiers, as well as list curation review to ensure accurate linkages of DSSTox substances to chem. lists and assocd. data. The Dashboard, publicly launched in Apr. 2016, has expanded considerably in content and user traffic over the past year. It is continuously evolving with the growth of DSSTox into high-interest or data-rich domains of interest to EPA, such as chems. on the Toxic Substances Control Act listing, while providing the user community with a flexible and dynamic web-based platform for integration, processing, visualization and delivery of data and resources. The Dashboard provides support for a broad array of research and regulatory programs across the worldwide community of toxicologists and environmental scientists.
- 15Kim, S.; Chen, J.; Cheng, T.; Gindulyte, A.; He, J.; He, S.; Li, Q.; Shoemaker, B. A.; Thiessen, P. A.; Yu, B.; Zaslavsky, L.; Zhang, J.; Bolton, E. E. PubChem 2023 Update. Nucleic Acids Res. 2023, 51, D1373– D1380, DOI: 10.1093/nar/gkac956Google ScholarThere is no corresponding record for this reference.
- 16Pence, H. E.; Williams, A. ChemSpider: An Online Chemical Information Resource. J. Chem. Educ. 2010, 87 (11), 1123– 1124, DOI: 10.1021/ed100697wGoogle Scholar16https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3cXhtV2jtr%252FF&md5=c7c75704f57cf699598c0250001ebb48ChemSpider: An Online Chemical Information ResourcePence, Harry E.; Williams, AntonyJournal of Chemical Education (2010), 87 (11), 1123-1124CODEN: JCEDA8; ISSN:0021-9584. (American Chemical Society and Division of Chemical Education, Inc.)ChemSpider is a free, online chem. database offering access to phys. and chem. properties, mol. structure, spectral data, synthetic methods, safety information, and nomenclature for almost 25 million unique chem. compds. sourced and linked to almost 400 sep. data sources on the Web. ChemSpider is quickly becoming the primary chem. Internet portal and it can be very useful for both chem. teaching and research.
- 17American Chemical Society. CAS REGISTRY - The CAS Substance Collection , 2024. https://www.cas.org/cas-data/cas-registry (accessed 2024-08-03).Google ScholarThere is no corresponding record for this reference.
- 18Wang, Z.; Walker, G. W.; Muir, D. C. G.; Nagatani-Yoshida, K. Toward a Global Understanding of Chemical Pollution: A First Comprehensive Analysis of National and Regional Chemical Inventories. Environ. Sci. Technol. 2020, 54 (5), 2575– 2584, DOI: 10.1021/acs.est.9b06379Google Scholar18https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB3cXhsFaitL0%253D&md5=5f731031fadaccfe5bd723138725c8a7Toward a Global Understanding of Chemical Pollution: A First Comprehensive Analysis of National and Regional Chemical InventoriesWang, Zhanyun; Walker, Glen W.; Muir, Derek C. G.; Nagatani-Yoshida, KakukoEnvironmental Science & Technology (2020), 54 (5), 2575-2584CODEN: ESTHAG; ISSN:0013-936X. (American Chemical Society)Chems., while benefitting society, may be released during their life cycle and possibly harm humans and ecosystems. Chem. pollution is mentioned as a planetary boundaries within which humanity can safely operate, but is not comprehensively understood. This work analyzed 22 chem. inventories from 19 countries and regions to achieve a first comprehensive overview of chems. on the market as an essential first step toward a global understanding of chem. pollution. More than 350,000 chems. and chem. mixts. have been registered for prodn. and use, up to three times as many as previously estd. and with substantial differences across countries/regions. A noteworthy finding was that identities of many chems. remain publicly unknown because they are claimed as confidential (>50,000) or ambiguously described (up to 70,000). Coordinated efforts by all stake-holders including scientists from different disciplines are urgently needed; new areas of interest and opportunities are highlighted.
- 19Mohammed Taha, H.; Aalizadeh, R.; Alygizakis, N.; Antignac, J.-P.; Arp, H. P. H.; Bade, R.; Baker, N.; Belova, L.; Bijlsma, L.; Bolton, E. E.; Brack, W.; Celma, A.; Chen, W.-L.; Cheng, T.; Chirsir, P.; Čirka, L.; D’Agostino, L. A.; Djoumbou Feunang, Y.; Dulio, V.; Fischer, S. The NORMAN Suspect List Exchange (NORMAN-SLE): Facilitating European and Worldwide Collaboration on Suspect Screening in High Resolution Mass Spectrometry. Environmental Sciences Europe 2022, 34 (1), 104, DOI: 10.1186/s12302-022-00680-6Google ScholarThere is no corresponding record for this reference.
- 20Schymanski, E. L.; Kondić, T.; Neumann, S.; Thiessen, P. A.; Zhang, J.; Bolton, E. E. Empowering Large Chemical Knowledge Bases for Exposomics: PubChemLite Meets MetFrag. Journal of Cheminformatics 2021, 13 (1), 19, DOI: 10.1186/s13321-021-00489-0Google ScholarThere is no corresponding record for this reference.
- 21Helmus, R.; ter Laak, T. L.; van Wezel, A. P.; de Voogt, P.; Schymanski, E. L. patRoon: Open Source Software Platform for Environmental Mass Spectrometry Based Non-Target Screening. Journal of Cheminformatics 2021, 13 (1), 1, DOI: 10.1186/s13321-020-00477-wGoogle Scholar21https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB3MXhtFShs7zM&md5=4ee142079555535910b3f76297a8317apatRoon: open source software platform for environmental mass spectrometry based non-target screeningHelmus, Rick; ter Laak, Thomas L.; van Wezel, Annemarie P.; de Voogt, Pim; Schymanski, Emma L.Journal of Cheminformatics (2021), 13 (1), 1CODEN: JCOHB3; ISSN:1758-2946. (SpringerOpen)Abstr.: Mass spectrometry based non-target anal. is increasingly adopted in environmental sciences to screen and identify numerous chems. simultaneously in highly complex samples. However, current data processing software either lack functionality for environmental sciences, solve only part of the workflow, are not openly available and/or are restricted in input data formats. In this paper we present patRoon, a new R based open-source software platform, which provides comprehensive, fully tailored and straightforward non-target anal. workflows. In addn., patRoon offers various functionality and strategies to simplify and perform automated processing of complex (environmental) data effectively. patRoon implements several effective optimization strategies to significantly reduce computational times. The ability of patRoon to perform time-efficient and automated non-target data annotation of environmental samples is demonstrated with a simple and reproducible workflow using open-access data of spiked samples from a drinking water treatment plant study. In addn., the ability to easily use, combine and evaluate different algorithms was demonstrated for three commonly used feature finding algorithms. This article, combined with already published works, demonstrate that patRoon helps make comprehensive (environmental) non-target anal. readily accessible to a wider community of researchers.
- 22Ruttkies, C.; Schymanski, E. L.; Wolf, S.; Hollender, J.; Neumann, S. MetFrag Relaunched: Incorporating Strategies Beyond in Silico Fragmentation. Journal of Cheminformatics 2016, 8 (1), 3, DOI: 10.1186/s13321-016-0115-9Google Scholar22https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2sXmtVKltL0%253D&md5=0135b38f288a51bec794ee93800590ecMetFrag relaunched: incorporating strategies beyond in silico fragmentationRuttkies, Christoph; Schymanski, Emma L.; Wolf, Sebastian; Hollender, Juliane; Neumann, SteffenJournal of Cheminformatics (2016), 8 (), 3/1-3/16CODEN: JCOHB3; ISSN:1758-2946. (Chemistry Central Ltd.)Background: The in silico fragmenter MetFrag, launched in 2010, was one of the first approaches combining compd. database searching and fragmentation prediction for small mol. identification from tandem mass spectrometry data. Since then many new approaches have evolved, as has MetFrag itself. This article details the latest developments to MetFrag and its use in small mol. identification since the original publication. Results: MetFrag has gone through algorithmic and scoring refinements. New features include the retrieval of ref., data source and patent information via ChemSpider and PubChem web services, as well as InChIKey filtering to reduce candidate redundancy due to stereoisomerism. Candidates can be filtered or scored differently based on criteria like occurrence of certain elements and/or substructures prior to fragmentation, or presence in so-called "suspect lists". Retention time information can now be calcd. either within MetFrag with a sufficient amt. of user-provided retention times, or incorporated sep. as "user-defined scores" to be included in candidate ranking. The changes to MetFrag were evaluated on the original dataset as well as a dataset of 473 merged high resoln. tandem mass spectra (HR-MS/MS) and compared with another open source in silico fragmenter, CFM-ID. Using HR-MS/MS information only, MetFrag2.2 and CFM-ID had 30 and 43 Top 1 ranks, resp., using PubChem as a database. Including ref. and retention information in MetFrag2.2 improved this to 420 and 336 Top 1 ranks with ChemSpider and PubChem (89 and 71 %), resp., and even up to 343 Top 1 ranks (PubChem) when combining with CFM-ID. The optimal parameters and wts. were verified using three addnl. datasets of 824 merged HR-MS/MS spectra in total. Further examples are given to demonstrate flexibility of the enhanced features. Conclusions: In many cases addnl. information is available from the exptl. context to add to small mol. identification, which is esp. useful where the mass spectrum alone is not sufficient for candidate selection from a large no. of candidates. The results achieved with MetFrag2.2 clearly show the benefit of considering this addnl. information. The new functions greatly enhance the chance of identification success and have been incorporated into a command line interface in a flexible way designed to be integrated into high throughput workflows.
- 23NCBI/NLM/NIH. PubChem Table of Contents Classification Browser , 2024. https://pubchem.ncbi.nlm.nih.gov/classification/#hid=72 (accessed 2024-07-08).Google ScholarThere is no corresponding record for this reference.
- 24LCSB-ECI. Uniluxembourg/LCSB/Environmental Cheminformatics/Pubchemlite-Input. GitLab , 2024. https://gitlab.com/uniluxembourg/lcsb/eci/pubchemlite-input (accessed 2024-12-16).Google ScholarThere is no corresponding record for this reference.
- 25LCSB-ECI. Uniluxembourg/LCSB/Environmental Cheminformatics/Pubchemlite-Build-System. GitLab , 2024. https://gitlab.com/uniluxembourg/lcsb/eci/pclbuild (accessed 2024-12-16).Google ScholarThere is no corresponding record for this reference.
- 26LCSB-ECI. Uniluxembourg/LCSB/Environmental Cheminformatics. GitLab , 2024. https://gitlab.com/uniluxembourg/lcsb/eci/ (accessed 2024-12-16).Google ScholarThere is no corresponding record for this reference.
- 27Bolton, E.; Schymanski, E.; Kondic, T.; Thiessen, P.; Zhang, J. PubChemLite for Exposomics. Zenodo 2024, DOI: 10.5281/zenodo.5995885Google ScholarThere is no corresponding record for this reference.
- 28LCSB-ECI. Pubchemlite/Logos. GitLab , 2021. https://gitlab.com/uniluxembourg/lcsb/eci/pubchem/-/tree/master/pubchemlite/logos (accessed 2024-12-17).Google ScholarThere is no corresponding record for this reference.
- 29Ross, D. C3SDB (Combined Collision Cross Section DataBase) - Dylanhross/C3sdb. GitHub , 2024. https://github.com/dylanhross/c3sdb (accessed 2024-12-16).Google ScholarThere is no corresponding record for this reference.
- 30IPB Halle. MetFrag Web , 2024. https://msbi.ipb-halle.de/MetFrag/ (accessed 2024-12-03).Google ScholarThere is no corresponding record for this reference.
- 31Kirkwood, K. I.; Christopher, M. W.; Burgess, J. L.; Littau, S. R.; Foster, K.; Richey, K.; Pratt, B. S.; Shulman, N.; Tamura, K.; MacCoss, M. J.; MacLean, B. X.; Baker, E. S. Development and Application of Multidimensional Lipid Libraries to Investigate Lipidomic Dysregulation Related to Smoke Inhalation Injury Severity. J. Proteome Res. 2022, 21 (1), 232– 242, DOI: 10.1021/acs.jproteome.1c00820Google Scholar31https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB3MXis1OgsLzE&md5=6cad3904c24aa34afa2488bd8b01fc0dDevelopment and Application of Multidimensional Lipid Libraries to Investigate Lipidomic Dysregulation Related to Smoke Inhalation Injury SeverityKirkwood, Kaylie I.; Christopher, Michael W.; Burgess, Jefferey L.; Littau, Sally R.; Foster, Kevin; Richey, Karen; Pratt, Brian S.; Shulman, Nicholas; Tamura, Kaipo; MacCoss, Michael J.; MacLean, Brendan X.; Baker, Erin S.Journal of Proteome Research (2022), 21 (1), 232-242CODEN: JPROBS; ISSN:1535-3893. (American Chemical Society)The implication of lipid dysregulation in diseases, toxic exposure outcomes, and inflammation has brought great interest to lipidomic studies. However, lipids have proven to be anal. challenging due to their highly isomeric nature and vast concn. ranges in biol. matrixes. Therefore, multidimensional techniques such as those integrating liq. chromatog., ion mobility spectrometry, collision-induced dissocn., and mass spectrometry (LC-IMS-CID-MS) have been implemented to sep. lipid isomers as well as provide structural information and increased identification confidence. These data sets are however extremely large and complex, resulting in challenges for data processing and annotation. Here, we have overcome these challenges by developing sample-specific multidimensional lipid libraries using the freely available software Skyline. Specifically, the human plasma library developed for this work contains over 500 unique lipids and is combined with adapted Skyline functions such as indexed retention time (iRT) for retention time prediction and IMS drift time filtering for enhanced selectivity. For comparison with other studies, this database was used to annotate LC-IMS-CID-MS data from a NIST SRM 1950 ext. The same workflow was then utilized to assess plasma and bronchoalveolar lavage fluid (BALF) samples from patients with varying degrees of smoke inhalation injury to identify lipid-based patient prognostic and diagnostic markers.
- 32Foster, M.; Rainey, M.; Watson, C.; Dodds, J. N.; Kirkwood, K. I.; Fernández, F. M.; Baker, E. S. Uncovering PFAS and Other Xenobiotics in the Dark Metabolome Using Ion Mobility Spectrometry, Mass Defect Analysis, and Machine Learning. Environ. Sci. Technol. 2022, 56 (12), 9133– 9143, DOI: 10.1021/acs.est.2c00201Google Scholar32https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB38XhsVWmu7nE&md5=c081d248c11892a225f6313c2dccc662Uncovering PFAS and Other Xenobiotics in the Dark Metabolome Using Ion Mobility Spectrometry, Mass Defect Analysis, and Machine LearningFoster, MaKayla; Rainey, Markace; Watson, Chandler; Dodds, James N.; Kirkwood, Kaylie I.; Fernandez, Facundo M.; Baker, Erin S.Environmental Science & Technology (2022), 56 (12), 9133-9143CODEN: ESTHAG; ISSN:1520-5851. (American Chemical Society)The identification of xenobiotics in nontargeted metabolomic analyses is a vital step in understanding human exposure. Xenobiotic metab., transformation, excretion, and coexistence with other endogenous mols., however, greatly complicate the interpretation of features detected in nontargeted studies. While mass spectrometry (MS)-based platforms are commonly used in metabolomic measurements, deconvoluting endogenous metabolites from xenobiotics is also often challenged by the lack of xenobiotic parent and metabolite stds. as well as the numerous isomers possible for each small mol. m/z feature. Here, we evaluate a xenobiotic structural annotation workflow using ion mobility spectrometry coupled with MS (IMS-MS), mass defect filtering, and machine learning to uncover potential xenobiotic classes and species in large metabolomic feature lists. Xenobiotic classes examd. included those of known high toxicities, including per- and polyfluoroalkyl substances (PFAS), polycyclic arom. hydrocarbons (PAHs), polychlorinated biphenyls (PCBs), polybrominated di-Ph ethers (PBDEs), and pesticides. Specifically, when the workflow was applied to identify PFAS in the NIST SRM 1957 and 909c human serum samples, it greatly reduced the hundreds of detected liq. chromatog. (LC)-IMS-MS features by utilizing both mass defect filtering and m/z vs. IMS collision cross sections relationships. These potential PFAS features were then compared to the EPA CompTox entries, and while some matched within specific m/z tolerances, there were still many unknowns illustrating the importance of nontargeted studies for detecting new mols. with known chem. characteristics. Addnl., this workflow can also be utilized to evaluate other xenobiotics and enable more confident annotations from nontargeted studies.
- 33Picache, J. A.; Rose, B. S.; Balinski, A.; Leaptrot, K. L.; Sherrod, S. D.; May, J. C.; McLean, J. A. Collision Cross Section Compendium to Annotate and Predict Multi-Omic Compound Identities. Chemical Science 2019, 10 (4), 983– 993, DOI: 10.1039/C8SC04396EGoogle Scholar33https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1cXitlSqtrbN&md5=177c56f0b3557ca849e5f224ffd5a224Collision cross section compendium to annotate and predict multi-omic compound identitiesPicache, Jaqueline A.; Rose, Bailey S.; Balinski, Andrzej; Leaptrot, Katrina L.; Sherrod, Stacy D.; May, Jody C.; McLean, John A.Chemical Science (2019), 10 (4), 983-993CODEN: CSHCCN; ISSN:2041-6520. (Royal Society of Chemistry)Ion mobility mass spectrometry (IM-MS) expands the analyte coverage of existing multi-omic workflows by providing an addnl. sepn. dimension as well as a parameter for characterization and identification of mols. - the collision cross section (CCS). This work presents a large, Unified CCS compendium of >3800 exptl. acquired CCS values obtained from traceable mol. stds. and measured with drift tube ion mobility-mass spectrometers. An interactive visualization of this compendium along with data analytic tools have been made openly accessible. Represented in the compendium are 14 structurally-based chem. super classes, consisting of a total of 80 classes and 157 subclasses. Using this large data set, regression fitting and predictive statistics have been performed to describe mass-CCS correlations specific to each chem. ontol. These structural trends provide a rapid and effective filtering method in the traditional untargeted workflow for identification of unknown biochem. species. The utility of the approach is illustrated by an application to metabolites in human serum, quantified trends of which were used to assess the probability of an unknown compd. belonging to a given class. CCS-based filtering narrowed the chem. search space by 60% while increasing the confidence in the remaining isomeric identifications from a single class, thus demonstrating the value of integrating predictive analyses into untargeted expts. to assist in identification workflows. The predictive abilities of this compendium will improve in specificity and expand to more chem. classes as addnl. data from the IM-MS community is contributed. Instructions for data submission to the compendium and criteria for inclusion are provided.
- 34Picache, J.; McLean, J. S50 CCSCOMPEND The Unified Collision Cross Section (CCS) Compendium. Zenodo 2019, DOI: 10.5281/zenodo.2658162Google ScholarThere is no corresponding record for this reference.
- 35Celma, A.; Sancho, J. V.; Schymanski, E. L.; Fabregat-Safont, D.; Ibáñez, M.; Goshawk, J.; Barknowitz, G.; Hernández, F.; Bijlsma, L. Improving Target and Suspect Screening High-Resolution Mass Spectrometry Workflows in Environmental Analysis by Ion Mobility Separation. Environ. Sci. Technol. 2020, 54 (23), 15120– 15131, DOI: 10.1021/acs.est.0c05713Google Scholar35https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB3cXitlGqsbbF&md5=3b89676a1e7faca3f5ddfa4e5686c850Improving target and suspect screening high-resolution mass spectrometry workflows in environmental analysis by ion mobility separationCelma, Alberto; Sancho, Juan V.; Schymanski, Emma L.; Fabregat-Safont, David; Ibanez, Maria; Goshawk, Jeff; Barknowitz, Gitte; Hernandez, Felix; Bijlsma, LubertusEnvironmental Science & Technology (2020), 54 (23), 15120-15131CODEN: ESTHAG; ISSN:0013-936X. (American Chemical Society)Currently, the most powerful approach to monitor org. micropollutants (OMPs) in environmental samples is the combination of target, suspect, and nontarget screening strategies using high-resoln. mass spectrometry (HRMS). However, the high complexity of sample matrixes and the huge no. of OMPs potentially present in samples at low concns. pose an anal. challenge. Ion mobility sepn. (IMS) combined with HRMS instruments (IMS-HRMS) introduces an addnl. anal. dimension, providing extra information, which facilitates the identification of OMPs. The collision cross-section (CCS) value provided by IMS is unaffected by the matrix or chromatog. sepn. Consequently, the creation of CCS databases and the inclusion of ion mobility within identification criteria are of high interest for an enhanced and robust screening strategy. In this work, a CCS library for IMS-HRMS, which is online and freely available, was developed for 556 OMPs in both pos. and neg. ionization modes using electrospray ionization. The inclusion of ion mobility data in widely adopted confidence levels for identification in environmental reporting is discussed. Illustrative examples of OMPs found in environmental samples are presented to highlight the potential of IMS-HRMS and to demonstrate the addnl. value of CCS data in various screening strategies.
- 36Celma, A.; Fabregat-Safont, D.; Ibàñez, M.; Bijlsma, L.; Hernandez, F.; Sancho, J. V. S61 UJICCSLIB Collision Cross Section (CCS) Library from UJI. Zenodo 2019, DOI: 10.5281/zenodo.3549476Google ScholarThere is no corresponding record for this reference.
- 37Belova, L.; Caballero-Casero, N.; Nuijs, A. L. N. van; Covaci, A. S79 UACCSCEC Collision Cross Section (CCS) Library from UAntwerp. Zenodo 2021, DOI: 10.5281/zenodo.4704648Google ScholarThere is no corresponding record for this reference.
- 38Muller, H.; Palm, E.; Schymanski, E. S116 REFCCS Collision Cross Section (CCS) Values from Literature. Zenodo 2024, DOI: 10.5281/zenodo.10932895Google ScholarThere is no corresponding record for this reference.
- 39PubChem. PubChem Classification Browser: CCSbase Classification, 2024. https://pubchem.ncbi.nlm.nih.gov/classification/#hid=104 (accessed 2024-12-17).Google ScholarThere is no corresponding record for this reference.
- 40PubChem. PubChem Classification Browser: CCS Classification - Baker Lab , 2024. https://pubchem.ncbi.nlm.nih.gov/classification/#hid=124 (accessed 2024-12-17).Google ScholarThere is no corresponding record for this reference.
- 41PubChem. PubChem Classification Browser: NORMAN-SLE Classification , 2024. https://pubchem.ncbi.nlm.nih.gov/classification/#hid=106 (accessed 2024-12-17).Google ScholarThere is no corresponding record for this reference.
- 42PubChem. PubChem Classification Browser: Aggregated CCS Classification , 2024. https://pubchem.ncbi.nlm.nih.gov/classification/#hid=106 (accessed 2024-12-17).Google ScholarThere is no corresponding record for this reference.
- 43Schymanski, E. Annotations/CCS/CCS_retrieval · Master · Uniluxembourg/LCSB/Environmental Cheminformatics/Pubchem. GitLab , 2024. https://gitlab.com/uniluxembourg/lcsb/eci/pubchem/-/tree/master/annotations/CCS/CCS_retrieval (accessed 2024-12-16).Google ScholarThere is no corresponding record for this reference.
- 44Schymanski, E.; Zhang, J.; Thiessen, P.; Bolton, E. Experimental CCS Values in PubChem. Zenodo 2024, DOI: 10.5281/zenodo.6800138Google ScholarThere is no corresponding record for this reference.
- 45Grouès, V.; Rocca-Serra, P.; Ded, V. Elixir-Luxembourg/Data-Catalog. GitHub , 2023. https://github.com/elixir-luxembourg/data-catalog (accessed 2024-08-04).Google ScholarThere is no corresponding record for this reference.
- 46Welter, D.; Rocca-Serra, P.; Grouès, V.; Sallam, N.; Ancien, F.; Shabani, A.; Asariardakani, S.; Alper, P.; Ghosh, S.; Burdett, T.; Sansone, S.-A.; Gu, W.; Satagopam, V. The Translational Data Catalog - Discoverable Biomedical Datasets. Scientific Data 2023, 10 (1), 470, DOI: 10.1038/s41597-023-02258-0Google ScholarThere is no corresponding record for this reference.
- 47Landrum, G.. RDKit: Open-Source Cheminformatics Software , 2024. https://www.rdkit.org/ (accessed 2024-08-04).Google ScholarThere is no corresponding record for this reference.
- 48Landrum, G.; Tosco, P.; Kelley, B.; Rodriguez, R.; Cosgrove, D.; Vianello, R.; sriniker; gedeck; Jones, G.; NadineSchneider; Kawashima, E.; Nealschneider, D.; Dalke, A.; Swain, M.; Cole, B.; Turk, S.; Savelev, A.; Vaucher, A.; Wójcikowski, M.; Take, I. Rdkit/Rdkit: 2024_03_5 (Q1 2024) Release. Zenodo 2024, DOI: 10.5281/zenodo.591637Google ScholarThere is no corresponding record for this reference.
- 49Grouès, V. Uniluxembourg/LCSB/Environmental Cheminformatics/PubChemLite-Web. GitLab , 2024. https://gitlab.com/uniluxembourg/lcsb/eci/pubchemlite-web (accessed 2024-08-04).Google ScholarThere is no corresponding record for this reference.
- 50NCBI/NLM/NIH. PubChem Download Pages , 2024. https://ftp.ncbi.nlm.nih.gov/pubchem/ (accessed 2024-12-16).Google ScholarThere is no corresponding record for this reference.
- 51Aurich, D.; Schymanski, E. L.; De Jesus Matias, F.; Thiessen, P. A.; Pang, J. Revealing Chemical Trends: Insights from Data-Driven Visualization and Patent Analysis in Exposomics Research. Environ. Sci. Technol. Lett. 2024, 11 (10), 1046– 1052, DOI: 10.1021/acs.estlett.4c00560Google ScholarThere is no corresponding record for this reference.
- 52Arp, H. P. H.; Aurich, D.; Schymanski, E. L.; Sims, K.; Hale, S. E. Avoiding the Next Silent Spring: Our Chemical Past, Present, and Future. Environ. Sci. Technol. 2023, 57 (16), 6355– 6359, DOI: 10.1021/acs.est.3c01735Google Scholar52https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB3sXnsFCgtLw%253D&md5=dd8d9b5a8381cfe4ef36643f525f8144Avoiding the Next Silent Spring: Our Chemical Past, Present, and FutureArp, Hans Peter H.; Aurich, Dagny; Schymanski, Emma L.; Sims, Kerry; Hale, Sarah E.Environmental Science & Technology (2023), 57 (16), 6355-6359CODEN: ESTHAG; ISSN:1520-5851. (American Chemical Society)There is no expanded citation for this reference.
- 53Aurich, D. Uniluxembourg/LCSB/Environmental Cheminformatics/Chemicalstripes. GitLab , 2024. https://gitlab.com/uniluxembourg/lcsb/eci/chemicalstripes (accessed 2024-08-04).Google ScholarThere is no corresponding record for this reference.
- 54Talavera Andújar, B.; Mary, A.; Venegas, C.; Cheng, T.; Zaslavsky, L.; Bolton, E. E.; Heneka, M. T.; Schymanski, E. L. Can Small Molecules Provide Clues on Disease Progression in Cerebrospinal Fluid from Mild Cognitive Impairment and Alzheimer’s Disease Patients?. Environ. Sci. Technol. 2024, 58, 4181– 4192, DOI: 10.1021/acs.est.3c10490Google ScholarThere is no corresponding record for this reference.
- 55WishartLab. FooDB , 2024. https://foodb.ca/ (accessed 2024-11-06).Google ScholarThere is no corresponding record for this reference.
- 56Menger, F.; Celma, A.; Schymanski, E. L.; Lai, F. Y.; Bijlsma, L.; Wiberg, K.; Hernández, F.; Sancho, J. V.; Ahrens, L. Enhancing Spectral Quality in Complex Environmental Matrices: Supporting Suspect and Non-Target Screening in Zebra Mussels with Ion Mobility. Environ. Int. 2022, 170, 107585 DOI: 10.1016/j.envint.2022.107585Google ScholarThere is no corresponding record for this reference.
- 57Baker, E. S.; Hoang, C.; Uritboonthai, W.; Heyman, H. M.; Pratt, B.; MacCoss, M.; MacLean, B.; Plumb, R.; Aisporna, A.; Siuzdak, G. METLIN-CCS: An Ion Mobility Spectrometry Collision Cross Section Database. Nat. Methods 2023, 20 (12), 1836– 1837, DOI: 10.1038/s41592-023-02078-5Google ScholarThere is no corresponding record for this reference.
- 58Baker, E. S.; Uritboonthai, W.; Aisporna, A.; Hoang, C.; Heyman, H. M.; Connell, L.; Olivier-Jimenez, D.; Giera, M.; Siuzdak, G. METLIN-CCS Lipid Database: An Authentic Standards Resource for Lipid Classification and Identification. Nature Metabolism 2024, 6 (6), 981– 982, DOI: 10.1038/s42255-024-01058-zGoogle ScholarThere is no corresponding record for this reference.
Cited By
This article has not yet been cited by other publications.
Article Views
Altmetric
Citations
Article Views are the COUNTER-compliant sum of full text article downloads since November 2008 (both PDF and HTML) across all institutions and individuals. These metrics are regularly updated to reflect usage leading up to the last few days.
Citations are the number of other articles citing this article, calculated by Crossref and updated daily. Find more information about Crossref citation counts.
The Altmetric Attention Score is a quantitative measure of the attention that a research article has received online. Clicking on the donut icon will load a page at altmetric.com with additional details about the score and the social media presence for the given article. Find more information on the Altmetric Attention Score and how the score is calculated.
Recommended Articles
Abstract
Figure 1
Figure 1. PubChemLite categories in the PubChem Table of Contents (TOC) (https://pubchem.ncbi.nlm.nih.gov/classification/#hid=72), selected subcategories, and associated annotation examples. Yellow shading denotes “environmental” categories (example CID 47759) (https://pubchem.ncbi.nlm.nih.gov/compound/47759#section=EU-Pesticides-Data), red the “exposomics” (example CID 114481) (https://pubchem.ncbi.nlm.nih.gov/compound/114481#section=Associated-Disorders-and-Diseases) and purple the “metabolomics” sections (example CID 1) (https://pubchem.ncbi.nlm.nih.gov/compound/1#section=Pathways). For high resolution live images, please click the embedded hyperlinks. Logo image from GitLab. (28)
Figure 2
Figure 2. Aggregated Collision Cross Section (CCS) Classification Tree (https://pubchem.ncbi.nlm.nih.gov/classification/#hid=106) in PubChem. Inset: Experimental CCS values in individual PubChem compound records for Cl-PFOPA (CID 138395139) (https://pubchem.ncbi.nlm.nih.gov/compound/138395139#section=Collision-Cross-Section) and the transformation product 2-hydroxyatrazine (CID 135398733) (https://pubchem.ncbi.nlm.nih.gov/compound/135398733#section=Collision-Cross-Section). For high resolution live images, please click the embedded hyperlinks. Logo image from GitLab. (28)
Figure 3
Figure 3. PubChemLite web interface (composite image), compound view of Atrazine (https://pubchemlite.lcsb.uni.lu/e/compound/2256). For high resolution live images, please click the embedded hyperlink. Logo image from GitLab. (28)
Figure 4
Figure 4. PubChemLite web interface (composite image), view of additional data including annotations, CCS values, and patent and literature stripes for Streptomycin (https://pubchemlite.lcsb.uni.lu/e/compound/19649). For high resolution live images, please click the embedded hyperlink. Logo image from GitLab. (28)
Figure 5
Figure 5. PubChemLite annotation content (total and by category) between 4 Feb. 2022, and 3 Nov. 2024.
References
This article references 58 other publications.
- 1Hollender, J.; Schymanski, E. L.; Ahrens, L.; Alygizakis, N.; Béen, F.; Bijlsma, L.; Brunner, A. M.; Celma, A.; Fildier, A.; Fu, Q.; Gago-Ferrero, P.; Gil-Solsona, R.; Haglund, P.; Hansen, M.; Kaserzon, S.; Kruve, A.; Lamoree, M.; Margoum, C.; Meijer, J.; Merel, S. NORMAN Guidance on Suspect and Non-Target Screening in Environmental Monitoring. Environmental Sciences Europe 2023, 35 (1), 75, DOI: 10.1186/s12302-023-00779-4There is no corresponding record for this reference.
- 2Lai, Y.; Koelmel, J. P.; Walker, D. I.; Price, E. J.; Papazian, S.; Manz, K. E.; Castilla-Fernández, D.; Bowden, J. A.; Nikiforov, V.; David, A.; Bessonneau, V.; Amer, B.; Seethapathy, S.; Hu, X.; Lin, E. Z.; Jbebli, A.; McNeil, B. R.; Barupal, D.; Cerasa, M.; Xie, H. High-Resolution Mass Spectrometry for Human Exposomics: Expanding Chemical Space Coverage. Environ. Sci. Technol. 2024, 58 (29), 12784– 12822, DOI: 10.1021/acs.est.4c01156There is no corresponding record for this reference.
- 3Belova, L.; Caballero-Casero, N.; van Nuijs, A. L. N.; Covaci, A. Ion Mobility-High-Resolution Mass Spectrometry (IM-HRMS) for the Analysis of Contaminants of Emerging Concern (CECs): Database Compilation and Application to Urine Samples. Anal. Chem. 2021, 93 (16), 6428– 6436, DOI: 10.1021/acs.analchem.1c001423https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB3MXosFSqu7c%253D&md5=0fc01fc7b8c7360ee4f7015b0c8b98dcIon Mobility-High-Resolution Mass Spectrometry (IM-HRMS) for the Analysis of Contaminants of Emerging Concern (CECs): Database Compilation and Application to Urine SamplesBelova, Lidia; Caballero-Casero, Noelia; van Nuijs, Alexander L. N.; Covaci, AdrianAnalytical Chemistry (Washington, DC, United States) (2021), 93 (16), 6428-6436CODEN: ANCHAM; ISSN:0003-2700. (American Chemical Society)Ion mobility mass spectrometry (IM-MS)-derived collision cross section (CCS) values can serve as a valuable addnl. identification parameter within the anal. of compds. of emerging concern (CEC) in human matrixes. This study introduces the first comprehensive database of DTCCSN2 values of 148 CECs and their metabolites including bisphenols, alternative plasticizers (AP), organophosphate flame retardants (OP), perfluoroalkyl chems. (PFAS), and others. A total of 311 ions were included in the database, whereby the DTCCSN2 values for 113 compds. are reported for the first time. For 105 compds., more than one ion is reported. Moreover, the DTCCSN2 values of several isomeric CECs and their metabolites are reported to allow a distinction between isomers. Comprehensive quality assurance guidelines were implemented in the workflow of acquiring DTCCSN2 values to ensure reproducible exptl. conditions. The reliability and reproducibility of the complied database were investigated by analyzing pooled human urine spiked with 30 AP and OP metabolites at two concn. levels. For all investigated metabolites, the DTCCSN2 values measured in urine showed a percent error of <1% in comparison to database values. DTCCSN2 values of OP metabolites showed an av. percent error of 0.12% (50 ng/mL in urine) and 0.15% (20 ng/mL in urine). For AP metabolites, these values were 0.10 and 0.09%, resp. These results show that the provided database can be of great value for enhanced identification of CECs in environmental and human matrixes, which can advance future suspect screening studies on CECs.
- 4Celma, A.; Bade, R.; Sancho, J. V.; Hernandez, F.; Humphries, M.; Bijlsma, L. Prediction of Retention Time and Collision Cross Section (CCSH+, CCSH-, and CCSNa+) of Emerging Contaminants Using Multiple Adaptive Regression Splines. J. Chem. Inf. Model. 2022, 62 (22), 5425– 5434, DOI: 10.1021/acs.jcim.2c008474https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB38Xis12gu7zN&md5=604ce54bdd69828329d9d996601f7d39Prediction of Retention Time and Collision Cross Section (CCSH+, CCSH-, and CCSNa+) of Emerging Contaminants Using Multiple Adaptive Regression SplinesCelma, Alberto; Bade, Richard; Sancho, Juan Vicente; Hernandez, Felix; Humphries, Melissa; Bijlsma, LubertusJournal of Chemical Information and Modeling (2022), 62 (22), 5425-5434CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)Ultra-high performance liq. chromatog. coupled to ion mobility sepn. and high-resoln. mass spectrometry instruments have proven very valuable for screening of emerging contaminants in the aquatic environment. However, when applying suspect or nontarget approaches (i.e., when no ref. stds. are available), there is no information on retention time (RT) and collision cross-section (CCS) values to facilitate identification. In silico prediction tools of RT and CCS can therefore be of great utility to decrease the no. of candidates to investigate. In this work, Multiple Adaptive Regression Splines (MARS) were evaluated for the prediction of both RT and CCS. MARS prediction models were developed and validated using a database of 477 protonated mols., 169 deprotonated mols., and 249 sodium adducts. Multivariate and univariate models were evaluated showing a better fit for univariate models to the exptl. data. The RT model (R2 = 0.855) showed a deviation between predicted and exptl. data of ±2.32 min (95% confidence intervals). The deviation obsd. for CCS data of protonated mols. using the CCSH model (R2 = 0.966) was ±4.05% with 95% confidence intervals. The CCSH model was also tested for the prediction of deprotonated mols., resulting in deviations below ±5.86% for the 95% of the cases. Finally, a third model was developed for sodium adducts (CCSNa, R2 = 0.954) with deviation below ±5.25% for 95% of the cases. The developed models have been incorporated in an open-access and user-friendly online platform which represents a great advantage for third-party research labs. for predicting both RT and CCS data.
- 5Song, X.-C.; Dreolin, N.; Canellas, E.; Goshawk, J.; Nerin, C. Prediction of Collision Cross-Section Values for Extractables and Leachables from Plastic Products. Environ. Sci. Technol. 2022, 56 (13), 9463– 9473, DOI: 10.1021/acs.est.2c028535https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB38XhsFKiur3I&md5=154cdd78efbaf8ca197951b7a690b691Prediction of collision cross-section values for extractables and leachables from plastic productsSong, Xue-Chao; Dreolin, Nicola; Canellas, Elena; Goshawk, Jeff; Nerin, CristinaEnvironmental Science & Technology (2022), 56 (13), 9463-9473CODEN: ESTHAG; ISSN:1520-5851. (American Chemical Society)The use of ion mobility sepn. (IMS) in conjunction with high-resoln. mass spectrometry has proved to be a reliable and useful technique for the characterization of small mols. from plastic products. Collision cross-section (CCS) values derived from IMS can be used as a structural descriptor to aid compd. identification. One limitation of the application of IMS to the identification of chems. from plastics is the lack of published empirical CCS values. As such, machine learning techniques can provide an alternative approach by generating predicted CCS values. Herein, exptl. CCS values for over a thousand chems. assocd. with plastics were collected from the literature and used to develop an accurate CCS prediction model for extractables and leachables from plastic products. The effect of different mol. descriptors and machine learning algorithms on the model performance were assessed. A support vector machine (SVM) model, based on Chem. Development Kit (CDK) descriptors, provided the most accurate prediction with 93.3% of CCS values for [M + H]+ adducts and 95.0% of CCS values for [M + Na]+ adducts in testing sets predicted with <5% error. Median relative errors for the CCS values of the [M + H]+ and [M + Na]+ adducts were 1.42 and 1.76%, resp. Subsequently, CCS values for the compds. in the Chems. assocd. with Plastic Packaging Database and the Food Contact Chems. Database were predicted using the SVM model developed herein. These values were integrated in our structural elucidation workflow and applied to the identification of plastic-related chems. in river water. False positives were reduced, and the identification confidence level was improved by the incorporation of predicted CCS values in the suspect screening workflow.
- 6Ieritano, C.; Hopkins, W. S. Assessing Collision Cross Section Calculations Using MobCal-MPI with a Variety of Commonly Used Computational Methods. Mater. Today Commun. 2021, 27, 102226 DOI: 10.1016/j.mtcomm.2021.1022266https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB3MXmtFWltro%253D&md5=2189429366cd072f2ac8ff8ca7f1816bAssessing collision cross section calculations using MobCal-MPI with a variety of commonly used computational methodsIeritano, Christian; Hopkins, W. scottMaterials Today Communications (2021), 27 (), 102226CODEN: MTCAC7; ISSN:2352-4928. (Elsevier Ltd.)Structural studies with ion mobility require an accurate methodol. to bridge theor. modeling of chem. structure with exptl. detn. of an ion's collision cross section (CCS). The parallelized MobCal-MPI package enables rapid and accurate evaluation of CCSs that are applicable to several chem. classes, but was only assessed for accuracy using a single model chem.: B3LYP-D3/6-31++G(d,p). In this work, the performance of MobCal-MPI was validated across 25 different model chemistries, which encompassed PM7, Hartree-Fock, and three common DFT functionals (B3LYP-D3, ωB97X-D, and M06-2X-D3) using six different basis sets (6-31 G, 6-31 G(d,p), 6-31++G(d,p), def2-SVP, def2-TZVP, and def2-TZVPP). Performance assessment was accomplished using geometries generated from a set of 50 structurally diverse mols. at each level of theory. MobCal-MPI calcs. CCSs that correlate well with exptl. values for all model chemistries explored (< 2.5% RMSD) with the exception of PM7 (3.0% RMSD) and methods that employ basis sets lacking polarization functions (e.g., 6-31G; < 4% RMSD). While any of the 25 model chemistries can be used with MobCal-MPI with reasonable accuracy, caution should be exercised when coupling CCS calcns. with PM7 or basis sets that lack polarization functions. Following benchmarking, MobCal-MPI was used to calc. the CCS of a macromol. construct consisting of atropine and β-cyclodextrin. The CCSs calcd. for the β-cyclodextrin complex using either the PM7 or B3LYP-D3 model chemistries agree with exptl. values within the expected error of the method (< 2.5%).
- 7Colby, S. M.; Thomas, D. G.; Nuñez, J. R.; Baxter, D. J.; Glaesemann, K. R.; Brown, J. M.; Pirrung, M. A.; Govind, N.; Teeguarden, J. G.; Metz, T. O.; Renslow, R. S. ISiCLE: A Quantum Chemistry Pipeline for Establishing in Silico Collision Cross Section Libraries. Anal. Chem. 2019, 91 (7), 4346– 4356, DOI: 10.1021/acs.analchem.8b045677https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1MXivVOgsb0%253D&md5=786b8a6e8d764d48a9a4debfc31e4d59ISiCLE: A Quantum Chemistry Pipeline for Establishing in Silico Collision Cross Section LibrariesColby, Sean M.; Thomas, Dennis G.; Nunez, Jamie R.; Baxter, Douglas J.; Glaesemann, Kurt R.; Brown, Joseph M.; Pirrung, Meg A.; Govind, Niranjan; Teeguarden, Justin G.; Metz, Thomas O.; Renslow, Ryan S.Analytical Chemistry (Washington, DC, United States) (2019), 91 (7), 4346-4356CODEN: ANCHAM; ISSN:0003-2700. (American Chemical Society)High-throughput, comprehensive, and confident identifications of metabolites and other chems. in biol. and environmental samples will revolutionize the understanding of the role these chem. diverse mols. play in biol. systems. Despite recent technol. advances, metabolomics studies still result in the detection of a disproportionate no. of features that cannot be confidently assigned to a chem. structure. This inadequacy is driven by the single most significant limitation in metabolomics, the reliance on ref. libraries constructed by anal. of authentic ref. materials with limited com. availability. To this end, the authors have developed the in silico chem. library engine (ISiCLE), a high-performance computing-friendly cheminformatics workflow for generating libraries of chem. properties. In the instantiation described here, the authors predict probable three-dimensional mol. conformers (i.e., conformational isomers) using chem. identifiers as input, from which collision cross sections (CCS) are derived. The approach employs first-principles simulation, distinguished by the use of mol. dynamics, quantum chem., and ion mobility calcns., to generate structures and chem. property libraries, all without training data. Importantly, optimization of ISiCLE included a refactoring of the popular MOBCAL code for trajectory-based mobility calcns., improving its computational efficiency by over 2 orders of magnitude. Calcd. CCS values were validated against 1983 exptl. measured CCS values and compared to previously reported CCS calcn. approaches. Av. calcd. CCS error for the validation set is 3.2% using std. parameters, outperforming other d. functional theory (DFT)-based methods and machine learning methods (e.g., MetCCS). An online database is introduced for sharing both calcd. and exptl. CCS values (metabolomics.pnnl.gov), initially including a CCS library with over 1 million entries. Finally, three successful applications of mol. characterization using calcd. CCS are described, including providing evidence for the presence of an environmental degrdn. product, the sepn. of mol. isomers, and an initial characterization of complex blinded mixts. of exposure chems. This work represents a method to address the limitations of small mol. identification and offers an alternative to generating chem. identification libraries exptl. by analyzing authentic ref. materials. All code is available at github.com/pnnl.
- 8Zhou, Z.; Luo, M.; Chen, X.; Yin, Y.; Xiong, X.; Wang, R.; Zhu, Z.-J. Ion Mobility Collision Cross-Section Atlas for Known and Unknown Metabolite Annotation in Untargeted Metabolomics. Nat. Commun. 2020, 11 (1), 4334, DOI: 10.1038/s41467-020-18171-88https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB38XhtVyjtr7J&md5=e6d7e433aa292b43a5f71cabe7338670Ion mobility collision cross-section atlas for known and unknown metabolite annotation in untargeted metabolomicsZhou, Zhiwei; Luo, Mingdu; Chen, Xi; Yin, Yandong; Xiong, Xin; Wang, Ruohong; Zhu, Zheng-JiangNature Communications (2020), 11 (1), 4334CODEN: NCAOBW; ISSN:2041-1723. (Nature Research)The metabolome includes not just known but also unknown metabolites; however, metabolite annotation remains the bottleneck in untargeted metabolomics. Ion mobility - mass spectrometry (IM-MS) has emerged as a promising technol. by providing multi-dimensional characterizations of metabolites. Here, we curate an ion mobility CCS atlas, namely AllCCS, and develop an integrated strategy for metabolite annotation using known or unknown chem. structures. The AllCCS atlas covers vast chem. structures with >5000 exptl. CCS records and ∼12 million calcd. CCS values for >1.6 million small mols. We demonstrate the high accuracy and wide applicability of AllCCS with medium relative errors of 0.5-2% for a broad spectrum of small mols. AllCCS combined with in silico MS/MS spectra facilitates multi-dimensional match and substantially improves the accuracy and coverage of both known and unknown metabolite annotation from biol. samples. Together, AllCCS is a versatile resource that enables confident metabolite annotation, revealing comprehensive chem. and metabolic insights towards biol. processes.
- 9Ross, D. H.; Cho, J. H.; Xu, L. Breaking Down Structural Diversity for Comprehensive Prediction of Ion-Neutral Collision Cross Sections. Anal. Chem. 2020, 92 (6), 4548– 4557, DOI: 10.1021/acs.analchem.9b057729https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB3cXjs1Klsb4%253D&md5=c5c69468f76ce320da7c03357fd97176Breaking Down Structural Diversity for Comprehensive Prediction of Ion-Neutral Collision Cross SectionsRoss, Dylan H.; Cho, Jang Ho; Xu, LibinAnalytical Chemistry (Washington, DC, United States) (2020), 92 (6), 4548-4557CODEN: ANCHAM; ISSN:0003-2700. (American Chemical Society)Identification of unknowns is a bottleneck for large-scale untargeted analyses like metabolomics or drug metabolite identification. Ion mobility-mass spectrometry (IM-MS) provides rapid two-dimensional sepn. of ions based on their mobility through a neutral buffer gas. The mobility of an ion is related to its collision cross section (CCS) with the buffer gas, a phys. property that is detd. by the size and shape of the ion. This structural dependency makes CCS a promising characteristic for compd. identification, but this utility is limited by the availability of high-quality ref. CCS values. CCS prediction using machine learning (ML) has recently shown promise in the field, but accurate and broadly applicable models are still lacking. Here we present a novel ML approach that employs a comprehensive collection of CCS values covering a wide range of chem. space. Using this diverse database, we identified the structural characteristics, represented by mol. quantum nos. (MQNs), that contribute to variance in CCS and assessed the performance of a variety of ML algorithms in predicting CCS. We found that by breaking down the chem. structural diversity using unsupervised clustering based on the MQNs, specific and accurate prediction models for each cluster can be trained, which showed superior performance than a single model trained with all data. Using this approach, we have robustly trained and characterized a CCS prediction model with high accuracy on diverse chem. structures. An all-in-one web interface (https://CCSbase.net) was built for querying the CCS database and accessing the predictive model to support unknown compd. identifications.
- 10Guo, R.; Zhang, Y.; Liao, Y.; Yang, Q.; Xie, T.; Fan, X.; Lin, Z.; Chen, Y.; Lu, H.; Zhang, Z. Highly Accurate and Large-Scale Collision Cross Sections Prediction with Graph Neural Networks. Communications Chemistry 2023, 6 (1), 1– 10, DOI: 10.1038/s42004-023-00939-wThere is no corresponding record for this reference.
- 11Plante, P.-L.; Francovic-Fontaine, É.; May, J. C.; McLean, J. A.; Baker, E. S.; Laviolette, F.; Marchand, M.; Corbeil, J. Predicting Ion Mobility Collision Cross-Sections Using a Deep Neural Network: DeepCCS. Anal. Chem. 2019, 91 (8), 5191– 5199, DOI: 10.1021/acs.analchem.8b0582111https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1MXmtFOisLY%253D&md5=d735e5f80e99301113280f7c878314a5Predicting Ion Mobility Collision Cross-Sections Using a Deep Neural Network: DeepCCSPlante, Pier-Luc; Francovic-Fontaine, Elina; May, Jody C.; McLean, John A.; Baker, Erin S.; Laviolette, Francois; Marchand, Mario; Corbeil, JacquesAnalytical Chemistry (Washington, DC, United States) (2019), 91 (8), 5191-5199CODEN: ANCHAM; ISSN:0003-2700. (American Chemical Society)Untargeted metabolomic measurements using mass spectrometry are a powerful tool for uncovering new small mols. with environmental and biol. importance. The small mol. identification step, however, still remains an enormous challenge due to fragmentation difficulties or unspecific fragment ion information. Current methods to address this challenge are often dependent on databases or require the use of NMR, which have their own difficulties. The use of the gas-phase collision cross section (CCS) values obtained from ion mobility spectrometry (IMS) measurements were recently demonstrated to reduce the no. of false pos. metabolite identifications. While promising, the amt. of empirical CCS information currently available is limited, thus predictive CCS methods need to be developed. In this article, the authors expand upon current exptl. IMS capabilities by predicting the CCS values using a deep learning algorithm. The authors successfully developed and trained a prediction model for CCS values requiring only information about a compd.'s SMILES notation and ion type. The use of data from five different labs. using different instruments allowed the algorithm to be trained and tested on more than 2400 mols. The resulting CCS predictions were found to achieve a coeff. of detn. of 0.97 and median relative error of 2.7% for a wide range of mols. Furthermore, the method requires only a small amt. of processing power to predict CCS values. Considering the performance, time, and resources necessary, as well as its applicability to a variety of mols., this model was able to outperform all currently available CCS prediction algorithms.
- 12Rainey, M. A.; Watson, C. A.; Asef, C. K.; Foster, M. R.; Baker, E. S.; Fernández, F. M. CCS Predictor 2.0: An Open-Source Jupyter Notebook Tool for Filtering Out False Positives in Metabolomics. Anal. Chem. 2022, 94 (50), 17456– 17466, DOI: 10.1021/acs.analchem.2c0349112https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB38XjtVensrbP&md5=9fdde50110cae7ec21403a8c2c242c86CCS Predictor 2.0: An Open-Source Jupyter Notebook Tool for Filtering Out False Positives in MetabolomicsRainey, Markace A.; Watson, Chandler A.; Asef, Carter K.; Foster, Makayla R.; Baker, Erin S.; Fernandez, Facundo M.Analytical Chemistry (Washington, DC, United States) (2022), 94 (50), 17456-17466CODEN: ANCHAM; ISSN:0003-2700. (American Chemical Society)Metabolite annotation continues to be the widely accepted bottleneck in nontargeted metabolomics workflows. Annotation of metabolites typically relies on a combination of high-resoln. mass spectrometry (MS) with parent and tandem measurements, isotope cluster evaluations, and Kendrick mass defect (KMD) anal. Chromatog. retention time matching with stds. is often used at the later stages of the process, which can also be followed by metabolite isolation and structure confirmation utilizing NMR (NMR) spectroscopy. The measurement of gas-phase collision cross-section (CCS) values by ion mobility (IM) spectrometry also adds an important dimension to this workflow by generating an addnl. mol. parameter that can be used for filtering unlikely structures. The millisecond timescale of IM spectrometry allows the rapid measurement of CCS values and allows easy pairing with existing MS workflows. Here, we report on a highly accurate machine learning algorithm (CCSP 2.0) in an open-source Jupyter Notebook format to predict CCS values based on linear support vector regression models. This tool allows customization of the training set to the needs of the user, enabling the prodn. of models for new adducts or previously unexplored mol. classes. CCSP produces predictions with accuracy equal to or greater than existing machine learning approaches such as CCSbase, DeepCCS, and AllCCS, while being better aligned with FAIR (Findable, Accessible, Interoperable, and Reusable) data principles. Another unique aspect of CCSP 2.0 is its inclusion of a large library of 1613 mol. descriptors via the Mordred Python package, further encoding the fine aspects of isomeric mol. structures. CCS prediction accuracy was tested using CCS values in the McLean CCS Compendium with median relative errors of 1.25, 1.73, and 1.87% for the 170 [M - H]-, 155 [M + H]+, and 138 [M + Na]+ adducts tested. For superclass-matched data sets, CCS predictions via CCSP allowed filtering of 36.1% of incorrect structures while retaining a total of 100% of the correct annotations using a ΔCCS threshold of 2.8% and a mass error of 10 ppm.
- 13Wishart, D. S.; Guo, A.; Oler, E.; Wang, F.; Anjum, A.; Peters, H.; Dizon, R.; Sayeeda, Z.; Tian, S.; Lee, B. L.; Berjanskii, M.; Mah, R.; Yamamoto, M.; Jovel, J.; Torres-Calzada, C.; Hiebert-Giesbrecht, M.; Lui, V. W.; Varshavi, D.; Varshavi, D.; Allen, D. HMDB 5.0: The Human Metabolome Database for 2022. Nucleic Acids Res. 2022, 50 (D1), D622– D631, DOI: 10.1093/nar/gkab106213https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB38Xis1Chtbk%253D&md5=ee423e78dc044e44d3bceed843af47f4HMDB 5.0: the human metabolome database for 2022Wishart, David S.; Guo, AnChi; Oler, Eponine; Wang, Fei; Anjum, Afia; Peters, Harrison; Dizon, Raynard; Sayeeda, Zinat; Tian, Siyang; Lee, Brian L.; Berjanskii, Mark; Mah, Robert; Yamamoto, Mai; Jovel, Juan; Torres-Calzada, Claudia; Hiebert-Giesbrecht, Mickel; Lui, Vicki W.; Varshavi, Dorna; Varshavi, Dorsa; Allen, Dana; Arndt, David; Khetarpal, Nitya; Sivakumaran, Aadhavya; Harford, Karxena; Sanford, Selena; Yee, Kristen; Cao, Xuan; Budinski, Zachary; Liigand, Jaanus; Zhang, Lun; Zheng, Jiamin; Mandal, Rupasri; Karu, Naama; Dambrova, Maija; Schioth, Helgi B.; Greiner, Russell; Gautam, VasukNucleic Acids Research (2022), 50 (D1), D622-D631CODEN: NARHAD; ISSN:1362-4962. (Oxford University Press)A review. The Human Metabolome Database or HMDB has been providing comprehensive ref. information about human metabolites and their assocd. biol., physiol. and chem. properties since 2007. Over the past 15 years, the HMDB has grown and evolved significantly to meet the needs of the metabolomics community and respond to continuing changes in internet and computing technol. This year's update, HMDB 5.0, brings a no. of important improvements and upgrades to the database. These should make the HMDB more useful and more appealing to a larger cross-section of users. In particular, these improvements include: (i) a significant increase in the no. of metabolite entries (from 114 100 to 217 920 compds.); (ii) enhancements to the quality and depth of metabolite descriptions; (iii) the addn. of new structure, spectral and pathway visualization tools; (iv) the inclusion of many new and much more accurately predicted spectral data sets, including predicted NMR spectra, more accurately predicted MS spectra, predicted retention indexes and predicted collision cross section data and (v) enhancements to the HMDB's search functions to facilitate better compd. identification. Many other minor improvements and updates to the content, the interface, and general performance of the HMDB website have also been made. Overall, we believe these upgrades and updates should greatly enhance the HMDB's ease of use and its potential applications not only in human metabolomics but also in exposomics, lipidomics, nutritional science, biochem. and clin. chem.
- 14Williams, A. J.; Grulke, C. M.; Edwards, J.; McEachran, A. D.; Mansouri, K.; Baker, N. C.; Patlewicz, G.; Shah, I.; Wambaugh, J. F.; Judson, R. S.; Richard, A. M. The CompTox Chemistry Dashboard: A Community Data Resource for Environmental Chemistry. Journal of Cheminformatics 2017, 9 (1), 61, DOI: 10.1186/s13321-017-0247-614https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1MXjtlejtrw%253D&md5=34b709fb91fd550b84398cf62434cd01The CompTox chemistry dashboard: a community data resource for environmental chemistryWilliams, Antony J.; Grulke, Christopher M.; Edwards, Jeff; Mceachran, Andrew D.; Mansouri, Kamel; Baker, Nancy C.; Patlewicz, Grace; Shah, Imran; Wambaugh, John F.; Judson, Richard S.; Richard, Ann M.Journal of Cheminformatics (2017), 9 (), 61/1-61/27CODEN: JCOHB3; ISSN:1758-2946. (Chemistry Central Ltd.)Despite an abundance of online databases providing access to chem. data, there is increasing demand for highquality, structure-curated, open data to meet the various needs of the environmental sciences and computational toxicol. communities. The U.S.These data include physicochem., environmental fate and transport, exposure, usage, in vivo toxicity, and in vitro bioassay data, surfaced through an integration hub with link-outs to addnl. EPA data and public domain online resources. Batch searching allows for direct chem. identifier (ID) mapping and downloading of multiple data streams in several different formats. This facilitates fast access to available structure, property, toxicity, and bioassay data for collections of chems. (hundreds to thousands at a time). Advanced search capabilities are available to support, for example, non-targeted anal. and identification of chems. using mass spectrometry. The contents of the chem. database, presently contg. ∼ 760,000 substances, are available as public domain data for download. The chem. content underpinning the Dashboard has been aggregated over the past 15 years by both manual and auto-curation techniques within EPA's DSSTox project. DSSTox chem. content is subject to strict quality controls to enforce consistency among chem. substance-structure identifiers, as well as list curation review to ensure accurate linkages of DSSTox substances to chem. lists and assocd. data. The Dashboard, publicly launched in Apr. 2016, has expanded considerably in content and user traffic over the past year. It is continuously evolving with the growth of DSSTox into high-interest or data-rich domains of interest to EPA, such as chems. on the Toxic Substances Control Act listing, while providing the user community with a flexible and dynamic web-based platform for integration, processing, visualization and delivery of data and resources. The Dashboard provides support for a broad array of research and regulatory programs across the worldwide community of toxicologists and environmental scientists.
- 15Kim, S.; Chen, J.; Cheng, T.; Gindulyte, A.; He, J.; He, S.; Li, Q.; Shoemaker, B. A.; Thiessen, P. A.; Yu, B.; Zaslavsky, L.; Zhang, J.; Bolton, E. E. PubChem 2023 Update. Nucleic Acids Res. 2023, 51, D1373– D1380, DOI: 10.1093/nar/gkac956There is no corresponding record for this reference.
- 16Pence, H. E.; Williams, A. ChemSpider: An Online Chemical Information Resource. J. Chem. Educ. 2010, 87 (11), 1123– 1124, DOI: 10.1021/ed100697w16https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3cXhtV2jtr%252FF&md5=c7c75704f57cf699598c0250001ebb48ChemSpider: An Online Chemical Information ResourcePence, Harry E.; Williams, AntonyJournal of Chemical Education (2010), 87 (11), 1123-1124CODEN: JCEDA8; ISSN:0021-9584. (American Chemical Society and Division of Chemical Education, Inc.)ChemSpider is a free, online chem. database offering access to phys. and chem. properties, mol. structure, spectral data, synthetic methods, safety information, and nomenclature for almost 25 million unique chem. compds. sourced and linked to almost 400 sep. data sources on the Web. ChemSpider is quickly becoming the primary chem. Internet portal and it can be very useful for both chem. teaching and research.
- 17American Chemical Society. CAS REGISTRY - The CAS Substance Collection , 2024. https://www.cas.org/cas-data/cas-registry (accessed 2024-08-03).There is no corresponding record for this reference.
- 18Wang, Z.; Walker, G. W.; Muir, D. C. G.; Nagatani-Yoshida, K. Toward a Global Understanding of Chemical Pollution: A First Comprehensive Analysis of National and Regional Chemical Inventories. Environ. Sci. Technol. 2020, 54 (5), 2575– 2584, DOI: 10.1021/acs.est.9b0637918https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB3cXhsFaitL0%253D&md5=5f731031fadaccfe5bd723138725c8a7Toward a Global Understanding of Chemical Pollution: A First Comprehensive Analysis of National and Regional Chemical InventoriesWang, Zhanyun; Walker, Glen W.; Muir, Derek C. G.; Nagatani-Yoshida, KakukoEnvironmental Science & Technology (2020), 54 (5), 2575-2584CODEN: ESTHAG; ISSN:0013-936X. (American Chemical Society)Chems., while benefitting society, may be released during their life cycle and possibly harm humans and ecosystems. Chem. pollution is mentioned as a planetary boundaries within which humanity can safely operate, but is not comprehensively understood. This work analyzed 22 chem. inventories from 19 countries and regions to achieve a first comprehensive overview of chems. on the market as an essential first step toward a global understanding of chem. pollution. More than 350,000 chems. and chem. mixts. have been registered for prodn. and use, up to three times as many as previously estd. and with substantial differences across countries/regions. A noteworthy finding was that identities of many chems. remain publicly unknown because they are claimed as confidential (>50,000) or ambiguously described (up to 70,000). Coordinated efforts by all stake-holders including scientists from different disciplines are urgently needed; new areas of interest and opportunities are highlighted.
- 19Mohammed Taha, H.; Aalizadeh, R.; Alygizakis, N.; Antignac, J.-P.; Arp, H. P. H.; Bade, R.; Baker, N.; Belova, L.; Bijlsma, L.; Bolton, E. E.; Brack, W.; Celma, A.; Chen, W.-L.; Cheng, T.; Chirsir, P.; Čirka, L.; D’Agostino, L. A.; Djoumbou Feunang, Y.; Dulio, V.; Fischer, S. The NORMAN Suspect List Exchange (NORMAN-SLE): Facilitating European and Worldwide Collaboration on Suspect Screening in High Resolution Mass Spectrometry. Environmental Sciences Europe 2022, 34 (1), 104, DOI: 10.1186/s12302-022-00680-6There is no corresponding record for this reference.
- 20Schymanski, E. L.; Kondić, T.; Neumann, S.; Thiessen, P. A.; Zhang, J.; Bolton, E. E. Empowering Large Chemical Knowledge Bases for Exposomics: PubChemLite Meets MetFrag. Journal of Cheminformatics 2021, 13 (1), 19, DOI: 10.1186/s13321-021-00489-0There is no corresponding record for this reference.
- 21Helmus, R.; ter Laak, T. L.; van Wezel, A. P.; de Voogt, P.; Schymanski, E. L. patRoon: Open Source Software Platform for Environmental Mass Spectrometry Based Non-Target Screening. Journal of Cheminformatics 2021, 13 (1), 1, DOI: 10.1186/s13321-020-00477-w21https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB3MXhtFShs7zM&md5=4ee142079555535910b3f76297a8317apatRoon: open source software platform for environmental mass spectrometry based non-target screeningHelmus, Rick; ter Laak, Thomas L.; van Wezel, Annemarie P.; de Voogt, Pim; Schymanski, Emma L.Journal of Cheminformatics (2021), 13 (1), 1CODEN: JCOHB3; ISSN:1758-2946. (SpringerOpen)Abstr.: Mass spectrometry based non-target anal. is increasingly adopted in environmental sciences to screen and identify numerous chems. simultaneously in highly complex samples. However, current data processing software either lack functionality for environmental sciences, solve only part of the workflow, are not openly available and/or are restricted in input data formats. In this paper we present patRoon, a new R based open-source software platform, which provides comprehensive, fully tailored and straightforward non-target anal. workflows. In addn., patRoon offers various functionality and strategies to simplify and perform automated processing of complex (environmental) data effectively. patRoon implements several effective optimization strategies to significantly reduce computational times. The ability of patRoon to perform time-efficient and automated non-target data annotation of environmental samples is demonstrated with a simple and reproducible workflow using open-access data of spiked samples from a drinking water treatment plant study. In addn., the ability to easily use, combine and evaluate different algorithms was demonstrated for three commonly used feature finding algorithms. This article, combined with already published works, demonstrate that patRoon helps make comprehensive (environmental) non-target anal. readily accessible to a wider community of researchers.
- 22Ruttkies, C.; Schymanski, E. L.; Wolf, S.; Hollender, J.; Neumann, S. MetFrag Relaunched: Incorporating Strategies Beyond in Silico Fragmentation. Journal of Cheminformatics 2016, 8 (1), 3, DOI: 10.1186/s13321-016-0115-922https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2sXmtVKltL0%253D&md5=0135b38f288a51bec794ee93800590ecMetFrag relaunched: incorporating strategies beyond in silico fragmentationRuttkies, Christoph; Schymanski, Emma L.; Wolf, Sebastian; Hollender, Juliane; Neumann, SteffenJournal of Cheminformatics (2016), 8 (), 3/1-3/16CODEN: JCOHB3; ISSN:1758-2946. (Chemistry Central Ltd.)Background: The in silico fragmenter MetFrag, launched in 2010, was one of the first approaches combining compd. database searching and fragmentation prediction for small mol. identification from tandem mass spectrometry data. Since then many new approaches have evolved, as has MetFrag itself. This article details the latest developments to MetFrag and its use in small mol. identification since the original publication. Results: MetFrag has gone through algorithmic and scoring refinements. New features include the retrieval of ref., data source and patent information via ChemSpider and PubChem web services, as well as InChIKey filtering to reduce candidate redundancy due to stereoisomerism. Candidates can be filtered or scored differently based on criteria like occurrence of certain elements and/or substructures prior to fragmentation, or presence in so-called "suspect lists". Retention time information can now be calcd. either within MetFrag with a sufficient amt. of user-provided retention times, or incorporated sep. as "user-defined scores" to be included in candidate ranking. The changes to MetFrag were evaluated on the original dataset as well as a dataset of 473 merged high resoln. tandem mass spectra (HR-MS/MS) and compared with another open source in silico fragmenter, CFM-ID. Using HR-MS/MS information only, MetFrag2.2 and CFM-ID had 30 and 43 Top 1 ranks, resp., using PubChem as a database. Including ref. and retention information in MetFrag2.2 improved this to 420 and 336 Top 1 ranks with ChemSpider and PubChem (89 and 71 %), resp., and even up to 343 Top 1 ranks (PubChem) when combining with CFM-ID. The optimal parameters and wts. were verified using three addnl. datasets of 824 merged HR-MS/MS spectra in total. Further examples are given to demonstrate flexibility of the enhanced features. Conclusions: In many cases addnl. information is available from the exptl. context to add to small mol. identification, which is esp. useful where the mass spectrum alone is not sufficient for candidate selection from a large no. of candidates. The results achieved with MetFrag2.2 clearly show the benefit of considering this addnl. information. The new functions greatly enhance the chance of identification success and have been incorporated into a command line interface in a flexible way designed to be integrated into high throughput workflows.
- 23NCBI/NLM/NIH. PubChem Table of Contents Classification Browser , 2024. https://pubchem.ncbi.nlm.nih.gov/classification/#hid=72 (accessed 2024-07-08).There is no corresponding record for this reference.
- 24LCSB-ECI. Uniluxembourg/LCSB/Environmental Cheminformatics/Pubchemlite-Input. GitLab , 2024. https://gitlab.com/uniluxembourg/lcsb/eci/pubchemlite-input (accessed 2024-12-16).There is no corresponding record for this reference.
- 25LCSB-ECI. Uniluxembourg/LCSB/Environmental Cheminformatics/Pubchemlite-Build-System. GitLab , 2024. https://gitlab.com/uniluxembourg/lcsb/eci/pclbuild (accessed 2024-12-16).There is no corresponding record for this reference.
- 26LCSB-ECI. Uniluxembourg/LCSB/Environmental Cheminformatics. GitLab , 2024. https://gitlab.com/uniluxembourg/lcsb/eci/ (accessed 2024-12-16).There is no corresponding record for this reference.
- 27Bolton, E.; Schymanski, E.; Kondic, T.; Thiessen, P.; Zhang, J. PubChemLite for Exposomics. Zenodo 2024, DOI: 10.5281/zenodo.5995885There is no corresponding record for this reference.
- 28LCSB-ECI. Pubchemlite/Logos. GitLab , 2021. https://gitlab.com/uniluxembourg/lcsb/eci/pubchem/-/tree/master/pubchemlite/logos (accessed 2024-12-17).There is no corresponding record for this reference.
- 29Ross, D. C3SDB (Combined Collision Cross Section DataBase) - Dylanhross/C3sdb. GitHub , 2024. https://github.com/dylanhross/c3sdb (accessed 2024-12-16).There is no corresponding record for this reference.
- 30IPB Halle. MetFrag Web , 2024. https://msbi.ipb-halle.de/MetFrag/ (accessed 2024-12-03).There is no corresponding record for this reference.
- 31Kirkwood, K. I.; Christopher, M. W.; Burgess, J. L.; Littau, S. R.; Foster, K.; Richey, K.; Pratt, B. S.; Shulman, N.; Tamura, K.; MacCoss, M. J.; MacLean, B. X.; Baker, E. S. Development and Application of Multidimensional Lipid Libraries to Investigate Lipidomic Dysregulation Related to Smoke Inhalation Injury Severity. J. Proteome Res. 2022, 21 (1), 232– 242, DOI: 10.1021/acs.jproteome.1c0082031https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB3MXis1OgsLzE&md5=6cad3904c24aa34afa2488bd8b01fc0dDevelopment and Application of Multidimensional Lipid Libraries to Investigate Lipidomic Dysregulation Related to Smoke Inhalation Injury SeverityKirkwood, Kaylie I.; Christopher, Michael W.; Burgess, Jefferey L.; Littau, Sally R.; Foster, Kevin; Richey, Karen; Pratt, Brian S.; Shulman, Nicholas; Tamura, Kaipo; MacCoss, Michael J.; MacLean, Brendan X.; Baker, Erin S.Journal of Proteome Research (2022), 21 (1), 232-242CODEN: JPROBS; ISSN:1535-3893. (American Chemical Society)The implication of lipid dysregulation in diseases, toxic exposure outcomes, and inflammation has brought great interest to lipidomic studies. However, lipids have proven to be anal. challenging due to their highly isomeric nature and vast concn. ranges in biol. matrixes. Therefore, multidimensional techniques such as those integrating liq. chromatog., ion mobility spectrometry, collision-induced dissocn., and mass spectrometry (LC-IMS-CID-MS) have been implemented to sep. lipid isomers as well as provide structural information and increased identification confidence. These data sets are however extremely large and complex, resulting in challenges for data processing and annotation. Here, we have overcome these challenges by developing sample-specific multidimensional lipid libraries using the freely available software Skyline. Specifically, the human plasma library developed for this work contains over 500 unique lipids and is combined with adapted Skyline functions such as indexed retention time (iRT) for retention time prediction and IMS drift time filtering for enhanced selectivity. For comparison with other studies, this database was used to annotate LC-IMS-CID-MS data from a NIST SRM 1950 ext. The same workflow was then utilized to assess plasma and bronchoalveolar lavage fluid (BALF) samples from patients with varying degrees of smoke inhalation injury to identify lipid-based patient prognostic and diagnostic markers.
- 32Foster, M.; Rainey, M.; Watson, C.; Dodds, J. N.; Kirkwood, K. I.; Fernández, F. M.; Baker, E. S. Uncovering PFAS and Other Xenobiotics in the Dark Metabolome Using Ion Mobility Spectrometry, Mass Defect Analysis, and Machine Learning. Environ. Sci. Technol. 2022, 56 (12), 9133– 9143, DOI: 10.1021/acs.est.2c0020132https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB38XhsVWmu7nE&md5=c081d248c11892a225f6313c2dccc662Uncovering PFAS and Other Xenobiotics in the Dark Metabolome Using Ion Mobility Spectrometry, Mass Defect Analysis, and Machine LearningFoster, MaKayla; Rainey, Markace; Watson, Chandler; Dodds, James N.; Kirkwood, Kaylie I.; Fernandez, Facundo M.; Baker, Erin S.Environmental Science & Technology (2022), 56 (12), 9133-9143CODEN: ESTHAG; ISSN:1520-5851. (American Chemical Society)The identification of xenobiotics in nontargeted metabolomic analyses is a vital step in understanding human exposure. Xenobiotic metab., transformation, excretion, and coexistence with other endogenous mols., however, greatly complicate the interpretation of features detected in nontargeted studies. While mass spectrometry (MS)-based platforms are commonly used in metabolomic measurements, deconvoluting endogenous metabolites from xenobiotics is also often challenged by the lack of xenobiotic parent and metabolite stds. as well as the numerous isomers possible for each small mol. m/z feature. Here, we evaluate a xenobiotic structural annotation workflow using ion mobility spectrometry coupled with MS (IMS-MS), mass defect filtering, and machine learning to uncover potential xenobiotic classes and species in large metabolomic feature lists. Xenobiotic classes examd. included those of known high toxicities, including per- and polyfluoroalkyl substances (PFAS), polycyclic arom. hydrocarbons (PAHs), polychlorinated biphenyls (PCBs), polybrominated di-Ph ethers (PBDEs), and pesticides. Specifically, when the workflow was applied to identify PFAS in the NIST SRM 1957 and 909c human serum samples, it greatly reduced the hundreds of detected liq. chromatog. (LC)-IMS-MS features by utilizing both mass defect filtering and m/z vs. IMS collision cross sections relationships. These potential PFAS features were then compared to the EPA CompTox entries, and while some matched within specific m/z tolerances, there were still many unknowns illustrating the importance of nontargeted studies for detecting new mols. with known chem. characteristics. Addnl., this workflow can also be utilized to evaluate other xenobiotics and enable more confident annotations from nontargeted studies.
- 33Picache, J. A.; Rose, B. S.; Balinski, A.; Leaptrot, K. L.; Sherrod, S. D.; May, J. C.; McLean, J. A. Collision Cross Section Compendium to Annotate and Predict Multi-Omic Compound Identities. Chemical Science 2019, 10 (4), 983– 993, DOI: 10.1039/C8SC04396E33https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1cXitlSqtrbN&md5=177c56f0b3557ca849e5f224ffd5a224Collision cross section compendium to annotate and predict multi-omic compound identitiesPicache, Jaqueline A.; Rose, Bailey S.; Balinski, Andrzej; Leaptrot, Katrina L.; Sherrod, Stacy D.; May, Jody C.; McLean, John A.Chemical Science (2019), 10 (4), 983-993CODEN: CSHCCN; ISSN:2041-6520. (Royal Society of Chemistry)Ion mobility mass spectrometry (IM-MS) expands the analyte coverage of existing multi-omic workflows by providing an addnl. sepn. dimension as well as a parameter for characterization and identification of mols. - the collision cross section (CCS). This work presents a large, Unified CCS compendium of >3800 exptl. acquired CCS values obtained from traceable mol. stds. and measured with drift tube ion mobility-mass spectrometers. An interactive visualization of this compendium along with data analytic tools have been made openly accessible. Represented in the compendium are 14 structurally-based chem. super classes, consisting of a total of 80 classes and 157 subclasses. Using this large data set, regression fitting and predictive statistics have been performed to describe mass-CCS correlations specific to each chem. ontol. These structural trends provide a rapid and effective filtering method in the traditional untargeted workflow for identification of unknown biochem. species. The utility of the approach is illustrated by an application to metabolites in human serum, quantified trends of which were used to assess the probability of an unknown compd. belonging to a given class. CCS-based filtering narrowed the chem. search space by 60% while increasing the confidence in the remaining isomeric identifications from a single class, thus demonstrating the value of integrating predictive analyses into untargeted expts. to assist in identification workflows. The predictive abilities of this compendium will improve in specificity and expand to more chem. classes as addnl. data from the IM-MS community is contributed. Instructions for data submission to the compendium and criteria for inclusion are provided.
- 34Picache, J.; McLean, J. S50 CCSCOMPEND The Unified Collision Cross Section (CCS) Compendium. Zenodo 2019, DOI: 10.5281/zenodo.2658162There is no corresponding record for this reference.
- 35Celma, A.; Sancho, J. V.; Schymanski, E. L.; Fabregat-Safont, D.; Ibáñez, M.; Goshawk, J.; Barknowitz, G.; Hernández, F.; Bijlsma, L. Improving Target and Suspect Screening High-Resolution Mass Spectrometry Workflows in Environmental Analysis by Ion Mobility Separation. Environ. Sci. Technol. 2020, 54 (23), 15120– 15131, DOI: 10.1021/acs.est.0c0571335https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB3cXitlGqsbbF&md5=3b89676a1e7faca3f5ddfa4e5686c850Improving target and suspect screening high-resolution mass spectrometry workflows in environmental analysis by ion mobility separationCelma, Alberto; Sancho, Juan V.; Schymanski, Emma L.; Fabregat-Safont, David; Ibanez, Maria; Goshawk, Jeff; Barknowitz, Gitte; Hernandez, Felix; Bijlsma, LubertusEnvironmental Science & Technology (2020), 54 (23), 15120-15131CODEN: ESTHAG; ISSN:0013-936X. (American Chemical Society)Currently, the most powerful approach to monitor org. micropollutants (OMPs) in environmental samples is the combination of target, suspect, and nontarget screening strategies using high-resoln. mass spectrometry (HRMS). However, the high complexity of sample matrixes and the huge no. of OMPs potentially present in samples at low concns. pose an anal. challenge. Ion mobility sepn. (IMS) combined with HRMS instruments (IMS-HRMS) introduces an addnl. anal. dimension, providing extra information, which facilitates the identification of OMPs. The collision cross-section (CCS) value provided by IMS is unaffected by the matrix or chromatog. sepn. Consequently, the creation of CCS databases and the inclusion of ion mobility within identification criteria are of high interest for an enhanced and robust screening strategy. In this work, a CCS library for IMS-HRMS, which is online and freely available, was developed for 556 OMPs in both pos. and neg. ionization modes using electrospray ionization. The inclusion of ion mobility data in widely adopted confidence levels for identification in environmental reporting is discussed. Illustrative examples of OMPs found in environmental samples are presented to highlight the potential of IMS-HRMS and to demonstrate the addnl. value of CCS data in various screening strategies.
- 36Celma, A.; Fabregat-Safont, D.; Ibàñez, M.; Bijlsma, L.; Hernandez, F.; Sancho, J. V. S61 UJICCSLIB Collision Cross Section (CCS) Library from UJI. Zenodo 2019, DOI: 10.5281/zenodo.3549476There is no corresponding record for this reference.
- 37Belova, L.; Caballero-Casero, N.; Nuijs, A. L. N. van; Covaci, A. S79 UACCSCEC Collision Cross Section (CCS) Library from UAntwerp. Zenodo 2021, DOI: 10.5281/zenodo.4704648There is no corresponding record for this reference.
- 38Muller, H.; Palm, E.; Schymanski, E. S116 REFCCS Collision Cross Section (CCS) Values from Literature. Zenodo 2024, DOI: 10.5281/zenodo.10932895There is no corresponding record for this reference.
- 39PubChem. PubChem Classification Browser: CCSbase Classification, 2024. https://pubchem.ncbi.nlm.nih.gov/classification/#hid=104 (accessed 2024-12-17).There is no corresponding record for this reference.
- 40PubChem. PubChem Classification Browser: CCS Classification - Baker Lab , 2024. https://pubchem.ncbi.nlm.nih.gov/classification/#hid=124 (accessed 2024-12-17).There is no corresponding record for this reference.
- 41PubChem. PubChem Classification Browser: NORMAN-SLE Classification , 2024. https://pubchem.ncbi.nlm.nih.gov/classification/#hid=106 (accessed 2024-12-17).There is no corresponding record for this reference.
- 42PubChem. PubChem Classification Browser: Aggregated CCS Classification , 2024. https://pubchem.ncbi.nlm.nih.gov/classification/#hid=106 (accessed 2024-12-17).There is no corresponding record for this reference.
- 43Schymanski, E. Annotations/CCS/CCS_retrieval · Master · Uniluxembourg/LCSB/Environmental Cheminformatics/Pubchem. GitLab , 2024. https://gitlab.com/uniluxembourg/lcsb/eci/pubchem/-/tree/master/annotations/CCS/CCS_retrieval (accessed 2024-12-16).There is no corresponding record for this reference.
- 44Schymanski, E.; Zhang, J.; Thiessen, P.; Bolton, E. Experimental CCS Values in PubChem. Zenodo 2024, DOI: 10.5281/zenodo.6800138There is no corresponding record for this reference.
- 45Grouès, V.; Rocca-Serra, P.; Ded, V. Elixir-Luxembourg/Data-Catalog. GitHub , 2023. https://github.com/elixir-luxembourg/data-catalog (accessed 2024-08-04).There is no corresponding record for this reference.
- 46Welter, D.; Rocca-Serra, P.; Grouès, V.; Sallam, N.; Ancien, F.; Shabani, A.; Asariardakani, S.; Alper, P.; Ghosh, S.; Burdett, T.; Sansone, S.-A.; Gu, W.; Satagopam, V. The Translational Data Catalog - Discoverable Biomedical Datasets. Scientific Data 2023, 10 (1), 470, DOI: 10.1038/s41597-023-02258-0There is no corresponding record for this reference.
- 47Landrum, G.. RDKit: Open-Source Cheminformatics Software , 2024. https://www.rdkit.org/ (accessed 2024-08-04).There is no corresponding record for this reference.
- 48Landrum, G.; Tosco, P.; Kelley, B.; Rodriguez, R.; Cosgrove, D.; Vianello, R.; sriniker; gedeck; Jones, G.; NadineSchneider; Kawashima, E.; Nealschneider, D.; Dalke, A.; Swain, M.; Cole, B.; Turk, S.; Savelev, A.; Vaucher, A.; Wójcikowski, M.; Take, I. Rdkit/Rdkit: 2024_03_5 (Q1 2024) Release. Zenodo 2024, DOI: 10.5281/zenodo.591637There is no corresponding record for this reference.
- 49Grouès, V. Uniluxembourg/LCSB/Environmental Cheminformatics/PubChemLite-Web. GitLab , 2024. https://gitlab.com/uniluxembourg/lcsb/eci/pubchemlite-web (accessed 2024-08-04).There is no corresponding record for this reference.
- 50NCBI/NLM/NIH. PubChem Download Pages , 2024. https://ftp.ncbi.nlm.nih.gov/pubchem/ (accessed 2024-12-16).There is no corresponding record for this reference.
- 51Aurich, D.; Schymanski, E. L.; De Jesus Matias, F.; Thiessen, P. A.; Pang, J. Revealing Chemical Trends: Insights from Data-Driven Visualization and Patent Analysis in Exposomics Research. Environ. Sci. Technol. Lett. 2024, 11 (10), 1046– 1052, DOI: 10.1021/acs.estlett.4c00560There is no corresponding record for this reference.
- 52Arp, H. P. H.; Aurich, D.; Schymanski, E. L.; Sims, K.; Hale, S. E. Avoiding the Next Silent Spring: Our Chemical Past, Present, and Future. Environ. Sci. Technol. 2023, 57 (16), 6355– 6359, DOI: 10.1021/acs.est.3c0173552https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB3sXnsFCgtLw%253D&md5=dd8d9b5a8381cfe4ef36643f525f8144Avoiding the Next Silent Spring: Our Chemical Past, Present, and FutureArp, Hans Peter H.; Aurich, Dagny; Schymanski, Emma L.; Sims, Kerry; Hale, Sarah E.Environmental Science & Technology (2023), 57 (16), 6355-6359CODEN: ESTHAG; ISSN:1520-5851. (American Chemical Society)There is no expanded citation for this reference.
- 53Aurich, D. Uniluxembourg/LCSB/Environmental Cheminformatics/Chemicalstripes. GitLab , 2024. https://gitlab.com/uniluxembourg/lcsb/eci/chemicalstripes (accessed 2024-08-04).There is no corresponding record for this reference.
- 54Talavera Andújar, B.; Mary, A.; Venegas, C.; Cheng, T.; Zaslavsky, L.; Bolton, E. E.; Heneka, M. T.; Schymanski, E. L. Can Small Molecules Provide Clues on Disease Progression in Cerebrospinal Fluid from Mild Cognitive Impairment and Alzheimer’s Disease Patients?. Environ. Sci. Technol. 2024, 58, 4181– 4192, DOI: 10.1021/acs.est.3c10490There is no corresponding record for this reference.
- 55WishartLab. FooDB , 2024. https://foodb.ca/ (accessed 2024-11-06).There is no corresponding record for this reference.
- 56Menger, F.; Celma, A.; Schymanski, E. L.; Lai, F. Y.; Bijlsma, L.; Wiberg, K.; Hernández, F.; Sancho, J. V.; Ahrens, L. Enhancing Spectral Quality in Complex Environmental Matrices: Supporting Suspect and Non-Target Screening in Zebra Mussels with Ion Mobility. Environ. Int. 2022, 170, 107585 DOI: 10.1016/j.envint.2022.107585There is no corresponding record for this reference.
- 57Baker, E. S.; Hoang, C.; Uritboonthai, W.; Heyman, H. M.; Pratt, B.; MacCoss, M.; MacLean, B.; Plumb, R.; Aisporna, A.; Siuzdak, G. METLIN-CCS: An Ion Mobility Spectrometry Collision Cross Section Database. Nat. Methods 2023, 20 (12), 1836– 1837, DOI: 10.1038/s41592-023-02078-5There is no corresponding record for this reference.
- 58Baker, E. S.; Uritboonthai, W.; Aisporna, A.; Hoang, C.; Heyman, H. M.; Connell, L.; Olivier-Jimenez, D.; Giera, M.; Siuzdak, G. METLIN-CCS Lipid Database: An Authentic Standards Resource for Lipid Classification and Identification. Nature Metabolism 2024, 6 (6), 981– 982, DOI: 10.1038/s42255-024-01058-zThere is no corresponding record for this reference.
Supporting Information
Supporting Information
The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/acs.estlett.4c01003.
A document including additional details about the CCSbase training data sets (S1), using PubChemLite in MetFrag (S2), and additional rank and CCS results (S3) (PDF)
Terms & Conditions
Most electronic Supporting Information files are available without a subscription to ACS Web Editions. Such files may be downloaded by article for research use (if there is a public use license linked to the relevant article, that license may permit other uses). Permission may be obtained from ACS for other uses through requests via the RightsLink permission system: http://pubs.acs.org/page/copyright/permissions.html.