Self-Contained Sequence Representation: Bridging the Gap between Bioinformatics and Cheminformatics

William L. Chen*, Burton A. Leland, Joseph L. Durant, David L. Grier, Bradley D. Christie, James G. Nourse, and Keith T. Taylor
Accelrys, Incorporated, 2440 Camino Ramon, Suite 300, San Ramon, California 94583, United States
J. Chem. Inf. Model., 2011, 51 (9), pp 2186–2208
DOI: 10.1021/ci2001988
Publication Date (Web): July 31, 2011
Copyright © 2011 American Chemical Society
E-mail: williamlingran.chen@accelrys.com. Telephone: (925) 543 7541.

 Author Present Address

PerkinElmer, 100 Cambridge Park Drive, Cambridge, MA 02140, United States.

Abstract

Abstract Image

The wide application of next-generation sequencing has presented a new hurdle to bioinformatics for managing the fast-growing sequence data. The management of biomacromolecules at the chemistry level imposes an even greater challenge in cheminformatics because of the lack of a good chemical representation of biopolymers. Here we introduce the self-contained sequence representation (SCSR). SCSR combines the best features of bioinformatics and cheminformatics notations. SCSR is the first general, extensible, and comprehensive representation of biopolymers in a compressed format that retains chemistry detail. The SCSR-based high-performance exact structure and substructure searching methods (NEMA key and SSS) offer new ways to search biopolymers that complement bioinformatics approaches. The widely used chemical structure file format (molfile) has been enhanced to support SCSR. SCSR offers a solid framework for future development of new methods and systems for managing and handling sequences at the chemistry level. SCSR lays the foundation for the integration of bioinformatics and cheminformatics.

Tools

SciFinder Links

SciFinder subscribers:  Click to sign in | Not a SciFinder subscriber? Learn more at www.cas.org

Explore by:


History

  • Published In Issue September 26, 2011
  • Article ASAPAugust 22, 2011
  • Just Accepted ManuscriptJuly 31, 2011
  • Received: May 04, 2011

Recommend & Share

  • Share on ACS NetworkACS Network
  • Add to FacebookFacebook
  • Tweet ThisTweet This
  • Add to CiteULikeCiteULike
  • Add to NewsvineNewsvine
  • Digg ThisDigg This
  • Add to DeliciousDelicious

Related Content

Other ACS content by these authors: