Article
Self-Contained Sequence Representation: Bridging the Gap between Bioinformatics and Cheminformatics
Purchase the full-text
- PDF/HTML,
figures/images,
references and tables,
(where available)
‡ Author Present Address
PerkinElmer, 100 Cambridge Park Drive, Cambridge, MA 02140, United States.
Abstract

The wide application of next-generation sequencing has presented a new hurdle to bioinformatics for managing the fast-growing sequence data. The management of biomacromolecules at the chemistry level imposes an even greater challenge in cheminformatics because of the lack of a good chemical representation of biopolymers. Here we introduce the self-contained sequence representation (SCSR). SCSR combines the best features of bioinformatics and cheminformatics notations. SCSR is the first general, extensible, and comprehensive representation of biopolymers in a compressed format that retains chemistry detail. The SCSR-based high-performance exact structure and substructure searching methods (NEMA key and SSS) offer new ways to search biopolymers that complement bioinformatics approaches. The widely used chemical structure file format (molfile) has been enhanced to support SCSR. SCSR offers a solid framework for future development of new methods and systems for managing and handling sequences at the chemistry level. SCSR lays the foundation for the integration of bioinformatics and cheminformatics.
Tools
-
Add to Favorites
-
Download Citation
-
Email a Colleague -
Permalink
Order Reprints
Rights & Permissions
Citation Alerts
History
- Published In Issue September 26, 2011
- Article ASAPAugust 22, 2011
- Just Accepted ManuscriptJuly 31, 2011
- Received: May 04, 2011
Cart

ACS
Network






