Article
Optical Structure Recognition Software To Recover Chemical Information: OSRA, An Open Source Solution
Purchase the full-text
- PDF/HTML,
figures/images,
references and tables,
(where available)
SAIC-Frederick, Inc.
, ‡NCI-Frederick.
Abstract

Until recently most scientific and patent documents dealing with chemistry have described molecular structures either with systematic names or with graphical images of Kekulé structures. The latter method poses inherent problems in the automated processing that is needed when the number of documents ranges in the hundreds of thousands or even millions since graphical representations cannot be directly interpreted by a computer. To recover this structural information, which is otherwise all but lost, we have built an optical structure recognition application based on modern advances in image processing implemented in open source tools, OSRA. OSRA can read documents in over 90 graphical formats including GIF, JPEG, PNG, TIFF, PDF, and PS, automatically recognizes and extracts the graphical information representing chemical structures in such documents, and generates the SMILES or SD representation of the encountered molecular structure images.
Citing Articles
Citation data is made available by participants in CrossRef's Cited-by Linking service. For a more comprehensive list of citations to this article, users are encouraged to perform a search in SciFinder.
This article has been cited by 4 ACS Journal articles (4 most recent appear below).

AsteriX: A Web Server To Automatically Extract Ligand Coordinates from Figures in PDF Articles
V. Lounnas and G. VriendJournal of Chemical Information and Modeling2012 Article ASAPAsteriX: A Web Server To Automatically Extract Ligand Coordinates from Figures in PDF Articles
V. Lounnas and G. VriendJournal of Chemical Information and Modeling2012 Article ASAPCoordinates describing the chemical structures of small molecules that are potential ligands for pharmaceutical targets are used at many stages of the drug design process. The coordinates of the vast majority of ligands can be obtained from either ...

Chemical−Text Hybrid Search Engines
Yingyao Zhou, Bin Zhou, Shumei Jiang and Frederick J. KingJournal of Chemical Information and Modeling2010 50 (1), 47-54Chemical−Text Hybrid Search Engines
Yingyao Zhou, Bin Zhou, Shumei Jiang and Frederick J. KingJournal of Chemical Information and Modeling2010 50 (1), 47-54As the amount of chemical literature increases, it is critical that researchers be enabled to accurately locate documents related to a particular aspect of a given compound. Existing solutions, based on text and chemical search engines alone, suffer from ...

Tunable Machine Vision-Based Strategy for Automated Annotation of Chemical Databases
Jungkap Park, Gus R. Rosania and Kazuhiro SaitouJournal of Chemical Information and Modeling2009 49 (8), 1993-2001Tunable Machine Vision-Based Strategy for Automated Annotation of Chemical Databases
Jungkap Park, Gus R. Rosania and Kazuhiro SaitouJournal of Chemical Information and Modeling2009 49 (8), 1993-2001We present a tunable, machine vision-based strategy for automated annotation of virtual small molecule databases. The proposed strategy is based on the use of a machine vision-based tool for extracting structure diagrams in research articles and ...

CLiDE Pro: The Latest Generation of CLiDE, a Tool for Optical Chemical Structure Recognition
Aniko T. Valko and A. Peter JohnsonJournal of Chemical Information and Modeling2009 49 (4), 780-787CLiDE Pro: The Latest Generation of CLiDE, a Tool for Optical Chemical Structure Recognition
Aniko T. Valko and A. Peter JohnsonJournal of Chemical Information and Modeling2009 49 (4), 780-787We present CLiDE Pro, the latest version of the output of the long-term CLiDE project for the development of tools for automatic extraction of chemical information from the literature. CLiDE Pro is concerned with the extraction of chemical structure and ...
Tools
-
Add to Favorites
-
Download Citation
-
Email a Colleague -
Permalink
Order Reprints
Rights & Permissions
Citation Alerts
History
- Published In Issue March 23, 2009
- Article ASAPFebruary 17, 2009
- Received: February 22, 2008
Cart


ACS
Network






