Article
Foreign Language Translation of Chemical Nomenclature by Computer

Abstract

Chemical compound names remain the primary method for conveying molecular structures between chemists and researchers. In research articles, patents, chemical catalogues, government legislation, and textbooks, the use of IUPAC and traditional compound names is universal, despite efforts to introduce more machine-friendly representations such as identifiers and line notations. Fortunately, advances in computing power now allow chemical names to be parsed and generated (read and written) with almost the same ease as conventional connection tables. A significant complication, however, is that although the vast majority of chemistry uses English nomenclature, a significant fraction is in other languages. This complicates the task of filing and analyzing chemical patents, purchasing from compound vendors, and text mining research articles or Web pages. We describe some issues with manipulating chemical names in various languages, including British, American, German, Japanese, Chinese, Spanish, Swedish, Polish, and Hungarian, and describe the current state-of-the-art in software tools to simplify the process.
Citing Articles
Citation data is made available by participants in CrossRef's Cited-by Linking service. For a more comprehensive list of citations to this article, users are encouraged to perform a search in SciFinder.
This article has been cited by 1 ACS Journal articles (1 most recent appear below).

Improved Chemical Text Mining of Patents with Infinite Dictionaries and Automatic Spelling Correction
Roger Sayle, Paul Hongxing Xie, and Sorel MuresanJournal of Chemical Information and Modeling2012 52 (1), 51-62Improved Chemical Text Mining of Patents with Infinite Dictionaries and Automatic Spelling Correction
Roger Sayle, Paul Hongxing Xie, and Sorel MuresanJournal of Chemical Information and Modeling2012 52 (1), 51-62The text mining of patents of pharmaceutical interest poses a number of unique challenges not encountered in other fields of text mining. Unlike fields, such as bioinformatics, where the number of terms of interest is enumerable and essentially static, ...
Tools
-
Add to Favorites
-
Download Citation
-
Email a Colleague -
Permalink
Order Reprints
Rights & Permissions
Citation Alerts
History
- Published In Issue March 23, 2009
- Article ASAPFebruary 24, 2009
- Received: July 21, 2008
Cart

ACS
Network
C−(fc)n−C






