Jump to main content
Jump to site search

All chapters
Previous chapter Next chapter

Chapter 6

Representing Chemical Structures in Databases for Drug Design

Many different computer representations of chemical structures are used in drug design. Most treat the molecule as a topological graph, though the analogy is not a perfect one, and has problems with features such as tautomerism, aromaticity and stereochemistry. Commonly-used representations include 2D-structure diagrams, systematic nomenclature, line notations (e.g. SMILES), connection tables and the recently-developed International Chemical Identifier (InChI). Several approaches are used to indicate stereochemical configuration. 3-D structural representations are also used in identifying molecules with appropriate conformations for biological activity. The presence or absence of substructure fragments in a molecule is used to build chemical “fingerprints”, which are useful both in structure search systems and for measuring the similarity between molecules. It is frequently important to establish a unique representation for a molecule, and a number of canonicalisation algorithms and structure normalisation procedures (especially to deal with tautomerism and protonation) are used to achieve this. In some cases, these need to consider the predominant form under physiological conditions. “Business rules” for normalisation are especially important in chemical registration systems, which also need to deal with different salts and isotopically-labelled compounds, as well as unknown and partially-known structures. In recent years, many techniques have been developed for the analysis of structural databases; these include clustering, R-group decomposition, “reduced feature” representations and matched molecular pair analysis.

Print publication date: 10 Nov 2011
Copyright year: 2012
Print ISBN: 978-1-84973-166-9
PDF eISBN: 978-1-84973-341-0
From the book series:
RSC Drug Discovery