Cheminformatics in Diverse Dimensions
Modern drug discovery is exceedingly reliant on experimental screening data and on computer-assisted methods. Both approaches accumulate enormous amounts of data that have to be processed and analysed and numerous software programs are available to handle the information. A diversity of programs arise from collecting data from a multitude of sources including instruments (HPLC, NMR, etc.), drawn structures, expert systems that process the data and produce new information. All of these systems have one task in common, to save and handle data. Many organizations and software suppliers have developed their own data formats and quite a few have made provisions for the import or export of other file formats due to the lack of a generally accepted exchange format. Thus, the processing of data into information and ultimately knowledge, more often than not requires the interaction and cooperation of several different software systems and databases. Furthermore, in drug design, the understanding of structure–activity relationship (SAR) is essential, since molecular features are responsible for compound properties and/or pharmacological behaviour. All these aspects require adequate representation of molecular structures and their physicochemical properties.
This article reviews the multitude of computer-based chemical structural representations by considering aspects of their code dimensionality (zero dimension: numbers; one dimension: strings; two dimensions: tables, matrices, etc.). Furthermore, typical computer applications, characterized by their structural representations, are highlighted.