Abstract
Compounds from 36 commercial supplier libraries and the NCI open database were analysed to address the bias in structural features for the selection of small molecules for high-throughput screening (HTS). Initially a meta dataset consisting of 11.8 million unique structures was identified from 15.6 million compounds by eliminating redundant molecules from individual libraries. Then the selection of the HTS compounds from these libraries was accomplished using common structural filters, physicochemical filters and recently emerged descriptors. Compound libraries from different suppliers were also analysed according to their exclusiveness, ‘