Jump to main content
Jump to site search
PLANNED MAINTENANCE Close the message box

Scheduled maintenance upgrade on Thursday 4th of May 2017 from 8.00am to 9.00am (BST).

During this time our websites will be offline temporarily. If you have any questions please use the feedback button on this page. We apologise for any inconvenience this might cause and thank you for your patience.


Issue 6, 2016
Previous Article Next Article

The creation and characterisation of a National Compound Collection: the Royal Society of Chemistry pilot

Author affiliations

Abstract

We present a summary of the National Compound Collection (NCC) pilot; which harvested chemical structure data from 746 publicly-available PhD theses to create an enhanced database of diverse and interesting (largely organic) molecular entities. The database comprised ∼75 000 structure entries, of which 70% were new to ChemSpider at the time of upload. The dataset was evaluated for structural uniqueness by twelve external drug discovery groups from the pharmaceutical, biotech, academic and not-for-profit sectors. These partners generated data reported here comparing the NCC pilot with their in-house compound collections. The proportion of NCC structures considered to be useful for drug discovery ranged from 5–80% depending on the strictness of the filters used; most interestingly from a drug discovery standpoint ∼13k NCC compounds (18% of the NCC) passed the filters and were of good diversity. These compounds are quite different from those that are already present in the screening collections but not so different that they are no longer considered to be drug-like. In general, the drug discovery teams would consider these compounds to be high value molecules for inclusion in their screening collections. This pilot addressed the potential value of unpublished data and explored the practicalities of large-scale data extraction, to inform both retrospective and prospective extraction of chemical data from theses.

Graphical abstract: The creation and characterisation of a National Compound Collection: the Royal Society of Chemistry pilot

Back to tab navigation
Please wait while Download options loads

Supplementary files

Publication details

The article was received on 19 Jan 2016, accepted on 22 Feb 2016 and first published on 23 Feb 2016


Article type: Edge Article
DOI: 10.1039/C6SC00264A
Citation: Chem. Sci., 2016,7, 3869-3878
  • Open access: Creative Commons BY license
  •   Request permissions

    The creation and characterisation of a National Compound Collection: the Royal Society of Chemistry pilot

    D. M. Andrews, L. M. Broad, P. J. Edwards, D. N. A. Fox, T. Gallagher, S. L. Garland, R. Kidd and J. B. Sweeney, Chem. Sci., 2016, 7, 3869
    DOI: 10.1039/C6SC00264A

    This article is licensed under a Creative Commons Attribution 3.0 Unported Licence. Material from this article can be used in other publications provided that the correct acknowledgement is given with the reproduced material.

    Reproduced material should be attributed as follows:

    • For reproduction of material from NJC:
      [Original citation] - Published by The Royal Society of Chemistry (RSC) on behalf of the Centre National de la Recherche Scientifique (CNRS) and the RSC.
    • For reproduction of material from PCCP:
      [Original citation] - Published by the PCCP Owner Societies.
    • For reproduction of material from PPS:
      [Original citation] - Published by The Royal Society of Chemistry (RSC) on behalf of the European Society for Photobiology, the European Photochemistry Association, and RSC.
    • For reproduction of material from all other RSC journals:
      [Original citation] - Published by The Royal Society of Chemistry.

    Information about reproducing material from RSC articles with different licences is available on our Permission Requests page.

Search articles by author