Issue 6, 2023

ChemDataWriter: a transformer-based toolkit for auto-generating books that summarise research

Abstract

Since the number of scientific papers has grown substantially over recent years, scientists spend much time searching, screening, and reading papers to follow the latest research trends. With the development of advanced natural-language-processing (NLP) models, transformer-based text-generation algorithms have the potential to summarise scientific papers and automatically write a literature review from numerous scientific publications. In this paper, we introduce a Python-based toolkit, ChemDataWriter, which auto-generates books about research in a completely unsupervised fashion. ChemDataWriter adopts a conservative book-generation pipeline to automatically write the book by suggesting potential book content, retrieving and re-ranking the relevant papers, and then summarising and paraphrasing the text within the paper. To the best of our knowledge, ChemDataWriter is the first open-source toolkit in the area of chemistry to be able to compose a literature review entirely via artificial intelligence once one has suggested a broad topic. We also provide an example of a book that ChemDataWriter has auto-generated about battery-materials research. To aid the use of ChemDataWriter, its code is provided with associated documentation to serve as a user guide.

Graphical abstract: ChemDataWriter: a transformer-based toolkit for auto-generating books that summarise research

Supplementary files

Article information

Article type
Paper
Submitted
20 Aug 2023
Accepted
04 Oct 2023
First published
04 Oct 2023
This article is Open Access
Creative Commons BY license

Digital Discovery, 2023,2, 1710-1720

ChemDataWriter: a transformer-based toolkit for auto-generating books that summarise research

S. Huang and J. M. Cole, Digital Discovery, 2023, 2, 1710 DOI: 10.1039/D3DD00159H

This article is licensed under a Creative Commons Attribution 3.0 Unported Licence. You can use material from this article in other publications without requesting further permissions from the RSC, provided that the correct acknowledgement is given.

Read more about how to correctly acknowledge RSC content.

Social activity

Spotlight

Advertisements