Deep learning for chemical reaction prediction
Reaction predictor is an application for predicting chemical reactions and reaction pathways. It uses deep learning to predict and rank elementary reactions by first identifying electron sources and sinks, pairing those sources and sinks to propose elementary reactions, and finally ranking the reactions by favorability. Global reactions can be identified by chaining together these elementary reaction predictions. We carefully curated a data set consisting of over 11 000 elementary reactions, covering a broad range of advanced organic chemistry. Using this data for training, we demonstrate an 80% top-5 recovery rate on a separate, challenging benchmark set of reactions drawn from modern organic chemistry literature. A fundamental problem of synthetic chemistry is the identification of unknown products observed via mass spectrometry. Reaction predictor includes a pathway search feature that can help identify such products through multi-target mass search. Finally, we discuss an alternative approach to predicting electron sources and sinks using recurrent neural networks, specifically long short-term memory (LSTM) architectures, operating directly on SMILES strings. This approach has shown promising preliminary results.
- This article is part of the themed collections: MSDE most-read Q1 2019 and Machine Learning and Data Science in Materials Design