Commit: Reaction classification and yield prediction using the differential reaction fingerprint DRFP
Abstract
In “Reaction classification and yield prediction using the differential reaction fingerprint DRFP”, we introduced a chemical reaction fingerprint based on the symmetric difference AΔB of two sets A and B. With DRFP, were present a reaction as the two sets R and P, where R contains the fragments of one or more reactants and P the fragments of one or more products. The SMILES strings of the fragments in the symmetric difference of fragments RΔP are then hashed and folded into a binary vector. We evaluated DRFP-trained models on high through put experiment data where it performed at least as well as DFT-based and learned fingerprints. In this commit, we present the evaluation of DRFP-trained XGBoost and Random Forest regressors on a recently released set of electronic laboratory notebook-extracted Buchwald–Hartwig reactions where it performs better than other methods by a wide margin. This result underlines the status of DRFP as a strong baseline for reaction representation and yield prediction.