Machine learning enhanced spectroscopic analysis: towards autonomous chemical mixture characterization for rapid process optimization†
Autonomous chemical process development and optimization methods use algorithms to explore the operating parameter space based on feedback from experimentally determined exit stream compositions. Measuring the compositions of multicomponent streams is challenging, requiring multiple analytical techniques to differentiate between similar chemical components in the mixture and determine their concentration. Herein, we describe a universal analytical methodology based on multitarget regression machine learning (ML) models to rapidly determine chemical mixtures' compositions from Fourier transform infrared (FTIR) absorption spectra. Specifically, we used simulated FTIR spectra for up to 6 components in water and tested seven different ML algorithms to develop the methodology. All algorithms resulted in regression models with mean absolute errors (MAE) between 0–0.27 wt%. We validated the methodology with experimental data obtained on mixtures prepared using a network of programmable pumps in line with an FTIR transmission flow cell. ML models were trained using experimental data and evaluated for mixtures of up to 4-components with similar chemical structures, including alcohols (i.e., glycerol, isopropanol, and 1-butanol) and nitriles (i.e., acrylonitrile, adiponitrile, and propionitrile). Linear regression models predicted concentrations with coefficients of determination, R2, between 0.955 and 0.986, while artificial neural network models showed a slightly lower accuracy, with R2 between 0.854 and 0.977. These R2 correspond to MAEs of 0.28–0.52 wt% for mixtures with component concentrations between 4–10 wt%. Thus, we demonstrate that ML models can accurately determine the compositions of multicomponent mixtures of similar species, enhancing spectroscopic chemical quantification for use in autonomous, fast process development and optimization.