Comment on “A simple constrained machine learning model for predicting high-pressure-hydrogen-compressor materials” by Hattrick-Simpers, et al., Molecular Systems Design & Engineering, 2018, 3, 509

Jason Hattrick-Simpers; Brian DeCost

doi:10.1039/C9ME00138G

Comment on “A simple constrained machine learning model for predicting high-pressure-hydrogen-compressor materials” by Hattrick-Simpers, et al., Molecular Systems Design & Engineering, 2018, 3, 509†

Jason Hattrick-Simpers

^a and Brian DeCost

*^a

Author affiliations

* Corresponding authors

^a National Institute of Standards and Technology, Gaithersburg, Maryland, USA
E-mail: brian.decost@nist.gov

Abstract

In this short comment we present a reproducibility study for our recent manuscript “A simple constrained machine learning model for predicting high-pressure-hydrogen-compressor materials” by Hattrick-Simpers, et al., Mol. Syst. Des. Eng., 2018, 3, 509” using a suite of open source materials data science tools. The principal goal of this study is to provide the interested reader the ability to reproduce our previous machine learning model with minimal effort and then perform predictions upon the holdout set used in that manuscript. In transcribing our model from the Java-based Magpie/Weka framework to the Python-based Matminer/scikit-learn framework we noticed an unexpected discrepancy in the predictions between the two platforms. To compare the performance of nominally equivalent random forest regression models across these two platforms, we trained and evaluated 50 replicate models for each platform using random 90% subsets of the full hydride training set for each replicate. The Magpie/Weka models showed somewhat higher predicted mean absolute error (5.6 ± 0.4) than the Matminer/scikit-learn models (4.2 ± 0.4) on the holdout set, although the validation statistics were within error of one another. It is beyond the scope of this comment to fully analyze the ultimate source of the variance in these predictions, but we speculate that some contribution results from differences in how Magpie treats duplicate compositions in the training set and/or differences in RF implementation between Weka and scikit-learn.

Supplementary files

Article information

DOI: https://doi.org/10.1039/C9ME00138G
Article type: Comment
Submitted: 09 Oct 2019
Accepted: 03 Feb 2020
First published: 07 Feb 2020

Download Citation

Mol. Syst. Des. Eng., 2020,5, 589-591

Permissions

Request permissions

Comment on “A simple constrained machine learning model for predicting high-pressure-hydrogen-compressor materials” by Hattrick-Simpers, et al., Molecular Systems Design & Engineering, 2018, 3, 509

J. Hattrick-Simpers and B. DeCost, Mol. Syst. Des. Eng., 2020, 5, 589 DOI: 10.1039/C9ME00138G

To request permission to reproduce material from this article, please go to the Copyright Clearance Center request page.

If you are an author contributing to an RSC publication, you do not need to request permission provided correct acknowledgement is given.

If you are the author of this article, you do not need to request permission to reproduce figures and diagrams provided correct acknowledgement is given. If you want to reproduce the whole article in a third-party publication (excluding your thesis/dissertation for which permission is not required) please go to the Copyright Clearance Center request page.

Molecular Systems Design & Engineering

Comment on “A simple constrained machine learning model for predicting high-pressure-hydrogen-compressor materials” by Hattrick-Simpers, et al., Molecular Systems Design & Engineering, 2018, 3, 509†

Abstract

Associated articles

Supplementary files

Article information

Download Citation

Permissions

Comment on “A simple constrained machine learning model for predicting high-pressure-hydrogen-compressor materials” by Hattrick-Simpers, et al., Molecular Systems Design & Engineering, 2018, 3, 509

Social activity

Search articles by author

Spotlight

Advertisements