How big is Big Data?

Daniel Speckhard; Tim  Bechtel; Luca M. Ghiringhelli; Martin  Kuban; Santiago Rigamonti; Claudia Draxl

doi:10.1039/D4FD00102H

You do not have JavaScript enabled. Please enable JavaScript to access the full features of the site or access our non-JavaScript page.

How big is Big Data?

Daniel Speckhard, Tim Bechtel, Luca M. Ghiringhelli, Martin Kuban, Santiago Rigamonti and Claudia Draxl

Abstract

Big data has ushered in a new wave of predictive power using machine learning models. In this work, we assess what {\it big} means in the context of typical materials-science machine-learning problems. This concerns not only data volume, but also data quality and veracity as much as infrastructure issues. With selected examples, we ask (i) how models generalize to similar datasets, (ii) how high-quality datasets can be gathered from heterogenous sources, (iii) how the feature set and complexity of a model can affect expressivity, and (iv) what infrastructure requirements are needed to create larger datasets and train models on them. In sum, we find that big data present unique challenges along very different aspects that should serve to motivate further work.

This article is part of the themed collection: Data-driven discovery in the chemical sciences

Download options Please wait...

Article information

DOI: https://doi.org/10.1039/D4FD00102H
Article type: Paper
Submitted: 14 Thg5 2024
Accepted: 08 Thg7 2024
First published: 11 Thg7 2024
This article is Open Access

Download Citation

Faraday Discuss., 2024, Accepted Manuscript

Permissions

Request permissions

How big is Big Data?

D. Speckhard, T. Bechtel, L. M. Ghiringhelli, M. Kuban, S. Rigamonti and C. Draxl, Faraday Discuss., 2024, Accepted Manuscript , DOI: 10.1039/D4FD00102H

This article is licensed under a Creative Commons Attribution 3.0 Unported Licence. You can use material from this article in other publications without requesting further permissions from the RSC, provided that the correct acknowledgement is given.

Faraday Discussions

How big is Big Data?

Abstract

Article information

Download Citation

Permissions

How big is Big Data?

Social activity

Search articles by author

Spotlight

Advertisements