Harikrishnan Sibia,
Jovita Bijub and
Chandra Chowdhury*c
aSchool of Mathematics, Indian Institute of Science Education and Research Thiruvananthapuram (IISER TVM), Maruthamala P. O, Thiruvananthapuram 695 551, India
bSchool of Data Science, Indian Institute of Science Education and Research Thiruvananthapuram (IISER TVM), Maruthamala P. O, Thiruvananthapuram 695 551, India
cAdvanced Materials Laboratory, CSIR-Central Leather Research Institute, Sardar Patel Road, Adyar, Chennai, 600020, India. E-mail: pc.chandra12@gmail.com
First published on 29th November 2024
Despite the increased research and scholarly attention on two-dimensional (2D) materials, there is still a limited range of practical applications for these materials. This is because it is challenging to acquire properties that are usually obtained by experiments or first-principles predictions, which require substantial time and resources. Descriptor-based machine learning models frequently require further density functional theory (DFT) calculations to enhance prediction accuracy due to the intricate nature of the systems and the constraints of the descriptors employed. Unlike these models, research has demonstrated that graph neural networks (GNNs), which solely rely on the systems' coordinates for model description, greatly improve the ability to represent and simulate atomistic materials. Within this framework, we employed the Atomistic Line Graph Neural Network (ALIGNN) to predict the work function, a crucial material characteristic, for a diverse array of 2D materials sourced from the Computational 2D Materials Database (C2DB). We found that the ALIGNN algorithm shows superior performance compared to standard feature-based approaches. It attained a mean absolute error of 0.20 eV, whereas random forest models achieved 0.27 eV.
The work function is a fundamental characteristic of a material that quantifies the minimal energy necessary to extract an electron from the Fermi level and transfer it to a vacuum state. In addition to its significance in surface science, the work function of a 2D-material holds relevance in several fields such as catalysis, energy storage/conversion, and electronics.11–14 It has been shown that fluorinating graphene at varying concentrations has been used as work function grading to enhance electron extraction in inverted structures.15 Deng and co-workers16 revealed that graphene cover enclosing Fe nanoparticles displayed both high activity and great stability which is due to the work function difference between outer cover and the inner nanometal. He et al. showed that due to the very high work function value, CuBiP2Se6 acts as the most stable intrinsic p-type 2D material.17
Researchers frequently employ sophisticated first principles approaches such as density functional theory (DFT)18 in their pursuit to unravel electronic characteristics, encompassing crucial attribute like work function. Although these techniques provide valuable insights, their computing requirements can be challenging, particularly when dealing with a wide range of materials. Significantly, a comprehensive collection of materials, carefully developed via extensive and meticulous research, reveals promising opportunities for the application of machine learning (ML) in the investigation of new 2D materials.19–23 Predictive machine learning models offer practical alternatives to computationally demanding DFT calculations by providing statistical predictions for these important characteristics. These models are trained using either existing or newly curated data. Databases such as the computational 2D materials database (C2DB)24 and 2DMatPedia,25 for example, systematically document the thermodynamic, and electronic characteristics of 2D materials, providing essential assets for expediting the process of material exploration and design. Recently Roy et al. performed ML for the prediction of the work function of 2D-materials26 where they used the descriptor based analysis. Descriptor-based models may encounter difficulties in accurately reflecting intricate linkages and non-linear interactions within materials data. These limitations are particularly evident when working with high-dimensional or unstructured data.27,28
Graph neural networks (GNNs)29 are a revolutionary method for representing and modelling atomistic materials. They go beyond traditional machine learning models that rely on predefined descriptors like bond distances, angles, or local atomic environments. In contrast to descriptor-based models, GNNs naturally capture the complex and frequently non-linear connections between atoms by explicitly representing the graph-like arrangement of materials. In this representation, atoms are considered as nodes and interatomic bonds as edges. GNNs utilize a graph-based representation to effectively learn the fundamental physical and chemical interactions. This enables a more precise and reliable modelling of intricate material behaviours, which is essential for accurately predicting properties in new materials and comprehending phenomena at the quantum level. The ALIGNN (atomistic line graph neural network), proposed by Choudhary et al.,30 represents the latest achievements in this field. What sets ALIGNN apart from other GNN models is the inclusion of not only the atomic positions but also the related bond lengths and bond angles as explicit input characteristics. By incorporating an extra level of data, ALIGNN is able to accurately capture the intricate geometric and topological aspects of atomic structures, resulting in very accurate forecasts of material characteristics.
This manuscript examines the predicted accuracy of the ALIGNN approach, emphasizing its potential to greatly improve the overall comprehension of work functions. The validity of our research is strengthened by thorough validation, performed on both artificial and real-world datasets, which clearly shows the model's excellent predicting ability. The findings highlight the model's capacity as a fundamental tool for future study and practical applications, expanding beyond work functions to include a broad spectrum of materials' properties.
In Fig. 2, we present the schematic representation of the ALIGNN model for predicting the work function. The initial step is the conversion of the chemical structure into two separate graph representations. The initial representation is a conventional molecular graph, wherein atoms are depicted as nodes and bonds as edges. The second type of graph is a line graph, which provides a more conceptual depiction in which the bonds are plotted as nodes, and the edges reflect the shared atoms between the bonds. The aforementioned graph representations are subsequently inputted into the ALIGNN model, which has been specifically engineered to concurrently process both the graph and line graph. The inputs utilized by the model facilitate the acquisition of intricate linkages within the molecular structure, thereby providing a comprehensive representation of both atomic interactions and bond connectivity. In the subsequent step, the processed data is employed to make predictions regarding the work function of the molecule.
The Fig. 3 illustrates a comprehensive flowchart outlining the methods utilized in the construction of a predictive model employing the ALIGNN framework. The proposed model is specifically designed for the purpose of predicting the work function of materials. The initial stage involves the acquisition of molecular data, which functions as the fundamental dataset. The molecular data is subsequently subjected to two separate channels of processing: one that represents the data in the form of a conventional graph, wherein atoms are represented as nodes and bonds are represented as edges, and another that generates a line graph, capturing more complex relationships by considering the bonds of the original graph as nodes. Following the representation of molecular data in these two formats, an embedding procedure is employed to transform both the graph and line graph representations into a space of higher dimensions. The aforementioned step holds significant importance in capturing the intricate and non-linear interactions that are present in the molecular structure. The higher-dimensional embeddings maintain the inherent duality of the data, hence preserving both the graph and line graph representations. Once the embedding phase is completed, the data undergoes additional processing using the ALIGNN layer. This particular neural network layer has been specifically built to efficiently acquire knowledge from atomistic line graphs, hence facilitating the model's comprehension of complex interactions among atoms and their bonds. These interactions play a crucial role in defining the properties of materials. Following the completion of data processing by the ALIGNN layer, the process of average pooling is implemented. The aforementioned procedure consolidates the data from several dimensions, resulting in a concise statistical measure that encompasses the fundamental characteristics of the molecular design. Ultimately, the aggregated data is processed by means of a basic linear layer, which functions to convert the consolidated characteristics into a precise forecast—in this instance, the material's work function.
![]() | ||
Fig. 3 Overall methodology utilized for developing our model, describing the architecture of ALIGNN framework. |
Table 1 provides a concise overview of the essential technical aspects of the ALIGNN model, emphasising its structure and settings. The model is composed of 2 ALIGNN layers and 2 EGCN layers. These layers are essential for capturing intricate interactions within the graph representations of molecules or materials. The concealed feature space is determined by a vector of 256 dimensions, enabling the model to efficiently acquire knowledge and analyse data. In addition, the bond length and bond angle undergo transformation through the use of Radial Basis Function (RBF) expansion. The output sizes for the bond length and bond angle transformations are 80 and 40, respectively. This technique allows for the capture of more complex patterns in the data. The RBF expansion embedding size is set to 128 to encode the enlarged features into a fixed-size vector that is acceptable for the model's processing. Understanding these specifications is crucial for assessing the model's ability to accurately predict material properties.
ALIGNN model's technical details | |
---|---|
ALIGNN layers | 2 |
EGCN layers | 2 |
Hidden features | 256 |
Bond length RBF expansion output size | 80 |
Bond angle RBF expansion output size | 40 |
RBF expansion embedding size | 128 |
The provided Fig. 4 illustrates a violin plot, graphically portraying our dataset's distribution. The plot illustrates a symmetrical distribution centered around a work function of zero. Fig. 4 shows that the work function dataset has a broad distribution that closely follows the law of normal distribution. Table 2 illustrates the detailed configuration for our model. The dataset has 7208 data points. It contains 4977 data points after null values are removed. These are broken down into 3982 (80% of total data) training, 497 testing, and 498 validation instances. For model implementation, we used a k-nearest neighbours (k = 12) approach with a mini-batch size of 64. To train the neural network, we employ a five-fold cross validation strategy.
Parameter | Value |
---|---|
Total data points | 4977 |
Training set | 3982 |
Testing set | 497 |
Validation set | 498 |
Neighbours | 12 |
Mini batch size | 64 |
Fig. 5 presents a detailed representation of the performance of the ALIGNN model. It includes a parity plot in the left panel and training-validation loss curves in the right panel. The parity plot in the left panel illustrates the comparison between the predicted values and the actual values for both the training and test datasets. The bulk of data points strongly overlap with the diagonal reference line, which is shown by an orange dashed line. This indicates that the model's predictions nearly match the true values throughout the dataset. This alignment indicates that the model exhibits strong generalization abilities and has successfully captured the fundamental patterns in the data, resulting in minimum discrepancy between the anticipated and actual values.
![]() | ||
Fig. 5 (left) Actual vs. predicted values showing their correlation. (right) Training and validation loss for ALIGNN. |
The right panel of Fig. 5 displays the evolution of the training and validation losses during 700 epochs. At the beginning of training, the validation loss shows noticeable variations, indicating that the model's capacity to generalise is unstable. As the training advances, both the training and validation losses increasingly diminish, eventually reaching a stable area. By around 700 epochs, the disparity between the training and validation losses diminishes, suggesting that the model has successfully mitigated overfitting. Currently, the validation loss reaches a stable point, indicating that additional training would not result in substantial enhancements, but would instead raise the risk of overfitting.
The ALIGNN model ultimately attains a mean absolute error (MAE) of 0.20 eV on the test set, surpassing alternative models. Comparatively, a random forest model produces a mean absolute error (MAE) of 0.27 eV, whilst a conventional artificial neural network (ANN) model generates an MAE of 0.25 eV. The ALIGNN model's lower MAE indicates its improved predictive accuracy for this assignment, showcasing its capacity to more efficiently capture intricate material features. The results indicate that ALIGNN is a reliable model option for predicting work functions, as it demonstrates superior accuracy and generalization capabilities compared to traditional machine learning models.
During the ALIGNN model setup, we detected a validation loss of 0.0960 eV, which serves as a crucial measure of the model's capacity to effectively adapt to new data. The validation loss is comparatively minimal, indicating that the model successfully caught the fundamental patterns in the training data without overfitting. In addition, the model obtained a training loss of 0.0045 eV, indicating its high level of competency in learning from the training data. The significant disparity between the training and validation losses suggests that the model possesses a robust ability to learn, while also maintaining a high level of generalisation, which is essential for the effectiveness of machine learning models.
In addition, the Mean Absolute Error (MAE) was measured at 0.2033 eV. The MAE quantifies the average size of errors between the expected and actual values, regardless of their direction. The MAE value offers a straightforward measure of the model's prediction accuracy, indicating that, on average, the forecasts differ by around 0.2033 eV from the actual work functions. The precision achieved in this study is comparable to the current norms in the area, especially for models that predict energy attributes based on first-principles calculations.
The residual plot depicted in Fig. 6 provides additional confirmation of these findings. Residuals, which are the discrepancies between the observed and model-predicted work functions, serve as a crucial diagnostic tool for assessing model performance. Our research revealed that around 60% of the data points in the test set exhibited residuals falling within a range of 0.20 eV. This discovery is important since it indicates that most of the model's forecasts are closely matched with the actual values, coming within a small margin of error. The selection of the 0.20 eV threshold is intentional as it corresponds to the level of accuracy commonly attained by machine learning models that forecast properties such as adsorption energies, nanostructure stabilities, and electronic band gaps. These properties are frequently derived from density functional theory (DFT) or comparable first-principles techniques.31,32
![]() | ||
Fig. 6 Visualization of absolute differences between the actual and predicted values on the validation dataset using both models. |
One important component of our study is the novel use of graph properties in the ALIGNN framework, which sets our work apart from earlier studies. Previous models did not include graph-based features. We believe that including this technique will improve the model's performance, especially when dealing with intricate porous materials. This innovative technique has the capacity to surpass conventional models in forecasting the characteristics of such materials, indicating that additional investigation in this direction could produce even more precise and dependable predictive models. This development has the potential to have a substantial impact on the sector by offering a fresh approach to constructing machine learning models that can accurately forecast the characteristics of advanced materials.
ALIGNN enhances this capability by integrating bond-level specifics via a dual representation that has a secondary“line graph” wherein each bond in the primary atomic graph is seen as a node. This secondary network represents bond angles as links between bond-nodes, enabling ALIGNN to capture more complex geometric and topological features such as bond lengths and angles. This additional precision is essential for precisely predicting properties like the work function, which is significantly dependent on the material's electronic and atomic structure. The sensitivity of the work function to bonding conditions and bond angles is crucial, as these elements directly affect electron distribution and Fermi levels. By acquiring both atomic and bond-level data, ALIGNN improves prediction accuracy, surpassing traditional feature-based machine learning models and typical graph neural networks.
Feature-based machine learning models often necessitate comprehensive, domain-specific feature engineering to achieve generalization across materials, but ALIGNN's graph-based architecture directly learns these dependencies from the data. Conventional GNNs capture atomic configurations but do not incorporate bond-angle information, which ALIGNN integrates, rendering ALIGNN particularly effective for intricate materials such as 2D structures. The dual graph methodology renders ALIGNN exceptionally applicable in material science, where minor structural variations significantly influence surface properties, providing an advanced model architecture that is both versatile and highly precise across various configurations and compositions. This architecture enhances ALIGNN as a powerful tool for the discovery and prediction of electronic properties, including the work function, across a diverse range of 2D materials.
A further potential enhancement entails integrating ALIGNN with reinforcement learning or optimization algorithms to proactively direct material design. By establishing target work function ranges, such hybrid models could facilitate the discovery or engineering of materials with specified electronic properties tailored for certain applications. Furthermore, augmenting the training dataset with a broader array of 2D material systems may enhance the generalizability of ALIGNN across various chemical compositions, hence increasing its effectiveness in high-throughput materials screening initiatives.
This journal is © The Royal Society of Chemistry 2024 |