Page 104 - Read Online
P. 104

Page 8 of 15                         Wu et al. J. Mater. Inf. 2025, 5, 15  https://dx.doi.org/10.20517/jmi.2024.67




































                Figure 3. (A) The model architecture of AGAT. The top panel denotes the AGAT layer, and the bottom panel denotes the AGAT model;
                (B) The interpretability of the AGAT model. The attention scores of the energy and forces models compared with the energy and forces
                variations; (C) ML-predicted vs. DFT-calculated adsorption enthalpies in 5-fold cross-validation using RBF-GPR, WWL-GPR, and
                XGBoost for the simple adsorbates database, respectively; (D) The model architecture of WWL-GPR. The adsorption enthalpy for the
                relaxed structure is predicted by representing the initial structure as a graph. Node attributes are calculated based on the gas-phase
                molecule and the pristine surface. The similarity between graphs is assessed using the WWL graph kernel, and this information is then
                used in a GPR model. Reproduced with permission from  refs [78,79] . Copyright 2023 Elsevier and Copyright 2022 Springer Nature,
                respectively. AGAT: Atomic graph attention; ML: machine learning; DFT: density-functional theory; RBF-GPR: radial basis function and
                Gaussian progress regression; WWL-GPR: Wasserstein Weisfeiler-Lehman graph kernel and GPR.

               Kernal-based models
               Gaussian process regression (GPR) is a Kernal-based ML technique that models complex relationships
               between variables by treating the regression function as a random process governed by prior probability
               distributions. The advantage of GPR is that it not only outputs the predicted result, but also gives
               confidence for the prediction. GPR naturally quantifies uncertainty in its predictions by computing both
               mean estimations and confidence intervals since it is a probabilistic method . As a result, GPR has been
                                                                                 [82]
               widely used to predict surface phase diagrams, model MLIPs, etc. [83,84] .

               Xu et al. proposed a data-efficient model called the Wasserstein Weisfeiler-Lehman graph kernel and GPR
               (WWL-GPR) to predict the binding motifs and adsorption enthalpies of various adsorbates on transition
               metals (TMs) and their alloys [85,86] . It was trained on the TM dataset consisting of Cu, Rh, Pd, and Co and
               achieved a test root mean square error (RMSE) of 0.2 eV. Figure 3C shows that WWL-GPR outperforms
               RBF-GPR and XGBoost for 41 complex adsorbates in various adsorption motifs on surfaces of Cu, Co, Pd
               and Rh. Its accuracy is comparable to the DFT calculations owing to its node attributes through the
               combination of geometrical information from graph representation and easily accessible physics-informed
               attributes [e.g., d-band moments and highest occupied molecular orbital/lowest unoccupied molecular
               orbital (HOMO/LUMO) energy levels]. The node embeddings were then generated to further calculate the
               Wasserstein distance between their distributions. Based on the Wasserstein distance-based graph
               similarities, the GPR can be used to make predictions of adsorption enthalpies [Figure 3D]. Understanding
   99   100   101   102   103   104   105   106   107   108   109