Page 47 - Read Online
P. 47

Page 2 of 14                            Li et al. J Mater Inf 2024;4:4  I http://dx.doi.org/10.20517/jmi.2023.41



               INTRODUCTION
               Machine learning (ML) has found widespread applications in the field of materials science and engineer-
               ing [1,2] . Researchers such as Behler and Parrinello have utilized atomic neural networks to learn total en-
                                                                            [3]
               ergy, which has been instrumental in developing interatomic potentials . Moreover, ML techniques, such
                                                                            [5]
                        [4]
               as SchNet , Crystal Graph Convolutional Neural Network (CGCNN) , and Atomistic Line Graph Neu-
                                  [6]
               ral Network (ALIGNN) , have been employed to establish relationships between atomic structures and their
               properties. These methods havebeen used topredict upto50 different characteristics ofcrystals andmolecular
               materials, including formation energy and electronic band gaps. Additionally, deep learning (DL) techniques
               have been leveraged in various applications to identify chemically feasible spaces. For instance, Bayesian opti-
               mization methods, in conjunction with MEGNet, have been employed as energy evaluators for direct structure
                        [7]
               relaxation . To further enhance the performance, BOWSR incorporates band symmetry relaxation alongside
                                  [8]
               Bayesian optimization . Accurate characterization of localized physicochemical properties is of paramount
               importance in numerous scientific and engineering disciplines. Ranging from electrochemical catalysis, sen-
               sors, carbon capture, and energy storage and conversion to drug delivery, exploring and exploiting the in-
               tricacies of local environments are at the heart of many frontier investigations. For instance, adsorption is
               a pervasive surface phenomenon in areas such as electrocatalysis, with its understanding rooted in founda-
                                                                      [9]
               tional theories such as bonding and adsorption thermodynamics . Crucially, the influence of neighboring
               atoms on adsorption sites must be fully accounted for, as factors including atomic electronic structures, spa-
               tial constraints, surface stoichiometry, and surface defects can all impinge on the behavior of adsorbates on
               surfaces [10] .

               Recently, DL models have found applications in the catalysis realm, as demonstrated by various end-to-end
               graph neural network models developed in the Open Catalyst Project (OCP) challenge, encompassing equiv-
               ariantEuclideanNeuralNetworks(e3nn), SphericalChannelNetwork(SCN),andEquivariantSphericalChan-
               nel Network (eSCN), among others [11–17] . While these models showcase stellar performance on the data-rich
               OC20 database [18] , their computational complexity often leads to overfitting on smaller datasets [19–22] . Specif-
               ically, unlike single metal materials, multi-metal alloy catalysts exhibit excellent physical and chemical prop-
               erties in the field of nanoparticles [23] . However, for this type of complex adsorption systems such as large
               organic molecules and certain transition metal oxides, accurate Density Functional Theory (DFT) computa-
               tions are challenging, resulting in a paucity of reliable data  [24,25] . Currently, graph neural networks based on
               global information are designed to capture the topology of the data, making them well-suited for processing
               small- to medium-scale molecular and crystalline materials, where local connectivity does not increase signif-
               icantly with system size. However, for large systems, the number of edges and nodes in the graph network
               increases dramatically, resulting in significant performance degradation. Furthermore, it is difficult for the
               graph network to distinguish which information is critical. Therefore, extracting local information is critical
               to the adsorption energy. The structural diversity of large nanoparticles or two-dimensional (2D) materials
               is higher, and it is difficult for graph networks to handle the complexity caused by structural differences. In
               addition, implementingeffectiveboundary conditionprocessingingraphnetworksis also a challenge. The Ad-
               sorbate Chemical Environment-based Graph Convolution Neural Network (ACE-GCN) endeavors to convert
               each adsorbent surface to the configuration is initially split into subgraphs to explicitly account for the local
               chemical and structural environment of the adsorbent [26] . This approach suffers from weak interpretability
               and captures interactions in complex molecular systems. The segmentation of subgraphs has limitations and
               uncertainties, making it difficult to generalize to more diverse systems such as 2D materials. Employing ML
               approaches boasts advantages such as heightened interpretability, reduced data dependency, flexibility in de-
               signing and selecting features tailored to specific problems, and capabilities in prediction interpretation and
               error analysis [27–32] . Earlier studiesproposed numerousfeature engineering descriptors to enhancethe predic-
               tion of ML models for adsorption energy, encompassing atomic number, ionization energy, electronegativity,
               ionic radius, and inter-atomic interactions [33–35] . These descriptors reported are equally pertinent to the field
               of electrocatalysis [36] . Yet, adequately representing metal compound adsorption sites remains a major chal-
   42   43   44   45   46   47   48   49   50   51   52