Page 41 - Read Online
P. 41

Page 14 of 21                         Chen et al. J Mater Inf 2022;2:19  https://dx.doi.org/10.20517/jmi.2022.23

               optimal compositions with high activity. Combining HT DFT calculations, ML, data-guided combinatorial
               synthesis and HT characterization, these works demonstrate an efficient methodology for HT closed-loop
               materials design in the rising field of HEA catalysts.


               ML models of CO RR on HEA catalysts
                               2
               The ever-increasing demand for global energy and the need to replace CO -emitting fossil fuels with
                                                                                  2
               renewable sources have driven interest in energy conversion and storage. In particular, the electrochemical
               reduction of CO  to chemical feedstocks is a hot topic due to its high correlation with both CO  removal and
                             2                                                                2
               renewable energy generation. To accelerate catalyst discovery for the CO RR, Zhong et al. developed a
                                                                                2
               ML-accelerated HT DFT framework and explored 12,229 surfaces and 228,969 adsorption sites on 244
                                                 [85]
               copper-containing intermetallic crystals . This work illustrates the significance of computation and ML for
               exploring multi-metallic systems in experiments. By combining DFT with supervised ML, Pedersen et al.
               presented a strategy for the probabilistic and unbiased discovery of high-performance CO RR catalysts on
                                                                                            2
                                                              [86]
               disordered CoCuGaNiZn and AgAuCuPdPt HEAs . Gaussian process regressors were trained by
               hundreds of adsorption energy values of CO* (on-top site) and H* (hollow site) on (111) surfaces of
               CoCuGaNiZn and AgAuCuPdPt, achieved by DFT calculations, as illustrated in Figure 10A-F. The
               normally distributed errors of the Gaussian process regressors are similar to those of the cross-validations.
               As seen in Figure 10A-F, most predictions are within the dotted lines (± 0.1 eV deviation from the DFT
               values), which indicates that the Gaussian process regressors successfully capture the essential parts of the
               chemical environment of adsorption sites. The learning curves, which give the relation between the
               prediction error and the number of training samples, validate that the Gaussian process regressors have
               converged prediction error for the current number of adsorption energies achieved by DFT calculations.
               For a ML model, the input feature is of vital importance for the precision and universality of the model.
               More importantly, it is essential to understand the structure-activity relationships of HEA catalysts. To
               address this aspect, Roy et al. applied the permutation importance module as implemented in the
               scikit-learn library of Python to understand the contribution of every input feature towards the output, as
                                   [53]
               depicted in Figure 10G . To determine the correlation between every input feature, a correlation matrix
               was generated, where the highly correlated features could be eliminated to decrease the dimensionality of
               the data set. Moreover, the correlation of each metal from every region with the corresponding adsorption
               energy is easily achieved and analyzed by the feature importance.

               Descriptors in ML models of HEA catalysts
               The key to constructing a ML model is designing effective descriptors, which is more important for HEA
               catalysts due to the complex active sites. The appropriate descriptors as input features for a ML model
               should be achieved directly from databases or by the simplest DFT calculations and include sufficient
               information on surface active sites. Some approaches, such as coordination atom fingerprints (CAFs),
               Coulomb matrices, the spectrum of London and Axilrod-Teller-Muto, elemental properties and SLATM
               (EP & SLATM), smooth overlap of atomic positions, Voronoi connectivity-based crystal graph, labeled site
               crystal graph (LSCG) and FCHL19, have recently been reported [87-94] . Li et al. applied elemental groups and
               periods (GP) to replace atomic numbers in the FCHL19, LSCG, Atomic Number and Coordination
               Number (ANCN) and CAF representations to achieve an effective improvement for predicting adsorption
                               [95]
               energies on alloys . This strategy effectively enables ML models to learn from the periodic table. An
               improvement is achieved up to ~0.2 eV in adsorption energy MAE, compared to those obtained using
               ANCN, CAF, FCHL19 and LSCG. In particular, for the GP-LSCG representation, the MAE is 0.05 eV (near
               chemical accuracy) in predicting hydrogen adsorption and ~0.1 eV for other strong binding adsorbates (C*,
               N*, O* and S*). Although this work mainly focuses on bimetallic alloy systems, it has the potential to be
               extended to HEA catalysts, which has been verified by another research group, who proposed a transferable
                                                                                      [96]
               ML model by considering the intrinsic properties of substrates and adsorbates . Simply training the
   36   37   38   39   40   41   42   43   44   45   46