Page 82 - Read Online
P. 82

Mooraj et al. J Mater Inf 2023;3:4  https://dx.doi.org/10.20517/jmi.2022.41      Page 7 of 45

               field of high-throughput computational studies is extremely wide and covers too many topics to discuss
               succinctly. As such, the discussion of computational methods is limited to studies focused on phase
               formation and mechanical properties of HEAs to illustrate the potential advantages and disadvantages of
               the previously mentioned methods.


               Machine learning
               Machine learning (ML) is a powerful computational tool to rapidly explore vast design space through
                               [78]
               statistical methods . These methods include artificial neural networks (ANN), support vector machines
               (SVM), and decision trees, which can often be used to quantitatively predict material properties such as
                       [79]                                                                        [80]
               hardness  or to predict qualitative factors such as the expected phases of a given alloy composition . Over
               the past decade, as computational power has continued to increase, there has been an explosion in the
                                                 [81]
               topics of machine learning and big data . Machine learning methods have an extremely high potential to
               handle large databases due to their statistical nature. This section includes examples from literature of
               various ML techniques and methods that are representative of the state-of-the-art results achieved in the
               field.


               ML techniques are capable of predicting the structure and properties of various alloys in reasonably short
               periods. However, this predictive capability is largely dependent on the size and quality of the training data,
               a thorough consideration of appropriate input variables (also known as feature engineering), and the choice
                           [82]
               of ML model . Typically robust databases of training data only exist for materials that have been well
                                                                         [83]
               studied, such as the Ni-Ti-Hf shape memory alloy (SMA) systems . For example, Liu et al. developed
               Gaussian process regression (GPR) models to estimate thermal parameters related to the martensite and
                                                                             [83]
               austenite finish temperatures in a Ni-Ti-Hf alloy system to design a SMA . The predicted parameters were
               described as     = (A  + M)/2, and ΔT  = A  - M, where A  and M  are  the  austenite  finish  and  martensite
                                f    f            f    f       f      f
               finish temperatures, respectively. The value of   represents the average of the austenite finish and
               martensite finish temperatures and thus illustrates the temperature region where an SMA is expected to
               transform. Tuning this range can be useful in aerospace applications where autonomous actuation can be
               induced due to the temperature difference of the surroundings at take-off (typically 275 K) and cruising
                            [83]
               (usually 215 K) . On the other hand, ΔT represents the total temperature range of  the austenite  finish and
               martensite finish temperatures, indicating the hysteresis during the transformation. A lowΔT  can  lead to
               more efficient actuation when the martensitic and austenitic phase transformations are activated.

               As previously mentioned, an essential aspect of building an ML model is the determination of the input
               variables that will most accurately predict the output variables. Typically, adding more input variables can
               improve the model’s accuracy, as variables that do not correlate strongly with the output variables will have
               to be emphasized less through training sets. However, using too many input variables increases the
               dimensionality of the model, making it computationally expensive to execute. Additionally, the solution
               space formed by many input variables can often contain local minima that require many iterations to
               escape. For this reason, Liu et al. initially started with 48 input variables based on the relevance of those
                                                                                                  [83]
               variables to the physical processes involved in martensitic and austenitic phase transformations . These
               chosen features included fundamental atomic properties (e.g., atomic radius, atomic number, relative
               atomic mass, etc.), thermal properties (e.g., melting point, boiling point, the heat of fusion, thermal
               conductivity, etc.), overall alloy compositions, electronic configurations, and process conditions (e.g.,
               solution temperature, aging temperature, etc.). This variable space was refined via mutual information (MI)
               and Pearson correlation (PC). MI indicates the dependency of the output variable on the input variables,
               which ensures that only the most impactful variables are used. In contrast, the PC between the two variables
               illustrates their correlations. Input variables strongly correlated to each other produce redundant
   77   78   79   80   81   82   83   84   85   86   87