Page 82 - Read Online

P. 82

Mooraj et al. J Mater Inf 2023;3:4 https://dx.doi.org/10.20517/jmi.2022.41 Page 7 of 45

field of high-throughput computational studies is extremely wide and covers too many topics to discuss
succinctly. As such, the discussion of computational methods is limited to studies focused on phase
formation and mechanical properties of HEAs to illustrate the potential advantages and disadvantages of
the previously mentioned methods.

Machine learning
Machine learning (ML) is a powerful computational tool to rapidly explore vast design space through
[78]
statistical methods . These methods include artificial neural networks (ANN), support vector machines
(SVM), and decision trees, which can often be used to quantitatively predict material properties such as
[79] [80]
hardness or to predict qualitative factors such as the expected phases of a given alloy composition . Over
the past decade, as computational power has continued to increase, there has been an explosion in the
[81]
topics of machine learning and big data . Machine learning methods have an extremely high potential to
handle large databases due to their statistical nature. This section includes examples from literature of
various ML techniques and methods that are representative of the state-of-the-art results achieved in the
field.

ML techniques are capable of predicting the structure and properties of various alloys in reasonably short
periods. However, this predictive capability is largely dependent on the size and quality of the training data,
a thorough consideration of appropriate input variables (also known as feature engineering), and the choice
[82]
of ML model . Typically robust databases of training data only exist for materials that have been well
[83]
studied, such as the Ni-Ti-Hf shape memory alloy (SMA) systems . For example, Liu et al. developed
Gaussian process regression (GPR) models to estimate thermal parameters related to the martensite and
[83]
austenite finish temperatures in a Ni-Ti-Hf alloy system to design a SMA . The predicted parameters were
described as = (A + M)/2, and ΔT = A - M, where A and M are the austenite finish and martensite
f f f f f f
finish temperatures, respectively. The value of represents the average of the austenite finish and
martensite finish temperatures and thus illustrates the temperature region where an SMA is expected to
transform. Tuning this range can be useful in aerospace applications where autonomous actuation can be
induced due to the temperature difference of the surroundings at take-off (typically 275 K) and cruising
[83]
(usually 215 K) . On the other hand, ΔT represents the total temperature range of the austenite finish and
martensite finish temperatures, indicating the hysteresis during the transformation. A lowΔT can lead to
more efficient actuation when the martensitic and austenitic phase transformations are activated.

As previously mentioned, an essential aspect of building an ML model is the determination of the input
variables that will most accurately predict the output variables. Typically, adding more input variables can
improve the model’s accuracy, as variables that do not correlate strongly with the output variables will have
to be emphasized less through training sets. However, using too many input variables increases the
dimensionality of the model, making it computationally expensive to execute. Additionally, the solution
space formed by many input variables can often contain local minima that require many iterations to
escape. For this reason, Liu et al. initially started with 48 input variables based on the relevance of those
[83]
variables to the physical processes involved in martensitic and austenitic phase transformations . These
chosen features included fundamental atomic properties (e.g., atomic radius, atomic number, relative
atomic mass, etc.), thermal properties (e.g., melting point, boiling point, the heat of fusion, thermal
conductivity, etc.), overall alloy compositions, electronic configurations, and process conditions (e.g.,
solution temperature, aging temperature, etc.). This variable space was refined via mutual information (MI)
and Pearson correlation (PC). MI indicates the dependency of the output variable on the input variables,
which ensures that only the most impactful variables are used. In contrast, the PC between the two variables
illustrates their correlations. Input variables strongly correlated to each other produce redundant

77 78 79 80 81 82 83 84 85 86 87