Page 52 - Read Online

P. 52

Page 4 of 12 Liu et al. J Mater Inf 2022;2:20 https://dx.doi.org/10.20517/jmi.2022.29

where m = 350,000 is the dataset size; k = 50 is the number of nearest neighbors, which includes nh near-hit
ones and nm near-miss ones; P = 1/7 and P =1/7 are the prior probability of the class which the atom X i
t
n
j
j
and its neighbor X belong to; Δ (a, a ) measures the difference in the values of feature F between
i,n j i i,n j j j j j
X and X . In our model, Δ(a, a ) is calculated by ∆(a, a ) = |a - a |/[max(F) - min(F)], where a is the
i,n
i
i
i
i,n
i,n
i
i,n
j
value of feature F. As illustrated in Supplementary Figure 2, the lower weights of the features with Δt < 5 ps
indicate that these features are trivial to the classifier and should be removed. Consequently, the final
dataset used for training and testing the kNN model included 350,000 instances. Every instance was
described as a feature vector, which is composed of 17 features.
Hyperparameter optimization and data preprocessing
The number of nearest neighbors k was optimized by minimizing the validation root-mean-square error
(RMSE). As can be seen from Supplementary Figure 3, the best k is 50. The 10-fold cross-validation was
applied during the hyperparameter optimization and the following temperature predictions. Prior to
training, all input data were normalized by using the min-max scaling method, which translated the data
into the range of [0,1]. The ML was implemented by MATLAB R2020a.
RESULTS AND DISCUSSION
Machine-learned temperature prediction
For every testing atom, the kNN model searched the dataset and found 50 same-type nearest neighbors,
which have the nearest Euclidean distance of the feature vectors. These nearest neighbors have the most
similarities in these 17 features. The T was computed by averaging the temperature class labels of its 50
ML
nearest neighbors, and thus T predicted which temperature class the testing atom is most likely to belong
ML
to. As shown in Figure 1A-E, there is a good linear relation between the predicted T and the actual
ML
temperature T, although the slight deviations occur at T = 100, 800, and 2000 K, respectively. Also, this
model works for all types of atoms without obvious differences. Therefore, the kNN model succeeds in
recognizing the characteristics of temperature-induced atomic motion. Applying the kNN model, we can
establish a correlation between temperature and individual atomic motion behavior. We must clarify the
difference between the meaning of T and thermodynamic temperature. In thermodynamics, temperature
ML
is a macroscopic quantity. In our work, however, temperatures were applied to the class labels for
classification. The kNN model predicted the most probable HEMG sample that the testing atoms belong to,
and this sample had a temperature of T . Therefore, the kNN model established a correlation between an
ML
atom and a sample temperature. The T can be understood as a temperature-like parameter that reflects
ML
the characteristic of individual atomic motion in a 200 ps time window. A high-T “hot” atom means it is
ML
active in responding to a thermal stimulus, whereas a low-T “cold” atom behaves in an inactive manner.
ML
Particularly with regard to the atoms with T over glass transition temperature T , their atomic motion is
g
ML
like the atoms in supercooled liquids. Note that the active and inactive atoms are defined from atomic
dynamics without relying on any structural signature, different from previous static structural
parameters [10,12-14] .
Machine-learned temperature in the stress-induced viscoplastic flow
In fact, the kNN model simply learns the atomic trajectories, i.e., the parameter lg[ (∆t) /(Tm )],
-1
2
regardless of the cause of the motion. As a result, although this model is obtained from learning
temperature-induced atomic motions, it can be applied to other scenarios, such as the stress-induced atomic
motion from deformation. During creep, the atomic motion is activated by thermal and mechanical stimuli
simultaneously [36,37] . When the kNN model predicts T from the creep data, T becomes a temperature-like
ML
ML
parameter that reflects how an atom responds to the combined agitations.

47 48 49 50 51 52 53 54 55 56 57