Page 35 - Read Online
P. 35

Hansen et al. Microstructures 2023;3:2023029  https://dx.doi.org/10.20517/microstructures.2023.17  Page 13 of 17

               The elbow graph and the crystallographic variant maps for the SMA and Gaussian-filtered VO  on sapphire
                                                                                               2
               datasets with different k-values are shown in Figure 10. The SMA elbow graph [Figure 10A] has a rapidly
               dropping inertia from k = 1 to 3, after which the slope begins to flatten out. This indicates an optimal k-
               value of 3 or 4, as both the inertia and k at those points are minimized. From the previous discussion, the
               mapped region shows four predominant and two minor variants. Hence, the optimal k-value from k-means
               underestimates the variant numbers. Figure 10B shows crystallographic variant maps for k = 3, 4, and 5. In
               k = 4, the four predominant martensite variants were identified, but the two minor variants were lost. When
               k increased to 5, the minor variants were not revealed. Instead, a high level of noise was observed in the
               martensite plates. The k-means analysis took roughly 7 h on the SMA dataset. Similar observations were
               made in the Gaussian-filtered VO  on sapphire datasets. The elbow graph in Figure 10C indicates an
                                              2
               optimal k-value of 2 or 3. From the previous discussion, the dataset should contain four crystallographic
               variants (sapphire, two VO  variants, and vacuum). Again, the optimal k-value tends to underestimate the
                                      2
               variant numbers. Figure 10D shows crystallographic variant maps for k = 3, 4, and 5. In the map with k = 3,
               the sapphire substrate, VO  film, and vacuum were identified. However, part of the VO  film was
                                        2
                                                                                                 2
               erroneously marked as vacuum, as indicated by the arrows. In k = 4, the vacuum and thin film regions
               remain the same, but the sapphire substrate split into two pseudo-variants. In k = 5, the substrate region
               remains split, and the vacuum and thin film regions are accurately revealed, with the second VO  variant
                                                                                                   2
               now showing. The k-means analysis took roughly 40 min on the cropped VO  on the sapphire dataset
                                                                                    2
               (100 × 100 pixels). Based on the above observations, the k-means method can provide general information
               on the crystallographic variants in the materials with minimal user input but can fail to capture minor
               variants and may falsely split variants into pseudo-variants.

               It should be noted that while the k-means algorithm is well-suited for identifying similar-sized spherical
               clusters, it may not be the ideal choice for the SMA and VO  model systems. For example, in the case of the
                                                                  2
               VO  on the sapphire dataset, sapphire, two variants of VO , and vacuum exhibit significant differences in
                  2
                                                                 2
               data sizes, which can potentially lead to misidentifications when using k = 3 or 4. To address this challenge,
               alternative unsupervised machine learning techniques, such as density-based spatial clustering of
               applications with noise (DBSCAN), mean shift, and Gaussian mixture models (GMM), could be considered
                                                                                                       [42]
               in future studies for more effective identification of crystallographic variants using the PED data .
               Nevertheless, the use of k-means serves as a baseline for comparing the performance of other unsupervised
               learning techniques in future applications.

               Method comparison
               Each approach described above has ideal use cases along with advantages and disadvantages. The user-
               selecting-reference-pattern approach (Method 1) is best suited when the number and location of
               crystallographic variants in the sample are known. A region can be mislabeled if a reference point for the
               diffraction pattern is not selected. Inversely, if more than one reference point is selected for a pattern, a
               variant region can be divided between the selected points. Hence, if the user is not familiar with the
               material, this method can be prone to human errors. When all the variants are selected correctly, this
               approach can produce the most accurate similarity maps with the lowest computational cost (generally a
               few minutes). Among the diffraction pattern similarity quantification algorithms, both the Euclidean and
               SSIM methods outperform the Cosine method in generating more accurate crystallographic variant maps.


               The algorithmic-selecting-reference pattern approach (Method 2) omits the need for the user to select the
               reference pattern of each variant. This approach can generate similarity maps automatically with only user
               input to select the ideal cut-off threshold based on the generated maps. The cut-off values in each map
               govern the new variant generation. Too high of a value will capture the small differences of diffraction
               patterns within the same variant, thus splitting one variant into two or more pseudo-variants. Too low of a
   30   31   32   33   34   35   36   37   38   39   40