Page 10 - Read Online

P. 10

Liu et al. Intell Robot 2023;3(2):131-43 I http://dx.doi.org/10.20517/ir.2023.07 Page 133

imum mean discrepancy (MKMMD) in different feature layers between the two domains [18] . Wang et al.
propose a method that uses multi-scale convolution to extract fault features while combining adversarial train-
ing to achieve effective migration effects. The effect of this method is close to 100% on the bearing dataset [19] .
The above-mentioned studies demonstrate the effectiveness of deep transfer learning in rotary machine fault
diagnosis. However, therearestillsomeproblemsthathavenotbeentakenseriously: (i)Mosttransferlearning
methods only perform domain-adaptive alignment from a global perspective. This alignment effect is greatly
reduced when the data distribution varies dramatically [20] ; (ii) During the validation of transfer learning algo-
rithms, the effectiveness of transfer effects for mixed fault types on different devices is rarely considered, which
is quite difficult due to the significant differences in data distribution.

To address the aforementioned challenges, this paper proposes a semi-supervised joint adaptation transfer net-
work with conditional adversarial learning for rotary machine fault diagnosis, which introduces the following
main innovative aspects.

(1) To efficiently transfer the diagnostic power learned on a large amount of data in the source domain, a pre-
trained model is trained on the labeled data in the source domain and then used to generate pseudo-labels
for the unlabeled target domain data. This effectively utilizes unlabeled data to boost the performance of the
diagnostic model. Then, to reduce domain shifts and align the joint distribution of the source and target
domains, we take into account both the global feature variation and the intra-class similarity between differ-
ent domains. This enables the alignment of both the conditional probability distributions and the marginal
probability distributions in different domains. This method can effectively capture both the global and local
differences between the two domains and align the distributions to reduce the domain shift. This can signifi-
cantly improve the diagnostic performance on the target domain and enable the use of diagnostic models in
real-world scenarios where the labeled data may be scarce.

(2) Considering the mutual influence between different devices of modern rotary machines, the difficulty of
fault diagnosis is significantly increased. Our method can be used in single-type fault diagnosis and produce
highly reliable results. More importantly, our method has shown great improvement in diagnostic tasks in-
volving mixed fault types, which has led to more accurate diagnostic results.

The remaining sections of this paper are organized as follows: Section 2 provides an introduction to the related
definitions of transfer learning. Section 3 elaborates on the proposed method in detail. In section 4, we present
experimental results on three different types of settings to showcase the effectiveness of our method. Finally,
section 5 summarizes the contributions of this work and discusses potential avenues for future research.

2. TRANSFER LEARNING PROBLEM
Havingsufficientannotateddataisarequirementforawell-performingsupervisedmodel;however,theprocess
of annotating data can be tedious and time-consuming. Therefore, transfer learning is a proven way to make
use of a previously pre-trained model on a new task while ensuring optimal performance. The main goal is
to transfer the capabilities learned in the source domain data to the target domain data, thus solving the pain
point that it is difficult to obtain sufficient knowledge in the target domain with limited data [21] .

From Figure 1, we can see that the traditional intelligent fault diagnosis method gives an accurate diagnosis
in the case where the data distribution of the training and test sets is similar. Therefore, transfer learning is
unnecessary in such cases. In general, when their data distributions are inconsistent, the generalization ability
of the model is poor. In these situations, transfer learning can exploit the diagnostic power learned from the
training data by reducing the difference between the two distributions.

5 6 7 8 9 10 11 12 13 14 15