Page 11 - Read Online
P. 11
Page 134 Liu et al. Intell Robot 2023;3(2):131-43 I http://dx.doi.org/10.20517/ir.2023.07
Category1
Category2
Category3
Category4 Training Testing Training Testing
Unlabeled data
Transfer
Intelligent system Knowledge Intelligent system
Figure 1. Traditional intelligent methods and transfer learning-based intelligent methods.
Formally, we define = , as labeled training data, = as unlabeled test data, where denotes the
source domain task, denotes the target domain task, and and represent the vectorized representation
of the th sample and the corresponding label. In addition, it is worth noting that the target domain task has
no corresponding , which means that the available labeled data in the training phase can only rely on the
labeled data in the source domain, which will increase the difficulty of transfer. Since there is a great difference
between the task data in two different fields, transfer learning can minimize the difference between them by
finding a mapping relationship, thus realizing the reusable diagnostic ability. When the data distributions of
the two domains are close, we can satisfy the assumptions on which the existing intelligent faults depend and
realize an effective diagnosis.
3. THE PROPOSED ARCHITECTURE
In order to efficiently transfer the diagnostic power learned from the labeled data, a pre-trained model is ob-
tained by generating pseudo-labels for training. A domain adaptation network, using the joint maximum
mean deviation (JMMD) criterion and conditional domain adversarial (CDA) learning, is then used to learn
a mapping relationship that reduces the variation in the distribution of different domains. The joint distri-
bution between the aligned features and the predicted labels is aligned through multiple domain adaptation
approaches. Meanwhile, the information from the unlabeled data is incorporated in the pre-training phase,
thus resulting in maximum category differentiation and domain adaptation under multimodal conditions.
AsdepictedinFigure2, theprimaryarchitectureoftheproposedmethodisstructuredasfollows: First, enough
labeled data in the source domain are collected to train a pre-trained model. After that, the unlabeled data
in the target domain are predicted to obtain pseudo-labels. Then these data are combined to extract more
effective fault features. Second, the feature vectors and label vectors are linearly transformed several times to
jointlymodelthe implied relationshipsbetween them. Finally, adomain adaptationmodule is used toalign the
differences between the two data through loss function optimization. The optimization objectives include the
CDAloss, thelabelclassificationloss, andtheJMMDloss, respectively, inordertoperformajointoptimization
training of the three components.
3.1. Pre-training
The pre-trained model structure using convolutional neural networks (CNN) with bi-directional long short-
term memory (BILSTM). Detailed information on the model structure is given in Table 1. To accelerate com-
putational efficiency, the raw signal is first downsampled and then fed into the CNN. After that, the features
obtained from CNN are fed again into the BILSTM to better extract the temporal information of the vibration