Page 11 - Read Online
P. 11

Page 134                          Liu et al. Intell Robot 2023;3(2):131-43  I http://dx.doi.org/10.20517/ir.2023.07






                                   Category1
                                   Category2
                                   Category3
                                   Category4  Training   Testing     Training     Testing
                                   Unlabeled data
                                                                         Transfer
                                                Intelligent system  Knowledge  Intelligent system


                               Figure 1. Traditional intelligent methods and transfer learning-based intelligent methods.



                                         
                                           
                                                                       
                                                                   
                                    
               Formally, we define    =    ,    as labeled training data,    =    as unlabeled test data, where    denotes the
                                         
                                                                       
                                           
               source domain task,    denotes the target domain task, and       and       represent the vectorized representation
               of the   th sample and the corresponding label. In addition, it is worth noting that the target domain task has
                                  
               no corresponding    , which means that the available labeled data in the training phase can only rely on the
                                  
               labeled data in the source domain, which will increase the difficulty of transfer. Since there is a great difference
               between the task data in two different fields, transfer learning can minimize the difference between them by
               finding a mapping relationship, thus realizing the reusable diagnostic ability. When the data distributions of
               the two domains are close, we can satisfy the assumptions on which the existing intelligent faults depend and
               realize an effective diagnosis.
               3. THE PROPOSED ARCHITECTURE
               In order to efficiently transfer the diagnostic power learned from the labeled data, a pre-trained model is ob-
               tained by generating pseudo-labels for training. A domain adaptation network, using the joint maximum
               mean deviation (JMMD) criterion and conditional domain adversarial (CDA) learning, is then used to learn
               a mapping relationship that reduces the variation in the distribution of different domains. The joint distri-
               bution between the aligned features and the predicted labels is aligned through multiple domain adaptation
               approaches. Meanwhile, the information from the unlabeled data is incorporated in the pre-training phase,
               thus resulting in maximum category differentiation and domain adaptation under multimodal conditions.

               AsdepictedinFigure2, theprimaryarchitectureoftheproposedmethodisstructuredasfollows: First, enough
               labeled data in the source domain are collected to train a pre-trained model. After that, the unlabeled data
               in the target domain are predicted to obtain pseudo-labels. Then these data are combined to extract more
               effective fault features. Second, the feature vectors and label vectors are linearly transformed several times to
               jointlymodelthe implied relationshipsbetween them. Finally, adomain adaptationmodule is used toalign the
               differences between the two data through loss function optimization. The optimization objectives include the
               CDAloss, thelabelclassificationloss, andtheJMMDloss, respectively, inordertoperformajointoptimization
               training of the three components.


               3.1. Pre-training
               The pre-trained model structure using convolutional neural networks (CNN) with bi-directional long short-
               term memory (BILSTM). Detailed information on the model structure is given in Table 1. To accelerate com-
               putational efficiency, the raw signal is first downsampled and then fed into the CNN. After that, the features
               obtained from CNN are fed again into the BILSTM to better extract the temporal information of the vibration
   6   7   8   9   10   11   12   13   14   15   16