Page 9 - Read Online
P. 9
Page 132 Liu et al. Intell Robot 2023;3(2):131-43 I http://dx.doi.org/10.20517/ir.2023.07
1. INTRODUCTION
The intelligent development of modern industrial technology leads to the gradual complexity and system-
[1]
atization of machinery and equipment . As essential equipment in modern industrial applications, rotary
machines play a vital role in ensuring efficient and reliable operations. Key components, such as bearings and
gears, are critical to the proper functioning of these machines, and any faults can disrupt the normal rotating
mechanism. In engineering practice, bearings and gears are prone to faults due to improper assembly, corro-
[2]
sion, overload, poor lubrication, etc . If the equipment fault is not detected in time, it may affect the regular
operation of the equipment and cause economic losses. In more serious cases, it may even put the lives of oper-
ators at risk. The early detection and prediction of bearing and gear faults in rotary machines will significantly
enhance the safety of machinery production and avoid the loss of lives and property caused by mechanical
faults. Based on the literature [3,4] , fault diagnosis methods for rotary machines are divided into two main cat-
egories: traditional fault diagnosis methods that rely on manual signal analysis and newer methods that use
neural network diagnostic models to mine fault features.
For the past few years, deep learning techniques have made significant breakthroughs in artificial intelligence
fields, and the advantages of automatically learning and extracting valid information from data are gaining
increasing attention. By using sensors to acquire vibration signals and other relevant data and processing the
data with deep learning algorithms to extract features that correspond to fault data, it becomes feasible to
[5]
recognize and rectify potential faults . Unlike traditional fault diagnosis methods that use signal processing
[6]
techniques combined with machine learning classifiers to perform fault diagnosis , deep learning-based fault
diagnosis models can automatically mine and analyze the underlying mechanisms of faults to obtain accurate
[7]
fault classification performance with sufficient data . However, in practical engineering scenarios, mechan-
ical equipment mainly operates normally, and failures are relatively rare. Therefore, the amount of fault data
collected is usually limited. Furthermore, the distribution of data collected under changing operating condi-
tions, such as speed, load, and surrounding environment of rotary machines, can vary considerably, which
[8]
may affect the reliability and stability of diagnostic results .
Transfer learning is a machine learning technique that allows for the transfer of knowledge learned from one
[9]
task to another, with the aim of improving the performance of the latter task . In the context of diagnostic
tasks, transfer learning allows for the simultaneous application of diagnostic knowledge learned from pre-
trained data to relevant diagnostic tasks in order to achieve good diagnostic results [10] . In this strategy, the
core problem is distribution alignment, which enables the models to be constrained by the objective function
so that it satisfies the assumption of distributional consistency to achieve good diagnostic results [11] . Domain
adaptation is the core technique for achieving distribution alignment. It essentially ensures that the feature
spaces of the two tasks are aligned through some kind of transformation [12] . In real-world scenarios, the fea-
ture space of the source and target tasks can vary greatly, and distance metric minimization is often utilized
for alignment. Metrics for differences in distribution between domains include Kullback–Leibler (KL) diver-
gence [13] , maximum mean difference (MMD) [14] , Wasserstein distance [15] , and CORAL loss [16] . Additional
loss measures are introduced into the loss function and then optimized by gradient descent. Notably, it is
acknowledged that this strategy can obtain effective alignment with little difference in data distribution.
However, these methods mainly focus on aligning the marginal probability distributions, which only capture
the variation of global characteristics and ignore differences in the conditional distribution probabilities in
different domains. This makes it challenging to handle scenarios where the differences in data distribution
between different domains are more complex. Based on the recent literature [17–19] , transfer learning fault di-
agnosis techniques are preferred by a wide range of researchers. Qian et al. use DenseNet as the baseline
model, combined with a joint distribution adapted regularization term to get the metastable features. In this
way, diagnostic capabilities are effectively migrated [17] . Li et al. using the representational capabilities learned
in supervised learning to obtain target domain feature representations by minimizing the multi-kernel max-