Page 88 - Read Online
P. 88

Shu et al. Intell Robot 2024;4(1):74-86  I http://dx.doi.org/10.20517/ir.2024.05     Page 76

               latent features, and the classification accuracy depends on the choice of features. In recent years, deep learn-
               ing networks, such as convolutional neural networks (CNN) and long short-term memory (LSTM) networks,
               have started to be applied in the field of quantifying bradykinesia in PD [17,18] . Classification methods based on
               deep learning networks can overcome the limitations of manual feature extraction and learn low-dimensional
               feature representations but are more susceptible to the limitations posed by the amount of data compared to
               traditional machine learning methods. On the other hand, clinical data usually have the characteristic of small
               samples, and motion capture data are typical multidimensional time series signals. Since the classic data aug-
               mentation methods, such as slicing, jittering, and rotation, result in temporal information distortion, methods
               based on dynamic time warping (DTW) can address this issue by maintaining the temporal relationships of
               time series signals [19,20] .


               In this paper, to address the characteristics of small sample sizes of clinical data, we design a network clas-
               sification approach based on DTW data merge and attentional prototype networks (DTW-TapNet). Firstly,
               a DTW-based data merge method is employed for data augmentation. Secondly, random grouping is used
               for dimensionality reorganization of time series, followed by convolution operations to learn features from
               multivariate time series data. The approach then incorporates attention mechanism and prototype learning
               to optimize the distance of the class prototypes of time series, achieving a low-dimensional feature represen-
               tation of the training set and thus reducing the dependency on data volume. Based on motion capture data
               collected from patients with PD and healthy controls during upper and lower limb movements, the proposed
               DTW-TapNet classification method aims to achieve an objective and accurate assessment of bradykinesia in
               PD.



               2. METHODS
               This paper designs a DTW-TapNet network based on DTW data merge and attentional prototype network,
               and the network structure is shown in Figure 1. The network structure includes DTW data merge, feature
               extraction, and attentional prototype network, which are discussed in the following sub-sections.

               2.1 Dataset
               The study was approved by the local ethics committee of Tianjin Huanhu Hospital (No. 2019-56). All subjects
               provided written informed consent in accordance with the Declaration of Helsinki to participate in this study.
               The subjects in the dataset consisted of 36 patients with PD (27 male and nine female) and eight age-matched
               healthy controls (three male and five female).

               The experimental paradigm consisted of finger and toe tapping tasks, which referred to the MDS-UPDRS and
               were commonly used to assess the bradykinesia in PD [8,21,22] . For the finger tapping task, the subjects were
               instructedtorapidlyandconsistentlytaptheindexfingeragainstthethumb. Forthetoetappingtask,theywere
               instructed to tap their toes on the ground. Moreover, the subjects were required to complete ten consecutive
               cycles for both tasks, and the score ratings were performed by a professional doctor.

               The motion capture data were collected during the experiment with a sampling rate of 60 Hz, which consisted
               of the three-dimensional coordinates of the reflective markers. For the finger tapping task, two markers were
               placedatthetipofthethumbandtheindexfinger, respectively. Forthetoe tappingtask, themarkerwas placed
               at the toe. The marker location diagram is shown in Figure 2. Considering the use of hands and medication,
               multiple sets of motion capture data can be collected for each subject, and the data with missing values was
               removed due to the obstruction. In total, the sample sizes are 165 and 169 for the finger and toe tapping tasks,
               respectively. In addition, since patients with a rating of 4 could not complete the task, the final dataset contains
               theratingof0, 1, 2, and3. Specifically,thesamplesizesofeachratingare33, 74, 44, and14forthefingertapping
               task and 31, 68, 48, and 22 for toe tapping. Since the sample sizes do not conform to the long-tail distributions,
   83   84   85   86   87   88   89   90   91   92   93