: Heart Sound Classification using the Nonlinear Dynamic Feature Approach along … Heart Sound Classification using the Nonlinear Dynamic Feature Approach along with Conventional Classifiers

Heart sounds show chaotic and complex behavior when murmurs are present, containing nonlinear and non-Gaussian information. This paper studies ways to extract features from nonlinear dynamic models. The features frequently used to describe the underlying dynamics of the heart are derived from nonlinear dynamical modeling of heart sound signals. This study incorporates nonlinear dynamic features alongside conventional classifiers in the analysis of phonocardiograms (PCGs), achieving a significant improvement in the classification performance with 0.90 sensitivity and 0.92 specificity


INTRODUCTION
World Health Organization (WHO) has reported that cardiovascular diseases (CVDs) and related conditions have caused 17.7 million deaths worldwide in 2015, representing 31% of all mortality worldwide [1].Several techniques can diagnose heart disease patients.Many sophisticated treatments are available but they are very expensive and cumbersome, so often they are not available to the majority of people.Another CVD diagnostic method is heart sound auscultation.A stethoscope is usually used to examine patients, and if an abnormality is detected, the patient may be referred to a cardiologist.An early diagnosis of abnormal heart sounds allows physicians to take corrective measures to prevent cardiovascular disruptions and treat the underlying cause.The phonocardiogram (PCG) represents the sound heart graphically.The PCG signal enables the diagnosis of heart diseases and the evaluation of the cardiovascular system's performance [2,3].Each PCG contains multiple cardiac cycles, each with 4 heart sound states: S1, systole, S2, and diastole.These sounds are caused by the closing of the valves at each heart period, with the mitral and tricuspid valves closing before systole, and the aortic and pulmonic valves closing before diastole.Despite their importance, heart sounds are often difficult to interpret due to their low intensity and dominant frequencies near the lower limits of human hearing.Therefore, auscultation requires a lot of training and experience to detect abnormalities early [3].The PCG diagnosis can be improved by the use of computers and the development of automated diagnostic tools or computer-assisted auscultation tools for physicians.Automated algorithms have been developed over the past decade to assess patients based solely on PCG measurements without synchronization between the PCG and the electrocardiogram (ECG).This approach has some difficulties due to the variations in the heart rate of the same patient.There is a possibility that a PCG will fluctuate on an individual patient.Moreover, the model is limited to different types of patients.
In this work, we study ways to extract features from nonlinear dynamic models.The features that have been suggested are based on the nonlinear dynamical modeling of the heart sound signals, which are frequently utilized to describe the underlying dynamics of the heart.

II. METHODOLOGY
In this study, we propose a nonlinear dynamic feature method of the heart sound signals as input for a computerassisted auscultation system, as shown in Figure 1.

A. Preprocessing
To improve the cardiac sound signal, minimizing background noise and eliminating spike noise is critical.A two-stage preprocessing system is used, with a third-order Butterworth bandpass filter with corner frequencies of 15 and 800 Hz in the first stage.This allows for the selection of the valuable bandwidth of the heart sound.The spectral subtraction denoising method was used in the second stage [24].This method's adaptive noise estimation is an advantage in rebuilding the denoised signal.To obtain accurate measurements of heart sounds, the noise power of frequencies outside the range of heart sounds is measured first.A weighted version of this measured noise power is then subtracted from the power spectrum of the unprocessed heart sounds.This process ensures that the resulting measurements are precise and reliable [24].Spectral subtraction filtration with a weighting factor of 0.5 was applied in this study.

B. Cardiac Cycle Segmentation
Each PCG signal is divided into cardiac cycles at this stage.Recognizing the systolic or diastolic states is essential for classifying abnormal states in these areas.Different algorithms were used.Some work was performed using a reference signal, like the ECG, which the segmentation algorithms demand to be recorded simultaneously, making simpler to hear heartbeats.In other methods, the ECG is not utilized as a reference.This work divided the PCG signal into cardiac cycles using Springer's improved version of Schmidt's segmentation algorithm [22].Then, the processing was carried out using each complete cardiac cycle.This method utilizes information about the estimated heart sound state lengths and employs a logistic regression Hidden Semi-Markov Model (HSMM) to estimate the most likely sequence of states without the need for ECG synchronization.In order to solve the issue of the varying time length of cardiac cycles (and thus the sizes of their digital signals) in later processing stages the size of all signals was set to be the most extended cardiac cycle observed across all PCG recordings (here, it was around 2s).The shorter cardiac cycles were zero-padded to the length.This ensures that all signals contain the same frequency resolution.

C. Feature Extraction
In this stage, features were extracted from the segmented cardiac cycles for optimum classification accuracy.When murmurs are present, heart sounds exhibit chaotic and complex behavior.Projecting a signal's dynamic behavior and dealing with its nonlinearity and non-Gaussianity using nonlinear dynamic techniques is feasible.So, in this work, we study the extraction of nonlinear dynamical modeling features.The proposed features are based on nonlinear dynamic modeling of cardiac sound signals, which are commonly used to describe the underlying dynamics of the heart.A multidimensional phase space or attractor representing the system dynamics and its states could be developed from the measured signals or time series [21].Thus the system's actual attractor, at which the measurements were obtained, has the same dynamical characteristics as the Reconstructed Phase Space (RPS) [24].The calculated features for the RPS of the heart sound signals are moment-invariant, distance series-based, and statisticalbased features.In the nonlinear dynamical modeling analysis of the heart sound signal, the first step is reconstructing the phase space from the heart sound measurements.The timedelay embedding method proposed in [21,24] was used for the phase space reconstruction.Time-delay embedding is a technique used to reconstruct m-dimensional vectors from a time series of observations.This is achieved by selecting values of the time series at different lags and repeating the process until the vectors of the phase space are obtained.Therefore, the time delay embedding has two parameters: the embedding dimension and the time lag .
Let {x k ∶ k = 1,2,… , N} be the observed time series, the reconstructed -dimensional phase space ( ) can be constructed as the following matrix (1): where = − ( − 1) , is the length of the original time series, is the embedding dimension, and is the delay time.
We use an embedding dimension m of 18 and a time delay of 10.
After the reconstruction of the phase space, different features were extracted.We calculated the following statistical features for each reconstructed phase space vector: mean value, median value, standard deviation, mean absolute deviation, 25th and 75th percentiles, signal inter quartile range feature, skewness, and kurtosis.In addition, the same 9 statistical features are extracted from a new domain called the Distance Series (DS) domain.It is defined by [25] as the reduction of the multidimensional phase space into a one-dimensional space.The DS is a method used to characterize complex variations in RPS.It is calculated by taking the Euclidian distance between every point in the phase space and the origin, resulting in a one-dimensional representation of the trajectory.The DS D i can be calculated by determining the Euclidian distance between each point in the phase space Y i and the origin: for i = 1, 2, 3...k.If successive values of D i show smooth behavior, a slow trajectory and small region of support in the phase space are indicated by minimal changes in the values.Conversely, significant changes in values suggest a moving trajectory with large steps and significant support in the phase space.This mapping allows capturing more information about the trajectory than the traditional and more complex measures.Moreover, moment invariant features are calculated for the RPS of the heart sound signals.Moments are quantitative measures used in statistics to describe the distribution of a random variable quantitatively.In addition, skewness, which describes the asymmetry of the distribution, is represented by the third moment, and kurtosis, which describes the peaked ness of the probability distribution of the random variable, is described by the fourth moment.The entire set of moments from order zero to infinity describes the distribution uniquely.The term invariant denotes that the moments should remain unaffected to translation, rotation, and scaling transforms to describe the shape of the attractor optimally.The steps of calculating the invariant moments were described in [25], where the second-order moments M were calculated by: where ρ(() is the probability density function, ( is a column in the RPS, and , is the order of the moments given by: We obtained the central moments by applying (5) to the second-order moments.From these central moments, we constructed the O matrix using (6)."The major minors for the O matrix were calculated to represent the moment invariant features of the RPS, given that the number of the moment invariant features is equal to the number of the embedding dimension of the RPS". where The total number of extracted features is 207.

D. Classification
There are various classifiers available and each one comes with its own set of advantages and disadvantages.One such classifier is k-Nearest Neighbors (KNN), which is a memorybased learning algorithm [19].However, it is important to note that KNN requires the availability of training and testing data at all times.Additionally, for noisy datasets, decision tree classifiers are recommended [26].However, boosted ensemble classifiers perform best with imbalanced data [19,26,30].This study used a cross-validation approach to evaluate the performance of various conventional classifiers on a set of nonlinear dynamics features and then determine the most effective classification model, The classification methods used in this study include SVMs with quadratic, cubic, and Gaussian kernels, KNN with linear, cosine, cubic, and weighted distance metrics, and ensemble classification methods such as bagged Trees, subspace KNN, RUSBoosted tree, and boosted tree [27][28].The classifiers were tested and trained using the local holdout and cross-validation methods.The feature vector data for the cross-validation test were divided into five folds.The model was trained for each fold using all the data outside the fold, and each fold was held out for testing in turn.The performance of each model was then evaluated using the information included in the fold, and the overall results were computed as the average over all folds.The local holdout technique studies utilized 80% of the feature vector data as the training set and the remaining 20% as the testing set, both of which were randomly chosen.

E. Dataset Description
Selecting the right dataset is crucial for successful model building and generalization in pattern recognition.A large, diverse and easy-to-understand dataset is necessary to effectively analyze and compare different algorithms.The PhysioNet/Computing in Cardiology Challenge 2016 dataset was used to test the performance of the proposed system [16].A total of 3153 recordings are included in the dataset.Only the

www.etasr.com Alromema et al.: Heart Sound Classification using the Nonlinear Dynamic Feature Approach along …
sure-labeled data in this collection were used to a total of 2868 recordings obtained from 6 datasets.There are 2249 normal patient records and 619 abnormal patient records, ranging from 5s to more than 120s.Each track has been down-sampled to 2,000Hz and is available in wav format.Recordings with varying noise levels were included from several actual clinical and nonclinical settings.Data from normal persons and patients, from either children or adults were collected.One to 6 recordings of the same subject may exist in the dataset.Data were gathered from various body parts and locations (including aortic, pulmonic, tricuspid, and mitral areas).The fact that there are far more normal than abnormal recordings clearly shows that the data are unbalanced.Eighty percent of the dataset was used for training, while 20% was used for testing.
The heterogeneity of the recordings introduces differences that could make classifier training more challenging.

F. Performance Evaluation
The confusion matrix is produced with the abnormal cases as the positive class in order to evaluate the success of the classification process, and it is then used to determine the values for sensitivity, specificity, and accuracy: Error rate and accuracy are insufficient for measuring classification performance for imbalanced data since they do not consider the costs of misclassification.They are hence sensitive to class skews and frequently show a considerable bias toward the dominant class [22,29].As a result, the official evaluation metric for the 2016 PhysioNet/Computing in Cardiology Challenge was an alternative evaluation score based on the average between sensitivity and specificity:

A. Experimental Verification
Each PCG record was preprocessed using a Butterworth band-pass filter of order 3 with corner frequencies of 15 and 800Hz.Each record was further enhanced using spectral subtraction denoising with 0.5 weight.There were 79492 cardiac cycles after segmenting each record into cardiac cycles.The most extended cardiac cycle observed in all PCG recordings is about 2s long.Each time series was therefore zero-padded to a cardiac cycle's length of 2s if it was less.For nonlinear dynamic feature extraction, the first step is the reconstruction of the phase space from the heart sound measurements with embedding dimension m=18 and a time delay of 10.A total of 207 features were considered for each cardiac cycle.All feature vector values were normalized to the interval [0, 1] in order to optimize the classification process.The performances of several conventional classifiers, including SVM and KNN, and bagged Trees, subspace KNN, and RUSBoosted tree ensemble classifiers, were compared.

B. Results
The classification results with the 5-fold cross-validation train-test approach are listed in Table I.When using unbalanced data, the cross-validation train test approach is effective in reducing over-fitting and improving model evaluation.It should be noted that in our experiments only the ensemble classifiers were able to achieve relatively balanced sensitivity and specificity values.The RUSBoosted Tree ensemble classifier had the highest score value of sensitivity and specificity of 0.90 and 0.91.On the other hand, the SVM classifier with a Gaussian kernel had the lowest sensitivity of 0.64.The Bagged tree ensemble classifier and the KNN classifier with weighted distance metric both achieved the most remarkable specificity of 0.95, while the majority of commonly used classifiers achieved lower specificity of 0.91.The classification results utilizing the local holdout train-test approach are listed in Table II.Only the ensemble classifiers succeeded in achieving relatively balanced sensitivity and specificity values in our experiments.The RUSBoosted Tree ensemble classifier scored 0.91, which was the highest, with balanced sensitivity and specificity of 0.90 and 0.92.Conversely, the SVM classifier with a Gaussian kernel showed the lowest sensitivity, at 0.65, while the SVM classifier with a Gaussian kernel achieved the lowest specificity of 0.79, the Bagged tree ensemble classifier and the KNN classifier with weighted distance metric both achieved the highest specificity of 0.95.IV.DISCUSSION This research demonstrates that nonlinear dynamic features have an application to phonocardiogram analysis.Heart sounds exhibit chaotic and complex behavior when murmurs are present, hence they contain nonlinear and non-Gaussian information.Nonlinear dynamic approaches can deal with a signal's nonlinearity and non-Gaussianity and project its dynamic behavior.The features that have been proposed are based on the nonlinear dynamical modeling of the heart sound signals, which are frequently used to characterize the underlying dynamics of the heart.Applying nonlinear dynamic approaches is also helpful for determining whether murmurs are present.
In this paper, the RUSBoosted Tree ensemble classifier scored the highest with a sensitivity value of 0.90 and specificity value of 0.92, resulting in an overall score of 0.91.As a result, it offered the most accurate prediction of the predictive score using both approaches, as well as relatively balanced values for sensitivity and specificity.The classifier performance evaluations in the two train-test approaches were largely comparable.The ability of the proposed system to handle various P phonocardiogram CG recordings and signal quality settings was demonstrated by the significant improvements in the classification performance obtained utilizing nonlinear dynamic features along with the conventional classifiers.
Observing the effect of the training data's class distribution is essential, as this is a significant factor in determining the accuracy of subsequent classification.Although there seem to be many samples in the challenge database, these samples are, unfortunately, highly unbalanced, with vastly different proportions of normal and abnormal recordings.Due to this, traditional classifiers frequently develop biases in favor of the majority class, increasing the misclassification rate for the minority class.As a result, two validation approaches were used to assess the performance of various classifiers on the dataset.The cross-validation train-test method is the first.It reduces the influence of over-fitting caused on by noisy records.The second approach, known as the local holdout method, simulates real-world analysis by using just 80% of the data to develop the model and the remaining 20% to assess it.Noting that the dataset is significantly imbalanced, it is observed in the proposed system that only the ensemble classifiers performed very well.This confirms that ensemble classifiers are always recommended for imbalanced data in general [19,26].

V. CONCLUSION
In this paper, a new classification approach was proposed for heart sounds, in which the new proposed features were based on nonlinear dynamics.The proposed methodology and its implementation were described in detail.The results of the experimental verification show a potential to overcome challenges encountered during heart sound classification under different settings.Consequently, the proposed method provided well balanced values for sensitivity and specificity as well as the best accurate prediction of the predictive score using both methodologies.
TN, FP, and FN are the confusion matrix entries representing True Positive, True Negative, False Positive and False Negative classifications, respectively.