0% found this document useful (0 votes)
31 views22 pages

Parkinson's Detection for Engineers

The document discusses developing a multimodal data-driven ensemble approach for Parkinson's disease detection using voice data from different channels and modalities. It aims to explore the most effective data and features for Parkinson's detection and apply machine learning algorithms to achieve better detection accuracy than previous works.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views22 pages

Parkinson's Detection for Engineers

The document discusses developing a multimodal data-driven ensemble approach for Parkinson's disease detection using voice data from different channels and modalities. It aims to explore the most effective data and features for Parkinson's detection and apply machine learning algorithms to achieve better detection accuracy than previous works.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

Thesis Title: Parkinson’s Disease Detection Using Multimodal

Data Driven Approach

Course Code: BME 4000


Course Name: Thesis/Project

A report submitted to the Department of Biomedical Engineering in Fourth Year Second


Semester

Supervised by Submitted by
Dr. Mohiuddin Ahmad Md. Ariyan Sarker
Professor Roll: 1815021
Department of Electrical and Electronic Year: 4th, Term:2nd
Engineering Department of Biomedical Engineering

Department of Biomedical Engineering


Khulna University of Engineering & Technology
Khulna-9203, Bangladesh

i
ABSTRACT

Parkinson's disease (PD) stands as the second most prevalent neurological disorder, lacking a
specific medical exam for definitive diagnosis. This research tackles PD identification using
diverse voice data collected via two channels: Smart Phone (SP) and Acoustic Cardioid (AC).
Each channel yielded four distinct data modalities: sustained phonation (P), regular speech (S),
voiced (V), and unvoiced (U). The study's contributions are twofold. Firstly, it explores the
most effective data modality and features offering greater insight into PD. Secondly, it
introduces a Multi-Modal Data–Driven Ensemble (MMDD-Ensemble) approach for PD
detection.

The MMDD-Ensemble operates in two tiers. The first level establishes various base classifiers
rooted in multimodal voice data. At the second level, predictions from the base classifiers are
combined using blending and voting techniques. To affirm the proposed method's robustness,
accuracy served as the validation metric. Notably, the proposed method outperformed the
optimal unimodal framework's best outcomes across key evaluation metrics, namely accuracy
and AUC. Additionally, it exhibited parity with other cutting-edge ensemble models.
Experimental findings revealed that the proposed multimodal approach achieved a 92%
accuracy rate. These results exhibit promise in contrast to recent reports on PD detection
through multimodal voice data.

ii
Keywords:

PD Parkinson’s Disease
CNS Central Nervous System
PWP People with Parkinsonism
TQWT Tunable Q-factor Wavelet Transform
ELM Extreme Learning Machine
fMRI Functional Magnetic Resonance Imaging
DTI Diffusion Tensor Imaging
ADHD Attention-deficit/Hyperactivity Disorder
CNN Convolutional Neural Networks
MMDD Multi Modal Data-Driven
AC Acoustic Cardioid
SP Smart Phone
SVM Support Vector Machine
LDA Linear Discrimination Algorithm
KNN K-Nearest Neighbor
GNB Gauss Naïve Bayes
LR Logistic Regression
DTC Decision Tree Classifier
ROC Receiver Operating Characteristics

iii
Table of Contents

PAGE

Title Page i
Abstract ii
Keywords iii
Table of Contents iv
Index v
List of Tables v
List of Illustrations v

iv
Index

CHAPTER 1 Introduction 1
1.1 Introduction 1
1.2 Objectives 2
CHAPTER 2 Background Study 3
2.1 Related Works 3
2.2 Motivation 4
CHAPTER 3 Methodology 5
3.1 Dataset Description 5
3.2 Machine Learning Models 5
3.2.1 SVM 5
3.2.2 LDA 6
3.2.3 LR 6
3.2.4 DTC 6
3.2.5 GNB 7
3.2.6 KNN 7
CHAPTER 4 Progress 10
4.1 Progress 10
CHAPTER 5 Results & Discussion 15
5.1 Results 15
5.2 Discussion 15
CHAPTER 6 Conclusion & Future Plan 16
6.1 Conclusion 16
6.2 Future Plan 16
References 17

LIST OF TABLES
Table No. Description Page
4.1 Phonation (AC Channel) 10
4.2 Speech (AC Channel) 11
4.3 Unvoiced (AC Channel) 11
4.4 Voiced (AC Channel) 12
4.5 Speech (SP Channel) 12
4.6 Phonation (SP Channel) 13
4.7 Unvoiced (SP Channel) 13
4.8 Voiced (SP Channel) 14
5.1 Summary 15
5.2 Voting 15
6 Future Plan 16

LIST OF ILLUSTRATIONS
Figure No. Description Page
3.1 Flowchart of the work 8

v
Chapter 1

Introduction
1.1 Introduction
Parkinson's disease (PD) is a degenerative condition affecting the central nervous system (CNS),
impacting around 6.3 million individuals worldwide across diverse genders, ethnicities, and
societies. It leads to partial or complete loss of functions such as speech, motor reflexes, and
cognitive and behavioral processes. Back in 1817, Dr. James Parkinson initially characterized and
named this disorder. Common PD symptoms encompass rest tremors, rigidity, bradykinesia,
postural instability, visual challenges, dementia, memory decline, and cognitive function
disturbances involving reasoning and judgment. Notably, individuals with Parkinsonism (PWP)
experience speech-related impairments like dysphonia (flawed voice use), hypophonia (reduced
volume), monotone (limited pitch range), and dysarthria (difficulty articulating sounds or
syllables).

In recent times, the utilization of voice data for detecting PD has garnered substantial attention
for several reasons. Firstly, vocal abnormalities are suggested to manifest as some of the earliest
signs of the disease. Secondly, it's asserted that almost 90% of individuals with Parkinsonism
exhibit voice-related issues. Thirdly, leveraging voice data for PD detection allows for remote
diagnosis of the condition. As of now, there exists no blood test or laboratory examination to
definitively diagnose PD cases. Hence, the development of machine learning-based automated
learning systems is imperative to offer an effective means of assessing the ailment.

1
1.2 Objectives
The objectives of this work are-

 To explore optimal data modality and features having better information about
parkinson’s disease (PD).
 To propose a multimodal data–driven ensemble (MMDD-Ensemble) approach for PD
detection.
 To apply various msachine-learning algorithms to understand which one can detect with
better efficiency.
 To have better accuracy for the detection of PD than previous works.
 To provide overall more trusted detection system for PD.

2
Chapter 2

Background Study
2.1 Related Works
Within the realm of literature, various investigations have been undertaken to automate the
detection of Parkinson's disease (PD) based on voice and speech data. Little et al. introduced
a technique to assess PD by measuring dysphonia in vowel "a" phonation data from a group
of 31 individuals, yielding a 91% accuracy [1]. More recently, Sarkar et al. conducted a
comparative analysis of different feature extraction methods for PD identification using
replicated vowel "a" phonation data. They demonstrated that tunable Q-factor wavelet
transform (TQWT) and Mel-frequency cepstral coefficients carry complementary
information about PD [2]. Vaiciukynas et al. gathered multifaceted voice and speech data for
PD detection, encompassing four data modalities and 18 feature sets. Their efforts resulted in
a 79% accuracy for PD detection [3]. Almeida et al. employed multimodal voice and speech
data, investigating the viability of diverse machine learning techniques across the 18 extracted
feature sets. They achieved the highest PD detection accuracy of 94% [4].

In multimodal learning, Kassani et al. introduced a multimodal sparse extreme learning


machine (ELM) classifier for predicting adolescent brain age [5]. Their method outperformed
conventional ELM and sparse Bayesian learning ELM in terms of classification accuracy.
Luo et al. ventured into multimodal neuroimaging (fMRI, DTI, sMRI) data-based prediction
for attention-deficit/hyperactivity disorder (ADHD) [6]. Their findings favored a bagging
ensemble approach with SVM base classifiers. Kumar et al. posited that different
convolutional neural network (CNN) architectures learn varying semantic image
representations [7]. Leveraging this, they constructed an ensemble of fine-tuned CNNs for
medical image classification, which surpassed other established CNNs. Poria et al. tackled
sentiment analysis through a multimodal approach, utilizing audio, video, and textual data
modalities. Textual modality achieved the highest accuracy at 79.14%, while fusion of all
three modalities led to 87.89% accuracy [8]. More recently, Hao et al. proposed emotion
recognition based on visual and audio data, using a blending ensemble approach for their
fusion, which outperformed several state-of-the-art methods [9].

3
2.2 Motivation

In my proposed method, I also embarked on exploring the feasibility of these approaches for
PD detection. This study revolves around two main research questions. First, I inquired about
the data modality and features that offer superior insight into PD. Second, I delve into how
multiple modalities can be synergized to enhance the accuracy of PD detection. As a result,
this study makes a dual contribution. It involves the development of several machine learning
models to investigate the optimal data modality and features that provide complementary PD
information. Furthermore, this paper introduces the MMDD-Ensemble technique for
enhanced PD detection. The proposed MMDD-Ensemble operates across two levels: the first
level comprises diverse base classifiers driven by distinct voice data types (multimodalities),
while the second level combines the predictions of these base classifiers through a voting
mechanism.

4
Chapter 3

Methodology
3.1 Dataset Description

Various features have been taken in the dataset. Some of them are-
 Amplitude Modulation
 Envelope Shape Statistics
 Loudness
 Flatness
Value of those features have been calculated by these parameters-
 Loudness
 Flatness
 Mean
 Median
 Trimean

3.2 Machine Learning Models

3.2.1 SVM

Decision Function:

f(x) = sign(wᵀx + b)

Objective Function:

Minimize: 1/2 ||w||² + C Σ max(0, 1 - yᵢ(wᵀxᵢ + b))

Dual Formulation:

Maximize: Σ αᵢ - 1/2 Σ Σ αᵢ αⱼ yᵢ yⱼ xᵢᵀxⱼ

Subject to: Σ αᵢ yᵢ = 0

0 ≤ αᵢ ≤ C for all i

5
3.2.2 LDA

1. For each document d in D:

- Sample a topic distribution θ[d] ~ Dirichlet(α), where α is a hyperparameter.

2. For each topic k in K:

- Sample a word distribution φ[k] ~ Dirichlet(β), where β is a hyperparameter.

3. For each word in the document:

- Sample a topic assignment z[d, n] ~ Multinomial(θ[d]).

- Sample a word w[d, n] ~ Multinomial(φ[z[d, n]]).

3.2.3 LR

The logistic regression model predicts the probability of the positive class (class 1) as follows:

P(Y = 1 | X) = σ(β₀ + β₁*X₁ + β₂*X₂ + ... + βₖ*Xₖ)

The probability P(Y = 0 | X) of the negative class (class 0) can be computed as:

P(Y = 0 | X) = 1 - P(Y = 1 | X)

To make binary predictions, you can set a threshold (usually 0.5) and classify instances as
follows:

- If P(Y = 1 | X) ≥ 0.5, classify as the positive class (Y = 1).

- If P(Y = 1 | X) < 0.5, classify as the negative class (Y = 0).

3.2.4 DTC

Splitting Criteria for a Node:

At each decision node in the tree, you evaluate a feature (Xᵢ) against a threshold (θ) to
determine which branch to follow. The split can be represented as:

if (Xᵢ <= θ) then Left Subtree else Right Subtree,

6
This condition is applied to the input features for each node in the tree.

Leaf Nodes and Class Assignments:

When you reach a leaf node in the tree, you make a class assignment. This can be represented
as:

Class = Cᵢ

Here, Cᵢ represents the class label assigned to the leaf node.

3.2.5 GNB

P(C=c | X) = (P(X | C=c) * P(C=c)) / P(X)


- The Gaussian probability density function (PDF) is used to compute P(X | C=c) for each
feature:
P(Xᵢ | C=c) = (1 / sqrt(2 * π * σ²ᵢ)) * exp(-(Xᵢ - μᵢ)² / (2 * σ²ᵢ))
3. P(X) (Total Probability of Observing X):
- P(X) is a normalization factor that ensures that the sum of probabilities over all classes
equals 1. It can be computed as:
P(X) = Σ (P(X | C=c) * P(C=c))
Summing over all possible classes.
4. P(C=c | X) (Posterior Probability):
- P(C=c | X) represents the posterior probability that the instance with feature vector X
belongs to class c. This is what GNB calculates for classification.

3.2.6 KNN
Calculate the Distance:
Euclidean Distance (d) = sqrt((X₁ - X₂)² + (Y₁ - Y₂)²)
Sort by Distance: Sort the data points in D by their distances to the query point X in ascending
order.
Select the Nearest Neighbors: Take the top k data points from the sorted list. These are the k-
nearest neighbors to the query point X.
For Classification (k-NN classification):
- If k = 1, assign the class label of the single nearest neighbor to X as the predicted class.
- If k > 1, calculate the class distribution among the k-nearest neighbors and assign the class
with the majority vote as the predicted class.
7
For Regression (k-NN regression):
- If k = 1, assign the target value of the single nearest neighbor to X as the predicted target
value.
- If k > 1, calculate the mean (average) of the target values among the k-nearest neighbors
and assign it as the predicted target value.

The overall flow diagram of the proposed work is shown in fig. 3.1.

Data Acquisition

AC Channel SP Channel

Phonation Speech Unvoiced Voiced

8
No Feat. ext. method Abbreviation Tool/Study

1 Avec2011 AV1 OpenSMILE toolkit Eyben et al. (2013)


2 Avec2013 AV2 OpenSMILE toolkit Eyben et al. (2013)

3 Emo_large EL OpenSMILE toolkit Eyben et al. (2013)

4 Emobase EM1 OpenSMILE toolkit Eyben et al. (2013)

5 Emobase2010 EM2 OpenSMILE toolkit Eyben et al. (2013)

6 Essentia_descriptors ED Essentia Bogdanov et al. (2013)

7 IS09_emotion IS1 OpenSMILE toolkit Eyben et al. (2013)

8 IS10_paraling IS2 OpenSMILE toolkit Eyben et al. (2013)

9 IS10_paraling_compat IS3 OpenSMILE toolkit Eyben et al. (2013)

10 IS11_speaker_state IS4 OpenSMILE toolkit Eyben et al. (2013)

11 IS12_speaker_trait IS5 OpenSMILE toolkit Eyben et al. (2013)

12 IS12_speaker_trait_compat IS6 OpenSMILE toolkit Eyben et al. (2013)

13 IS13_ComPare IS7 OpenSMILE toolkit Eyben et al. (2013)

14 jAudio_features JA jAudio McEnnis et al. (2006)

15 KTU_features KTU KTU (2009)

16 MPEG7_descriptors MP MPEG7AudioEnc Crysandt et al. (2004)

17 Tsanas TS Tsanas (2012)

18 YAAFE_features YA YAAFE toolbox Mathieu et al. (2010)

SVM LDA KNN GNB LR DTC

Finding datasets with best accuracy of every


modalities in two channels (8 in total)

Voting between these datasets

Accuracy

Fig 3.1: Flowchart of the work

9
Chapter 4

Progress
4.1 Progress

In the work, all of the 8 modalities have been put into Table 4.1 to Table 4.8. In these 8
tables, 6 machine learning models have been put into columns and 18 feature extraction
methods have been put into rows. The best accuracy of the overall table has been shown in
bold.

Table 4.1: Phonation (AC Channel)


Feat SVM LDA KNN GNB LR DTC
1 55.41 70.27 60.81 66.22 68.92 64.86
2 60.81 67.57 58.11 62.16 37.84 72.97
3 63.51 77.03 52.7 44.59 72.97 81.08
4 63.51 79.73 66.22 63.51 63.51 59.46
5 75.68 71.62 75.68 54.05 68.92 70.27
6 67.57 52.7 60.81 35.14 68.92 70.27
7 59.46 59.46 58.11 63.51 48.65 66.22
8 64.86 71.62 70.27 55.41 60.81 70.27
9 62.16 67.57 58.11 64.86 58.11 60.81
10 60.81 72.97 62.16 62.16 55.41 59.46
11 66.22 66.22 58.11 63.51 41.89 74.32
12 67.57 70.27 63.51 66.22 51.35 62.16
13 63.51 74.32 60.81 66.22 54.05 71.62
14 56.76 67.57 68.92 51.35 79.73 79.73
15 68.92 56.76 64.86 41.89 64.86 72.97
16 68.92 55.41 67.57 63.51 70.27 72.97
17 59.46 50 59.46 67.57 68.92 71.62
18 64.86 83.78 63.51 77.03 68.92 72.97

10
Table 4.2: Speech (AC Channel)

Feat SVM LDA KNN GNB LR DTC


1 68 76 56 68 52 60
2 72 60 52 72 24 68
3 76 76 64 76 24 56
4 44 60 56 68 56 56
5 56 56 64 56 56 64
6 64 76 72 60 68 68
7 64 56 52 68 72 56
8 64 56 76 72 60 60
9 68 64 64 64 68 44
10 80 84 56 40 72 68
11 44 52 44 44 56 64
12 60 60 72 36 56 68
13 48 60 48 44 60 60
14 68 76 64 56 72 56
15 64 72 44 64 48 48
16 56 80 56 72 72 56
17 56 64 40 52 56 60
18 56 68 56 84 64 60

Table 4.3: Unvoiced (AC Channel)

Feat SVM LDA KNN GNB LR DTC


1 68 84 72 32 68 56
2 60 80 48 32 60 84
3 56 68 56 56 44 52
4 60 52 60 44 60 48
5 60 68 80 68 80 48
6 72 80 68 88 84 68
7 72 68 52 60 56 76
8 48 52 60 60 48 44
9 44 48 44 60 44 52
10 56 56 56 44 52 60
11 68 68 68 32 56 64
12 76 84 44 48 48 48
13 52 52 40 40 52 60
14 68 56 72 60 60 68
15 76 64 60 64 72 52
16 64 76 76 60 64 60
17 80 64 72 40 76 64
18 72 60 64 48 56 68

11
Table 4.4: Voiced (AC Channel)

Feat SVM LDA KNN GNB LR DTC


1 64 64 60 48 52 64
2 68 80 48 72 52 52
3 56 56 60 52 44 64
4 60 64 52 64 72 44
5 56 64 64 56 60 68
6 76 76 68 72 68 64
7 68 48 52 72 72 56
8 76 72 72 60 64 60
9 48 52 36 60 48 48
10 60 72 56 48 48 64
11 64 56 48 56 52 56
12 68 76 60 80 76 52
13 64 56 48 68 64 52
14 56 68 56 40 52 44
15 64 76 64 60 76 72
16 68 52 52 52 52 44
18 76 84 76 76 68 76

Table 4.5: Speech (SP Channel)

Feat SVM LDA KNN GNB LR DTC


1 76 56 72 56 48 56
2 56 80 48 44 56 56
3 76 68 64 48 60 64
4 72 88 72 76 48 64
5 68 76 68 76 68 56
6 64 64 52 40 72 76
7 64 72 48 80 68 72
8 52 68 56 64 60 56
9 56 76 52 64 68 52
10 68 64 68 56 52 48
11 72 76 56 44 44 52
12 72 80 80 40 60 56
13 56 76 48 44 52 60
14 64 72 80 76 72 64
15 64 60 60 44 44 56
16 60 80 72 48 56 64
17 60 76 56 60 52 56
18 60 88 68 60 64 64

12
Table 4.6: Phonation (SP Channel)

Feat SVM LDA KNN GNB LR DTC


1 68 72 64 65.33 68 74.67
2 68 76 65.33 68 68 80
3 69.33 80 62.67 50.67 66.67 70.67
4 58.67 78.67 48 69.33 52 68
5 68 77.33 76 56 73.33 61.33
6 65.33 70.67 61.33 37.33 64 65.33
7 65.33 66.67 61.33 69.33 66.67 69.33
8 74.67 74.67 72 61.33 68 56
9 74.67 81.33 74.67 62.67 58.67 70.67
10 68 77.33 57.33 33.33 68 72
11 64 74.67 57.33 34.67 36 68
12 61.33 80 64 40 37.33 72
13 74.67 70.67 64 25.33 25.33 70.67
14 60 73.33 66.67 48 73.33 80
15 69.33 69.33 65.33 42.67 62.67 70.67
16 64 69.33 58.67 65.33 65.33 74.67
17 61.33 69.33 57.33 60 56 80
18 57.33 77.33 62.67 64 66.67 64

Table 4.7: Unvoiced (SP Channel)

Feat SVM LDA KNN GNB LR DTC


1 64 76 72 32 40 52
2 88 76 60 80 88 64
3 60 60 48 40 60 52
4 56 84 64 76 68 68
5 72 68 60 68 80 56
6 56 64 60 64 68 56
7 76 56 56 40 76 68
8 56 72 60 60 68 64
9 60 56 76 68 60 64
10 64 68 52 60 52 68
11 72 64 48 68 76 52
12 64 60 40 24 48 52
13 84 76 52 52 56 64
14 72 72 80 68 72 76
15 56 68 60 52 60 60
16 56 72 52 68 56 44
17 60 64 60 56 56 76
18 72 72 68 24 64 64

13
Table 4.8: Voiced (SP Channel)

Feat SVM LDA KNN GNB LR DTC


1 68 64 56 56 48 60
2 68 72 60 72 64 64
3 76 76 56 40 64 56
4 60 84 64 72 64 64
5 76 80 68 68 80 68
6 76 64 56 44 56 68
7 72 52 64 64 64 68
8 64 72 64 68 68 68
9 56 60 52 64 64 64
10 76 80 52 24 80 72
11 56 68 52 56 44 56
12 72 32 68 44 76 44
13 52 64 44 48 52 76
14 72 80 68 60 60 68
15 72 68 56 44 52 56
16 64 56 76 60 56 60
17 80 64 48 44 64 52
18 68 52 44 48 52 44

14
Chapter 5

Results & Discussion


5.1 Results
Summary of the results of best Accuracy for 8 modalities are shown in table 5.1. The results
of the voting classifiers are shown in table 5.2.
Table 5.1: Summary
Modality Channel Feat. Set Model Accuracy(%)
P AC 18 LDA 83.78
S AC 18 GNB 84
U AC 6 GNB 88
V AC 18 LDA 84
P SP 9 LDA 81.33
S SP 4 LDA 88
U SP 2 SVM 88
V SP 4 LDA 84

Table 5.2: Voting

Channel Fused Modalities Method Accuracy (%)


AC U+S+P Voting 84
AC V+U+S Voting 84
AC V+U+S+P Voting 88
SP S+P Voting 88
SP V+S Voting 92
SP V+U+S Voting 92
SP V+U+S+P Voting 88

5.2 Discussion

Those results have been achieved through python code. In the code, every modality has been
taken and then coded one by one. When voting, different modalities have been added and
then accuracy has been measured by python codes.
15
Chapter 6

Conclusion
6.1 Conclusion
By this work, Parkinson’s disease can be detected using multimodal approach with good
accuracy. As multimodal modes can result in better accuracy, voting classifiers have been
used to fuse different modalities and by combining those accuracy has been increased. It can
be increased more by blending approach as it contains machine learning models.

6.2 Future Plan


Future plan of my work is shown in table 6.1.
Table 6.1: Future Plan

Work Week 1- Week 5- Week 8- Week Week Week 19 Week 20 Week 21


4 7 10 11-13 14-18
Feature
selection for
datasets
Blending
between
datasets
ROC curve &
sensitivity test

Improvement
of overall
accuracy
Cross-
checking &
final result
Final testing
& writing
report

16
References:

[1] Little, M. A., McSharry, P. E., Hunter, E. J., Spielman, J., Ramig, L. O., et al. (2009).
Suitability of dysphonia measurements for telemonitoring of Parkinson’s disease. IEEE Trans.
Biomed. Eng. 56, 1015–1022.
[2] Sarkar, C. O., Serbes, G., Gunduz, A., Tunc, H. C., Nizam, H., Sakar, B. E., et al. (2019). A
comparative analysis of speech signal processing algorithms for Parkinson’s disease classification and
the use of the tunable q-factor.
[3] Vaiciukynas, E., Verikas, A., Gelzinis, A., and Bacauskiene, M. (2017). Detecting Parkinson’s disease
from sustained phonation and speech signals. PLoS ONE 12:e0185613
[4] Almaeida, J. S., Rebouças Filho, P. P., Carneiro, T., Wei, W., Damaševičius, R., Maskeliū nas, R.,
et al. (2019). Detecting Parkinson’s disease with sustained phonation and speech signals using machine
learning techniques. Pattern Recogn. Lett. 125, 55–62.
[5] Kassani, P. H., Gossmann, A., and Wang, Y.-P. (2019). Multimodal sparse classifier for adolescent brain
age prediction. IEEE J. Biomed. Health Inform.
[6] Luo, Y., Alvarez, T. L., Halperin, J. M., and Li, X. (2020). Multimodal neuroimaging-based prediction
of adult outcomes in childhood-onset adhd using ensemble learning techniques. Neuroimage Clin.
2020:102238.
[7] Kumar, A., Kim, J., Lyndon, D., Fulham, M., and Feng, D. (2016). An ensemble of fine-tuned
convolutional neural networks for medical image classification. IEEE J. Biomed. Health Inform. 21, 31–
40.
[8] Poria, S., Peng, H., Hussain, A., Howard, N., and Cambria, E. (2017). Ensemble application of
convolutional neural networks and multiple kernel learning for multimodal sentiment analysis.
Neurocomputing 261, 217–230.doi: 10.1016/j.neucom.2016.09.117
[9] Hao, M., Cao, W.-H., Liu, Z.-T., Wu, M., and Xiao, P. (2020). Visual-audio emotion recognition based
on multi-task and ensemble learning with multiple features. Neurocomputing.
[10] Ahmad, F. S., Ali, L., Khattak, H. A., Hameed, T., Wajahat, I., Kadry, S., et al. (2021). A hybrid machine
learning framework to predict mortality in paralytic ileus patients using electronic health records (EHRs).
J. Ambient Intell. Human. Comput. 12, 3283–3293.
[11] Ali, L., and Bukhari, S. (2020). An approach based on mutually informed neural networks to optimize
the generalization capabilities of decision support systems developed for heart failure prediction. IRBM
42, 345–352.
[12] Ali, L., Khan, S. U., Arshad, M., Ali, S., and Anwar, M. (2019a). “A multimodel framework for
evaluating type of speech samples having complementary information about Parkinson’s disease,” in
2019 International Conference on Electrical, Communication, and Computer Engineering (ICECCE)
(IEEE), 1–5.
[13] Ali, L., Zhu, C., Golilarz, N. A., Javeed, A., Zhou, M., and Liu, Y. (2019b). Reliable Parkinson’s disease
detection by analyzing handwritten drawings: Construction of an unbiased cascaded learning system
based on feature selection and adaptive boosting model. IEEE Access 7, 116480–116489.

17

You might also like