Computational Framework For Heart Disease Prediction Using Deep Belief Neural Network With Fuzzy Logic
Computational Framework For Heart Disease Prediction Using Deep Belief Neural Network With Fuzzy Logic
In biomedical diagnosis, the information provided by the This paper is organized as follows: section 2 describes
patients may include redundant and interrelated symptoms Material and Methods, section 3 presents the proposed
and signs especially when the patients suffer from more than method for Heart Disease Prediction, Experimental results and
one type of disease of the same category. The health care its analysis are comprehensively discussed in section 4 and
industries are facing many challenges and issues based on the section 5 concluded with wrapping up of the current work.
patient’s severity which is to be reduced and detect it earlier
in a more effective way. The World Health Organization
26
International Journal of Computer Applications (0975 – 8887)
Volume 181 – No. 49, April 2019
2. MATERIAL AND METHODS KNN yielded the accuracy between 94% and 97.4% for
The ability to extract useful hidden knowledge in the large different values of K. When K = 7, the method achieved the
amount of data and to act on the same is a challenging task, maximum accuracy and specificity are 97.4% and 99%
since growth of the data size is enormously increasing day by respectively.
day. The need to understand large, complex, enriched In 2012, Nidhi et al. [7] introduced the algorithms Naïve
information in data sets has now increased in all the varied Bayes, Decision Tree and Neural Network and the same was
fields of technology, business, medical and science. The analyzed. The analysis results showed that the neural network
process of developing a novel Computer Based Information produced at maximum accuracy for 15 attributes i.e. nearly
System (CBIS) for discovering potentially useful knowledge 100%. On the other hand, Decision Tree has also performed
from data is necessary to increase the life time of the patient. well for the same number of attributes and yielded 99.62% of
The various data mining techniques introduced so for to heart accuracy. Moreover, Genetic Algorithm and Feature Subset
disease predictions are reviewed extensively here under: Selection methods are used to reduce the number of attributes
In 2010, Anbarasi et al. [2] proposed enhanced prediction to six and given as input to classification models. One of the
model for Heart Disease through the hybridization of genetic classification model Decision Tree produced almost same
algorithm with Decision Tree for feature subset selection, accuracy.
Genetic with Naive Bayes and Genetic with classification via
clustering. The enhanced prediction of heart disease is carried In 2012, Pethalakshmi et al. [8] introduced the algorithm such
out using genetic algorithm based feature subset selection as Fuzzy Decision tree, Fuzzy Naive Bayes, Fuzzy Neural
method with 10 attributes. The prediction accuracy is Network, Fuzzy k-means to predict the heart disease. The
increased while incorporating classification via clustering accuracy of these algorithms were 90.06%, 89.62%, 91.09%
with feature subset selection algorithms. Classification and 99.49% respectively. Fuzzy k-means yielded the best
techniques such as Naïve Bayes, Decision Tree and result which was at least 8.4% more than others.
Classification by clustering were used for prediction. In 2013, Abishek [9] proposed J48 and Naive Bayes to predict
the Heart disease and adopted Weka tool to implement the
Kumari et al. [3] applied data mining techniques such as above techniques. Accuracy of these algorithms is 95.56%
SVM, ANN, Decision Tree, RIPPER classifier to predict the and 92.42% respectively. The accuracy of j48 was 3.14%
risk of heart diseases, in 2011. The performance of these higher than Naive Bayes algorithm.
algorithms is analyzed using statistical analysis factors such as
Accuracy and Error Rate. Accuracy of RIPPER, Decision In 2013, Chitra et al. [10] implemented the algorithm such as
Tree, ANN and SVM are 81.08%, 79.05%, 80.06% and Artificial Neural Network (ANN), K-Means Clustering and
84.12% respectively whereas error rates of RIPPER, Decision Fuzzy C Means Clustering to predict the heart disease. The
Tree, ANN and SVM are 2.756, 0.2755, 0.2248 and 0.1588 accuracy of these algorithms were 85%, 88% and 92%
respectively. The analysis clearly showed that out of these respectively. Fuzzy C Means Clustering yielded the best result
four classification models, SVM predicted cardiovascular which is at least 4% more than others.
disease with least error rate and highest accuracy. In 2013, Patel et al. [11] applied the algorithm such as
Sundar, et al. [4] discussed the performance analysis of Decision Tree and Naive Bayes to predict the heart disease.
classification techniques Naive Bayes and Weighted The accuracy of these algorithms were 99.2%, 96.5% and
Association Classifier (WAC) for heart disease prediction 88.3% respectively. Decision Tree yielded the best result
based on evaluation using classification matrix, which revels which is at least 2.7% more than others.
the frequency of correct and incorrect prediction. Accuracy of In 2013, Vikas [12] applied CART Classification to predict
WAC and Naive Bayes are 84% and 78% respectively. The the heart disease. Accuracy of the algorithm is 84.49% and the
accuracy of WAC is 6% higher than Naive Bayes. time consumed by the algorithm is 0.23s.
In 2012, Dangare et al. [1] used 13 medical related input In 2014, Waghulde et al. [13] proposed the models called as
attributes medical terms such as sex, blood pressure, and Neural Network and Genetic Algorithm for the prediction of
cholesterol etc for prediction. To get more appropriate results, heart disease. The accuracy of the system is 98% respectively.
additionally added two more important attributes with above
attributes i.e. obesity and smoking for heart disease. In 2014, Niranjana Devi et al. performed a research work,
Multilayer Perceptron Neural Network (MLPNN), Back Evolutionary-Fuzzy Expert System for the Diagnosis of
propagation algorithm is introduced for heart disease. The Coronary Artery Disease. In this, a fuzzy expert system with
prediction method yielded the accuracy of 99.25% for 13 Genetic Algorithm is proposed to diagnose CAD disease
attributes and nearly 100% for 15 attributes. condition. Genetic Algorithm is used to optimize the
membership function parameters. The proposed system is
In 2012, Peter et al. [5] introduced the algorithm such as validated over CAD dataset and achieved an accuracy of
Naive Bayes, Multilayer, J48 and KNN and conducted 88.79%.
experimentation on dataset of health care domain. The
Correlation based Feature Selection and filter subset Rupali et al. [15] proposed the Classification based algorithms
evaluation methods are adapted to reduce more number of Naive Bayes and Laplace smoothing are introduced to predict
irrelevant and redundant attributes thereby increases the the heart disease in 2014. The accuracy of these algorithms
performance of classifiers. The accuracy of these algorithms were 78% and 86% respectively. Classification using Laplace
are 85.18%, 78.88%, 88.18% and 85.55% respectively and Smoothing yielded the best result than others.
provide the evidence of significant improvement in the
results. Venkatalakshmi et al. [16] introduced the algorithm such as
Naive Bayes and Decision Tree to predict the heart disease in
In 2012, Shouman et al. [6] proposed the model called as K- 2014. The accuracy of these algorithms is 85.03% and 84.01%
Nearest Neighbour for the prediction of heart disease. In this respectively. Naive Bayes yielded the higher result which is at
model, the value of K is ranged between one and thirteen. least 1.02% more than others.
27
International Journal of Computer Applications (0975 – 8887)
Volume 181 – No. 49, April 2019
In 2015, D’Souza [17] applied three techniques such as From this literature survey, the following issues are identified
Artificial Neural Network, K Means Clustering and Apriori in heart disease prediction using computational models in
Algorithm to classify whether the patients have the heart health care industry data.
disease or not and their performance are analysed. The results
showed that Artificial Neural Networks outperform well Prediction accuracy is low with reduced number of
compare to others. attributes.
In 2015, Adbar et al. [18] proposed methods C5.0, Neural It take more time for predict.
Network, SVM, and KNN to predict the risk of heart diseases. More false classification.
Accuracy of c5.0, Neural Network, SVM, and KNN are
93.02%, 89.4%, 86.05% and 80.23% respectively. C5.0 The choice of the kernel and other parameter
performed well when compared to others. Further, it was also selections.
noted that the accuracy of C5.0 is at least 3.62% higher than
others. The limit on ability to categorize correctly.
In 2016, Patel et al. [19] applied methods J48 and LMT Poor clinical decisions lead to mortality.
algorithm to predict the heart diseases. Performances of these The noise and missing vales make a hurdle to design
algorithms are analyzed in terms of accuracy and time classification model.
complexity. The accuracy of J48 and LMT are 56.76% and
55.77% respectively whereas time consumption these The irrelevant or redundant attributes Removal is
algorithms are 0.04s and 0.39s respectively. difficult task.
In 2016, Rajalakshmi et al. [20] proposed Weighted Accuracy and Speed of the model only for some
Association Classifier (WAC) and K-Means Clustering to extents.
predict the heart disease. Accuracy of these algorithms is
93.89% and 92.84% respectively. Combination of these The time to construct the model (training time) is
algorithms produced maximum accuracy of 94.54%. high.
In 2016, Suganya et al. [21] proposed CART classifier to Lack of handling noise and missing values
predict the heart disease. Minimum distance CART classifier
is used to classify the data among various groups. The Interpretability i.e. Level of understanding and insight
accuracy of the CART classifier algorithm is 83%. provided by the model.
28
International Journal of Computer Applications (0975 – 8887)
Volume 181 – No. 49, April 2019
trained via the RBM by giving a new input as a value of For fuzzy sets, membership function takes values in the
hidden layer 1. As such, learning is sequential up to the last interval [0, 1]. The interval between 0 and 1 is called
layer. membership level or degree of membership.
A supervised learning-based classification technique using the
DBN is the back propagation algorithm, which is configured A fuzzy set A is defined as:
in the uppermost layer in the DBN. A classification prediction ∈ ∈ (2)
model using the neural network backpropagation-DBN was
implemented for heart disease prediction. where is a membership function belongs to the interval
[0, 1].
Initially, the DBN built a learning model using a training set.
Eight input variables were used (age, sex, chest_pain_type,
Trestbps, chol, Fbs, Thalach, Ca) and 1 output data. The DBN
3.1.2.1 Membership Functions
Membership function describes the composition of the
consisted of two steps. The first phase was the construction of
the RBM network work using unsupervised learning. The basic elements of the series x in the fuzzy set X, so can
RBM parameters such as epoch, batchsize and momentum are assume a large class of functions. Reasonable functions are
set as 1, 10 and 0.5 respectively. In the second phase, the often linear functions for parts, such as triangular or
RBM network learned the neural network backward trapezoidal functions.
propagation algorithm of supervised learning. Triangular Membership Function: Let a, b, c x be the
Table1. Algorithm- Deep Belief Network coordinates of the three vertices of in a fuzzy set A (a:
Lower limit c: upper limit, whose degree of belonging is zero,
Deep Belief Network Algorithm for Classify b: center where the degree of belonging is 1).
Heart Disease Data set
Input: Heart Disease Dataset
Output: Classified Heart Disease Dataset
Step 1. Collect the heart disease Dataset from
Statlog heart disease database
0 if x ≤ a
Step 2. Separate into Training and Test Sets using
k-fold cross validation
Step 3. Define a Deep Belief Network Structure A(X) =
Step 4. Set parameters, values, initialize weights
Step 5. Transform data to network inputs 0 if x ≥ c (3)
Step 6. Start DBN training using Greedy layer-
wise deep training
Step 7. Construct an RBM with an input layer v
and a hidden layer h, Train the RBM
Step 8. Stop and Test
Step 9. Implementation: use the Deep Belief
Network with new cases
29
International Journal of Computer Applications (0975 – 8887)
Volume 181 – No. 49, April 2019
Trestbps Continuous Resting Blood Pressure (in False Positive (FP) = the number of cases incorrectly
mm Hg) identified as patient
Chol Continuous Serum Cholesterol in mg/dl True Negative (TN) = the number of cases correctly
identified as healthy
Fbs Discrete Fasting Blood Sugar > 120
mg/dl False Negative (FN) = the number of cases incorrectly
identified as healthy
Thalach Continuous Maximum heart rate
achieved 4.4 Result and Discussion
This section presents the experimental results and analysis
Ca Continuous Number of major vessels done for this study. The research work, two classification
colored by fluoroscopy that models such as DBN and Fuzzy DBN were evaluated. Using
ranged between 0 and 3. 10 fold validation to access the performance of them. The
Diagnosis Discrete Diagnosis classes proposed algorithms are implemented using MATLAB.
30
International Journal of Computer Applications (0975 – 8887)
Volume 181 – No. 49, April 2019
1.2
0.8
Deep Belief Network
0.6
Fuzzy Deep Belief
0.4 Network
0.2
0
1 2 3 4 5 6 7 8 9 10
31
International Journal of Computer Applications (0975 – 8887)
Volume 181 – No. 49, April 2019
0.35
0.3
0.25
0.2
DBN-ERR
0.15
FDBN-ERR
0.1
0.05
0
1 2 3 4 5 6 7 8 9 10
AVERAGE ACCURACY
1
0.8
0.6
0.4 ACCURACY
0.2
0
DBN FDBN
32
International Journal of Computer Applications (0975 – 8887)
Volume 181 – No. 49, April 2019
AVERAGE ERROR
0.3
0.25
0.2
0.15
ERROR
0.1
0.05
0
DBN FDBN
33
International Journal of Computer Applications (0975 – 8887)
Volume 181 – No. 49, April 2019
[14] R. Chitra and V. Seenivasagam, “Review of Heart [20] K. Rajalakshmi and K. Nirmala, “Heart Disease
Disease Prediction System using Data Mining and Prediction with Map Reduce by using Weighted
Hybrid Intelligent Techniques”, ICTACT Journal on Soft Association Classifier and K-Means”, Indian Journal of
Computing, Vol. 3, No. 4, pp. 605-609, 2013. Science and Technology, Vol. 9, No. 19, pp. 231-237,
2016.
[15] Rupali R. Patil, “Heart Disease Prediction System using
Naive Bayes and Jelinek-Mercer Smoothing”, [21] S. Suganya and P. Tamil Selvi, “A Proficient Heart
International Journal of Advanced Research in Computer Disease Prediction Method using Fuzzy-Cart
and Communication Engineering, Vol. 3, No. 5, pp. 515- Algorithm”, International Journal of Scientific
523, 2014. Engineering and Applied Science, Vol. 2, No. 1, pp. 1-6,
2016.
[16] B. Venkatalakshmi and M.V. Shivsankar, “Heart Disease
Diagnosis using Predictive Data mining”, International [22] Rishabh Wadhawan, “Prediction of Coronary Heart
Journal of Innovative Research in Science, Engineering Disease using Apriori algorithm with Data Mining
and Technology, Vol. 3, No. 3, pp. 223-229, 2014. Classification”, International Journal of Research in
Science and Technology, Vol. 3, No. 1, pp. 1-15, 2018.
[17] Andrea D. Souza, “Heart Disease Prediction using Data
Mining Techniques”, International Journal of Research [23] Herculano-Houzel, Suzana and Lent, Roberto. Isotropic
in Engineering and Science, Vol. 3, No. 3, pp. 74-77, Fractionator: A simple, rapid method for the
2015. Quantification of Total Cell and Neuron Numbers in the
Brain. In: The Journal of Neuroscience. 9 March 2005,
[18] Moloud Adbar et al., “Comparing Performance of Data 25(10): 2518-2521. ISSN: 0270-6474, 2005.
Mining algorithms in Prediction Heart Diseases”,
International Journal of Electrical and Computer [24] Guyon, Isabelle. A Scaling Law for the Validation Set
Engineering, Vol. 5, No. 6, pp. 1569-1576, 2015. Training Set Size Ratio. In: AT&T Bell Laboraties,
1997.
[19] Jaymin Patel, Teja Upadhyay and Samir Patel, “Heart
Disease Prediction using Machine Learning and Data [25] Hinton, G. E. Training products of experts by
Mining Techniques”, International Journal of Computer minimizing contrastive divergence. In: Neural
Science and Communication, Vol. 7, No. 1, pp. 129-137, Computation, 2002, 14(8):1711-1800.
2016.
IJCATM : www.ijcaonline.org 34