Heart Failure Prediction EDA & Modeling
Heart Failure Prediction EDA & Modeling
Heart Failure is a condition when the heart muscle does not pump blood as well as it should
to meet the body's demands. Blood is the most important fluid that circulates throughout the
body by supplying oxygen to all the parts of the body
Cardiovascular diseases (CVDs) are the number 1 cause of death globally, taking an
estimated 17.9 million lives each year, which accounts for 31% of all deaths worldwide. Four
out of 5CVD deaths are due to heart attacks and strokes, and one-third of these deaths
occur prematurely in people under 70 years of age. Heart failure is a common event caused
by CVDs and this dataset contains 11 features that can be used to predict a possible heart
disease
localhost:8888/notebooks/Heart_Failure_Prediction_EDA_%26_%26_Modeling_(95_acc%2C_93_f1).ipynb 1/38
5/8/23, 7:04 PM Heart_Failure_Prediction_EDA_&_&_Modeling_(95_acc,_93_f1) - Jupyter Notebook
People with cardiovascular disease or who are at high cardiovascular risk (due to the
presence of one or more risk factors such as hypertension, diabetes, hyperlipidaemia or
already established disease) need early detection and management wherein a machine
learning model can be of great help
This dataset contains person's information like age sex blood pressure smoke diabetes ejection fraction
Library
In [379]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
colors = ['#97C1A9','#FFFFFF']
import imblearn
from collections import Counter
from imblearn.over_sampling import SMOTE
import warnings
warnings.filterwarnings("ignore")
Dataset
In [380]:
data=pd.read_csv('/content/drive/MyDrive/projek/heart_failure_clinical_records_dataset.cs
localhost:8888/notebooks/Heart_Failure_Prediction_EDA_%26_%26_Modeling_(95_acc%2C_93_f1).ipynb 2/38
5/8/23, 7:04 PM Heart_Failure_Prediction_EDA_&_&_Modeling_(95_acc,_93_f1) - Jupyter Notebook
In [381]:
data.head()
Out[381]:
0 75.0 0 582 0 20 1
1 55.0 0 7861 0 38 0
2 65.0 0 146 0 20 0
3 50.0 1 111 0 20 0
4 65.0 1 160 1 20 0
Dataset Attributes
localhost:8888/notebooks/Heart_Failure_Prediction_EDA_%26_%26_Modeling_(95_acc%2C_93_f1).ipynb 3/38
5/8/23, 7:04 PM Heart_Failure_Prediction_EDA_&_&_Modeling_(95_acc,_93_f1) - Jupyter Notebook
Data Info
In [382]:
data.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 299 entries, 0 to 298
Data columns (total 13 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 age 299 non-null float64
1 anaemia 299 non-null int64
2 creatinine_phosphokinase 299 non-null int64
3 diabetes 299 non-null int64
4 ejection_fraction 299 non-null int64
5 high_blood_pressure 299 non-null int64
6 platelets 299 non-null float64
7 serum_creatinine 299 non-null float64
8 serum_sodium 299 non-null int64
9 sex 299 non-null int64
10 smoking 299 non-null int64
11 time 299 non-null int64
12 DEATH_EVENT 299 non-null int64
dtypes: float64(3), int64(10)
memory usage: 30.5 KB
In [383]:
data.shape
Out[383]:
(299, 13)
In [384]:
data.columns
Out[384]:
localhost:8888/notebooks/Heart_Failure_Prediction_EDA_%26_%26_Modeling_(95_acc%2C_93_f1).ipynb 4/38
5/8/23, 7:04 PM Heart_Failure_Prediction_EDA_&_&_Modeling_(95_acc,_93_f1) - Jupyter Notebook
In [385]:
data.describe().T
Out[385]:
In [386]:
data.isnull().mean()*100
Out[386]:
age 0.0
anaemia 0.0
creatinine_phosphokinase 0.0
diabetes 0.0
ejection_fraction 0.0
high_blood_pressure 0.0
platelets 0.0
serum_creatinine 0.0
serum_sodium 0.0
sex 0.0
smoking 0.0
time 0.0
DEATH_EVENT 0.0
dtype: float64
EDA
localhost:8888/notebooks/Heart_Failure_Prediction_EDA_%26_%26_Modeling_(95_acc%2C_93_f1).ipynb 5/38
5/8/23, 7:04 PM Heart_Failure_Prediction_EDA_&_&_Modeling_(95_acc,_93_f1) - Jupyter Notebook
In [387]:
data['age'] = data['age'].astype(int)
data['platelets'] = data['platelets'].astype(int)
df = data.copy(deep = True)
In [388]:
df.loc[df['DEATH_EVENT']==0,'Status']='Survived'
df.loc[df['DEATH_EVENT']==1,'Status']='Not Survived'
In [389]:
col = list(data.columns)
categorical_features = []
numerical_features = []
for i in col:
if len(data[i].unique()) > 6:
numerical_features.append(i)
else:
categorical_features.append(i)
localhost:8888/notebooks/Heart_Failure_Prediction_EDA_%26_%26_Modeling_(95_acc%2C_93_f1).ipynb 6/38
5/8/23, 7:04 PM Heart_Failure_Prediction_EDA_&_&_Modeling_(95_acc,_93_f1) - Jupyter Notebook
sns.set(style='white')
plt.subplot(1,2,2)
ax=sns.countplot(data=df, x='Status',palette = colors,edgecolor = 'k')
ax.bar_label(ax.containers[0])
plt.suptitle('Death Event')
Out[390]:
localhost:8888/notebooks/Heart_Failure_Prediction_EDA_%26_%26_Modeling_(95_acc%2C_93_f1).ipynb 7/38
5/8/23, 7:04 PM Heart_Failure_Prediction_EDA_&_&_Modeling_(95_acc,_93_f1) - Jupyter Notebook
Categorical Features
In [391]:
# Categorical Plot
def catplot(df,x):
sns.set(style='white')
fig = plt.subplots(1,3,figsize = (15,4))
plt.subplot(1,3,1)
df[x].value_counts().plot.pie(explode=[0.1,0.1], autopct='%1.1f%%', shadow=True, text
plt.subplot(1,3,2)
ax=sns.histplot(data=df,x=x,kde = True,color=colors[0],edgecolor = 'k')
ax.bar_label(ax.containers[0])
# ax.set_xlim(-1,2)
# ax.set_xticks(range(-1,2))
plt.subplot(1,3,3)
ax=sns.countplot(data=df, x=x, hue='Status',palette = colors,edgecolor = 'k')
for container in ax.containers:
ax.bar_label(container)
tit = x + ' vs Death Event'
plt.suptitle(tit)
In [392]:
# for i in range(len(categorical_features)):
# catplot(df,categorical_features[i])
Anemia
In [393]:
catplot(df,'anaemia')
localhost:8888/notebooks/Heart_Failure_Prediction_EDA_%26_%26_Modeling_(95_acc%2C_93_f1).ipynb 8/38
5/8/23, 7:04 PM Heart_Failure_Prediction_EDA_&_&_Modeling_(95_acc,_93_f1) - Jupyter Notebook
Diabetes
In [394]:
catplot(df,'diabetes')
high_blood_pressure
In [395]:
catplot(df,'high_blood_pressure')
sex
In [396]:
catplot(df,'sex')
localhost:8888/notebooks/Heart_Failure_Prediction_EDA_%26_%26_Modeling_(95_acc%2C_93_f1).ipynb 9/38
5/8/23, 7:04 PM Heart_Failure_Prediction_EDA_&_&_Modeling_(95_acc,_93_f1) - Jupyter Notebook
smoking
In [397]:
catplot(df,'smoking')
Sumary
Genaral Information
localhost:8888/notebooks/Heart_Failure_Prediction_EDA_%26_%26_Modeling_(95_acc%2C_93_f1).ipynb 10/38
5/8/23, 7:04 PM Heart_Failure_Prediction_EDA_&_&_Modeling_(95_acc,_93_f1) - Jupyter Notebook
Numerical Features
In [398]:
# Numerical Plot
def numplot(df,x,scale):
sns.set(style='whitegrid')
fig = plt.subplots(2,1,figsize = (15,11))
plt.subplot(2,1,1)
ax=sns.histplot(data=df, x=x, kde=True,color=colors[0],edgecolor = 'k')
ax.bar_label(ax.containers[0])
tit=x + ' distribution'
plt.title(tit)
plt.subplot(2,1,2)
tar=x + '_group'
Tstr= str(scale)
tit2=x + ' vs Death Event ( ' + Tstr + ' : 1 )'
df[tar] = [ int(i / scale) for i in df[x]]
ax=sns.countplot(data=df, x=tar, hue='Status',palette = colors,edgecolor = 'k')
for container in ax.containers:
ax.bar_label(container)
plt.title(tit2)
age
In [399]:
numplot(df,'age',5)
localhost:8888/notebooks/Heart_Failure_Prediction_EDA_%26_%26_Modeling_(95_acc%2C_93_f1).ipynb 11/38
5/8/23, 7:04 PM Heart_Failure_Prediction_EDA_&_&_Modeling_(95_acc,_93_f1) - Jupyter Notebook
Creatinine Phosphokinase
In [400]:
numplot(df,'creatinine_phosphokinase',100)
localhost:8888/notebooks/Heart_Failure_Prediction_EDA_%26_%26_Modeling_(95_acc%2C_93_f1).ipynb 12/38
5/8/23, 7:04 PM Heart_Failure_Prediction_EDA_&_&_Modeling_(95_acc,_93_f1) - Jupyter Notebook
ejection_fraction
In [401]:
numplot(df,'ejection_fraction',10)
localhost:8888/notebooks/Heart_Failure_Prediction_EDA_%26_%26_Modeling_(95_acc%2C_93_f1).ipynb 13/38
5/8/23, 7:04 PM Heart_Failure_Prediction_EDA_&_&_Modeling_(95_acc,_93_f1) - Jupyter Notebook
platelets
In [402]:
numplot(df,'platelets',10**5)
localhost:8888/notebooks/Heart_Failure_Prediction_EDA_%26_%26_Modeling_(95_acc%2C_93_f1).ipynb 14/38
5/8/23, 7:04 PM Heart_Failure_Prediction_EDA_&_&_Modeling_(95_acc,_93_f1) - Jupyter Notebook
serum_creatinine
In [403]:
numplot(df,'serum_creatinine',1)
localhost:8888/notebooks/Heart_Failure_Prediction_EDA_%26_%26_Modeling_(95_acc%2C_93_f1).ipynb 15/38
5/8/23, 7:04 PM Heart_Failure_Prediction_EDA_&_&_Modeling_(95_acc,_93_f1) - Jupyter Notebook
serum_sodium
In [404]:
numplot(df,'serum_sodium',5)
localhost:8888/notebooks/Heart_Failure_Prediction_EDA_%26_%26_Modeling_(95_acc%2C_93_f1).ipynb 16/38
5/8/23, 7:04 PM Heart_Failure_Prediction_EDA_&_&_Modeling_(95_acc,_93_f1) - Jupyter Notebook
time
In [405]:
numplot(df,'time',10)
Sumary
Cases of DEATH_EVENT initiate from the age of 45. Some specific peaks of high cases of
DEATH_EVENT can be observed at 45, 50, 60, 65, and 70
High cases of DEATH_EVENT can be observed for ejaction_fraction values from 20 - 60.
serum_creatinine values from 0.6 to 3.0 have higher probability to lead to DEATH_EVENT.
serum_sodium values 127 - 145 indicate towards a DEATH_EVENT due to heart failure.
DEATH_EVENT cases are on a high for the values between 0(0x100) - 500(5x100) for
creatinine_phosphokinase.
platelets values between 0(0x10^5) - 400,000(4x10^5) are prone to heart failures leading to
DEATH_EVENT.
For the time feature, values from 0(0x10) - 60(6*10) have higher probability to lead to a DEATH_EVENT.
age : 50 - 70
creatinine_phosphokinase : 0 - 500
ejaction_fraction : 20 - 40
platelets : 200,000 - 300,000
serum_creatinine : 1 - 2
serum_sodium : 130 - 140
localhost:8888/notebooks/Heart_Failure_Prediction_EDA_%26_%26_Modeling_(95_acc%2C_93_f1).ipynb 17/38
5/8/23, 7:04 PM Heart_Failure_Prediction_EDA_&_&_Modeling_(95_acc,_93_f1) - Jupyter Notebook
time : 0 - 50
General Information
Features Engineering
Scaling
In [406]:
# Normalization
df['age'] = mms.fit_transform(df[['age']])
df['creatinine_phosphokinase'] = mms.fit_transform(df[['creatinine_phosphokinase']])
df['ejection_fraction'] = mms.fit_transform(df[['ejection_fraction']])
df['serum_creatinine'] = mms.fit_transform(df[['serum_creatinine']])
df['time'] = mms.fit_transform(df[['time']])
# Standardization
df['platelets'] = ss.fit_transform(df[['platelets']])
df['serum_sodium'] = ss.fit_transform(df[['serum_sodium']])
df.head()
Out[406]:
5 rows × 21 columns
localhost:8888/notebooks/Heart_Failure_Prediction_EDA_%26_%26_Modeling_(95_acc%2C_93_f1).ipynb 18/38
5/8/23, 7:04 PM Heart_Failure_Prediction_EDA_&_&_Modeling_(95_acc,_93_f1) - Jupyter Notebook
Correlation
In [407]:
plt.subplots(figsize = (5,5))
sns.heatmap(corr,annot = True,cmap = colors,linewidths = 0.4,linecolor = 'black');
plt.title('DEATH_EVENT Correlation');
Insight :
Based on the statistical test, we will drop the following features : high_blood_pressure, anaemia,
creatinine_phosphokinase, diabetes, sex, smoking, and platelets
Based on the General information., we will drop the following features : sex, platelets.
localhost:8888/notebooks/Heart_Failure_Prediction_EDA_%26_%26_Modeling_(95_acc%2C_93_f1).ipynb 19/38
5/8/23, 7:04 PM Heart_Failure_Prediction_EDA_&_&_Modeling_(95_acc,_93_f1) - Jupyter Notebook
In [408]:
df1=data.copy()
df2=data.copy()
Data Balancing
In [409]:
over = SMOTE()
f1 = df1.iloc[:,:5].values
t1 = df1.iloc[:,5].values
f1, t1 = over.fit_resample(f1, t1)
Counter(t1)
Out[409]:
In [410]:
over = SMOTE()
f2 = df2.iloc[:,:10].values
t2 = df2.iloc[:,10].values
f2, t2 = over.fit_resample(f2, t2)
Counter(t2)
Out[410]:
Model
In [411]:
localhost:8888/notebooks/Heart_Failure_Prediction_EDA_%26_%26_Modeling_(95_acc%2C_93_f1).ipynb 20/38
5/8/23, 7:04 PM Heart_Failure_Prediction_EDA_&_&_Modeling_(95_acc,_93_f1) - Jupyter Notebook
In [412]:
def model(classifier,x_train,y_train,x_test,y_test):
sns.set(rc={'figure.figsize':(5,3)})
sns.set(style='whitegrid')
classifier.fit(x_train,y_train)
prediction = classifier.predict(x_test)
cv = RepeatedStratifiedKFold(n_splits = 10,n_repeats = 3,random_state = 1)
print("Cross Validation Score : ",'{0:.2%}'.format(cross_val_score(classifier,x_train
print("ROC_AUC Score : ",'{0:.2%}'.format(roc_auc_score(y_test,prediction)))
# plot_roc_curve(classifier, x_test,y_test)
RocCurveDisplay.from_estimator(classifier, x_test,y_test)
plt.title('ROC_AUC_Plot')
plt.show()
def model_evaluation(classifier,x_test,y_test):
# Confusion Matrix
cm = confusion_matrix(y_test,classifier.predict(x_test))
names = ['True Neg','False Pos','False Neg','True Pos']
counts = [value for value in cm.flatten()]
percentages = ['{0:.2%}'.format(value) for value in cm.flatten()/np.sum(cm)]
labels = [f'{v1}\n{v2}\n{v3}' for v1, v2, v3 in zip(names,counts,percentages)]
labels = np.asarray(labels).reshape(2,2)
sns.heatmap(cm,annot = labels,cmap = 'Blues',fmt ='')
# Classification Report
print(classification_report(y_test,classifier.predict(x_test)))
localhost:8888/notebooks/Heart_Failure_Prediction_EDA_%26_%26_Modeling_(95_acc%2C_93_f1).ipynb 21/38
5/8/23, 7:04 PM Heart_Failure_Prediction_EDA_&_&_Modeling_(95_acc,_93_f1) - Jupyter Notebook
XGB Classifier
In [413]:
classifier_xgb = XGBClassifier(random_state=1)
model(classifier_xgb,x_train1,y_train1,x_test1,y_test1)
model_evaluation(classifier_xgb,x_test1,y_test1)
accuracy 0.89 61
macro avg 0.88 0.89 0.88 61
weighted avg 0.89 0.89 0.89 61
localhost:8888/notebooks/Heart_Failure_Prediction_EDA_%26_%26_Modeling_(95_acc%2C_93_f1).ipynb 22/38
5/8/23, 7:04 PM Heart_Failure_Prediction_EDA_&_&_Modeling_(95_acc,_93_f1) - Jupyter Notebook
In [414]:
model(classifier_xgb,x_train2,y_train2,x_test2,y_test2)
model_evaluation(classifier_xgb,x_test2,y_test2)
accuracy 0.90 61
macro avg 0.91 0.89 0.90 61
weighted avg 0.91 0.90 0.90 61
localhost:8888/notebooks/Heart_Failure_Prediction_EDA_%26_%26_Modeling_(95_acc%2C_93_f1).ipynb 23/38
5/8/23, 7:04 PM Heart_Failure_Prediction_EDA_&_&_Modeling_(95_acc,_93_f1) - Jupyter Notebook
LGBMClassifier
In [415]:
classifier_lgbm = LGBMClassifier(random_state=1)
model(classifier_lgbm,x_train1,y_train1,x_test1,y_test1)
model_evaluation(classifier_lgbm,x_test1,y_test1)
accuracy 0.85 61
macro avg 0.85 0.85 0.85 61
weighted avg 0.85 0.85 0.85 61
localhost:8888/notebooks/Heart_Failure_Prediction_EDA_%26_%26_Modeling_(95_acc%2C_93_f1).ipynb 24/38
5/8/23, 7:04 PM Heart_Failure_Prediction_EDA_&_&_Modeling_(95_acc,_93_f1) - Jupyter Notebook
In [416]:
model(classifier_lgbm,x_train2,y_train2,x_test2,y_test2)
model_evaluation(classifier_lgbm,x_test2,y_test2)
accuracy 0.92 61
macro avg 0.92 0.92 0.92 61
weighted avg 0.92 0.92 0.92 61
localhost:8888/notebooks/Heart_Failure_Prediction_EDA_%26_%26_Modeling_(95_acc%2C_93_f1).ipynb 25/38
5/8/23, 7:04 PM Heart_Failure_Prediction_EDA_&_&_Modeling_(95_acc,_93_f1) - Jupyter Notebook
Logistic Regression
In [417]:
classifier_lr = LogisticRegression(random_state = 1)
model(classifier_lr,x_train1,y_train1,x_test1,y_test1)
model_evaluation(classifier_lr,x_test1,y_test1)
accuracy 0.80 61
macro avg 0.80 0.80 0.80 61
weighted avg 0.80 0.80 0.80 61
localhost:8888/notebooks/Heart_Failure_Prediction_EDA_%26_%26_Modeling_(95_acc%2C_93_f1).ipynb 26/38
5/8/23, 7:04 PM Heart_Failure_Prediction_EDA_&_&_Modeling_(95_acc,_93_f1) - Jupyter Notebook
In [418]:
model(classifier_lr,x_train2,y_train2,x_test2,y_test2)
model_evaluation(classifier_lr,x_test2,y_test2)
accuracy 0.87 61
macro avg 0.87 0.87 0.87 61
weighted avg 0.87 0.87 0.87 61
localhost:8888/notebooks/Heart_Failure_Prediction_EDA_%26_%26_Modeling_(95_acc%2C_93_f1).ipynb 27/38
5/8/23, 7:04 PM Heart_Failure_Prediction_EDA_&_&_Modeling_(95_acc,_93_f1) - Jupyter Notebook
In [419]:
classifier_svc = SVC()
model(classifier_svc,x_train1,y_train1,x_test1,y_test1)
model_evaluation(classifier_svc,x_test1,y_test1)
accuracy 0.87 61
macro avg 0.87 0.87 0.87 61
weighted avg 0.88 0.87 0.87 61
localhost:8888/notebooks/Heart_Failure_Prediction_EDA_%26_%26_Modeling_(95_acc%2C_93_f1).ipynb 28/38
5/8/23, 7:04 PM Heart_Failure_Prediction_EDA_&_&_Modeling_(95_acc,_93_f1) - Jupyter Notebook
In [420]:
model(classifier_svc,x_train2,y_train2,x_test2,y_test2)
model_evaluation(classifier_svc,x_test2,y_test2)
accuracy 0.84 61
macro avg 0.83 0.83 0.83 61
weighted avg 0.84 0.84 0.84 61
localhost:8888/notebooks/Heart_Failure_Prediction_EDA_%26_%26_Modeling_(95_acc%2C_93_f1).ipynb 29/38
5/8/23, 7:04 PM Heart_Failure_Prediction_EDA_&_&_Modeling_(95_acc,_93_f1) - Jupyter Notebook
In [421]:
classifier_grad = GradientBoostingClassifier(random_state=1)
model(classifier_grad,x_train1,y_train1,x_test1,y_test1)
model_evaluation(classifier_grad,x_test1,y_test1)
accuracy 0.87 61
macro avg 0.87 0.86 0.87 61
weighted avg 0.87 0.87 0.87 61
localhost:8888/notebooks/Heart_Failure_Prediction_EDA_%26_%26_Modeling_(95_acc%2C_93_f1).ipynb 30/38
5/8/23, 7:04 PM Heart_Failure_Prediction_EDA_&_&_Modeling_(95_acc,_93_f1) - Jupyter Notebook
In [422]:
model(classifier_svc,x_train2,y_train2,x_test2,y_test2)
model_evaluation(classifier_svc,x_test2,y_test2)
accuracy 0.84 61
macro avg 0.83 0.83 0.83 61
weighted avg 0.84 0.84 0.84 61
localhost:8888/notebooks/Heart_Failure_Prediction_EDA_%26_%26_Modeling_(95_acc%2C_93_f1).ipynb 31/38
5/8/23, 7:04 PM Heart_Failure_Prediction_EDA_&_&_Modeling_(95_acc,_93_f1) - Jupyter Notebook
In [423]:
classifier_rdf = RandomForestClassifier(random_state=1)
model(classifier_rdf,x_train1,y_train1,x_test1,y_test1)
model_evaluation(classifier_rdf,x_test1,y_test1)
accuracy 0.85 61
macro avg 0.86 0.84 0.85 61
weighted avg 0.85 0.85 0.85 61
localhost:8888/notebooks/Heart_Failure_Prediction_EDA_%26_%26_Modeling_(95_acc%2C_93_f1).ipynb 32/38
5/8/23, 7:04 PM Heart_Failure_Prediction_EDA_&_&_Modeling_(95_acc,_93_f1) - Jupyter Notebook
In [424]:
model(classifier_svc,x_train2,y_train2,x_test2,y_test2)
model_evaluation(classifier_svc,x_test2,y_test2)
accuracy 0.84 61
macro avg 0.83 0.83 0.83 61
weighted avg 0.84 0.84 0.84 61
Result
Dataset 1
Dataset 2
From these results it is found that Dataset 2 shows better results and LGBM is the best model
Hyperparameter Tuning
In [425]:
In [426]:
# automl = AutoML()
# settings = {
# "time_budget": 1200, # total running time in seconds
# "metric": 'roc_auc', # primary metrics for regression can be chosen from: ['mae','
# "estimator_list": ['lgbm'], # list of ML learners; we tune lightgbm in this exampl
# "task": 'classification', # task type
# "log_file_name": '/content/drive/MyDrive/heart_lg2.log', # flaml log file
# "seed": 1, # random seed
# }
# automl.fit(X_train=x_train2, y_train=y_train2, **settings)
localhost:8888/notebooks/Heart_Failure_Prediction_EDA_%26_%26_Modeling_(95_acc%2C_93_f1).ipynb 34/38
5/8/23, 7:04 PM Heart_Failure_Prediction_EDA_&_&_Modeling_(95_acc,_93_f1) - Jupyter Notebook
In [427]:
# I used flaml hyperparameter tuning for 20 minutes and got this results
classifier_lgbm = LGBMClassifier(colsample_bytree=0.26649620250942635,
learning_rate=0.02058909150877934, max_bin=127,
min_child_samples=7, n_estimators=184, num_leaves=48,
reg_alpha=0.004090180440029941, reg_lambda=0.0009765625,
verbose=-1)
localhost:8888/notebooks/Heart_Failure_Prediction_EDA_%26_%26_Modeling_(95_acc%2C_93_f1).ipynb 35/38
5/8/23, 7:04 PM Heart_Failure_Prediction_EDA_&_&_Modeling_(95_acc,_93_f1) - Jupyter Notebook
In [428]:
model(classifier_lgbm,x_train1,y_train1,x_test1,y_test1)
model_evaluation(classifier_lgbm,x_test1,y_test1)
accuracy 0.93 61
macro avg 0.93 0.93 0.93 61
weighted avg 0.93 0.93 0.93 61
localhost:8888/notebooks/Heart_Failure_Prediction_EDA_%26_%26_Modeling_(95_acc%2C_93_f1).ipynb 36/38
5/8/23, 7:04 PM Heart_Failure_Prediction_EDA_&_&_Modeling_(95_acc,_93_f1) - Jupyter Notebook
In [429]:
model(classifier_lgbm,x_train2,y_train2,x_test2,y_test2)
model_evaluation(classifier_lgbm,x_test2,y_test2)
accuracy 0.95 61
macro avg 0.95 0.95 0.95 61
weighted avg 0.95 0.95 0.95 61
localhost:8888/notebooks/Heart_Failure_Prediction_EDA_%26_%26_Modeling_(95_acc%2C_93_f1).ipynb 37/38
5/8/23, 7:04 PM Heart_Failure_Prediction_EDA_&_&_Modeling_(95_acc,_93_f1) - Jupyter Notebook
Dataset 1 :
Dataset 2 :
localhost:8888/notebooks/Heart_Failure_Prediction_EDA_%26_%26_Modeling_(95_acc%2C_93_f1).ipynb 38/38