0% found this document useful (0 votes)
17 views10 pages

Comparison of The Performance of Machine Learning-Based Algorithms For Predicting Depression and Anxiety Among University Students in Bangladesh A Result of The First Wave of The COVID-19 Pandemic

Uploaded by

rafael
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views10 pages

Comparison of The Performance of Machine Learning-Based Algorithms For Predicting Depression and Anxiety Among University Students in Bangladesh A Result of The First Wave of The COVID-19 Pandemic

Uploaded by

rafael
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Original Article

Comparison of the Performance of Machine Learning‑based Algorithms


for Predicting Depression and Anxiety among University Students in
Bangladesh: A Result of the First Wave of the COVID‑19 Pandemic
Downloaded from http://journals.lww.com/shbh by BhDMf5ePHKav1zEoum1tQfN4a+kJLhEZgbsIHo4XMi0hCywCX1AW

Abstract Md. Iqbal Hossain


Introduction: The purpose of this research was to predict mental illness among university students Nayan1,
using various machine learning (ML) algorithms. Methods: A structured questionnaire‑based M. Sheikh Giash
online survey was conducted on 2121 university students (private and public) living in Bangladesh.
Uddin2,
nYQp/IlQrHD3i3D0OdRyi7TvSFl4Cf3VC4/OAVpDDa8K2+Ya6H515kE= on 06/02/2024

After obtaining informed consent, the participants completed a web‑based survey examining
sociodemographic variables and behavioral tests (including the Patient Health Questionnaire (PHQ‑9) Md. Ismail Hossain2,
scale and the Generalized Anxiety Disorder Assessment‑7 scale). This study applied six well‑known Md. Mohibul Alam3,
ML algorithms, namely logistic regression, random forest (RF), support vector machine (SVM), Maliha Afroj Zinnia4,
linear discriminate analysis, K‑nearest neighbors, Naïve Bayes, and which were used to predict Iqramul Haq5,
mental illness among university students from Dhaka city in Bangladesh. Results: Of the 2121
eligible respondents, 45% were male and 55% were female, and approximately 76.9% were 21– Md. Moshiur
25 years old. The prevalence of severe depression and severe anxiety was higher for women than for Rahman6,
men. Based on various performance parameters, the results of the accuracy assessment showed that Rejwana Ria4,
RF outperformed other models for the prediction of depression (89% accuracy), while SVM provided Md. Injamul Haq
the best result than other models for the prediction of anxiety (91.49% accuracy). Conclusion: Based
Methun7
on these findings, we recommend that the RF algorithm and the SVM algorithm were more moderate 1
Quality Services and Compliance,
than any other ML algorithm used in this study to predict the mental health status of university Square Pharmaceutical Limited,
students in Bangladesh (depression and anxiety, respectively). Finally, this study proposes to apply Dhaka, Bangladesh, 2Department
RF and SVM classification when the prediction of mental illness status is the core interest. of Statistics, Jagannath University,
Dhaka, Bangladesh, 3Department of
Keywords: Anxiety, Bangladesh, COVID‑19, depression, machine learning algorithm, psychological Training, Eskayef Pharmaceuticals
Limited, 4Department of Pharmacy,
East West University, Departments
of 5Agricultural Statistics and
Introduction status, socioeconomic status, and loneliness, 6
Pharmacology and Toxicology,
personal autonomy, and future plans.[7‑11] Sher‑e‑Bangla Agricultural
COVID‑19 started as a local transmission
From a mental health standpoint, due to University, Dhaka, Bangladesh,
from the Wuhan city of China and has 7
Department of Statistics, Tejgaon
COVID‑19, the world population have
become one of the major calamities of College, Dhaka, Bangladesh
impacted not only by anxiety and trauma but
the century.[1,2] From the earlier status Received: 04 January, 2022.
also from unfavorable societal dynamics. Revised: 12 April, 2022.
of a global health emergency, the WHO
Taking into account the current COVID‑19 Accepted: 01 May, 2022.
officially certified COVID‑19 as a Published: 23 May, 2022.
pandemic, numerous universities across the
“pandemic” on February 11, 2020 (WHO,
world have either held over or canceled all ORCID:
2020). Societies are facing great uncertainty
their campus activities. Universities have Iqramul Haq:
considering the knowledge being developed https://orcid.org/0000-0002-9183-
shifted their programs from the face‑to‑face
about the unpredictable nature of the spread 120X
to online system.[12,13] During quarantined Address for correspondence:
of this virus and its reciprocation with
and outside from university environment Assist. Prof. Iqramul Haq,
societal responses.[3,4]
and schedule, students may encounter Department of Agricultural
Mental illness is also increasing at an stress, anxiety, anger, boredom, loneliness Statistics, Sher‑e‑Bangla
Agricultural University, Dhaka
epidemic rate worldwide, which was severe and other emotions, which have both shorter 1207, Bangladesh.
due to fear of COVID‑19.[5,6] Some recent and longer‑term psychological impacts on E‑mail: [email protected].
studies have shown that sociodemographic, health.[14‑16] It affects the student’s energy bd
behavior, and education are the main level, concentration, reliability, mental Access this article online
influencing factors for mental illness, ability and optimism, thereby affecting
Website: www.
including gender, residence, relationship student performance.[15,16] healthandbehavior.com
DOI: 10.4103/shb.shb_38_22
Quick Response Code:
This is an open access journal, and articles are
How to cite this article: Nayan MI, Uddin MS,
distributed under the terms of the Creative Commons
Attribution‑NonCommercial‑ShareAlike 4.0 License, which Hossain MI, Alam MM, Zinnia MA, Haq I, et al.
allows others to remix, tweak, and build upon the work Comparison of the performance of machine learning-
non‑commercially, as long as appropriate credit is given and based algorithms for predicting depression and
the new creations are licensed under the identical terms. anxiety among university students in Bangladesh: A
result of the first wave of the COVID-19 pandemic.
For reprints contact: [email protected] Asian J Soc Health Behav 2022;5:75-84.

© 2022 Asian Journal of Social Health and Behavior | Published by Wolters Kluwer - Medknow 75
Nayan, et al.: Machine learning‑based algorithms for predicting depression and anxiety level in Bangladesh

The Government of Bangladesh declared a public analysis (LDA), K‑nearest neighbors (KNN), and NB have
holiday on March 26, 2020, to lessen the transmission of been applied to predict mental health status (depression and
COVID‑19.[16] Since then, all the schools, colleges, and anxiety) among university students in Bangladesh.
universities were closed, hampering the students’ studies,
daily routines, and daily habits, which in turn affecting Methods
their mental health. Furthermore, home quarantine, Participants and procedure
maintaining physical distance, and other restrictions also
Downloaded from http://journals.lww.com/shbh by BhDMf5ePHKav1zEoum1tQfN4a+kJLhEZgbsIHo4XMi0hCywCX1AW

psychologically affected students and hindered their mental The research was a prospective cross‑sectional survey, and
well‑being.[17,18] On top of that, unpredictable situation, the data was collected through an online questionnaire
news, rumors, and misinformation can also raise negative survey using a Google form. This self‑administered
thoughts within university students about their future.[19,20] rapid online survey was conducted between August 2020
nYQp/IlQrHD3i3D0OdRyi7TvSFl4Cf3VC4/OAVpDDa8K2+Ya6H515kE= on 06/02/2024

All of this together can bring hopelessness, fear of death, and September 2020. Both public and private university
students operating in Dhaka city and those who have an
and frustration among quarantined students.[19]
internet facility were able to respond to the questionnaire
Before COVID‑19, mental health studies reported and were eligible for this analysis. To fulfill the objective,
an elevated level of moderate to extremely severe a structured questionnaire was designed by the authors, and
depression (52.2%), anxiety (58.1%), and stress (24.9%) the questionnaire was arranged into a “Google Form” by
among university students in Bangladesh.[21] Yet very collecting highly pertinent facts about mental health and
little information is available on the mental situation of then it was sent to various public and private university
students during the COVID‑19 pandemic, and so acquiring students through social media and asked them to convey
structured measures of depression, anxiety, stress can help their informative responses.
to estimate the necessity for interventions to diminish the
A multistage sampling technique was applied for this study.
mental health impacts of the pandemic on students.
In the first stage, 4 universities (two public and two private
Mental health is an indicator of a person’s emotional, universities, respectively) were chosen randomly from Dhaka
psychological, and social well‑being. There are many city. In the second stage, six departments were chosen from
causes of mental illness (depression and anxiety). Academic each of the selected universities. In the third stage, 100
performance, occupational status, and family status are students were randomly chosen from each of the selected
considered the most important factors leading to depression departments from the 1st year to the 5th year (additionally,
and anxiety. Furthermore, previous investigations found that graduate students for some departments). Here students were
different variables, for example, sex, age, marital status, were numbered consecutively each year as per their ID numbers.
significant factors related to depression and anxiety status.[7‑10] From this design, a total of 2400 students were selected
In addition to proper diagnosis and intervention with mental for an interview and out of this 2350 consented. From the
health reduces the risk for depression and anxiety. Various 2350 respondents, a total of 2121 students were found who
statistical methods have been used to evaluate the mental had answered the structured questionnaire through “Google
health status of university students. The main objective Form” and valuable information was stored precisely. In this
of an estimation procedure is the correct prediction of case, the study sample size was 2121 and the response rate
depression and anxiety status using a machine learning (ML) was 90.25%. Almost 10% of the students did not participate
algorithm. ML, a scientific method that intersects artificial in this study after knowing the consent information. Among
intelligence and statistical learning research, may be how the students, 55.1% were women and 44.9% were men. In
much knowledge can be researched to look for unknown this analysis, the students came from all over Bangladesh
associations or trends.[22] ML algorithms that can build and could be representative of the entire population.
models for prediction purposes have shown excellence in Ethical consideration
taking care of classification problems, in comparison to the
classical statistical model. Furthermore, within the field of The Research Ethics Committee of the Department of
health and medical research, ML has become common.[23] Agricultural Statistics at Sher‑e‑Bangla Agricultural
University, Dhaka‑1207, allowed ethical discussions to
Many researchers have used ML algorithms such as conduct the present study. All participants are informed
random forest (RF) trees, support vector machines (SVMs), about the purpose of the study, and the unanimity of their
and convolutional neural networks to predict anxiety identity is insured, and consent from all is obtained.
and depression.[24] In Sau et al.’s study, different ML
algorithms such as logistic regression (LR), Catboost, Naïve Variables and measures
Bayes (NB), RF, and SVM were used for classification.[25] As the main response variable, the main concern of this
In addition, ML can be used to better predict the risk of study was the state of depression status and anxiety status
depression and anxiety. Therefore, the purpose of this study among participants, and the two most popular methods,
is to compare the prediction performance of six well‑known the Patient Health Questionnaire (PHQ‑9) and Generalized
ML algorithms such as LR, RF, SVM, linear discriminate Anxiety Disorder Assessment (GAD‑7), were used to

76 Asian Journal of Social Health and Behavior | Volume 5 | Issue 2 | April-June 2022
Nayan, et al.: Machine learning‑based algorithms for predicting depression and anxiety level in Bangladesh

measure depressive symptoms and anxiety symptoms, Machine learning algorithms


respectively.
We used six different supervised algorithms to predict the
Patient Health Questionnaire level of depression and anxiety among university students.
The Patient Health Questionnaire (PHQ‑9) was used Logistic regression
to screen the presence of depression through nine
self‑administered questionnaires, and every item of this LR was a “statistical learning” technique, which is a
Downloaded from http://journals.lww.com/shbh by BhDMf5ePHKav1zEoum1tQfN4a+kJLhEZgbsIHo4XMi0hCywCX1AW

questionnaire is rated on a 4‑point Likert scale ranging from “supervised” ML method specifically used for “classification”
0 (not at all) to 3 (nearly every day) queries the existence and tasks. It uses the maximum likelihood estimation procedure to
rate of repetition of depressive manifestation experienced by estimate the parameters of interest. Let Xi1, Xi2 … Xip be a set of
the respondent in the last 14 days.[26,27] The total score of explanatory variables, which can be quantitative variables or
nYQp/IlQrHD3i3D0OdRyi7TvSFl4Cf3VC4/OAVpDDa8K2+Ya6H515kE= on 06/02/2024

PHQ‑9 ranges from 0 to 27, and the recommended severity index variables that refer to the level of categorical variables,
cut‑off scores are: None (<5), mild (5–9), moderate (10–14), and Y is a binary variable, which has a Bernoulli distribution
moderately severe (15–19), and severe (>19). The internal of a parameter πi, then the logit regression model is,
consistency of this scale was very high in the present
study (Cronbach coefficient Alpha = 0.813).  π 
log  i =  β 0 + β1 X i1 +…+ β p X ip ; β i be the parameters.
Generalized Anxiety Disorder Assessment 1 − π i 

In epidemiological surveys, GAD‑7 was a more valid and Random forest


reliable tool, which was a questionnaire consisting of 7
RF is an ensemble learning‑based classification approach
items and all of them carried a point based on a four‑point
with a large number of decision trees constructed in the
Likert scale ranging from 0 (not at all) to 3 (Nearly every
training process, where the final output integrates the
day).[28,29] It is used to screen anxiety level and assesses
its severity. According to scores of 0–4, 5–9, 10–14, and outcome class of individual decision trees.[30]
15–21, anxiety levels are divided into four categories: Support vector machine
Mild, mild, moderate, and severe. In the present study,
Cronbach’s alpha was 0.878. SVM is one of the most popular classification algorithms;
it has a good method of transforming nonlinear data. Chen
Independent variables Junli and Jiao Licheng explained the classification strategy
As independent variables, we used a set of socioeconomic of SVM well.[31] The linear SVM model is used in the
and demographic factors, and all of them related to prediction research of this mental health disease.
depressive symptoms were considered covariates. The
factors are listed below. Linear discrimination analysis
1. Gender (male, female) LDA is a supervised ML technique used to extract
2. Current age (<20, 21–25, >25) important highlights from a data set. When these classes
3. Current living status (with family, without family) are well separated, the parameter estimation of the LR
4. Occupation status (student, job holder, unemployed) model is unstable. In this case, LDA is used.
5. Educational year (1st/2nd, 3rd/4th/5th and graduate)
6. Family status (higher class, middle class, lower class) K‑nearest neighbors
7. Marital status (married, unmarried)
The KNN algorithm is also the simplest and one of the most
8. Cumulative grade point average (CGPA) (≤3.00,
widely used classification algorithms. The KNN algorithm
3.01–3.50, >3.50).
has confirmed the multiclass label classification problem
Statistical analysis and has good generalization ability.[32] The algorithm stores
The relationship between explanatory variables and each accessible case and classifies new cases based on
depression status was tested in a bivariate setting. In the similarity measures.
bivariate setting, we applied the Chi‑square independence Naive Bayes
test. Mathematically, the Chi‑square statistic can be defined
as follows: The NB classifier is a probabilistic classifier based on
2
(Oi − Ei ) the assumption of strong (naive) independence between
χ 2= ∑ Ei
; i= 1, …, n the features of Bayes’ theorem. The NB model is easy to
construct without estimating complex repeat parameters,
where Oi and Ei are the observed and expected frequency, which makes it particularly effective in the treatment field.
respectively. Statistics follow the Chi‑square distribution Although simple, NB classifiers usually perform well and
with the degrees of freedom ([number of row – 1] are widely used because they outperform more complex
× [number of column – 1]). classification methods.[33]

Asian Journal of Social Health and Behavior | Volume 5 | Issue 2 | April-June 2022 77
Nayan, et al.: Machine learning‑based algorithms for predicting depression and anxiety level in Bangladesh

Proposed approach Table 1: Percentage distribution of university students to


First, apply data preparation methods, for example, selected sociodemographic characteristics in Bangladesh
distinguish missing values from the data set and process Covariates Frequency (n=2121), n (%)
them. Subsequently, 75% of the individual samples in Gender
each group (called the training data set) were used to Male 952 (44.9)
apply the ML algorithm, and the remaining 25% of the Female 1169 (55.1)
Age (years)
Downloaded from http://journals.lww.com/shbh by BhDMf5ePHKav1zEoum1tQfN4a+kJLhEZgbsIHo4XMi0hCywCX1AW

individuals (called the test data set) were verified. All


models were trained to support 10‑fold cross‑validation. <20 276 (13.0)
We used a 10‑fold cross‑validation in the training set and 21-25 1630 (76.9)
>25 215 (10.1)
evaluated the performance in the test set.
Residence
nYQp/IlQrHD3i3D0OdRyi7TvSFl4Cf3VC4/OAVpDDa8K2+Ya6H515kE= on 06/02/2024

Model evaluation With family 1841 (86.8)


Without family 280 (13.2)
Six evaluation parameters were taken into account, named
Occupation
accuracy, sensitivity, specificity, positive predictive value,
Student 1843 (86.9)
negative predictive value, Cohen’s kappa. All data analyzes
Job holder 141 (6.6)
were completed using the Statistical Package for Social
Unemployed 137 (6.5)
Sciences (SPSS) version 25 (IBM Corporation, Armonk,
Educational year
New York, NY, USA) and the R- programming (version
1st/2nd 849 (40.0)
4.0.0, R Core Team).
3rd/4th/5th 947 (44.6)
Graduate 325 (15.3)
Results
Family status
Socio‑demographic characteristics of university students Higher class 508 (24.0)
Middle class 1414 (66.7)
Table 1 lists basic demographic and psychological
Lower class 199 (9.4)
characteristics, such as gender, age, residence, occupation,
Marital status
years of education, family status, marital status, and CGPA.
Married 281 (13.2)
In this study, slightly above half of the respondents (55%)
Unmarried 1840 (86.8)
were female, and more than one‑third (77%) belonged to
CGPA (4.00 Scale)
the 21–25 years of age group. Most of the respondents
≤3.00 448 (21.1)
live with their families, and the percentage distribution 3.01-3.50 920 (43.4)
was 86.8%. The majority (approximately, 87%) of the >3.50 753 (35.5)
interviewees were undergraduate students, while the Depression status
rest were unemployed or employed. Nearly half of Nonsevere 1554 (73.3)
the respondents (45%) were students and studied in Severe 567 (26.7)
the 3rd year or above. Approximately two‑thirds of the Anxiety status
respondents (67%) were from household with average Nonsevere 1654 (78.0)
income (middle‑income families). Almost 43% of the Severe 467 (22.0)
respondents had an average GPA between 3.01 and 3.50. CGPA: Cumulative grade point average
Responses to the PHQ‑9 and generalized anxiety
disorder‑7 questionnaire students, 34.1% of respondents under the age of
twenty, 27.4% of respondents living with their families,
This section describes the response rates of 9 questions
unemployed respondents (41.6%), respondents from poor
on the Patient Health Questionnaire and 7 questions
families (37.2%), 1st/2nd year students (34.0%), unmarried
on the General Anxiety Disorder Questionnaire. Both
respondents (28.0%), students in grades 3.01–3.50 (31.6%).
questionnaires are scored on a 4‑point Likert scale, ranging
It was found that all these selected covariates were
from 0 (not at all) to 3 (almost every day). The answers are
shown in Figure 1. significantly related to the depressive symptoms of
university students (P < 0.001).
Association with depression and anxiety among
university students Table 3 shows specifying the association between the
socioeconomic and demographic characteristics and
The prevalence of severe depression and severe anxiety anxiety status among university students in Bangladesh.
with the background characteristics of the selected covariate According to Table 3, it was found that the relationship
is shown in Tables 2 and 3, respectively. involving the student’s gender and their anxiety status
In terms of depression level, the percentage of severely was profoundly noteworthy (P < 0.001). The most
depressed students was higher, 37.1% of female noteworthy rate of severe anxiety was found in female

78 Asian Journal of Social Health and Behavior | Volume 5 | Issue 2 | April-June 2022
Nayan, et al.: Machine learning‑based algorithms for predicting depression and anxiety level in Bangladesh
Downloaded from http://journals.lww.com/shbh by BhDMf5ePHKav1zEoum1tQfN4a+kJLhEZgbsIHo4XMi0hCywCX1AW
nYQp/IlQrHD3i3D0OdRyi7TvSFl4Cf3VC4/OAVpDDa8K2+Ya6H515kE= on 06/02/2024

Figure 1: Percentage responses of PHQ-9 and GAD-7. GAD-7: Generalized anxiety disorder-7

students (26.1%). There was also a significant connection experimental observation results. The accuracy, sensitivity,
between the age of the students (in years) and their anxiety and specificity of LR were reported as 74.48%, 22.69%,
status (P = 0.03). Students aged more than 25 years had and 93.29%, respectively. However, the performance shown
severe anxiety (31.2%). Students who were currently by the k‑nearest neighborhood algorithm was 88.28%,
studying (22.9%) and 1st‑year/2nd‑year students (29.8%) 66.67%, and 96.13%, accuracy, sensitivity and specificity,
had severe anxiety. It has also shown in Table 3 that respectively. Naive Bayes showed an accuracy of 68.24%,
an association between CGPA and anxiety status also a sensitivity of 25.53%, and a specificity of 83.76% in
exist (P < 0.001) and around 30% student with lower predicting the depression state of the test observation
average result had severe anxiety level. results. Among the six classifiers, the best result was
Performance parameter of machine learning algorithms achieved by the RF algorithm, which showed that accuracy
was 88.66%, the sensitivity was 68.79%, and specificity
In this study, six different ML algorithms were used was 95.88%.
to classify the levels of depression and anxiety among
university students in the test data set as severe and The Cohen kappa statistics of linear discriminant analysis,
nonsevere. The predictive performance of these algorithms SVM (linear), naive Bayes, and LR were 0.1931, 0.1664,
will be compared based on performance parameters such as 0.1027, and 0.1968, respectively. It was recommended to
accuracy, sensitivity, and specificity. In addition, Cohen’s k adopt a “slightly fair agreement.” However, among all
statistic was used to determine the discrimination accuracy executed ML algorithms, the RF algorithm showed the
of the algorithm. Tables 4 and 5, respectively show the greatest discriminative ability (Cohen’s kappa = 0.6903).
prediction results of depression and anxiety states with Table 5 (performance indicators for predicting anxiety
performance parameters for each ML algorithm (for states) shows that using linear discriminant analysis, the
training and testing data sets). accuracy of the test data set was 79.58%, the sensitivity
The performance indicators used to predict depression are was 14.87%, and the specificity was 98.75%. RF showed
shown in Table 4. Using linear discriminant analysis, the an accuracy of 91.30%, a sensitivity of 70.25% and a
accuracy in the test data set was 74.29%, the sensitivity specificity of 97.55% in the prediction of anxiety level in
was 22.69%, and the specificity was 93.04%. SVM (linear) the test results. The accuracy, sensitivity, and specificity of
showed 74.10% accuracy, 19.14% sensitivity, and 94.07% the LR classifier were reported as 78.07%, 14.87%, and
specificity in predicting the depression status of the 96.81%, respectively. However, the k‑nearest neighbor

Asian Journal of Social Health and Behavior | Volume 5 | Issue 2 | April-June 2022 79
Nayan, et al.: Machine learning‑based algorithms for predicting depression and anxiety level in Bangladesh

Table 2: Assessing association between selected Table 3: Assessing the association between selected
covariates and depression status among university covariates and anxiety status among university students
student in Bangladesh using Chi‑square test in Bangladesh using the Chi‑square test
Covariates Depression level using PHQ‑9 method Covariates Anxiety level using GAD‑7 method
Severe Nonsevere Severe anxiety (%) Nonsevere anxiety (%)
depression (%) depression (%) Gender
Gender
Downloaded from http://journals.lww.com/shbh by BhDMf5ePHKav1zEoum1tQfN4a+kJLhEZgbsIHo4XMi0hCywCX1AW

Male 17.0 83.0


Male 14.0 86.0 Female 26.1 73.9
Female 37.1 62.9 P <0.001
P <0.001 Age (years)
Age (years) <20 21.7 78.3
nYQp/IlQrHD3i3D0OdRyi7TvSFl4Cf3VC4/OAVpDDa8K2+Ya6H515kE= on 06/02/2024

<20 34.1 65.9 21-25 20.9 79.1


21-25 24.8 75.2 >25 31.2 68.8
>25 32.1 67.9 P 0.03
P 0.01 Residence
Residence With family 22.0 78.0
With family 27.4 72.6 Without family 22.1 77.9
Without family 22.1 77.9 P 0.51
P 0.03 Occupation
Occupation Student 22.9 77.1
Student 25.8 74.2 Job holder 17.0 83.0
Job holder 24.8 75.2 Unemployed 15.3 84.7
Unemployed 41.6 58.4 P 0.04
P <0.001 Educational year
Educational year 1st/2nd 29.8 70.2
1st/2nd 34.0 66.0 3rd/4th/5th 15.5 84.5
3rd/4th/5th 19.6 80.4 Graduate 20.6 79.4
Graduate 28.3 71.7 P <0.001
P <0.001 Family status
Family status Rich 16.1 83.9
Rich 17.7 82.3 Middle 24.1 75.9
Middle 28.5 71.5 Poor 22.1 77.9
Poor 37.2 62.8 P 0.01
P <0.001 Marital status
Marital status Married 23.1 76.9
Married 18.1 81.9 Unmarried 21.8 78.2
Unmarried 28.0 72.0 P 0.34
P <0.001 CGPA (4.00 Scale)
CGPA (4.00 Scale) ≤3.00 29.9 70.1
≤3.00 22.3 77.7 3.01-3.50 21.2 78.8
3.01-3.50 31.6 68.4 >3.50 18.3 81.7
>3.50 23.4 76.6 P <0.001
P <0.001 GAD‑7: Generalized Anxiety Disorder Assessment,
CGPA: Cumulative grade point average, PHQ‑9: Patient health CGPA: Cumulative grade point average
questionnaire
classifier, the data showed that the reliability exceeded 60%
algorithm showed the same performance as the RF, with (k = 0.7333).
accuracy, sensitivity, and specificity of 91.30%, 73.55%,
and 96.57%, respectively. Naive Bayes showed an accuracy This violin plot shows the relationship of seven
of 73.53%, a sensitivity of 24.13%, and a specificity classifiers to accuracy. The shaded areas detail the
of 94.11% when predicting the anxiety state of the test distribution of the data in each classifier. In terms of
respondents. depression, Figure 2a shows that RF provided the highest
Among these six classifiers, the best result was achieved mean accuracy, followed by KNN, LR, LDA, SVM, and
by the SVM (linear) algorithm, showing that the accuracy NB. Figure 2b shows that SVM provided the highest
was 91.49%, the sensitivity was 67.77%, and the specificity mean accuracy for anxiety, followed by RF, KNN, LDA,
was 98.53%. According to Cohen’s kappa value of SVM LR, and NB. Unlike the boxplot, the entire distribution

80 Asian Journal of Social Health and Behavior | Volume 5 | Issue 2 | April-June 2022
Nayan, et al.: Machine learning‑based algorithms for predicting depression and anxiety level in Bangladesh

Table 4: Performance indicators of all five machine learning algorithms to predict depression status
Algorithms
LR RF SVM LDA KNN NB
Training set
Accuracy (%) 75.63 89.76 74.62 75.31 89.38 70.79
95% CI 73.44-77.72 88.17-91.21 72.41-76.75 73.12-77.42 87.77-90.86 68.49-73.02
κ 0.2537 0.7220 0.1835 0.2474 0.7099 0.1504
Downloaded from http://journals.lww.com/shbh by BhDMf5ePHKav1zEoum1tQfN4a+kJLhEZgbsIHo4XMi0hCywCX1AW

Sensitivity (%) 28.17 71.36 19.95 28.17 69.72 26.06


Specificity (%) 92.97 96.48 94.60 92.54 96.57 87.14
PPV 59.40 88.12 57.43 57.97 88.13 42.52
NPV 77.98 90.22 76.38 77.90 89.72 76.33
nYQp/IlQrHD3i3D0OdRyi7TvSFl4Cf3VC4/OAVpDDa8K2+Ya6H515kE= on 06/02/2024

Test set
Accuracy (%) 74.48 88.66 74.1 74.29 88.28 68.24
95% CI 70.54-78.14 85.64-91.23 70.15-77.79 70.34-77.97 85.23-90.9 64.09-72.19
κ 0.1968 0.6903 0.1664 0.1931 0.6769 0.1027
Sensitivity (%) 22.69 68.79 19.14 22.69 66.67 25.53
Specificity (%) 93.29 95.88 94.07 93.04 96.13 83.76
PPV 55.17 85.84 54.00 54.23 86.24 36.36
NPV 76.85 89.42 76.20 76.80 88.81 75.58
LR: Logistic regression, RF: Random forest, SVM: Support vector machine, LDA: Linear discriminant analysis, KNN: K‑nearest
neighborhood, NB: Naïve Bayes, PPV: Positive predictive value, NPV: Negative predictive value, CI: Confidence interval

Table 5: Performance indicators of the five machine learning algorithms to predict anxiety status
Algorithms
LR RF SVM LDA KNN NB
Training set
Accuracy (%) 80.78 93.09 92.27 81.78 92.9 76.26
95% CI 78.76-82.69 91.73-94.29 90.85-93.54 79.8-83.65 91.53-94.11 74.09-78.33
κ 0.2565 0.7852 0.7525 0.2775 0.7834 0.1448
Sensitivity (%) 22.54 76.59 70.81 22.25 78.61 28.09
Specificity (%) 96.95 97.67 98.23 98.31 96.87 95.18
PPV 67.24 90.14 91.76 78.57 87.46 31.81
NPV 81.84 93.76 92.38 81.99 94.22 78.85
Test set
Accuracy 78.07 91.3 91.49 79.58 91.3 73.53
95% CI 74.3-81.53 88.57-93.56 88.78-93.73 75.89-82.94 88.57-93.56 69.56-77.25
κ 0.1583 0.7334 0.7333 0.1909 0.7399 0.1827
Sensitivity (%) 14.87 70.25 67.77 14.87 73.55 24.13
Specificity (%) 96.81 97.55 98.53 98.77 96.57 94.11
PPV 58.06 89.47 93.18 78.26 86.41 17.24
NPV 79.31 91.71 91.16 79.64 92.49 76.80
CI: Confidence interval, LR: Logistic regression, RF: Random forest, SVM: Support vector machine, LDA: Linear discriminate analysis,
KNN: K‑nearest neighbors, NB: Naïve Bayes, PPV: Positive predictive value, NPV: Negative predictive value

of the 10‑fold accuracy can be visualized in this violin a major issue around the world.[34] Few studies have been
plot [Figure 2]. conducted on mental disorders among college students.[35,36]
Thus, the main focus of the study was to predict the state
Discussion of mental disorders such as depression and anxiety among
To the best of our knowledge, this is the first study to Bangladeshi university students. To fulfill our study
apply several machine classifiers to predict the level of objective, this study used six well‑known ML algorithms.
depression and anxiety status among university students All models were trained based on 10‑fold cross‑validation
during the first wave of COVID‑19 in Bangladesh. on the training set, and performance was estimated in the
testing set.
In Bangladesh, mental health illnesses were not as
important as all other public health issues such as The prevalence of severe depression and severe anxiety
malnutrition, and mental health issues are now becoming was higher for female than for male. A recent systematic

Asian Journal of Social Health and Behavior | Volume 5 | Issue 2 | April-June 2022 81
Nayan, et al.: Machine learning‑based algorithms for predicting depression and anxiety level in Bangladesh
Downloaded from http://journals.lww.com/shbh by BhDMf5ePHKav1zEoum1tQfN4a+kJLhEZgbsIHo4XMi0hCywCX1AW
nYQp/IlQrHD3i3D0OdRyi7TvSFl4Cf3VC4/OAVpDDa8K2+Ya6H515kE= on 06/02/2024

a b
Figure 2: Violin plot showing depression accuracy (a) and anxiety accuracy (b) in each machine learning classifiers

review analysis depicts that female gender was significantly Second, the study was conducted by interviewing online
important to increase the depression symptom.[37] This with a small number of students from four universities, not
study found a high prevalence of depression and anxiety representing university students from the whole country.
problem for teenage students who were currently in the Finally, due to the online survey, few students were unable
study. There are many psychological reasons for higher to complete the questionnaire by their own hands and
percentage of severe mental health problem among provide accurate information due to mental shame.
students. For instances, students felt insecure about their
future career during this pandemic situation.[5] Lower However, this investigation established that ML algorithms
academic performance is also an important and significant can be used to predict psychological state malady
indicator for increasing the mental health problem among supported common risk factors, which can assist within
university students.[38] the development of interventions to prevent severe
psychological state issues among students, particularly
From supervised model comparison, the best accuracy university students in Bangladesh.
was achieved by the RF algorithm for depressed sectors
based on various performance parameters. But for anxiety Conclusions
states, the SVM showed the best predictive estimate with
This study focuses mainly on the comparison of the
91% accuracy. Compared to previous research SVMs,
performance of ML‑based algorithms to predict depression
it performed with a reasonable level of accuracy among
all classifiers.[39] In other areas of public health, such as and anxiety among university students during the first wave
malnutrition and anemia in children, the RF has great of the COVID‑19 pandemic in Bangladesh. This research
predictive potential.[40,41] In research related to mental health is also important for policymakers since Bangladesh has
or medical sector, RF and support vector algorithm were the second‑highest temporary university closed in the
used mostly to predict psychiatric disorder and disease world in an attempt to contain the spread of the COVID‑19
outcome, respectively, by several researchers.[42‑46] Finally, pandemic. During the time of the first wave of the
this study proposed that the classification of RFs and COVID‑19 pandemic, students faced severe mental health
vector machines be extended where Bangladesh’s coerced problems. Based on the study findings, the prevalence of
concern is predicting mental health problems, for example, severe depression and severe anxiety among university
depression and anxiety. students was 26.7% and 22%, respectively. In summary,
we conclude that among ML classifiers, a RF is best for
Limitations predicting depression, and an SVM is best for predicting
Almost all studies will have few impediments, and current anxiety. Finally, we suggest that RF and SVMs are the core
research is not without limitations that must be taken interests of researchers in predicting depression and anxiety
seriously during data interpretation. First, the nature of this among university students, respectively.
research is cross‑sectional, for which it is quite impossible
Acknowledgments
to provide a causal relationship. During the lockdown,
data collection is really impossible, for that reason, we The authors would like to express the deepest gratitude to all
have conducted the online structure questionnaire method. the enthusiastic and volunteer respondents who participated

82 Asian Journal of Social Health and Behavior | Volume 5 | Issue 2 | April-June 2022
Nayan, et al.: Machine learning‑based algorithms for predicting depression and anxiety level in Bangladesh

in this research. Many thanks also to the developers who and traumatic stress: Probable risk factors and correlates of
developed 9‑item Patient Health Questionnaire depression posttraumatic stress disorder. J Loss Trauma 2020;25:503-22.
scale (PHQ‑9) and 7‑item GAD‑7 anxiety scale. 15. Islam MS, Ferdous MZ, Potenza MN. Panic and generalized
anxiety during the COVID-19 pandemic among Bangladeshi
Declaration of respondent consent people: An online pilot survey early in the outbreak. J Affect
Disord 2020;276:30-7.
The authors certify that they have obtained all appropriate 16. Islam MS, Sujan MS, Tasnim R, Sikder MT, Potenza MN,
university student consent. van Os J. Psychological responses during the COVID-19
Downloaded from http://journals.lww.com/shbh by BhDMf5ePHKav1zEoum1tQfN4a+kJLhEZgbsIHo4XMi0hCywCX1AW

outbreak among university students in Bangladesh. PLoS One


Financial support and sponsorship 2020;15:e0245083.
Nil. 17. Rubin GJ, Wessely S. The psychological effects of quarantining
a city. BMJ 2020;368:m313.
Conflicts of interest
nYQp/IlQrHD3i3D0OdRyi7TvSFl4Cf3VC4/OAVpDDa8K2+Ya6H515kE= on 06/02/2024

18. Brooks SK, Webster RK, Smith LE, Woodland L, Wessely S,


Greenberg N, et al. The psychological impact of quarantine
There are no conflicts of interest.
and how to reduce it: Rapid review of the evidence. Lancet
2020;395:912-20.
References
19. Zandifar A, Badrfam R. Iranian mental health during the
1. Eurosurveillance Editorial Team. Note from the editors: World COVID-19 epidemic. Asian J Psychiatr 2020;51:101990.
Health Organization declares novel coronavirus (2019-nCoV) 20. Cherry K. How Does Quarantine Affect Your Mental health?
sixth public health emergency of international concern. Euro [Internet]. Verywell Mind. 2020. Available from: https://
Surveill 2020;25:1-2. 10.2807/1560-7917.ES.2020.25.5.200131e. www.verywellmind.com/protect-your-mental-health-during-
2. Bhattacharyya R, Chatterjee S, Bhattacharyya S, Gupta S, Das S, quarantine-4799766. [Last accessed on 2021 Jan 27].
Banerjee B. Attitude, practice, behavior, and mental health impact 21. Mamun MA, Hossain MD, Griffiths MD. Mental health problems
of COVID-19 on doctors. Indian J Psychiatry 2020;62:257. and associated predictors among Bangladeshi students. Int J
3. Atchison CJ, Bowman L, Vrinten C, Redd R, Pristera P, Ment Health Addict 2022;20:657-71.
Eaton JW, et al. Early Perceptions and Behavioural Responses 22. Alghamdi M, Al-Mallah M, Keteyian S, Brawner C, Ehrman J,
during the COVID-19 pandemic: a cross-sectional Survey of UK Sakr S. Predicting diabetes mellitus using SMOTE and
Adults. BMJ Open 2021;11:e043577. ensemble machine learning approach: The Henry Ford ExercIse
4. Verity R, Okell LC, Dorigatti I, Winskill P, Whittaker C, Imai N, Testing (FIT) project. PLoS One 2017;12:e0179805.
et al. Estimates of the severity of coronavirus disease 2019: 23. Kotsiantis SB, Zaharakis ID, Pintelas PE. Machine learning:
A model-based analysis. Lancet Infect Dis 2020;20:669-77. A review of classification and combining techniques. Artif Intell
5. Rajabimajd N, Alimoradi Z, Griffiths M. Impact of COVID-19- Rev 2006;26:159-90.
related fear and anxiety on job attributes: A systematic review. 24. Priya A, Garg S, Tigga NP. Predicting anxiety, depression and
Asian J Soc Health Behav 2021;4:51. stress in modern life using machine learning algorithms. Procedia
6. Pramukti I, Strong C, Sitthimongkol Y, Setiawan A, Pandin MG, Comput Sci 2020;167:1258-67.
Yen CF, et al. Anxiety and suicidal thoughts during the 25. Sau A, Bhakta I. Screening of anxiety and depression among
COVID-19 pandemic: Cross-country comparative study among the seafarers using machine learning technology. Inform Med
Indonesian, Taiwanese, and Thai University Students. J Med Unlocked 2019;16:100149.
Internet Res 2020;22:e24487. 26. Kroenke K, Spitzer RL, Williams JB. The PHQ-9: Validity
7. Sumathi B. Prediction of mental health problems among children of a brief depression severity measure. J Gen Intern Med
using machine learning techniques. Int J Adv Comput Sci Appl 2001;16:606-13.
2016;7:1-6. 27. Kroenke K, Spitzer RL. The PHQ-9: A new depression diagnostic
8. Abdallah AR, Gabr HM. Depression, anxiety and stress among and severity measure. Psychiatr Ann 2002;32:509-15.
first year medical students in an Egyptian Public University. Int 28. Spitzer RL, Kroenke K, Williams JB, Löwe B. A brief measure
Res J Med Med Sci 2014;2:11-9. for assessing generalized anxiety disorder: The GAD-7. Arch
9. Alim SA, Rabbani MG, Karim E, Mullick MS, Mamun AA, Intern Med 2006;166:1092-7.
Fariduzzaman. Assessment of depression, anxiety and stress 29. Haque M, Das C, Ara R, Alam M, Ullah S, Hossain Z.
among first year MBBS students of a public medical college, Prevalence of generalized anxiety disorder and its effect on
Bangladesh. Bangladesh J Psychiatry 2017;29:23-9. daily living in the rural community of Rajshahi. J Teach Assoc
10. Beiter R, Nash R, McCrady M, Rhoades D, Linscomb M, 2018;27:14-23.
Clarahan M, et al. The prevalence and correlates of depression, 30. Zhang T, Su J, Xu Z, Luo Y, Li J. Sentinel-2 satellite imagery
anxiety, and stress in a sample of college students. J Affect for urban land cover classification by optimized random forest
Disord 2015;173:90-6. classifier. Appl Sci 2021;11:543.
11. Alimoradi Z, Ohayon MM, Griffiths MD, Lin CY, Pakpour AH. 31. Junli C, Licheng J. Classification Mechanism of Support
Fear of COVID-19 and its association with mental health-related Vector Machines. WCC 2000-ICSP 2000 2000 5th International
factors: Systematic review and meta-analysis. BJPsych Open Conference on Signal Processing Proceedings 16th World
2022;8:e73. Computer Congress 2000; 2000.
12. Gewin V. Five tips for moving teaching online as COVID-19 32. Güvenç E, Ceti̇ nGÇ, Koçak H. Comparison of KNN and DNN
takes hold. Nature 2020;580:295-6. classifiers performance in predicting mobile phone price ranges.
13. Sahu P. Closure of universities due to coronavirus disease Adv Artif Intell Res 2021;1:19-28.
2019 (COVID-19): Impact on education and mental health of 33. Vembandasamy K, Sasipriya R, Deepa E. Heart diseases
students and academic staff. Cureus 2020;12:e7541. detection using naive bayes algorithm. Int J Innov Sci Eng
14. Boyraz G, Legros DN. Coronavirus disease (COVID-19) Technol 2015;2:1-4.

Asian Journal of Social Health and Behavior | Volume 5 | Issue 2 | April-June 2022 83
Nayan, et al.: Machine learning‑based algorithms for predicting depression and anxiety level in Bangladesh

34. Bayram N, Bilgel N. The prevalence and socio-demographic Nutrition 2020;78:110861.


correlations of depression, anxiety and stress among a group 41. Khare S, Kavyashree S, Gupta D, Jyotishi A. Investigation
of university students. Soc Psychiatry Psychiatr Epidemiol of nutritional status of children based on machine learning
2008;43:667-72. techniques using Indian demographic and health survey data.
35. Kumari R, Langer B, Jandial S, Gupta RK, Raina SK, Singh P. Procedia Comput Sci 2017;115:338-49.
Psycho-social health problems: Prevalence and associated factors 42. Aguiar-Pulido V, Gestal M, Fernandez-Lozano C, Rivero D,
among students of professional colleges in Jammu. Indian J Munteanu CR. Applied computational techniques on
Community Health 2019;31:43-9. schizophrenia using genetic mutations. Curr Top Med Chem
Downloaded from http://journals.lww.com/shbh by BhDMf5ePHKav1zEoum1tQfN4a+kJLhEZgbsIHo4XMi0hCywCX1AW

36. Lun KW, Chan CK, Ip PK, Ma SY, Tsai WW, Wong CS, et al. 2013;13:675-84.
Depression and anxiety among university students in Hong 43. Aguiar-Pulido V, Seoane JA, Rabuñal JR, Dorado J, Pazos A,
Kong. Hong Kong Med J 2018;24:466-72. Munteanu CR. Machine learning techniques for single nucleotide
37. Akanni O, Olashore A, Fela-Thomas A, Khutsafalo K. The polymorphism-disease classification models in schizophrenia.
nYQp/IlQrHD3i3D0OdRyi7TvSFl4Cf3VC4/OAVpDDa8K2+Ya6H515kE= on 06/02/2024

psychological impact of COVID-19 on health-care workers in Molecules 2010;15:4875-89.


African countries: A systematic review. Asian J Soc Health 44. Vivian-Griffiths T, Baker E, Schmidt KM, Bracher-Smith M,
Behav 2021;4:85. Walters J, Artemiou A, et al. Predictive modeling of
38. Agnafors S, Barmark M, Sydsjö G. Mental health and academic schizophrenia from genomic data: Comparison of polygenic risk
performance: A study on selection and causation effects from score with kernel support vector machines approach. Am J Med
childhood to early adulthood. Soc Psychiatry Psychiatr Epidemiol Genet B Neuropsychiatr Genet 2019;180:80-5.
2021;56:857-66. 45. Li C, Yang C, Gelernter J, Zhao H. Improving genetic risk
39. Srividya M, Mohanavalli S, Bhalaji N. Behavioral modeling for prediction by leveraging pleiotropy. Hum Genet 2014;133:639-50.
mental health using machine learning algorithms. J Med Syst 46. Acikel C, Aydin Son Y, Celik C, Gul H. Evaluation of potential
2018;42:88. novel variations and their interactions related to bipolar disorders:
40. Talukder A, Ahammed B. Machine learning algorithms for Analysis of genome-wide association study data. Neuropsychiatr
predicting malnutrition among under-five children in Bangladesh. Dis Treat 2016;12:2997-3004.

84 Asian Journal of Social Health and Behavior | Volume 5 | Issue 2 | April-June 2022

You might also like